Pigeon Hole Principle


Pigeon Hole Principle
On account that I have been seeing a lot of pigeon hole lately (not only in competitive), I’m making this to explain some techniques that you can use to deal with pigeon hole. Now pigeon hole does not come up super often, but it is a concept that anyone should be familiar with. It’s also necessary in some harder probability problems.

Basic idea

The Pigeon Hole Principle states that if n+1 or more pigeons are placed in n holes, then one hole must contain two or more pigeons.
Art of Problem Solving

Yes, this concept seems like a no-brainer. However, its application can often times be really subtle and at times almost impossible to detect. Luckily, it’s also really difficult to invent a pigeon hole problem that stems from a completely original or gimmicky idea, seeing that it is such a well studied technique.

The extended version of this theorem is also very straightforward to state and understand.

The extended version of the Pigeonhole Principle states that if k objects are placed in n boxes then at least one box must hold at least kn\lceil\frac{k}{n}\rceil objects.
Art of Problem Solving

With the basic concept covered, we can now move onto some warmup problems.

Warmup Problems

Let’s start with a math problem.

Problem 1: If a Martian has an infinite number of red, blue, yellow, and black socks in a drawer, how many socks must the Martian pull out of the drawer to guarantee he has a pair? [Source]

Solution: We basically have 4 holes each of a different color. A pair of socks with the same color is similar to a hole with 2 pigeons in it. By the Pigeon Hole Principle, we need 4+1=5 pigeons so that one of our 4 holes have at least 1 pigeon.

Problem 2: Given a n×nn \times n square, prove that if 5 points are placed randomly inside the square, then two of them are at most n2\frac{n}{\sqrt{2}} units apart.[Source]

Solution: We divide the n×nn \times n square into four n2×n2\frac{n}{2} \times \frac{n}{2} squares (pigeonholes). Consequently, at least two points (pigeons) are inside the same n2×n2\frac{n}{2} \times \frac{n}{2} square.

Now that was not too challenging. Let’s start on an actual competitive programming problem.

Problem 3: You are selling computational powers on your N machines. There are 2 types of purchase. A customer can either buy 1 seconds of computational power or Q seconds of computational power. After all purchases have been confirmed, you will assign these tasks to the machines such as to minimize the maximum number of seconds you assign to any computer. If customers purchase A of the first type of power and B units of the second type, how many seconds will you be able to finish all the computation? [Source]
Constraints:
  • 2Q1032\le Q\le10^3
  • 1N,A,B1061\le N,A,B\le10^6
Solution: To start with, we deal with the case where we only have the first type of purchase. This means that we can use pigeon hole principle and conclude that we need at least AN\lceil\frac{A}{N}\rceil seconds to finish. Now we just need to perform something to reduce the problem to the case where there are all type-1 purchase. Let’s assign the second type of purchase first evenly. This means there are at least BN\lfloor\frac{B}{N}\rfloor of the type-2 will be assigned to each computer, leaving Bmod  NB\mod{N} left.

Now if Bmod  NB\mod{N} is 0, we successfully reduced the problem to only type-1 purchase. Otherwise, we need to fill the empty space of the N(Bmod  N)N-(B\mod{N}) columns with only BN\lfloor\frac{B}{N}\rfloor type-2. Time for me to demonstrate with a drawing:

After distributing the type-2 purchases
After you successfully fill the blue region with (N(Bmod  N))Q(N-(B\mod{N}))\cdot Q type-1 purchases, there are only type-1 purchase and you can solve it with the method described above. If you cannot then the answer is clearly QBNQ\cdot\lceil\frac{B}{N}\rceil.

Enough warmups, show me something cool

Yes, I know it’s hard to control your excitement when it comes to pigeon hole problems. Here are my stash of problems that had me go “heck yeah”.

Problem 1: Given N integers and a number M. Output “Yes” if there exist a subset of the N integers such that they sum up to a multiple of M, and “No” otherwise. [Source]

Constraints:
  • N106N \le 10^6
  • M103M\le10^3
Solution: Say we have that N<MN < M. This means that we can do knapsack in O(M2)O(M^2). Now suppose that NMN\ge M. We will use pigeon hole to show how a solution always exist. Even stronger, we are going to prove that a subarray (and not only subset), that sums up to a multiple of n, exists.

We have the array A of integers. Let aia_i be the prefix sum of all elements with index less than or equal to i in A modulo M. More formally,

ai=j=0iAimod  M a_i = \sum_{j=0}^{i}A_i \mod{M}

Observe that aia_i must be in the range [0,M1][0,M-1] by definition of the modulo operation. If aia_i is 0, we are done, since we can just take the prefix of everything up to i and it will be divisible by M by definition of modulo. This means aia_i is in the range [1,M1][1,M-1], which gives us M-1 holes.

Now the question is, how many pigeons do we have. It is clear that the answer is N. We know NM>M1N\ge M>M-1 so there must be a pair (i,j) such that i and j are different and ai=aja_i = a_j. We can just take the subarray from i+1 to j and the sum will be ajai0mod  Ma_j-a_i \equiv 0 \mod{M}. So the answer is always yes if NMN \ge M.

Let’s now go conquer the reason this blog was made in the first place:

Problem 2: Dachey (or someone) is practicing for a chess tournament and plays at least 1 game per day for 77 days. To prevent from burning out, Dachey only plays at most 132 games in total. Prove that there exist a contiguous set of days where Dachey plays exactly 21 games. [Source = Discrete Math II]

Best Solution: Let aia_i again denote the cumulitative sum of games Dachey plays until day i. We have the following condition:

0<a1<a2<a3<...<a77132 0 < a_1 < a_2 < a_3 < ... < a_{77} \le 132

Now we attempt to add 21 to every number to yield:
21<a1+21<a2+21<a3+21<...<a77+21153 21 < a_1+21 < a_2+21 < a_3+21 < ... < a_{77}+21 \le 153

Now if the second sequence has any integer common with the first we are done. We notice that in total there are 772=15477 \cdot 2 = 154 values here, and none of the values in the first list is equal to one another (and same with the second). Since there are 154 pigeons, at least 2 must go into one of the 153 values. Suppose we have a pair, we know that they also cannot belong to the same list. Thus, there always exist at least 1 common number in between the two lists. Let this be (i,j) where i is in the first list and j in the second. We can easily conclude that j<ij<i, and that ajai=21a_j-a_i=21.

Problem 3: An array of integers of size n1n-1 is called beautiful iff every element in it is in the range [1,n1][1,n-1] and non of its subarray sum to 0 modulo nn. Given n and k, find the kthk^{th} lexicographic beautiful array. [Source]
Constraints:
  • 2N10002\le N\le1000
  • 1K10181\le K\le10^{18}
Hint: What does the cummulative sum of any beautiful array of size n1n-1 tell you about the form of the beautiful array?
Best Solution: First, we recognize that if we can quickly compute how many ways to finish the beautiful subarray given values of the first mm positions, then finding kthk^{th} is done in a binary-search like manner.
Suppose you have the first mm values. Attempt to place 1 in the (m+1)th(m+1)^{th} place, if permissible, and compute the number of ways to finish in a beautiful array. If this is less than or equal to kk, we can safely put 1 in the (m+1)th(m+1)^{th} place. Otherwise, we must place something higher than 1, so we can safely subtract the number of ways to finish if you place 1 from kk and continue.

So let’s find out ways to count number of beautiful arrays. First, start with the fact that if the array has to be of size nn then this is an impossible task. Given an array A of size n, let denote the cumulative array of A modulo m by:
ai=j=0iAimod  na_i = \sum_{j=0}^{i}A_i \mod{n}

Again, we can see that aia_i must be in range [0,n1][0,n-1] and if aia_i is 0, we are done. So aia_i is in range [1,n1][1,n-1] and there are nn values. You can extrapolate the rest.

The same could be said about the array A of size n1n-1. All of its aia_i must be in the range [1,n1][1,n-1] and no pair of indices has the same value. This means we can conclude that the array aa is actually a permutation of integers from the range [1,n1][1,n-1] (kudos to Darrin for seeing this).

This means we can quickly compute the number of beautiful arrays. Upon closer inspection, we also notice that we can quickly compute the number of ways to finish a beautiful array given the first mm values, since if we remove those mm values from the range [1,n1][1,n-1], any permutation of the rest of the n1mn-1-m values gives a beautiful array.

I heard enough solutions, how about some practice problems

These are some of the problems I have found on the topic. They are not trivial, but they are very interesting.

Problem 1: Given two multisets A and B of size n, with integers in the range [1,n][1,n]. Find one subset of A and one subset of B with equivalent sums or determine that it is impossible. [Source]

Extension: Given a list of NN integers in the range [1,K][1,K] and another list of KK integers in the range [1,N][1,N], prove that there must be one subarray from the first list and one subarray from the second list with the same sum.

Constraints:
  • 1n1061\le n\le 10^6
Problem 2: Consider a permutation of length N. We want to sort the permutation by swapping elements, but the cost of swapping any two elements in indices (i,j)(i,j) is ji|j-i|. Find out the minimum cost as well as the way to sort the permutation. [Source to be determined]

Constraints:
  • 1n1031\le n\le 10^3

Additional problems

These are some extra problems I have found. I will add more when I see them. Do suggest in the comments if you know any neat ones.

Written with StackEdit.

Comments

Popular posts from this blog

Dynamic Programming

Welcome

Graph Theory