Pigeon Hole Principle

On account that I have been seeing a lot of pigeon hole lately (not only in competitive), I’m making this to explain some techniques that you can use to deal with pigeon hole. Now pigeon hole does not come up super often, but it is a concept that anyone should be familiar with. It’s also necessary in some harder probability problems.

Basic idea

The Pigeon Hole Principle states that if n+1 or more pigeons are placed in n holes, then one hole must contain two or more pigeons.
– Art of Problem Solving

Yes, this concept seems like a no-brainer. However, its application can often times be really subtle and at times almost impossible to detect. Luckily, it’s also really difficult to invent a pigeon hole problem that stems from a completely original or gimmicky idea, seeing that it is such a well studied technique.

The extended version of this theorem is also very straightforward to state and understand.

The extended version of the Pigeonhole Principle states that if k objects are placed in n boxes then at least one box must hold at least

\lceil\frac{k}{n}\rceil

objects.
– Art of Problem Solving

With the basic concept covered, we can now move onto some warmup problems.

Warmup Problems

Let’s start with a math problem.

Problem 1: If a Martian has an infinite number of red, blue, yellow, and black socks in a drawer, how many socks must the Martian pull out of the drawer to guarantee he has a pair? [Source]

Solution: We basically have 4 holes each of a different color. A pair of socks with the same color is similar to a hole with 2 pigeons in it. By the Pigeon Hole Principle, we need 4+1=5 pigeons so that one of our 4 holes have at least 1 pigeon.

Problem 2: Given a

n \times n

square, prove that if 5 points are placed randomly inside the square, then two of them are at most

\frac{n}{\sqrt{2}}

units apart.[Source]

Solution: We divide the

n \times n

square into four

\frac{n}{2} \times \frac{n}{2}

squares (pigeonholes). Consequently, at least two points (pigeons) are inside the same

\frac{n}{2} \times \frac{n}{2}

square.

Now that was not too challenging. Let’s start on an actual competitive programming problem.

Problem 3: You are selling computational powers on your N machines. There are 2 types of purchase. A customer can either buy 1 seconds of computational power or Q seconds of computational power. After all purchases have been confirmed, you will assign these tasks to the machines such as to minimize the maximum number of seconds you assign to any computer. If customers purchase A of the first type of power and B units of the second type, how many seconds will you be able to finish all the computation? [Source]
Constraints:

$2\le Q\le10^3$
$1\le N,A,B\le10^6$

Solution: To start with, we deal with the case where we only have the first type of purchase. This means that we can use pigeon hole principle and conclude that we need at least

\lceil\frac{A}{N}\rceil

seconds to finish. Now we just need to perform something to reduce the problem to the case where there are all type-1 purchase. Let’s assign the second type of purchase first evenly. This means there are at least

\lfloor\frac{B}{N}\rfloor

of the type-2 will be assigned to each computer, leaving

B\mod{N}

left.

Now if

B\mod{N}

is 0, we successfully reduced the problem to only type-1 purchase. Otherwise, we need to fill the empty space of the

N-(B\mod{N})

columns with only

\lfloor\frac{B}{N}\rfloor

type-2. Time for me to demonstrate with a drawing:

After distributing the type-2 purchases

After you successfully fill the blue region with

(N-(B\mod{N}))\cdot Q

type-1 purchases, there are only type-1 purchase and you can solve it with the method described above. If you cannot then the answer is clearly

Q\cdot\lceil\frac{B}{N}\rceil

Enough warmups, show me something cool

Yes, I know it’s hard to control your excitement when it comes to pigeon hole problems. Here are my stash of problems that had me go “heck yeah”.

Problem 1: Given N integers and a number M. Output “Yes” if there exist a subset of the N integers such that they sum up to a multiple of M, and “No” otherwise. [Source]

Constraints:

$N \le 10^6$
$M\le10^3$

Solution: Say we have that

N &lt; M

. This means that we can do knapsack in

O(M^2)

. Now suppose that

N\ge M

. We will use pigeon hole to show how a solution always exist. Even stronger, we are going to prove that a subarray (and not only subset), that sums up to a multiple of n, exists.

We have the array A of integers. Let

a_i

be the prefix sum of all elements with index less than or equal to i in A modulo M. More formally,

a_i = \sum_{j=0}^{i}A_i \mod{M}

Observe that

a_i

must be in the range

[0,M-1]

by definition of the modulo operation. If

a_i

is 0, we are done, since we can just take the prefix of everything up to i and it will be divisible by M by definition of modulo. This means

a_i

is in the range

[1,M-1]

, which gives us M-1 holes.

Now the question is, how many pigeons do we have. It is clear that the answer is N. We know

N\ge M&gt;M-1

so there must be a pair (i,j) such that i and j are different and

a_i = a_j

. We can just take the subarray from i+1 to j and the sum will be

a_j-a_i \equiv 0 \mod{M}

. So the answer is always yes if

N \ge M

.

Let’s now go conquer the reason this blog was made in the first place:

Problem 2: Dachey (or someone) is practicing for a chess tournament and plays at least 1 game per day for 77 days. To prevent from burning out, Dachey only plays at most 132 games in total. Prove that there exist a contiguous set of days where Dachey plays exactly 21 games. [Source = Discrete Math II]

Best Solution: Let

a_i

again denote the cumulitative sum of games Dachey plays until day i. We have the following condition:

0 &lt; a_1 &lt; a_2 &lt; a_3 &lt; ... &lt; a_{77} \le 132

Now we attempt to add 21 to every number to yield:

21 &lt; a_1+21 &lt; a_2+21 &lt; a_3+21 &lt; ... &lt; a_{77}+21 \le 153

Now if the second sequence has any integer common with the first we are done. We notice that in total there are

77 \cdot 2 = 154

values here, and none of the values in the first list is equal to one another (and same with the second). Since there are 154 pigeons, at least 2 must go into one of the 153 values. Suppose we have a pair, we know that they also cannot belong to the same list. Thus, there always exist at least 1 common number in between the two lists. Let this be (i,j) where i is in the first list and j in the second. We can easily conclude that

j&lt;i

, and that

a_j-a_i=21

.

Problem 3: An array of integers of size

n-1

is called beautiful iff every element in it is in the range

[1,n-1]

and non of its subarray sum to 0 modulo

n

. Given n and k, find the

k^{th}

lexicographic beautiful array. [Source]
Constraints:

$2\le N\le1000$
$1\le K\le10^{18}$

Hint: What does the cummulative sum of any beautiful array of size

n-1

tell you about the form of the beautiful array?
Best Solution: First, we recognize that if we can quickly compute how many ways to finish the beautiful subarray given values of the first

m

positions, then finding

k^{th}

is done in a binary-search like manner.
Suppose you have the first

m

values. Attempt to place 1 in the

(m+1)^{th}

place, if permissible, and compute the number of ways to finish in a beautiful array. If this is less than or equal to

k

, we can safely put 1 in the

(m+1)^{th}

place. Otherwise, we must place something higher than 1, so we can safely subtract the number of ways to finish if you place 1 from

k

and continue.

So let’s find out ways to count number of beautiful arrays. First, start with the fact that if the array has to be of size

n

then this is an impossible task. Given an array A of size n, let denote the cumulative array of A modulo m by:

a_i = \sum_{j=0}^{i}A_i \mod{n}

Again, we can see that

a_i

must be in range

[0,n-1]

and if

a_i

is 0, we are done. So

a_i

is in range

[1,n-1]

and there are

n

values. You can extrapolate the rest.

The same could be said about the array A of size

n-1

. All of its

a_i

must be in the range

[1,n-1]

and no pair of indices has the same value. This means we can conclude that the array

a

is actually a permutation of integers from the range

[1,n-1]

(kudos to Darrin for seeing this).

This means we can quickly compute the number of beautiful arrays. Upon closer inspection, we also notice that we can quickly compute the number of ways to finish a beautiful array given the first

m

values, since if we remove those

m

values from the range

[1,n-1]

, any permutation of the rest of the

n-1-m

values gives a beautiful array.

I heard enough solutions, how about some practice problems

These are some of the problems I have found on the topic. They are not trivial, but they are very interesting.

Problem 1: Given two multisets A and B of size n, with integers in the range

[1,n]

. Find one subset of A and one subset of B with equivalent sums or determine that it is impossible. [Source]

Extension: Given a list of

N

integers in the range

[1,K]

and another list of

K

integers in the range

[1,N]

, prove that there must be one subarray from the first list and one subarray from the second list with the same sum.

Constraints:

$1\le n\le 10^6$

Problem 2: Consider a permutation of length N. We want to sort the permutation by swapping elements, but the cost of swapping any two elements in indices

(i,j)

|j-i|

. Find out the minimum cost as well as the way to sort the permutation. [Source to be determined]

Constraints:

$1\le n\le 10^3$

Additional problems

These are some extra problems I have found. I will add more when I see them. Do suggest in the comments if you know any neat ones.

Written with StackEdit.

Search This Blog

Powered by MathJax

Codeburners