BrainDen.com - Brain Teasers
• 0

# Colored balls in a bag

## Question

There are N balls in the bag. Each one has a distinct color. You draw one ball randomly from the bag, record its color and put it back. Calculate the probability that after T draws you've seen M distinct colors.
(T > 0, M > 0, M <= min(T, N) )

## Recommended Posts

• 0

If you number the colors of the balls {1, 2, 3 ... N} then you can consider the T balls that you draw as an ordered list (B

1, B2, B3 … BT) where the value of each BX term is the number corresponding to the color pulled on draw number X. So if there were four colored balls {1=red, 2=blue, 3=white, 4=purple} and you had three draws and got (red, white, red), then you would have (B1=red=1, B2=white=3, B3=red=1).

If there are N different colored balls and you draw T balls with replacement, then there are NT different possible outcomes for the ordered list (B1, B2, B3 ... BT). The probability that you will draw M different colors is [the number of different ordered lists that include M different colors] divided by [the total number of different possible ordered lists, which is NT].

The enumerate the different possible (B1, B2, B3 ... BT) lists with M different colors, I will temporarily lose generality: WITH loss of generality, number the colors that are drawn from color #1 to color #M so the terms in the list (B1, B2, B3 ... BT) each must take on a value from 1 to M, and each number from 1 to M must appear in the list at least once. To convince you that this does lose generality, consider the case in the first paragraph with N=4 and T=3 – if you number the colors from 1 to M and solve for M=1, then only one color (color #1) can appear in the list and the only solution is (1, 1, 1), but in actuality there are four solutions: (red, red, red,), (blue, blue, blue), (white, white, white), and (purple, purple, purple). In general, if you consider the solution found by renumbering the colors from 1 to M, then to get the real solution you must then multiply by (N choose M) to account for each of the different color combinations that could be used in the set from 1 to M. So the answer will be along the lines of (N choose M) * (ways of arranging M different colors in a list that's T elements long).

The number of ways of making a list that's T elements long using only M=1 colors is easy: that's just 1. In fact, that's too easy... to make things more complicated I'll rewrite that as 1T.

The number of ways of making a list that's T elements long using M=2 colors can be calculated as (total number of ways of making a list with either M=1 or M=2 colors) minus (total number of ways of making a list with M=1 color for each of those two colors). The first of those two terms is just 2T, and in general would be MT. The second term for the case of M=2 would be (number of solutions if you have 1 color) times (number of ways of picking 1 color out of the M colors) = 1T times (2 choose 1). So the final solution for M=2 is

2T – (2 choose 1)*1T

(Remember that I'm writing 1 as 1T to make things complicated.)

The number of ways of making a list that's T elements long using M=3 can be similarly calculated as (number of ways of making a list with M=1 or M=2 or M=3 colors) minus (number of ways of making a list with M=2 colors) minus (number of ways of making a list with M=1 colors). That gives
3T – (3 choose 2)*(2T – (2 choose 1)*1T) – (3 choose 1)*1T

These terms are starting to get long, so for simplicity I'll write SX as the solution for M=X (remembering that we're losing generality by renumbering the colors from 1 to M as I pointed out earlier).
S1 = 1T
S2 = 2T – (2 choose 1)*S1
S3 = 3T – (3 choose 2)*S2 – (3 choose 1)*S1
SM = MT – (M choose M-1)*SM-1 – (M choose M-2)*SM-2 – … – (M choose 1)*S1

To account for all different combinations of colors that could be used in the solution set (reversing the loss of generality imposed by renumbering from 1 to M), the final answer would be
(N choose M)*SM
which is converted to a probability by dividing by NT.

I cannot come up with a good way of simplifying that to keep the calculation from becoming unwieldy with a bunch of recursion.

#!/usr/bin/perl
use strict;
use warnings;

use List::Util qw(min sum);
use Math::Counting ':student';
# If you don't already have List::Util and Math::Counting installed,
#   then type "cpan List::Util" and "cpan Math::Counting" from the
#   command prompt

my \$iterations=10000;

for(my \$balls=1; \$balls<21; \$balls++) {
for(my \$draws=1; \$draws<21; \$draws++) {

# First, run the simulation
# @colors is an array where \$colors[X] is the number of trials
#   where X different color balls were drawn
# @drawn is an array where \$drawn[X] is zero if the color X
#   has not been drawn and is one if color X has been drawn
my @colors = (0) x (\$balls+1);
for(my \$iteration=0; \$iteration<\$iterations; \$iteration++) {
my @drawn = (0) x (\$balls+1);
for(my \$draw=0; \$draw<\$draws; \$draw++) {
\$drawn[int(rand(\$balls))] = 1;
}
\$colors[sum(@drawn)]++;
}

# Next, find out what the calculated answer would be
# S(X) and M are as they are explained in the answer on brainden
my @s = (0);
for(my \$m=1; \$m<\$balls+1; \$m++) {
\$s[\$m] = \$m**\$draws;
for(my \$iteration=1; \$iteration<\$m; \$iteration++) {
\$s[\$m] -= combination(\$m, \$iteration)*\$s[\$iteration];
}
}

# Last, print the results
print "Balls (N): \$balls, Draws (T): \$draws\n";
for(my \$color=1; \$color < min(\$balls, \$draws)+1; \$color++) {
print "Simulated  probability of drawing \$color colors (M): " . (\$colors[\$color] / \$iterations) . "\n";
print "Calcualted probability of drawing \$color colors (M): " . (combination(\$balls, \$color)*\$s[\$color]/(\$balls**\$draws)) . "\n";
}
print "\n";
}
}

##### Share on other sites

• 0

Required probability = selecting T balls from an infinite choice of m colors where atleast 1 ball from each of m colors is present/ selecting T balls from an infinite choice of n colors (not all n colors need be present in the selection)

Calculating Denominator = Selecting T balls from an infinite selection of n color balls:

Assume a string of n-1 0s. These 0s divide the string into n sections. We can fill T 1s in these T sections in n+T-1CT ways. Now, assume each section represents one of the n distinct colors. The number of 1s in each section represents the number of balls of that color. So, this representation gives us the number of ways selecting T balls from a set of n color balls where T<=n

Hence, Denominator = n+T-1CT

Now, Numerator can be arrived at as

first selecting m colors = nCm ways

then selecting 1 ball from each of these m colors = 1 way

then selecting the remaining T-m balls from an infinite selection of these m color (not all m need be present in the selection) = m+(T-m)-1CT-m (refer denominator) = T-1CT-m

Hence required probability = nCm*T-1CT-m/n+T-1CT

##### Share on other sites

• 0

There will be M draws of a different color, each with pi = (N+1-i)/N for i=1 to M.
Then there will be (T-M) draws of any of the M specified colors, each with p = M/N

Sequence does not matter, so take them in the above order.

First M draws: p1 = N! / NM (N-M)!

Last T-M draws: p2 = (M/N)T-M

Answer: p = p1p2 = N! MT-M / NT(N-M)!

##### Share on other sites

• 0

M different colors means: exactly M different colors, not: at least M colors, right?

##### Share on other sites

• 0

Bonanova: using that formula with 3 balls being drawn 3 times...

probability of drawing 1 color

3! 13-1 / 33(3-1)! = 6*1 / 27*2 = 6/54

probability of drawing 2 colors

3! 23-2 / 33(3-2)! = 6*2 / 27*1 = 12/27

probability of drawing 3 colors

3! 33-3 / 33(3-3)! = 6*1 / 27*1 = 6/27

Sum of all three of those is less than 1.

I'm having trouble following m00li's logic. I got lost as soon as he started talking about selecting balls from an infinite choice of colors (which was pretty soon).

##### Share on other sites

• 0

Bonanova: using that formula with 3 balls being drawn 3 times...

probability of drawing 1 color

3! 13-1 / 33(3-1)! = 6*1 / 27*2 = 6/54

probability of drawing 2 colors

3! 23-2 / 33(3-2)! = 6*2 / 27*1 = 12/27

probability of drawing 3 colors

3! 33-3 / 33(3-3)! = 6*1 / 27*1 = 6/27

Sum of all three of those is less than 1.

I'm having trouble following m00li's logic. I got lost as soon as he started talking about selecting balls from an infinite choice of colors (which was pretty soon).

You are not alone Plasmid. I am having trouble too I must have been high on something that day

My solution is for cases where a 'selection' is being made out of n different colors. I should have calculated permutations and not selections.

##### Share on other sites

• 0

My formula made such sense to me (this is often the case until someone explains the logic that I overlook) that I posted it. Then, embarrassingly, it did not agree with simulations.

The second part of the probability (after M colors appeared in M draws)
appeared to be incorrect by a factor of 2. I've been chasing that factor of 2
for several days, but only part time owing to preparations for a choral
performance this weekend. More, perhaps, to follow this week.

plasmid, your post #7 may have found it, since you have agreement with simulation.
I'll read it after I've given my approach another shot.

FWIW here is my simulation of 10000 cases for N = 10.
The 10 columns are for M = 1 ... 10.
The 20 rows are for T = 1 ... 20
The table entries are percent, rounded to nearest integer.

My equations agree for the major diagonal.

But they underestimate the probabilities elsewhere.

10 MNTHIST 10000
100 0 0 0 0 0 0 0 0 0
10 90 0 0 0 0 0 0 0 0
1 27 72 0 0 0 0 0 0 0
0 6 43 50 0 0 0 0 0 0
0 1 18 51 30 0 0 0 0 0
0 0 6 34 45 15 0 0 0 0
0 0 2 18 43 31 6 0 0 0
0 0 1 8 33 40 17 2 0 0
0 0 0 4 21 41 28 6 0 0
0 0 0 2 12 36 35 14 1 0
0 0 0 1 7 28 38 22 4 0
0 0 0 0 4 20 38 29 8 1
0 0 0 0 2 15 34 35 13 1
0 0 0 0 1 10 30 38 19 3
0 0 0 0 1 6 24 39 25 5
0 0 0 0 0 4 20 39 30 7
0 0 0 0 0 3 16 37 35 10
0 0 0 0 0 2 12 34 39 13
0 0 0 0 0 1 9 32 42 17
0 0 0 0 0 1 6 28 44 22

##### Share on other sites

• 0

More recursions:

Let f(t,m) denote the number of ways of arranging m colored balls in t slots, where t >= m and supply of each m coloured ball is infinite. Then final answer is nCmf(t,m) / nt

Here are the different recursions I have come up with for f(t,m) (alas, no direct formula in m,n)

1) mt - mC1f(t,1) - mC2f(t,2) - mC3f(t,3) .... - mCm-1f(t,m-1)

2) tC1f(t-1,m-1) + tC2f(t-2,m-1) + tC3f(t-3,m-1) + ... + tCt-m+1f(m-1,m-1)

3) mf(t-1,m-1) + m2f(t-2,m-1) + m3f(t-3,m-1) + ... + mt-m+1f(m-1,m-1)

4) m( f(t-1,m) + f(t-1,m-1) )

where f(t,1) = 1, f(t,2) = 2t-2, and f(t,t) = t!

##### Share on other sites

• 0

So, I wonder. Is there no closed form solution?

##### Share on other sites

• 0

More recursions:

Let f(t,m) denote the number of ways of arranging m colored balls in t slots, where t >= m and supply of each m coloured ball is infinite. Then final answer is nCmf(t,m) / nt

Here are the different recursions I have come up with for f(t,m) (alas, no direct formula in m,n)

1) mt - mC1f(t,1) - mC2f(t,2) - mC3f(t,3) .... - mCm-1f(t,m-1)

2) tC1f(t-1,m-1) + tC2f(t-2,m-1) + tC3f(t-3,m-1) + ... + tCt-m+1f(m-1,m-1)

3) mf(t-1,m-1) + m2f(t-2,m-1) + m3f(t-3,m-1) + ... + mt-m+1f(m-1,m-1)

4) m( f(t-1,m) + f(t-1,m-1) )

where f(t,1) = 1, f(t,2) = 2t-2, and f(t,t) = t!

And finally one kind of non recursive form:

5) f(t,m) = mt - mC1(m-1)t + mC2(m-2)t - mC3(m-3)t + ....(-1)m-1(mCm-1) (This one gives the AHA insight into the solution)

Edited by m00li

## Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

×   Pasted as rich text.   Paste as plain text instead

Only 75 emoji are allowed.

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.