BMAD Posted May 6, 2014 Report Share Posted May 6, 2014 There are N balls in the bag. Each one has a distinct color. You draw one ball randomly from the bag, record its color and put it back. Calculate the probability that after T draws you've seen M distinct colors. (T > 0, M > 0, M <= min(T, N) ) Quote Link to comment Share on other sites More sharing options...

0 plasmid Posted May 11, 2014 Report Share Posted May 11, 2014 If you number the colors of the balls {1, 2, 3 ... N} then you can consider the T balls that you draw as an ordered list (B_{1}, B_{2}, B_{3} … B_{T}) where the value of each B_{X} term is the number corresponding to the color pulled on draw number X. So if there were four colored balls {1=red, 2=blue, 3=white, 4=purple} and you had three draws and got (red, white, red), then you would have (B_{1}=red=1, B_{2}=white=3, B_{3}=red=1). If there are N different colored balls and you draw T balls with replacement, then there are N^{T} different possible outcomes for the ordered list (B_{1}, B_{2}, B_{3} ... B_{T}). The probability that you will draw M different colors is [the number of different ordered lists that include M different colors] divided by [the total number of different possible ordered lists, which is N^{T}]. The enumerate the different possible (B_{1}, B_{2}, B_{3} ... B_{T}) lists with M different colors, I will temporarily lose generality: WITH loss of generality, number the colors that are drawn from color #1 to color #M so the terms in the list (B_{1}, B_{2}, B_{3} ... B_{T}) each must take on a value from 1 to M, and each number from 1 to M must appear in the list at least once. To convince you that this does lose generality, consider the case in the first paragraph with N=4 and T=3 – if you number the colors from 1 to M and solve for M=1, then only one color (color #1) can appear in the list and the only solution is (1, 1, 1), but in actuality there are four solutions: (red, red, red,), (blue, blue, blue), (white, white, white), and (purple, purple, purple). In general, if you consider the solution found by renumbering the colors from 1 to M, then to get the real solution you must then multiply by (N choose M) to account for each of the different color combinations that could be used in the set from 1 to M. So the answer will be along the lines of (N choose M) * (ways of arranging M different colors in a list that's T elements long). The number of ways of making a list that's T elements long using only M=1 colors is easy: that's just 1. In fact, that's too easy... to make things more complicated I'll rewrite that as 1^{T}. The number of ways of making a list that's T elements long using M=2 colors can be calculated as (total number of ways of making a list with either M=1 or M=2 colors) minus (total number of ways of making a list with M=1 color for each of those two colors). The first of those two terms is just 2^{T}, and in general would be M^{T}. The second term for the case of M=2 would be (number of solutions if you have 1 color) times (number of ways of picking 1 color out of the M colors) = 1^{T} times (2 choose 1). So the final solution for M=2 is 2^{T} – (2 choose 1)*1^{T} (Remember that I'm writing 1 as 1^{T} to make things complicated.) The number of ways of making a list that's T elements long using M=3 can be similarly calculated as (number of ways of making a list with M=1 or M=2 or M=3 colors) minus (number of ways of making a list with M=2 colors) minus (number of ways of making a list with M=1 colors). That gives 3^{T} – (3 choose 2)*(2^{T} – (2 choose 1)*1^{T}) – (3 choose 1)*1^{T} These terms are starting to get long, so for simplicity I'll write S_{X} as the solution for M=X (remembering that we're losing generality by renumbering the colors from 1 to M as I pointed out earlier). S_{1} = 1^{T} S_{2} = 2^{T} – (2 choose 1)*S_{1} S_{3} = 3^{T} – (3 choose 2)*S_{2} – (3 choose 1)*S_{1} S_{M} = M^{T} – (M choose M-1)*S_{M-1} – (M choose M-2)*S_{M-2} – … – (M choose 1)*S_{1} To account for all different combinations of colors that could be used in the solution set (reversing the loss of generality imposed by renumbering from 1 to M), the final answer would be (N choose M)*S_{M} which is converted to a probability by dividing by N^{T}. I cannot come up with a good way of simplifying that to keep the calculation from becoming unwieldy with a bunch of recursion. #!/usr/bin/perl use strict; use warnings; use List::Util qw(min sum); use Math::Counting ':student'; # If you don't already have List::Util and Math::Counting installed, # then type "cpan List::Util" and "cpan Math::Counting" from the # command prompt my $iterations=10000; for(my $balls=1; $balls<21; $balls++) { for(my $draws=1; $draws<21; $draws++) { # First, run the simulation # @colors is an array where $colors[X] is the number of trials # where X different color balls were drawn # @drawn is an array where $drawn[X] is zero if the color X # has not been drawn and is one if color X has been drawn my @colors = (0) x ($balls+1); for(my $iteration=0; $iteration<$iterations; $iteration++) { my @drawn = (0) x ($balls+1); for(my $draw=0; $draw<$draws; $draw++) { $drawn[int(rand($balls))] = 1; } $colors[sum(@drawn)]++; } # Next, find out what the calculated answer would be # S(X) and M are as they are explained in the answer on brainden my @s = (0); for(my $m=1; $m<$balls+1; $m++) { $s[$m] = $m**$draws; for(my $iteration=1; $iteration<$m; $iteration++) { $s[$m] -= combination($m, $iteration)*$s[$iteration]; } } # Last, print the results print "Balls (N): $balls, Draws (T): $draws\n"; for(my $color=1; $color < min($balls, $draws)+1; $color++) { print "Simulated probability of drawing $color colors (M): " . ($colors[$color] / $iterations) . "\n"; print "Calcualted probability of drawing $color colors (M): " . (combination($balls, $color)*$s[$color]/($balls**$draws)) . "\n"; } print "\n"; } } Quote Link to comment Share on other sites More sharing options...

0 m00li Posted May 6, 2014 Report Share Posted May 6, 2014 Required probability = selecting T balls from an infinite choice of m colors where atleast 1 ball from each of m colors is present/ selecting T balls from an infinite choice of n colors (not all n colors need be present in the selection) Calculating Denominator = Selecting T balls from an infinite selection of n color balls: Assume a string of n-1 0s. These 0s divide the string into n sections. We can fill T 1s in these T sections in ^{n+T-1}C_{T} ways. Now, assume each section represents one of the n distinct colors. The number of 1s in each section represents the number of balls of that color. So, this representation gives us the number of ways selecting T balls from a set of n color balls where T<=n Hence, Denominator = ^{n+T-1}C_{T} Now, Numerator can be arrived at as first selecting m colors = ^{n}C_{m} ways then selecting 1 ball from each of these m colors = 1 way then selecting the remaining T-m balls from an infinite selection of these m color (not all m need be present in the selection) = ^{m+(T-m)-1}C_{T-m} (refer denominator) = ^{T-1}C_{T-m} Hence required probability = ^{n}C_{m*}^{T-1}C_{T-m}/^{n+T-1}C_{T} Quote Link to comment Share on other sites More sharing options...

0 bonanova Posted May 9, 2014 Report Share Posted May 9, 2014 There will be M draws of a different color, each with p_{i} = (N+1-i)/N for i=1 to M. Then there will be (T-M) draws of any of the M specified colors, each with p = M/N Sequence does not matter, so take them in the above order. First M draws: p_{1} = N! / N^{M} (N-M)! Last T-M draws: p_{2} = (M/N)^{T-M} Answer: p = p_{1}p_{2} = N! M^{T-M} / N^{T}(N-M)! Quote Link to comment Share on other sites More sharing options...

0 bonanova Posted May 10, 2014 Report Share Posted May 10, 2014 M different colors means: exactly M different colors, not: at least M colors, right? Quote Link to comment Share on other sites More sharing options...

0 plasmid Posted May 10, 2014 Report Share Posted May 10, 2014 Bonanova: using that formula with 3 balls being drawn 3 times...probability of drawing 1 color3! 1^{3-1} / 3^{3}(3-1)! = 6*1 / 27*2 = 6/54probability of drawing 2 colors3! 2^{3-2} / 3^{3}(3-2)! = 6*2 / 27*1 = 12/27probability of drawing 3 colors3! 3^{3-3} / 3^{3}(3-3)! = 6*1 / 27*1 = 6/27Sum of all three of those is less than 1.I'm having trouble following m00li's logic. I got lost as soon as he started talking about selecting balls from an infinite choice of colors (which was pretty soon). Quote Link to comment Share on other sites More sharing options...

0 m00li Posted May 11, 2014 Report Share Posted May 11, 2014 Bonanova: using that formula with 3 balls being drawn 3 times... probability of drawing 1 color 3! 1^{3-1} / 3^{3}(3-1)! = 6*1 / 27*2 = 6/54 probability of drawing 2 colors 3! 2^{3-2} / 3^{3}(3-2)! = 6*2 / 27*1 = 12/27 probability of drawing 3 colors 3! 3^{3-3} / 3^{3}(3-3)! = 6*1 / 27*1 = 6/27 Sum of all three of those is less than 1. I'm having trouble following m00li's logic. I got lost as soon as he started talking about selecting balls from an infinite choice of colors (which was pretty soon). You are not alone Plasmid. I am having trouble too I must have been high on something that day My solution is for cases where a 'selection' is being made out of n different colors. I should have calculated permutations and not selections. Quote Link to comment Share on other sites More sharing options...

0 bonanova Posted May 12, 2014 Report Share Posted May 12, 2014 My formula made such sense to me (this is often the case until someone explains the logic that I overlook) that I posted it. Then, embarrassingly, it did not agree with simulations. The second part of the probability (after M colors appeared in M draws) appeared to be incorrect by a factor of 2. I've been chasing that factor of 2 for several days, but only part time owing to preparations for a choral performance this weekend. More, perhaps, to follow this week.plasmid, your post #7 may have found it, since you have agreement with simulation.I'll read it after I've given my approach another shot.FWIW here is my simulation of 10000 cases for N = 10.The 10 columns are for M = 1 ... 10.The 20 rows are for T = 1 ... 20The table entries are percent, rounded to nearest integer. My equations agree for the major diagonal. But they underestimate the probabilities elsewhere. 10 MNTHIST 10000100 0 0 0 0 0 0 0 0 0 10 90 0 0 0 0 0 0 0 0 1 27 72 0 0 0 0 0 0 0 0 6 43 50 0 0 0 0 0 0 0 1 18 51 30 0 0 0 0 0 0 0 6 34 45 15 0 0 0 0 0 0 2 18 43 31 6 0 0 0 0 0 1 8 33 40 17 2 0 0 0 0 0 4 21 41 28 6 0 0 0 0 0 2 12 36 35 14 1 0 0 0 0 1 7 28 38 22 4 0 0 0 0 0 4 20 38 29 8 1 0 0 0 0 2 15 34 35 13 1 0 0 0 0 1 10 30 38 19 3 0 0 0 0 1 6 24 39 25 5 0 0 0 0 0 4 20 39 30 7 0 0 0 0 0 3 16 37 35 10 0 0 0 0 0 2 12 34 39 13 0 0 0 0 0 1 9 32 42 17 0 0 0 0 0 1 6 28 44 22 Quote Link to comment Share on other sites More sharing options...

0 m00li Posted May 25, 2014 Report Share Posted May 25, 2014 More recursions: Let f(t,m) denote the number of ways of arranging m colored balls in t slots, where t >= m and supply of each m coloured ball is infinite. Then final answer is ^{n}C_{m}f(t,m) / n^{t} Here are the different recursions I have come up with for f(t,m) (alas, no direct formula in m,n) 1) m^{t} - ^{m}C_{1}f(t,1) - ^{m}C_{2}f(t,2) - ^{m}C_{3}f(t,3) .... - ^{m}C_{m-1}f(t,m-1) 2) ^{t}C_{1}f(t-1,m-1) + ^{t}C_{2}f(t-2,m-1) + ^{t}C_{3}f(t-3,m-1) + ... + ^{t}C_{t-m+1}f(m-1,m-1) 3) mf(t-1,m-1) + m^{2}f(t-2,m-1) + m^{3}f(t-3,m-1) + ... + m^{t-m+1}f(m-1,m-1) 4) m( f(t-1,m) + f(t-1,m-1) ) where f(t,1) = 1, f(t,2) = 2^{t}-2, and f(t,t) = t! Quote Link to comment Share on other sites More sharing options...

0 bonanova Posted May 25, 2014 Report Share Posted May 25, 2014 So, I wonder. Is there no closed form solution? Quote Link to comment Share on other sites More sharing options...

0 m00li Posted May 25, 2014 Report Share Posted May 25, 2014 (edited) More recursions: Let f(t,m) denote the number of ways of arranging m colored balls in t slots, where t >= m and supply of each m coloured ball is infinite. Then final answer is ^{n}C_{m}f(t,m) / n^{t} Here are the different recursions I have come up with for f(t,m) (alas, no direct formula in m,n) 1) m^{t} - ^{m}C_{1}f(t,1) - ^{m}C_{2}f(t,2) - ^{m}C_{3}f(t,3) .... - ^{m}C_{m-1}f(t,m-1) 2) ^{t}C_{1}f(t-1,m-1) + ^{t}C_{2}f(t-2,m-1) + ^{t}C_{3}f(t-3,m-1) + ... + ^{t}C_{t-m+1}f(m-1,m-1) 3) mf(t-1,m-1) + m^{2}f(t-2,m-1) + m^{3}f(t-3,m-1) + ... + m^{t-m+1}f(m-1,m-1) 4) m( f(t-1,m) + f(t-1,m-1) ) where f(t,1) = 1, f(t,2) = 2^{t}-2, and f(t,t) = t! And finally one kind of non recursive form: 5) f(t,m) = m^{t} - ^{m}C_{1}(m-1)^{t} + ^{m}C_{2}(m-2)^{t} - ^{m}C_{3}(m-3)^{t} + ....(-1)^{m-1}(^{m}C_{m-1}) (This one gives the AHA insight into the solution) Edited May 25, 2014 by m00li 1 Quote Link to comment Share on other sites More sharing options...

## Question

## BMAD

## Link to comment

## Share on other sites

## 10 answers to this question

## Recommended Posts

## Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.