Best Answer plasmid, 11 May 2014 - 06:02 PM

_{1}, B

_{2}, B

_{3}… B

_{T}) where the value of each B

_{X}term is the number corresponding to the color pulled on draw number X. So if there were four colored balls {1=red, 2=blue, 3=white, 4=purple} and you had three draws and got (red, white, red), then you would have (B

_{1}=red=1, B

_{2}=white=3, B

_{3}=red=1).

If there are N different colored balls and you draw T balls with replacement, then there are N

^{T}different possible outcomes for the ordered list (B

_{1}, B

_{2}, B

_{3}... B

_{T}). The probability that you will draw M different colors is [the number of different ordered lists that include M different colors] divided by [the total number of different possible ordered lists, which is N

^{T}].

The enumerate the different possible (B

_{1}, B

_{2}, B

_{3}... B

_{T}) lists with M different colors, I will temporarily lose generality:

*WITH*loss of generality, number the colors that are drawn from color #1 to color #M so the terms in the list (B

_{1}, B

_{2}, B

_{3}... B

_{T}) each must take on a value from 1 to M, and each number from 1 to M must appear in the list at least once. To convince you that this does lose generality, consider the case in the first paragraph with N=4 and T=3 – if you number the colors from 1 to M and solve for M=1, then only one color (color #1) can appear in the list and the only solution is (1, 1, 1), but in actuality there are four solutions: (red, red, red,), (blue, blue, blue), (white, white, white), and (purple, purple, purple). In general, if you consider the solution found by renumbering the colors from 1 to M, then to get the real solution you must then multiply by (N choose M) to account for each of the different color combinations that could be used in the set from 1 to M. So the answer will be along the lines of (N choose M) * (ways of arranging M different colors in a list that's T elements long).

The number of ways of making a list that's T elements long using only M=1 colors is easy: that's just 1. In fact, that's too easy... to make things more complicated I'll rewrite that as 1

^{T}.

The number of ways of making a list that's T elements long using M=2 colors can be calculated as (total number of ways of making a list with either M=1 or M=2 colors) minus (total number of ways of making a list with M=1 color for each of those two colors). The first of those two terms is just 2

^{T}, and in general would be M

^{T}. The second term for the case of M=2 would be (number of solutions if you have 1 color) times (number of ways of picking 1 color out of the M colors) = 1

^{T}times (2 choose 1). So the final solution for M=2 is

2^{T} – (2 choose 1)*1^{T}

(Remember that I'm writing 1 as 1^{T} to make things complicated.)

The number of ways of making a list that's T elements long using M=3 can be similarly calculated as (number of ways of making a list with M=1 or M=2 or M=3 colors) minus (number of ways of making a list with M=2 colors) minus (number of ways of making a list with M=1 colors). That gives

3^{T} – (3 choose 2)*(2^{T} – (2 choose 1)*1^{T}) – (3 choose 1)*1^{T}

These terms are starting to get long, so for simplicity I'll write S_{X} as the solution for M=X (remembering that we're losing generality by renumbering the colors from 1 to M as I pointed out earlier).

S_{1} = 1^{T}

S_{2} = 2^{T} – (2 choose 1)*S_{1}

S_{3} = 3^{T} – (3 choose 2)*S_{2} – (3 choose 1)*S_{1}

S_{M} = M^{T} – (M choose M-1)*S_{M-1} – (M choose M-2)*S_{M-2} – … – (M choose 1)*S_{1}

To account for all different combinations of colors that could be used in the solution set (reversing the loss of generality imposed by renumbering from 1 to M), the final answer would be

(N choose M)*S_{M}

which is converted to a probability by dividing by N^{T}.

I cannot come up with a good way of simplifying that to keep the calculation from becoming unwieldy with a bunch of recursion.

#!/usr/bin/perl use strict; use warnings; use List::Util qw(min sum); use Math::Counting ':student'; # If you don't already have List::Util and Math::Counting installed, # then type "cpan List::Util" and "cpan Math::Counting" from the # command prompt my $iterations=10000; for(my $balls=1; $balls<21; $balls++) { for(my $draws=1; $draws<21; $draws++) { # First, run the simulation # @colors is an array where $colors[X] is the number of trials # where X different color balls were drawn # @drawn is an array where $drawn[X] is zero if the color X # has not been drawn and is one if color X has been drawn my @colors = (0) x ($balls+1); for(my $iteration=0; $iteration<$iterations; $iteration++) { my @drawn = (0) x ($balls+1); for(my $draw=0; $draw<$draws; $draw++) { $drawn[int(rand($balls))] = 1; } $colors[sum(@drawn)]++; } # Next, find out what the calculated answer would be # S(X) and M are as they are explained in the answer on brainden my @s = (0); for(my $m=1; $m<$balls+1; $m++) { $s[$m] = $m**$draws; for(my $iteration=1; $iteration<$m; $iteration++) { $s[$m] -= combination($m, $iteration)*$s[$iteration]; } } # Last, print the results print "Balls (N): $balls, Draws (T): $draws\n"; for(my $color=1; $color < min($balls, $draws)+1; $color++) { print "Simulated probability of drawing $color colors (M): " . ($colors[$color] / $iterations) . "\n"; print "Calcualted probability of drawing $color colors (M): " . (combination($balls, $color)*$s[$color]/($balls**$draws)) . "\n"; } print "\n"; } }