Thanks everyone for all the great feedback! I guess I feel a little better that it's not just me getting tripped up by this.
Regarding Michael Morton's suggestion to "calculate the probability using the two binomial probabilities and summing all possibilities", that is what I originally thought of too, and I believe that is correct. However, that also requires manual calculations for each combination of k/n1/n2, and I was trying to figure out if there was a generalizable solution. And yes, there is independence of x1 and x2, and p1 and p2 are both known.
Regarding Jeffrey Finman's question about the original conditions, he brings up a good point. Please let me clarify what I meant in my original e-mail. When I wrote that "k>=min(n1,n2)", it was more an observation about the particulars of my dataset and the desired values of k. Strictly speaking, given the data I have, k could be (and sometimes is) any number between 0 and n1+n2. However, for the purposes of my particular analysis, the desired values of k for which I need to do the calculation always happen to meet the condition "k>=min(n1,n2)". But yes, in general it is possible for k to equal X1 (if X2 is zero) or vice versa.
Regarding Margot's original suggestion to take the weighted sum of the two individual distributions, with weights n1/(n1+n2) and n2/(n1+n2), when I tried that I found something interesting. Compared to the manual calculations I did for a few cases, when p1=p2 her solution matched the manual summation exactly. However, when p1-p2=0.05, the weighting solution started to deviate from the manual summation by somewhere in the neighborhood of 0.0005 to 0.001, for the cases I tested of n's in the range of 2 to 4.
Also, regarding the suggestions to sum over all possibilities, that was my first thought as well, but I was hoping there was some distribution or mathematical identity or something like else similar that would allow for generalizing the solution without having to specify the lower and upper bounds of summation for each case. If not, then I guess I'll have to approach the problem differently.
Thanks again,
Gabe
-------------------------------------------
Gabriel Farkas
-------------------------------------------
Original Message:
Sent: 02-06-2013 11:34
From: Jeffrey Finman
Subject: probability calculation question
Are the original conditions as stated correct?
The text seems to imply it is possible for k to equal X1 (if X2 is zero) or vice versa (which presumably would be the default understanding), whereas the condition is stated as k>=min(n1, n2), which is an entirely different matter.
-------------------------------------------
Jeffrey Finman Ph.D.
Jupiter Point Pharma Consulting, LLC
-------------------------------------------
Original Message:
Sent: 02-06-2013 11:24
From: James Baldwin
Subject: probability calculation question
Scott has it right. An explicit enumeration is what you'll need to do. (The "outer" function in R would be handy for this.)
If I understand the text description, what you have is the following:
x1 ~ Binomial(n1,p1)
x2 ~ Binomial(n2,p2)
k = x1 + x2
Need conditional distribution k ' k > min(n1,n2).
Jim
-------------------------------------------
James Baldwin
Station Statistician
US Forest Service
-------------------------------------------
Original Message:
Sent: 02-06-2013 11:11
From: Scott Berry
Subject: probability calculation question
I think you have to write this as the convolution...
Pr(X_1 + X_2 = k) = sum_{i=0:k} Pr(X_1 = i) Pr(X_2=k-i)
Each Pr(*) part is the binomial pmf -- I dont think this reduces down at all (unless p_1 = p_2). Some of these terms become zero when i > n_1 or n-i > n_2, etc... so easy to set up an R function for this, but I dont see any simple analytically formula falling out of this.
-------------------------------------------
Scott Berry
Berry Consultants
-------------------------------------------
Original Message:
Sent: 02-05-2013 21:00
From: Gabriel Farkas
Subject: probability calculation question
Hi all,
For some reason I can't seem to remember how to address this seemingly basic probability question I'm facing, nor can I find the answer anywhere online, and so I was hoping someone in this group might be able to help me out.
I have a situation where there are 2 sets of events of type 1 or type 2, both of which follow their own binomial distributions, with p1 and p2 (and p1 is not necessarily equal to p2). I need to find the probability of exactly k successes in exactly n1+n2 trials, where success in a type 1 trial is indistinguishable from success in a type 2 trial, for my purposes. I know that in all circumstances k>=min(n1,n2); that is, it's possible for all k successes to have occurred during trials of only one type, although by no means guaranteed, and in most cases there certainly would be m successes in type 1 trials and k-m successes in type 2 trials.
The solution is trivial when k=n1+n2, or when p1=p2, but I was having trouble generalizing it for larger n's. I thought at first this might follow the multinomial distribution or multivariate hypergeometric, but neither of those seem to be what I'm looking for.
If anyone has any guidance on how to tackle this, it would be greatly appreciated.
Thanks,
Gabe Farkas
-------------------------------------------
Gabriel Farkas
-------------------------------------------