In other words, it is the
probability distribution of the
number of successes in a collection of nindependent yes/no experiments with success
probabilities. The ordinary
binomial distribution is a special case of the Poisson binomial distribution, when all success probabilities are the same, that is .
Definitions
Probability Mass Function
The probability of having k successful trials out of a total of n can be written as the sum
[1]
where is the set of all subsets of k integers that can be selected from . For example, if n = 3, then . is the
complement of , i.e. .
will contain elements, the sum over which is infeasible to compute in practice unless the number of trials n is small (e.g. if n = 30, contains over 1020 elements). However, there are other, more efficient ways to calculate .
As long as none of the success probabilities are equal to one, one can calculate the probability of k successes using the recursive formula
[2][3]
where
The recursive formula is not
numerically stable, and should be avoided if is greater than approximately 20. An alternative is to use a
divide-and-conquer algorithm: if we assume is a power of two, denoting by the Poisson binomial of and the
convolution operator, we have .
where is the set of all subsets of size that can be selected from .
Properties
Mean and Variance
Since a Poisson binomial distributed variable is a sum of n independent Bernoulli distributed variables, its mean and variance will simply be sums of the mean and variance of the n Bernoulli distributions:
For fixed values of the mean () and size (n), the variance is maximal when all success probabilities are equal and we have a binomial distribution. When the mean is fixed, the variance is bounded from above by the variance of the
Poisson distribution with the same mean which is attained asymptotically [citation needed] as n tends to infinity.
Entropy
There is no simple formula for the entropy of a Poisson binomial distribution, but the entropy is bounded above by the entropy of a binomial distribution with the same number parameter and the same mean. Therefore, the entropy is also bounded above by the entropy of a Poisson distribution with the same mean.[6]
The Shepp–Olkin concavity conjecture, due to
Lawrence Shepp and
Ingram Olkin in 1981, states that the entropy of a Poisson binomial distribution is a concave function of the success probabilities .[7] This conjecture was proved by Erwan Hillion and Oliver Johnson in 2015.[8] The Shepp–Olkin monotonicity conjecture, also from the same 1981 paper, is that the entropy is monotone increasing in , if all . This conjecture was also proved by Hillion and Johnson, in 2019.[9]
Chernoff bound
The probability that a Poisson binomial distribution gets large, can be bounded using its moment generating function as follows (valid when and for any ):
The reference [10] discusses techniques of evaluating the probability mass function of the Poisson binomial distribution. The following software implementations are based on it:
An R package
poibin was provided along with the paper,[10] which is available for the computing of the cdf, pmf, quantile function, and random number generation of the Poisson binomial distribution. For computing the PMF, a DFT algorithm or a recursive algorithm can be specified to compute the exact PMF, and approximation methods using the normal and Poisson distribution can also be specified.
^
Shah, B. K. (1994). "On the distribution of the sum of independent integer valued random variables". American Statistician. 27 (3): 123–124.
JSTOR2683639.
^
ab
Hong, Yili (March 2013). "On computing the distribution function for the Poisson binomial distribution". Computational Statistics & Data Analysis. 59: 41–51.
doi:
10.1016/j.csda.2012.10.006.