Negative Binomial Distribution
Consider a statistical experiment where a success occurs with probability p and a failure occurs with probability q = 1 - p. If the experiment is repeated indefinitely and the trials are independent of each other, then the random variable X, whose value is the number of the trial on which the rth success occurs, has a negative binomial distribution with parameters r and p. The probability mass function of X is
For the rth success to occur on the xth trial, there must have been (r-1) successes and (x-r) failures among the first (x-1) trials. The number of ways of distributing (r-1)
successes among (x - 1) trials is The probability of having ( -1)
successes and (x - r) failures is p r-1(1 - p)x-r
The probability of the rth success is p.
Thus, the product of these three terms is the probability that there are r successes and x-r failures in the x trials, with the rth success occurring on the xth trial.
A random variable X, having a negative binomial distribution with parameters r and p, is the sum of r independent random variables, each one geometrically distributed with parameter p. Intuitively, X is the number of trials needed for the first success, plus the number of trials needed for the second success, ........ , plus the number of trials needed for the rth success. Thus, the mean and variance of a random variable X, with parameters r and p, are derived as follows:
In fact, a geometric distribution with parameter p is the same as a negative binomial distribution with parameters n = 1 and p.
EX. A phenomenal major-league baseball player has a batting average of 0.400. Beginning with his next at-bat, the random variable X, whose value refers to the number of the at-bat (walks, sacrifice flies and certain types of outs are not considered at-bats) when his rth hit occurs, has a negative binomial distribution with parameters r and p = 0.400. It has the following probability mass function:
The probability that this hitter's second hit comes on the fourth at-bat is