
5.3 Some Standard Discrete Distributions 147
This PMF can be deduced by counting rules. There are
¡
m
n
¢
different ways
of selecting the n balls from a box with a total of m balls. From these (each
equally likely), there are
¡
k
x
¢
ways of selecting x white balls from the k white
balls in the box and, similarly,
¡
m−k
n
−x
¢
ways of choosing the black balls. The prob-
ability is the ratio of these two numbers. The PDF and CDF of
H G (40,15,10)
are shown in Fig. 5.4.
It can be shown that the mean and variance for the hypergeometric distri-
bution are, respectively,
µ =
nk
m
and
σ
2
=
µ
nk
m
¶µ
m −k
m
¶
³
m −n
m −1
´
.
The MATLAB commands for hypergeometric CDF, PDF, quantile, and a ran-
dom number are
hygecdf, hygepdf, hygeinv, and hygernd. WinBUGS does not
have a built-in command for a hypergeometric distribution.
Example 5.9. CASES. In a group of 40 people, 15 are “CASES” and 25 are
“CONTROLS.” A sample of 10 subjects is selected [(A) with replacement and
%Solution
%(A) - with replacement (binomial case);
%Let X be the number of CASES. The event
%X is at least 2 is the complement of X <= 1.
disp(’(A) Bin(10, 15/40): P(X >= 2)’); 1 - binocdf(1, 10, 15/40)
% ans = 0.9363
% or
1 - binopdf(0, 10, 15/40) - binopdf(1, 10, 15/40)
% ans = 0.9363
%B - without replacement (hypergeometric case) hygecdf(x, m, k, n)
% where m size of population,
% k - number of cases among m, and n sample size.
disp(’(B) HyGe(40,15,10): P(X >=2)’); 1 - hygecdf(1, 40, 15, 10)
% ans = 0.9600, or
1 - hygepdf(0, 40, 15, 10)- hygepdf(1, 40, 15, 10)
% ans = 0.9600
Example 5.10. Capture-Recapture Models. Suppose that an unknown num-
ber m of animals inhabit a particular region. To assess the population size,
ecologists often apply the following capture-recapture scheme. They catch k
animals, tag them and release them back into the region. After some time,
when the tagged animals are expected to be mixed well with the untagged, a
second catch of size n is made. Suppose that x animals in the second sample
are found to be tagged.
If catching any animal is assumed equally likely, the number x of tagged
animals in the second sample is hypergeometric
H G (m, k, n). Ecologists use
(B) without replacement]. Find the probability
P(at least 2 subjects are CASES).