
P is known as a transition matrix. The ijth element of P
n
gives the n-step transition
probability. If the limit as n !1is π
j
then the π
j
s constitute the stationary distribution of
the chain. [Statistics in Civil Engineering, 1997, A. V. Metcalfe, Edward Arnold, London.]
Markov chain Monte Carlo methods (MCMC): Powerful methods for indirectly simulating
random observations from complex, and often high dimensional probability distributions.
Originally used in image processing for the restoration of blurred or distorted pictures, the
technique has now become of considerable importance in many branches of statistics, in
particular in applications of
Bayesian inference
where the integrations needed to evaluate the
posterior distribution
itself or particular features of interest such as moments, quantiles, etc.,
have, until recently, been a source of considerable practical difficulties. But although most
applications of MCMC have been in Bayesian inference, the methods are often also
extremely useful in classical
likelihood
calculations. In general terms the problem attacked
by MCMC methodology is that of obtaining characteristics of interest (for example, mean,
variance, etc.) of the
marginal distribution
, f(x) arising from a given
joint distribution
,
gðx; y
1
; ...; y
q
Þ as
f ðxÞ¼
Z
Z
gðx; y
1
; ...; y
q
Þdy
1
...dy
q
The most natural and straightforward approach would be to calculate f(x) and then use it to
obtain the required characteristics. In general, however, the necessary integrations are
extremely difficult to perform, either analytically or numerically. MCMC methods effec-
tively allow generation of samples from f(x) without requiring f(x) explicitly. By simulating a
large enough sample, the mean, variance or any other characteristic of f(x) (or f(x) itself) can
be calculated to any desired degree of accuracy. The essential feature of the MCMC
approach is a cleverly constructed
Markov chain
, the
stationary distribution
of which is
precisely the distribution of interest f. A number of different methods are available for
creating this Markov chain consisting of a sequence of random variables fX
0
; X
1
; X
2
; ...g.
The Metropolis–Hastings algorithm, for example, samples at each stage a candidate point Y
from a so-called proposal distribution that may depend on the current point X
t
. The candidate
point is then accepted with probability αðX
t
; Y Þ where
αðX ; Y Þ¼minð1;
f ðY ÞqðX jY Þ
f ðX ÞqðY jX Þ
Þ
and q is the proposal distribution. If the candidate point is accepted, the next state becomes
X
tþ1
¼ Y otherwise X
tþ1
¼ X
t
and the chain does not move. Remarkably, the proposal
distribution q can take any form and the stationary dis tribution of the chain will still be f.
An example of the application of this algorithm in a simple case is shown in Fig. 92.The
most widely used form of MCMC in statistical applications is Gibbs sampling where, for
two variables for example, a sample from f(x) is generated by sam pling instead from the
two conditional distributions, hðxjyÞ and rðy jxÞ. So now, beginning with an initial value
Y
0
a sequ ence fY
0
; X
0
; Y
1
; X
1
; Y
2
; X
2
; ...; Y
k
; X
k
g is generated by sampling X
j
from hð:jY
j
Þ
and Y
j+1
from rð:jX
j
Þ. Under reasonably general conditions the distribution of X
k
con-
verges to f(x)ask !1.Sofork large enough the final observation in the sequence is
effectively a s ample point from f (x). See also data aug mentation algorithm.[Markov
Chain Monte Carlo in Practice, 1996, edited by W. R. Gilks, S. Richardson and D. J.
Spiegelhalter, C hapman and Hall/CRC Press, London.]
Markov illness^death model: A model in which live individuals are classified as either having,
or not having, a disease A, and then move between these possibilities and death as indicated
in Fig. 93.[Medical Decision Making, 1994, 14, 266–72.]
269