7 The Britannica Guide to Statistics and Probability 7
92
observation that {T
1
> t} = {N(t) = 0}. Hence, P{T
1
≤ t} = 1 − P
{N(t) = 0} = 1 − exp(−μt), and by differentiation one obtains
the exponential density function.
The Cauchy distribution does not have a mean value
or a variance, because the integral (15) does not converge.
As a result, it has a number of unusual properties. For
example, if X
1
, X
2
, . . . , X
n
are independent random vari-
ables having a Cauchy distribution, then the average
(X
1
+⋯+ X
n
)/n also has a Cauchy distribution. The variabil-
ity of the average is exactly the same as that of a single
observation. Another random variable that does not have
an expectation is the waiting time until the number of
heads first equals the number of tails in tossing a fair coin.
condiTional expecTaTion
and leasT squaRes pRedicTion
An important problem of probability theory is to predict
the value of a future observation Y given knowledge of a
related observation X (or, more generally, given several
related observations X
1
, X
2
, . . .). Examples are to predict
the future course of the national economy or the path of a
rocket, given its present state.
Prediction is often just one aspect of a “control” prob-
lem. For example, in guiding a rocket, measurements of
the rocket’s location, velocity, and so on are made almost
continuously. At each reading, the rocket’s future course is
predicted, and a control is then used to correct its future
course. The same ideas are used to steer automatically
large tankers transporting crude oil, for which even slight
gains in efficiency result in large financial savings.
Given X, a predictor of Y is just a function H(X). The
problem of “least squares prediction” of Y given the obser-
vation X is to find that function H(X) that is closest to Y
in the sense that the mean square error of prediction,