20 THE COVARIANCE FUNCTION
does not hold for many actu al data. In such a situation, it is recommended
to draw a histogram of the data. The histogram might reveal a difference
in the distributions even though the two sets of data have the same mean
and variance.
Therefore, the features of a time series cannot always be captured
completely only by the mean, the variance and the covariance function.
In general, it is necessary to examine the joint probability density func-
tion of the time series y
1
,···, y
N
, i.e., f (y
1
,···, y
N
). For that purpose, it is
sufficient to specify the joint probability density function f (y
i
1
,···, y
i
k
)
of y
i
1
,···, y
i
k
for arbitrary integers k and arbitrary time points satisfying
i
1
< i
2
< ··· < i
k
.
In pa rticular, whe n this joint distribution is a k-variate normal d istri-
bution, the time series is called a Gaussian time series. The features of a
Gaussian time series can be completely captured by the mean vector and
the variance-covariance matrix.
When the distribution of a c e rtain time series is invariant with re-
spect to a time sh ift and the probability distribution does not chang e
with time, the time series is called strongly stationary. Namely, a time
series is called strongly stationary, if its distribution functio n satisfies the
following relation
f (y
i
1
,···, y
i
k
) = f (y
i
1
−ℓ
,···, y
i
k
−ℓ
), (2.4)
for an arbitrary time sh ift ℓ and arbitrary time points i
1
,···,i
k
.
As noted above, the properties of Gaussian distributions are com-
pletely specified by the mean, the variance and the covariance. There-
fore, for Gaussian time series, wea k stationarity is equivalent to strong
stationarity.
2.2 The Autocovariance Function of Stationary Time Series
Under the assumption of stationarity, the mean value function
µ
n
of a
time series becomes a constant and does not depend on time n. Ther efore,
for a stationary time series, it can be expressed as
µ
= E(y
n
), (2.5)
where
µ
is called the mean of the tim e series y
n
. Furthe r, the covariance
of y
n
and y
n−k
, Cov(y
n
,y
n−k
), becomes a value that depends only on the
time difference k . Therefore, it can be expressed as
C
k
= Cov(y
n
,y
n−k
) = E{(y
n
−
µ
)(y
n−k
−
µ
)}, (2.6)