PROBABILITY DISTRIBUTIONS AND STATISTICAL MODELS 51
is called the
χ
2
distribution with k degrees of freed om. Especially, for
k = 2, it becomes an exponential distribution. The sum of the squ are
of k Gaussian rand om variables follows the
χ
2
distribution with k
degrees of freedom.
(f) Double exponential distribution. The distribution with density func-
tion
g(x) = e
x−e
x
(4.9)
is called the d ouble exponential distribution. The logarithm of the
exponential random variable follows the double exponential distribu-
tion.
(g) Uniform distribution. The distribution with density functio n
g(x) =
(
(b −a)
−1
, for a ≤x < b
0, otherwise
(4.10)
is called the uniform distribution over [a, b).
Example Figur e 4.1 shows the density functions defined in (a)–(f)
above. By the simulation methods to be discu ssed in Chapter 16, data
y
1
,···,y
N
can be generated that take various values acc ording to the den -
sity function. The generated da ta are called realizations of th e random
variable. Figure 4.2 shows examples of re alizations with the sample size
N = 20 for the distributions of (a )–(c) and (f) above.
If a probability distribution or a density function is given, we can
generate data that follow the distribution. On the other hand, in statis-
tical analysis, when data y
1
,···, y
N
have been obta ined, they are con-
sidered to be realizations of a random variable Y . That is, we assume
a random variable Y underlying the data, and when we obtain the data,
we consider them as realizations of that random variable. Here, the den-
sity function g(y) defining the random variable is called the true model.
Since this true model is usu ally unknown for u s, g iven a set o f data, it is
necessary to estima te the probability distribution that generates the data.
For example, we estimate the density function shown in Figure 4.1 from
the data shown in Figure 4. 2. Here, the density function estimated from
data is called a statistical mo del and is denoted by f (y).
In ordinary statistical an a lysis, the probability distribution is suffi-
cient to characterize the data, whereas for time series data, we have to
consider the joint distribution f (y
1
,···, y
N
) as shown in Chapter 2. In