584 Appendix D
Under these two conditions, the stationary solution is unique. Moreover, the distri-
bution at time t
n
in the limit n →∞asymptotically converges towards the stationary
solution, regardless of the initial distribution at time t
1
. This property will be demon-
strated in the second part of this appendix.
Assuming that the conditions of uniqueness are satisfied in the previous example, the
entropy H(X )
t=t
n
converges towards the limit:
H(X )
t=t
∞
≡ H
∞
=−µ
1
log µ
1
− µ
2
log µ
2
=−
β
M
log
β
M
−
1 −
β
M
log
1 −
β
M
.
(D14)
It is easily verified that when the stationary solution is uniform (β = M/2), then
H
∞
= H
max
= log 2 ≡ 1 bit/symbol, which represents the maximum possible entropy
for a two-state distribution (Chapter 4). In the general case where the stationary solution
is nonuniform (β = M/2), we have, therefore, H
∞
< H
max
. This means that the system
evolves towards an entropy limit that is lower than the maximum. Here comes the inter-
esting conclusion for this first part of the appendix: assuming that the initial distribution
is uniform and the stationary solution nonuniform, the entropy will converge to a value
H
∞
< H
max
= H (X)
t=t
1
. This result means that the entropy of the system decreases
over time, in apparent contradiction with the second law of thermodynamics. Such a
contradiction is lifted by the argument that a real physical system has no reason to be
initiated with a uniform distribution, giving maximum entropy for initial conditions. In
this case, and if the stationary distribution is uniform, then the entropy will grow over
time, which represents a simplified version of the second law, as we shall see in the
second part. Note that the stationary distribution does not need to be uniform for the
entropy to increase. The condition H
∞
> H (X)
t=t
1
is sufficient, and it is in the domain
of physics, not mathematics, to prove that such a condition is representative of real
physical systems.
Proving the second law of thermodynamics
The second part of this appendix provides an elegant information-theory proof of the
second law of thermodynamics.
2
The tool used to establish this proof is the concept
of relative entropy, also called the Kullback–Leibler distance, which was introduced in
Chapter 5.
Considering two joint probability distributions p(x, y), q(x, y), the relative entropy
is defined as the quantity:
D[ p(x, y)q(x, y)] =
log
p(x, y)
q(x, y)
!
X,Y
=
x∈X
y∈Y
p(x, y)log
p(x, y)
q(x, y)
.
(D15)
2
T. M. Cover and J. A. Thomas, Elements of Information Theory (New York: John Wiley & Sons, 1991),
Ch. 2.