Appendix L (Chapter 13) Converse
proof of the channel coding theorem
This appendix provides the converse proof of the CCT.
1
The converse proof must show
that to achieve transmission with arbitrary level accuracy (or error probability), the
condition R ≤ C must be fulfilled. The demonstration seeks to establish two properties,
which I shall name A and B.
Property A
To begin with, we must establish the following property, referred to as Fano’s inequality,
according to which:
H(X
n
|Y
n
) ≤ 1 + p
e
nR, (L1)
where p
e
is the error probability of the code ( p
e
= 1 −
˜
p), i.e., the probability that the
code will output a codeword that is different from the input message codeword.
To demonstrate Fano’s inequality, we define W = 1 ...2
nR
as the integer that labels
the 2
nR
possible codewords in the input-message codebook. We can view the code
as generating an output integer label W
, to which the label W of the input message
codeword may or may not correspond. We can, thus, write p
e
= p(W
= W ). We then
define a binary random variable E, which tells whether or not a codeword error occurred:
E = 1, if W
= W , and E = 0, if W
= W .Thus,wehave p(E = 1) = p
e
and p(E =
0) = 1 − p
e
. We can then introduce the conditional entropy H (E, W |Y
n
), which is the
average information we have on E, W , given the knowledge of the output codeword
source, Y
n
. Referring back to the chain rule in Eqs. (5.22) and (5.23), we can expand
H(E, W |Y
n
) in two different ways:
H(E, W |Y
n
) = H (E|Y
n
) + H (W |E , Y
n
)
= H (W |Y
n
) + H (E |W, Y
n
).
(L2)
Since E is given by the combined knowledge of label W and output source Y
n
,we
have H (E|W, Y
n
) = 0. We also have H (E|Y
n
) = H (E), since the only knowledge of
Y
n
does not condition the knowledge of E. Substituting these results into Eq. (L2), we
1
Inspired from M. Cover and J. A. Thomas, Elements of Information Theory (New York: John Wiley & Sons,
1991), pp. 203–9.