22.3 A graphical and numerical illustration 471
0
0.1
0.2
0.3
0.4
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Message length
Parameter
ε
(n,p)
Figure 22.2 Lower bound for the parameter ε(n, p), calculated from Eq. (22.39), with zoom in
the region 0 <ε(n, p) ≤ 0.4.
fraction of the ε(n, p) values progressively form a cluster in the region between ε = 0
and ε(n, 0) ≤ 0.4. This effect is also illustrated in Fig. 22.2, which shows a zoom in
this region. It is clear that for a message length n sufficiently large, any value ε>0 can
be reached by ε(n, p). Furthermore, we observe that given a message length n and any
ε>0 it is always possible to define a subset of codewords satisfying ε(n, p) ≤ ε,as
indicated in the figure by the circled clusters. In the case n = 19, for instance, the cluster
consists of a subset of five codewords satisfying ε(19, p) ≤ 0.29681, and this group of
codewords is precisely the ε-typical subspace we have been looking for! Let us take
a closer look at the corresponding data. Table 22.4 shows, for each codeword type µ of
length n = 19, from left to right: the individual codeword probability p(µ), the number of
possible permutations N of the codeword type µ, the codeword-type probability Np(µ),
the codeword parameter ε, and the associated value R = S + ε. We have defined our
ε-typical subspace as generated by the five codeword types for which ε(19, p) ≤
0.29681. As the table shows, these codeword types are of the form |λ
18
a
λ
1
b
, |λ
17
a
λ
2
b
,
|λ
16
a
λ
3
b
, |λ
15
a
λ
4
b
, and |λ
14
a
λ
5
b
. Summing the N data, it can be found that there exist 16 663
such codewords, out of 524 288 possibilities, representing only about 3% of the codeword
space . Summing the values Np(µ) in the table, one finds that the probability of any
received codeword belonging to this typical subspace is p() = 0.90208, which is
relatively high. This result yields the lower bound δ = 1 − p() = 0.09792 and hence,
using λ
a
= 0.8535, the transmission fidelity
˜
F =
(
1 − δ
)
2
+ δλ
19
a
= 0.81859 ≈ 82%,
which is fair. From the table, we also find that the maximum value of R, or the best
compression factor achievable under this coding scheme, is given by the highest value
of ε in the subspace , namely ε
max
= 0.29681, or R = 0.89769 ≈ 89%.
Having defined the ε-typical subspace (with the choice ε ≤ 0.29681), the next
step is to analyze the effect of increasing the message length n. It is only a matter of
tabulating a spreadsheet that outputs the same data as shown in Table 22.4, along with
the values p(), δ = 1 − p(), and
˜
F =
(
1 − δ
)
2
+ δλ
n
a
, for each value of n. Care must
be taken to effect the summation yielding p() only in the spreadsheet domain where
0 <ε≤ 0.29681. The fidelity
˜
F calculated for message lengths from n = 19 to n = 100