8.4 Mean code length and coding efficiency 137
H
3
= 2.640 trit/symbol. We, thus, find that η = 74.5%. As we shall see further in this
chapter, such a ratio defines coding efficiency, a parameter that cannot exceed unity. Put
simply:
The mean codeword length (or effective code entropy) cannot exceed the source entropy.
This is a fundamental property of codes, which was originally demonstrated by Shannon.
We can thus conclude that with a 74% coding efficiency, the Morse code (analyzed as
a ternary code) is a reasonably good choice. Yet, despite its popularity and usefulness
in the past (and until as recently as 1999), the Morse code is quite far from optimal.
The main reason for this is the use of the blank, which is first required for the code to
be uniquely decodable, and second for being usable by human operators. Such a blank,
however, does not carry any information whatsoever, and it takes precious codeword
resources! This observation shows that we could obtain significantly greater coding
efficiencies if the Morse code was uniquely decodable without making use of blanks.
This improved code could be transmitted as uninterrupted bit or trit sequences. But it
would be only intelligible to machines, because human beings would be too slow to
recognize the unique symbol patterns in such sequences. Considering then both binary
and ternary codings (with source entropies H
source
showninTable8.3), and either fixed
or variable symbol or codeword sizes, there are four basic possibilities:
(i) Fixed-length binary codewords with l = 5 bit (2
5
= 32), giving L ≡ 5.000 bit/
symbol, or H
source
/L = 83.7%;
(ii) Fixed-length ternary codewords with l = 3 trit (3
3
= 27), giving L ≡ 3.000 trit/
symbol, or H
source
/L = 88.0%;
(iii) Variable-length binary codewords with lengths between l = 1 and l = 10 bits, gi-
ving L ≡ 4.212 bit/symbol, or H
source
/L = 99.33% (with optimal codes using
3 ≤ l ≤ 9 bits);
(iv) Variable-length ternary codewords with lengths between l = 1 and l = 3 trits,
giving L ≡ 2.744 trit/symbol or H
source
/L = 96.2%.
The result shown in case (iii) is derived from an optimal coding approach (Huffmann
coding) to be described in Chapter 9. The result shown in case (iv) will be demonstrated
in the next section. At this stage, we can just observe that all the alternative solutions
(i)–(iv) have coding efficiencies significantly greater than the Morse code. We also note
that the efficiency seems to be greater for the multi-level (M-ary) codes with M > 2,
but I will show next that it is not always true.
Consider, for simplicity, the case of fixed-length M-ary codes, for which it is straight-
forward to calculate the coding efficiency. For instance, quaternary codewords, made
with n quad characters, called 0, 1, 2, and 3, can generate 4
n
symbol possibilities.
Since 4
2
< 26 < 4
3
= 64, codewords with 3 quads are required for the A–Z alpha-
bet. The mean codeword length is, thus, L = 3 quad/symbol. The source entropy is
H
4
= (ln 2/ln 4) H
2
= 2.092 quad/symbol. Thus, the coding efficiency is η = H
4
/L =
69.7%, which is lower than that of ternary, fixed-length coding (88.0%), as seen in case
(ii). The reason is that going from ternary to quaternary coding does not reduce the
codeword length, which remains equal to three. The situation would be quite different