78 4 Classical information and communication
one flips the bit that disagrees with the others, returning their state to one of
the logical states. As long as p ≤
1
3
, the probability of a net error occurring
using this method is reduced from its original value p to an improved value of
3p
2
. The price paid is the reduction of transmission rate by a factor of three,
because three physical bits are used to transmit each logical bit. A quantum
versionofthismethodisconsideredlater,inSection10.4.
4.6 Data compression
Data compression is a method of encoding that reduces the length of the
strings required to capture a quantity of information given some knowledge
of the states provided, for example, by a transmitting source. The Shannon
entropy, H(A), of a random variable A provides a lower bound on the average
length of its shortest description. A basic result of the theory of data compres-
sion is the noiseless coding theorem, which provides a lower bound on data
compression by stating that a message cannot be compressed to less than its
Shannon entropy per bit, as follows.
For any δ, > 0:
(i) With H(A)+δ available bits per signal, there exists a coding-decoding
scheme with fidelity F
M
> 1 − , for all M sufficiently large;
(ii) With H(A) − δ available bits per signal, for any coding-decoding
scheme, the fidelity F
M
<, for all M sufficiently large, where the fidelity
is given by
F
M
=
A
M
p(A
M
)p
exact
(A
M
) , (4.35)
A
M
= a
i
1
a
i
2
...a
i
M
being a bitstring (block) with prior probability as dis-
tributed by the sender, Alice, p(A
M
)=p
i
1
p
i
2
...p
i
M
, p
i
J
being the probability
of a given a
i
J
.
This theorem provides a statistical justification for the Shannon entropy
being considered a measure of uncertainty; see [379]. It also allows one to
interpret the Shannon entropy as the mean number of bits needed to code
the output of a source using an ideal code. The Shannon entropy can thus
be viewed as a measure of the resources required to represent the information
provided by a source. A quantum analogue of this result is discussed in Section
10.8.
Different methods of data compression operate with different efficiencies,
depending on the statistical properties of the message. Generally, use of typ-
ical sequences is not the most efficient method of compressing information.
The sender Alice can, for example, use block coding to compress information
by jointly taking strings of M signals and coding them as shorter data se-
quences without the redundancies naturally contained in an arbitrary signal,
as mentioned above. The receiver, Bob, can then decode (or decompress) these
sequences, reconstructing them with any desired level of accuracy.