128 Information coding
the date of the French Revolution?). Despite the oddity of their code rules, it is noteworthy
that Roman numerals are still in use, for instance to represent the hours on clock
dials, number pages in book prefaces, express copyright dates, enumerate cases in
mathematical descriptions, or count series in games, such as US football. It is also
interesting to note that the Roman-numeral system was in fact inspired by the Greek
system in use in 400 BC.
2
The ten-character dictionary of our decimal system, {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}was first
used by the Hindus in 400 BC, and then transmitted later to the West by the Arabs, hence
the misnomer Arabic numerals, which should rather be Hindu-Arabic numerals.The
introduction of a 0 character made it possible to greatly simplify numerals. Indeed, with
only two- or three-character codewords (00–99 or 000–999), up to 100 or 1000 numerals
can be generated. The hexadecimal system is based on the 16-character alphabet
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F}.
The advantage of the hexadecimal system is that the codeword length is shorter than
in the decimal case, at the expense of using a greater number of alphabet characters. The
drawback of both decimal and hexadecimal codes is that they are based on relatively
large alphabets. In practice, such characters may not be simple to generate, write, or
faithfully interpret, as we all know from handwriting experience. Such characters need,
in turn, to be coded through a simpler alphabet. The most basic code (and number
representation) corresponds to the binary system, which uses the two-character {0, 1}
alphabet.
3
A single-character, binary codeword is referred to as a bit, short for binary
digit. In the following, we will equivalently refer to codewords as “numbers.”
Binary and decimal numbers are conventionally written as the ordered charac-
ter sequences B = ...b
3
b
2
b
1
b
0
and D = ...d
3
d
2
d
1
d
0
, respectively. The conversion
between decimal and binary numbers is given by the following power expansion:
...d
3
d
2
d
1
d
0
≡ ...d
3
× 10
3
+ d
2
× 10
2
+ d
1
× 10
1
+ d
0
× 10
0
= ...b
3
× 2
3
+ b
2
× 2
2
+ b
1
× 2
1
+ b
0
× 2
0
(8.1)
≡ ...b
3
b
2
b
1
b
0
.
not 9 ≡ VIII. To generate numbers greater than 10 ≡ X,theruleisasfollows:11≡ XI, 12 ≡ XII, . . . ,
18 ≡XVIII, 19 ≡XIX, and 20 ≡XX, and. Then 30 ≡XXX, 31 ≡XXXI, . . . , up to 39 ≡XXXIX. Because
of the absence of a zero, the powers of ten would be character-consuming should they have to be repeated by
as many X. To make numbers more compact, the Romans chose to represent 50, 100, 500, and 1000 by the
symbols L, C, D, and M, respectively, with the subtractive rule 40 ≡ XL, 400 ≡ CD, 90 ≡ XC, 900 ≡ CM.
Thus, 2006 is represented by MMVI, while 1999 is represented by MCMXCIX. There are additional rules
for representing greater numbers, see for instance:
http://en.wikipedia.org/wiki/Roman_numerals#IIII_or_IV.3 F, and
http://ostermiller.org/calc/roman.html.
For more on the numeral systems used by different civilizations through history, see
http://en.wikipedia.org/wiki/Arabic_numerals.
2
See: http://en.wikipedia.org/wiki/Greek_numerals.
3
The unary system uses a single-character alphabet {1}. Numbers are represented by this character’s repeti-
tion, starting from zero: 0 = 1, 1 = 11, 2 = 111, 3 = 1111, etc. Albeit not practical, such a code can have
interesting applications, for instance, in Turing machines (see Chapter 7).