Overview of data compression standards 597
CRC trailer for error correction. The archive freeware IZArc supports an impressive list
of archive file formats, including applications for CD and DVD images. It also pro-
vides 256-bit AES encryption, repairing corrupted archives, and several other advanced
features.
25
Understandably, the achievable compression rate of any file-archiving programs is
strongly dependent on the datafile type (e.g., text, slides, tabulated data, HTML web page,
executables, etc.) and anything of a different format that the file might also contain (e.g.,
raw or uncompressed pictures, equation fields, and other types of embedded additions).
An approximate comparison is shown in.
26
This study indicates that pure text files can
be zipped down to 19–27% of their original sizes (compression rates of 73–81%), the
leaders seemingly being 7-ZIP and RAR, with 19–20% rates. For executables, 7-ZIP
champions with 27% squeezing (the other competitors confined to 36–40% ratios), and
for raw images this performance is reduced to 50–60%. These figures must, however,
be weighted against the coding and decoding times, obviously coming with a higher tax
for the champions, but not systematically. Thus, each case is a tricky matter of finding
the right trade-off between minimal archival size, the time required to squeeze the file,
and the time taken to recover the data as uncompressed. Ideally, compression should be
performed as a systematic background routine, in such a way as not to slow down other
computer tasks.
The external and permanent storage of computer files is popularly based on the CD-
ROM (compact-disk, read-only memory),
27
which is wholly similar to the previously
described audio-CD, both in terms of looks and storage space (700–800 Mbytes). A
key difference is that the CD-ROM is primarily designed for computer data and, hence,
it includes the function of error-correction, which is achieved through Reed–Solomon
(RS) encoding (see Chapter 11). The CD-ROM can be “burnt” according to three preset
modes: audio (for music tracks or copying audio-CDs), and mode 1 and mode 2 for PC
data. The CD-ROM has 333 000 sectors of 2352 bytes length (which gives a capacity
783.2 Mbytes). In the audio mode, no error correction is used, thus, the full sector
length (2352 bytes) can be used for storing audio files. Both mode 1 and mode 2 use
a 16-byte header in each sector, for the purposes of synchronization and identification.
Unlike in mode-1, mode-2 does not include error correction, which leaves 2352 − 16 =
2336 bytes for payload. In mode 1, the 288-byte trailer for error correction leaves a
payload of 2336 −288 = 2048 bytes. CD-ROMs are characterized by three different
possible read and write speeds: A = write-once, B = rewrite and C = read-only. By
definition, the 1×speed rating corresponds to 150 kbyte/s. Thus, a 32×write-only speed
corresponds to 4.8 Mbyte/s. A 700 Mbyte CD-ROM can, thus, be copied in approximately
700/4.8 = 146 s or approximately 2 min 30 s. The different speed ratings for the A, B,
and C functions are then identified under the generic label A×, B×, C×, for instance
12×, 10×, 32×.
28
25
See, for instance: http://en.wikipedia.org/wiki/Izarc; www.izarc.org/.
26
See, for instance: http://en.wikipedia.org/wiki/Comparison_of_file_archivers.
27
See, for instance: http://en.wikipedia.org/wiki/CD-ROM.
28
See, for instance: http://en.wikipedia.org/wiki/Izarc; www.izarc.org/.