10 2 Spam and its economic significance
ISPs, E-mail Service Provider (ESP)s and organizations for data or pri-
vacy protection receive reports from the public or their subscribers and
customers. For example, SpamCop (www.spamcop.net) and Abuse.net
(www.abuse.net) have been operating a reporting service and provide
complaint-based blacklists.
Technical tool-based approach
The technical tool-based approach usually does not require the ac-
tive participation of users. Generally, this means that this approach is
more accurate and objective in that it does not require a subjective
interpretation of users compared to the other two approaches. On the
other hand, however, this approach is limited in that it cannot assess
subjective reactions to spam, such as what type of action was taken
by users to reduce spam or reactions to fraudulent or illegal types of
spam. The technical tool-based approach is dependent on the accuracy
of its technical methods, which require constant updating in order to
recognize new forms of spam as they develop. Technical tools do not
guarantee 100% accuracy, so that false-positive (non-spam that is mis-
takenly classified as spam) and false negative (spam that is mistakenly
not classified as spam) results impact on the accuracy of any spam
measurement using the technical tool-based approach.
In the following, we are interested in those types of statistics that are
“best” created by the usage of technical tool-based approaches, such as
the total amount of spam, the type or content of spam messages, or the
geographic origins of spam. Organizations that collect huge data and
provide such statistics are Symantec, MessageLabs, Ironport, Sophos,
and Commtouch. The Symantec Probe Network consists of millions of
decoy e-mail addresses that are configured to attract a stream of spam
traffic that is representative of spam activity across the Internet as a
whole [169]. MessageLabs collects data taken from its global network of
control towers that scan millions of e-mails daily [122, p. 10]. Ironport
uses the SenderBase traffic monitoring network and claims that this
network samples 25% percent of the world’s e-mail [84]. Sophos uses
spam traps in its global network and analyzes millions of e-mails each
day to determine whether they are spam or not [162].
The following statistics are not only affected by the intrinsic elements
mentioned above, but also by some other, extrinsic factors, as Table 2.3 shows.
Furthermore, the statistics focus on three issues of spam: (1) portions and
trends in the development of spam categories, (2) categories of spam, and (3)
origin of spam.
Figure 2.1 shows the development of spam over almost 2 years, as recorded
by MessageLabs and Symantec. However, data on the spam portion in 2006
have not yet been provided by Symantec. Although the development of the
spam portion is similar, the levels differ quite considerably. The figure indi-
cates that the spam portion decreases; however, the numbers do not neces-