(expected) value of the conflict among evidential claims expressed by each
given probability distribution function p.
3.2.1. Simple Derivation of the Shannon Entropy
Suppose that a particular alternative in a finite set X of considered alterna-
tives occurs with the probability p(x). When this probability is very high, say
p(x) = 0.999, then the occurrence of x is taken almost for granted and, conse-
quently, we are not much surprised when it actually occurs. That is, our uncer-
tainty in anticipating x is quite small and, therefore, our observation that x has
actually occurred contains very little information. On the other hand, when
the probability is very small, say p(x) = 0.001, then we are greatly surprised
when x actually occurs. This means, in turn, that we are highly uncertain in our
anticipation of x and, hence, the actual observation of x has very large infor-
mation content. We can conclude from these considerations that the anticipa-
tory uncertainty of x prior to the observation (and the information content of
observing x) should be expressed by a decreasing function of the probability
p(x): the more likely the occurrence of x, the less information its actual obser-
vation contains.
Consider a random experiment with n considered outcomes, i = 1,2,...,n,
whose probabilities are p
1
, p
2
,...,p
n
, respectively. Assume that p
i
> 0 for all
i Œ ⺞
n
, which means that no outcomes with zero probabilities are considered.
The uncertainty in anticipating a particular outcome i (and the information
obtained by actually observing this outcome) should clearly be a function of
p
i
. Let
denote this function. To measure in a meaningful way the anticipatory uncer-
tainty, function s should satisfy the following properties:
(s1) s(p
i
) should decrease with increasing p
i
.
(s2) s(1) = 0.
(s3) s should behave properly when applied to joint outcomes of indepen-
dent experiments.
To elaborate on property (s3), let r
ij
denote the joint probabilities of outcomes
of two independent experiments. Assume that one of the experiments has n
outcomes with probabilities p
i
(i Œ ⺞
n
) and the other one has m outcomes with
probabilities q
j
( j Œ⺞
m
). Then, according to the calculus of probability theory,
(3.26)
for all i Œ ⺞
n
and all j Œ ⺞
m
. Since the experiments are independent, the
anticipatory uncertainty of a particular joint outcome ·i, jÒ should be equal to