Klir G.J. Uncertainity and Information. Foundations of Generalized Information Theory

Подождите немного. Документ загружается.

(2.14)

The range of NH is clearly the unit interval [0, 1], independent of X and E Õ

X. Moreover, NH is invariant with respect to the choice of measurement units.

Assume now that a given set of possible alternatives, E, is reduced by the

outcome of an action to a smaller set E¢ÃE. Then, the amount of informa-

tion obtained by the action, I

(E, E¢) is measured by the difference H(E) -

H(E¢). That is,

(2.15)

When the action eliminates all alternatives in E except one (i.e., when |E¢| =

1), we obtain I

(E, E¢) = log

|E|=H(E). This means that H(E) may also be

viewed as the amount of information needed to characterize one element of

set E.

Consider now two universal sets, X and Y, and assume that a relation R Õ

X ¥ Y describes a set of possible alternatives in some situation of interest. Con-

sider further the sets

which are usually referred to as projections of R on sets X, Y, respectively.

Then three distinct Hartley measures are applicable, H(R

), H(R

), and H(R),

which are deﬁned on the power sets of X, Y, and X ¥ Y, respectively. The ﬁrst

two,

(2.16)

(2.17)

are called simple or marginal Hartley measures. The third one,

(2.18)

is called a joint Hartley measure.

Two additional Hartley measures are deﬁned,

(2.19)

(2.20)

HR R

()

= log ,

HR R

()

= log ,

HR R

()

= log

HR R

()

= log ,

HR R

()

= log ,

RxXxyR yY

RyYxyR xX

=Œ

()

ŒŒ

{}

=Œ

()

ŒŒ

{}

for some

IEE

, log .¢

()

NH E

()

log

32 2. CLASSICAL POSSIBILITY-BASED UNCERTAINTY THEORY

which are called conditional Hartley measures. These deﬁnitions can be gen-

eralized by restricting the set of possible conditions to R¢

Õ R

and R¢

Õ R

respectively. The generalized deﬁnitions are:

(2.21)

(2.22)

Observe that the ratio |R|/|R

| in Eq. (2.19) represents the average number

of elements of R

that are possible alternatives under the condition that an

element of R

has already been selected.This means that H(R

| R

) measures

the average nonspeciﬁcity regarding possible choices from R

for all possible

choices from R

. Function H(R

| R

) deﬁned by Eq. (2.20) clearly has a

similar meaning, with the roles of sets R

and R

exchanged. The generalized

forms of conditional Hartley measures deﬁned by Eqs. (2.21) and (2.22) obvi-

ously have the same meaning under the restricted sets of possible conditions

R¢

and R¢

, respectively.

The marginal, joint, and conditional Hartley measures are related in numer-

ous ways. To describe these various relations generically, it is useful (and a

common practice) to identify only the universal sets involved and not the

actual subsets of possible alternatives.That is, the generic symbols H(X), H(Y),

H(X ¥ Y), H(X | Y), and H(Y | X) are used instead of their speciﬁc counter-

parts H(R

), H(R

), H(R), H(R

| R

) and H(R

| R

), respectively. As is

shown later in this book, the generic descriptions of the relations have the

same form in every uncertainty theory, even though the related entities are

speciﬁc to each theory and change from theory to theory.

The equations

(2.23)

(2.24)

which follow immediately from Eqs. (2.19) and (2.20), express in generic form

the relationship between marginal, joint, and conditional Hartley measures.As

is demonstrated later in this book, these important equations hold in every

uncertainty theory when the Hartley measure is replaced with its counterpart

in the other theory.

If possible alternatives from X do not depend on selections from Y, and

vice versa, then R = X ¥ Y and the sets R

and R

are called noninteractive.

Then, clearly,

(2.25)

(2.26)

(2.27)

HX Y HX HY¥

()

HY X HY

()

HX Y HX

()

HY X H X Y HX

()

=¥

()

HX Y HX Y HY

()

=¥

()

HR R

()

log ,

HR R

()

log ,

2.2. HARTLEY MEASURE OF UNCERTAINTY FOR FINITE SETS 33

In the general case, when sets R

and R

are not necessarily interactive, these

equations become the inequalities

(2.28)

(2.29)

(2.30)

The following functional, which is usually referred to as information trans-

mission, is a useful indicator of the strength of constraint between possible

alternatives in sets X and Y:

(2.31)

When the sets are noninteractive, T

(X,Y) = 0; otherwise, T

(X,Y) > 0. Using

Eqs. (2.23) and (2.24), T

(X,Y) can be also expressed in terms of the con-

ditional uncertainties:

(2.32)

(2.33)

The maximum value, T

(X,Y), of information transmission associated with

relations R Õ X ¥ Y is obtained when

This means that

and, hence,

This implies that |R|=|R

|=|R

|. These equalities can be satisﬁed only for

|R|=1,2,...,min{|X|,|Y|} Clearly, the largest value of information transmis-

sion is obtained for

Hence,

(2.34)

, min log ,log .TXY X Y

()

{}

RR R XY

===

{}

min , .

HX Y HX HY¥

()

HX Y HY

HX Y HX

()

HX Y HY X

()

= 0.

TXY HY HYX

()

TXY HX HXY

()

TXY HX HY HXY

()

-¥

()

HX Y HX HY¥

()

HY X HY

()

HX Y HX

()

34 2. CLASSICAL POSSIBILITY-BASED UNCERTAINTY THEORY

The normalized information transmission, NT

, is then deﬁned by the formula

(2.35)

2.2.4. Examples

The meaning of uncertainty measured by the Hartley functional depends on

the meaning of the set E. For example, when E is a set of predicted states of

a variable (from the set X of all states deﬁned for the variable), H(E) is a

measure of predictive uncertainty; when E is a set of possible diseases of a

patient determined from relevant medical evidence, H(E) is a measure of diag-

nostic uncertainty; when E is a set of possible answers to an unsettled histori-

cal question, H(E) is a measure of retrodictive uncertainty; when E is a set for

possible policies, H(E) is a measure of prescriptive uncertainty. The purpose of

this section is to illustrate the utility of the Hartley measure on simple exam-

ples in some of these application contexts.

EXAMPLE 2.1. Consider a simple dynamic system with four states whose

purpose is prediction. Let S = {s

, s

} denote the set of states of the system,

and let R denote the state-transition relation on S

(the set of possible transi-

tions from present states to next states) that is deﬁned in matrix form by the

basic possibility function r

in Figure 2.1a. Entries in the matrix M

, are values

) for all pairs ·s

ÒŒS

. All possible transitions from present states to

next states (for which r

) = 1) are also illustrated by the directed arcs

(edges) in the diagram in Figure 2.1b. It is assumed that transitions occur only

at speciﬁed discrete times.The system is clearly nondeterministic, which means

that its predictions inevitably involve some nonspeciﬁty. For convenience, let

NT XY T XY T XY

HHH

()

()()

2.2. HARTLEY MEASURE OF UNCERTAINTY FOR FINITE SETS 35

= M

0 1 1 1

0 1 1 0

1 1 0 0

0 1 0 0

t+1

Next states

Present states

(a)(b)

R S ¥ S

Figure 2.1. Illustration to Examples 2.1 and 2.2.

denote the set of considered states of the system at some speciﬁed initial

time, and let S

t+k

for some k Œ⺞ denote the set of considered states of the

system at time t + k. Clearly, S

= S

t+k

= S for any k Œ ⺞.

Since the purpose of the system is prediction, it makes sense to ask the

system questions regarding possible future states or sequences of future states.

For each question, the system provides us with a particular prediction that, in

general, is not fully speciﬁc. The Hartley measure allows us to calculate the

actual amount of nonspeciﬁcity in this prediction. The following are a few

examples illustrating the use of Hartley measure for this purpose:

(a) Assuming that any of the four states is possible at time t, what is the

average nonspeciﬁcity in predicting the state at time t + 1? Applying

Eq. (2.23) for X ¥ Y = S

t+1

¥ S

, the answer is

(b) Assuming that only states s

and s

are possible at time t, what is the

average nonspeciﬁcity in predicting the state at time t + 1? Applying

Eq. (2.23) for X ¥ Y = {{s

} ¥ S

t+1

}, the answer is

predicting the sequence of states of length n? For any n ≥ 1, the answer

is given by the formula

(2.36)

For n = 1 this formula becomes the one in Example 2.1a.To apply this formula

for any n ≥ 2, we need to determine the number of possible sequences of the

respective lengths. This can be done easily by using the matrix representation

of the state-transition relation. For n = 2, the total number of possible

sequences is obtained by adding all entries in the resulting matrix of the matrix

product M

¥ M

. In our example,

By adding all entries in the resulting matrix M

¥ M

, we obtain 16, and this

is exactly the number of possible sequences of length 2. Moreover, the sums

of entries in the individual rows of the resulting matrix are equal to the number

of possible sequences of length 2 that begin in states assigned to the respec-

0111

0110

1100

0100

0111

0110

1100

0100

1310

1210

0221

0110

MMMM

RRRR

HS S S S HS S S HS

tt tnt tt tn t++ + + +

¥¥¥

()

=¥¥¥

()

12 1

...

HS ss Hss S Hss

tt++

{}

()

{}

()

{}

()

=-=

112 12 1 12 2 2

52132, , , log log . .

HS S HS S HS

tt tt t++

()

=¥

()

=-=

1122

841log log .

36 2. CLASSICAL POSSIBILITY-BASED UNCERTAINTY THEORY

tive rows. That is, there are 5, 4, 5, 2 possible sequences of length 2 that begin

in states s

, s

respectively. Similarly, the sums of the entries in the indi-

vidual columns of the matrix are equal to the number of possible sequences

of length 2 that terminate in states assigned to the respective columns. The

same results apply to sequences of lengths 3, 4, and so on, but we need to

perform, respectively,

and so on.

Determining the number of possible sequences for n Œ ⺞

and calculating

the average predictive nonspeciﬁcity for each n in this range by Eq. (2.36), we

obtain the following sequence of predictive nonspeciﬁcities: 1, 2, 2.95, 4.11,

5.16, 6.21, 7.27, 8.32, 9.37, 10.43. As expected, the predictive nonspeciﬁcities

increase with n. This means qualitatively that long-term predictions by a non-

deterministic system are less speciﬁc than short-term predictions by the same

system.

Assume now that only one state, s

, is possible at time t and we want to cal-

culate again the nonspeciﬁty in predicting the sequence of states of length n.

In this case,

As already mentioned, the number of sequences of states of length n that begin

with state s

, which we need for this calculation, is obtained by adding the

entries in the respective row of the matrix resulting from the required chain

of n - 1 matrix products. For s

= s

in our example and n Œ ⺞

, we obtain the

following predictive nonspeciﬁcities: 1.58, 2.32, 3.46, 4.46, 5.55, 6.58, 7.65, 8.70,

9.75, 10.81. As expected from the high initial nonspeciﬁcity H(S

t+1

| {s

}), all

these values are above average. On the other hand, the following values for s

= s

are all below average: 0, 1, 2, 3.17, 4.17, 5.25, 6.29, 7.35, 8.40, 9.45.

EXAMPLE 2.2. Consider the same system and the same types of predictions

as in Example 2.1. However, let the focus in this example be on the predictive

informativeness of the system rather than its predictive nonspeciﬁcity. That is,

the aim of this example is to calculate the amount of information contained

in each prediction of a certain type made by the system. In each case, we need

to calculate the maximum amount of predictive nonspeciﬁcity, obtained in the

face of total ignorance, and the actual amount of predictive nonspeciﬁcity asso-

ciated with the prediction made by the system. The amount of information

provided by the system is then deﬁned as the difference between the maximum

and actual amounts of predictive nonspeciﬁcity.

In general, the distinguishing feature of total ignorance within the classical

possibility theory is that all recognized alternatives are possible. In our

example, the recognized alternatives are transitions from states to states, each

HSSSsHsSSSHs

tt tni itt tn i++ + ++ +

¥ ¥◊◊◊¥

{}

()

{}

¥ ¥ ¥◊◊◊¥

()

{}

()

12 12

MMMMMMM

RRRRRRR

()

¥¥

()

¥,,

2.2. HARTLEY MEASURE OF UNCERTAINTY FOR FINITE SETS 37

of which is represented by one cell in the matrix in Figure 2.1a. Predictions of

states or sequences of states are determined via these transitions. Maximum

nonspeciﬁcity in each prediction is obtained when all the recognized transi-

tions are possible.When,on the other hand, only one transition from each state

is possible, each prediction is fully speciﬁc and, hence, the system is

deterministic.

The following are examples that illustrate the use of the Hartley measure

for calculating informativeness of those types of predictions that are exam-

ined in Example 2.1:

(a) Let H(S

t+1

| S

) have the same meaning as in Example 2.1, and let

t+1

| S

) be the average nonspeciﬁcity in predicting the state at time

t + 1 in the face of total ignorance. Then, the average amount of infor-

mation, I

t+1

| S

), contained in the prediction made by the system (or

the informativeness of the system with the respect to predicting the

next state) is given by the formula

Since there are 16 possible transitions in the face of total ignorance,

From Example 2.1a, H(S

t+1

| S

) = 1. Hence, I

t+1

| S

) = 2 - 1 = 1.

(b) In this case,

Then, using the result in Example 2.1b, we have

states of length n in the face of total ignorance is clearly equal to 4

n+1

Hence,

and consequently,

ISSSSnHSSSS

Ht t tn t t t tn t++ + ++ +

¥¥¥

()

=- ¥ ¥¥

()

12 12

... ...

...

log log ,

HS S S S HS S S HS

tt tnt tt tn n

++ + + +

¥¥¥

()

=¥¥¥

()

=-=

12 1

442

I S ss HS ss HS ss

Ht t t+++

{}

()

{}

()

{}

()

=- =

112 112 112

2 1 32 0 68,

,,...

, , log log .HS ss Hss S Hss

tt++

{}

()

{}

()

{}

()

=-=

112 12 1 12 2 2

822

log log .HS S

tt+

()

=-=

122

16 4 2

ISSHSSHSS

Ht t t t t t+++

()

111

38 2. CLASSICAL POSSIBILITY-BASED UNCERTAINTY THEORY

Using the values of H(S

t+1

¥ S

t+2

¥ ···¥ S

t+n

| S

) calculated for n Œ⺞

Example 2.1c, we readily obtain the corresponding values of I

t+1

¥ S

t+2

···¥ S

t+n

| S

): 1, 2, 3.05, 3.89, 4.84, 5.79, 6.73, 7.68, 8.63, 9.57.

When only one state, s

, is possible at time t, the number of possible

sequences of states of length n in the face of total ignorance is equal to 4

which means that

Hence,

Using the values of H(S

t+1

¥ S

t+2

¥ ···¥ S

| {s

}) calculated for n Œ⺞

Example 2.1c, we obtain the corresponding values of I

t+1

¥ S

t+2

¥ ...¥

| {s

}): 0.42, 1.68, 2.54, 4.45, 5.42, 6.35, 7.30, 8.25, 9.19.

EXAMPLE 2.3. Consider a system with four variables, x

, x

, which

take their values from the set {0, 1}. These variables are constrained via a

particular 4-dimensional relation R Õ {0, 1}

, but this relation is not known.

We only know how the following pairs of the four variables are related:

·x

, x

Ò, ·x

, x

Ò, ·x

, x

Ò, ·x

, x

Ò. Let R

, R

denote, respectively,

these partial relations on {0, 1}

, and let P = {R

, R

}. The partial

relations are deﬁned in Figure 2.2a. All of the introduced relations can also be

represented by their basic possibility functions. Let r, r

, r

denote these

functions.

If relation R were known, the four partial relations (or any of the other

partial relations) would be uniquely determined as speciﬁc projections of R

via the max operation of possibility theory. For example, using the labels intro-

duced for all overall states (elements of the Cartesian product {0, 1}

) in Figure

2.2c, we have

In our case, R is not known and we want to determine it on the basis of infor-

mation in the partial relations (projections of R). This inverse problem, illus-

trated in Figure 2.2b, is usually referred to as system identiﬁcation. In general,

rrsrsrsrs

r rsrsrsrs

r rsrsrs rs

rrs

12 0123

12 4567

12 8 9 10 11

,max ,,, ,

,max ,, , ,

,max

()

()()()()

{}

()

()()()()

{}

()

()()( )( )

{}

()

1212 13 14 15

()()()()

{}

,,, .rs rs rs

ISSSsnHSSSs

Httni ttni++ ++

¥¥¥

{}

()

=- ¥ ¥¥

{}

()

12 12

... ...

...

log

HSSSsHsSSSHs

Hs S S

tt ni itt tn i

it tn

++ ++ +

¥¥¥

{}()

{}

¥¥¥¥

()

{}()

{}

¥¥¥

()

12 12

2.2. HARTLEY MEASURE OF UNCERTAINTY FOR FINITE SETS 39

40 2. CLASSICAL POSSIBILITY-BASED UNCERTAINTY THEORY

: x

0 0 0 0 0 0 0 0

0 1 0 1 1 0 0

1 1 1 1 1 1 1

States xxx x

0 0 0 0

0 0 0 1

0 0 1 0

0 0 1 1

0 1 0 0

0 1 0 1

0 1 1 0

0 1 1 1

1 0 0 0

1 0 0 1

1 0 1 0

1 0 1 1

1 1 0 0

1 1 0 1

1 1 1 0

1 1 1 1

States x

0 0 0 0

0 0 0 1

0 1 0 0

0 1 0 1

0 1 1 0

1 1 0 1

(a)

321 4

ﬁ

(b)

(c)

(d)

Cylindric closure

Possible states of

(states that are consistent with

, R

)

Potential states of

(Cartesian

product {0, 1}

)

Figure 2.2. System identiﬁcation (Example 2.3).

R cannot be determined uniquely from its projections. We can only determine

a family, R

, of all relations that are consistent with the given projections in

set P. Clearly, R

Õ P({0, 1}

). It is convenient to determine R

in two steps.

First, we determine the set of all overall states (elements of {0, 1}

in our case)

that are possible under the given information. These are states that are con-

sistent with the given projections. In our case, a particular overall state ·x

, x

Ò is possible if and only if

The possibility of each overall state ·x

, x

ÒŒ{0, 1}

thus can be deter-

mined by the equation

The resulting set of all possible overall states, which is usually called a cylin-

dric closure of the given projections, is shown in Figure 2.2d. The term “cylin-

dric closure” emerged from a classical method for determining the set of all

possible overall states from given projections. In this method (less efﬁcient

than the one described here), the cylindric extension is constructed for each

projection with respect to the remaining dimensions and the intersection of

all these cylindric extensions is the cylindric closure. The unknown relation R

is guaranteed to be a subset of the cylindric closure.

Once the set of all possible overall states (the cylindric closure) is deter-

mined, the next step is to determine all its subsets that are complete in the

sense that they cover all possible states of the given projections. In our

example, there are eight such subsets, one of which is the cylindric closure

itself:

Each of these subsets of the Cartesian product {0, 1}

can be the unknown rela-

tion R, but we have no basis to decide which one it is. We therefore identiﬁed

a family, R

, of all possible overall relations. Each of these relations is both

consistent and complete with respect to the given projections in P. The

identiﬁcation nonspeciﬁcity is given by the Hartley measure

()

log

log .

ssssss

sssss

ssss

sss

0145613

014613

015613

045613

145613

01613

05613

146

,,,,,

,,,,

,,,

{}

,, s

{}

rxxxx r xx r xx r xx r xx

min

1234 1212141423233434

()()()()

{}

˙˙

.xxR xxR xxR xxR

1 2 12 1 4 14 2 3 23 3 4 34

ŒŒŒŒ and and and

2.2. HARTLEY MEASURE OF UNCERTAINTY FOR FINITE SETS 41