for distinct real numbers a
1
,a
2
,...,a
m
and disjoint events A
1
,A
2
,...,A
m
, each of positive
probability, whose union is Ω.
Next, let X be any other real–valued random variable on Ω. What is our best guess of
X, given Y ? Think about the problem this way: if we know the value of Y (ω), we can tell
which event A
1
,A
2
,...,A
m
contains ω. This, and only this, known, our best estimate for
X should then be the average value of X over each appropriate event. That is, we should
take
E(X |Y ):=
1
P (A
1
)
A
1
XdP on A
1
1
P (A
2
)
A
2
XdP on A
2
.
.
.
1
P (A
m
)
A
m
XdP on A
m
.
We note for this example that
• E(X |Y ) is a random variable, and not a constant.
• E(X |Y )isU(Y )-measurable.
•
A
XdP =
A
E(X |Y ) dP for all A ∈U(Y ).
Let us take these properties as the definition in the general case:
DEFINITION. Let Y be a random variable. Then E(X |Y )isanyU(Y )-measurable
random variable such that
A
XdP =
A
E(X |Y ) dP for all A ∈U(Y ).
Finally, notice that it is not really the values of Y that are important, but rather just
the σ-algebra it generates. This motivates the next
DEFINITION. Let (Ω, U,P) be a probability space and suppose V is a σ-algebra, V⊆U.
If X :Ω→ R
n
is an integrable random variable, we define
E(X |V)
to be any random variable on Ω such that
(i) E(X |V)isV-measurable, and
(ii)
A
XdP =
A
E(X |V) dP for all A ∈V.
Interpretation. We can understand E(X |V) as follows. We are given the “information”
available in a σ-algebra V, from which we intend to build an estimate of the random
variable X. Condition (i) in the definition requires that E(X |V) be constructed from the
28