
with
. With two variables, there are three dags; with three variables
there are 25 dags; with five variables there are 25,000 dags; and with ten variables
there are about
possible models. As can be seen in (6.6), this grows
exponentially in
.
This problem of exponential growth of the model space does not go away nicely,
and we will return to it in Chapter 8 when we address metric learners of causal
structure. In the meantime, we will introduce a heuristic search method for learning
causal structure which is effective and useful.
6.3 Conditional independence learners
We can imagine a variety of different heuristic devices that might be brought to bear
upon the search problem, and in particular that might be used to reduce the size of
the space. Thus, if we had partial prior knowledge of some of the causal relations
between variables, or prior knowledge of temporal relations between variables, that
could rule out a great many possible models. We will consider the introduction of
specific prior information later, in the section on adaptation (
9.4).
But there must also be methods of learning causal structure which do not depend
on any special background knowledge: humans (and other animals), after all, learn
about causality from an early age, and in the first instance without much background.
Evolution may have built some understanding into us from the start, but it is also
clear that our individual learning ability is highly flexible, allowing us severally and
communally to adapt ourselves to a very wide range of environments. We should like
to endow our machine learning systems with such abilities, for we should like our
systems to be capable of supporting autonomous agency, as we argued in Chapter 1.
One approach to learning causal structure directly is to employ experimentation
in addition to observation: whereas observing a joint correlation between
and
guarantees that there is some causal relation between them (via the Common Cause
Principle), a large variety of causal relations will suffice. If, however, we intervene
— changing the state of
— and subsequently see a correlated change in the state
of
, then we can rule out both being a cause of and some common cause being
the sole explanation. So, experimental learning is clearly a more powerful instrument
for learning causal structure.
Our augmented model for causal reasoning of Chapter 3 suggests that learning
from experimental data is a special variety of learning from observational data; it is,
namely, learning from observational samples taken over the augmented model. Note
that adding the intervention variable
to the common causal structure Figure 6.1 (b)
yields
. Observations of without observations of can be
interpreted as causal manipulations of
. And, clearly, in such cases experimental
data interpreted as an observation in the augmented model will find no dependency
between the causal intervention and
, since the intervening v-structure blocks the
© 2004 by Chapman & Hall/CRC Press LLC