EBNN generalizes more accurately than standard BACKPROPAGATION, especially
when training data is scarce. For example, after 30 training examples, EBNN
achieved a root-mean-squared error of
5.5
on a separate set of test data, compared
to an error of 12.0 for BACKPROPAGATION. Mitchell and Thrun (1993a) describe
applying EBNN to learning to control a simulated mobile robot, in which the do-
main theory consists of neural networks that predict the effects of various robot
actions on the world state. Again, EBNN using an approximate, previously learned
domain theory, outperformed BACKPROPAGATION. Here
BACKPROPAGATION required
approximately 90 training episodes to reach the level of performance achieved
by EBNN after
25
training episodes. O'Sullivan et al. (1997) and Thrun (1996)
describe several other applications of EBNN to real-world robot perception and
control tasks, in which the domain theory consists of networks that predict the
effect of actions for an indoor mobile robot using sonar, vision, and laser range
sensors.
EBNN bears
an
interesting relation to other explanation-based learning meth-
ods, such as PROLOG-EBG described in Chapter 11. Recall from that chapter that
PROLOG-EBG also constructs explanations (predictions of example target values)
based on a domain theory. In PROLOG-EBG the explanation is constructed from a
domain theory consisting of Horn clauses, and the target hypothesis is refined by
calculating the weakest conditions under which this explanation holds. Relevant
dependencies in the explanation are thus captured in the learned Horn clause hy-
pothesis. EBNN constructs an analogous explanation, but it is based on a domain
theory consisting of neural networks rather than Horn clauses. As in PROLOG-EBG,
relevant dependencies are then extracted from the explanation and used to refine
the target hypothesis. In the case of EBNN, these dependencies take the form
of derivatives because derivatives are the natural way to represent dependencies
in continuous functions such as neural networks. In contrast, the natural way to
represent dependencies in symbolic explanations or logical proofs is to describe
the set of examples to which the proof applies.
There are several differences in capabilities between EBNN and the sym-
bolic explanation-based methods of Chapter 11. The main difference is that EBNN
accommodates imperfect domain theories, whereas PROLOG-EBG does not. This
difference follows from the fact that EBNN is built on the inductive mechanism
of fitting the observed training values and uses the domain theory only as an addi-
tional constraint on the learned hypothesis. A second important difference follows
from the fact that PROLOG-EBG learns a growing set of Horn clauses, whereas
EBNN learns a fixed-size neural network. As discussed in Chapter 11, one diffi-
culty in learning sets of Horn clauses is that the cost of classifying a new instance
grows as learning proceeds and new Horn clauses are added. This problem is
avoided in EBNN because the fixed-size target network requires constant time to
classify new instances. However, the fixed-size neural network suffers the cor-
responding disadvantage that it may be unable to represent sufficiently complex
functions, whereas a growing set of Horn clauses can represent increasingly com-
plex functions. Mitchell and Thrun
(1993b) provide a more detailed discussion of
the relationship between EBNN and symbolic explanation-based learning methods.