
128
Chapter 4. Programs
§19. Declarative Error Diagnosis
129
Note that, by means
of
the metacalls, succeed and fail, the top-down algorithm
has decoupled the diagnosis
of
the program from whatever transformation,
compilation
or
advanced control was applied to the program. In other words, the
top-down algorithm is essentially independent
of
the underlying computational
behaviour
of
the logic programming system, which could therefore be changed or
improved without affecting the diagnoser.
We
have tried to minimise the number
of
oracle calls made by the top-down
algorithm, without being too concerned about its computational complexity.
Nevertheless, this algorithm makes rather extravagant use
of
metacalls and hence
could be prohibitively expensive for some programs. It would be possible to
reduce this cost by building the erroneous refutation (or finitely failed tree) once at
the beginning
of
the diagnosis and then searching this refutation (or tree) for the
error. Wrong could be easily adapted to this approach, but missing would seem to
require more extensive changes, along the lines
of
[6].
The top-down algorithm for diagnosing missing answers for definite programs
is similar to Shapiro's algorithm for missing answers [92, p.55]. We now briefly
compare the top-down algorithm for diagnosing incorrect answers for definite
programs with the single-stepping and divide-and-query algorithms
of
Shapiro [92].
For this comparison, it is convenient to assume that, for all three algorithms, the
final computation tree
of
the erroneous computation is first constructed and the
algorithms search this tree for the incorrect clause instance. The final computation
tree is the AND-tree corresponding to the refutation obtained by applying all the
mgu's
used in the refutation to all the nodes in the tree. For simplicity, we also
assume that the goal (body) is a single atom. Thus some instance
of
this atom is
the root
of
the final computation tree and its children are instances
of
atoms in the
body
of
the input clause invoked by the goal.
The single-stepping algorithm finds the error by doing a post-order traversal
of
the final computation tree. Suppose the algorithm has just queried the oracle about
all the children
of
some node and found them to be valid.
It
then queries the
oracle about the node itself.
If
this node is not valid, then an incorrect clause
instance has been found.
If
this node is valid, then the algorithm continues the
post-order traversal. This algorithm is essentially a bottom-up algorithm. It has
the disadvantage that its worst case query complexity is equal to the number
of
nodes in the tree. A version
of
the single-stepping algorithm is as follows.
wrong(v and w, x)
~
wrong(v, x)
wrong(v and w, x)
~
wrong(w, x)
wrong(x, z)
~
clause(x,
Xl
if
y)
1\
succeed(y, y)
1\
wrong(y, z)
wrong(x,
Xl
if
y)
~
unsatisfiable(x,
Xl)
1\
clause(x,
Xl
if
y)
1\
valid(y, y)
The divide-and-query algorithm is an improvement in that its query complexity
is optimal to within a constant factor. The idea
of
this algorithm is as follows. It
finds a node
in
the tree such that the weight
of
the subtree rooted at that node is
as
close as possible to half the weight
of
the entire tree.
It
then queries the oracle
about this node.
If
this node is not valid, then the algorithm recursively enters the
subtree rooted at this node.
If
not, the algorithm calculates a new
"middle"
node
for the entire tree with this subtree deleted. It is shown in [92] that this algorithm
has logarithmic query complexity. Unfortunately,
it
is rarely possible to divide the
tree in half. Usually, we must settle for a
"middle"
node which is the root
of
a
subtree with somewhat smaller weight. This detracts from the performance
of
the
divide-and-query algorithm.
If
the tree has n nodes and branching factor b, then
the worst case query complexity is blog n (not log n, as a superficial analogy with
the binary search algorithm might suggest).
The top-down algorithm searches the final computation tree as follows. First,
the oracle is queried about the root node, which
is
presumably not valid. It then
queries each child
of
the root node in turn.
If
they are all valid, then an incorrect
clause instance has been found. Otherwise,
it
enters the subtree rooted at the
leftmost child which
it
finds to be not valid and continues the search in the same
way
in
this subtree. The top-down algorithm does indeed search the tree in a top-
down fashion. Note that it would be easy to add the flexibility
of
querying the
children in some preferred order.
If
the final computation tree has branching factor
b and height h, then the worst case query complexity
of
the top-down algorithm is
bh.
We
now compare in more detail the query complexity
of
the top-down and
divide-and-query algorithms. First, the top-down algorithm can perform worse
than the divide-and-query algorithm. Suppose the tree is linear and the error is
right at the bottom
of
the tree. The top-down algorithm queries all nodes in the
tree, while the divide-and-query algorithm only queries the logarithm
of
this
number. On the other hand, suppose the tree has two subtrees, the one on the right
being very much greater than the one on the left, and the only error is in the left
subtree. The top-down algorithm will quickly find the error by immediately