Lloyd J.W. Foundations of Logic Programming

Подождите немного. Документ загружается.

8 Chapter 1. Preliminaries

§2. First Order Theories

Because clauses are so common in logic programming, it will be convenient to

adopt a special clausal notation. Throughout, we will denote the clause

'v'xl''''v'x

(A1v...

v-B

v...

v-B

)

where

Al'

...

,Ak,Bl'

...

are atoms and

xl'

...

are all the variables occurring in

these atoms, by

Al,

·,Ak~BI,

·,Bn

Thus, in the clausal notation, all variables are assumed to

be universally quantified,

the commas in the antecedent

Bl'

...

denote conjunction and the commas in the

consequent

Al'

...,A

denote disjunction. These conventions are justified because

'v'xl

·'v'x

(Alv

...

vAkv-Blv

...

v-B

)

is equivalent to

To illustrate the application

the various concepts in this chapter to logic

programming, we now define definite programs and definite goals.

Definition A

definite program clause is a clause

the form

A~Bl'

...,Bn

which contains precisely one atom (viz. A) in its consequent. A is called the

head

and

Bl'

...

is called the

body

the program clause.

Definition A unit clause is a clause

the form

that is, a definite program clause with an empty body.

The informal semantics

A~Bl'

...,Bn is

"for

each assignment

each

variable,

BI,

...

are all true, then A is true". Thus,

n>O,

a program clause is

conditional. On the other hand, a unit clause

is unconditional. Its informal

semantics is

"for

each assignment

each variable, A is

true".

Definition A definite program is a finite set

definite program clauses.

Definition In a definite program, the set

all program clauses with the same

predicate symbol p in the head is called the

definition

Example The following program, called slowsort, sorts a list

non-negative

integers into a list in which the elements are in increasing order. It is a very

inefficient sorting program! However, we will find it most useful for illustrating

various aspects

the theory.

In this program, non-negative integers are represented using a constant 0 and a

unary function symbol

The intended meaning

0 is zero and f is the successor

function. We define the powers

f by induction: fl(x)=o and

fl+

(x)=f(t\x».

Then the non-negative integer n is represented by the term fl(O).

fact, it will

sometimes

convenient simply to denote fl(O) by

Lists are represented using a binary function symbol

"."

(the cons function

written infix) and the constant nil representing the empty list. Thus the list

[17,22,6,5]

would be represented by 17.(22.(6.(5.nil»). We make the usual right

associativity convention and write this more simply as 17.22.6.5.nil.

SLOWSORT PROGRAM

sort(x,y)

sorted(y), perm(x,y)

sorted(nil)

sorted(x.nil)

sorted(x.y.z)

x~y,

sorted(y.z)

perm(nil,nil)

perm(x.y,u.v)

delete(u,x.y,z), perm(z,v)

delete(x,x.y,y)

delete(x,y.z,y.w)

delete(x,z,w)

O~x~

f(x)~f(y)

x~y

Slowsort contains definitions

five predicate symbols, sort, sorted, perm,

delete and

(written infix). The informal semantics

the definition

sort is

"if

x and y are lists, y is a permutation

x and y is sorted, then y is the sorted

version

x".

This is clearly a correct top-level description

a sorting program.

Similarly, the first clause in the definition

sorted states that

"the

empty list is

sorted". The intended meaning

the predicate symbol delete is that delete(x,y,z)

should hold

z is the list obtained by deleting the element x from the list

The

above definition for delete contains obviously correct statements about the delete

predicate.

Definition A

definite goal is a clause

the form

~Bl'

...,Bn

that is, a clause which has an empty consequent. Each B

(i=l,...,n) is called a

subgoal

the goal.

Yl'''''Yr are the variables

the goal

~BI,·

,Bn

Chapter

Preliminaries

§3. Interpretations and Models

or, equivalently,

then this clausal notation is shorthand for

V'Yl···V'Yr

(-B1v

...

v-B

)

Example To run slowsort, we give

a goal such

+- sort(17.22.6.5.nil,y)

This is understood

a request

find the list

which is the sorted version of

.22.6.5.nil.

Definition The

empty clause, denoted

is the clause with empty consequent

and empty antecedent. This clause is

be understood

a contradiction.

Definition A

Horn clause is a clause which is either a definite program clause

or a definite goal.

§3. INTERPRETATIONS AND MODELS

The declarative semantics

a logic program is given by the usual (model-

theoretic) semantics

formulas in first order logic. This section discusses

interpretations and models, concentrating particularly on the important class

Herbrand interpretations.

Before we give the main definitions, some motivation is appropriate. In order

to be able to discuss the truth or falsity

a formula, it is necessary to attach some

meaning to each

the symbols in the formula first. The various quantifiers and

connectives have fixed meanings, but the meanings attached to the constants,

function symbols and predicate symbols can vary. An interpretation simply

consists

some domain

discourse over which the variables range, the

assignment to each constant

an element

the domain, the assignment to each

function symbol

a mapping on the domain and the assignment to each predicate

symbol

a relation on the domain.

interpretation thus specifies a meaning for

each symbol in the formula. We are particularly interested in interpretations for

which the formula expresses a true statement in that interpretation. Such an

interpretation is called a

model

the formula. Normally there is some

distinguished interpretation, called the

intended interpretation, which gives the

principal meaning

the symbols. Naturally, the intended interpretation

formula should be a model

the formula.

First order logic provides methods for deducing the theorems

a theory.

These can be characterised (by

GlXlel's completeness theorem [69], [99])

the

formulas which are logical consequences

the axioms

the theory, that is, they

are true in every interpretation which is a model

each

the axioms of the

theory. In particular, each theorem is true in the intended interpretation

the

theory. The logic programming systems in which we are interested use the

resolution rule

the only inference rule.

Suppose we want to prove that the formula

3Yl

·3Yr

(Bl"···,,B

)

is a logical consequence

a program

Now resolution theorem provers

are

refutation systems. That is, the negation

the formula

be proved is added

the axioms and a contradiction is derived.

we negate the formula we want to

prove, we obtain the goal

+-Bl'

...

Working top-down from this goal, the system derives successive goals.

the

empty clause is eventually derived, then a contradiction has been obtained and later

results assure us that

1·

l"···,,B

is indeed a logical consequence

From a theorem proving point

view, the only interest is to demonstrate

logical consequence. However, from a programming point

view, we are much

more interested in the bindings that are made for the variables y1,

...

because

these give us the

output from the running

the program.

fact, the ideal view

a logic programming system is that it is a black box for computing bindings and

our only interest is in its input-output behaviour. The internal workings of the

system should be invisible to the programmer. Unfortunately, this situation is not

true, to various extents, with current PROLOG systems. Many programs can only

be understood in a procedural (i.e. operational) manner, because

the way they

use cuts and other non-logical features.

Returning to the slowsort program, from a theorem proving point

view,

can regard the goal +-sort(17.22.6.5.nil,y)

a request to prove that

sort(l7.22.6.5.nil,y) is a logical consequence

the program.

fact, we are

much more interested that the proof is constructive and provides us with a specific

Chapter 1. Preliminaries

§3. Interpretations and Models

y which makes sort(l7.22.6.5.nil,y) true in the intended interpretation.

We now give the definitions

pre-interpretation, interpretation and model.

Definition A

pre-interpretation

a first order language L consists

the

following:

(a)

A non-empty set D, called the domain

the pre-interpretation.

(b) For each constant in

the assignment

an element in

the assignment

a mapping from D

Definition An interpretation I

a first order language L consists

a pre-

interpretation J with domain D

L together with the following:

For each n-ary predicate symbol in

the assignment

a mapping from D

into

(true, false} (or, equivalently, a relation on D

We say I is based on

Definition Let J

a pre-interpretation

a first order language L. A variable

assignment (wrt

1) is

assignment to each variable in L

an element in the

domain

Definition Let J

a pre-interpretation with domain D

a first order

language L and let V

a variable assignment. The term assignment (wrt J and

the terms in L is defined

follows:

(a) Each variable is given its assignment according to

(b) Each constant is given its assignment according to

(c)

tl,···,t~

are the term assignments

tl'

...

and f' is the assignment

the

n-ary function symbol

then

f'(tl,

...

,t~)eD

is the term assignment

f(tl'

...,t

Definition Let J

a pre-interpretation

a first order language L, V a

variable assignment

wrt

J, and A an atom. Suppose A is

p(tl'

...

) and

dl'

...

the domain

J are the term assignments

tl'

...,t

wrt

J and V. We call

AJ,y

p(dl'

...,d

) the I-instance

A wrt

Let [Al

={AJ,V : V is a variable

assIgnment

wrt

J}.

We call each element

[Al

a I-instance

We also call

each

p(dl,

...,d

) a I-instance.

Definition Let I

an interpretation with domain D

a first order language L

and let V

be a variable assignment. Then a formula in L can be given a truth

value,

true or false, (wrt I and

follows:

(a)

the formula is an atom

p(tl,

...,t

), then the truth value is obtained

calculating the value

p'(t'l'...

,t~),

where p' is the mapping assigned to p

I and

...

,t~

are the term assignments

tl,

...

wrt

I and

(b)

the formula has the form

-F,

FAG,

FvG,

F~G

F~G,

then the truth

value

the formula is given by the following table:

-F

FAG

FvG

F~G

true

true false true

true

true false false

false true

false

true true false true true false

false false

true false false true true

(c)

the formula has the form

F, then the truth value

the formula is true

there exists

deD

such that F has truth value true

wrt

I and Vex/d), where Vex/d)

is V except that x is assigned

otherwise, its truth value is false.

(d)

the formula has the form Vx F, then the truth value

the formula is

true if, for all

deD,

we have that F has truth value true

wrt

I and Vex/d);

otherwise, its truth value is false.

Clearly the truth value

a closed formula does not depend on the variable

assignment. Consequently,

can speak unambiguously

the truth value

closed formula

wrt

to an interpretation.

the truth value

a closed formula

wrt

to an interpretation is true (resp., false), we say the formula is true (resp,. false)

wrt

to the interpretation.

Definition Let I

an interpretation for a first order language L and let W be

a formula in

We say W is satisfiable in I

3(W)

is true

wrt

We say W is

valid in I

'V(W) is true

wrt

We say W is

unsatisfiable in I

3(W)

is false

wrt

We say W is

nonvalid in I

'V(W) is false

wrt

Definition Let I

be an interpretation

a first order language L and let F

closed formula

L. Then I is a model for F

F is true

wrt

Example Consider the formula

Vx3y p(x,y) and the following interpretation L

Let the domain D

be the non-negative integers and let p be assigned the relation

Then I is a model

the formula,

is easily seen.

I, the formula expresses the

true statement that

"for

every non-negative integer, there exists a non-negative

Chapter

Preliminaries

§3. Interpretations and Models

integer which is strictly larger than

it".

On the other hand, I is not a model

the

formula

3y\ix

p(x,y).

The axioms

a first order theory are a designated subset

closed formulas

in the language

the theory. For example, the first order theories in which we are

most interested have the clauses

a program as their axioms.

Definition Let T

a first order theory and let L

the language

T. A

model for T is an interpretation for L which is a model for each axiom

T has a model, we say T is consistent.

The concept

a model

a closed formula can easily

extended to a model

a set

closed formulas.

Definition Let S

a set

closed formulas

a first order language L and let

an interpretation

say I is a model for S

I is a model for each

formula

Note that,

S =

{Fl'

...,F

is a finite set

closed formulas, then I is a model

for S

iff!

is a model for FIA...

Definition Let S

a set

closed formulas

a first order language L.

We say S is satisfiable

L has an interpretation which is a model for

say S is valid

every interpretation

L is a model for S.

say S is unsatisfiable

no interpretation

L is a model for S.

say S is nonvalid

L has an interpretation which is not a model for S.

Now we can give the definition

the important concept

logical

consequence.

Definition Let S

a set

closed formulas and F

a closed formula

first order language

say F is a logical consequence

S if, for every

interpretation I

L, I is a model for S implies that I

a model for F.

Note that

S = {F1,...,Fn} is a finite set

closed formulas, then F is a

logical consequence

iff

F1A...

F is valid.

Proposition

3.1 Let S

a set

closed formulas and F

a closed formula

a first order language L. Then F is a logical consequence

iff

S u {-F} is

unsatisfiable.

Proof

Suppose that F is a logical consequence

Let I

an interpretation

L and suppose I is a model for S. Then I is also a model for F. Hence I is not a

model for S

{-F}.

Thus S u {-F} is unsatisfiable.

Conversely, suppose S u {-F} is unsatisfiable. Let I be any interpretation of

Suppose I is a model for S. Since S u {-F} is unsatisfiable, I cannot be a

model for

-F.

Thus I is a model for F and so F is a logical consequence

S. I

Example

Let S = (p(a),

\ix(p(x)~q(x))}

and F

q(a).

show that F is a

logical consequence

S. Let I

any model for

Thus p(a) is true

wrt

Since

\ix(p(x)~q(x))

is true

wrt

I, so is

p(a)~q(a).

Hence q(a) is true

wrt

Applying these definitions to programs, we see that when we give a goal G

the system, with program P loaded, we are asking the system to show that the set

clauses P u

{O}

is unsatisfiable. In fact,

0 is the goal

f-B1,

·,B

with

variables y

1'''',y

then proposition

3.1

states that showing P u

{O}

unsatisfiable is

exactly the same as showing that

1...

l'"

...

is a logical consequence

Thus the basic problem is that

determining the unsatisfiability, or otherwise,

P u {O}, where P is a program and 0 is a goal. According to the definition,

this implies showing

every interpretation

P u

{O}

is not a model. Needless to

say, this seems to

a formidable problem. However, it turns out that there is a

much smaller and more convenient class

interpretations, which are all that need

investigated

show unsatisfiability. These are the so-called Herbrand

interpretations, which we now proceed to study.

Definition A

ground term is a term not containing variables. Similarly, a

ground atom

an atom not containing variables.

Definition Let L

a first order language. The Herbrand universe U

for L

is the set

all ground terms, which can

formed out

the constants and

function symbols appearing in

(In the case that L has no constants, we add

some constant, say,

to form ground terms.)

Example

Consider the program

p(x)

q(f(x),g(x))

r(y)

which has an underlying first order language L based on the predicate symbols p, q

and r and the function symbols f and

Then the Herbrand universe for L is

Chapter

Preliminaries

§3. Interpretations and Models

(a, f(a), g(a), f(f(a», f(g(a», g(f(a», g(g(a»,

...

Definition Let L

a fIrst order language. The Herbrand base B

for L is the

set

all ground atoms which can

formed by using predicate symbols from L

with ground terms from the Herbrand universe

arguments.

Example For the previous example, the Herbrand base for L is

(p(a), q(a,a), r(a), p(f(a», p(g(a», q(a,f(a», q(f(a),a),...

Definition Let L

a fIrst order language. The Herbrand pre-interpretation

for L is the pre-interpretation given by the following:

(a)

The domain

the pre-interpretation is the Herbrand universe U

(b) Constants in L are assigned themselves in U

(c)

f is an n-ary function symbol in L, then the mapping from (UL)n into U

defIned by (t

...

) ~ f(t1,

...

) is assigned to

An Herbrand interpretation for L is any interpretation based on the Herbrand

pre-interpretation for

Since, for Herbrand interpretations, the assignment to constants and function

symbols is fIxed, it is possible to identify an Herbrand interpretation with a subset

the Herbrand base. For any Herbrand interpretation, the corresponding subset

the Herbrand base is the set

all ground atoms which are true

wrt

the

interpretation. Conversely, given an arbitrary subset

the Herbrand base, there is

a corresponding Herbrand interpretation defIned by specifying that the mapping

assigned to a predicate symbol maps some arguments to

"true"

precisely when the

atom made

the predicate symbol with the same arguments is in the given

subset. This identifIcation

Herbrand interpretation as a subset

the

Herbrand base will

made throughout. More generally, each interpretation based

on an arbitrary pre-interpretation J can

identifIed with a subset

J-instances, in

a similar way.

Definition Let L

a fIrst order language and S a set

closed formulas

An Herbrand model for S is an Herbrand interpretation for L which is a model for

It will often

convenient to refer, by abuse

language, to an interpretation

a set S

formulas rather than the underlying first order language from which

the formulas come. Normally, we assume that the underlying first order language

is defined by the constants, function symbols and predicate symbols appearing in

With this understanding, we can now refer to the Herbrand universe

and

Herbrand base B

S and also refer to Herbrand interpretations

subsets of

the Herbrand base

In particular, the set

formulas will often

a program

P, so that we will refer to the Herbrand universe Up and Herbrand base B

Example We now illustrate these concepts with the slowsort program. This

program can

regarded

the set

axioms

a fIrst order theory. The language

this theory is given by the constants 0 and nil, function symbols f and

"."

and

predicate symbols sort, perm, sorted, delete and

::;;.

The only inference rule is the

resolution rule. The intended interpretation is an Herbrand interpretation.

atom

sort(l,m) is in the intended interpretation iff each

I and m is either nil or is a list

terms

the form ;c(O) and m

the sorted version

The other predicate

symbols have the obvious assignments. The intended interpretation is indeed a

model for the program and hence a model for the associated theory.

Next we show that in order to prove unsatisfiability

a set

clauses, it

sufftces to consider only Herbrand interpretations.

Proposition 3.2 Let S

a set

clauses and suppose S has a model. Then S

has an Herbrand model.

Proof

Let I

an interpretation

We defIne

Herbrand interpretation I'

S as follows:

= {p(tl'

...

,tn)eB

p(tl'

...

) is true

wrt

I}.

It is straightforward to show that

I is a model, then

is also a model. I

Proposition 3.3 Let S

a set

clauses. Then S is unsatisfiable iff S has no

Herbrand.models.

Proof

S is satisfIable, then proposition 3.2 shows that it has an Herbrand

model.

It is important to understand that neither proposition 3.2 nor 3.3 holds

drop the restriction that S

a set

clauses. In other words,

S is a set of

arbitrary closed formulas,

is not generally possible to show S is unsatisfiable

restricting attention to Herbrand interpretations.

Example Let S

{p(a),

-p(x)}. Note that the second formula in S is not a

clause. We claim that S has a model. It sufftces to let D

the set

{O,

I}, assign 0

to a and assign to p the mapping which maps 0 to true and 1 to false. Clearly this

Chapter 1. Preliminaries

§3. Interpretations and Models

gives a model for

However, S does not have

Herbrand model. The only Herbrand

interpretations for S are 0 (the empty set) and {p(a)}. But neither

these

model for

The point is worth emphasising. Much

the theory

logic programming is

concerned only with clauses and for this Herbrand interpretations suffice.

However, non-clausal formulas do arise naturally (particularly in chapters 3, 4 and

5). For this part

the theory, we will be forced to consider arbitrary

interpretations.

There are various normal forms for formulas. One, which we will

fInd useful,

is prenex conjunctive normal form.

Definition

A formula is in prenex conjunctive normal form

it has the form

···Qx

«L11v

...

vL1m1)A...

A(Ln1

v...

nmn

where each Q is an existential or universal quantifIer and each L.. is a literal

The next proposition shows that each formula has

"equivalent" formula,

which is in prenex conjunctive normal form.

Definition We say two formulas W and V are

logically equivalent

is valid.

In other words, two formulas are logically equivalent

they have the same

truth values

wrt

any interpretation and variable assignment.

Proposition 3.4 For each formula

there is a formula

logically equivalent

to W, such that V is in prenex conjunctive normal form.

Proof

The proof is left

exercise. (See problem 5.)

When we discuss deductive database systems in chapter 5, we will base the

theoretical developments on a typed

fIrst order theory. The intuitive idea

typed theory (also called a many-sorted theory [33]) is that there are several sorts

variables, each ranging over a different domain. This can be thought

generalisation

the theories we have considered so far which only allow a single

domain. For example, in a database context, there may be several domains

interest, such

the domain

customer names, the domain

supplier cities, and

so on. For semantic integrity reasons, it is important to allow only queries and

database clauses which respect the typing restrictions.

In addition to the components

a fIrst order theory, a typed fIrst order theory

has a

fInite set, whose elements are called types. Types are denoted

Greek

letters, such as

and

cr.

The alphabet

the typed fIrst order theory contains

variables, constants, function symbols, predicate symbols and quantifIers, each of

which is typed. Variables and constants have types such as

1:.

Predicate symbols

have types

the form

x...

X1:

and function symbols have types

the form

x...

X1:

~1:.

f has type

...

X1:

~1:,

we say f has range type

1:.

For each type

1:,

there is a universal quantifIer V

and an existential quantifIer

31:'

Definition A term

type

is defIned inductively

follows:

(a)

A variable

type

is a term

type

1:.

(b) A constant

type

is a term

type

1:.

(c)

f is an n-ary function symbol

type

...

X1:

~1:

and t

is a term

type

(i=l,...,n), then

f(tl'

...,t

) is a term

type

1:.

Definition A typed (welljormed ) formula is defIned inductively

follows:

(a)

p is an n-ary predicate symbol

type

...

X1:

and t

is a term

type

(i=l,...,n), then p(t

,tn) is a typed atomic formula.

(b)

F and G are typed formulas, then so are

-F,

FAG,

FvG,

F~G

and

F~G.

(c)

F is a typed formula and x is a variable

type

1:,

then V

1:x

F and

31:x

are

typed formulas.

Definition The typed first order language given by an alphabet consists

the

set

all typed formulas constructed from the symbols

the alphabet.

We will

fInd it more convenient to use the notation

Vx/1:

F in place

1:x

Similarly, we will use the notation

3x/1:

F in place

31:x

We let

V(F)

denote

the typed universal closure

the formula F and

:3(F)

denote the typed existential

closure. These are obtained by

prefIxing F with quantifIers

appropriate types.

Definition

A pre-interpretation

a typed fIrst order language L consists

the following:

(a) For each type

1:,

a non-empty set D

1:'

called the domain

type

the pre-

interpretation.

(b) For each constant

type

in L, the assignment

an element in

D1:'

type

x...

X1:

~1:

the assignment

mapping from

x...xD

D1:'

1 n

Chapter 1. Preliminaries

§4. Unification

Definition An interpretation I

a typed first order language L consists

pre-interpretation J with domains

{D't}

L together with the following:

For each n-ary predicate symbol

type 't

x...x't

the assignment

mapping from

D-r

...

xD-r

into {true, false} (or, equivalently, a relation on

1 n

...

1 n

We say I is based on

It is straightforward to define the concepts

variable assignment, term

assignment, truth value, model, logical consequence, and so on, for a typed first

order theory. We leave the details to the reader. Generally speaking, the

development

the theory

first order logic can be carried through with only the

most trivial changes for typed first order logic. We shall exploit this fact in

chapter 5, where we shall use typed versions

results from earlier chapters.

The other fact that we will need about typed logics is that there is a

transformation

typed formulas into (type-free) formulas, which shows that the

apparent extra generality provided by typed logics is illusory

[33]. This

transformation allows one to reduce the proof

a theorem in a typed logic to a

corresponding theorem in a (type-free) logic. We shall use this transformation

process as one stage

the query evaluation process for deductive database

systems in chapter 5.

§4.

UNIFICATION

Earlier we stated that the main purpose

a logic programming system is to

compute bindings. These bindings are computed by unification. In this section, we

present a detailed discussion

unifiers and the unification algorithm.

Definition A

substitution a is a finite set

the form

{v/tl'

...

,vJt

where

each

is a variable, each t

is a term distinct from

viand

the variables v

l""'v

are distinct. Each element

v·/t. is called a binding for

a is called a ground

1 1 1

substitution

the t

are all ground terms. a is called a variable-pure substitution

the t

are all variables.

Definition An expression is either a term, a literal

a conjunction

disjunction

literals. A simple expression is either a term

an atom.

Definition Let

a =

,...

,vJt

}

be a substitution and E be an expression.

Then Ea, the

instance

E by a, is the expression obtained from E by

simultaneously replacing each occurrence

the variable

in E by the term t

(i=l,...,n).

is ground, then

is called a ground instance

Example

Let E = p(x,y,f(a» and a = {xfb, y/x}. Then

= p(b,x,f(a».

S =

{El'

...,E

} is a finite set

expressions and a is a substitution, then

denotes the set {E

...,Ena}.

Definition Let

a = {u1/sl'

...

,urrr'sm}

and a =

lit

1,...

} be substitutions.

Then the

composition

a and a is the substitution obtained from the set

11s1

...,urrr'sm

1/tl""'v

}

by deleting any binding

u!sia

for which ui=sia and deleting any binding

vitj

for

which

VjE

{ul""'u

}·

Example

Let a = {x/f(y), y/z} and a = {x/a, ylb, z/y}. Then

= {x/f(b),

z/y}.

Definition The substitution given by the empty set is called the

identity

substitution.

We denote the identity substitution by

Note that

= E, for all expressions

The elementary properties

substitutions are contained in the following

proposition.

Proposition 4.1 Let

a, a and y be substitutions. Then

(a)

= a.

(b) (ES)a =

E(aa),

for all expressions

Proof

(a) This follows immediately from the definition

(b) Clearly it suffices to prove the result when E is a variable, say,

Let

a =

{u/sl'

...,urrr'sm} and a = {v1/tl'...

,vJtn}.

·,u

} U

{vl""'v

then

(xa)a

= x = x(aa).

{ul'''''u

say x=u

' then (xS)a = si

= x(Sa). If

XE{V

,...,v

}\{u

,...,u

}, say x=v

' then

(xa)a

= t

= x(Sa).

x is a variable, then

x«aa)y)

= x(S(cry».

fact,

x«aa)y)

(x(aa»y

«xa)a)y

= (xa)(cry) = x(a(cry», by (b). I

Chapter 1. Preliminaries

§4. Unification

Proposition 4.1(a) shows that e acts

a left and right identity for composition.

The definition

composition

substitutions was made precisely to obtain (b).

Note that (c) shows that we can omit parentheses when writing a composition

...8

substitutions.

Example Let 8={x/f(y),

y/z}

and a={x/a,

zIb}.

Then 8a = lx/fey), yfb,

zIb}.

Let E = p(x,y,g(z». Then E8 = p(f(y),z,g(z» and (E8)a = p(f(y),b,g(b». Also

E(8a) = p(f(y),b,g(b» = (E8)a.

Definition Let E and F be expressions. We say E and F are variants

there

exist substitutions

8 and a such that E=F8 and F=Ea. We also say E is a variant

F or F is a variant

Example p(f(x,y),g(z),a) is a variant

p(f(y,x),g(u),a). However, p(x,x) is not

a variant

p(x,y).

Definition Let E be an expression and V be the set

variables occurring in

A renaming substitution for E is a variable-pure substitution

/Y1,

...

,xi

such

that

{xl""'x

}

V, the

are distinct and (V \

(xl'''''x

})

n {Yl""'Yn} =

Proposition 4.2 Let E and F be expressions which are variants. Then there

exist substitutions

8 and a such that E=F8 and F=Ea, where 8 is a renaming

substitution for F and

a is a renaming substitution for

Proof

Since E and F are variants, there exist substitutions 8 and a such that

1 1

E=F8

and

F=Ea

Let V be the set

variables occurring in E and let a be the

substitution obtained from

by deleting all bindings

the form x/t, where

xiV.

Clearly F=Ea. Furthermore, E=F8

=Ea8

and it follows that a must be a

renaming substitution for

11III

We will be particularly interested in substitutions which unify a set

expressions, that is, make each expression in the set syntactically identical. The

concept

unification goes back to Herbrand [44] in 1930. It was rediscovered in

1963 by Robinson [88] and exploited in the resolution rule, where it was used

reduce the combinatorial explosion

the search space. We restrict attention to

(non-empty) finite sets

simple expressions, which is all that we require. Recall

that a simple expression is a term or an atom.

Definition Let S be a finite set

simple expressions. A substitution 8 is

called a

unifier for S

S8 is a singleton. A unifier 8 for S is called a most

general unifier

(mgu) for S if, for each unifier a

there exists a substitution

such that a=8y.

Example (p(f(x),a), p(y,f(w»} is not unifiable, because the second arguments

cannot be

unified.

Example (p(f(x),z), p(y,a)} is unifiable, since a = (y/f(a), x/a,

zla}

is a

unifier. A most general unifier is

8 = (y/f(x), zla}. Note that a = 8{x/a}.

It follows from the definition

an mgu that

8 and a are both mgu's of

,··.,E

then E

8 is a variant

Proposition 4.2 then shows that E

a can

be obtained from E

8 simply by renaming variables. In fact, problem 7 shows that

mgu's are unique modulo renaming.

We next present an algorithm, called the unification algorithm, which takes a

finite set

simple expressions

input and outputs an mgu

the set is unifiable.

Otherwise, it reports the fact that the set is not unifiable. The intuitive idea behind

the unification algorithm is

follows. Suppose we want to unify two simple

expressions. Imagine two pointers, one at the leftmost symbol

each

the two

expressions. The pointers are moved together

the right until they point

different symbols. An attempt is made to unify the two subexpressions starting

with these symbols by making a substitution.

the attempt is successful, the

process is continued with the two expressions obtained by applying the

substitution.

not, the expressions are not unifiable.

the pointers eventually

reach the ends

the two expressions, the composition

all the substitutions

made is an mgu

the two expressions.

Definition Let S be a finite set

simple expressions. The disagreement set of

is'

defined

follows. Locate the leftmost symbol position at which not all

expressions in S have the same symbol and extract from each expression in S the

subexpression beginning at that symbol position. The set

all such subexpressions

is the disagreement set.

Example Let S = (p(f(x),h(y),a), p(f(x),z,a), p(f(x),h(y),b)}. Then the

disagreement set is

(heY),

z}.

We now present the unification algorithm.

In this algorithm, S denotes a finite

set

simple expressions.

Chapter 1. Preliminaries

§4. Unification

UNIFICA

nON

ALGORITHM

Put

k=O

and 0'0=£'

SO'k

is a singleton, then stop;

O'k

is an mgu

Otherwise, find the

disagreement set D

SO'k'

there exist v and t in D

such that v is a variable that does not occur in

then put

O'k+l

O'k{v/t},

increment k and go to

Otherwise, stop; S is not

unifiable.

The unification algorithm

presented above is non-deterministic to the extent

that there may be several choices for v and t in step

However, as we remarked

earlier, the application

any two mgu's produced by the algorithm leads

expressions which differ only

a change

variable names. It is clear that the

algorithm terminates because S contains only finitely many variables and each

application

step 3 eliminates one variable.

Example Let S = {p(f(a),g(x)), p(y,y)}.

(a)

0'0

(b)

= {f(a), y},

0'1

= {y/f(a)} and

SO'I

= {p(f(a),g(x)), p(f(a),f(a))}.

= {g(x), f(a)}. Thus S is not unifiable.

Example Let S = {p(a,x,h(g(z))), p(z,h(y),h(y))}.

(a)

0'0

(b)

{a,

z},

0'1

{z/a}

and

SO'I

= {p(a,x,h(g(a))), p(a,h(y),h(y))}.

(c)

{x,

hey)},

0'2

= {z/a, x/h(y)} and S0'2 = {p(a,h(y),h(g(a))), p(a,h(y),h(y))}.

(d) D

{y,

g(a)},

0'3

={z/a, x/h(g(a)), y/g(a)} and

S0'3

= {p(a,h(g(a)),h(g(a)))}.

Thus S is unifiable and

0'3

is an mgu.

In step 3

the unification algorithm, a check is made to see whether v occurs

This is called the occur check. The next example illustrates the use

the

occur check.

Example Let S

={p(x,x), p(y,f(y))}.

(a)

0'0

(b)

{x,

y},

0'1

= {x/y} and

SO'I

= {p(y,y), p(y,f(y))}.

(c)

= {y, fey)}. Since y occurs in

fey),

S is not unifiable.

Next we prove that the unification algorithm does indeed find an mgu

unifiable set

simple expressions. This result first appeared in [88].

Theorem

4.3 (Unification Theorem)

Let S be a finite set

simple expressions.

S is unifiable, then the

unification algorithm terminates and gives an mgu for

S is not unifiable, then

the unification algorithm terminates and reports this fact.

Proof

We have already noted that the unification algorithm always terminates.

It suffices to show that

S is unifiable, then the algorithm finds an mgu.

fact,

S is not unifiable, then the algorithm cannot terminate at step 2 and, since it

does terminate, it must terminate at step

Thus it does report the fact that S is

not unifiable.

Assume then that S is unifiable and let

abe any unifier for

We prove first

that, for

~O,

O'k

is the substitution given in the kth iteration

the algorithm,

then there exists a substitution

I'k such that a=

O'kl'k'

Suppose first that

k=O.

Then we can put

since a=

Ea.

Next suppose,

for some

~O,

there exists I'k such that a=

O'k

I'k'

SO'k

a singleton, then the

algorithm terminates at step

Hence we can confine attention to the case when

SO'k

is not a singleton. We want to show that the algorithm will produce a further

substitution

O'k+

1 and that there exists a substitution I'k+1 such that a=

O'k+

Il'k+

Since

SO'k

is not a singleton, the algorithm will determine the disagreement set

SO'k

and go to step

Since a =

O'kl'k

and a unifies

it follows that

I'k

unifies D

. Thus D

must contain a variable, say, v. Let t be any other term in D

Then v cannot occur in

i because

Vl'k

=tyk' We can suppose that

{v/t}

is indeed

the substitution chosen at step

Thus

O'k+

1 =

O'k

{v/t}.

We now define I'k+l =I'k'{v/Vl'k}'

I'k has a binding for v, then

I'k =

{v/Vl'k}

U I'k+l

= {v/tl'k} U I'k+l (since

Vl'k

tyk)

= {v/tyk+I} U I'k+1 (since v does not occur in

{v/t}l'k+l

(by the definition

composition).

I'k does not have a binding for v, then I'k+l = I'k' each element

is a

variable and

I'k ={v/t}l'k+l' Thus a=

O'kl'k

=O'k{v/t}l'k+l =

O'k+

Il'k+1,

required.

Now we can complete the proof.

S is unifiable, then we have shown that the

algorithm must terminate at step 2 and,

it terminates at the kth iteration, then

O'k

I'k' for some I'k' Since

O'k

is a unifier

this equality shows that it

indeed an mgu for

Chapter

Preliminaries

§5. Fixpoints

The unification algorithm which we have presented can be very inefficient. In

the worst case, its running time can be an exponential function

the length

the

input. Consider the following example, which is taken from [9]. Let

S =

{p(xl""'x

), p(f(xO,xO),·..

,f(xn_l,xn_l»}'

Then 01 = {xl/f(xO'x

)}

and SOl =

{p(f(x

)'x

,·

..,x

), p(f(xO,xO),f(f(xO,xO),f(xO,xO»,f(x2,x2),···,f(xn_1,x

»}.

The

next substitution is

= {xl/f(xO'x

/f(f(x

)' f(xO'x

))}' and so on.

Note

that the second atom in

SOn

has 2

-1

occurrences

f in its kth argument

(1~:Lc;:;n).

In particular, its last argument has 2

occurrences

Now recall

that step 3

the unification algorithm has the occur check. The perfonnance

this check just for the last substitution will thus require exponential time. In fact,

printing

also requires exponential time. This example shows that no unification

algorithm which explicitly presents the (final) unifier can be linear.

Much more efficient unification algorithms than the one presented above are

known. For example, [67] and [80] give linear algorithms (see also [68]). In [80],

linearity is achieved by the use

a carefully chosen data structure for representing

expressions and avoiding the explicit presentation

the unifier, which is instead

presented as a composition

constituent substitutions. Despite its linearity, this

algorithm is not employed in PROLOG systems. Instead, most use essentially the

unification algorithm presented earlier in this section, but with the expensive occur

check omitted! From a theoretical viewpoint, this is a disaster because it destroys

the soundness

SLD-resolution. We discuss this matter further in §7.

§5.

FIXPOINTS

Associated with every definite program is a monotonic mapping which plays.a

very important role in the theory. This section introduces the requisite concepts and

results concerning monotonic mappings and their fixpoints.

Definition Let S be a set. A relation R on S is a subset

SxS.

We usually use infix notation writing (x,y)eR as xRy.

Definition A relation R on a set S is a partial order

the following

conditions are satisfied:

(a) xRx, for all

xeS.

(b) xRy and yRx imply x=y, for all x,yeS.

Example

Let S be a set and 2

be the set

all subsets

Then set

inclusion,

S;;;;,

is easily seen to be a partial order on 2

adopt the standard notation and use

to denote a partial order. Thus we

have (a)

x~x,

(b)

x~y

and

y~x

imply x=y and (c)

x~y

and

y~z

imply

x~z,

for all

x,y,zeS.

Definition Let S be a set with a partial order

Then

S is an upper bound

a subset X

x~a,

for all

xeX.

Similarly,

beS

is a lower bound

b~x,

for all

xeX.

Definition Let S be a set with a partial order

Then

aeS

is the least upper

bound

a subset X

a is an upper bound

X and, for all upper bounds

X, we have

~a'.

Similarly,

beS

is the greatest lower bound

a subset X

b is a lower bound

X and, for all lower bounds b'

X, we have

b·~b.

The least upper bound

X is unique,

it exists, and is denoted by lub(X).

Similarly, the greatest lower bound

X is unique,

it exists, and is denoted by

glb(X).

Definition A partially ordered set L is a complete lattice

lub(X) and glb(X)

exist for every subset X

let T denote the top element lub(L) and

denote the bottom element

glb(L)

the complete lattice

Example

the previous example, 2

under

S;;;;

is a complete lattice. In fact,

the least upper bound

a collection

subsets

S is their union and the greatest

lower bound is their intersection. The top element is S and the bottom element is

Definition Let L be a complete lattice and T :

L~L

be a mapping.

say T

is monotonic

T(x)~T(y),

whenever

x~y.

Definition Let L be a complete lattice and X

S;;;;

L. We say X is directed

every finite subset

X has an upper bound in

Definition Let L be a complete lattice and T :

L be a mapping.

say T

is continuous

T(1ub(X»

= lub(T(X», for every directed subset X