A NETWORK-OF-AUTOMATA MODEL FOR QUESTION-ANSWERING

6 downloads 0 Views 2MB Size Report
Oct 31, 1973 - question-answering, The principal obstacles to a workable system seem .... "answer" path has been found, and the terminal composes an appropriate ...... an answer to the riddle: penguin. Figure 12. CnOG"). PENGUIN. - 66 -.
A NETWORK-OF-AUTOMATA MODEL FOR QUESTION-ANSWERING IN SEMANTIC MEMORY by Joseph Fiksel

TECHNICAL REPORT NO. 218 October 31, 1973

PSYCHOLOGY AND EDUCATION SERIES

Reproduction in Whole or in Part Is Permitted for Any Purpose of the United States Government

Research and reproduction of this report was partially supported by Contract NIH GM 14789-05, NSF GJ 443X3, and EC 443X4.

INSTITUTE FOR MATHEMATICAL STUDIES IN THE SOCIAL SCIENCES STANFORD

UNIVERSITY

STANFORD, CALIFORNIA

TABLE OF CONTENTS Page CHAPTER I 1.1

Introduction

1

1;2

The Question-Answering System

3

2.1

Mathematical Formulation

7

2. 2

The Fundamental Algori thm

11

2.3

An Inferential Search Algorithm

33

2.4

Some Simple Illustrations

48

3.1

First-Order Language Questions

54

3.2

Illustration

61

CHAPTER II

CHAPTER III

3; 3 . ·Complex Expressions 3;4

Model~Theoretic

Representation REFERENCES

67

Semantics in a Network

73 82

.,.

CHAPTER I

1.1

Introduction. This dissertation proposes a new theory of how people answer

questions on the basis of information available in memory.

This is an

important but currently unsolved puzzle for cognitive psychology.

A

"question...;answering system" is normally unde,rstood to mean a computer

program capable of (i) accepting and interpreting propositional information and questions, (ii) storing the information according to some internal representation scheme, and (iii) answering questions by retrieving the relevant information and employing deductive reasoning.

The underlying

goal in question-answering research is to develop a computational theory which simulates some aspects of the way humans answer questions.

However,

,in artificial intelligence work of this type, the actual psychological processes which underlie human performance are usually considered in only a casual, intuitive fashion.

Instead, emphasis has typically been placed

on achieving efficient programs and data structures which enable computations to proceed within the limitations of present-day serial computers. The goals of this paper, however, are to be distinguished from these traditional goals of artificial intelligence work.

We try to model

(at least roughly) the actual psychological mechanisms involved in question-answering, as well as the representational design of human semantic memory.

Thus, these efforts would moreptoperly fall under the

domain of theoretical psychology.

Indeed, the criteria used for testing

the validity of the question-answering model are drawn from the growing literature in experimental psychology dealing with human reaction times in question-answering.

At the same time, computational efficiency is

- 1 -

given no weight at all in the construction of our model.

In fact, the

parallel-processing search algorithm which is used lends itself very poorly to realization on a serial

computer~

But then, there are no necessary

reasons to suppose that memory-searching methods of the brain are analogous to serial rather than parallel processes. Nevertheless, there are important similarities between this model and previous question-answering systems, in terms of the problematics of language representation, memory organization, and

inference~

Simmons

(1969) provides an excellent review of the progress in the field of question-answering,

The principal obstacles to a workable system seem

to have been (i) correctly analyzing the semantics of a question and (ii) developing efficient techniques or heuristics for searching through a large data base (memory).

Following the example of most others, we assume that

the first problem (i) is solvable in principle, and that questions (and statements) are to be transformed (by some unanalyzed semantic processor) into an unambiguous formal language.

Coles (1968) has already dealt with

this problem with some measure of success, One of the more successful question-answering systems was created by Green and Raphael (1968), who employed formal mathematical theorem-proving methods (so-called "Robinson resolution" methods) to deduce the answer to a question.

Another well-known system was developed by Winograd (1972),

who conversed with a hypothetical "robot" operating in a small world of colored blocks on a table.

Both these systems are unconcerned with provid-

ing an accurate psychological description of question-answering.

Of more

direct relevance to our work are the graph-structure belief systems of \

Colby (1968), and the semantic networks of Quillian (1966), which necessarily

- 2 -

sacrific~ ~fficientr~alizability

m~mory.

for semantic with parallel provid~d

1.2

Th~

to

sugg~st

structur~

a plausible

By formalizing the notion of a network of automata

s~arch t~chniques,

pap~r

this

builds upon the foundations

Question-Answering System.

g~neral

answ~ring syst~m

pr~cis~

are diagrammed in Figure 1.

formal

languag~ qu~ry,

information. th~

proc~ssor

and

Wh~n

propos~d

components and information flow of a

natural language is input to

outputs

ord~r

th~se latt~r pap~rs.

by

The

into a

in

th~

languag~.

initiat~s

the control

information in

th~

a

A question

question-

formulat~d

in

semantic processor, which encodes it Th~

control terminal

s~arch

th~ m~mory stor~

in

t~rminal r~c~iv~s

formal

languag~.

transforms this output into an

acc~pts

answ~r

a

r~ply

Finally

for

from

this formal th~ r~quir~d m~mory,

it

th~ s~mantic

in natural

languag~ •

..

QUESTIONS NATURAL LANGUAGE

/

SEllANTIC PROCESSOR

..... ANSWERS

-

F ORl1AL

-:.7

E-LAN GUAGE -

CONTROL TERlIINAL / \

SEARC H COMMAND

RESPONSE

\ llEMORY STORE

Figur~

- 3 -

1

1/

The scope of this paper includes only the control terminal and the memory, not the semantic

processor~

Hence, we will not be concerned with

all the thorny issues of synonymity, syntactic ambiguity, and other aspects of natural language which hinder mehcnaical translation.

The

focus is rather on the organization of the memory store and the search techniques employed to extract information.

In Chapter II we assume that

the questions received by the terminal have already been rephrased in terms of the network structure of memory.

Then, in Chapters 3.1 and 3.4, we, con-

sider questions that are stated in a formal language. the network represent formal semantic

concepts~

not

The elements of

words~

and several

words (e.g., rabbit, bunny, hare) may all map into a single concept. In simplest terms, information may be roughly divided into items (e.g., any concept which can act as the subject of a proposition) and relations between these"items.

Each item corresponds to a node of the

network, and to every -relation between ,two items there corresponds a

directed arc joining the appropriate nodes.

Ths set of arcs (relations)

in the network is partitioned into a finite set of types, each type represented by a label on its arcs.

For example, the phr8>se "Rabbits have

stomachs" could be represented as two nodes, RABBIT and STOMACH, with an arc labelled "HAS-AS-PARTS" directed from RABBIT to STOMACH, as illustrated in Figure 2 (a).'

It is assumed that a certain finite set of concepts and

relations is used as "primitives ll , from which complex meanings cam be

constructed.

Schank (1972) is developing a system of semantic primitives

along with a parser (semantic processor) which will encode naturallanguage phrases into this system. Our theory supposes that each node of the network contains a

- 4 -

finite~state

cautomaton,.whose"input.is ,the ·stateof each of its "neighbors"

in the network, and whose state-changes are a hlUction of these inputs, Thus we postulate a network of identical automata, each interacting only locally with its neighbors ·as it undergoes state transitions from moment to moment, We also postulate a ,control mechanism, external to the network, called the terminal, which may transmit and receive simple pulses to and from the network automata,

An ,automaton can be "excited" by the terminal

and can notify the terminal if it achieves a particular, designated state, Otherwise, however, the operation of the automata proceeds without any global control, In general, the

state~transition,rules for

the automata could be

probabilistic, as exemplified by simple word-association norms,

However,

we will assume that question-answering is basically a deterministic search procedure,

For example, suppose that the terminal receives the que",tion

"Does a rabbit have a stomach?"

The query causes the terminal to excite the

two node-automata RABBIT and STOMACH,

Since i t is seeking a path in the

network of the form shown in Figure 2(a) or (b), the state-transition function is specif:\.ed roughly as follows: RABBIT sends a signal along all outward-directed arcs that are labelled HAS or SUBSET OFi similarly, STOMACH sends ,a signal along all inward-directed arcs that are l.,belled HAS,

(The exact manner in which these signals propagate through the net-

work is explained in Section 2,2).

When some intermediate node receives

the proper signals from two directions, it notifies the terminal that an "answer" path has been found, and the terminal composes an appropriate answer

0

- 5 -

If a notification is received, the answer returned is YES.

If no

notification is received after a certain time, the answer is NO, indicating the .absence of a path in the network of .the desired type.

Later we shall

discuss .the return of factual. answers rather than simply YES or NO.

HAS (a)

(b)

RABBIT

}----~

C::RABBI~ SUBSET

Figure 2

- 6 -

STOMACH

CHAPTER II

2.1

Mathematical Formulation. A directed network may be· formally defined as a doublet

where

/Z

..tJ

is a (possibly infinite) set of nodes, and

(7?, -If. ),

7Z x 71.

c

is a set of ordered pairs corresponding to arcs of the network. (x,y) E,;fl

there is a directed arc running from node

where

'it? '

x,y E

and

x

and

x

Thus, if

to node

yare said to be adjacent.

A labeled directed network may be defined as a triplet ( '7( where

I(

is a finite set of nodes,

types, and

.-1

C

an arc from

x

to

71. 2 y,

x

7.

y,

,A , '7')

~is a finite set of relational

That is, i f

J1 ,

(x,y; u) E

with the arc labeled as type

u.

then there is

Now, the degree

of a node is the number of arcs incident to that node, either incoming or outgoing.

Let the maximum degree of

('l? ,u'f, 7!

be

d.

The type of automata used in this paper are quite simple mathematically, and may be imagined as little demons with exactly d "limbs" or tentacles radiating from a central cell. below.

Let

[d]

= {1,2,3 •.• d};

For example,

d = 5

in the figure

then an arbitrary numbering of the limbs

can be established by associating each limb with an element of

[d].

will embed these automata in the network by placing one at each node. the automata in the network are assumed to be identical. shall refer to the automaton at node ·x

- 7 -

as "automaton

We All

Henceforth, we x" .

(y,S)

p (x, 3)

1

and

p(y,3) = (y,3)

5

3

5

2

1

2

4

The limbs of each automaton are connected to the limbs of other (x,y; u) E ~

automata in such a way that every arc connection between one limb of we define a mapping

p :

'1?

x

x [d] ~

into itself or some other limb. limb of automaton

implying an arc between

x

used, or "dead". p

More precisely,

[d], which maps every limb

p(x,r) = (y,s)

and

is the identity mapping, since x

remain unconnected.

'17. x

y.

means

that the r

is connected to the sth limb of automaton

x

Unless node

and one limb of

corresponds to a

y.

p

y,

composed with itself

p(p(x,r»= p(y,s) = (x,r».

has degree If

Note that

th

d, some "of the limbs of

p(x,r) = (x,r),

we say that limb

x

will

(x,r)

is not

In effect, it is connected to itself; by this device,

is defined over the entire set of limbs in the network.

The purpose

of the dead limbs "is to make "all the automata identical, regardless of the effective degree of their nodes. Let

J

be a set with the property that for every

there exists "an inverse relation of type

- 8 -

u

-1

E

-,J

;

u E

J,

that is, for every

x

and

y

the arc

(x,y, u)

is equivalent to the arc

the direction of an arc may be arbitrarily chosen, relation

u

maton

x.

(x,r),

u = u

-1

-1

),

Thus

the

is symmetric, and the arc may be considered undirected,

d g = (g 1 , .•. , g) x x x

Let

If

(y, x, u

g~ E ~

be the relational vector associated with auto-

is the relation on the arc corresponding to the limb

assuming this arc is directed away from

p (x. r) = (y,s)

and

r gx = u,

assigned the null relation,

then (1 E

gy = u -1 s

":7!.

x.

This means that if

Of course ~ dead limbs are p(x,r) = (x. r) => gxr

'I'hat is,

(1.

We now have completely described the structure of a network of automata from a topological point of view.

Next we must discuss the

dyanmics of this network under the condition that an.automaton may undergo a "-change of statell. at successiveediscrete instants of time,

t

= 0.1.2,3, .•.•

(~.oe, ~

Formally, an automaton is defined as a triple

where ~ is the set of possible states of the automaton at any instant of time, 0( is the set of possible inputs. and

~

is a state transition

function

which determines the state of the automaton at the next instant of time, given its current state and input. The state of automaton

x

at timet

and will consist of a d-component vector d

d 1 2 ( x, et. e x, t "., e x.t

is the number of limbs of the automaton.

will actually be a composite of to each limb.

will be written

Thus a state

at instant

In this dissertation, we shall have

t

will be written

time-invariant relational vector

x. t' ) where

x.t

E

d elementary states, one corresponding

is a finite set of elenemtary limb states, x

e

e

A~;f'

xi:

=

$7. d.

where

The input to automaton

For each automaton

x.

the

gx will form one part of the input - 9 -

).

The other part will be the vector

vectol'o

f

1 d x, t = (f x, t .•• , f x, t)' the

limb states of all limbs connected to those of p(x,r)

=

(y,s) then

S fr = e x, t y, t

x.

In particular, if

Verbally, the time-varying input to



x

will consist of a portion of the states of all node-automata adjacent to

x.

Hence

A = (f ,g), and x,t x,t x

notation we have

e.."'" x', t+1

9C':

= ~ (ex, ~, Ax, t) ,

"

=

~: d

x

J'fi..

Using the above

These network automata differ

from classical automata in three respectsg the inputs of a classical automaton are usually symbols from an alphabet, and are distinct from the internal states of the automaton, here the states are equivalent to inputs,

Also, a classical

automaton is treated as an isolated entity, while here many automata are connected together.

Finally, the classical automaton may have rudimentary

storage such asa push-down stack, whereas the network automata, being much simpler, have no storage capabilities whatsoever, Note that the sequence of states of each automaton in the network is determined solely by the states of ·its immediate neighbours. is meant by the phrase ":l.oca1 interaction".

This is what

If you think again in terms of

the little demons, none of them realize that they are part of a network. All they know is that their tentacles are gripping those ofa number of identical demons. happy too.

If a demon's neighbours are in a happy state, he may be

On the other hand, if he were just in a sad state, and his

neighbours were happy, this may send him into a deeper depression. Abandoning the metaphor, and returning to the mathematical formulation, such networks of finite automata have a remarkable property. and the function

~

If the set

.4-1

are appropriately chosen; the automata can perform

complex tasks involving properties of the network as a whole, in the absence of any global guidance.

Rosenstiehl, Fiksel, and Holliger [1972] deal ex-

tensively with the problem-solving capabilities of automata in non_labeled

- 10 -

graphs

0

Below we demonstrate that automata are capable of performing

question-answering in the special case ofa labeled directed networko node-automata's only communications with the terminal are at time at the .end of their operationo

The

0, and

This is tantamount to switching the network

on and allowing it to change state until it switches itself offo

The set

of automata compute locally and in parallel, until they reach a stationary terminating

202

state~

The Fundamental Algorithmo The parallel processing operation of the network of automata requires

communication with the control terminal at the initial and final stages 0 Only the input and output of the control terminal will be specified at present; nothing will be said about its internal computational procedures until Section 30 to the

t~rminal,

Let us postulate a

(d+l)st

limb joining each automaton

with possible limb states and inputs

I,D, 1,

and

20

Ordinarily, the state of this limb and its input from the terminal are

I

(meaning "idle") 0 We shall say that the terminal sends a type-l (type-2) excitation to automaton becomes

1(2)

x

if the input to the

(d+l)st

limb of

x

for a single instant of time, and then returns to state

10

The purpose of these excitations will be made clear in Theorem 1, which is our fundamental theoremo The remaining state,

is used by the terminal to control the

0,

activity of the network as a whole. sending an input of

0

to the

(d+l)st

an automaton receives an input of

leaving its remaining

d

The terminal.can freeze the network by limb of every automaton"

When

0, it ceases .to perform state transitions,

limbs in their current stateso - 11 -

It remains frozen

until the in state

O. input changes to an I.

T, whereupon it places all its limbs

Thus, after freezing, the terminal can initialize the network

into the idle state, We .shall now show in detail how the network of automata deals with a simple class of questions,

Given a .sequence

q

of relations and two nodes

x

and

y,the .network is to investigatethe.existence of a path between

x

and

y

whose labeled arcs match that sequence of relations.

When an

arc labeled

A is traversed in the backward direction, it appears in the

sequence

as

q

-1

A



ry

Q(a, -/ )

Let

where that the

A.

A

i

E

be the set of all relational sequences

7',

need not be distinct from one another.

l

the upper bound is bounded by

a

a,

be an even number.

ff,

,J

=

t

Ct~

Note

For convenience, let

Because the length

it is sufficient to take

elementary limb states

n
c

and let f)- be an automaton with

0D and elementary limb states

say that ,/). is stacked upon .. k: if

- 25 -

J£D'

We

so that for e

the state vector at node ~

$

- ~ (e A) x,t+l - ~c x,t' x,t

node

x

is

13

x,t+l

In other words, for its ~

state of

Notice that g

whereas for .& the state vector at

= ~ (13

,~

D x,t

x,t

,e

(t+l)st transition,

t+l~

at time

x,t+l

0

~ takes into account the

~,

~

but

is ind,apendent of

The stacked automata may be considered as a single

If and

)

as well as its own state and input at time

is dependent upon

the composition of

is

(r;

/f,

ofi-)"

automaton~

to

.8-0

written as

Stacking is useful when each

node-automaton is called upon to perform several clearly separable functions. Let r:;' q = (,J' 0' 0
does not ,hinder the operation of

h q.

The

only difference is that when the network is frozen after event $, certain limbs are left with markers on them.

Now, it is known that event $

occurred as a result of condition 4 being satisfied at automaton where

~ @ATHE~ However, the relevant information may be encoded in the network in a different form, such as: SUBSET-OF

HAS-AS-PARTS

>C

(ROBI0

>0ATHE~

~

BIRD

What is necessary is a rule of inference which says: the sequence of relations SUBSET-OF, HAS-AS-PARTS is equivalent to the single relation HAS-AS-PARTS.

Below we show how the search algorithm for the network of

automata can be extended to incorporate exactly this sort of inferencemaking. Returning to the formal theory, we define a production as an operator

S

a, b E

on the set r-r-* j

J'*

Sa = b.

If

of relational sequences such that for a given p=c

ac

1

may be applied to the subsequence is denoted by replacement of Let

0

Sp = c bc , l 2 a

by

b

a,

Applying

2

E

J*,

then the production

replacing it by S

to

p

b; this operation

thus results in a·single

at some unspecified point in the sequence

be a given set of "admissible" productions and let

defined as in Section 2.2, with

a.

- 33 -

S

any positive integer,

Q(a.,7)

Then the

p. be

path

P E

r--,-'

J

*

if there exists a sequence

81o .• 8m

p~q,

This is abbreviated P-'-----7qo

"Y\ q E Q(a,JJ

is called reducible to

e

in

with respect to

such that

8 .•. 8 P = q. 1 m

or if it is understood that

By convention, we say that

e

is invariant,

q----3>q.

The productions are to be interpreted as semantic rules of inference concerning relations in the network, inferred from inference. p

p

p ----;;.. q

means that

q

may be

through the application of a series of these rules of

Thus, to ask a quescion

exists between

x

and

y

(x,q,y)

such that

is to ask whether any path

P----3> q.

In the service of con-

sistency, we can think of the question-answering algorithm presented in

e

Theorem 1 as dealing with the case where For present purposes,

e

can be restricted to contain only elementary

productions, defined as productions B ,B E l 2

'Y,

B B -:;"B . I 2 i

i = 1

or

2.

is empty.

8

of the form

8(B B ) = B , where I 2 i

These can be written more concisely as

('1{,J{, 7)

Of course, the symmetry of labeled arcs in B B -7 B l 2 l

requires that i f

e. e

also belongs to of relations in

~e

is a production in

e

B;lB~l----7B~l

then

may be thought of as a set of ordered triples

8 C~3

A set of productions

is called assQciative

if it satisfies the following conditions: (i)

lS

(ii)

AB--'7A

If

also in

If

BA-7A

is also in

is in

e

and

BC~ B

is in

e

then

AC-7A

e and

BC-7B

is in

e' then

CA----':7A

e. is in

e. 8

The associative property implies that if

p"'::'::;;"A,

pair of consecutive relations

p

left-hand side is

AA , I 2

A ,A l 2

in

A E

'Y,

then for every

there is a production whose

The proof of this statement can easily be demon-

- 34 -

strated by induction on the length of

p,

The .set of paths (sequences of nodes) in the network which can be interpreted as a YE8 answer to the question

In general.

Pa(q)

a ••• em p =A' l Pe(q)

(x,q.y)

can be extremely large. since the

need not be distinct.

may even be infinite.

may now be written:

8 i

satisfying

If cycling phenomena occur.

However. provided that

Pe(q)

is non-empty.

the fundamental path-searching algorithm can be easily extended to find the shortest path in theorem below.

Pe(q).

Of course.

This second major result is derived in the if

PB(q)

= 0.

the search process could con-

tinue forever; thus the terminal must fix some time limit after which. if no path has been found. it returns a NO answer, The new "inferential search" algorithm requires an expanded set of limb states.

Define

G

to be the set of all combinations of distinct

elements from the set

A

In other words.

G is the set of all unordered groups of signals. (8

(i)

1

8 ••••• 8D)

2

8, E G. i = l •••.• ~ ~

(ii)

j

+i

(iii)

1

~ ~ ~

~ 8j

+ 8i

2(a

+ 1)

From combinatorial theory.

- 35 -

where

~

card

= 2a!2 (2a+2) jJ=l jJ

= 22a+2 _ 1

= one less than the sum of the binomial cqefficients of order In practice we have never found the length ofa question Hence,

a = 3

q

2a+2.

to exceed 3.

seems like a reasonable upper bound, in which case card

G = 255.

t

Let Then

e

r

z,t

1 = {w, I} U

=(SlS2"'S ) E

G

be the set of limb states for the network,

G means

that at time

simultaneously transmits the signals ki

8 = 0, Pe(q) = Z(q).

the limb

Sl'S2""'SjJ'

is interpreted exactly as in Section 2.2.

and that for

t

(z,r)

Reception of a signal

#0 c

Notice that

#1'

Thus the following theorem is in a

precise sense a generalization of Theorem 1.

However, the complexity of

the state transition function must increase enormously, as evidenced by the fact that card

{,o

01

As before, we define Theorem 2: 8

~

Jr3

= 8, =

whereas card

£~

('71. ,A ,7)

Let

and

0(1

J 1 = 257, = 0 x yd. 1

for

be any labeled, directed network, and let

be an associative set of elementary productions.

question

(x,q,y)

with

a transition function the automaton

x, y E ~q

(-dr, £1'

7(

and

q E Q(a,:J},

Then to each

there corresponds

having the following property:

~q)

limbs connected according to the shortest path in

a = 3.

Suppose that

1(..

is embedded at each node of p,

Assuming that

Pe(q), with length

v.

Pe(q)

+0,

Then after

x

with

let and

0

be

yare

excited by the terminal the network of automata will verify the existence of

0 within

Proof:

vl2

time units.

Again, the proof is constructive.

search procedure at time

t = 0

The terminal initiates the

by exciting - 36 -

x

with a type-l and

y

Pre (A) Post (A)

7'

{B E

~

{B E

~

BA + A is in

7'

'q= ~l ••. An

Let function

7,

AE

For any

with a type-2 excitatiouQ

:

AB +A

define the two sets 8}

is in

and

8L

Q(a,'JP).

be a specific sequence of

is similar to

4>q

except that it permits the

q

The sequenc.e

to be interspersed with other relations according to the admissible productions. panding

~

In effect, it employs the productions in reverse, exwhenever possible into in

8 •

for each production of the

This is accomplished by allowing a

signal to be propagated along (k

A B k 2

B 2

k,

~

arcs without being increased to

+ 1) .. ~

The following are the transition rules for

'4J •

q

An alternative

reading of each rule is obtained by substituting the phrases in parentheses. 1.

If automaton

x(y)

receives a type-l (type-2) excitation,

then~

(a) it places all limbs labeled (b) for each x(y) 2.

Al(A

-1

n

)

in state

11(1 2 ),

B E Pre (AI) (C E Post (An»

B(C~l). in state 01(02)'

places all limbs labeled

This transition rule is illustrated in Figure 7. If limb (a)

z

(z,r)

receives a

k (k ) l 2

places all limbs labeled

signal, then:

Ak+l(A _ ) n k

in state

(k+l)l(state (k+l)2)' (b) i f

(g~)-l E {~} U Post (~) [g~ E {A:=k+l} U Pre (An - k+1)]

then for each

B E Pre

(~+l)

U Post (Ak )

[C E Pre (A - + ) U Post (A _ )] n k l n k l B (C- ) in state k (k ) l 2 - 37 -

z

places all limbs labeled

(g~)-l E Pre (~+l) [g~ E Post (An _k )]

(c) if

BE Pre (A + ) [C E Post (A _ )] z k l n k l labeled B(C- ) in state k (k ). l 2 (Parts (b) and (c) allow the

k

could vanish from the sequence (d) whenever

z

p

through the application of productions.)

is required by (a), (b), and (c) to place a

e

r

.".

z,t

E G,

then

e

1

where

places that limb in state If

places all limbs

signal to propagate along any arcs which

i

limb in the states

3.

then for each

r

(Sl'" S\l) E

it

G:

Rule 2 may supersede this

1.

z,t+l

2(,,+1),

< \l
GTOMAV

~-~

c

C

BIRD

E

WINGS

HYTHICAL CREATURE

HAMHAL

E

E

ROBIN

DEVIL

HOOSE

H

EARS

occur are far from obvious; Collins and Quillian (1969), among others, have attempted to determine experimentally the actual structure of such sub~networks.

For the

~present

we are concerned purely with illustration ~

49

~

of the network operation, and not with claims about the detailed structureof memory. Before a-question is processed the entire network is re-initialized into the idle -statel.

A simple example of a question is

animal?" ,coded as (BIRD, C, ANIMAL) •. send signals 11 from BIRD and

1

2

"Is a bird an

The transition function

~q

would

from ANIHAL, and event $ would occur

at the BIRD .automaton .due to crossing of signals (Rule 4b) after the first step. A slightly more ,complex example is the question "Is there a mammal with horns?"

The ,corresponding triple would be (HAHHAL, E-lH, HORNS),

which will search for a path

E-IH

Horeexplicitly,the question is: an element of MAMMAL and

z

between the nodes MAMMAL and HORNS. "Does there exist

z

such that

z

is

has-as-parts HORNS?"

Figure 10 .demonstrates the state transitions of the automata involved in answering this question. nodes MAMMAL and.HORNS. paths

(E-:

l

At time

t

= 0,

the terminal excites the two

The search signals then fan out along admissible

out of MAMMAL and H-

l

out of HORNS) until the node HOOSE

receives the two simultaneous, complementary signals

11

and

12 ,

The

terminal is immediately informed of the discovery of a solution path (event $) ,and it can return a YES answer, and identify the node HOOSE as a positive

instance~

If BIRD were substituted for MAMMAL in the question, then no path satisfying E-IH

would have been found, since no bird has true horns

(not even the owl),

At time

t =2, the terminal would return a NO

answer, indicating a failure to retrieve a positive instance. In Chapter III we shall deal with .strongertypes of negation.

- 50 -

Figure 10 (a) EXCITATION

EXCITATION

0

t

I--_H"----=':li

E

(b)

EVENT

I

$

1

t

E

H

I

E

We can also ,introduce productions into the scheme, allowing elementary inferences in the-question-answering procedure. can be interpreted-as the '''string of relations

- 51 -

p

The sentence "p -?> q"

implies the string of

relations· q productions

between any two nodes ."

~ould

Two intuitively legitimate elementary

be:

8 1

EC~E

(X an element of X an 'element 'of

Y, z)

Y

8 2

EH--7-H.

(X ,an 'elemen t of X has-as-parts

Y, z)

Y has-as-parts

a subclass of

Z

Z

implies

implies

In order for .theset of admissible productions to be associative, 8

1

and .8 , .require,athird .production 2 8

3

: CH--3>H

Fortunately,

8

(X a sub class of Y, X has-ss-.parts Z)

Y

has-as-parts

Z

implies

also seems intuitively correct •

3

.Theinferential algorithm described in Section 2.3 permits the application of these productions. a rabin have .wings?" (ROBIN ,II, WINGS). for .the path

q

=

Symbolically, this can be represented as the triple Using a production-sensitive transition function

q

1jJ q

-.H,the -network of automata will display the path EH as a

solution using .the 'production to

For example, consider the question: "Does

8

2

EH~H.

The shortest path reducible

is .alwaysthe first one to be .discovered, due to the parallel nature

..ofthe search.

Even in an 'a~tosynchronous network, the length of time

required to find a path is roughly proportional to the length of that path. Anotherexampleof.applying a .production is with the question, "Does a rabbit have a stomach?" which transforms to (RABBI" q "H

again.

of automata is

H, STONACH), where

In this case,the only .solution path found in the network p

= ECH.

But

p--;'q,

since

8 P 1

= EH,

and

8 (8 1 p) 2

=

H.

Several other types of questions can be handled by such automata. "What mammal has no horns?" requests an instance of a node

there is a path

(x, E, WIW1AL)

but no path

(x, H, HORNS).

x

for which

The terminal

would symbolize this question as a modified triple, with the variable

- 52 -

node

x· inserted at that point in the string where node identification

is .required, viz._,.(MAMMAt,E-

l

x

~,HORNS)..

The slash through the

indicates which portion 'ofthe path must be absent. would-be -the same as .£or ·the 'question for .event$becomes:

.any node

x

-1

q = E

H,

H

The search algorithm

except that the rule

which -receives a type-l signal from

MAMMAL, but no type-2-signal from'HORNS after a finite time interval (say

t = -4).

Figure 10,

-notifies -the .terminal.

For the subnetwork illustrated in

-x= .RABBIT wHl be ·the only node satisfying this condition.

Hence, -the -answer will be -"rabbit",rather than YES ~henetwork

solution paths.

0);"

NO.

-can also -handle questions involving several component

For instance,consider the question: "IJ there an animal

with both horns and ears?"

The terminal would decomposeithis into two

. -1 .queries: - -(ANIMAL, E x .H, EARS) and (x, H, HORNS).

Although the order

of .inquiryisarbitrary, .the 'second component depends on the node identified ,as .the ·answer .tothe first component.

The initial answer path is

C-lE-lH~E-lH, .so that

x = MOOSE

x= RABBIT

and

are both possible

answers.-The .secondcomponent is .processed by exciting both -RABBIT and MOOSE with type-l signals, .and the solution is found to be MOOSE.

The

sequential mechanism ·seems psychologically plausible, since it is difficult.to imagine asking ourselves two questions simultaneously. Multiple queries .aretreated more formally in the context of propositional information by FikselandBower(inpress).

- 53 -

CHAPTER III

3.1

First~OrderLanguage·Questions.

Referring again·to Figure 1 'of Section 1.2, we shall now consider possible .candidates .forthe .formallanguage used as input to the control terminal.

So far,..questions .put ,to ·the ,terminal have been phrased in

terms .of findingspeci£ied'paths between nodes.

However, in order to

interact .properly with the semantic processor, the terminal must have the ability to analyze questions that are represented in a precise formal language, and to translate them into

path~searching

executable .by .thenetwork .ofautomata. by

thenode~automata~as

procedures that are

The computational strategy employed

been,su££icientlydeveloped in Chapter II; in

this chapter .the supelOvisory role of the terminal will be expanded, allowing the system,to·answelO.a.broad,

well~defined

range of sophisticated

qUllstions. A suitable beginning choice for a formal language is the first~orderlanguageofpredicatecalculus..We

well~known

shall use a simple version

in which the .atomic sentences all consist of a binary relation between two concept~names.

of nodes

.'71.

.The

'non~logicf\l

.vocabulaliY .of t)-ds language will be a set

in a semantic network, and a set of binary relations, corres~

ponding .tothe set

'7 of

arc-labels.

A possible realization for this

language is the real world model implied by the semantics of the vocabulary.

The semantic network serves as a representation for this real

world model,. so that the truth of any first-order language sentence could be tested.byconsulting .the informationstolOed in the network.

Our purpose is

to determine what .classes 'ofsentencescanbe successfully tested using

- 54 -

finite computational procegures. Formally speaking, let

('l'(, vf ,J)

L =

network, as ,defined in Section 2.1. with equality,having

t!(

7z

be a labeled, directed

Let ~ be a first-order language

·as the set ·of individual constants.

£.

of binary ,relations in

J-:

is isomorphic to the set

The set and we shall

use the same symbol to designate .corresponding members of both sets. general, i f

R

Eh

semantic .model

L)

then the binary relation ,between .every pair

a, b

R

E'l(

Et1(

-1 R

denotes the inverse relation of

R,

as

holds (in the

such that

This would be written in ~ as the atomic sentence

(a, b; R)

aRb,

(\l)

/\ (and) ,

V(or)

-I-/' bRa. Sentences of 0\-

quantifiers and the usual logical connectives

---7 (implication), and

E 0.

or since

can be constructed from atomic sentences using the existential (j) universal

In

=(if and only if).

and

"'(not) A sentence

of '~canbe .considered ,a question ·tobe answered in the sense that it is -either true .or -false with respect ·to the semantic network model.

We

now give a .completecharacterization of 0(;: Def:

A term of .;;( is any variable dual constant from the set

Def:

'1(

and

or

any indivi-

(a,b,c,etc.).

Anatomic formula is any expression REt1[U{=},

Def:

(x,y,z,etc.)

vI R v

2

where

are termso

A formula.is any expression derived from.the following rules: 1. ..Everyatomic .formula is ,a .formula

2.

I f f and

b

are formulae, so are

"';;:;

J7b

Y'n,fj, : / U/j,7=it. 3.

If

'7 is

and (Yx) Def:

A sentence of

a formula and

x

a variable, then (3x) --:;::

J

are formulae.

of:.

is any formula which contains no free

- 55 -

variables.

A variable is free if it is not in the scope of

.aquantifier

0

Apossible.realization for the first-order language pair

'ti(=

(D; v>

where

D

=

'1(

U / ( and

every element .of the non-logical vocabulary

v

0 S}.

from

From Proposition 1, if

aRb

is ~roof:

nodes

is decidable by

with the set of productions

By axioms l(b) and 3, aRb

there exists a path. p

Corollary 3:

aRb

a = b

The required procedure consists of

asking .thequestion. (a,R, b).,

e

and

a

to

b

such that

is decidable so is

iff.

e R. p....,......,.. ~(aRb).

Arty sentence of 0( which contains no quantifiers

E-decidab1e. If a sentence.contains no quantifiers, it contains no

variables (or else ,the variables would be free).

From the

definition of a formula, the sentence must consist of atomic sentences Joined by logical connectives.

- 58 -

Hence, by

Proposition 1 and Theorem 3, it is decidable. In order to prove our first major result we require the following lemma. Lemma 1: where Proof:

(3 x) (xRa) and f7., R E tIf..

(i) a E

("3 x) ('"xRa)

The inferential algorithm for deciding

as follows: (i) a path a1

(ii)

2

p

node

such that

a

(bRa)

are decidable,

is modified

sends type-2 signals searching for

p-3>R

-1

Any node

.

x

that receives

signal satisfies. xRa;

(ii)

X = {x E ~ : xRa}

Let

in (i).

(,3 x) (,"xRa)

Then

be the set of nodes contacted

iff.

1( -

X

+ 0.

Notice that axiom l(c) excludes the possibility of cycling due to productions: thus

X is found ,after a finite time.

Theorem 4:

Let

~ (x)

be a formula with

variable, -containing no quantifiers ~

( ::3 x)

;;: (x)

is

x

as the only free

Then the sentence

I-decidable.

. Proof: d(x)

can be expressed in disjunctive normal form as n. where = /'( -.I1lV.!J2 V . (x) m 1 J j=l i =1,~ o. ,m, and the (x) are either atomic formulae or their j

-11.

.,vLt

11.

If

negations, involving only

(3 x) OC(x)

m

"

(:::l

V x) J:j. , i=l 1 that sentence is decidable if for each

"

and individual constants. so that by Proposition 1

(::3 x)"//i

is decidable

i.

In what follows we shall suppress the subscript

JJ.

= )\

'II. (x) •

Ignoring negations, the

j=l J possible types:

- 59 -

#(x) J

i.

Let

fall into three

where

L

xRx

2.

xRa

3.

aRb

a, bE

pend upon

7Z

x,

1I j (x)

and

""' ;713

Since the type 3 formulae do not de-

REt!(.

11 3 /\ ( ::3 x) ( i l l /\ il2)

=

of type

k,

k

° /7

is decidable if both Since

aRx

they can be taken outside the scope of the quantifier.

(3 x).,it

Hence all

or

1,2,3.

=

p

and

3

the formula is true for all

=

(3 x)( i l l /\ il2 )

are decidable.

it is decidable by Corollary

xRx.

If

R = C

3.

-1 C ,by axiom lea)

or

On the other hand, i f

x.

conjoins

Again, by Proposition l,the sentence

contains no quantifiers,

Consider the type 1 formula

if k

where

R E

l {C, C- },

If... -

r-.J

x, by axiom 2.

the formula is false for all

is true,. in which case the expression is false; in which case

P

P

becomes

-1 xR a,

as

jj!l or

ill

is false. is decidable.

It remains to show that aRx

Therefore, either

We can rewrite

where

so that

or

a

If

p = 1,

ilz

j

E

7{

and

R Elf, j

j

= 1, ... ,po

,..J.

(:3 x)

is decidable by Lemma 1.

As in the proof of that

lemma, let Xl = {x E

II,

(:3 x) 112

'Ill (x)} •

AJ

For any finite

p

-=Xl

2,

A.

Find

B.

(recursive)

is decidable by the following procedure:

as defined above, and set Find

X = {v E X _ : j j 1

- 60 -

j = 2.

1-Ij (v)}

( :3 v) Ii.J (v)

This set is determined by asking the question

as in Lemma 1 and admitting only those solution nodes which are .already in

(We assume that the terminal can "mark"

X.J- 1

a specific set of nodes for.-sub.seq.uent comparison; notice I

X ~ j

kj _ l

·) rV

(Exit rule)

for any

If

than

j,

( 3x)

'liz

is

false. On the other hand, i f (i)

if

j

p


f (x) , \;fa H

R (ii).

~

axiom lea).

;C(x) =

(3

by axiom 8

x). (>.xRa).

R = C -7 'f(x) • \;fa

by axioms 7, 10, 11

c-l~ y' (x)., Va by axiom 10

R

is equivalent to (i) with

R = H R =

-1 H is equivalent to (i) with

R= H

is equivalent to (i). with

--1 R= H is equivalent to (i). with

H --1 H H -1 H

The proofs of all the above assertions are easy.

Thus, Theorem 4 still

holds in the case of the animal taxonomy network.

We shall give an ex-

ample of the question-answering procedure for a sentence with one quantifier: Example:

"Here is a riddle: What warm-blooded animal has wings,

cannot fly; but can swim?"

The corresponding sentence in .~ is:

( 3 x}(xC

an/\ xH warm-blood /\ xH wings /\xH flight /\xH swimming)

The relevant portion of

('t(. A, J)

is shown in Figure 12.

By Thm.4, the procedure for deciding whether such a creature exists is as follows: A.

Xl = {x

1 X

Can}.

- 65 -

B. (1)

(]x)Q(x) "'(Vx)Q(x) ... (3x) ("'Q(x» Let

X = {x E

Then (ii)

'fL n

-

I2 n : "'Q(x)} X + 0 ... (:Ix) Q(x) •

Conversely, suppose

(3x)Q(x)

(Corollary 2)

is decidable.