Oct 31, 1973 - question-answering, The principal obstacles to a workable system seem .... "answer" path has been found, and the terminal composes an appropriate ...... an answer to the riddle: penguin. Figure 12. CnOG"). PENGUIN. - 66 -.
A NETWORK-OF-AUTOMATA MODEL FOR QUESTION-ANSWERING IN SEMANTIC MEMORY by Joseph Fiksel
TECHNICAL REPORT NO. 218 October 31, 1973
PSYCHOLOGY AND EDUCATION SERIES
Reproduction in Whole or in Part Is Permitted for Any Purpose of the United States Government
Research and reproduction of this report was partially supported by Contract NIH GM 14789-05, NSF GJ 443X3, and EC 443X4.
INSTITUTE FOR MATHEMATICAL STUDIES IN THE SOCIAL SCIENCES STANFORD
UNIVERSITY
STANFORD, CALIFORNIA
TABLE OF CONTENTS Page CHAPTER I 1.1
Introduction
1
1;2
The Question-Answering System
3
2.1
Mathematical Formulation
7
2. 2
The Fundamental Algori thm
11
2.3
An Inferential Search Algorithm
33
2.4
Some Simple Illustrations
48
3.1
First-Order Language Questions
54
3.2
Illustration
61
CHAPTER II
CHAPTER III
3; 3 . ·Complex Expressions 3;4
Model~Theoretic
Representation REFERENCES
67
Semantics in a Network
73 82
.,.
CHAPTER I
1.1
Introduction. This dissertation proposes a new theory of how people answer
questions on the basis of information available in memory.
This is an
important but currently unsolved puzzle for cognitive psychology.
A
"question...;answering system" is normally unde,rstood to mean a computer
program capable of (i) accepting and interpreting propositional information and questions, (ii) storing the information according to some internal representation scheme, and (iii) answering questions by retrieving the relevant information and employing deductive reasoning.
The underlying
goal in question-answering research is to develop a computational theory which simulates some aspects of the way humans answer questions.
However,
,in artificial intelligence work of this type, the actual psychological processes which underlie human performance are usually considered in only a casual, intuitive fashion.
Instead, emphasis has typically been placed
on achieving efficient programs and data structures which enable computations to proceed within the limitations of present-day serial computers. The goals of this paper, however, are to be distinguished from these traditional goals of artificial intelligence work.
We try to model
(at least roughly) the actual psychological mechanisms involved in question-answering, as well as the representational design of human semantic memory.
Thus, these efforts would moreptoperly fall under the
domain of theoretical psychology.
Indeed, the criteria used for testing
the validity of the question-answering model are drawn from the growing literature in experimental psychology dealing with human reaction times in question-answering.
At the same time, computational efficiency is
- 1 -
given no weight at all in the construction of our model.
In fact, the
parallel-processing search algorithm which is used lends itself very poorly to realization on a serial
computer~
But then, there are no necessary
reasons to suppose that memory-searching methods of the brain are analogous to serial rather than parallel processes. Nevertheless, there are important similarities between this model and previous question-answering systems, in terms of the problematics of language representation, memory organization, and
inference~
Simmons
(1969) provides an excellent review of the progress in the field of question-answering,
The principal obstacles to a workable system seem
to have been (i) correctly analyzing the semantics of a question and (ii) developing efficient techniques or heuristics for searching through a large data base (memory).
Following the example of most others, we assume that
the first problem (i) is solvable in principle, and that questions (and statements) are to be transformed (by some unanalyzed semantic processor) into an unambiguous formal language.
Coles (1968) has already dealt with
this problem with some measure of success, One of the more successful question-answering systems was created by Green and Raphael (1968), who employed formal mathematical theorem-proving methods (so-called "Robinson resolution" methods) to deduce the answer to a question.
Another well-known system was developed by Winograd (1972),
who conversed with a hypothetical "robot" operating in a small world of colored blocks on a table.
Both these systems are unconcerned with provid-
ing an accurate psychological description of question-answering.
Of more
direct relevance to our work are the graph-structure belief systems of \
Colby (1968), and the semantic networks of Quillian (1966), which necessarily
- 2 -
sacrific~ ~fficientr~alizability
m~mory.
for semantic with parallel provid~d
1.2
Th~
to
sugg~st
structur~
a plausible
By formalizing the notion of a network of automata
s~arch t~chniques,
pap~r
this
builds upon the foundations
Question-Answering System.
g~neral
answ~ring syst~m
pr~cis~
are diagrammed in Figure 1.
formal
languag~ qu~ry,
information. th~
proc~ssor
and
Wh~n
propos~d
components and information flow of a
natural language is input to
outputs
ord~r
th~se latt~r pap~rs.
by
The
into a
in
th~
languag~.
initiat~s
the control
information in
th~
a
A question
question-
formulat~d
in
semantic processor, which encodes it Th~
control terminal
s~arch
th~ m~mory stor~
in
t~rminal r~c~iv~s
formal
languag~.
transforms this output into an
acc~pts
answ~r
a
r~ply
Finally
for
from
this formal th~ r~quir~d m~mory,
it
th~ s~mantic
in natural
languag~ •
..
QUESTIONS NATURAL LANGUAGE
/
SEllANTIC PROCESSOR
..... ANSWERS
-
F ORl1AL
-:.7
E-LAN GUAGE -
CONTROL TERlIINAL / \
SEARC H COMMAND
RESPONSE
\ llEMORY STORE
Figur~
- 3 -
1
1/
The scope of this paper includes only the control terminal and the memory, not the semantic
processor~
Hence, we will not be concerned with
all the thorny issues of synonymity, syntactic ambiguity, and other aspects of natural language which hinder mehcnaical translation.
The
focus is rather on the organization of the memory store and the search techniques employed to extract information.
In Chapter II we assume that
the questions received by the terminal have already been rephrased in terms of the network structure of memory.
Then, in Chapters 3.1 and 3.4, we, con-
sider questions that are stated in a formal language. the network represent formal semantic
concepts~
not
The elements of
words~
and several
words (e.g., rabbit, bunny, hare) may all map into a single concept. In simplest terms, information may be roughly divided into items (e.g., any concept which can act as the subject of a proposition) and relations between these"items.
Each item corresponds to a node of the
network, and to every -relation between ,two items there corresponds a
directed arc joining the appropriate nodes.
Ths set of arcs (relations)
in the network is partitioned into a finite set of types, each type represented by a label on its arcs.
For example, the phr8>se "Rabbits have
stomachs" could be represented as two nodes, RABBIT and STOMACH, with an arc labelled "HAS-AS-PARTS" directed from RABBIT to STOMACH, as illustrated in Figure 2 (a).'
It is assumed that a certain finite set of concepts and
relations is used as "primitives ll , from which complex meanings cam be
constructed.
Schank (1972) is developing a system of semantic primitives
along with a parser (semantic processor) which will encode naturallanguage phrases into this system. Our theory supposes that each node of the network contains a
- 4 -
finite~state
cautomaton,.whose"input.is ,the ·stateof each of its "neighbors"
in the network, and whose state-changes are a hlUction of these inputs, Thus we postulate a network of identical automata, each interacting only locally with its neighbors ·as it undergoes state transitions from moment to moment, We also postulate a ,control mechanism, external to the network, called the terminal, which may transmit and receive simple pulses to and from the network automata,
An ,automaton can be "excited" by the terminal
and can notify the terminal if it achieves a particular, designated state, Otherwise, however, the operation of the automata proceeds without any global control, In general, the
state~transition,rules for
the automata could be
probabilistic, as exemplified by simple word-association norms,
However,
we will assume that question-answering is basically a deterministic search procedure,
For example, suppose that the terminal receives the que",tion
"Does a rabbit have a stomach?"
The query causes the terminal to excite the
two node-automata RABBIT and STOMACH,
Since i t is seeking a path in the
network of the form shown in Figure 2(a) or (b), the state-transition function is specif:\.ed roughly as follows: RABBIT sends a signal along all outward-directed arcs that are labelled HAS or SUBSET OFi similarly, STOMACH sends ,a signal along all inward-directed arcs that are l.,belled HAS,
(The exact manner in which these signals propagate through the net-
work is explained in Section 2,2).
When some intermediate node receives
the proper signals from two directions, it notifies the terminal that an "answer" path has been found, and the terminal composes an appropriate answer
0
- 5 -
If a notification is received, the answer returned is YES.
If no
notification is received after a certain time, the answer is NO, indicating the .absence of a path in the network of .the desired type.
Later we shall
discuss .the return of factual. answers rather than simply YES or NO.
HAS (a)
(b)
RABBIT
}----~
C::RABBI~ SUBSET
Figure 2
- 6 -
STOMACH
CHAPTER II
2.1
Mathematical Formulation. A directed network may be· formally defined as a doublet
where
/Z
..tJ
is a (possibly infinite) set of nodes, and
(7?, -If. ),
7Z x 71.
c
is a set of ordered pairs corresponding to arcs of the network. (x,y) E,;fl
there is a directed arc running from node
where
'it? '
x,y E
and
x
and
x
Thus, if
to node
yare said to be adjacent.
A labeled directed network may be defined as a triplet ( '7( where
I(
is a finite set of nodes,
types, and
.-1
C
an arc from
x
to
71. 2 y,
x
7.
y,
,A , '7')
~is a finite set of relational
That is, i f
J1 ,
(x,y; u) E
with the arc labeled as type
u.
then there is
Now, the degree
of a node is the number of arcs incident to that node, either incoming or outgoing.
Let the maximum degree of
('l? ,u'f, 7!
be
d.
The type of automata used in this paper are quite simple mathematically, and may be imagined as little demons with exactly d "limbs" or tentacles radiating from a central cell. below.
Let
[d]
= {1,2,3 •.• d};
For example,
d = 5
in the figure
then an arbitrary numbering of the limbs
can be established by associating each limb with an element of
[d].
will embed these automata in the network by placing one at each node. the automata in the network are assumed to be identical. shall refer to the automaton at node ·x
- 7 -
as "automaton
We All
Henceforth, we x" .
(y,S)
p (x, 3)
1
and
p(y,3) = (y,3)
5
3
5
2
1
2
4
The limbs of each automaton are connected to the limbs of other (x,y; u) E ~
automata in such a way that every arc connection between one limb of we define a mapping
p :
'1?
x
x [d] ~
into itself or some other limb. limb of automaton
implying an arc between
x
used, or "dead". p
More precisely,
[d], which maps every limb
p(x,r) = (y,s)
and
is the identity mapping, since x
remain unconnected.
'17. x
y.
means
that the r
is connected to the sth limb of automaton
x
Unless node
and one limb of
corresponds to a
y.
p
y,
composed with itself
p(p(x,r»= p(y,s) = (x,r».
has degree If
Note that
th
d, some "of the limbs of
p(x,r) = (x,r),
we say that limb
x
will
(x,r)
is not
In effect, it is connected to itself; by this device,
is defined over the entire set of limbs in the network.
The purpose
of the dead limbs "is to make "all the automata identical, regardless of the effective degree of their nodes. Let
J
be a set with the property that for every
there exists "an inverse relation of type
- 8 -
u
-1
E
-,J
;
u E
J,
that is, for every
x
and
y
the arc
(x,y, u)
is equivalent to the arc
the direction of an arc may be arbitrarily chosen, relation
u
maton
x.
(x,r),
u = u
-1
-1
),
Thus
the
is symmetric, and the arc may be considered undirected,
d g = (g 1 , .•. , g) x x x
Let
If
(y, x, u
g~ E ~
be the relational vector associated with auto-
is the relation on the arc corresponding to the limb
assuming this arc is directed away from
p (x. r) = (y,s)
and
r gx = u,
assigned the null relation,
then (1 E
gy = u -1 s
":7!.
x.
This means that if
Of course ~ dead limbs are p(x,r) = (x. r) => gxr
'I'hat is,
(1.
We now have completely described the structure of a network of automata from a topological point of view.
Next we must discuss the
dyanmics of this network under the condition that an.automaton may undergo a "-change of statell. at successiveediscrete instants of time,
t
= 0.1.2,3, .•.•
(~.oe, ~
Formally, an automaton is defined as a triple
where ~ is the set of possible states of the automaton at any instant of time, 0( is the set of possible inputs. and
~
is a state transition
function
which determines the state of the automaton at the next instant of time, given its current state and input. The state of automaton
x
at timet
and will consist of a d-component vector d
d 1 2 ( x, et. e x, t "., e x.t
is the number of limbs of the automaton.
will actually be a composite of to each limb.
will be written
Thus a state
at instant
In this dissertation, we shall have
t
will be written
time-invariant relational vector
x. t' ) where
x.t
E
d elementary states, one corresponding
is a finite set of elenemtary limb states, x
e
e
A~;f'
xi:
=
$7. d.
where
The input to automaton
For each automaton
x.
the
gx will form one part of the input - 9 -
).
The other part will be the vector
vectol'o
f
1 d x, t = (f x, t .•• , f x, t)' the
limb states of all limbs connected to those of p(x,r)
=
(y,s) then
S fr = e x, t y, t
x.
In particular, if
Verbally, the time-varying input to
•
x
will consist of a portion of the states of all node-automata adjacent to
x.
Hence
A = (f ,g), and x,t x,t x
notation we have
e.."'" x', t+1
9C':
= ~ (ex, ~, Ax, t) ,
"
=
~: d
x
J'fi..
Using the above
These network automata differ
from classical automata in three respectsg the inputs of a classical automaton are usually symbols from an alphabet, and are distinct from the internal states of the automaton, here the states are equivalent to inputs,
Also, a classical
automaton is treated as an isolated entity, while here many automata are connected together.
Finally, the classical automaton may have rudimentary
storage such asa push-down stack, whereas the network automata, being much simpler, have no storage capabilities whatsoever, Note that the sequence of states of each automaton in the network is determined solely by the states of ·its immediate neighbours. is meant by the phrase ":l.oca1 interaction".
This is what
If you think again in terms of
the little demons, none of them realize that they are part of a network. All they know is that their tentacles are gripping those ofa number of identical demons. happy too.
If a demon's neighbours are in a happy state, he may be
On the other hand, if he were just in a sad state, and his
neighbours were happy, this may send him into a deeper depression. Abandoning the metaphor, and returning to the mathematical formulation, such networks of finite automata have a remarkable property. and the function
~
If the set
.4-1
are appropriately chosen; the automata can perform
complex tasks involving properties of the network as a whole, in the absence of any global guidance.
Rosenstiehl, Fiksel, and Holliger [1972] deal ex-
tensively with the problem-solving capabilities of automata in non_labeled
- 10 -
graphs
0
Below we demonstrate that automata are capable of performing
question-answering in the special case ofa labeled directed networko node-automata's only communications with the terminal are at time at the .end of their operationo
The
0, and
This is tantamount to switching the network
on and allowing it to change state until it switches itself offo
The set
of automata compute locally and in parallel, until they reach a stationary terminating
202
state~
The Fundamental Algorithmo The parallel processing operation of the network of automata requires
communication with the control terminal at the initial and final stages 0 Only the input and output of the control terminal will be specified at present; nothing will be said about its internal computational procedures until Section 30 to the
t~rminal,
Let us postulate a
(d+l)st
limb joining each automaton
with possible limb states and inputs
I,D, 1,
and
20
Ordinarily, the state of this limb and its input from the terminal are
I
(meaning "idle") 0 We shall say that the terminal sends a type-l (type-2) excitation to automaton becomes
1(2)
x
if the input to the
(d+l)st
limb of
x
for a single instant of time, and then returns to state
10
The purpose of these excitations will be made clear in Theorem 1, which is our fundamental theoremo The remaining state,
is used by the terminal to control the
0,
activity of the network as a whole. sending an input of
0
to the
(d+l)st
an automaton receives an input of
leaving its remaining
d
The terminal.can freeze the network by limb of every automaton"
When
0, it ceases .to perform state transitions,
limbs in their current stateso - 11 -
It remains frozen
until the in state
O. input changes to an I.
T, whereupon it places all its limbs
Thus, after freezing, the terminal can initialize the network
into the idle state, We .shall now show in detail how the network of automata deals with a simple class of questions,
Given a .sequence
q
of relations and two nodes
x
and
y,the .network is to investigatethe.existence of a path between
x
and
y
whose labeled arcs match that sequence of relations.
When an
arc labeled
A is traversed in the backward direction, it appears in the
sequence
as
q
-1
A
•
ry
Q(a, -/ )
Let
where that the
A.
A
i
E
be the set of all relational sequences
7',
need not be distinct from one another.
l
the upper bound is bounded by
a
a,
be an even number.
ff,
,J
=
t
Ct~
Note
For convenience, let
Because the length
it is sufficient to take
elementary limb states
n
c
and let f)- be an automaton with
0D and elementary limb states
say that ,/). is stacked upon .. k: if
- 25 -
J£D'
We
so that for e
the state vector at node ~
$
- ~ (e A) x,t+l - ~c x,t' x,t
node
x
is
13
x,t+l
In other words, for its ~
state of
Notice that g
whereas for .& the state vector at
= ~ (13
,~
D x,t
x,t
,e
(t+l)st transition,
t+l~
at time
x,t+l
0
~ takes into account the
~,
~
but
is ind,apendent of
The stacked automata may be considered as a single
If and
)
as well as its own state and input at time
is dependent upon
the composition of
is
(r;
/f,
ofi-)"
automaton~
to
.8-0
written as
Stacking is useful when each
node-automaton is called upon to perform several clearly separable functions. Let r:;' q = (,J' 0' 0
does not ,hinder the operation of
h q.
The
only difference is that when the network is frozen after event $, certain limbs are left with markers on them.
Now, it is known that event $
occurred as a result of condition 4 being satisfied at automaton where
~ @ATHE~ However, the relevant information may be encoded in the network in a different form, such as: SUBSET-OF
HAS-AS-PARTS
>C
(ROBI0
>0ATHE~
~
BIRD
What is necessary is a rule of inference which says: the sequence of relations SUBSET-OF, HAS-AS-PARTS is equivalent to the single relation HAS-AS-PARTS.
Below we show how the search algorithm for the network of
automata can be extended to incorporate exactly this sort of inferencemaking. Returning to the formal theory, we define a production as an operator
S
a, b E
on the set r-r-* j
J'*
Sa = b.
If
of relational sequences such that for a given p=c
ac
1
may be applied to the subsequence is denoted by replacement of Let
0
Sp = c bc , l 2 a
by
b
a,
Applying
2
E
J*,
then the production
replacing it by S
to
p
b; this operation
thus results in a·single
at some unspecified point in the sequence
be a given set of "admissible" productions and let
defined as in Section 2.2, with
a.
- 33 -
S
any positive integer,
Q(a.,7)
Then the
p. be
path
P E
r--,-'
J
*
if there exists a sequence
81o .• 8m
p~q,
This is abbreviated P-'-----7qo
"Y\ q E Q(a,JJ
is called reducible to
e
in
with respect to
such that
8 .•. 8 P = q. 1 m
or if it is understood that
By convention, we say that
e
is invariant,
q----3>q.
The productions are to be interpreted as semantic rules of inference concerning relations in the network, inferred from inference. p
p
p ----;;.. q
means that
q
may be
through the application of a series of these rules of
Thus, to ask a quescion
exists between
x
and
y
(x,q,y)
such that
is to ask whether any path
P----3> q.
In the service of con-
sistency, we can think of the question-answering algorithm presented in
e
Theorem 1 as dealing with the case where For present purposes,
e
can be restricted to contain only elementary
productions, defined as productions B ,B E l 2
'Y,
B B -:;"B . I 2 i
i = 1
or
2.
is empty.
8
of the form
8(B B ) = B , where I 2 i
These can be written more concisely as
('1{,J{, 7)
Of course, the symmetry of labeled arcs in B B -7 B l 2 l
requires that i f
e. e
also belongs to of relations in
~e
is a production in
e
B;lB~l----7B~l
then
may be thought of as a set of ordered triples
8 C~3
A set of productions
is called assQciative
if it satisfies the following conditions: (i)
lS
(ii)
AB--'7A
If
also in
If
BA-7A
is also in
is in
e
and
BC~ B
is in
e
then
AC-7A
e and
BC-7B
is in
e' then
CA----':7A
e. is in
e. 8
The associative property implies that if
p"'::'::;;"A,
pair of consecutive relations
p
left-hand side is
AA , I 2
A ,A l 2
in
A E
'Y,
then for every
there is a production whose
The proof of this statement can easily be demon-
- 34 -
strated by induction on the length of
p,
The .set of paths (sequences of nodes) in the network which can be interpreted as a YE8 answer to the question
In general.
Pa(q)
a ••• em p =A' l Pe(q)
(x,q.y)
can be extremely large. since the
need not be distinct.
may even be infinite.
may now be written:
8 i
satisfying
If cycling phenomena occur.
However. provided that
Pe(q)
is non-empty.
the fundamental path-searching algorithm can be easily extended to find the shortest path in theorem below.
Pe(q).
Of course.
This second major result is derived in the if
PB(q)
= 0.
the search process could con-
tinue forever; thus the terminal must fix some time limit after which. if no path has been found. it returns a NO answer, The new "inferential search" algorithm requires an expanded set of limb states.
Define
G
to be the set of all combinations of distinct
elements from the set
A
In other words.
G is the set of all unordered groups of signals. (8
(i)
1
8 ••••• 8D)
2
8, E G. i = l •••.• ~ ~
(ii)
j
+i
(iii)
1
~ ~ ~
~ 8j
+ 8i
2(a
+ 1)
From combinatorial theory.
- 35 -
where
~
card
= 2a!2 (2a+2) jJ=l jJ
= 22a+2 _ 1
= one less than the sum of the binomial cqefficients of order In practice we have never found the length ofa question Hence,
a = 3
q
2a+2.
to exceed 3.
seems like a reasonable upper bound, in which case card
G = 255.
t
Let Then
e
r
z,t
1 = {w, I} U
=(SlS2"'S ) E
G
be the set of limb states for the network,
G means
that at time
simultaneously transmits the signals ki
8 = 0, Pe(q) = Z(q).
the limb
Sl'S2""'SjJ'
is interpreted exactly as in Section 2.2.
and that for
t
(z,r)
Reception of a signal
#0 c
Notice that
#1'
Thus the following theorem is in a
precise sense a generalization of Theorem 1.
However, the complexity of
the state transition function must increase enormously, as evidenced by the fact that card
{,o
01
As before, we define Theorem 2: 8
~
Jr3
= 8, =
whereas card
£~
('71. ,A ,7)
Let
and
0(1
J 1 = 257, = 0 x yd. 1
for
be any labeled, directed network, and let
be an associative set of elementary productions.
question
(x,q,y)
with
a transition function the automaton
x, y E ~q
(-dr, £1'
7(
and
q E Q(a,:J},
Then to each
there corresponds
having the following property:
~q)
limbs connected according to the shortest path in
a = 3.
Suppose that
1(..
is embedded at each node of p,
Assuming that
Pe(q), with length
v.
Pe(q)
+0,
Then after
x
with
let and
0
be
yare
excited by the terminal the network of automata will verify the existence of
0 within
Proof:
vl2
time units.
Again, the proof is constructive.
search procedure at time
t = 0
The terminal initiates the
by exciting - 36 -
x
with a type-l and
y
Pre (A) Post (A)
7'
{B E
~
{B E
~
BA + A is in
7'
'q= ~l ••. An
Let function
7,
AE
For any
with a type-2 excitatiouQ
:
AB +A
define the two sets 8}
is in
and
8L
Q(a,'JP).
be a specific sequence of
is similar to
4>q
except that it permits the
q
The sequenc.e
to be interspersed with other relations according to the admissible productions. panding
~
In effect, it employs the productions in reverse, exwhenever possible into in
8 •
for each production of the
This is accomplished by allowing a
signal to be propagated along (k
A B k 2
B 2
k,
~
arcs without being increased to
+ 1) .. ~
The following are the transition rules for
'4J •
q
An alternative
reading of each rule is obtained by substituting the phrases in parentheses. 1.
If automaton
x(y)
receives a type-l (type-2) excitation,
then~
(a) it places all limbs labeled (b) for each x(y) 2.
Al(A
-1
n
)
in state
11(1 2 ),
B E Pre (AI) (C E Post (An»
B(C~l). in state 01(02)'
places all limbs labeled
This transition rule is illustrated in Figure 7. If limb (a)
z
(z,r)
receives a
k (k ) l 2
places all limbs labeled
signal, then:
Ak+l(A _ ) n k
in state
(k+l)l(state (k+l)2)' (b) i f
(g~)-l E {~} U Post (~) [g~ E {A:=k+l} U Pre (An - k+1)]
then for each
B E Pre
(~+l)
U Post (Ak )
[C E Pre (A - + ) U Post (A _ )] n k l n k l B (C- ) in state k (k ) l 2 - 37 -
z
places all limbs labeled
(g~)-l E Pre (~+l) [g~ E Post (An _k )]
(c) if
BE Pre (A + ) [C E Post (A _ )] z k l n k l labeled B(C- ) in state k (k ). l 2 (Parts (b) and (c) allow the
k
could vanish from the sequence (d) whenever
z
p
through the application of productions.)
is required by (a), (b), and (c) to place a
e
r
.".
z,t
E G,
then
e
1
where
places that limb in state If
places all limbs
signal to propagate along any arcs which
i
limb in the states
3.
then for each
r
(Sl'" S\l) E
it
G:
Rule 2 may supersede this
1.
z,t+l
2(,,+1),
< \l
GTOMAV
~-~
c
C
BIRD
E
WINGS
HYTHICAL CREATURE
HAMHAL
E
E
ROBIN
DEVIL
HOOSE
H
EARS
occur are far from obvious; Collins and Quillian (1969), among others, have attempted to determine experimentally the actual structure of such sub~networks.
For the
~present
we are concerned purely with illustration ~
49
~
of the network operation, and not with claims about the detailed structureof memory. Before a-question is processed the entire network is re-initialized into the idle -statel.
A simple example of a question is
animal?" ,coded as (BIRD, C, ANIMAL) •. send signals 11 from BIRD and
1
2
"Is a bird an
The transition function
~q
would
from ANIHAL, and event $ would occur
at the BIRD .automaton .due to crossing of signals (Rule 4b) after the first step. A slightly more ,complex example is the question "Is there a mammal with horns?"
The ,corresponding triple would be (HAHHAL, E-lH, HORNS),
which will search for a path
E-IH
Horeexplicitly,the question is: an element of MAMMAL and
z
between the nodes MAMMAL and HORNS. "Does there exist
z
such that
z
is
has-as-parts HORNS?"
Figure 10 .demonstrates the state transitions of the automata involved in answering this question. nodes MAMMAL and.HORNS. paths
(E-:
l
At time
t
= 0,
the terminal excites the two
The search signals then fan out along admissible
out of MAMMAL and H-
l
out of HORNS) until the node HOOSE
receives the two simultaneous, complementary signals
11
and
12 ,
The
terminal is immediately informed of the discovery of a solution path (event $) ,and it can return a YES answer, and identify the node HOOSE as a positive
instance~
If BIRD were substituted for MAMMAL in the question, then no path satisfying E-IH
would have been found, since no bird has true horns
(not even the owl),
At time
t =2, the terminal would return a NO
answer, indicating a failure to retrieve a positive instance. In Chapter III we shall deal with .strongertypes of negation.
- 50 -
Figure 10 (a) EXCITATION
EXCITATION
0
t
I--_H"----=':li
E
(b)
EVENT
I
$
1
t
E
H
I
E
We can also ,introduce productions into the scheme, allowing elementary inferences in the-question-answering procedure. can be interpreted-as the '''string of relations
- 51 -
p
The sentence "p -?> q"
implies the string of
relations· q productions
between any two nodes ."
~ould
Two intuitively legitimate elementary
be:
8 1
EC~E
(X an element of X an 'element 'of
Y, z)
Y
8 2
EH--7-H.
(X ,an 'elemen t of X has-as-parts
Y, z)
Y has-as-parts
a subclass of
Z
Z
implies
implies
In order for .theset of admissible productions to be associative, 8
1
and .8 , .require,athird .production 2 8
3
: CH--3>H
Fortunately,
8
(X a sub class of Y, X has-ss-.parts Z)
Y
has-as-parts
Z
implies
also seems intuitively correct •
3
.Theinferential algorithm described in Section 2.3 permits the application of these productions. a rabin have .wings?" (ROBIN ,II, WINGS). for .the path
q
=
Symbolically, this can be represented as the triple Using a production-sensitive transition function
q
1jJ q
-.H,the -network of automata will display the path EH as a
solution using .the 'production to
For example, consider the question: "Does
8
2
EH~H.
The shortest path reducible
is .alwaysthe first one to be .discovered, due to the parallel nature
..ofthe search.
Even in an 'a~tosynchronous network, the length of time
required to find a path is roughly proportional to the length of that path. Anotherexampleof.applying a .production is with the question, "Does a rabbit have a stomach?" which transforms to (RABBI" q "H
again.
of automata is
H, STONACH), where
In this case,the only .solution path found in the network p
= ECH.
But
p--;'q,
since
8 P 1
= EH,
and
8 (8 1 p) 2
=
H.
Several other types of questions can be handled by such automata. "What mammal has no horns?" requests an instance of a node
there is a path
(x, E, WIW1AL)
but no path
(x, H, HORNS).
x
for which
The terminal
would symbolize this question as a modified triple, with the variable
- 52 -
node
x· inserted at that point in the string where node identification
is .required, viz._,.(MAMMAt,E-
l
x
~,HORNS)..
The slash through the
indicates which portion 'ofthe path must be absent. would-be -the same as .£or ·the 'question for .event$becomes:
.any node
x
-1
q = E
H,
H
The search algorithm
except that the rule
which -receives a type-l signal from
MAMMAL, but no type-2-signal from'HORNS after a finite time interval (say
t = -4).
Figure 10,
-notifies -the .terminal.
For the subnetwork illustrated in
-x= .RABBIT wHl be ·the only node satisfying this condition.
Hence, -the -answer will be -"rabbit",rather than YES ~henetwork
solution paths.
0);"
NO.
-can also -handle questions involving several component
For instance,consider the question: "IJ there an animal
with both horns and ears?"
The terminal would decomposeithis into two
. -1 .queries: - -(ANIMAL, E x .H, EARS) and (x, H, HORNS).
Although the order
of .inquiryisarbitrary, .the 'second component depends on the node identified ,as .the ·answer .tothe first component.
The initial answer path is
C-lE-lH~E-lH, .so that
x = MOOSE
x= RABBIT
and
are both possible
answers.-The .secondcomponent is .processed by exciting both -RABBIT and MOOSE with type-l signals, .and the solution is found to be MOOSE.
The
sequential mechanism ·seems psychologically plausible, since it is difficult.to imagine asking ourselves two questions simultaneously. Multiple queries .aretreated more formally in the context of propositional information by FikselandBower(inpress).
- 53 -
CHAPTER III
3.1
First~OrderLanguage·Questions.
Referring again·to Figure 1 'of Section 1.2, we shall now consider possible .candidates .forthe .formallanguage used as input to the control terminal.
So far,..questions .put ,to ·the ,terminal have been phrased in
terms .of findingspeci£ied'paths between nodes.
However, in order to
interact .properly with the semantic processor, the terminal must have the ability to analyze questions that are represented in a precise formal language, and to translate them into
path~searching
executable .by .thenetwork .ofautomata. by
thenode~automata~as
procedures that are
The computational strategy employed
been,su££icientlydeveloped in Chapter II; in
this chapter .the supelOvisory role of the terminal will be expanded, allowing the system,to·answelO.a.broad,
well~defined
range of sophisticated
qUllstions. A suitable beginning choice for a formal language is the first~orderlanguageofpredicatecalculus..We
well~known
shall use a simple version
in which the .atomic sentences all consist of a binary relation between two concept~names.
of nodes
.'71.
.The
'non~logicf\l
.vocabulaliY .of t)-ds language will be a set
in a semantic network, and a set of binary relations, corres~
ponding .tothe set
'7 of
arc-labels.
A possible realization for this
language is the real world model implied by the semantics of the vocabulary.
The semantic network serves as a representation for this real
world model,. so that the truth of any first-order language sentence could be tested.byconsulting .the informationstolOed in the network.
Our purpose is
to determine what .classes 'ofsentencescanbe successfully tested using
- 54 -
finite computational procegures. Formally speaking, let
('l'(, vf ,J)
L =
network, as ,defined in Section 2.1. with equality,having
t!(
7z
be a labeled, directed
Let ~ be a first-order language
·as the set ·of individual constants.
£.
of binary ,relations in
J-:
is isomorphic to the set
The set and we shall
use the same symbol to designate .corresponding members of both sets. general, i f
R
Eh
semantic .model
L)
then the binary relation ,between .every pair
a, b
R
E'l(
Et1(
-1 R
denotes the inverse relation of
R,
as
holds (in the
such that
This would be written in ~ as the atomic sentence
(a, b; R)
aRb,
(\l)
/\ (and) ,
V(or)
-I-/' bRa. Sentences of 0\-
quantifiers and the usual logical connectives
---7 (implication), and
E 0.
or since
can be constructed from atomic sentences using the existential (j) universal
In
=(if and only if).
and
"'(not) A sentence
of '~canbe .considered ,a question ·tobe answered in the sense that it is -either true .or -false with respect ·to the semantic network model.
We
now give a .completecharacterization of 0(;: Def:
A term of .;;( is any variable dual constant from the set
Def:
'1(
and
or
any indivi-
(a,b,c,etc.).
Anatomic formula is any expression REt1[U{=},
Def:
(x,y,z,etc.)
vI R v
2
where
are termso
A formula.is any expression derived from.the following rules: 1. ..Everyatomic .formula is ,a .formula
2.
I f f and
b
are formulae, so are
"';;:;
J7b
Y'n,fj, : / U/j,7=it. 3.
If
'7 is
and (Yx) Def:
A sentence of
a formula and
x
a variable, then (3x) --:;::
J
are formulae.
of:.
is any formula which contains no free
- 55 -
variables.
A variable is free if it is not in the scope of
.aquantifier
0
Apossible.realization for the first-order language pair
'ti(=
(D; v>
where
D
=
'1(
U / ( and
every element .of the non-logical vocabulary
v
0 S}.
from
From Proposition 1, if
aRb
is ~roof:
nodes
is decidable by
with the set of productions
By axioms l(b) and 3, aRb
there exists a path. p
Corollary 3:
aRb
a = b
The required procedure consists of
asking .thequestion. (a,R, b).,
e
and
a
to
b
such that
is decidable so is
iff.
e R. p....,......,.. ~(aRb).
Arty sentence of 0( which contains no quantifiers
E-decidab1e. If a sentence.contains no quantifiers, it contains no
variables (or else ,the variables would be free).
From the
definition of a formula, the sentence must consist of atomic sentences Joined by logical connectives.
- 58 -
Hence, by
Proposition 1 and Theorem 3, it is decidable. In order to prove our first major result we require the following lemma. Lemma 1: where Proof:
(3 x) (xRa) and f7., R E tIf..
(i) a E
("3 x) ('"xRa)
The inferential algorithm for deciding
as follows: (i) a path a1
(ii)
2
p
node
such that
a
(bRa)
are decidable,
is modified
sends type-2 signals searching for
p-3>R
-1
Any node
.
x
that receives
signal satisfies. xRa;
(ii)
X = {x E ~ : xRa}
Let
in (i).
(,3 x) (,"xRa)
Then
be the set of nodes contacted
iff.
1( -
X
+ 0.
Notice that axiom l(c) excludes the possibility of cycling due to productions: thus
X is found ,after a finite time.
Theorem 4:
Let
~ (x)
be a formula with
variable, -containing no quantifiers ~
( ::3 x)
;;: (x)
is
x
as the only free
Then the sentence
I-decidable.
. Proof: d(x)
can be expressed in disjunctive normal form as n. where = /'( -.I1lV.!J2 V . (x) m 1 J j=l i =1,~ o. ,m, and the (x) are either atomic formulae or their j
-11.
.,vLt
11.
If
negations, involving only
(3 x) OC(x)
m
"
(:::l
V x) J:j. , i=l 1 that sentence is decidable if for each
"
and individual constants. so that by Proposition 1
(::3 x)"//i
is decidable
i.
In what follows we shall suppress the subscript
JJ.
= )\
'II. (x) •
Ignoring negations, the
j=l J possible types:
- 59 -
#(x) J
i.
Let
fall into three
where
L
xRx
2.
xRa
3.
aRb
a, bE
pend upon
7Z
x,
1I j (x)
and
""' ;713
Since the type 3 formulae do not de-
REt!(.
11 3 /\ ( ::3 x) ( i l l /\ il2)
=
of type
k,
k
° /7
is decidable if both Since
aRx
they can be taken outside the scope of the quantifier.
(3 x).,it
Hence all
or
1,2,3.
=
p
and
3
the formula is true for all
=
(3 x)( i l l /\ il2 )
are decidable.
it is decidable by Corollary
xRx.
If
R = C
3.
-1 C ,by axiom lea)
or
On the other hand, i f
x.
conjoins
Again, by Proposition l,the sentence
contains no quantifiers,
Consider the type 1 formula
if k
where
R E
l {C, C- },
If... -
r-.J
x, by axiom 2.
the formula is false for all
is true,. in which case the expression is false; in which case
P
P
becomes
-1 xR a,
as
jj!l or
ill
is false. is decidable.
It remains to show that aRx
Therefore, either
We can rewrite
where
so that
or
a
If
p = 1,
ilz
j
E
7{
and
R Elf, j
j
= 1, ... ,po
,..J.
(:3 x)
is decidable by Lemma 1.
As in the proof of that
lemma, let Xl = {x E
II,
(:3 x) 112
'Ill (x)} •
AJ
For any finite
p
-=Xl
2,
A.
Find
B.
(recursive)
is decidable by the following procedure:
as defined above, and set Find
X = {v E X _ : j j 1
- 60 -
j = 2.
1-Ij (v)}
( :3 v) Ii.J (v)
This set is determined by asking the question
as in Lemma 1 and admitting only those solution nodes which are .already in
(We assume that the terminal can "mark"
X.J- 1
a specific set of nodes for.-sub.seq.uent comparison; notice I
X ~ j
kj _ l
·) rV
(Exit rule)
for any
If
than
j,
( 3x)
'liz
is
false. On the other hand, i f (i)
if
j
p
f (x) , \;fa H
R (ii).
~
axiom lea).
;C(x) =
(3
by axiom 8
x). (>.xRa).
R = C -7 'f(x) • \;fa
by axioms 7, 10, 11
c-l~ y' (x)., Va by axiom 10
R
is equivalent to (i) with
R = H R =
-1 H is equivalent to (i) with
R= H
is equivalent to (i). with
--1 R= H is equivalent to (i). with
H --1 H H -1 H
The proofs of all the above assertions are easy.
Thus, Theorem 4 still
holds in the case of the animal taxonomy network.
We shall give an ex-
ample of the question-answering procedure for a sentence with one quantifier: Example:
"Here is a riddle: What warm-blooded animal has wings,
cannot fly; but can swim?"
The corresponding sentence in .~ is:
( 3 x}(xC
an/\ xH warm-blood /\ xH wings /\xH flight /\xH swimming)
The relevant portion of
('t(. A, J)
is shown in Figure 12.
By Thm.4, the procedure for deciding whether such a creature exists is as follows: A.
Xl = {x
1 X
Can}.
- 65 -
B. (1)
(]x)Q(x) "'(Vx)Q(x) ... (3x) ("'Q(x» Let
X = {x E
Then (ii)
'fL n
-
I2 n : "'Q(x)} X + 0 ... (:Ix) Q(x) •
Conversely, suppose
(3x)Q(x)
(Corollary 2)
is decidable.