Education. U.S. Department of. Health, Education and Welfare. Washington. D.C. (2020. UNIVERSITY OF ILLINOIS. AT URBANA-CHAMPAIGN. 51 Gerty Drive.
I LLIN
I S
UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
PRODUCTION NOTE University of Illinois at Urbana-Champaign Library Large-scale Digitization Project, 2007.
:?7i /12
^3^
uaz
T E C H N I C A L
R E P 0 R T S
Technical
Report No.
236
HWIM: A COMPUTER MODEL OF LANGUAGE COMPREHENSION AND PRODUCTION Bertram Bruce Bolt Beranek and Newman
Inc.
March 1982
Center for the Study of Reading THE LiBRARY OF Ti...
UFV
The National Institute of Education U.S. Department of Health, Education and Welfare Washington. D.C. (2020
1
L.1983A
UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 51 Gerty Drive Champaign, Illinois 61820
BOLT BERANEK AND NEWMAN INC. 50 Moulton Street Cambridge, Massachusetts 02138
CENTER FOR THE
Technical
STUDY OF
Report No.
READING
236
HWIM: A COMPUTER MODEL OF LANGUAGE COMPREHENSION AND PRODUCTION Bertram Bruce Bolt Beranek and Newman
Inc.
March 1982
University of Illinois at Urbana-Champaign 51 Gerty Drive Champaign, Illinois 61820
BBN Report No. 4800 Bolt Beranek and Newman Inc. 50 Moulton Street Cambridge, Massachusetts 02238
Bill Woods, Bonnie Webber, Scott Fertig, Andee Rubin, Ron Brachman, Marilyn Adams, Phil Cohen, Allan Collins, and Marty Ringle contributed useful suggestions and criticisms to this paper. Cindy Hunt helped in the preparation. The system development and much of the writing was supported by the Advanced Research Projects Agency of the Department of Defense and was monitored by ONR under Contract No. N00014-75-C-0533. Additional work on the paper itself was supported by the National
Institute of Education under Contract No. HEW-NIE-C-76-0116.
Views
and conclusions contained here are those of the author and should not be interpreted as representing the official opinion or policy of DARPA, NIE, the U.S. Government, or any other person or agency connected with them.
EDITORIAL BOARD Paul Jose and Jim Mosenthal Co-Editors Harry Blanchard
Asghar Iran-Nejad
Nancy Bryant
Jill LaZansky
Larry Colker
Ann Myers
Avon Crismore
Kathy Starr
Roberta Ferrara
Cindy Steinberg
Anne Hay
William Tirre Paul Wilson
Michael Nivens, Editorial Assistant
HWIM:
A Computer Model
HWIM:
A Computer Model
1
2
Abstract paper
This
called "HWIM," produces
discusses
a
computer
many
can
components of language processing interact.
of
diverse
types
of
and
HWIM is an example of
focuses on HWIM's discourse processing capabilities integration
inputs
which accepts either typed or spoken
both typed and spoken responses.
Comprehension and Production
natural language system
a relatively complete language system in which one the
A Computer Model of Language
HWIM:
and
see
how
The paper on
the
knowledge for the purpose of
comprehending and producing natural language.
design
People
computer
language for two reasons. by
One is to make computers
more
useful
inform
The other is that the design of such programs can
The program then becomes a model
theories of human language use.
language comprehension or production.
human
Decisions made
in the design process in order to enhance efficiency, to
natural
understand
communication between the computer and computer
facilitating
users.
for
to
programs
make
characteristics of the
suggest
succeed,
program
the
simply
or
Also,
processes humans carry out when performing similar tasks. the
to
extent
a theory of language use actually "works"
which
when it is implemented as a computer model is one measure of the
Moreover,
correctness.
act
of
its
expressing a theory in a
computer program forces us to be explicit and
unambiguous
about
what the theory is. By examining the development, the structure, and
the
operation
of
computer
programs that successfully use
natural language, one may gain insights into human use of natural language, This
including reading and writing, paper
comprehension
discusses
and
some
production
speaking and listening.
general
issues
through examination of a complete
natural language system called "HWIM" (for "Hear what HWIM
was
developed
language
of
I
mean").
over a five year period as the Bolt Beranek
HWIM:
A Computer Model
HWIM:
A Computer Model
3 and
Newman
understand
speech
understanding
natural
language
system.
(typed
or
It
4
was designed to
spoken);
to
answer
help
in
formulating
understanding.
more
general
models
of
language
Some salient features of HWIM are the following:
questions, perform calculations, and maintain a data base; and to respond
in natural language (typed and spoken).
and outputs used a relatively vocabulary.
Utterances
rich
grammar
Both its inputs
with
a
the user.
the
system
Bates,
Brown,
Schwartz,
Wolf,
discourse
can
be
Bruce, and
A
fuller
task,
the
user,
discourse,
and specific facts about the
world o
This paper focuses on the component of the system
embodying semantic and pragmatic knowledge. of
the
the maintenance of dynamic models of the the
word
were assumed to be part of an on-going
dialogue so that HWIM had to have a model of both and
1000
o
the use of
such
interpretations
treatment
knowledge of
to
constrain
utterances,
possible
thus increasing the
likelihood of successful understanding
found in Bruce (in press) and in Woods, Cook,
Zue
speech understanding systems,
Klovstad, (Note
9).
Makhoul,
Nash-Webber,
o
written
For discussions of other
see Erman, Hayes-Roth,
Lesser,
the use of the same and
knowledge
spoken
in
responses;
generating a
formalism
expressing response generation rules which
and
a
Reddy (1980); Lea (1980); and Walker (1978).
framework
for
the
application
both for
provides
of semantic and
discourse level knowledge HWIM's Job Responsibilities o HWIM
was
designed to serve as an assistant for the manager
of a travel budget. travel
budget
The task of the system
manager,
helping
to
record
proposed and to produce such summary money
allocated
its duties, HWIM manager.
to
converse
with
to
assist
as
the
travel
system's
own
knowledge
state known to the travel budget manager o
inference facilitate
total
mechanisms freer
and data base structures that
expression
of
commands
and
questions by the travel budget manager
In order to carry out the
the
the
the trips taken or
information
to various budget items. needed
was
facilities for making
budget
Steps in Processing
Though their conversations were simplified relative to
natural inter-personal conversation,
the design of the system can
In
this
section
we see a trace of HWIM's processing.
sentences shown here would normally occur as part of
an
The
ongoing
A Computer Model
HWIM:
HWIM:
5
might be significantly different in context, here
is
interpretations
the
produced,
what
seen
be
can
within the system, the types of and
capabilities,
inference
The structure of HWIM is shown in Figure 1
generation.
response
control
of
flow
the
6
and resulting responses
interpretations
their
While
dialogue.
(and discussed further in Appendix A). When HWIM engages in a text dialogue, just
two
Trip and Syntax.
processes,
subprocess to Trip,
though it
In that mode Syntax is a
to evaluate tests on arcs in the grammar.
subprocess
possible flow of control,
(Figure 2),
to
the system reduces
own
its
may itself invoke Trip as
To see one
consider the sentence -
Enter a trip for Jerry Wolf to New York. Trip reads the input and does a morphological It
word.
then calls Syntax to parse and interpret the sentence.
'Jerry
denote
Wolf'
then
back
to
Syntax
control returns to Trip. • c-tion
ton
h
known
person?",
which
Control passes to Trip, which answers
performed by Trip. and
a
fnormul at
again.
Ambiguity in h for the
to be
is
"yes,"
When the parse is complete, the
input
can
cause
ravel budaet manaaer.
manager's response would cause a return to Syntax, and so the end of the session.
the
"Does
Syntax may encounter a test, e.g.,
During processing, name
each
of
analysis
A Computer Model
on
a The to Figure 1.
The structure of HWIM.
HWIM:
A Computer Model
HWIM:
A Computer Model 8
To
see
the steps HWIM takes in reading, comprehending,
then responding to some text,
imagine
that
the
travel
and
budget
manager types: 1. "Enter a trip for wIPrrv %
Wnlf fn NPu., VnrL
When did Bill go to Mexico?
6. "Ok."
The program makes a first pass on the input, converting the words as typed into words as they appear in its dictionary and removing punctuation. For example, "$23.16" would become "twenty three dollar-s and sixteen cent-s". 2.PARSE ((ENTER
In this
case
the
modified
word
list is simply (WHEN DID BILL GO TO MEXICO) For LD: TRIP--) -- )
lists
spoken
inputs
HWIM has to consider many possible word
because the input is
analogous Rumelhart,
to
problems
1977; Woods,
ambiguous. faced
1980b).
by
This could a
For
person typed
be
in
viewed
reading
inputs,
HWIM
as (see
is
a
perfect decoder; thus, only one word list needs to be considered. In
order to simplify the discussion,
the example here will focus
on a single word list. The parser in HWIM uses a pragmatic grammar significant domain. Figure 2.
One possible flow of control for a dialogue interchange.
was
static
knowledge
of
the
which
travel budget management
Collapsing the task specific knowledge into the
viewed
efficiency.
as It
an
contains
grammar
expedient to gain computational (processing)
helped to
constrain
parses
early
and
enabled
A Computer Model
HWIM:
A Computer Model
HWIM:
9 parsing
simultaneous
and
contrast with the two phase Kaplan
(Woods,
interpretation.
semantic method
& Nash-Webber,
used
Note
in
the
This is in
to
[FOR: THE A0009 /
10) which produced purely
cascaded
as in the RUS parser (Bobrow, Note 2)
1980a)
ATN's
maintaining
a
:T ; (FOR: THE A0008 /
(FIND: PERSON (FIRSTNAME 'BILL))
T ; (FOR: ALL A0010 / More (FIND: TRIP (TRAVELER A0008) (Woods, (DESTINATION A0009)
have been developed
to gain much of the efficiency of pragmatic or semantic while
(FIND: LOCATION (COUNTRY 'MEXICO))
semantic
transform into executable interpretations.
techniques such as the use of
recently,
in addition:
parser
LUNAR
syntactic parse trees that were subsequently given to a interpreter
10
(TIME (BEFORE NOW)))
grammars (GET: A0010 'TIME]
ST ; (OUTPUT:
clean separation between general syntactic The
knowledge and task specific knowledge. For the example sentence, the parser derives
the
syntactic
interpretation is a quantified expression which can (almost)
be directly executed. the
Find
structure:
says, roughly:
Translated it
location
has
which
the
name
country
"Mexico" and call it A0009. Find the person whose first name
SQ
is
"Bill"
and call him A0008. Find in turn, all
QADV WHEN
trips whose traveler is A0008,
SUBJ NPR BILL
A0009,
AUX TNS PAST
enumerated,
PP PREP TO NP NPR MEXICO
which
occurred
label it
A0010
prior and
is
to now. As each is
output
its
specific
time.
VOICE ACTIVE VP V GO
and
destination
whose
The
T's which appear in the interpretation are there in place of
potential restrictions which
might
have
been
applied
to
the
location, person, or trip classes being searched.
ADV THEN Before Taking advantage of the semantic knowledge in the grammar, we get
the
interpretation
optimization (see Reiter,
1976)
can be executed it undergoes an and
modification
to
match
the
HWIM:
A Computer Model
HWIM:
A Computer Model
11 data
base
representations.
12
In this example there is no entity
called "LOCATION" in the semantic network data base which the
purpose
implied
in
states, coasts, countries, "LOCATION",
the
serves
interpretation. There are cities,
etc.
and
there
is
a
link
called
but no concept with instances as we might expect. The
interpretation
thus references a fictitious node.
The optimizer
recognizes this and modifies the interpretation so that a is
done
among
all
the
possible
"locations".
from
the
more
Bobrow, Cohen, Klovstad, Webber, The
optimized
to-be-executed
it
commands.
can
has
& Woods,
The
is
queue
abstractions
such
concepts (Brachman,
Note 3).
translate
into
single
placed
on
a
of
queue
whose
first
inconsistency
name
between
is
the
"Bill."
data
base
Given
first
looks for
apparent
for help. The response generation program produces the question: you
mean
Billy
Diskin,
Bill
Huggins,
Bill
Levison, Bill Patrick, Bill Plummer, Bill Russell, Bill Merriam, or Bill Woods? When
the an
utterances
system
answer.
asks This
may be accepted.
step
a means
question (as in this example) that
otherwise
next
section).
commands. commands
However,
"incomplete"
In this context, the person types a
In Bill Woods.
most
which are directly
can
be
a
unique
location
and
a
it
Here the parser will allow a name as
This sentence undergoes a pre-pass to produce the word list
which the parser interprets as
The FOR: function manages quantification in
Here it
the
sentence: (see
The (FOR: -- ) expression above is a LISP form which
the expected way.
the country,
and the command, FOR: is
(BILL WOODS)
directly.
i.e.,
forced to take the initiative of the dialogue and ask the manager
a complete utterance. a
executed.
evaluated
people
"Mexico,"
However, there are 8 items in the data base representing
expects
represents
discourse"
initiated
Mexico.
whose country name is
been
contain previous queries or commands from the
speaker as well as system utterances
specialized
interpretation
towards a "demand model of principle
that
More recent knowledge representation systems, such
"location"
place
In HWIM, this
node
as KL-ONE would formalize the ability to form as
single
Do
modification is specific to the fictitious referenced.
search
unique person which match the respective descriptions. There is a
(! THE A0011 PERSON ((FIRSTNAME BILL) (LASTNAME WOODS)))
second
HWIM:
A Computer Model
HWIM:
A Computer Model
13
14 the
The optimized interpretation is
DESTINATION
of
a TRIP is computed from the DESTINATIONs of
its component LEG/OF/TRIPs. (IOTA 1 A0011 /
This
(FIND:
uses
interpretation
PERSON (FIRSTNAME 'BILL) (LASTNAME 'WOODS)) the
have : T )
returns the one item in the class defined
by
-
(FIND:
the
-)
expression which meets the restriction, T (i.e., no restriction). this case there is exactly one item matching the description,
In
namely BILL WOODS. At this point, we are in the middle of what Schegloff (1972)
the
question,
user's
FOR:
expression. that
information question.
Evaluating was
obtainable
the
user a
user's
answer
asking
an
insertion
sequence
with
the
respect to the primary
The execution of the
expression
FOR:
suspended for the duration of the insertion sequence,
be resumed when the needed information unique
conferences,
DESTINATION
trips whose TIME is
and
is processed.
a unique TRAVELER,
but can
Having
now
FOR: can find all
(BEFORE NOW) and for each, output its TIME.
Information
about
trips,
etc. is never assumed to be complete since real
world.
Furthermore,
the
range
of
askable questions precludes computing in advance all the implicit information.
(The
procedures
that
do these computations are
discussed in Woods et al., Note 9.)
In
only
description
one
trip
that
matches
the
this
example
there
is
and its time is
printed out by the response generation programs: Bill Woods's Juarez,
trip
from
Boston,
Massachusetts
Mexico was from October 15th to 17th,
In
any
conversation
checking TRAVELER, DESTINATION,
and
intentions,
and
considerable
the
conversation (Bruce,
perceptions 1980, Note 4).
idealized
representation
speaker in
constructing one.
of an
search in the data base. For example,
such
are
many
including of
the
to
1975.
1974; Sidner, Note 8).
forces operating to the
memories,
participants
A discourse
model
the
in the is
an
forces as they are used by a
utterance
(See also Reichman,
TIME Deutsch,
require
there
determine what will be said next,
understanding Finding the trips, can
conference.
e.g.,
Discourse Model
question-answer sequence.
a
by
HWIM's question to the user, plus
constitute
is
only
required
expression
FOR:
the
specific
is not so in the
a
executing
began
system
the
it
computed from the PURPOSE of the LEG/OF/TRIP,
a
budgets,
may
answering
In the process of
has called an "insertion sequence."
be
attending
a function that
operator,
IOTA
to
The DESTINATION of a LEG/OF/TRIP
and
by
a
listener
1978; Stansfield,
in
1974;
HWIM:
A Computer Model
HWIM:
A Computer Model
15 The
discourse
context
is
an important factor in even the
restricted domain of person-computer communication, interactive
natural
language
explicit "discourse model".
16
but
in
many
For
example,
Most artificial problem domains
are
determination
of
understanding of the maintain
This
greatly
utterance.
There
is
also
need
models
of
the
to
generality,
the
means
of
able
to
cope
with
coping are rarely delineated
dialogues
between
a
travel
Woods
budget
(Bruce,
in
one
in which, of
suggested
constitute the modes of interaction. An
augmented
transition
of
several
e.g.,
trying
to
determine
of
a
proposed
trip,
examining
the
budget, or entering new trip information. formulations
the state of the
We considered
a
modified
ATN
any
parser
was
given
data
base.
state the parser can predict the most likely next
intent and hence such things as the class of likely verbs for the
the consequences
Then
that steps through the grammar on the basis of the input
next utterance. states,
(ATN) grammar was used to
sentence structure and the then-current state of the
discourse
at any point, the manager could be seen as being
network
represent some of the common modes of interaction found in travel
At a
Patterns of intents such as,
system-confirm-change,
written 1973)
manager and HWIM (a system
builder played the part of an ideal HWIM) model
& Makhoul,
assumed
user-make-editing-change
budget management dialogues. (see
An intent is the
system-answer-question
that
1975b). Incremental simulations
of modes of
"the
Even in those cases where the problem definition permits
a general discourse and the system is
concepts
user-ask-question
world,
etc.) when the only other participant in the conversation is
the
system-point-out-contradiction
a data base of participants in a conversation (together attitudes,
involved
user-enter-new-information
simplifies
no
formulation
purpose behind an utterance.
the speaker's intent, and hence, the full
with their presumed purpose,
USER."
such
interaction and intents (Bruce, 1975c).
the person may only ask questions; the
system may only answer the questions. the
One
understanding systems there is no
deliberately constructed so that only a few interaction modes are sensible.
made explicit and used to advantage.
This model was not used in the final version
system, primarily because it
of
was too rigid to function alone
as a discourse model.
several
of a discourse model in which these states would be
Another formulation of the user/discourse model involves the
A Computer Model
HWIM:
HWIM:
A Computer Model
17 (Both of
notion of demands made by participants in the dialogue. formulations
these Demands
response
formulation allows us to model how one computation of a can
be pushed down,
missing information,
and latter
This
out.
pointed
While
system
and how a computation can
Some elements of this demand model,
without
The more
interaction.
fact
are explained
is
question that
information
some
a
to
an
itself.
An
with
high
demand
typical
questions
leads
system
cannot embedding
Demands of lower priority include such things as
a
will
be
met,
the
expect that most counter-demands will be This is an
additional
influence
on
the
There 3).
Counter-demands:
These
are questions the system has
explicitly or implicitly asked the user (e.g., budget"
budget, canceling budget
implicitly items
or
asks
"This
trip
will
the user to review the
adjusting
cost
The discourse area of the data
LOCATION,
including: a
pointer to the current location of the
o
TIME, a pointer to the current time and date
o
SPEAKER, a pointer to the current speaker
o
the
last
mentioned
budget, conference,
could
over
reasonably
they
does to
person,
place,
time,
trip,
etc.
Such a
been answered.
you
can
that
these as long as it
speaker, e.g., the city
of modes of
notice might not be communicated until after direct questions had
put
nor expect too strongly
discourse context,
answered
be
notice by the system that the manager is over his budget.
(2)
to
base also contains an assortment of items that define the current
are demands for service of some sort
These
made upon the system by the user or by the
priority.
on
(3) Current discourse state:
o
unanswered
hold
discourse structure.
below:
active
not
resolved in some way.
subsequent
spawn
together with other aspects of the discourse model,
Demands:
should
while a whole dialogue takes place to obtain
expectations or digressions.
(1)
it
demands,
1975b).
questions
unanswered
as been
have
which
more fully in Bruce,
discussed things
such
include
contradictions
are
18
estimates).
is
also a representation of the current TOPIC (see Figure
This is the active focus of attention in the dialogue. be
the actual budget, a hypothetical budget, a particular
trip, or a conference. point
for
It
resolving
The current topic is used definite
as
an
references and deciding how much
detail to give in responses.
It can also generate certain
of interaction.
if
For example,
anchor
modes
the manager says "Enter a trip,"
HWIM:
A Computer Model
HWIM:
A Compt iter Model
19
20
the
system
notes
that
the
incompletely described trip. standard
current
topic
has
char iged to an
This results in demands thiat
fill-in questions to be asked.
If
cause
the manager wants to
complete the trip description later, then the completion
of
the
trip description becomes a low priority demand. The
system
"demand model." represent
has a primitive one-queue implementation of the This queue contains executable procedures
which
the speaker's previous queries and commands as well as
commands initiated by the system to examine the
consequences
of
its actions, give information to the user, or check for data base consistency.
These
procedures
are
dependencies and relative priorities.
related
by
functional
The major types of demands
are the following: o
DO:
o
TEST: means evaluate the form and answer "no"
means execute the specified command. if NIL
or "yes" otherwise. o
RESPOND: means give the user some information (which may or may not be part of
an
answer
to
a
di rect
query). o Figure 3.
A discourse state,
PREVENT:
means
monitor
for
a subsequent poss ible
action and block its normal execution (as in "Do not allow more than three trips to Europe.").
HWIM:
A Computer Model
HWIM:
A Computer Model
21
22
Understanding The
process
of
understanding
written natural language requires the integration of diverse knowledge sources. An optimal process must apply various types of knowledge in a balance that avoids over- or under-constraining the set of potential accounts of the input. Ultimately meaning
the
understanding
representation
the input be produced
written
directly
for or
or
via
varies
from
one
should
produce a the natural language input, whether
spoken.
structures (Bobrow & Brown, structures
process
This
some
representation
may
be
NPR
A
number of contingent knowledge
1975).
The
system
to
exact
form
of
INTERP= STHE-LOCATION-
these
another, but typically
includes such things as parse trees. In HWIM, processing
integration of is
knowledge
sources
for
0 linguistic
CALL TRIP FORKTO VERIFY THAT THIS MATCHES
accomplished
by means of (1) an ATN grammar (see Figure 4) that incorporates much of the syntactic, semantic, and pragmatic knowledge in the system, (2) a semantic network (see Nash-Webber, 1975) that represents remaining linguistic and world knowledge, and (3) tests on arcs in the grammar used to link the two representations. These structures are used to produce the account of the input that
serves
as
a
representation
of
its
meaning. Figure 4.
A small portion of the pragmatic grammar.
HWIM:
A Computer Model
HWIM:
A Computer Model
23 The
decision to include semantic knowledge in a grammar has
both its merits and its defects. obvious
On the one
hand
there
possible
interpretations
of an utterance.
the seemingly unnecessary step of as
a
constrain
Moreover,
building
a
parse tree, for an utterance.
HWIM (see also Erman, Hayes-Roth, Note
5).
On
the
Lesser,
other
that knowledge as if
it
discourse.
approached
No
those
base.
provided
by
the
discourse model or the factual data
For example:
& Reddy, 1980; Burton &
hand,
the
variety exhibits.
becomes
complex,
more
interconnected,
it
is
and
language
complexity
that
semantic
more not
fluid, clear
grammar
pragmatic
the
grammar
system has yet natural
"San Francisco, California" a valid place?
o
What
of
It
is,
words
descriptor o
Does
can
go
(e.g.,
"What
with
"speech"
as
a
project
"understanding")?
is the registration fee?" make sense in
this context? o
Does "How much is in the budget?" make sense in this context?
o
human
Does "When is that
meeting?"
make
sense
in
this
context?
and pragmatic knowledge and
more
intricately
to what extent the pragmatic
non-syntactic
knowledge,
the
however, closely linked to the
semantic network via tests on the arcs. to
Is
domain
not a complete knowledge source by itself
for linguistic processing.
o
o
Does "November 15th" make sense in this context?
With the grammar encoding semantic and pragmatic, as well as syntactic
inclusion is
"Jerry Woods" a valid name?
and
grammar approach will work. Despite
Is
the fact that we could
natural
As
o
in
were fundamentally no different from
computer
communication
grammar
as
model,
exists
syntactic knowledge was a consequence of having a limited of
the
All of this points to
incorporate semantic and pragmatic knowledge in the use
an
one avoids
syntactic
the desirability of a unified linguistic processor as
Brown,
is
advantage in terms of efficiency when semantic knowledge
can be applied early on in the most direct way to
such
24
These
tests
allow
capture more volatile constraints on the input,
the such
information,
it
is possible for the parser to build a
procedural "meaning" representation as well as a purely syntactic one. The meaning representations are in a command language whose expressions are
atomic,
consisting
applied to arguments, or compound, operator
(FOR:)
of
a
functional
operator
making use of a quantification
applied to another expression.
The operators in
HWIM:
A Computer Model
HWIM:
A Computer Model
25
26
atomic expressions specify operations to be performed on the data
so
base or interactions with the user.
The word "go," in the context of that
The
builds
parser
by
interpretations
accumulating
"will"
a
can always be interpreted as marking a future event.) our
trip is being discussed.
in
travel
domain,
indicates
Next, the constituent "the ASA
meeting" produces the
nodes
Note 9,
for a
registers the semantic head, quantifier, and links of described in the sentence (see Woods et al.,
being
more complete description).
For example,
(TO/ATTEND (! THE Y MEETING ((SPONSOR ASA)))).
the sentence (The ! indicates that a FOR: expression will have to be built part
I will go to the ASA meeting. yields
an
interpretation
in
the
command
language
(Section
COMMAND) of the form
of
the
interpretation.)
as
The top level of the grammar has
thus accumulated the link-node pairs (TIME (AFTER NOW)) (TRAVELER SPEAKER)
(FIND: MEETING (SPONSOR 'ASA))
(FOR: THE A0018 /
: T (TO/ATTEND (! THE Y MEETING ((SPONSOR 'ASA))))
(FOR: 1 A0019 /
(BUILD: TRIP (TO/ATTEND A0018)
with the semantic head TRIP.
(TRAVELER SPEAKER)
a BUILD:)
(TIME
is created, to produce
(AFTER NOW))) (BUILD:
: T ; T)) This
interpretation
is
built
up
the
in
constituent describing a person is found sentence. link-node
parser
The pair
interpretation
(TRAVELER of
that
The appropriate action (in this case
at
following the
start
transforms
the
pronoun
"I"
SPEAKER)
and
returns
this
way.
A
of
the
into the as
Finally,
pairs
being
the
necessary
expanded around the BUILD:
quantificational
expressions
are
expression.
the Response Generation
constituent. The word "will" adds (TIME
(AFTER NOW)) to the list of link-node
TRIP (TO/ATTEND (! THE Y MEETING ((SPONSOR 'ASA)))) (TRAVELER SPEAKER) (TIME (AFTER NOW)))
accumulated.
(The grammar does not accept constructions like "will have gone,"
Once
a
constructed,
satisfactory
theory
for
an
utterance
it must be acted upon by the system.
has
Regardless
been of
HWIM:
A Computer Model
HWIM:
A Computer Model
27 the
type
28
of action taken, some appropriate response should also
be made. From the speaker's point of view the response should explicit,
concise,
and
easy
to
understand.
be
The response may
affect the speaker's way of describing entities in the domain
as
well as illuminate the capabilities and operation of the system. The
effects
of
generated responses are many.
such as travel budget management there are many standard
names,
e.g.,
Coast conference". handle
for
the
"the budget item for two trips to a West
Names chosen (constructed) by HWIM provide manager
indicate
construction
the
of
handled
confidence
to
structures.
by
pursue A
subsequently.
capabilities
responses
structures
to
HWIM just
response
of
exhibit gives
the can
the
the
paths also
Responses
program.
exactly
the types of knowledge embodied in the
is
evidence
that
success
in
performing
these
system are sufficient for natural communication, at least at shown by sample dialogues.
of HWIM (Nash-Webber & Bruce, of
rely
on
place.
The history of the development
1976)
suggests that
each
category
The kinds of knowledge used by HWIM can be categorized as
follows: o
can
those
Acoustic-Phonetic forms--Knowledge of
phonemes
and
their relation to acoustic parameters.
she
Phonological
rules--Knowledge
of
phoneme clusters
and pronunciation variations describable by rules. o
show the system's focus of only
Lexical
forms--Knowledge
of
words,
their
inflections, parts of speech and phonemic spellings.
of
For
but also of how well the system is
example,
"entered"
is
the
past
form of "to
enter".
understanding and where more clarification may be useful. o Types of Knowledge Used in HWIM
knowledge
a
readily
accessible
form.
level.
at
the
For example,
phrase,
of clause,
a sentence
may
grammatical and start
sentence with
a
command verb, such as "schedule."
This
is needed for understanding both spoken utterances and
forms--Knowledge
Sentential structures
In order to carry out its role effectively, HWIM must have a large amount of knowledge in
the
knowledge is also necessary for natural communication to take
knowledge
the
he seeks,
tasks
o
the
The manager can thus get a better idea, not or
out commands, answering questions,
HWIM's
manager the legitimate which
carrying
Careful
attention. facts
a
to use and thus strongly influence the
ways the objects are referred to also
without
sentences,
and generating responses.
level
In a domain
objects
written
o
Discourse
form--Knowledge
of
idealized discourse,
A Computer Model
HWIM:
A Computer Model
HWIM:
29 what
e.g.,
type
of
is
utterance
likely
to
30 be
o
produced in a given state of the discourse.
Parameters
of
Semantic--Knowledge of how
are
words
related
speech
and
conveyed
structurally
and
vary
it
text input and speech or text communicative
This is
meaning.
used in parsing and in constructing responses.
situations
can
along a number of dimensions (see Rubin, 1980)
For and a successful communicator must use knowledge
the benefactive case for schedule should be
example,
or
In general,
output. used
communicative situation--HWIM's
operation is a function of the modality in which operates;
o
the
the
to
situation
filled by a person who will be taking the trip being
and
interpret
of
respond
appropriately. scheduled. o o
World--Knowledge
projects,
specific
of
trips,
User--Knowledge about each manager,
HWIM's
Whereas
budgets and conferences.
what
possible
budget
groups they belong to, and what they
semantic may know about the data base.
knowledge
travel
For
example,
only
is essentially fixed, its world knowledge certain
changes every time a
trip
is
taken
or
money
is
users
may
be
allowed
to
modify
budget
totals. shifted from one budget item to another. o o
On-Going
discourse--Knowledge
of
the
Self--Knowledge about (often
discourse state, e.g., objects
available
the
for
current
anaphoric
topic
the
system's
and
reference.
the
called
"meta-knowledge").
is
used
pragmatic
grammar.
to
relax
constraints
in
budget
manager
or
computer
by
HWIM.
Another is that HWIM knows
For example, if a conference is
under discussion, then
"What
is
the
what elements constitute a complete registration any is
One example is
the was
fee?"
knowledge
that data on trips and budgets is marked to indicate
This
whether it came from the travel knowledge
own
current
a meaningful utterance.
object, such as a trip.
Thus it
description
of
knows when its
If not, then the own knowledge is incomplete.
speaker would have to say something like,
"What
the registration fee for the ACL conference?"
is o
Task--Knowledge of the task manager
is
concerned
domain,
e.g.,
that
a
with maintaining an accurate
HWIM:
A Computer Model
HWIM:
A Computer Model
31 record
of
all
trips,
taken
or planned,
32
and with
with the second showing what can and cannot be done and the first
staying within the budget. o
addressing specific design questions.
Strategic--Knowledge concerning
the
and
This knowledge that
use
of
other
knowledge.
representation
would be needed in even a "blank" system; i.e.,
HWIM is in both categories, but its principal value probably lies
one
for
understanding is computer
shows
of
the
one
that
models.
model
of
language
precision
can
be
glossed
over
in
of
(Cohen speech
& Perrault,
1979)
act definitions.
domain,
representing
Thus, one
be
Winograd's
which
typed
accepted
performed
can
see
what
components,
each
aspects of language theory, are needed and how they
might interact.
actions,
and
(1972) blocks world program,
questions, generated
could be put in the latter category. these
have
to require complete use of language (within a restricted of course).
approaches;
in
than
the first.
To some extent, it areas
what
can
be
done
in
a
fairly
complete
system
It that
understands natural language (typed or spoken),
answers questions
and performs various actions,
natural
and responds
in
language
(typed and spoken).
commands, natural
and
a system
assertions,
language responses,
There is value in
both
Problems that arose in the design of HWIM were precursors of
less
A second
reason for computer models is that the task of the system can defined
a
For example, designers of computer
models of speech act generation the
computer
second
must resolve theoretical questions about
process
procedure-oriented
increased
a
that in the course of designing and debugging
program,
details
building
the
such as parsing, inference, data base design, and generation.
Conclusion reason
in
represents a synthesis of a number of lines of research in
that had no word- or domain-specific knowledge.
One
more
of
fact, they tend to complement each other,
those that are central issues in AI research today. the
speech
act
Cohen (Note 6), Questions
of
issues for HWIM are similar to those studied by
Cohen and Perrault knowledge
(1979),
1980).
and Fahlman (1979).
Language
generation
HWIM's has been studied Finally,
the
(1977),
Allen
(Note
by
and
Winograd
(1977),
in a discourse context similar to
McDonald
(Note
7),
for
instance.
issue of interactions among syntax, semantics,
Woods
1).
(Also see Brachman & Smith,
pragmatics is crucial in the work of many, Abelson
and
representation closely related to those
faced in HWIM have been pursued by Bobrow Brachman (1979),
For example,
(1980a),
and
including
Bobrow
(Note
Schank 2).
and and The
characteristics of HWIM reflect the goal of natural communication
HWIM:
A Computer Model
HWIM:
A Computer Model 34
33 between domain, depends The
a it
person and a computer assistant.
illustrates the extent to which natural
upon
structure
diverse of
HWIM
kinds can
Reference Notes
Even in its limited communication
1.
of knowledge in both communicants. provide
a
useful
framework
Allen,
J.
A
(Tech. Rep.
for
plan-based
approach to speech act recognition
No. 131/79).
Toronto:
Department of Computer Science,
University
of
Toronto,
January 1979.
obtaining a better understanding of natural communication. 2.
Bobrow,
R. J.
The
Cambridge, Mass.: 3.
Brachman, B. L.,
5.
system
& Woods,
Bruce,
B.
Report
No.
Report
W. A.
Cohen,
P.,
Research
No. 1978.
Klovstad, in
3878).
J., Webber,
natural
language
Annual report (1 Sept. 78--31 August 1979).
C.
Belief systems and language understanding BBN 2973).
Cambridge,
Newman, Inc.,
1975.
Burton,
& Brown,
R.,
(BBN
Bolt Beranek and Newman Inc.,
R. J., Bobrow, R.,
understanding 4.
RUS
Mass.:
Bolt
Beranek
and
(a) J.
S.
Semantic grammar:
A Technique
for constructing natural language interfaces to instructional systems (BBN
Report
No.
Beranek and Newman Inc., 6.
Cohen,
P. R.
(Tech.
Rep.
3587).
Cambridge,
118).
Bolt
May 1977.
On knowing what to say: No
Mass.:
Toronto:
Science, University of Toronto,
1978.
Planning speech acts
Department
of
Computer
HWIM:
A Computer Model
HWIM:
A Computer Model
35 7.
McDonald,
D.
Natural
language
production
decision making under constraint (MIT Cambridge,
Mass.:
Massachusetts
AI
36
as a process of
Lab
Institute
Tech. of
References
Rep.).
Technology,
Bobrow,
Sidner, C. L.
Towards a
computational
theory
of
definite
Bobrow,
No. 537).
Technology,
Cambridge, Mass.:
R. J.,
& Brown,
Synthesis,
anaphora comprehension in English discourse (MIT AI Lab Tech. Rep.
& Winograd,
understanding
Massachusetts Institute of
Representation
1979.
Woods,
W. A.,
Klovstad, J., Wolf,
J.,
Bates, Makhoul, Zue,
V.
M. Brown, J.,
G.,
Nash-Webber,
Speech
Bruce,
B.,
B.,
Cook,
Schwartz,
understanding
systems:
10.
An overview of KRL,
a knowledge
Woods, W. A.,
Bolt Beranek and Newman Inc.,
Kaplan, R. M.,
R.,
Brachman,
science natural language
information
(BBN
Cambridge, Mass.:
Report
Newman, Inc.,
No. 2378).
system:
R. J.
systems. and
networks:
Brachman,
understanding:
In D. G. Bobrow & A. Collins (Eds.)
understanding:
Studies
Academic Press,
1975.
the
The
N. V.
Findler
representation
New York:
R. J.,
epistemological
Bruce,
1972.
B.C.
& Smith, B. C.
Fourth
Pragmatics
and
Academic Press,
in
cognitive
in
International
status
of semantic
(Ed.), use
Associative
of
knowledge
in
1979.
SIGART Newsletter, Special Issue
on Knowledge Representation, No.
report
Bolt Beranek and
On In
computers.
The lunar
Final
Systematic
3-46.
and contingent knowledge in specialized
New York:
networks.
Final
1976.
& Nash-Webber, B. L.
S.
3,
C.,
technical progress report (30 October 1974--29 October 1976). Cambridge, Mass.:
J.
analysis,
science. 9.
T.
representation language. Cognitive Science, 1977,
1980. 8.
D. G.,
70, February 1980.
speech
understanding.
Joint
Conference
on
Proceedings Artificial
Intelligence, Tbilisi, USSR, 1975.(b) Bruce,
B. C.
American 19-35.(c)
Discourse Journal
models for
and
language
Computational
comprehension.
Linguistics 1975, 35,
HWIM:
A Computer Model HWIM:
37 Bruce,
Analysis
B.C.
understanding
of
of
plans
interacting
story
structure.
as a guide to the
Poetics,
1980,
9,
295-311.
Nash-Webber, B. L. understanding.
and
New York:
semantics in
D. G.
Bobrow
automatic speech
& A. Collins
understanding:
Studies
Academic Press,
1975.
(Eds.), cognitive
in
Natural communication between person and computer.
B. C.
In W. Lehnert and M. Ringle (Eds.),
Strategies
Hillsdale, N.J.:
language processing.
for
natural
B.,
P. R., & Perrault, C. R.
C.
Evolving uses of knowledge in a
Ottawa, Canada,
Proceedings of
the
COLING-76
June 1976.
Elements of a plan-based theory Reichman,
Cognitive Science, 1979, 3, 177-212.
of speech acts.
& Bruce, B.
speech understanding system.
in press.
Erlbaum,
Nash-Webber,
Conference, Cohen,
In
Representation science.
Bruce,
The role of
A Computer Model 38
R.
Conversational coherency.
Cognitive Science, 1978,
2, 283-327. Deutsch, B. G. L.D.
The structure of task (Ed.),
Erman
Speech
Proceedings
oriented of
Pittsburgh,
Recognition.
dialogues.
In
the
IEEE Symposium on
PA:
Carnegie
Mellon
L. D., Hayes-Roth,
knowledge
The
F., Lesser, V. R., & Reddy, D. R.
speech-understanding
Hearsay-II
to resolve uncertainty.
system:
Integrating
Computing Surveys,
1980,
NETL:
A
real-world knowledge. W. A.
(Ed.)
Cliffs, N.J.:
optimization
Proceedings of the
Rubin,
A. D.
for question-answering systems.
COLING-76
Conference,
Ottawa,
Canada,
A theoretical taxonomy of the differences between
oral and written language. W. F.
system
for
representing
and
Cambridge, Mass.: MIT Press,
using
1979.
Rumelhart, S.
Brewer
(Eds.),
Trends
in
Prentice-Hall,
speech 1980.
recognition.
D. E.
In R.
Towards
J.
Theoretical
Hillsdale, N.J.:
Dormic (Ed.),
N.J.: Lea,
Query
comprehension.
12, 213-253. Fahlman, S. E.
R.
June 1976.
University, 1974. Erman,
Reiter,
an
Spiro, B. C. issues
Bruce, and in
reading
Erlbaum, 1980.
interactive model of reading.
Attention and performance VI.
In
Hillsdale,
Erlbaum, 1977.
Englewood Schank,
R. C.,
& Abelson,
understanding: Hillsdale, N.J.
R. P.
Scripts,
plans,
An inquiry into human knowledge Erlbaum,
1977.
goals
and
structures.
HWIM:
A Computer Model
HWIM:
A Computer Model
39 Schegloff,
E. A.
Notes
Formulating place. interaction. Stansfield,
J.
a
conversational
In D. Sudnow (Ed.),
New York:
L.
on
Free Press,
Programming
a
Studies
40
practice: in
APPENDIX A
social The Structure of HWIM
1972.
dialogue
The
teaching situation.
Unpublished doctoral dissertation, University of
Edinburgh,
D. E.
York:
(Ed.)
Understanding
Elsevier North-Holland,
spoken
language.
New
1978.
clouds
New York:
Academic
DEC
process
PDP-10. (called
multiple-process
Press, 1972.
Major data structures
with thin arrows indicating data flow.
implemented on TENEX, the
T. Understanding natural language.
Components of
the system are shown in ovals with thick arrows representing flow
in
Winograd,
of HWIM is shown in Figure 1.
of control between components.
1974. Walker,
structure
a virtual memory Each
a
component Among
"fork").
the
job structure (see Woods,
shown
The system is
time-sharing exists
are
system
for
in a separate TENEX reasons
et al.,
for
this
1976) are that
most of the components are so large, in terms of program and data Woods,
W. A.
Cascaded
ATN
Grammars.
Computational Linguistics, Woods, W. A. In
1-12.
Journal of
(a)
Spiro,
Theoretical
B. C.
issues
in
Erlbaum, 1980.
Woods, W. A.,
& Makhoul,
continuous
speech
Bruce, reading
Calif.,
Brewer
comprehension.
(Eds.), Hillsdale,
(b) J.
Mechanical
understanding.
1973.
inference
problems
in
Proceedings of the Third Artificial
requirements,
that they cannot exist in the same TENEX
The Trip component is the one primarily paper.
Intelligence,
It
controls
the system and is
discussed
in
this
the major component for
acting on and responding to an utterance. input, it
International Joint Conference on Stanford,
& W. F.
structure
address space with another component.
Multiple theory formation in high-level perception.
R. J.
N.J.:
1980, 6,
American
If
calls Syntax to parse the sentence.
it
is given a typed
Otherwise it calls
Speech Understanding Control. Most of Dictionary
the
other
Expander
components expands
a
perform
single
dictionary
functions:
of
baseform
pronunciations using a set of phonological rules (applied once at system loadup
time).
Real
Time
Signal
Acquisition
(RTIME)
A Computer Model
HWIM:
41 and stores the speech signal.
digitizes
converts the speech Acoustic-Phonetic
signal
into
a
on the parametric
operates
representation of the speech signal to produce a set of hypotheses represented searches
the
expanded
and
against portions of the segment lattice. spectral
idealized
matches Verifier
the
parametric
an
using a region
representation of the speech signal.
Syntax
a
given
predicts
sequence;
word
extensions at each end of the sequence; builds a formal
representation of an utterance.
Interpreter
(present in an early
version only) takes a syntactic representation and
generates
against a
judges the grammaticality of possible
pronunciations
representation of a word (or words),
speech synthesis-by-rule program and matches it of
phonetic
Lexical Retrieval
in the segment lattice. dictionary
(PSA)
representation.
parametric
(APR)
Recognizer
Signal Processing
builds
a
of
an
utterance
procedural representation of the meaning.
generates speech from phoneme-prosodic cue strings.
Talker