HWIM : a computer model of language ...

4 downloads 0 Views 385KB Size Report
Education. U.S. Department of. Health, Education and Welfare. Washington. D.C. (2020. UNIVERSITY OF ILLINOIS. AT URBANA-CHAMPAIGN. 51 Gerty Drive.
I LLIN

I S

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

PRODUCTION NOTE University of Illinois at Urbana-Champaign Library Large-scale Digitization Project, 2007.

:?7i /12

^3^

uaz

T E C H N I C A L

R E P 0 R T S

Technical

Report No.

236

HWIM: A COMPUTER MODEL OF LANGUAGE COMPREHENSION AND PRODUCTION Bertram Bruce Bolt Beranek and Newman

Inc.

March 1982

Center for the Study of Reading THE LiBRARY OF Ti...

UFV

The National Institute of Education U.S. Department of Health, Education and Welfare Washington. D.C. (2020

1

L.1983A

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 51 Gerty Drive Champaign, Illinois 61820

BOLT BERANEK AND NEWMAN INC. 50 Moulton Street Cambridge, Massachusetts 02138

CENTER FOR THE

Technical

STUDY OF

Report No.

READING

236

HWIM: A COMPUTER MODEL OF LANGUAGE COMPREHENSION AND PRODUCTION Bertram Bruce Bolt Beranek and Newman

Inc.

March 1982

University of Illinois at Urbana-Champaign 51 Gerty Drive Champaign, Illinois 61820

BBN Report No. 4800 Bolt Beranek and Newman Inc. 50 Moulton Street Cambridge, Massachusetts 02238

Bill Woods, Bonnie Webber, Scott Fertig, Andee Rubin, Ron Brachman, Marilyn Adams, Phil Cohen, Allan Collins, and Marty Ringle contributed useful suggestions and criticisms to this paper. Cindy Hunt helped in the preparation. The system development and much of the writing was supported by the Advanced Research Projects Agency of the Department of Defense and was monitored by ONR under Contract No. N00014-75-C-0533. Additional work on the paper itself was supported by the National

Institute of Education under Contract No. HEW-NIE-C-76-0116.

Views

and conclusions contained here are those of the author and should not be interpreted as representing the official opinion or policy of DARPA, NIE, the U.S. Government, or any other person or agency connected with them.

EDITORIAL BOARD Paul Jose and Jim Mosenthal Co-Editors Harry Blanchard

Asghar Iran-Nejad

Nancy Bryant

Jill LaZansky

Larry Colker

Ann Myers

Avon Crismore

Kathy Starr

Roberta Ferrara

Cindy Steinberg

Anne Hay

William Tirre Paul Wilson

Michael Nivens, Editorial Assistant

HWIM:

A Computer Model

HWIM:

A Computer Model

1

2

Abstract paper

This

called "HWIM," produces

discusses

a

computer

many

can

components of language processing interact.

of

diverse

types

of

and

HWIM is an example of

focuses on HWIM's discourse processing capabilities integration

inputs

which accepts either typed or spoken

both typed and spoken responses.

Comprehension and Production

natural language system

a relatively complete language system in which one the

A Computer Model of Language

HWIM:

and

see

how

The paper on

the

knowledge for the purpose of

comprehending and producing natural language.

design

People

computer

language for two reasons. by

One is to make computers

more

useful

inform

The other is that the design of such programs can

The program then becomes a model

theories of human language use.

language comprehension or production.

human

Decisions made

in the design process in order to enhance efficiency, to

natural

understand

communication between the computer and computer

facilitating

users.

for

to

programs

make

characteristics of the

suggest

succeed,

program

the

simply

or

Also,

processes humans carry out when performing similar tasks. the

to

extent

a theory of language use actually "works"

which

when it is implemented as a computer model is one measure of the

Moreover,

correctness.

act

of

its

expressing a theory in a

computer program forces us to be explicit and

unambiguous

about

what the theory is. By examining the development, the structure, and

the

operation

of

computer

programs that successfully use

natural language, one may gain insights into human use of natural language, This

including reading and writing, paper

comprehension

discusses

and

some

production

speaking and listening.

general

issues

through examination of a complete

natural language system called "HWIM" (for "Hear what HWIM

was

developed

language

of

I

mean").

over a five year period as the Bolt Beranek

HWIM:

A Computer Model

HWIM:

A Computer Model

3 and

Newman

understand

speech

understanding

natural

language

system.

(typed

or

It

4

was designed to

spoken);

to

answer

help

in

formulating

understanding.

more

general

models

of

language

Some salient features of HWIM are the following:

questions, perform calculations, and maintain a data base; and to respond

in natural language (typed and spoken).

and outputs used a relatively vocabulary.

Utterances

rich

grammar

Both its inputs

with

a

the user.

the

system

Bates,

Brown,

Schwartz,

Wolf,

discourse

can

be

Bruce, and

A

fuller

task,

the

user,

discourse,

and specific facts about the

world o

This paper focuses on the component of the system

embodying semantic and pragmatic knowledge. of

the

the maintenance of dynamic models of the the

word

were assumed to be part of an on-going

dialogue so that HWIM had to have a model of both and

1000

o

the use of

such

interpretations

treatment

knowledge of

to

constrain

utterances,

possible

thus increasing the

likelihood of successful understanding

found in Bruce (in press) and in Woods, Cook,

Zue

speech understanding systems,

Klovstad, (Note

9).

Makhoul,

Nash-Webber,

o

written

For discussions of other

see Erman, Hayes-Roth,

Lesser,

the use of the same and

knowledge

spoken

in

responses;

generating a

formalism

expressing response generation rules which

and

a

Reddy (1980); Lea (1980); and Walker (1978).

framework

for

the

application

both for

provides

of semantic and

discourse level knowledge HWIM's Job Responsibilities o HWIM

was

designed to serve as an assistant for the manager

of a travel budget. travel

budget

The task of the system

manager,

helping

to

record

proposed and to produce such summary money

allocated

its duties, HWIM manager.

to

converse

with

to

assist

as

the

travel

system's

own

knowledge

state known to the travel budget manager o

inference facilitate

total

mechanisms freer

and data base structures that

expression

of

commands

and

questions by the travel budget manager

In order to carry out the

the

the

the trips taken or

information

to various budget items. needed

was

facilities for making

budget

Steps in Processing

Though their conversations were simplified relative to

natural inter-personal conversation,

the design of the system can

In

this

section

we see a trace of HWIM's processing.

sentences shown here would normally occur as part of

an

The

ongoing

A Computer Model

HWIM:

HWIM:

5

might be significantly different in context, here

is

interpretations

the

produced,

what

seen

be

can

within the system, the types of and

capabilities,

inference

The structure of HWIM is shown in Figure 1

generation.

response

control

of

flow

the

6

and resulting responses

interpretations

their

While

dialogue.

(and discussed further in Appendix A). When HWIM engages in a text dialogue, just

two

Trip and Syntax.

processes,

subprocess to Trip,

though it

In that mode Syntax is a

to evaluate tests on arcs in the grammar.

subprocess

possible flow of control,

(Figure 2),

to

the system reduces

own

its

may itself invoke Trip as

To see one

consider the sentence -

Enter a trip for Jerry Wolf to New York. Trip reads the input and does a morphological It

word.

then calls Syntax to parse and interpret the sentence.

'Jerry

denote

Wolf'

then

back

to

Syntax

control returns to Trip. • c-tion

ton

h

known

person?",

which

Control passes to Trip, which answers

performed by Trip. and

a

fnormul at

again.

Ambiguity in h for the

to be

is

"yes,"

When the parse is complete, the

input

can

cause

ravel budaet manaaer.

manager's response would cause a return to Syntax, and so the end of the session.

the

"Does

Syntax may encounter a test, e.g.,

During processing, name

each

of

analysis

A Computer Model

on

a The to Figure 1.

The structure of HWIM.

HWIM:

A Computer Model

HWIM:

A Computer Model 8

To

see

the steps HWIM takes in reading, comprehending,

then responding to some text,

imagine

that

the

travel

and

budget

manager types: 1. "Enter a trip for wIPrrv %

Wnlf fn NPu., VnrL

When did Bill go to Mexico?

6. "Ok."

The program makes a first pass on the input, converting the words as typed into words as they appear in its dictionary and removing punctuation. For example, "$23.16" would become "twenty three dollar-s and sixteen cent-s". 2.PARSE ((ENTER

In this

case

the

modified

word

list is simply (WHEN DID BILL GO TO MEXICO) For LD: TRIP--) -- )

lists

spoken

inputs

HWIM has to consider many possible word

because the input is

analogous Rumelhart,

to

problems

1977; Woods,

ambiguous. faced

1980b).

by

This could a

For

person typed

be

in

viewed

reading

inputs,

HWIM

as (see

is

a

perfect decoder; thus, only one word list needs to be considered. In

order to simplify the discussion,

the example here will focus

on a single word list. The parser in HWIM uses a pragmatic grammar significant domain. Figure 2.

One possible flow of control for a dialogue interchange.

was

static

knowledge

of

the

which

travel budget management

Collapsing the task specific knowledge into the

viewed

efficiency.

as It

an

contains

grammar

expedient to gain computational (processing)

helped to

constrain

parses

early

and

enabled

A Computer Model

HWIM:

A Computer Model

HWIM:

9 parsing

simultaneous

and

contrast with the two phase Kaplan

(Woods,

interpretation.

semantic method

& Nash-Webber,

used

Note

in

the

This is in

to

[FOR: THE A0009 /

10) which produced purely

cascaded

as in the RUS parser (Bobrow, Note 2)

1980a)

ATN's

maintaining

a

:T ; (FOR: THE A0008 /

(FIND: PERSON (FIRSTNAME 'BILL))

T ; (FOR: ALL A0010 / More (FIND: TRIP (TRAVELER A0008) (Woods, (DESTINATION A0009)

have been developed

to gain much of the efficiency of pragmatic or semantic while

(FIND: LOCATION (COUNTRY 'MEXICO))

semantic

transform into executable interpretations.

techniques such as the use of

recently,

in addition:

parser

LUNAR

syntactic parse trees that were subsequently given to a interpreter

10

(TIME (BEFORE NOW)))

grammars (GET: A0010 'TIME]

ST ; (OUTPUT:

clean separation between general syntactic The

knowledge and task specific knowledge. For the example sentence, the parser derives

the

syntactic

interpretation is a quantified expression which can (almost)

be directly executed. the

Find

structure:

says, roughly:

Translated it

location

has

which

the

name

country

"Mexico" and call it A0009. Find the person whose first name

SQ

is

"Bill"

and call him A0008. Find in turn, all

QADV WHEN

trips whose traveler is A0008,

SUBJ NPR BILL

A0009,

AUX TNS PAST

enumerated,

PP PREP TO NP NPR MEXICO

which

occurred

label it

A0010

prior and

is

to now. As each is

output

its

specific

time.

VOICE ACTIVE VP V GO

and

destination

whose

The

T's which appear in the interpretation are there in place of

potential restrictions which

might

have

been

applied

to

the

location, person, or trip classes being searched.

ADV THEN Before Taking advantage of the semantic knowledge in the grammar, we get

the

interpretation

optimization (see Reiter,

1976)

can be executed it undergoes an and

modification

to

match

the

HWIM:

A Computer Model

HWIM:

A Computer Model

11 data

base

representations.

12

In this example there is no entity

called "LOCATION" in the semantic network data base which the

purpose

implied

in

states, coasts, countries, "LOCATION",

the

serves

interpretation. There are cities,

etc.

and

there

is

a

link

called

but no concept with instances as we might expect. The

interpretation

thus references a fictitious node.

The optimizer

recognizes this and modifies the interpretation so that a is

done

among

all

the

possible

"locations".

from

the

more

Bobrow, Cohen, Klovstad, Webber, The

optimized

to-be-executed

it

commands.

can

has

& Woods,

The

is

queue

abstractions

such

concepts (Brachman,

Note 3).

translate

into

single

placed

on

a

of

queue

whose

first

inconsistency

name

between

is

the

"Bill."

data

base

Given

first

looks for

apparent

for help. The response generation program produces the question: you

mean

Billy

Diskin,

Bill

Huggins,

Bill

Levison, Bill Patrick, Bill Plummer, Bill Russell, Bill Merriam, or Bill Woods? When

the an

utterances

system

answer.

asks This

may be accepted.

step

a means

question (as in this example) that

otherwise

next

section).

commands. commands

However,

"incomplete"

In this context, the person types a

In Bill Woods.

most

which are directly

can

be

a

unique

location

and

a

it

Here the parser will allow a name as

This sentence undergoes a pre-pass to produce the word list

which the parser interprets as

The FOR: function manages quantification in

Here it

the

sentence: (see

The (FOR: -- ) expression above is a LISP form which

the expected way.

the country,

and the command, FOR: is

(BILL WOODS)

directly.

i.e.,

forced to take the initiative of the dialogue and ask the manager

a complete utterance. a

executed.

evaluated

people

"Mexico,"

However, there are 8 items in the data base representing

expects

represents

discourse"

initiated

Mexico.

whose country name is

been

contain previous queries or commands from the

speaker as well as system utterances

specialized

interpretation

towards a "demand model of principle

that

More recent knowledge representation systems, such

"location"

place

In HWIM, this

node

as KL-ONE would formalize the ability to form as

single

Do

modification is specific to the fictitious referenced.

search

unique person which match the respective descriptions. There is a

(! THE A0011 PERSON ((FIRSTNAME BILL) (LASTNAME WOODS)))

second

HWIM:

A Computer Model

HWIM:

A Computer Model

13

14 the

The optimized interpretation is

DESTINATION

of

a TRIP is computed from the DESTINATIONs of

its component LEG/OF/TRIPs. (IOTA 1 A0011 /

This

(FIND:

uses

interpretation

PERSON (FIRSTNAME 'BILL) (LASTNAME 'WOODS)) the

have : T )

returns the one item in the class defined

by

-

(FIND:

the

-)

expression which meets the restriction, T (i.e., no restriction). this case there is exactly one item matching the description,

In

namely BILL WOODS. At this point, we are in the middle of what Schegloff (1972)

the

question,

user's

FOR:

expression. that

information question.

Evaluating was

obtainable

the

user a

user's

answer

asking

an

insertion

sequence

with

the

respect to the primary

The execution of the

expression

FOR:

suspended for the duration of the insertion sequence,

be resumed when the needed information unique

conferences,

DESTINATION

trips whose TIME is

and

is processed.

a unique TRAVELER,

but can

Having

now

FOR: can find all

(BEFORE NOW) and for each, output its TIME.

Information

about

trips,

etc. is never assumed to be complete since real

world.

Furthermore,

the

range

of

askable questions precludes computing in advance all the implicit information.

(The

procedures

that

do these computations are

discussed in Woods et al., Note 9.)

In

only

description

one

trip

that

matches

the

this

example

there

is

and its time is

printed out by the response generation programs: Bill Woods's Juarez,

trip

from

Boston,

Massachusetts

Mexico was from October 15th to 17th,

In

any

conversation

checking TRAVELER, DESTINATION,

and

intentions,

and

considerable

the

conversation (Bruce,

perceptions 1980, Note 4).

idealized

representation

speaker in

constructing one.

of an

search in the data base. For example,

such

are

many

including of

the

to

1975.

1974; Sidner, Note 8).

forces operating to the

memories,

participants

A discourse

model

the

in the is

an

forces as they are used by a

utterance

(See also Reichman,

TIME Deutsch,

require

there

determine what will be said next,

understanding Finding the trips, can

conference.

e.g.,

Discourse Model

question-answer sequence.

a

by

HWIM's question to the user, plus

constitute

is

only

required

expression

FOR:

the

specific

is not so in the

a

executing

began

system

the

it

computed from the PURPOSE of the LEG/OF/TRIP,

a

budgets,

may

answering

In the process of

has called an "insertion sequence."

be

attending

a function that

operator,

IOTA

to

The DESTINATION of a LEG/OF/TRIP

and

by

a

listener

1978; Stansfield,

in

1974;

HWIM:

A Computer Model

HWIM:

A Computer Model

15 The

discourse

context

is

an important factor in even the

restricted domain of person-computer communication, interactive

natural

language

explicit "discourse model".

16

but

in

many

For

example,

Most artificial problem domains

are

determination

of

understanding of the maintain

This

greatly

utterance.

There

is

also

need

models

of

the

to

generality,

the

means

of

able

to

cope

with

coping are rarely delineated

dialogues

between

a

travel

Woods

budget

(Bruce,

in

one

in which, of

suggested

constitute the modes of interaction. An

augmented

transition

of

several

e.g.,

trying

to

determine

of

a

proposed

trip,

examining

the

budget, or entering new trip information. formulations

the state of the

We considered

a

modified

ATN

any

parser

was

given

data

base.

state the parser can predict the most likely next

intent and hence such things as the class of likely verbs for the

the consequences

Then

that steps through the grammar on the basis of the input

next utterance. states,

(ATN) grammar was used to

sentence structure and the then-current state of the

discourse

at any point, the manager could be seen as being

network

represent some of the common modes of interaction found in travel

At a

Patterns of intents such as,

system-confirm-change,

written 1973)

manager and HWIM (a system

builder played the part of an ideal HWIM) model

& Makhoul,

assumed

user-make-editing-change

budget management dialogues. (see

An intent is the

system-answer-question

that

1975b). Incremental simulations

of modes of

"the

Even in those cases where the problem definition permits

a general discourse and the system is

concepts

user-ask-question

world,

etc.) when the only other participant in the conversation is

the

system-point-out-contradiction

a data base of participants in a conversation (together attitudes,

involved

user-enter-new-information

simplifies

no

formulation

purpose behind an utterance.

the speaker's intent, and hence, the full

with their presumed purpose,

USER."

such

interaction and intents (Bruce, 1975c).

the person may only ask questions; the

system may only answer the questions. the

One

understanding systems there is no

deliberately constructed so that only a few interaction modes are sensible.

made explicit and used to advantage.

This model was not used in the final version

system, primarily because it

of

was too rigid to function alone

as a discourse model.

several

of a discourse model in which these states would be

Another formulation of the user/discourse model involves the

A Computer Model

HWIM:

HWIM:

A Computer Model

17 (Both of

notion of demands made by participants in the dialogue. formulations

these Demands

response

formulation allows us to model how one computation of a can

be pushed down,

missing information,

and latter

This

out.

pointed

While

system

and how a computation can

Some elements of this demand model,

without

The more

interaction.

fact

are explained

is

question that

information

some

a

to

an

itself.

An

with

high

demand

typical

questions

leads

system

cannot embedding

Demands of lower priority include such things as

a

will

be

met,

the

expect that most counter-demands will be This is an

additional

influence

on

the

There 3).

Counter-demands:

These

are questions the system has

explicitly or implicitly asked the user (e.g., budget"

budget, canceling budget

implicitly items

or

asks

"This

trip

will

the user to review the

adjusting

cost

The discourse area of the data

LOCATION,

including: a

pointer to the current location of the

o

TIME, a pointer to the current time and date

o

SPEAKER, a pointer to the current speaker

o

the

last

mentioned

budget, conference,

could

over

reasonably

they

does to

person,

place,

time,

trip,

etc.

Such a

been answered.

you

can

that

these as long as it

speaker, e.g., the city

of modes of

notice might not be communicated until after direct questions had

put

nor expect too strongly

discourse context,

answered

be

notice by the system that the manager is over his budget.

(2)

to

base also contains an assortment of items that define the current

are demands for service of some sort

These

made upon the system by the user or by the

priority.

on

(3) Current discourse state:

o

unanswered

hold

discourse structure.

below:

active

not

resolved in some way.

subsequent

spawn

together with other aspects of the discourse model,

Demands:

should

while a whole dialogue takes place to obtain

expectations or digressions.

(1)

it

demands,

1975b).

questions

unanswered

as been

have

which

more fully in Bruce,

discussed things

such

include

contradictions

are

18

estimates).

is

also a representation of the current TOPIC (see Figure

This is the active focus of attention in the dialogue. be

the actual budget, a hypothetical budget, a particular

trip, or a conference. point

for

It

resolving

The current topic is used definite

as

an

references and deciding how much

detail to give in responses.

It can also generate certain

of interaction.

if

For example,

anchor

modes

the manager says "Enter a trip,"

HWIM:

A Computer Model

HWIM:

A Compt iter Model

19

20

the

system

notes

that

the

incompletely described trip. standard

current

topic

has

char iged to an

This results in demands thiat

fill-in questions to be asked.

If

cause

the manager wants to

complete the trip description later, then the completion

of

the

trip description becomes a low priority demand. The

system

"demand model." represent

has a primitive one-queue implementation of the This queue contains executable procedures

which

the speaker's previous queries and commands as well as

commands initiated by the system to examine the

consequences

of

its actions, give information to the user, or check for data base consistency.

These

procedures

are

dependencies and relative priorities.

related

by

functional

The major types of demands

are the following: o

DO:

o

TEST: means evaluate the form and answer "no"

means execute the specified command. if NIL

or "yes" otherwise. o

RESPOND: means give the user some information (which may or may not be part of

an

answer

to

a

di rect

query). o Figure 3.

A discourse state,

PREVENT:

means

monitor

for

a subsequent poss ible

action and block its normal execution (as in "Do not allow more than three trips to Europe.").

HWIM:

A Computer Model

HWIM:

A Computer Model

21

22

Understanding The

process

of

understanding

written natural language requires the integration of diverse knowledge sources. An optimal process must apply various types of knowledge in a balance that avoids over- or under-constraining the set of potential accounts of the input. Ultimately meaning

the

understanding

representation

the input be produced

written

directly

for or

or

via

varies

from

one

should

produce a the natural language input, whether

spoken.

structures (Bobrow & Brown, structures

process

This

some

representation

may

be

NPR

A

number of contingent knowledge

1975).

The

system

to

exact

form

of

INTERP= STHE-LOCATION-

these

another, but typically

includes such things as parse trees. In HWIM, processing

integration of is

knowledge

sources

for

0 linguistic

CALL TRIP FORKTO VERIFY THAT THIS MATCHES

accomplished

by means of (1) an ATN grammar (see Figure 4) that incorporates much of the syntactic, semantic, and pragmatic knowledge in the system, (2) a semantic network (see Nash-Webber, 1975) that represents remaining linguistic and world knowledge, and (3) tests on arcs in the grammar used to link the two representations. These structures are used to produce the account of the input that

serves

as

a

representation

of

its

meaning. Figure 4.

A small portion of the pragmatic grammar.

HWIM:

A Computer Model

HWIM:

A Computer Model

23 The

decision to include semantic knowledge in a grammar has

both its merits and its defects. obvious

On the one

hand

there

possible

interpretations

of an utterance.

the seemingly unnecessary step of as

a

constrain

Moreover,

building

a

parse tree, for an utterance.

HWIM (see also Erman, Hayes-Roth, Note

5).

On

the

Lesser,

other

that knowledge as if

it

discourse.

approached

No

those

base.

provided

by

the

discourse model or the factual data

For example:

& Reddy, 1980; Burton &

hand,

the

variety exhibits.

becomes

complex,

more

interconnected,

it

is

and

language

complexity

that

semantic

more not

fluid, clear

grammar

pragmatic

the

grammar

system has yet natural

"San Francisco, California" a valid place?

o

What

of

It

is,

words

descriptor o

Does

can

go

(e.g.,

"What

with

"speech"

as

a

project

"understanding")?

is the registration fee?" make sense in

this context? o

Does "How much is in the budget?" make sense in this context?

o

human

Does "When is that

meeting?"

make

sense

in

this

context?

and pragmatic knowledge and

more

intricately

to what extent the pragmatic

non-syntactic

knowledge,

the

however, closely linked to the

semantic network via tests on the arcs. to

Is

domain

not a complete knowledge source by itself

for linguistic processing.

o

o

Does "November 15th" make sense in this context?

With the grammar encoding semantic and pragmatic, as well as syntactic

inclusion is

"Jerry Woods" a valid name?

and

grammar approach will work. Despite

Is

the fact that we could

natural

As

o

in

were fundamentally no different from

computer

communication

grammar

as

model,

exists

syntactic knowledge was a consequence of having a limited of

the

All of this points to

incorporate semantic and pragmatic knowledge in the use

an

one avoids

syntactic

the desirability of a unified linguistic processor as

Brown,

is

advantage in terms of efficiency when semantic knowledge

can be applied early on in the most direct way to

such

24

These

tests

allow

capture more volatile constraints on the input,

the such

information,

it

is possible for the parser to build a

procedural "meaning" representation as well as a purely syntactic one. The meaning representations are in a command language whose expressions are

atomic,

consisting

applied to arguments, or compound, operator

(FOR:)

of

a

functional

operator

making use of a quantification

applied to another expression.

The operators in

HWIM:

A Computer Model

HWIM:

A Computer Model

25

26

atomic expressions specify operations to be performed on the data

so

base or interactions with the user.

The word "go," in the context of that

The

builds

parser

by

interpretations

accumulating

"will"

a

can always be interpreted as marking a future event.) our

trip is being discussed.

in

travel

domain,

indicates

Next, the constituent "the ASA

meeting" produces the

nodes

Note 9,

for a

registers the semantic head, quantifier, and links of described in the sentence (see Woods et al.,

being

more complete description).

For example,

(TO/ATTEND (! THE Y MEETING ((SPONSOR ASA)))).

the sentence (The ! indicates that a FOR: expression will have to be built part

I will go to the ASA meeting. yields

an

interpretation

in

the

command

language

(Section

COMMAND) of the form

of

the

interpretation.)

as

The top level of the grammar has

thus accumulated the link-node pairs (TIME (AFTER NOW)) (TRAVELER SPEAKER)

(FIND: MEETING (SPONSOR 'ASA))

(FOR: THE A0018 /

: T (TO/ATTEND (! THE Y MEETING ((SPONSOR 'ASA))))

(FOR: 1 A0019 /

(BUILD: TRIP (TO/ATTEND A0018)

with the semantic head TRIP.

(TRAVELER SPEAKER)

a BUILD:)

(TIME

is created, to produce

(AFTER NOW))) (BUILD:

: T ; T)) This

interpretation

is

built

up

the

in

constituent describing a person is found sentence. link-node

parser

The pair

interpretation

(TRAVELER of

that

The appropriate action (in this case

at

following the

start

transforms

the

pronoun

"I"

SPEAKER)

and

returns

this

way.

A

of

the

into the as

Finally,

pairs

being

the

necessary

expanded around the BUILD:

quantificational

expressions

are

expression.

the Response Generation

constituent. The word "will" adds (TIME

(AFTER NOW)) to the list of link-node

TRIP (TO/ATTEND (! THE Y MEETING ((SPONSOR 'ASA)))) (TRAVELER SPEAKER) (TIME (AFTER NOW)))

accumulated.

(The grammar does not accept constructions like "will have gone,"

Once

a

constructed,

satisfactory

theory

for

an

utterance

it must be acted upon by the system.

has

Regardless

been of

HWIM:

A Computer Model

HWIM:

A Computer Model

27 the

type

28

of action taken, some appropriate response should also

be made. From the speaker's point of view the response should explicit,

concise,

and

easy

to

understand.

be

The response may

affect the speaker's way of describing entities in the domain

as

well as illuminate the capabilities and operation of the system. The

effects

of

generated responses are many.

such as travel budget management there are many standard

names,

e.g.,

Coast conference". handle

for

the

"the budget item for two trips to a West

Names chosen (constructed) by HWIM provide manager

indicate

construction

the

of

handled

confidence

to

structures.

by

pursue A

subsequently.

capabilities

responses

structures

to

HWIM just

response

of

exhibit gives

the can

the

the

paths also

Responses

program.

exactly

the types of knowledge embodied in the

is

evidence

that

success

in

performing

these

system are sufficient for natural communication, at least at shown by sample dialogues.

of HWIM (Nash-Webber & Bruce, of

rely

on

place.

The history of the development

1976)

suggests that

each

category

The kinds of knowledge used by HWIM can be categorized as

follows: o

can

those

Acoustic-Phonetic forms--Knowledge of

phonemes

and

their relation to acoustic parameters.

she

Phonological

rules--Knowledge

of

phoneme clusters

and pronunciation variations describable by rules. o

show the system's focus of only

Lexical

forms--Knowledge

of

words,

their

inflections, parts of speech and phonemic spellings.

of

For

but also of how well the system is

example,

"entered"

is

the

past

form of "to

enter".

understanding and where more clarification may be useful. o Types of Knowledge Used in HWIM

knowledge

a

readily

accessible

form.

level.

at

the

For example,

phrase,

of clause,

a sentence

may

grammatical and start

sentence with

a

command verb, such as "schedule."

This

is needed for understanding both spoken utterances and

forms--Knowledge

Sentential structures

In order to carry out its role effectively, HWIM must have a large amount of knowledge in

the

knowledge is also necessary for natural communication to take

knowledge

the

he seeks,

tasks

o

the

The manager can thus get a better idea, not or

out commands, answering questions,

HWIM's

manager the legitimate which

carrying

Careful

attention. facts

a

to use and thus strongly influence the

ways the objects are referred to also

without

sentences,

and generating responses.

level

In a domain

objects

written

o

Discourse

form--Knowledge

of

idealized discourse,

A Computer Model

HWIM:

A Computer Model

HWIM:

29 what

e.g.,

type

of

is

utterance

likely

to

30 be

o

produced in a given state of the discourse.

Parameters

of

Semantic--Knowledge of how

are

words

related

speech

and

conveyed

structurally

and

vary

it

text input and speech or text communicative

This is

meaning.

used in parsing and in constructing responses.

situations

can

along a number of dimensions (see Rubin, 1980)

For and a successful communicator must use knowledge

the benefactive case for schedule should be

example,

or

In general,

output. used

communicative situation--HWIM's

operation is a function of the modality in which operates;

o

the

the

to

situation

filled by a person who will be taking the trip being

and

interpret

of

respond

appropriately. scheduled. o o

World--Knowledge

projects,

specific

of

trips,

User--Knowledge about each manager,

HWIM's

Whereas

budgets and conferences.

what

possible

budget

groups they belong to, and what they

semantic may know about the data base.

knowledge

travel

For

example,

only

is essentially fixed, its world knowledge certain

changes every time a

trip

is

taken

or

money

is

users

may

be

allowed

to

modify

budget

totals. shifted from one budget item to another. o o

On-Going

discourse--Knowledge

of

the

Self--Knowledge about (often

discourse state, e.g., objects

available

the

for

current

anaphoric

topic

the

system's

and

reference.

the

called

"meta-knowledge").

is

used

pragmatic

grammar.

to

relax

constraints

in

budget

manager

or

computer

by

HWIM.

Another is that HWIM knows

For example, if a conference is

under discussion, then

"What

is

the

what elements constitute a complete registration any is

One example is

the was

fee?"

knowledge

that data on trips and budgets is marked to indicate

This

whether it came from the travel knowledge

own

current

a meaningful utterance.

object, such as a trip.

Thus it

description

of

knows when its

If not, then the own knowledge is incomplete.

speaker would have to say something like,

"What

the registration fee for the ACL conference?"

is o

Task--Knowledge of the task manager

is

concerned

domain,

e.g.,

that

a

with maintaining an accurate

HWIM:

A Computer Model

HWIM:

A Computer Model

31 record

of

all

trips,

taken

or planned,

32

and with

with the second showing what can and cannot be done and the first

staying within the budget. o

addressing specific design questions.

Strategic--Knowledge concerning

the

and

This knowledge that

use

of

other

knowledge.

representation

would be needed in even a "blank" system; i.e.,

HWIM is in both categories, but its principal value probably lies

one

for

understanding is computer

shows

of

the

one

that

models.

model

of

language

precision

can

be

glossed

over

in

of

(Cohen speech

& Perrault,

1979)

act definitions.

domain,

representing

Thus, one

be

Winograd's

which

typed

accepted

performed

can

see

what

components,

each

aspects of language theory, are needed and how they

might interact.

actions,

and

(1972) blocks world program,

questions, generated

could be put in the latter category. these

have

to require complete use of language (within a restricted of course).

approaches;

in

than

the first.

To some extent, it areas

what

can

be

done

in

a

fairly

complete

system

It that

understands natural language (typed or spoken),

answers questions

and performs various actions,

natural

and responds

in

language

(typed and spoken).

commands, natural

and

a system

assertions,

language responses,

There is value in

both

Problems that arose in the design of HWIM were precursors of

less

A second

reason for computer models is that the task of the system can defined

a

For example, designers of computer

models of speech act generation the

computer

second

must resolve theoretical questions about

process

procedure-oriented

increased

a

that in the course of designing and debugging

program,

details

building

the

such as parsing, inference, data base design, and generation.

Conclusion reason

in

represents a synthesis of a number of lines of research in

that had no word- or domain-specific knowledge.

One

more

of

fact, they tend to complement each other,

those that are central issues in AI research today. the

speech

act

Cohen (Note 6), Questions

of

issues for HWIM are similar to those studied by

Cohen and Perrault knowledge

(1979),

1980).

and Fahlman (1979).

Language

generation

HWIM's has been studied Finally,

the

(1977),

Allen

(Note

by

and

Winograd

(1977),

in a discourse context similar to

McDonald

(Note

7),

for

instance.

issue of interactions among syntax, semantics,

Woods

1).

(Also see Brachman & Smith,

pragmatics is crucial in the work of many, Abelson

and

representation closely related to those

faced in HWIM have been pursued by Bobrow Brachman (1979),

For example,

(1980a),

and

including

Bobrow

(Note

Schank 2).

and and The

characteristics of HWIM reflect the goal of natural communication

HWIM:

A Computer Model

HWIM:

A Computer Model 34

33 between domain, depends The

a it

person and a computer assistant.

illustrates the extent to which natural

upon

structure

diverse of

HWIM

kinds can

Reference Notes

Even in its limited communication

1.

of knowledge in both communicants. provide

a

useful

framework

Allen,

J.

A

(Tech. Rep.

for

plan-based

approach to speech act recognition

No. 131/79).

Toronto:

Department of Computer Science,

University

of

Toronto,

January 1979.

obtaining a better understanding of natural communication. 2.

Bobrow,

R. J.

The

Cambridge, Mass.: 3.

Brachman, B. L.,

5.

system

& Woods,

Bruce,

B.

Report

No.

Report

W. A.

Cohen,

P.,

Research

No. 1978.

Klovstad, in

3878).

J., Webber,

natural

language

Annual report (1 Sept. 78--31 August 1979).

C.

Belief systems and language understanding BBN 2973).

Cambridge,

Newman, Inc.,

1975.

Burton,

& Brown,

R.,

(BBN

Bolt Beranek and Newman Inc.,

R. J., Bobrow, R.,

understanding 4.

RUS

Mass.:

Bolt

Beranek

and

(a) J.

S.

Semantic grammar:

A Technique

for constructing natural language interfaces to instructional systems (BBN

Report

No.

Beranek and Newman Inc., 6.

Cohen,

P. R.

(Tech.

Rep.

3587).

Cambridge,

118).

Bolt

May 1977.

On knowing what to say: No

Mass.:

Toronto:

Science, University of Toronto,

1978.

Planning speech acts

Department

of

Computer

HWIM:

A Computer Model

HWIM:

A Computer Model

35 7.

McDonald,

D.

Natural

language

production

decision making under constraint (MIT Cambridge,

Mass.:

Massachusetts

AI

36

as a process of

Lab

Institute

Tech. of

References

Rep.).

Technology,

Bobrow,

Sidner, C. L.

Towards a

computational

theory

of

definite

Bobrow,

No. 537).

Technology,

Cambridge, Mass.:

R. J.,

& Brown,

Synthesis,

anaphora comprehension in English discourse (MIT AI Lab Tech. Rep.

& Winograd,

understanding

Massachusetts Institute of

Representation

1979.

Woods,

W. A.,

Klovstad, J., Wolf,

J.,

Bates, Makhoul, Zue,

V.

M. Brown, J.,

G.,

Nash-Webber,

Speech

Bruce,

B.,

B.,

Cook,

Schwartz,

understanding

systems:

10.

An overview of KRL,

a knowledge

Woods, W. A.,

Bolt Beranek and Newman Inc.,

Kaplan, R. M.,

R.,

Brachman,

science natural language

information

(BBN

Cambridge, Mass.:

Report

Newman, Inc.,

No. 2378).

system:

R. J.

systems. and

networks:

Brachman,

understanding:

In D. G. Bobrow & A. Collins (Eds.)

understanding:

Studies

Academic Press,

1975.

the

The

N. V.

Findler

representation

New York:

R. J.,

epistemological

Bruce,

1972.

B.C.

& Smith, B. C.

Fourth

Pragmatics

and

Academic Press,

in

cognitive

in

International

status

of semantic

(Ed.), use

Associative

of

knowledge

in

1979.

SIGART Newsletter, Special Issue

on Knowledge Representation, No.

report

Bolt Beranek and

On In

computers.

The lunar

Final

Systematic

3-46.

and contingent knowledge in specialized

New York:

networks.

Final

1976.

& Nash-Webber, B. L.

S.

3,

C.,

technical progress report (30 October 1974--29 October 1976). Cambridge, Mass.:

J.

analysis,

science. 9.

T.

representation language. Cognitive Science, 1977,

1980. 8.

D. G.,

70, February 1980.

speech

understanding.

Joint

Conference

on

Proceedings Artificial

Intelligence, Tbilisi, USSR, 1975.(b) Bruce,

B. C.

American 19-35.(c)

Discourse Journal

models for

and

language

Computational

comprehension.

Linguistics 1975, 35,

HWIM:

A Computer Model HWIM:

37 Bruce,

Analysis

B.C.

understanding

of

of

plans

interacting

story

structure.

as a guide to the

Poetics,

1980,

9,

295-311.

Nash-Webber, B. L. understanding.

and

New York:

semantics in

D. G.

Bobrow

automatic speech

& A. Collins

understanding:

Studies

Academic Press,

1975.

(Eds.), cognitive

in

Natural communication between person and computer.

B. C.

In W. Lehnert and M. Ringle (Eds.),

Strategies

Hillsdale, N.J.:

language processing.

for

natural

B.,

P. R., & Perrault, C. R.

C.

Evolving uses of knowledge in a

Ottawa, Canada,

Proceedings of

the

COLING-76

June 1976.

Elements of a plan-based theory Reichman,

Cognitive Science, 1979, 3, 177-212.

of speech acts.

& Bruce, B.

speech understanding system.

in press.

Erlbaum,

Nash-Webber,

Conference, Cohen,

In

Representation science.

Bruce,

The role of

A Computer Model 38

R.

Conversational coherency.

Cognitive Science, 1978,

2, 283-327. Deutsch, B. G. L.D.

The structure of task (Ed.),

Erman

Speech

Proceedings

oriented of

Pittsburgh,

Recognition.

dialogues.

In

the

IEEE Symposium on

PA:

Carnegie

Mellon

L. D., Hayes-Roth,

knowledge

The

F., Lesser, V. R., & Reddy, D. R.

speech-understanding

Hearsay-II

to resolve uncertainty.

system:

Integrating

Computing Surveys,

1980,

NETL:

A

real-world knowledge. W. A.

(Ed.)

Cliffs, N.J.:

optimization

Proceedings of the

Rubin,

A. D.

for question-answering systems.

COLING-76

Conference,

Ottawa,

Canada,

A theoretical taxonomy of the differences between

oral and written language. W. F.

system

for

representing

and

Cambridge, Mass.: MIT Press,

using

1979.

Rumelhart, S.

Brewer

(Eds.),

Trends

in

Prentice-Hall,

speech 1980.

recognition.

D. E.

In R.

Towards

J.

Theoretical

Hillsdale, N.J.:

Dormic (Ed.),

N.J.: Lea,

Query

comprehension.

12, 213-253. Fahlman, S. E.

R.

June 1976.

University, 1974. Erman,

Reiter,

an

Spiro, B. C. issues

Bruce, and in

reading

Erlbaum, 1980.

interactive model of reading.

Attention and performance VI.

In

Hillsdale,

Erlbaum, 1977.

Englewood Schank,

R. C.,

& Abelson,

understanding: Hillsdale, N.J.

R. P.

Scripts,

plans,

An inquiry into human knowledge Erlbaum,

1977.

goals

and

structures.

HWIM:

A Computer Model

HWIM:

A Computer Model

39 Schegloff,

E. A.

Notes

Formulating place. interaction. Stansfield,

J.

a

conversational

In D. Sudnow (Ed.),

New York:

L.

on

Free Press,

Programming

a

Studies

40

practice: in

APPENDIX A

social The Structure of HWIM

1972.

dialogue

The

teaching situation.

Unpublished doctoral dissertation, University of

Edinburgh,

D. E.

York:

(Ed.)

Understanding

Elsevier North-Holland,

spoken

language.

New

1978.

clouds

New York:

Academic

DEC

process

PDP-10. (called

multiple-process

Press, 1972.

Major data structures

with thin arrows indicating data flow.

implemented on TENEX, the

T. Understanding natural language.

Components of

the system are shown in ovals with thick arrows representing flow

in

Winograd,

of HWIM is shown in Figure 1.

of control between components.

1974. Walker,

structure

a virtual memory Each

a

component Among

"fork").

the

job structure (see Woods,

shown

The system is

time-sharing exists

are

system

for

in a separate TENEX reasons

et al.,

for

this

1976) are that

most of the components are so large, in terms of program and data Woods,

W. A.

Cascaded

ATN

Grammars.

Computational Linguistics, Woods, W. A. In

1-12.

Journal of

(a)

Spiro,

Theoretical

B. C.

issues

in

Erlbaum, 1980.

Woods, W. A.,

& Makhoul,

continuous

speech

Bruce, reading

Calif.,

Brewer

comprehension.

(Eds.), Hillsdale,

(b) J.

Mechanical

understanding.

1973.

inference

problems

in

Proceedings of the Third Artificial

requirements,

that they cannot exist in the same TENEX

The Trip component is the one primarily paper.

Intelligence,

It

controls

the system and is

discussed

in

this

the major component for

acting on and responding to an utterance. input, it

International Joint Conference on Stanford,

& W. F.

structure

address space with another component.

Multiple theory formation in high-level perception.

R. J.

N.J.:

1980, 6,

American

If

calls Syntax to parse the sentence.

it

is given a typed

Otherwise it calls

Speech Understanding Control. Most of Dictionary

the

other

Expander

components expands

a

perform

single

dictionary

functions:

of

baseform

pronunciations using a set of phonological rules (applied once at system loadup

time).

Real

Time

Signal

Acquisition

(RTIME)

A Computer Model

HWIM:

41 and stores the speech signal.

digitizes

converts the speech Acoustic-Phonetic

signal

into

a

on the parametric

operates

representation of the speech signal to produce a set of hypotheses represented searches

the

expanded

and

against portions of the segment lattice. spectral

idealized

matches Verifier

the

parametric

an

using a region

representation of the speech signal.

Syntax

a

given

predicts

sequence;

word

extensions at each end of the sequence; builds a formal

representation of an utterance.

Interpreter

(present in an early

version only) takes a syntactic representation and

generates

against a

judges the grammaticality of possible

pronunciations

representation of a word (or words),

speech synthesis-by-rule program and matches it of

phonetic

Lexical Retrieval

in the segment lattice. dictionary

(PSA)

representation.

parametric

(APR)

Recognizer

Signal Processing

builds

a

of

an

utterance

procedural representation of the meaning.

generates speech from phoneme-prosodic cue strings.

Talker