Some Interface Issues in Developing Intelligent ... - Semantic Scholar

2 downloads 73 Views 955KB Size Report
of AAC in order to develop intelligent communication aids ... IM. To ..WY --+h.-:.~, effort. In addition, the listener will be required to expend effort to ...... This work has been supported by a Small Business Research. Program ... Center Grant from.
Some Interface Communication Kathleen

F. McCoy,

Patrick

Issues in Developing

Aids for People with Disabilities

Demasco, Christopher

A. Pennington

Department of CIS and Applied Science and Engineering Laboratories University of Delaware/ A.I. duPont Institute Hospital for Children P.O. Box 269, Wilmington DE 19899 {mccoy, demasco, penningt } @asel.udel.edu ABSTRACT Augmentative and Alternative Communication (AAC) is the field of study concerned with providing devices and techniques to augment the communicative ability of a person whose disability makes it difficult to speak in an understandable fashion. For several years, we have been applying natural language processing techniques to the field of AAC in order to develop intelligent communication aids that attempt to provide linguistically “correct” output while speeding communication rate. In this paper we describe some of the interface issues that must be considered when developing such a device. We focus on a project aimed at a group of users who have cognitive impairments that affect their linguistic ability. A prototype system is under development which will hopefully not only prove to be an effective communication aid, but may provide some language intervention benefits for this population. Keywords Intelligent Augmentative Communication Devices, Natural Language Processing, Interfaces for People with Disabilities INTRODUCTION Augmentative and Alternative Communication (AAC) is the field of study concerned with providing devices or techniques to augment the communicative ability of a person whose disability makes it difficult to speak in an understandable fashion. A variety of AAC devices and techniques exist today. Some are non-electronic word boards containing words and phrases in standard orthography andlor iconic representations. A person using a non-electronic aid selects locations on the board and depends on the listener to appropriately interpret the selection. Electronic communication aids may use the same sorts of selectable items, but may also include speech synthesis. These presumably provide more independence for the person using the system since helshe does not need to rely on a partner to interpret the selections. Whichever approach is used, the communication rate of the person using AAC is likely to be extremely slow, and using the aid will require a great deal of cognitive and physical Permission to make digitrd/hard copies of all or part of this material for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice. (he [itle of the publication and its date appear, and notice is &!iven that copyright is by permission of the ACM. IM. To ..WY --+h.-:.~, to republish, to post on servers or to redistribute to lists, requ~r& soecific permission and/or fee. lUI 97, Orlando Florida USA @ 1997 ACM 0-89791-839-8/96/01

Intelligent

..$3.50

163

Arlene

Luberoff

Badman

Prentke Romich Company 1022 Heyl Road Wooster, OH 44691 alb @prentrom.com

effort. In addition, the listener will be required to expend effort to understand the person using AAC. Some common methods used to improve access time, communication rate, and cognitive and physical effort include abbreviation expansion (where the person using the system memorizes a set of unique abbreviations for some set of words/phrases) and letter/word prediction (where the system attempts to predict the next word of input based on the first few letters; typically these predictions are displayed on the screen and may be accessed very easily). The above techniques are most often thought of in conjunction with a system whose selectable items are letters. However, another way to speed communication rate and ease of use is to have selectable items which are themselves words or phrases. Using such a system some efficient users, in an effort to speed interactive communication, may use telegraphic language. While telegraphic language will usually be “functional” and get the point across, its use often has adverse side effects. For instance, it may give communicative partners the impression that the user is less intelligent because of their non-standard language use. For a number of years, we have been concerned with a particular type of communication aid that takes (essentially) telegraphic input on the part of the user, and produces wellformed English sentences. This project was first motivated by considering linguistically mature users who would (no doubt) like their device to output well-formed sentences (in fact, naturally “think” in terms of such sentences). However, because of the nature of their physical impairment and the time that it takes to compose well-formed sentences, such users must often settle for output that is not as desirable. For these users we envision a system that expands their telegraphic input yet does not interfere with their control over the dialogue. A research prototype of such a system has been developed (and is briefly explained below). However, here we focus on a different population of users. Imagine a user who not only has a physical disability which causes them to require an AAC device, but has cognitive impairments which affect their linguistic ability. A system that expands telegraphic utterances for this population might not only provide more appropriate output, but might be viewed as a language intervention tool that could provide feedback of well-formed sentences. At the verv, least. this hypothesis

should

be tested.

We briefly describe a prototype intelligent AAC system which is under development via a joint effort between (1) the Applied Science and Engineering Laboratories (ASEL) of the University of Delaware and the A.I. duPo”nt Institute Hospital for Children, and (2) the Prentke Romich Company (PRC is a well-known manufacturer of communication aids). The system itself is based on the Companion system [4], [2], [5]. Companion is a research prototype that takes as input, uninflected content words (e.g., “apple eat john”) and produces a well-formed English sentence using those content words (e.g., “The apple was eaten by John “). The system is intended as a word-based system that speeds communication rate while still allowing the output of full (i.e., non-telegraphic) sentences. We describe an ongoing effort that takes expertise from these two institutions to develop an intelligent language aid that would provide Compansion-like output in a practical The project combines the communication system. experience gained through the development of Companion and other natural language processing technology at ASEL, with the interface, access methods, and practical experience provided by PRC. Our effort is geared toward a specific population: people who use communication devices and have expressive language difficulties due to cognitive impairments. Natural language processing techniques in an augmentative communication aid have the potential to analyze and correct utterances that are ungrammatical before they are spoken by the communication aid. For example, a person with expressive language difficulties may use a telegraphic Natural language expression such as, “1 go store yesterday.” processing within a communication aid might expand this into a syntactically correct sentence by detecting that the verb “go” needs to appear in the past tense, and that the should be preceded by the preposition destination “store” “to “ and a determiner. Using this information, the communication aid might suggest to the user that the sentence “I went to the store yesterday” be spoken as output. If this is acceptable to the user, it will be output. If it is not, other possibilities might be suggested. It is expected that a device such as described here may not only help the user generate understandable English sentences, but could also provide the user with feedback that might be beneficial in overcoming their expressive difficulties. We briefly describe our collaborative effort and highlight decisions made concerning the usability of the system for the particular population under study. When possible, we attempt to highlight how certain decisions would be made differently with other populations of users.

that creates some output depending on the selected items. All three of these elements must be tailored to an individual depending on hislher physical and cognitive circumstances and the task they are intending to perform. For example, for people with severe physical limitations, access to the device might be limited to a single switch. A physical interface that might be appropriate in this case involves row-column scanning of the language set that is arranged (perhaps in a hierarchical fashion] as a matrix on the display. The user would make selections by appropriately hitting the switch when a visual cursor crosses the desired items. In row-column scanning the cursor first highlights each row moving down the screen at a rate appropriate for the user. When the cursor comes to the row containing the desired item, the user hits the switch causing the cursor to advance across the selected row, highlighting each item in turn. The user hits the switch again when the highlighting reaches the desired item in order to select it. For users with less severe physical disabilities, a physical interface using a keyboard may be appropriate. The size of the keys on the board and their activation method may need to be tailored to the abilities of the particular user. Independent of the physical interface is the language set that must also be tuned to the individual. For instance, the language set might contain letters, words, phrases, icons, pictures, etc.., If, for example, pictures are selected, the processing method might translate a sequence of picture selections into a word or phrase that will be output as the result of the series of activations. Alternatively consider a language set consisting of letters. A method called abbreviation expansion could take a sequence of key presses (e.g., chpt) and expand that set into a word (e.g., chapter). The use of a computer-based AAC device generally has many trade-offs. Assuming a physical interface of rowcolumn scanning, a language set consisting of letters would give the user the most flexibility, but would cause standard message construction to be very time consuming. On the other hand, a language set consisting of words or phrases might be more desirable from the standpoint of speed, but then the size of the language set would be much larger causing the user to take longer (on average) to access an individual member. In addition, if words or phrases are used, typically the words would have to be arranged in some hierarchical fashion, and thus there would be a cognitive/ physical/visual load involved in accessing the individual words and phrases. TARGET POPULATION In developing a communication device for a particular population, all three aspects of the interface must carefully be considered. However, because the physical interface is independent of the other two aspects, we primarily concentrate on an appropriate linguistic set and processing method in this work.

COMPUTER-BASED AAC SYSTEMS A typical computer-based AAC system can be viewed as providing the user with a “virtual keyboard” that enables the user to select items to be output to a speech synthesizer or other application. A virtual keyboard can be thought of as consisting of three components: (1) a physical interface providing the method for activating the keyboard (and thus selecting its elements), (2) a language set containing the elements that may be selected, and (3) a processing method

Here we consider a population have

cognitive

language is

164

verbal

ability. or

impairments Whether nonverbal,

of users (young adults) who that

a child some

with

affect

their

cognitive

general

expressive impairments

characteristics

of

expressive language difficulties may include the following [3], [8]: (1) short telegraphic utterances; (2) sentences consisting of concrete vocabulary (particularly nouns); (3) morphological such as and syntactical difficulties inappropriate use of verb tenses, plurals, and pronouns; (4) word additions, omissions, or substitutions; and (5) incorrect word order. While such children may have the ability to functionally wants, needs and communicate their intervention to assist them in their language production may be quite beneficial both from a social and an educational standpoint. A LANGUAGE SET AND LOW-LEVEL PROCESSING APPROPRIATE FOR THE TARGET POPULATION The speech output communication aids that PRC designs for commercial use incorporate an encoding technique called

purpose behind MinspeakR is to reduce the cognitive demand as well as the number of physical activations required to generate effective flexible communication. Its success stems from the use of a relatively small set of icons that are rich in meaning and associations. These icons can be combined to represent a vocabulary item such as a word, phrase, or sentence, so that only two or three activations are needed to retrieve an item. This small set of icons can permit a large vocabulary to be stored in the device. When the icons have only one meaning, problems arise from a lack of space on the device’s display for adequate vocabulary. Since they are

The Companion Project Natural language processing (NLP) is the sub-field of Artificial Intelligence devoted to capturing the regularities of natural languages (such as English), and developing computational mechanisms to process them. Typically the field has been broken into 3 areas: syntax (the ordering of words and phrases to make legal sentences), semantics (the way the meanings of individual words can be combined to form a meaningful whole), and pragmatic (the way the current context affects the. meaning of a sentence). While there has been some use of NLP techniques in augmentative communication prior to the Companion project, its use has been fairly limited. For instance, some syntactic knowledge has been used in the context of word prediction and language tutoring [9], [10], [6], [7], [13]. Also, several systems at the University of Dundee such as PROSE [12], CHAT [1], and TALKSBACK [11], [12], use semantic (and pragmatic) knowledge to allow the user to access context-appropriate chunks of meaningful language with minimal effort. The major research emphasis with these systems has been the development of schemes which use NLP knowledge to access prestored pieces of text which are appropriate in the current conversation. This use of NLP is quite different from that used in the Companion system as it attempts to process the user’s spontaneous language constructions.

rich in meaning, icons designed for MinspeakR can be combined in a large number of distinct sequences to represent a core lexicon easily. MinspeakR was first utilized with PRC’S Touch Talkerm and Light TalkerTM communication aids. With these MinspeakR systems, if icons on the overlay remain in fixed positions, once learned, they allow the individual using the system to find them quickly and automatically. With the design of prestored vocabulary programs known as MinspeakR Application Programs (MAPsTM), a large vocabulary is prestored in a well-organized fashion using a logical, paradigmatic structure that greatly facilitates learning and effective communication. One of these MAPsW, Communic-Easem, contains basic vocabulary appropriate for a user chronologically 10 or more years of age with a language age of 5-6 years. CommunicEaseTM has proven to be an effective interface for users in our target population providing access to approximately 580 single words divided into 38 general categories. Most of these words are coded as 2-icon sequences. The first icon in the sequence (the category icon) establishes the word category. For example, the icon indicates a body part word, the d4ASKS> icon indicates a feeling word, and the cAPPLE> icon indicates a food word. The second icon denotes the specific word. For example, d4ASK> followed by produces the word “happy”; followed by produces the word “eat”. to the

words

which

sequences, Communic-Easew

are accessed

contains

via

HIGHER-ORDER PROCESSING DECISIONS The above considers the language set (and implicitly the physical interface) and some low-level processing decisions appropriate for a communication device for the population under study. The Communic-Easem MAPTM allows a user to access words in a manner that is reasonable for this population. The use of this MAPTM has been directed at the problem of vocabulary access, but additional (higher-order) processing would be necessary to address the expressive language problems faced by this population as outlined earlier. Notice that expressive language problems of those with cognitive impairments result in very telegraphic output. Thus, in an effort to find some higher-order processing to address these problems, we turn to the Companion project.

semantic compaction, commercially known as MinspeakR (a contraction of the phrase “minimum effort speech”). The

In addition

and allows the addition of endings to regular tense verbs and regular noun plurals. However, note that to accomplish this, additional keystrokes are required. Also, it is possible to spell words that are not included in the core vocabulary. In practice, however, users with either slow access methods or poor language ability tend to produce telegraphic messages consisting of key word sequences.

The Companion system uses techniques from several areas of NLP in order to take a telegraphic message and transform it into a well-formed English sentence. The Companion system is broken into several components. Fh-st, the word order parser takes the string of words input to the system and, using syntactic techniques, identifies the part of speech of each input word (e.g., noun, verb, adjective), identifies certain modification relationships (e.g., which noun is being modified by an adjective), and passes clausal-sized chunks (such as found in a sentential complement) to the remainder of the system.

the icon

The second

the semantic parser

some morphology

165

major

component

of the system

is

This is perhaps the most sophisticated

component of the system. The semantic parser takes the individual content words (e.g., nouns and a single verb) identified in the previous component as a clausal-sized chunk, and attempts to fit them together as a meaningful sentence. The major reasoning of the semantic parser is driven by semantic (case frame) information associated with verbs in conjunction with a knowledge base classifying words according to their semantic type. For example, the that it prefers an case frame for the verb “eat” indicates animate

actor and a food-item

theme.

Using

this information

the semantic and assuming an input of “apple eat John,” parser can correctly infer that “John” is a person doing the is the food being eaten. Once this is eating and that “apple” determined, the remaining components of the system (using further semantic and syntactic processing) are able to generate a well-formed English sentence which captures the meaning identified. The primary goal of the Companion system prototype was to demonstrate the feasibility of an NLP-based approach to AAC techniques (i.e., is it even possible). While results so far have been encouraging, additional steps are necessary to establish its viability in an actual product for a specific user population. For example, one of the major problems we have is that Companion requires a large amount of information associated with each word. While we have been working on technologies to provide this information automatically, it is currently nearly impossible to handle totally unrestricted vocabulary. Thus the system will work best with a population whose vocabulary needs can be well identified. A second problem that has not been dealt with effectively within the Companion project is the particular interface through which the user interacts with the system. While the research prototype system has a particular word-based interface associated with it, research efforts have dealt with the front-end component as a separate “black box” that provides the system with words. We have not tackled the many issues involved . with developing a front-end appropriate for a specific population of users. Notice that because the processing required by users is fairly involved (i.e., they must not only select words but must also accept/ reject the expanded sentence produced by the system), the

interface requirements are quite complex. Experience with the specific population using the device is required to develop an appropriate interface. Finally, the research effort on the Companion system to date has been done on workstations in the Lisp language. Thus the system would not run well on portable microcomputers as would be appropriate for use in real-world situations. PROVIDING COMPANSION-LIKE PROCESSING This project attempts to integrate PRC’S expertise in AAC product development and support with ASEES research expertise in applying NLP techniques to AAC. PRC has had success with systems and MAPsTM designed specifically for the target population. In this project, we use one of these MAPsTM (the Communic-Ease MAPTM) to provide the base vocabulary and icon sequences used in the envisioned system. In addhion, the physical access method is provided by PRC hardware, and PRC’S expertise with the population is crucial for the overall system development. ASEL’S Companion system provides the theoretical basis for the necessary NLP techniques. Much of our work involves the development and integration of an intelligent parser/ generator with existing PRC hardware and software, the refinement of the knowledge base (lexicon and grammar) for the target population, and the development of an appropriate r.rserinterface. Prototype

Envisioned System The envisioned system will

1. We note that the users will (and misspell) concerned

words. While

with providing

MAPm and an block diagram of

have the ability

lexical

information

described later when faced with unknown ticipated

grammatical

to spell

we do have a support project for a large

number of words, the system will use a recovery

method

words or unan-

constructions.

Words

4

*

Liberator

4

the PRC LiberatorTM

Input

Words

1

Icon Selections Overlay Keyboard

combine

system, a modified Communic-Ease intelligent parser/generator. A simplified the system is shown in Figure 1.

1) Feedback of selected Icons& 2) Transformed Sentences Liberator LCD Display

Development

Communic-Ease Minspeak Application Program

Intelligent Parser/ Generator (IPG)

I

Icon Predictions 1)Output Phrases 2) Word& Category

Predictions

Figure 1. Block Diagram of Envisioned System (speech and print output is not shown). User interacts with the keyboard/overlay and LCD display. New components that will be added to the current product are shown in bold.

166

transformed into “John went to the store yesterday. ” In some cases there may be multiple interpretations of a given input. For example, “John go store Tuesday” can be interpreted as “John went to the store on Tuesday” or “John will go to the store on Tuesday.” In these cases the system will order the sentences based on a number of heuristics. For example, in the previous situation, the system might prefer the past tense if previous sentences were in the past tense.

The Liberatorm Overlay/Keyboard accepts user input via a variety of methods (e.g., direct selection), and also limits user choices via Icon Prediction. With Icon Prediction only icons that are part of a valid sequence are selectable. The user selects icon sequences that are transduced into words or commands according to the Communic-Ease MAPm. In normal operation, icon labels and the transduced words are sent to the LiberatorTM LCD display to give the user feedback (words may also be spoken).

Transformation rules. The final version of IPG will be capable of performing a variety of linguistic transformations which have been motivated by a study of current CommunicEaseTM users. (Most of these are supported by the current ASEL prototype.) Some of these transformation rules are summarized in Table 1. For each rule in the table, an example input and transformation is shown. These examples (except for the conjunction rule) were obtained from transcripts of a Communic-Easem user where a family member or teacher was interpreting the input and transforming it into a well-formed ~enten~e. “

In the proposed system, these components are supplemented with an intelligent parser/generator (IPG) that is currently under development at ASEL. IPG is responsible for generating well-formed sentences from the user’s selected words. It also provides further constraints on the Icon Prediction process. For example, if the user selected “I have red,” the system might only allow icon sequences for words that can be described by a color (e.g., shoe, face). As previously mentioned, the major role of IPG is to transform telegraphic input into well-formed sentences. For example, the input “John go store yesterday” could be TABLE

1. Transformation

taken from

recorded

Transformation determiner

rules for the intelligent

transcripts

of a Communic-EaseTM

Rule

inferencing

parser/generator

Example

with example

input

and human

translations

user

input

Example

Output

I want cookie please mom

I want a cookie please mom.

verb inflection (subject verb agreement)

She work in school

She works in a school.

verb inflection (tense addition)

I go my old school Tuesday

I went to my old school on Tuesday.

multiple

I go circus see clowns

I am going to go to the circus to see the clowns.

noun inflection

Cook 50 dinner

Cook 50 dinners.

preposition

I dance five girls

I danced with five girls.

verbs

in ferencing

verb inferencing

Mr. Keylor industrial

subject inferencing

No time

I have no time.

conjunctions

Mom Dad be home

Mom and Dad are home.

possessives

I see neighbor house

I saw a neighbor’s

tense defaulting (tense, once set, will stay the same until changed)

1) Dad birthday party on tomorrow Sunday 2) Grandma papa and Pat sing happy birthday 3) Dad open present 4) We eat cake and ice cream

1) Dad’s birthday party will be tomorrow, Sunday. 2) Grandma, Papa and Pat will sing Happy Birthday. 3) Dad will open presents. 4) We will eat cake and ice cream.

word order modification

I football

I am going to watch football

arts teacher

watch Friday 10pm

One maior advantage of describing transformations as a set of rules is that it is relatively easy t: parametrize the system to affect its overall strategy. For example, a clinician or

Mr. Keylor is the industrial

arts teacher.

house.

on Friday at 10pm.

teacher could disable any of the transformation rules depending on the particular user’s abilities or educational goals. For some rules it will also be possible to specify

167

information about how the rule should be applied. For example, preferences for determiner inferencing could be adjusted (e.g., prefer “the” over “zJan”). Interjace Issues Beyond the basic operation described above, there are a number of interface issues that need to be resolved before a completed product is developed. These issues are being explored in early system prototypes with iterative user testing. Because it is likely that different users will have different requirements (especially if the system is used by a larger population than just the target population), our methodology is to develop the interface and system function with a series of “switches” that can be set to customize the system’s behaviors. This will allow the system to be tuned to the needs of particular users. The first issue of concern is how the intended population can best interact with the system when multiple sentences are generated from an input. As mentioned above, IPG can order output sentences according to a variety of rules. However, if the user has the cognitive ability to select their desired sentence2, it will be important to determine the best way to (1) present multiple options to the user, and (2) allow users to select from a list of possible choices. A number of possibilities exist including providing a list on the LCD screen, offering each sentence one at a time with some useractivated key to request the next choice, and asking a clarification question of the user (e.g., “Did you mean John already went to the store?’). Options such as these will be explored during prototype development. We discuss how some of these options might be presented in the section on Presentation Layout below. A second major interface issue revolves around incremental versus non-incremental processing. In incremental processing, the system would attempt to transform input on a word-by-word basis. For example, if the user selected the word “cat,” the system might expand it to “The cat.” In contrast non-incremental processing would wait for the entire sentence to be entered and then produce an output sentence(s). For example, the input “cat hungry” would be transformed into “The cat is hungry” or “Is the cat hungry?” depending on the final terminating character (i.e., period or question mark). Because of the cognitive load involved in incremental processing, our initial prototype is being developed for non-incremental processing. This decision could likely be different given a higher functioning incremental versus nonpopulation of users. Thus, incremental processing is one of the switches that will be built into the final prototype. Editing Functionality. Another issue that must be addressed is the editing permitted by the system. (This is even more crucial when incremental processing is considered.) The editing capabilities of the system will be parameterized to fit the needs of the user. For example, the system might allow 2.

In cases where the user is lower-functioning

not have this ability,

Tied in with editing issues are concerns raised when the user is permitted to spell any word. Spelling, of course, introduces the possibility of unknown words (and misspellings). Note that unknown words are a serious difficulty for the intelligent aspects of the system that require part-of-speech and some semantic information on the words. As mentioned before, the LiberatorTM system provides Icon Prediction. When Icon Prediction is used, the user is “forced” to select only valid sequences (because only these key sequences are allowed to be selected). Icon Prediction has proven very useful for users, especially when they are still learning the appropriate icon sequences for their desired vocabulary. One method of handling misspellings is to force only valid words by expanding “icon prediction” into the spelling mode (using, of course, a fairly substantial dictionmy). The intuition is that the system would only allow sequences of letters that matched some element of the would preemptively dictionary. This restrict any misspellings that did not result in a word in the dictionary. However, it would not prevent the user from typing inappropriate words -- i.e., a word that is actually in the dictionzq but not the word intended by the user. Thus, if Icon Prediction is used in spelling mode, the system must have the ability to process inappropriately used words. If Icon Prediction is not used in spelling mode, then the system must be able to handle misspellings. One method for doing so would be to assign some default part-of-speech (e.g., noun) and very general semantic information to these words. The effectiveness of this solution must be tested with users. Of course, the original input would be made available as an output choice when either no expanded sentences were generated or when the “heuristic score” for each generated sentence was below a preset confidence level. This behavior can also be set as a default parameter so that the original input is always one of the choices presented to the user. Presentation Luyout. The most significant change to the current physical interface will be in the display which must now accommodate a list of generated sentences. The current Liberatorm display contains a window that shows each current icon / key selection (Icon Buffer) and a second window for message construction and editing (Text Buffer). In the example below, d4ASKS> is an icon that has “emotions” as one of its semantic associations. is often used to indicate a positive concept. Together they represent the MinspeakR encoding for the word happy.

1 I I



and does with the ex-

pansion that the system considers its best choice.

168

Text Buffer

happy

---—

the system may be a useful teaching

aid. In such a case the user may be provided

deletions only at the end of the string the user has selected. On the other hand, higher functioning users might choose “full editing” capabilities that would allow additions / deletions from the middle of the currently selected string (as an example).

---—

———

1 I

Icon Buffer

Whh the integration of IPG, it is necessary to add a third window that will be used to show generated sentence (Gen Buffer). The display below illustrates one of a number of possible con figu~ation layouts.

I

---— am

I

was

I

happy —---

Each possible

Buffer, or not

Gen Buffer

happy



more

---

happy

----

parameters

important scrolling the

Text

[con Buffer



layout

associated

feedback

is determined with

parameters behavior, Buffer

by setting a group of

presentation include

options. the

highlighting is replaced

size

options, by

the

Some of the of

the

Gen

and whether

can also

current

choice.

be provided

In the simplest scenario the display collapses the Text Buffer and the Gen Buffer (size=O, replacement=T). When the input sequence is complete, the contents of the Text Buffer is replaced by the most highly rated expanded sentence. The user could then cycle through the other possible expanded sentences one at a time until their desired sentence is found.

E

Figure 2. System prototype will combine a PRC LiberatorTM with a tablet-based portable computer connected with an RS-232 line. Thk strategy allows for rapid initial prototype development.

---

since users who are pre-literate or have visual difficulties may benefit from having each of the potential sentences spoken on a private audio channel. This is often referred to as audio scanning. Audio

Tablet-based portable computer

Text Buffer

happy

---1

the user interface. The two systems will be connected via an RS-232 link and physically connected as shown in Figure 2.

‘z;

We anticipate that this may be a useful configuration for users who are not cognitively comfortable with selecting from among a group of possible alternatives. Part of the evaluation process will include determinations of this sort Development Methodology The system described in the previous section is being developed via a joint effort between PRC and ASEL. The prototype will combine the PRC’S LiberatorTM platform and Communic-Ease MAPm with ASEES current generation In the implementation the intelligent parser/generator. LiberatorTM will function primarily as the user’s keyboard and a tablet-based portable computer will contain the parser/ generator. The portable computer will also replace the Liberatorm’s LCD display and provide easy modification of

The intelligent parser/generator includes three major software components. The parser/generator module is written in C++ (still under development). The system dictionary will include all of the words contained within the Communic-Ease MAPm along with a variety of words that users may spell. This knowledge base will contain semantic knowledge such as noun categories and properties as well as morphological properties such as word endings. Finally, syntactic knowledge is captured in system grammars that are based on an Augmented Transition Network formalism. The network grammar uses both syntactic and semantic properties of the input words. Our project methodology is to develop and test the robustness and usability of the system in phases. Currently we are in the process of developing a parser written in C++. At the same time we are collecting transcripts of users using the Communic-Ease MAPTM. In some cases, in addition to the actual keystrokes produced by the system user, we have collected and transcribed video tapes of the user (and have asked a communication partner to, whenever appropriate, speak the expanded sentence they believe the user intends). This data provides a range of input and output sequences that our system must be able to account for. We are currently in the process of developing the grammar and necessary knowledge bases to handle the transcribed sessions. At the same time, we are working out the final details of the user interface so that the system can be field tested with actual augmentative communicators. Because the user interface has been constructed with flexibility in mind (i.e., through parameterization), it will also be evaluated through iterative testing. Thk will help us determine the most appropriate interface configuration (or possibly set of configurations) for the targeted user population; however, we would still retain the ability to make minor adjustments to suit the needs of a particular user. CURRENT The

STATUS

implementation

is written

in C++

We are currently

169

of the core

intelligent

paxser/generator

a Sun or a PC platform. in the process of finishing the development and runs on either

of the software that is necessary to complete of the intelligent parser/generator with the keyboard running the Communic-Ease construction of the system grammar is driven of our collected transcripts described above.

the integration PRC Liberator MAPm. The by an analysis

Several evaluations of the completed prototype system are planned. For instance, a theoretical evaluation of the grammar coverage is ongoing. As has been stated, we have collected key selections from current users of the Communic-Ease MAPTM. In some situations, we also have an interpretation of those keystrokes provided by the communication partner in a videotaped session. These video sessions have been transcribed and aligned with the keystroke data. While some of this data is being used to develop the grammar, we have set aside a portion of it to be used for testing purposes. This test data will allow us to test the system’s grammar in several ways. First, the robustness of the grammar can be tested by determining the number of completed utterances found in the collected data that can be handled by the grammar. Second, the appropriateness of the grammar can be tested by determining how often the grammar’s output matches the interpretation provided by the communication partner in the video sessions. Because we have much more keystroke data than transcribed video data, we also plan a test of grammar appropriateness by comparing the output of the grammar with that generated by a human faced with the same sequence of words. In addition to the theoretical grammar testing described above, we also plan an informal evaluation of the usability of the system. We plan to iteratively refine the interface by doing usability studies of our prototype with current users of the Communic-Ease MAPTM. This will be possible once the integration of the Liberatorw and the PC program is complete. ACKNOWLEDGMENTS This work has been supported by a Small Business Research Program Phase I Grant from the Department of Health and Human Services Public Health Service, and a Rehabilitation Engineering Research Center Grant from the National Institute on Disability and Rehabilitation Research of the U.S. Department of Education (#H133E30010). Additional support has been provided by the Nemours Foundation. The authors would like to thank Clifford Kushler of the Prentke Romich Company for his collaboration on the project. In addition we thank John Gray for his discussions and implementation of many of the C++ aspects of the system, and Mzujeta Cedilnik for her work on the grammar (and transformation rules).

2.

Demasco, P. W. & McCoy, K. F. Generating text from compressed input: An intelligent interface for people with severe motor impairments. Communications of the ACM, 35(5), 68–78, 1992.

3.

Kumin, L. Communication skills in children with Down Syndrome: A guide for parents. Rockville, MD: Woodbine House, 1994.

4.

McCoy, K. F., Demasco, P., Gong, Y., Pennington, C., & Rowe, C. Toward a communication device which generates sentences. In J. J. Presperin (Ed.), Proceedings of the Twelj7h Annual RESNA Conference (pp. 41-43). Washington, D.C.: RESNA Press, 1989.

5.

McCoy, K. F., Demasco, P. W., Jones, M. A., Bennington, C. A., Vanderheyden, P. B., & Zickus, W. M. A communication tool for people with disabilities: Lexical semantics for filling in the pieces. In Proceedings of the First Annual ACM Conference on Assistive Technologies (pp. 107–1 14). New York: ACM, 1994.

6.

McCoy, K. F. & Suri, L. Z. Designing a computer tool for deaf writers acquiring written English. Presented at ISAAC-92. Abstract appears in Augmentative and Alternative Communication, 8, 1992.

7.

Newell, A. F., Amott, J. L., Beattie, W., & Brophy, B. Effect of “PAL” word prediction system on the quality and Alterand quantity of text generation. Augmentative (pp. 304-3 11), 1992. native Communication,

8.

Roth, F. P. & Casset-James, E. L. The language assessment process: Clinical implications for individuals with severe speech impairments. Augmentative and A lternative Communication, 5, 165–172, 1989.

9.

Swiftin, A. L., Amott, J. L., & Newell, A. F. The use of syntax in a predictive communication aid for the physically impaired. In R. Steele & W. Gerrey (Eds.), Proceedings of the Tenth Annual Conference on Rehabilitation Technology (pp. 124-1 26). Washington, DC: RESNA, 1987.

10. Vandyke, J., McCoy, K., & Demasco, P. Using syntactic knowledge for word prediction. Presented at ISAAC-92. Abstract appears in Augmentative and Alternative Communication, 8, 1992. 11. Wailer, A., Aim, N., & Newell, A. Aided communication using semantically linked text modules. In J. J. Presperin (Ed.), Proceedings of the 13th Annual RESNA Conference (pp. 177–178). Washington, D.C.: RESNA, 1990. 12. Wailer, A., Broumley, L., & Newell, A. Incorporating conversational narratives in an AAC device. Presented at ISAAC-92. Abstract appears in Augmentative and Alternative

REFERENCES 1. Aim, N., Newell, A., & Amott, J. A communication aid which models conversational patterns. In R. Steele &W. Gerrey (Eds.), Proceedings of the Tenth Annual Conference on Rehabilitation Technology (pp. 127–1 29). Washington, DC: RESNA, 1987.

13. Wright, Amott,

Communication, A.,

Beattie,

J. An integrated

8, 1992.

W.,

Booth,

predictive

L., word

Ricketts,

W.,

processing

& and

correction system. In J. Presperin (Ed.), Prospelling ceedings of the 15th Annual RESNA Conference (pp. 369–370). Washington, DC: RESNA, 1992.

170

Suggest Documents