PARSING SPOKEN LANGUAGE WITHOUT SYNTAX

10 downloads 0 Views 105KB Size Report
Email : Jean-Yves.Antoine@imag.fr. ABSTRACT. Parsing spontaneous speech is a difficult task because of the ungrammatical nature of most spoken utterances.
PARSING SPOKEN LANGUAGE WITHOUT SYNTAX : A MICROSEMANTIC APPROACH Jean-Yves ANTOINE CLIPS - IMAG Université Joseph Fourier - CNRS - INPG BP 53 F- 38041 Grenoble Cedex, FRANCE Email : [email protected]

ABSTRACT Parsing spontaneous speech is a difficult task because of the ungrammatical nature of most spoken utterances. To overpass this problem, we propose in this paper to handle the spoken language without considering syntax. We describe thus a microsemantic parser which is uniquely based on an associative network of semantic priming. Experimental results on spontaneous speech show that this parser stands for a robust alternative to standard syntactic-led ones.

1. INTRODUCTION The need of a robust parsing of spontaneous speech is a more and more essential as spoken human - machine communication meets a really impressive development. Now, the extreme structural variability of the spoken language balks seriously the attainment of such an objective. Because of its dynamic and uncontrolled nature, spontaneous speech presents indeed a high rate of ungrammatical constructions : hesitations repetitions self-corrections

ellipses interruptions inserted comments

As a result, spontaneous speech catch rapidly out most syntactic parsers, in spite of the frequent addition of some ad hoc corrective methods [Stallard 91, Seneff 92, Young 93]. Most speech systems exclude therefore a complete syntactic parsing of the sentence. They on the contrary restrict the analysis to a simple keywords extraction [Appelt 92, Clementino 93]. This selective approach led to significant results in some restricted applications (ATIS...). It seems however unlikely that it is appropriate for higher level tasks, which involve a more complex communication between the user and the computer. Thus, neither the syntactic methods nor the selective approaches can fully satisfy the constraints of robustness and of exhaustivity spoken human-machine communication needs. This paper presents a detailed semantic parser which masters most spoken utterances. In a first part, we describe the semantic knowledge our parser relies on. We then detail its implementation. Experimental results, which suggest the suitability of this model, are finally provided.

2. MICROSEMANTICS Most syntactic formalisms (LFG [Bresnan 82], HPSG [Pollard 87], TAG [Joshi 87]) give a major importance to subcategorization, which accounts for the grammatical dependencies inside the sentence. We consider on the contrary that subcategorization issue from a lexical semantic knowledge we will further name microsemantics [Rastier 94]. (to select) AGT

OBJ

(device)

(I)

DET

ATT

(left)

(the)

Figure 1 : Microsemantic structure of the sentence I select the left device

Our parser aims thus at building a microsemantic structure (figure 1) which fully describes the meaning dependencies inside the sentence. The corresponding relations are labeled by several microsemantic cases (Table 1) which only intend to cover the system’s application field (computer-helped drawing). Although some microsemantic cases share a similar denomination with Fillmore’s ones [Fillmore 77], noticeable theoretical differences separate our model from standard case grammars [Antoine 94a : 144]. Table 1 : Microsemantic cases. Linked words are italicized. Case markers (prepositions) are capitalized.

Label DET AGT ATT OBJ LOC OWN MOD INS DAT TPS SCE QTE BUT COO TAG REF

Semantic case determiner agent attribute object / theme location / destination meronomy / ownership modality instrument dative time source quantity goal coordination case marker anaphoric reference

Example Remove the stairway I remove the stairway Remove the left balcony Remove the balcony Place a window ON the frontage The floor OF the kitchen is tiled I want to erase the balcony Remove the balcony WITH the erase Give the room a balcony Save BEFORE leaving Move the balcony FROM the back to the front There are two stairs Move it IN ORDER TO place the stairway Draw a door and a window Move it FROM the back. I remove the door which is on the right

The microsemantic parser achieves a fully lexicalized analysis. It relies indeed on a microsemantic lexicon in which every input represents a peculiar lexeme 1. Each lexeme is described by the following features structure : Example : to draw PRED MORPH SEM SUBCAT

lexeme identifier morphological realizations semantic domain subcategorization frame

 Pr ed = ’to draw ’   Morph = {’draw ’,’draws’,’drew ’,’drawn ’}    AGT = / element / + / animate /    Subcat = OBJ = / element / + / concrete / ( LOC) = / property / + / on _ element / + / place /      Sem = / task − domain / 

The microsemantic subcategorization frames describe the meaning dependencies the lexeme dominate2. Their arguments are not ordered. The optional arguments are in brackets, by opposition with the compulsory ones. At last, the adverbial phrases are not subcategorized.

3. SEMANTIC PRIMING Any speech recognition system involves a high perplexity which requires the definition of top-down parsing constraints. This is why we based the microsemantic parsing on a priming process.

3.1. Priming process The semantic priming is a predictive process where some already uttered words (priming words) are calling some other ones (primed words) through various meaning associations. It aims a double goal : • It constrains the speech recognition. • It characterizes the meaning dependencies inside the sentence. Each priming step involves two successive processes. At first, the contextual adaptation favors the priming words which are consistent with the semantic context. The latter is roughly modeled by two semantic fields : the task domain and the computing domain. On the other hand, the relational priming identifies the lexemes which share a microsemantic relation with one of the already uttered words. These relations issue directly from the subcategorization frames of these priming words. 1 Lexeme = lexical unt of meaning. 2 The microsemantic subcategorisation frames should hereby be compared with the a-structure of the late LFG [Bresnan 89].

prepositions

coordination structural layer contextual layer semantic

S

AGT

fields OBJ

primed words

priming words (inputs)

(outputs) temporal forgetting

but

case-based networks

2

1 context specification

3 contextual focusing

contextual adaptation

4i argumentative dispatching

5i case-based priming

6 priming collection

relational priming

Figure 2 — Structure of the priming network

3.2. Priming network The priming process is carried out by an associative multi-layered network (figure 2) which results from the compilation of the lexicon. Each cell of the network corresponds to a specific lexeme. The inputs represent the priming words. Their activities are propagated up to the output layer which corresponds to the primed words. An additional layer (Structural layer S) handles furthermore the coordinations and the prepositions. We will now describe the propagation of the priming activities. Let us consider the following notations : •t • ai j ( t ) k ,l (t ) • ωi,j

current step of analysis activity of the cell j of the layer i at step t (i ∈{1, 2, 3, 4a, 4b..., 5a, 5b..., 6, S} ) synaptic weight between the cell k of the layer i and the cell l of the layer j.

Temporal forgetting — At first, the input activities are slightly modulated by a process of temporal forgetting :

a1i (t ) = amax

if i corresponds to the current word.

i

a1(t ) = amax a 1 ( t ) = Max (0, a 1 ( t − 1) − ∆ forget ) i

i

if it corresponds to the primer of the current word. otherwise.

Although it favors the most recent lexemes, this process does not prevent long distance primings. Contextual adaptation — Each cell of the second layer represents a peculiar semantic field. Its activity depends on the semantic affiliations of the priming words :

a2j (t) =

∑ ωi,, j ( t ) . ai (t)

(1)

1

12

i

with :

ω1i,,2j (t) = ω max

if the priming lexeme i belongs to the semantic field j.

ω1i,,2j ( t )

otherwise.

= - ωmax

Then, these contextual cells modulate dynamically the initial priming activities :

a3j (t) = a1j (t) +

∑ ωi,,j ( t ) . ai (t) 23

2

(2)

i

with :

ω (t) = i, j 2, 3

∆context

ω (t) = - ∆context i, j 2,3

if the priming lexeme j belongs to the semantic field i. otherwise.

The priming words which are consistent with the current semantic context are therefore favored.

Relational Priming — The priming activities are then dispatched among several sub-networks which perform parallel analyses on distinct microsemantic cases (figure 3). The dispatched activities represents therefore the priming power of the priming lexemes on each microsemantic case :

a i4 α (t) =

∑ω

i, j 3,4

( t ) . a 3j ( t ) = ω i,i3,4 ( t ) . a i3 ( t )

(3)

j

The dispatching weights are dynamically adapted during the parsing (see section 4). Their initial values issue from the compilation of the lexical subcategorization frames :

ω i,3,4j α (0) = 0

if i ≠ j

ω i,i3,4 (0) = δ max

if the case α corresponds to a compulsory argument of the lexeme i or if the latter

ω

should fulfill α alone. if the case α corresponds to an optional argument of the lexeme i or if the latter should

i,i 3,4

(0) = δ min

fulfill α thanks to a preposition.

ωi,i 3,4 α (0) = 0

otherwise.

The inner synaptic weights of the case-based sub-networks represent the microsemantic relations between the priming and the primed words :

ωi,4 αj,5α (t) = ωmax ωi,4αj,5α (t) = ω min

if i and j share a microsemantic relation which corresponds to the case α. otherwise.

The outputs of the case-based sub-networks, as well as the final priming excitations, are then calculated through a maximum heuristic :

a 5jα (t) = Max i

a i6 (t) = Max i



(a

i 5α

i, j 4,5 α

(t)

(t) . a 4j α (t)

)

)

(4) (5)

The lexical units are finally sorted in three coarse sets :

a6i (t) > Thigh

primed words

Thigh > a6i (t) > Tlow

primable words

i

rejected words

a6 (t) < Tlow

The primed words aims at constraining the speech recognition, thereby warranting the semantic coherence of the analysis (figure 3). These constraints can be relaxed by considering the primable words. Every recognized word is finally handled by the parsing process with its priming relation (see section 4). SPEECH UNDERSTANDING

MICROSEMANTIC

microsemantic structure

PARSING NETWORK ADAPTATION

weights priming relations recognized words

PRIMING

primed words

SPEECH RECOGNITION

Figure 3 — Architecture of the microsemantic parser.

3.3. Prepositions Prepositions restrict the microsemantic assignment of the objects they introduce. As a result, the prepositional cells of the structural layer modulate dynamically the case-based dispatching weights to prohibit any inconsistent priming. The rule (3’) stands therefore for (3) : a i4 α (t) = ω i,i3,4 α ( t ) .  ∑ ω αk ( t ) . a Sk ( t )  . a i3 ( t ) k 

with :

and :

(3')

ωαk (t) = ωmax

if α is consistent with the preposition k.

ωαk (t) = 0

otherwise.

aSk ( t ) = amax

while the object of a uttered preposition k is not assigned a case.

aSk ( t ) = 0

otherwise.

At last, the preposition is assigned the TAG argument of the introduced object.

3.4. Coordinations The parser deals only for the moment being with logical coordinations (and, or, but...). In such cases, the coordinated elements must share the same microsemantic case. This constraint is worked out by the recall of the already fulfilled microsemantic relations, which were all previously stacked. The dispatching is thus restricted to the recalled relations every time a coordination occurs :

ω i,3,4j α (t) = ω i,3,4j α (0)

if a relation directed by the lexeme i on the case α was stacked.

ω i,3,4j α (t) = 0

otherwise.

Then, the priming proceeds ordinarily. The coordinate words are finally considered the COO arguments of the conjunction, which is assigned to the shared microsemantic case (figure 4).

 Pred = ’to draw’  AGT = [ Pred = ’I’ ]   MOD = Pred = ’to want’ [ ]    OBJ =  Pred = ’circle’  DET = [ Pred = ’a’ ]   

 Pred = ’to draw’  AGT = [ Pred = ’I’ ]   MOD = Pred = ’to want’  [ ]     Pred = ’circle’   COO =  DET = [ Pred = ’a’]      OBJ =  Pred =’and’      COO =  Pred = ’square’   DET = [ Pred = ’a’]     

"I want to draw a circle and..."

"I want to draw a circle and a square"

Figure 4 — Current state of the microsemantic structure 3 during the parsing of a coordination

3.5. Back priming Generally speaking, the priming process provides a set of words that should follow the already uttered lexemes. In some cases, a lexeme might however occur before its priming word : (a)

I want to enlarge the small window

small is back primed by window

Back priming situations are handled through the following algorithm : Every time a new word occurs : 1. If this word was not primed, it is pushed it in a back priming stack. 2. Otherwise, a specific step of priming checks whether this word back primes some stacked ones. Back primed words are then popped out.

3 The illustrated parse trees are actually simplified : the MORPH and SUBCAT fields are for instance missing.

4. MICROSEMANTIC PARSING 4.1. Unification The microsemantic parsing relies on the unification of the subcategorization frames of the lexemes that are progressively recognized. This unification must respect four principles : Any subcategorized argument4 must be at the most fulfilled by a unique lexeme or by a coordination. Any lexeme must fulfil at the most a unique subcategorized argument. Coordinate lexemes must fulfil the same subcategorized argument. Any subcategorized case might remain unfulfilled although the parser must always favor the more complete analysis.

Unicity Coherence Coordination Relative completeness

The principle of relative completeness is motivated by the high frequency of incomplete utterances (ellipses, interruptions...) spontaneous speech involves. The parser aims only at extracting an unfinished microsemantic structure pragmatics should then complete. As noticed previously with the coordinations, these principles govern preventively the contextual adaptation of the priming network, so that any incoherent priming is excluded. This adaptation is achieved through the dynamic modulation of the synaptic weights.

4.2. An example of microsemantic parsing Let us consider a little parsing example to see exactly how the analysis proceeds : (b)

one creates a table in the kitchen as well as a chair.

At first, the pronoun one primes the following lexemes (primed words are supplied with their excitation a6i ) : to cancel

100

to go

100

to leave

100

to repeat

100

to launch

100

to use

100

to modify

100

to group

100

to dissociate

100

to copy

100

to cut

100

to to to to to to to to

zoom reduce enlarge draw start duplicate erase move

100 100 100 100 100 100 100 100

to place to choose to create

100 100 100

to to to to to to to

put select display quit open quit close

to kill to call

100 100 100 100 100 100 100 100 100

These words were all primed through the same relation (AGT). Since the semantic context is furthermore not yet relevant, the priming excitations are all identical. Several recognized words are usually kept preserve among the primed ones. We will only consider in this example the relevant hypothesis. The parser thus process to create, which is linked to one in the microsemantic structure : to create AGT one

Then occurs a new step of priming : bathroom toilet kitchen sitting − room bedroom house square triangle circle

102 102 102 102 102 102 102 102 102

lozenge line wall window - 1 door stairway room table chair

102 102 102 102 102 102 102 102 102

bed now then here there on the right on the left window - 2 display

4 Excepting MOD and ATT arguments. A similar exception is observed in LFG with the adjunct objects.

102 102 102 97 97 97 97 97 97

The analysis of to create favored the task domain, thereby enhancing the priming of the first meaning of window : (c1) (c2)

Draw a window upstairs Close the compilation window

task domain computing domain

activity = 102 activity = 97

The next word (the determiner a) was not primed. It is consequently memorized in the back priming stack. Then occurs table, which is preferably identified as the Object of to create (LOC corresponds only to an optional argument) : table : to_create to_create

OBJ LOC

100 71

The determiner a is then back primed by table and the parser provides the following microsemantic structure : to create OBJ table DET a AGT one

Then the analysis goes on word after word. As last, the parser provides the following accurate structure : to create ( TAG in) DET the

LOC kitchen OBJ as well as

COO chair DET a COO table DET a AGT one

5. LINGUISTIC ABILITIES As illustrated by the previous example, the microsemantic parser masters rather complex sentences. The study of its linguistic abilities offers a persuasive view of its structural power.

5.1. Linguistic coverage Although our parser is dedicated to French applications, we expect our semantic approach to be easily extended to other languages. We will now study several linguistic phenomena the parser masters easily. Compound tenses and passive — According to the microsemantic point of view, the auxiliaries appear as a mark of modality of the verb. As a result, the parser considers any auxiliary an ordinary MOD argument of the verb, whether the VP corresponds to a compound tense (d) or a passive (e). ♦

compound tenses

(d1) J’ai mange ( = *Je avoir manger) *I has eaten. I ate. (d2) Je suis venu. ( = *Je être venir) *I am come. I came.



passive voice

(e)

Le carré est effacé [par le logiciel]. The square is erased [by the software].

 Pred =’manger’   MOD = [ Pred =’avoir’]   AGT = [ Pred = ’ je’]   Pred =’venir’   MOD = [ Pred =' ê tre']   AGT = [ Pred =' je']    Pred = ' carré' OBJ = DET = [ Pred ='le']     = Pred 'effacer'   MOD = [ Pred = ' ê tre']    Pred ='log iciel'      AGT =  DET = [ Pred ='le']   TAG =' par'  

Interrogations — Three interrogative forms are met in French : ♦ closed

questions

subject inversion

(f1)

Déplaçons-nous le carré ? * Move we the square ?

Do we move the square ? Est-ce-que nous déplaçons le carré ? * Is-it-that we move the square ? (f3) Nous déplaçons le carré ? * We move the square ?

est-ce-que

(f2)

intonation ♦ open

questions

subject inversion

est-ce-que intonation

(g1)

Où déplaçons-nous le carré ? * Where move we the square ? Where do we move the square ? (g2) Où est-ce-que nous déplaçons le carré ? * Where is-it-that we move the square ? (g3) Nous déplaçons le carré où ? * We move the square where ?

Since the parser ignores most word-order considerations, the interrogative utterances are processed like any declarative ones. This approach suits perfectly to spontaneous speech, which rarely involves a subject inversion. Closed questions are consequently characterized either by a prosodic analysis (figure 5a) or by the adverbial phrase est-ce-que (figure 5b). Open questions are on the contrary introduced explicitly by an interrogative pronoun which stands for the missing subcategorized argument (figure 5c).  PRED =' deplacer '    AGT = [ Pr ed = ' nous']    OBJ =  PRED = ' carré'  DET = [ Pr ed = ' le']    

 PRED =' deplacer '  AGT = [ Pr ed =' nous']    OBJ = PRED =' carré' DET = [ Pr ed =' le']   MOD = ' est − ce − que'

a — utterances (l1) and (l3)

b — utterance (l2)

 LOC = [ Pr ed =' où']  AGT = [ Pr ed = ' nous']    = PRED deplacer ' '    PRED = ' carré'   = OBJ  DET = [ Pr ed =' le']       = − − MOD est ce que ' '  

c — utterance (m2)

Figure 5 : microsemantic structures of several interrogative sentences Relative clauses — Every relative clause is considered an argument of the lexeme the relative pronoun refers to. The microsemantic structures of the main and the relative clauses are however kept distinct to respect the principle of coherence. The two parse trees are indirectly related by an anaphoric relation (REF) : (h)

The door encumbers the window which is drawn on the right.

 Pr ed =’door ’   AGT =     DET = [ Pr ed =’the’]  Pr ed = ’to encumber ’  Pr ed =’window’= [1] OBJ =     DET = [ Pr ed =’the’] 

       

   Pr ed =’that ’ OBJ =  REF = [1]       MOD ed is passive = = − Pr ’ ’ [ ]    Pr ed = [ Pr ed =’to draw ’]   LOC = [ Pr ed =’on the right ’]   

Subordinate clauses — Provided the dependent clause is not a relative one, the subordinate verb is subcategorized by the main one. As a result, subordinate clauses are parsed like any ordinary object : (i)

You draw a square as soon as the circle is erased..

Pred =’to draw’   TAG = ’as soon as’   MOD = [ Pr ed =’is’]     BUT = Pr ed = ’to erase’     OBJ =  Pr ed =’circle’   DET = [ Pr ed =’a’]     AGT = [ Pred = ’you’]    Pr ed =’square’  OBJ = DET = Pr ed =’a’   [ ]   

Word-order phenomena — Other word-order phenomena, which are very frequent in spontaneous speech, are also mastered by the microsemantic parser. The following constructions are for instance correctly parsed. (j1)

Ensuite,en face de la chambre, dessiner la salle de bains. Then, in front of the bedroom, draw the bathroom.

(j2)

C'est le carré qu'on deplace. *It is the square that we move.

5.2. Spontaneous constructions The previous examples show the microsemantic parser achieves a linguistic coverage which meets noticeably the needs of a natural spoken human-machine communication. Its suitability is however still more patent when considering spontaneous speech. The parser masters indeed most of the spontaneous ungrammatical constructions without any specific mechanism : Hesitations — Hesitations involve always a silent pause which occurs usually with an interjection : (k)

Dessine ... euh ...un carré bleu. Draw ...well ... a blue square.

The interjection is considered an empty lexeme, so that it does not alter the parsing, even when the hesitation occurs with some other spontaneous phenomena (repetitions, self-corrections...). Repetitions and self-corrections — Repetitions and self-corrections seem to violate the principle of unicity. They involve indeed several lexemes which share the same microsemantic case : (l1) (l2)

repetitions self-corrections

* Select the device ... the right device. * Close the display ... the window.

These constructions are actually considered a peculiar coordination where the conjunction is missing [De Smedt 87, Levelt 89: 484-495]. They are consequently parsed like any coordination. (m)

* Draw a door ... no ... a french window !

Some self-corrections (m) are otherwise explicitly introduced by a negative adverb which appears as a logical connector. Ellipses and interruptions — The principle of relative completeness is mainly designed for the ellipses and the interruptions. Our parser is thus able to extract alone the incomplete microsemantic structure of any interrupted utterance. On the contrary, the criterion of relative completeness is deficient for most of the ellipses like (t), where the upper predicate to move is omitted : (n)

* [Move] The left door on the right too.

The parser lacks actually too much information to unite the arguments door and on the right in the same microsemantic structure. Such wide ellipses should nevertheless be recovered at a upper pragmatic level. Comments — Generally speaking, comments do not share any microsemantic relation with the sentence they are inserted in : (o)

* Draw a vertical line ... that’s it ... on the right..

For instance, the idiomatic phrase that’s it is a mark of confirmation which is related to (o) at the pragmatic level and not at the microsemantic one. As a result, the microsemantic parser can not unify the main clause and the comment. We expect however further studies on pragmatic marks to enhance the parsing of these constructions. Despite this weakness, the robustness of the microsemantic parser is already substantial. The following experimental results will thus suggest the suitability of our model for spontaneous speech parsing.

6. EXPERIMENTAL RESULTS This section presents several experiments that were carried out on our microsemantic analyzer as well as on a LFG parser [Zweigenbaum 91]. These experiments were achieved on the literal written transcription of three corpora of spontaneous speech (table 2) which all correspond to a collaborative task of drawing between two human subjects. Table 2 : Description of the experimental corpora.

Corpus corpus 1 corpus 2 corpus 3

Task architectural drawing abstract drawing elementary drawing

Number of utterances 260 157 179

Average utterance length 11.8 words per utterance 11.3 " 5.7 "

The first subject was simulating a computer-helped drawing system while a second one — the instructor — was playing the user of the system. Both subjects were facing a personal computer in two separate rooms. They were interacting via audio and visual (mouse + display) channels of communication. The dialogues were totally unconstrained, so that the corpora are corresponding to natural spontaneous speech. We compared the two parser according on their robustness and their perplexity.

6.1. Robustness The table 3 provides the accuracy rates of the two parsers. These results show the benefits of our approach. Around four utterances over five ( x =83.5%) are indeed processed correctly by the microsemantic parser whereas the LFG’s accuracy is limited to 40% on the two first corpora. Its robustness is noticeably higher on the third corpus, which presents a moderate ratio of ungrammatical utterances. The overall performances of the LFG suggest nevertheless that a syntactic approach is not suitable for spontaneous speech, by opposition with the microsemantic one. Most of the microsemantic failures involve furthermore a comment or a wide ellipsis, which should be handled at a pragmatic level. Table 3 : Average robustness of the LFG and the microsemantic. Accuracy rate = number of correct analyses / number of tested utterances.

Parser LFG parser Microsemantic parser

corpus 1 0.408 0.853

corpus 2 0.401 0.785

σn 0.170 0.036

x 0.525 0.835

corpus 3 0.767 0.866

Besides, the independence of microsemantics from the grammatical shape of the utterances warrants our parser’s robustness remains relatively unaltered in all circumstances (standard deviation σn = 0.036). This microsemantic stability was also observed on an inter-individual study (Table 4), whereas the same investigation showed the LFG robustness should fluctuates strongly (from 18% to 74%) among several different speakers. These significant performances are however not priceless with regard to perplexity. Table 4 : Individual accuracy rates of the LFG and microsemantic parsers on corpus 1.

Parser LFG parser Microsemantic parser

speaker 1 0.74 0.88

speaker 2 0.25 0.74

speaker 3 0.58 0.96

speaker 4 0.18 0.84

speaker 5 0.27 0.68

speaker 6 0.41 0.80

σn 0.20 0.09

6.2. Perplexity Two kinds of perplexity were studied. On the one hand, the priming perplexity characterizes the top-down constraining power of the microsemantic parser. On the other hand, the structural perplexity values the bottom-up degree of microsemantic ambiguity. Priming perplexity — Table 5 shows that the microsemantic parser primes one fifth of the lexicon on the average. This top-down perplexity seems quite moderate and should be usually compared with that of a context-free grammar. The priming perplexity remains furthermore unaltered whatever the grammatical regularity of the utterance. Table 5 : Average priming perplexity (corpora 1 and 2). Any utterance which is correctly parsed by the LFG analyzer is considered "grammatical".

criterion number of primed words / lexicon size rank of the excepted word / lexicon size

"grammatical" sentences 0.22 0.17

"ungrammatical" sentences 0.20 0.16

Structural perplexity — As mentioned above, the microsemantic parser ignores in a large extent most of the constraints of linear precedence. This tolerant approach is motivated by the frequent ordering violations spontaneous speech involves. It however leads to a noticeable increase of the bottom-up perplexity. This deterioration is particularly patent for sentences which include at least eight lexemes (Figure 6). final number of structural hypotheses 20

LFG Microsemantics 10

0 2

6

10

number of lexemes in the utterance

Figure 6 — Structural perplexity of the LFG and microsemantic parsers. At first, we proposed to reduce this perplexity through a cooperation between the microsemantic analyzer and a LFG parser [Antoine 94b]. Although this cooperation achieves a noticeable reduction of the perplexity, it is however ineffective when the LFG parser collapses. This is why we intend at present to insert directly some ordering constraints spontaneous speech never violates. [Rambow 94] established that any ordering rule should be expressed lexically. We suggest consequently to order partially the arguments of every lexical subcategorization. Thus, each frame will be assigned few equations which will characterize some ordering priorities among its arguments. Many authors (HPSG [Polard 87], TAG [Joshi 87], case grammars [???]) had already proposed to distinguish between subcategorization and word-order phenomena. Our approach presents however several peculiarities : ♦

partial ordering — The position of some arguments should not be specified. This tolerance is once again motivated by robustness considerations.



relative constraints — The ordering equations will only define additional relative constraints which should be infringed by the priming process.

We expect this solution, which is not implemented for the moment being, to restrict noticeably the structural perplexity of the parser without altering its robustness. Our analyzer would then be impressively suitable for spontaneous speech parsing.

7. CONCLUSION In this paper, we argued the structural variability of spontaneous speech prevents its parsing by standard syntactic analyzers. We have then described a semantic analyzer, based on an associative priming network, which aims at parsing spontaneous speech without considering syntax. The linguistic coverage of this parser, as well as several its robustness, have clearly shown the benefits of this approach. We expect furthermore the insertion of word-order constraints to noticeably decrease the perplexity of the microsemantic analyzer, so that the latter achieves a first-rate parsing of spontaneous speech.

REFERENCES [Antoine 94a]

J.Y. Antoine, "Cooperation syntaxe-sémantique pour la compréhension de la parole spontanée", PhD Thesis, INPG, Grenoble, France, 1994. [Antoine 94b] J.Y. Antoine, J. Caelen, B. Caillaud, "Automatic adaptive understanding of spoken language by cooperation of syntactic parsing and semantic priming", Proceedings ICSLP'94, Yokoham, Japan, 799:802, 1994. [Appelt 92] D. Appelt, E. Jakson, "SRI International February 1992 ATIS Benchmark Test Results", Proc. of the fifth DARPA Workshop on Speech and Natural Language, Harriman, NY, Morgan Kaufman Inc., 1992. [Bresnan 89] J. Bresnan, J. Kanerva, "Locative inversion in Chichewa : a study of factorization in grammar", Linguistic Inquiry, 20, 1-50, 1989. [Clementino 93] D. Clementino, L. Fissore, "A Man-Machine Dialogue System for Speech Access to Train Time Table Information, Eurospeech'93, Berlin, FRG, 1863-1866, 1993. [De Smedt 87] K. De Smedt, G. Kempen, "Incremental sentence production, self-correction and coordination", in G. Kempen (ed.), "Natural Language Generation", Kluwer, Dordrecht, NL, 1987. [Fillmore 77] C.J. Fillmore, "The case for case reopened", in P. Cole, J. Sadock, "Syntax and Semantics 8 : grammatical relations", Academic Press, New-York, NJ, 59:82. [Joshi 87] A. Joshi, "The relevance of Tree Adjoining Grammar to generation", in G. Kempen (ed.), "Natural Language Generation", Reidel, Dordrecht, NL, 1987. [Levelt 89] W. Levelt, "Speaking : from intention to articulation", MIT Press, Cambridge, Ma, 1989. [Pollard 87] C. Pollard, I. Sag, "Information based syntax and semantics", CSLI Lectures notes, 13, University of Chicago Press, IL, 1987. [Rambow 94] O. Rambow, A. Joshi, "A Formal Look at Dependancy Grammars and Phrase-Structure Grammars, with Special Consideration of Word-Order Phenomena", in L. Wanner (ed.), "Current Issues in Meaning-Text Theory", Pinter, London, 1994. [Rastier 94] F. Rastier, "La Microsémantique", in. F. Rastier, M. Cavazza, A. Abeillé, "Sémantique pour l'analyse", Masson, Paris, 43-82, 1994. [Searle 69] J.R. Searle, "Speech acts", Cambride University Press, Cambridge, UK, 1969.

[Seneff 92] [Stallard 91] [Young 93]

S. Seneff, "Robust Parsing for Spoken Language Systems", ICASSP’92, vol. I, 189-192, San Francisco, CA, 1992. D. Stallard, R. Bobrow, "Fragment processing in the DELPHI System", Proc. of the fourth DARPA Workshop on Speech and Natural Language, Pacific Grove, CA, Morgan Kaufman Inc., 1991. S.R. Young, W. Ward, "Semantic and pragmatically based re-recognition of spontaneous speech", Eurospeech’93, Berlin, FRG, 2143-2147, 1993.