Commas and Spaces: The Point of Punctuation - Semantic Scholar

9 downloads 0 Views 25KB Size Report
The natives | stalked the soldiers(,) | from the fort(,) | before launching a | fast and furious attack. First Pass reading .... Lexicon and Grammar. Houghton Mifflin;.
Commas and Spaces: The Point of Punctuation Robin L. Hill & Wayne S. Murray University of Dundee email:

[email protected]

[email protected]

11th Annual CUNY Conference on Human Sentence Processing, New Brunswick, New Jersey, March 19-21, 1998

Abstract While it has been widely assumed that punctuation may play a critical role in parsing, there has been relatively little direct empirical investigation of its effects. Most researchers have either avoided the use of punctuation or have simply assumed that it will serve a disambiguating role. There has been little or no consideration of how ’disambiguation’ might occur or whether it is equally effective across different structures. Previous work using selfpaced reading (Hill & Murray, 1997) has in fact shown that simplistic conclusions related to the role of punctuation are unlikely to be supported. These studies showed that while punctuation can play a potent disambiguating role in some structures, the effect is by no means universal. These conclusions, however, depend on the assumption that punctuation acts in the same way with word-by-word self-paced reading as it does in more natural reading tasks. The studies reported here therefore extended this work by the monitoring of subjects’ eye movements while reading three types of locally ambiguous items with and without inserted punctuation. Since results from an earlier pilot study showed effects of punctuation on saccade length, an additional condition of increased spacing, without punctuation, was also included. The results showed potent effects of punctuation on first pass ’garden pathing’ in two structures (early closure and reduced relative clause sentences), but not in sentences with prepositional phrase ambiguities. Punctuation also had effects on local processing difficulty, suggesting that it can cue some types of parsing decision at particular points in a sentence. A frequent effect of inserted punctuation was to increase processing time on sections of a sentence immediately preceding a comma, while facilitating processing which followed. Punctuation and increased spacing between words had similar effects on saccade length into a region, increasing these by more than the added character space, but while spacing manipulations did impinge on reading time, they did not have an equivalent disambiguating effect. Punctuation therefore appears to convey information related to structure that is more potent than the simple ’chunking’ of text, but this effect is limited to particular structural conditions. Interestingly, there was little effect of punctuation when it was consistent with a ’preferred’ parse. Its role appears to be more closely related to the avoidance (in some circumstances) of incorrect decisions. 1

1.

Punctuation

Punctuation is in the strange position of being simultaneously both necessary and mysterious in its usage. The unique role of this type of graphical feature in language, along with its level of importance, is extremely unclear and undeveloped. It appears very much an intuitive art rather than an explicit science. Of particular interest is the use of punctuation as a potential disambiguating mechanism in language processing. The study reported here used eye-tracking to further explore the effects of punctuation found in self-paced reading by Hill and Murray (1997).

1.1.

Why Commas?

Commas are the most frequently occurring punctuation mark (Francis & Kucera, 1982; Johanson & Hofland, 1989) as well as the most controversial in function and variable in use (Quirk et al., 1985). Their flexible qualities could either imply strong structural significance or render them effectively arbitrary and uninformative.

1.2.

Why Spaces?

Spacing was examined in order to ensure that any punctuation effects aren’t simply attributable to stretching the gap between words. Spacial features, such as line breaks etc., can cue clausal segmentation (Kennedy, et al. 1989). A possible interpretation of the insertion of a comma is that it is only the equivalent of inserting an additional space. Therefore, a set of experimental items with double spacing in place of commas was also included.

2.

The Experiment

Aims: • To determine whether commas can act as a disambiguating mechanism in otherwise temporarily ambiguous sentences. • To determine whether any effects of commas are simply attributable to increasing the space between words. Three classes of sentences were examined: prepositional phrase ambiguities, early / late closure ambiguities and reduced relatives. Essentially, all models of sentence processing predict ’garden-path’ effects with these items, although their explanations of the underlying mechanisms may differ. There were 6 versions of each sentence type: one predicted to induce a garden-path effect, along with a non-garden-path counterpart; copies of these, either with commas or an extra space. E.g., The The The The The The

vet vet vet vet vet vet

injected injected injected injected injected injected

the the the the the the

cat with the needle before leaving for a cat with the collar before leaving for a cat, with the needle, before leaving for cat, with the collar, before leaving for cat with the needle before leaving for cat with the collar before leaving for

rather late lunch. rather late lunch. a rather late lunch. a rather late lunch. a rather late lunch. a rather late lunch.

The eye movements of 36 subjects were recorded while they read 4 examples of each version for each of the 3 structures (72 experimental items) amongst 24 filler items. One-third of all fillers contained commas. Another 1/3 contained randomly placed double-spaces. All sentences were displayed on a single line.

3.

Results

For analysis purposes all sentences were segmented in to 5 zones (4 in the case of reduced relatives). The zone boundaries are indicated by lines in the exemplars listed for each structure. The location of the punctuation is indicated by a parenthesised comma.

2

3.1.

Prepositional Phrases

VP) The natives | stalked the soldiers(,) | from the rear(,) | before launching a | fast and furious attack. NP) The natives | stalked the soldiers(,) | from the fort(,) | before launching a | fast and furious attack.

First Pass reading (Figure 1) of the prepositional phrase (Zone 3) indicates that the only reliable difference is that of punctuation on NP attachment. Commas therefore appear to trigger the garden-path effect at the point of attachment. A strong structural difference is found in Zone 4 for VP attachment (mean 169 msec) compared with NP attachment (mean 199 msec). Although there is a trend for shorter times with both commas and spaces (Zone 4 NP), this is not significant. Commas do not prevent processing difficulties here, they just provoke it sooner. The total time spent in the critical zones (Figure 2) shows that there are clear garden-path effects but no differences due to punctuation or spacing. Total Second Pass (Figure 3) reading shows a different picture. The garden-path effect only shows up in the unpunctuated NP case. Commas do prevent the need for major global regressions whereas they don’t stop smaller, localised ones. There is also an effect of spacing in the NP case, significant by subjects but not by items. Total Reading Times for the sentences (Figure 4) show strong structural effects, but this is not modulated by punctuation or spacing.

3.1.1. Prepositional Phrase Summary • • • •

Commas ’bring on’ the garden-path phenomenon at the point of attachment, but they don’t eliminate it. However, they do prevent the need to go back and re-read the sentence. Problems seem to be spotted quicker and dealt with immediately in punctuated PPs. Effects of spacing are unclear. It doesn’t appear to do much other than help reduce the need for second pass reading.

3.2.

Early / Late Closure

EC) After the woman | had visited(,) | the tall doctor | spoke very gravely | to the surgeon. LC) After the woman | had visited | the tall doctor(,) | he spoke very gravely | to the surgeon.

First Pass reading (Figure 5) shows that the presence of punctuation (and to a lesser degree, spacing) produces a slight delay in Zone 2 in Early Closure sentences and a similar tendency in Zone 3 for Late Closures. A large garden-path effect is evident in Zone 4. There is also a suggestion of help from additional spacing, but this was not significantly different from the "no comma" condition and did not differ from the comma condition. Commas also facilitated reading in Late Closure sentences; something not found in self-paced reading. Spacing again exerted an intermediate effect and was not reliably different from the other conditions. Total Second Pass (Figure 6) shows only a weak structural effect. Total Reading Times (Figure 7) highlight the overall benefit of commas with Early Closure. Spacing did not show a clear effect.

3.2.1. Early / Late Closure Summary • • • •

Marginal delays on encountering commas. This pays off in the long run with quicker subsequent processing. Garden-path effects greatly reduced, if not eliminated, in Early Closure sentences by the presence of a comma. Spacing appears to have some modest processing effects, but is not significantly powerful.

3

3.3.

Un/Reduced Relatives

RR) The critic(,) || played the music(,) | listened very attentively | before saying no. UR) The critic(,) | who was | played the music(,) | listened very attentively | before saying no.

The First Pass reading results (Figure 8) show a clear structural difference in Zone 3 with longer times in the unreduced items. However, these did contain an extra prior zone and this may just be indicative of integration into a more complex structure at that point. There is also a non-significant tendency for longer times with commas. A strong garden-path effect can be seen in Zone 4. Commas drastically reduce this effect. An extra space also squashes this, but to a lesser degree. A strong structural difference remains in Total Second Pass reading (Figure 9) but again punctuation minimises it. Spaces do not influence second pass reading at all. The usual overall pattern is seen in Total Reading Time (Figure 10) but the data were too variable to show reliable effects of punctuation or spacing.

3.3.1. Un/Reduced Relative Summary • • •

4.

Commas facilitate both the First Pass and Second Pass reading of Reduced Relatives. Increased spacing also has a beneficial effect (though smaller) on First Pass reading but no effect on Second Pass. No obvious effects of punctuation or spacing on Unreduced Relatives.

Conclusions



Commas accelerate the on-set of processing difficulties in NP-attaching Prepositional Phrases rather than preventing them, although they do prevent the need to re-read sentences.



Commas do remove/reduce difficulties associated with Early Closure or Reduced Relative ambiguities. Both Early Closures and Reduced Relatives are regarded as ’difficult’ ambiguities (whereas PPs are not) and this emphasises the importance and power of punctuation.



An initial encounter with commas results in a slight delay, but this is followed by faster subsequent processing.



Strategic double-spacing may slightly aid First Pass reading in some structures but does not influence Second Pass Reading. It typically occupies a middle ground somewhere in-between the results for punctuated and unpunctuated ambiguous sentences, but there is no clear ’facilitatory’ effect.

Commas therefore have a strong, structurally dependent influence on sentence processing. They can successfully prevent the need for major reanalysis, either eliminating garden-path effects or enabling rapid repair. However, commas are doing more than simply physically segmenting text. Increased spacing may have a minor effect on processing under certain circumstances, but this may perhaps be due to ’segmentation’ rather than guiding or aiding parsing. The general pattern of results with punctuation in these locally ambiguous sentences mirrors that found with self-paced reading. What is clearer from this study is the tendency for commas to induce a slight delay on the punctuated word followed by faster processing on the subsequent word. This might be evidence of some sort of ’clausal wrap-up effect’ or just a change in the raw pattern of eye-movements accompanied by parafoveal processing, in advance of fixation, of the word(s) immediately following the comma. As the self-paced procedure denied the reader any possibility of parafoveal information, this may explain the slight discrepancy. In the case of unambiguous structures, these losses and gains effectively balance each other out and produce little overall benefit.

4

Prepositional Phrases No Commas

Commas

Spaces

230 220 Msec per word

210 200 190 180 170 160 150 Zone 3 VP

Zone 3 NP

Zone 4 VP

Zone 4 NP

Fig. 1. First Pass Reading 800 750

Msec

700 650 600 550 500 Zone 3 VP

Zone 3 NP

Zone 4 VP

Fig. 2. Total Time Spent Reading in Zone 5

Zone 4 NP

700

3950

650

3900 3850

600

3800 Msec

Msec

550 500 450

3750 3700 3650

400

3600

350

3550

300

3500 VP

NP

VP

Fig. 3. Total Second Pass

NP

Fig. 4. Total Reading Times

Early / Late Closure No Commas

Commas

Spaces

330 310 Msec per word

290 270 250 230 210 190 170 150 Zone 2 EC

Zone 2 LC

Zone 3 EC

Zone 3 EC

Fig. 5. First Pass Reading 6

Zone 4 EC

Zone 4 EC

650

3950 3900

600 3850 3800 Msec

Msec

550 500 450

3750 3700 3650 3600

400 3550 350

3500 EC

LC

EC

Fig. 6. Total Second Pass

Fig. 7. Total Reading Times

Un/Reduced Relatives No Commas

Commas

Spaces

330 310

Msec per word

290 270 250 230 210 190 170 150 Zone 3 RR

LC

Zone 3 UR

Zone 4 RR

Fig. 8. First Pass Reading 7

Zone 4 UR

750

3400

700

3350

650 3300 600 3250 Msec

Msec

550 500 450

3200 3150

400 3100 350 300

3050

250

3000 RR

RR

UR

Fig. 9. Total Second Pass

UR

Fig. 10. Total Reading Times References

Francis, W.N. & Kucera, H. (1982) Frequency Analysis of English Usage: Lexicon and Grammar. Houghton Mifflin; Boston. Hill, R.L. & Murray, W.S. (1997) Punctuated Parsing: Signposts along the Garden Path. Poster presented at the Tenth Annual CUNY Conference on Human Sentence Processing, Santa Monica, California, March 20-22. Johanson, S. & Hofland, K. (1989) Frequency Analysis of English Vocabulary and Grammar. Based on the LOB Corpus. Vol. 1: Tag Frequencies and Word Frequencies. Clarendon Press: Oxford. Kennedy, A., Murray, W.S., Jennings, F. & Reid, C. (1989) Parsing compliments: Comments on the Generality of the Principle of Minimal Attachment. Language and Cognitive Processes, 4, 51-76. Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J. (1985) A Comprehensive Grammar of the English Language. Longman; London.

8