Controlling thematic choices in discourse: towards a specification of contextual constraints* Julia Lavid Universidad Complutense de Madrid Abrego 19, 1D 28224 Pozuelo Madrid Spain Phone/Fax: +34-1-518-5799 e-mail:
[email protected] ABSTRACT This paper investigates the relationship between two contextual factors which contribute to the characterization of text types: the purpose and the subject matter of discourse, and the thematic structure of texts. This is done through the empirical analysis of a corpus of sixty discourses varying in these two factors together. The results of the analysis yielded statistically significant correlations between the identified factors and specific thematic choices, thus validating the hypothesis that thematic selection in discourse is not random but contextually-controlled. The paper also addresses issues of computational representation by outlining a semantic interface for Theme which mediates between high-level contextual sources of control and lexicogrammatical choices which might be useful in the application context of text generation.
1 INTRODUCTION The purpose of this paper is to validate the claim that the thematic structure of discourse is not random but is contextually controlled by contextual factors from the communicative context. In order to do so, the paper presents the methodology and results of the empirical analysis of a corpus of sixty discourses varying both in discourse purpose and subject matter where statistically significant correlations have been found between these two identified contextual factors and specific thematic selections. The results of this analysis are then used in the application context of text generation to specify high-level sources of control of grammatical theme as specified in the computational grammar Nigel. This is done by outlining a semantic system for Theme in English which abstracts from lexicogrammatical and realizes text building categories, and by representing the contextual motivations for thematic choices as inquiries. The paper is organized as follows. Section 2 illustrates the discourse function of theme as a signpost for the reader of the selected organizational strategy to structure texts. Section 3 presents hypothetical correlations between contextual factors and thematic choices which are empirically validated in Section 4 through empirical text and statistical analysis of a corpus of sixty discourses varying in the two identified contextual factors. Finally, section 5 addresses issues of computational representation by outlining a semantic interface for Theme which mediates between high-level contextual sources of control and grammatical selection in the context of a multistratal text generation architecture. 2 THEME AND CHAINING STRATEGIES The grammatical account of theme assumed here is described in (Matthiessen 1995: 531) as 'the resource for setting up the local context for each clause in a text'. Its theoretical basis is Halliday's definition of theme as 'the point of departure of the message' which is realized through initial position in the clause. However, as proposed in (Lavid 1994), theme also fulfills a guiding role in discourse, acting as 'a signpost for the reader of a specific chaining strategy'. Chaining strategies are 'devices used by the writer to steer his/her text along a specific line of development or frame with the purpose of achieving a maximally profitable text organization, in view of the discourse purpose and the subject matter' (Lavid 1994). Example (1) below illustrates the use of a temporal chaining strategy to steer the text: EXAMPLE (1) (01) Berne was founded in 1191 by Berchtold V. Duke of Zähringen, for strategic reasons (02) and enlarged by stages (...). (03) In 1353 Berne joined the Swiss Confederation. *
Part of the work reported in this paper was carried out by the author in the context of the Esprit BR Project 6665 DANDELION: 'Discourse Functions and Discourse Representation -An Empirically and Linguistically Motivated, Interdisciplinary Approach to Natural Language Texts".
(04) After the fire of 1405, which almost completely destroyed the wooden-builttown, the houses were rebuilt of sandstone. (05) The medieval structure of the city originating from that time has remained unchanged up to the present day. (06) From the 14th to the 16th century, Berne reached the zenith of its power by enlarging its territory and gaining great political influence. (07) In 1798 Napoleon' s troops invaded Berne and thus the collapse of the Ancien Régime was witnessed. (08) In 1834 Berne became a university town (09) and in 1848 the Federal Capital of Switzerland as well as the capital of the Canton of Berne (second largest Swiss canton). (10) Berne was declared a World Landmark by the United Nations and was given the title of "Europe's Most Beautiful Floral City". The temporal global chain consists of eight temporal scopes which make reference to different points along a temporal line of development which reflects a chronological order of events. Some of these temporal scopes are explicitly signalled by linguistic markers (bold type in the text), among which temporal themes predominate (bold and underlined in the text); other temporal scopes, however, are implicit and have to be inferred from other linguistic signals in the text. Therefore, the temporal chaining strategy is independent from its linguistic realization by temporal themes: sometimes it is implicit, or it is realized by a temporal expression in rhematic position. The former happens in clause number 10 of the passage where a new temporal scope is opened, since the events depicted (Berne's declaration as a World Landmark and award of a title) could not happen in 1848 but at some unspecified time in the 20th century which the reader is left to infer by the reference to the United Nations. The latter happens in the first clause of the passage where the temporal reference which signals the first temporal scope in the passage (in 1191) appears in rhematic position. Table 1 below illustrates the correlations between the global temporal chaining strategy, its linguistic encoding and the thematic selection options used by the writer to signal that strategy. Table 1: Chaining strategies and theme selection Sentence #
Global Temporal Chain
Thematic
Selection
Correlation
TEMPORAL SCOPES
LINGUISTIC MARKING
SEMANTIC TYPES
REALIZATION
1
1
in 1191
topical
Berne
non
2
-
-
-
-
-
3
2
In 1353
temporal
In 1353
gtc-t
4
3
after the fire
temporal
after the fire
gtc-t
5
-
-
-
-
-
6
4
from the 14th
temporal
from the 14th
gtc-t
7
5
In 1798
temporal
In 1798
gtc-t
8
6
In 1834
temporal
In 1834
gtc-t
9
7
In 1848
temporal
In 1848
gtc-t
10
8
United Nations
topical
Berne
non
gtc-t = correlation between global temporal chaining strategy and temporal thematic selection; non = no correlation between temporal chaining strategy and temporal thematic selection. Other chaining strategies found in the corpus (organizing texts both at a global and at a local level of analysis) include the following: 1. Characterization strategy: line of development consisting of references to a concept or to characteristics of the generic class to which the concept belongs (including attributes and parts). 2. Participant strategy: line of development consisting of references to different participants in the text. 3. Spatial strategy: spatial line of development consisting of different spatial scopes along an imaginary tour through which the reader is guided to see a number of sights, or of different locations with respect to a central place of observation. 4. Sequential strategy: presentation of a sequence of steps in a procedure. 5. Though- or counter-argument strategy: series of arguments in favour or against the thesis defended by the writer. 3 CONTEXTUALIZING THEME: TOWARDS A SPECIFICATION OF CONSTRAINTS Having outlined the guiding function of theme in discourse this section concentrates on its behaviour in specific contexts of use. The question is: how does the context constrain the deployment of this resource? In other words, which contextual factors determine thematic choice in discourse? In this paper we investigate the influence of two factors, namely, the
discourse purpose and the subject-matter on the thematic structure of discourse. The discourse purpose refers to the overall communicative goal that the writer has in mind, while the subject matter is the abstract propositional context of the text. 1 For illustration purposes, we present in tabular form the hypothetical correlations between contextual factors and the selection of specific chaining strategies and thematic selection options . Table 2 : Correlations between contextual factors and thematic choices C ONTEXTUAL F ACTORS P HENOMENA STUDIED D.P/ Subject Chaining Theme text type Matter Strategy Selection Expository Generic concepts; CharacteriTopical zation Descriptive Place rels. Spatial Location Narrative Events & Temporal Temporal Participants Participant Topical Instructive Steps in a Sequential Temporal procedure Process Argumentative facts through- and Conjunctives & ideas counter argument patterns
T EXTS Samples Encyclopedia entries travel guides h. of cities, biographies recipes i. manuals editorials, essays, etc.
These correlations can be explained as follows: If the discourse purpose is expository and the subject-matter deals with whole classes of objects or generic concepts, the hypothesis is that these two factors together control : a) the selection of a global characterization chaining strategy, realized by the selection of topical themes as markers of that chaining strategy. If the discourse purpose is descriptive and the subject-matter deals with place relations, the hypothesis is that these two factors will constrain: a) the selection of a global spatial chaining strategy, signalled by the selection of locative themes. If the discourse purpose is narrative and the subject-matter deals with events and participants, the hypothesis is that the most typical global strategy selected by the writer will be a temporal one. At a more local level of analysis, it is frequent to find a participant strategy as a minor chaining strategy. The temporal strategy is typically signalled by the selection of temporal themes, and the participant strategy by topical ones. If the discourse purpose is instructive and the subject-matter deals with steps in a procedure, the hypothesis is that these two factors will determine the selection of a sequential chaining strategy, signalled by temporal and/or process themes, depending on the genre. If the discourse purpose is argumentative, and the subject-matter deals with complex facts and ideas, the hypothesis is that these will determine the selection of two typical chaining strategies: through-argument and counter-argument, both signalled by the selection of conjunctive themes as markers. In order to test the specified hypotheses, empirical text analysis was carried out on a corpus of sixty texts varying in both of these factors. The analysis methodology and the results obtained are explained in the next section. 4 ANALYSIS METHODOLOGY Empirical text analysis was carried out on a corpus of sixty texts. The texts were distributed in sets of ten varying both in discourse purpose and subject matter, except for those with an instructive discourse purpose where we selected twenty texts to compare the different thematic selections depending on the genre. In order to test the influence of these two contextual factors on the type of theme selected to signal a specific chaining strategy, we searched for characteristic scopes which define the line of development of a given chaining strategy, and checked what type of theme was selected to signal that scope. Due to space limitations we present in tabular form the individual analysis only of those texts which reflected a narrative discourse purpose and whose subject matter concentrated on events and participants. For the rest of the texts, we explain the analysis methodology and the results obtained. 4.1.1 NARRATIVE TEXTS We selected ten historical passages including biographies and the history of two cities. All of them reflected a narrative discourse purpose and concentrated on events and participants involved in those events. The hypothesis was that these two contextual factors together would control the selection of temporal themes as markers of a global temporal chaining strategy. In order to uncover this relationship, we searched for temporal scopes along a temporal line of development. Each 1
The discourse purposes studied in this paper have been circumscribed to the ones corresponding to the well-known discourse types found in several existing text typologies, e.g: narrative, expository, descriptive, instructive and persuasive (or argumentative).
time a new temporal scope was introduced in the text, we checked what type of theme was selected to signal that scope. The results of the analysis are shown in table 3 below. Table 3: Correlations between global temporal strategy and thematic selection Sample
Global Chain
Thematic
Selection
TEXT Nº
Nºtemporal scopes
LOCAT.
TOP.
TIME
PRO.
CON.
OTHER
N_1
8
0
2
6
0
0
0
N_2 N_3
6
0
0
6
0
0
0
10
0
1
9
0
0
0
N_ 4 N_5
5
0
0
5
0
0
0
8
0
2
6
0
0
0
N_6
5
0
1
4
0
0
0
N_7
8
0
2
6
0
0
0
N_8
10
0
2
7
0
0
1
N_9
4
0
0
4
0
0
0
N_10
16
0
4 12 0 0 top.=topical, locat.= locative, con.=conjuctive, pro.= process.
0
The Pearson correlation coefficient (Butler 1985:143) showed a statistically significant correlation between the number of temporal scopes (which define the global temporal strategy and characterize narrative texts) and the selection of temporal themes (r=0.960; p