Annotating metadiscourse markers in the English ...

5 downloads 368 Views 90KB Size Report
letters to the editor, evenly divided into English and Spanish, all of them collected ... Published online Actes du IX Colloque de Linguistique de Doctorants ... https://coldoc2013fr.files.wordpress.com/2014/11/livret_actescoldoc_version2.pdf.
Annotating metadiscourse markers in the English-Spanish MULTINOT corpus: preliminary steps Julia Lavid and Lara Moratón Universidad Complutense de Madrid

The work reported in this paper is part of a larger research effort within the MULTINOT project (Lavid et al. 2015), focused on the multidimensional annotation of a registerdiversified bilingual corpus of comparable and parallel English and Spanish texts with lexicogrammatical, semantic and discourse features with the aim of developing a multifunctional resource which can be used by a variety of potential users and in a number of theoretical and applied contexts. While previous work by members of the research team has focused on the annotation of features such as modality (Zamorano et al. 2014), global discourse structures, rhetorical relations and thematic patterns (Arús, Moratón and Lavid 2013, Moratón and Lavid 2013), in this paper we report on the recent extension of our annotation tasks to metadiscourse markers (Hyland 2004), as potential realisational devices and markers of some of the previously annotated features. For this task we use a subpart of the MULTINOT corpus, namely, sixty-two newspaper texts, consisting of sixteen news reports, sixteen editorials and twenty letters to the editor, evenly divided into English and Spanish, all of them collected from British and Spanish high-circulation newspapers between 2009 and 2013 and preprocessed with the GATE platform (Cunningham et al. 2002). We found Hyland’s distinction into interactive and interactional markers particularly useful as the basis for the design of the annotation scheme, although we decided to use Halliday’s terminology

and

distinguish

between

‘textual’

(interactive)

and

‘interpersonal

(interactional) markers, given its wider acceptance. The former are concerned with ways of organising discourse to anticipate readers’ knowledge and include transitions, frame markers, endophoric markers, evidentials and code glosses. The latter focus on the participants of the interaction and “seek to display the writer’ s persona and a tenor consistent with the norms of the disciplinary community” (Hyland 2004, 139). These include hedges, boosters, attitude markers, engagement markers and self-mention markers. In the paper we present an annotation scheme for these metadiscourse markers in English and Spanish, report on experiments to validate it and the problems encountered during the annotation phase. We also report on the genre and languagespecific variation found in the distribution of these metadiscourse markers in the annotated corpus.

REFERENCES Cunningham, H., Maynard, D., Bontcheva, K. (2002). GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL'02). Philadelphia. Hyland, K. (2004). Disciplinary interactions: Metadiscourse in L2 postgraduate writing. Journal of Second Language Writing , 13, 133-151. http://dx.doi.org/10.1016/j.jslw.2004.02.001 Lavid, Julia, Arús, Jorge, DeClerck, B and Hoste, Veronique (2015). Creation of a high quality, register-diversified parallel corpus for linguistic and computational investigations. In Current Work in Corpus Linguistics: Working with Traditionallyconceived Corpora and Beyond. Selected Papers from the 7th International Conference on Corpus Linguistics (CILC2015). Procedia - Social and Behavioral Sciences, Volume 198, 24 July 2015, Pages 249–256 Moratón, L. and Lavid, J. (2013). Thematic Progression Patterns in English and Spanish Newspaper Genres. Paper presented at COLDOC Conference, 13-14 November 2013. Published online Actes du IX Colloque de Linguistique de Doctorants et Jeunes Chercheurs. Pp, 109-118. https://coldoc2013fr.files.wordpress.com/2014/11/livret_actescoldoc_version2.pdf Zamorano, JR., Carretero, M. And Lavid, J. (2014). The annotation of modality and evidentiality in English-Spanish comparable and parallel texts. Paper presented at EMEL’14 CONGRESS, Evidentiality and Modality in English (6-8 October 2014), Universidad Complutense de Madrid.



Suggest Documents