Natural Language Processing: many questions, no answers

Natural Language Processing: many questions, no answers Floriana Grasso May 30, 2000

Introduction Computational Linguistics, as a subfield of Linguistics, or Natural Language Processing (NLP), as a subfield of Artificial Intelligence (two research areas that nowadays can be safely considered as merged) concentrate on the “study of computer systems for understanding and generating natural language” [10], in order to develop “a computational theory of language, using the notions of algorithms and data structures from Computer Science” [2]. Typical problems for these research fields are [2]: How is the structure of sentences identified? How can knowledge and reasoning be modelled? How can language be used to accomplish specific tasks? In more recent times, these problems are being studied not so much from the traditional linguistic viewpoint (syntax and semantics) but with a focus on problems of belief models, planning processes, and functional properties of both discourse and text. I will focus my discussion on three problems in NLP. The choice of the problems to address is admittedly biased by my personal research interests, but I do believe that these problems in particular can (and should) most benefit from considerations coming from the Argumentation theory field. This paper is merely an attempt to build up a canvas of issues, on which further discussion is needed.

Problem 1: Natural Language Generation Natural Language Generation (NLG) is concerned with the “construction of computer systems that can produce understandable texts in English or other human language from some underlying non-linguistic representation of information” [30]. Typically, an NLG tool would start with, say, a database of facts on a given subject, for instance illnesses1 : isa(diabetes, disease) symptom(diabetes, tiredness) symptom(diabetes, nausea) treatment(diabetes, insulin-injection)

and build up a piece of text, possibly personalized to the addressee, which in some way “translates” these facts in natural language, for instance: Diabetes is a disease. Its symptoms include tiredness and nausea. It can be treated with insulin injections. In order to achieve this, in a more “intelligent” way than just gluing words together, an NLG system needs knowledge of, at least:

the domain (diabetes, and illnesses in general); grammar and lexicon; how to reach the intended purpose (what does “explaining” mean?); discourse organization (why is a sentence “good”? Why is it “understandable”?); the addressee (would the text be different if addressed to an audience of physicians? How?). Let us impudently ignore here the first two elements in the list: we may lightly assume that a computer system can collect and record data in some structured form (a database, a knowledge base, etc.) and can query such information structure. We may also assume that a computer system can process a grammar and keep a dictionary 1 For

some reasons, research in NLG has seen a proliferation of systems on medical or epidemiological domains [4, 5, 6, 12, 25, 31, 32].

1

of terms, and we may leave aside problems related to the choice of words (should I call it “disease”, “illness”, “pathology” or what?) and/or of grammatical constructs (should I use passive or active form?), despite they undoubtly give rise to an important series of pragmatic problems. The first problem to consider in our way to produce our piece of text on diabetes becomes therefore: how can we structure the text? What pieces of data should be included, and why? And how they should be put together? All computer systems which, more or less automatically, have produced natural language text have approached this problem on the basis of a common assumption: text has a structure, that can be determined, at least partially, by the speaker’s goal. The main differences between these systems lies in deciding what this structure is, and how it can be modelled. Early systems are based on the concept of “recipe”, or schema: if most of the explanations (e.g. in an encyclopedia) have the same structure (start with the identification, “X is a Y”, then pass to listing peculiar attributes of X etc.) then we can replicate the same structure to all texts having the same purpose (to explain). We can go a step further, and think of other, different goals the speaker might have, beyond the mere explanation: instruct, provide evidence, justify, compare etc., and create a schema for each of them. More recent systems tend to be based on a different, less rigid approach. Following Austin’s intuition that utterances are performatives just as physical action are [3], many researchers in NLP have acknowledged that the most flexible way to produce natural language text is to use a planning approach. Just as a robot may have the goal of building a brick tower, and can decompose this problem into smaller and smaller tasks, until easily executable basic actions can be performed (lift arm, pick up block etc.), similarly a natural language tool may decompose a communicative goal into its steps, and use a plan based mechanism to achieve them. We can then benefit from huge achievements in the planning research community, and implement a system capable of processing “communicative operators”, such as (adapted from [27]): NAME: EFFECT: CONSTRAINTS: DECOMPOSITION:

persuade-by-motivation (PERSUADED ?hearer (DO ?hearer ?act)) (STEP ?act ?goal) AND (GOAL ?hearer ?goal) AND (MOST-SPECIFIC ?goal) (FORALL ?goal (MOTIVATION ?act ?goal))

The “only” problem we are left with, therefore, is to decide how such communicative goals can be defined in the first place, and how they can be achieved, that is decomposed into smaller, more manageable problems, so that we can feed our planner with a suitable library of operators similar to the one above. Guidance on these issues comes typically from discourse theories, the most widely used of which is perhaps the Rhetorical Structure Theory (RST) [21]. RST has been used in many applications (see for example [15]), and, despite some criticisms [28], it is generally seen as “the” theory for generating text. This is most surprising, as, while extremely useful for generating descriptive texts, RST deals very poorly with different genres of text, for instance it says very little about how persuasive text can be generated. Moreover, it assumes texts have a hierarchical structure, which is in many circumstances too strong a constraint, if not an artificial imposition.

Generation from Predefined Text: Summarization An important subfield of NLG, which is rapidly becoming a field on its own right, summarization can be seen, at a shallower level, as the extraction of excerpts from a text that can convey the main message(s) of the text without too many details. At a deeper level, it can be seen as the extraction of the meaning of a text, and the reproduction of a shorter version of the text itself. In both cases, a discourse structure is needed to understand what to keep and what to throw away, decision which again should be based on the satisfaction of a given, communicative goal.

Problem 2: Intelligent Dialogue Agents A natural extension to the ability of creating text for a purpose, would be to use this text in a conversation with another partner, whether human or not. Most of the problems expressed before for the generation of a piece of text appear here again, for the production of the single sentence needed for the dialogue move. They are accompanied by many others, though, as a consequence of having the audience directly intervening in the generation. Such problems may involve shallower aspects of the dialogic activity, such as repairing conversation failures and keeping the conversation “on focus”. Or architectural problems of how the generation of messages should be interleaved with the management of the dialogue, or of how to organize turn taking (e.g. dialogue games). But perhaps most importantly, an “intelligent” dialogic agent needs to be able to understand the implications of the single sentence the partner communicates. 2

Natural Language Understanding has been a much studied problem, in NLP [2]. What I am referring to here, however, is not, or not only, the classical problem of parsing a sentence in order to understand its meaning, with its plethora of issues in grammar/syntax, lexicon, language inherent ambiguities, anaphora resolution, referring expression etc. An intelligent dialoguing agent rather needs to be able to perform a “rhetorical parsing” of the text, as defined by Marcu in his PhD thesis [23]. Again we need to refer to a discourse structure, and to the ways in which a communicative goal can generate such a structure (and the ways it can be recognized by the other party). Moreover, the agent should perform a “plan recognition” activity, on the basis of how the conversation proceeds, understanding the final aim of a piece of text, in terms of what the speaker was trying to achieve, and what are his/her expectation on how the hearer will react. Hopefully, the agent’s immediate reaction will then be more appropriate, and also the planning for future dialogue moves can be amended accordingly. This is not only important for debates, or general eristic situations, but for any type of dialogue: explanatory, tutoring etc.

Problem 3: Representing the Hearer (and representing the Speaker too!) Deciding what a natural language tool should know of the audience the message is addressed to (whether as oneshot text or as a dialogue turn) is a problem which encompasses all the issues above. In the majority of systems that have been designed in the last decade, messages are adapted, in their content and presentation style, to the hearer’s characteristics (represented in a “mental model” of the hearer), and, generally in a more implicit way, to the speaker’s characteristics. Adaptation is based on assumptions about the hearer’s mental state and the way this state is influenced by communication of each individual component of the message, and by understanding of the relationships among these components. The aspects of the mental state that are represented, in the large majority of cases, are the hearers’ beliefs and knowledge of domain topics and their goals about the domain state; in some cases, this may extend to representing other aspects, such as the ability to perform domain actions, the interest towards topics, the preference towards domain states. When a speaker’s model is represented as well, this includes second-order beliefs and goals of the same type. Less studied is the problem of investigating whether, when and how extra-rational factors, such as emotions, personalities etc. should be taken into account when designing a NLG, or a dialogue tool. How can we, for instance, produce “utterances empathetic to both the speaker and the hearer” [11], deciding whether we should offset a piece of information, or stress another one? How can we enrich the hearer’s mental model with domainrelated personal preferences, concerns, worries and related features? How can we represent variables like the “social distance” between the speaker and the hearer, the “power” that the hearer has over the speaker and the two agents’ “basic desires” [33]? How these variables might affect strategies for planning text? How they may affect the definition of single sentences, in terms of choice of lexicon, referring expressions, elimination of redundant data, etc.2 ? A completely different issue, but still worth considering, involves the generation of “lifelike” system that can actually show their own “emotions” in generating natural language [20], as studies have shown that, although conscious of being interacting with a machine, humans have a clear perception of the computer system’s personality [26].

A Summary of Open Problems Discourse processing: which is the structure of particular genres of discourse (instruction, narratives, explanations, arguments)? Can they be represented by one model (for example, an extension to RST, or an alternative) or should it be acknowledged that different tools have to use different approaches? How can rhetorical relations be used to glue together pieces of information? Can the same theory be used for understanding? Text Planning: How can we define high level communicative goal, in an operative, “implementable”, and computationally tractable way? How are these goals achieved, i.e. planned for? How can a listener recognize these goals, and use them to infer the meaning of a sentence? Affect and Pragmatic Features: should extra-rational factors, such as emotions, be taken into account? How can affective goals be defined? How can they be achieved? How does the text change when formality, politeness, empathy are to be accounted for? What about things like irony, deception, metaphor, or humour? 2 And, on the subject of treating redundance and general sentence planning: Can redundance be used as a rhetoric device? Would the impact of this section be different if its title was modified to “Representing the Hearer and the Speaker”, as current sentence planning algorithms would suggest?

3

Evaluation: the mother of all problems, a final issue, both in generation and in dialogue: how to evaluate the text produced? In particular, how can we say that a text is coherent? Some researchers have given hints on what coherence is not3 , and evaluation heuristics have been identified for some categories of systems [9]. However, researchers in NLP still have to agree on a standard approach, and this is probably not surprising (how can we say that a text is “good”?).

Can Argumentation Help? As most, if not all, of the Argumentation process is based on the use language it is surprising that Argumentation theory and NLP meet only in relatively rare occasions (see [7, 8, 16, 18, 22, 24, 29] inter alia), and for addressing specific problems, which almost never go beyond the generation/understanding of a single sentence or paragraph, typically in support or attacking a claim. An important, common misconception which pervades all these experiences, in my opinion, is that Argumentation theory can only help dealing with specific genres of text (persuasive texts, or debates). If starting from the point of view that any text, even a mere explanation, can be seen as discourse meant to change the addressee’s mind (an instructional text has to be, for instance, “believable” in order to achieve its purpose, even assuming the student is cooperative and not particularly sceptical), NLP and Argumentation theory researchers can efficaciously collaborate to try and answer many unresolved, or partially resolved, questions on how systems can be built for “linguistically mediated problem solving” ([14, Introduction]).

References [1] Proceedings of the 5th European Workshop on Natural Language Generation, May 1995. [2] J.F. Allen. Natural Language Understanding. The Benjamin/Cummings Publishing Company, Inc., 2nd edition, 1995. [3] J. Austin. How To Do Things With Words. Oxford University Press, 1975. [4] B.G. Buchanan, J. Moore, D.E. Forsythe, G. Carenini, S. Ohlsson, and G. Banks. An Intelligent Interactive System for Delivering Individualized Information to Patients. Artificial Intelligence in Medicine, 7(2):117– 154, 1995. [5] G. Carenini, V. Mittal, and J. Moore. Generating Patient Specific Interactive Explanations. In Proceedings of the 18th Symposium on Computer Applications in Medical Care (SCAMC94). McGraw-Hill Inc., 1994. [6] A. Cawsey, K. Binsted, and R. Jones. Personalised Explanations for Patient Education. In Proceedings of the 5th European Workshop on Natural Language Generation [1], pages 59–74. [7] R. Cohen. Analyzing the Structure of Argumentative Discourse. Computational Linguistics, 13(1-2):11–24, 1987. [8] M. Elhadad. Using Argumentation in Text Generation. Journal of Pragmatics, 24:189–220, 1995. [9] J.R. Galliers and K. Sparck Jones. Evaluating Natural Language Processing Systems. Technical Report 291, University of Cambridge, Computer Laboratory, March 1993. [10] R. Grishman. Computational Linguistics : an Introduction. Studies in Natural Language Processing. Cambridge University Press, 1986. [11] I. Haimowitz. Modeling all Dialogue System Participants to generate Empathetic Responses. Computer Methods and Programs in Biomedicine, 35:321–330, 1991. [12] G. Hirst, C. DiMarco, E. Hovy, and K. Parsons. Authoring and Generating Health-Education Documents that are Tailored to the Needs of the Individual Patient. In A. Jameson, C. Paris, and C. Tasso, editors, User Modeling - Proceedings of the 6th International Conference, pages 107–118. SpringerWien, 1997. 3 For instance [13] claims it is neither cohesion (“I once went to Naples. I will start working tomorrow” is cohesive as both sentences refer to the same individual but not coherent), nor use of explicit connectives, nor even mere comprehensibility.

4

[13] J. Hobbs. Towards an Understanding of Coherence in Discourse. In W. Lehnert and M. Ringle, editors, Strategies for Natural Language Processing, chapter 8, pages 223–243. Lawrence Erlbaum Associates, Hillsdale, NJ, 1982. [14] H. Horacek and M. Zock, editors. New Concepts in Natural Language Generation. Communication in Artificial Intelligence Series. Pinter Publishers, 1993. [15] E. Hovy. Automated Discourse Generation using Discourse Structure Relations. Artificial Intelligence, 63(1-2):341–385, 1993. [16] X. Huang. Planning Argumentative Texts. In Proceedings of the 17th International Conference on Computational Linguistics (COLING94), 1994. [17] K. Jokinen, M. Maybury, M. Zock, and I. Zukerman, editors. Proceedings of the ECAI-96 Workshop on: Gaps and Bridges: New directions in Planning and NLG, 1996. [18] N. Karacapilidis and D. Papadias. A Computational Approach for Argumentative Discourse in Multi-Agent Decision Making Environments. AI Communications, 11(1):21–33, 1998. [19] J. Lewi and B. Hayes-Roth, editors. Proceedings of the 1st International Conference on Autonomous Agents (Agent97). ACM Press, February 1997. [20] A. B. Loyall and J. Bates. Personality-Rich Believable Agents that Use Language. In Lewi and Hayes-Roth [19], pages 106–113. [21] W. Mann and S. Thompson. Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text, 8(3):243–281, 1988. [22] D. Marcu. The Conceptual and Linguistic Facets of Persuasive Arguments. In Jokinen et al. [17], pages 43–46. [23] D. Marcu. The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. PhD thesis, University of Toronto, Department of Computer Science, 1997. [24] M. Maybury. Communicative Acts for Generating Natural Language Arguments. In Proceedings of the 11th National Conference on Artificial Intelligence (AAAI93), pages 357–364. AAAI Press / The MIT Press, 1993. [25] S. Miksch, K. Cheng, and B. Hayes-Roth. The Patient Advocate: a Cooperative Agent to Support PatientCentered Needs and Demands. In AMIA Annual Fall Symposium, October 1996. [26] Y. Moon and C. Nass. How ”Real” Are Computer Personalities? Psychological Responses to Personality Types in Human-Computer Interaction. Communication Research, 23(6):651–674, 1996. [27] J. Moore. Participating in Explanatory Dialogues. MIT Press, Cambridge (Mass.), 1995. [28] J. Moore and C. Paris. Planning Text for Advisory Dialogues: Capturing Intentional and Rhetorical Information. Computational Linguistics, 19(4):651–695, 1993. [29] C. Reed. Generating Arguments in Natural Language. PhD thesis, University College London, 1988. [30] E. Reiter and R. Dale. Building Applied Natural Language Generation Systems. Natural Language Engineering, 3(1):57–87, 1997. [31] E. Reiter and L. Osman. Tailored Patient Information: Some Issues and Questions. In Proceedings of the ACL-1997 Workshop ’From Research to Commercial Applications: Making NLP Technology Work in Practice’, pages 29–34, 1997. [32] P. Szolovits, J. Doyle, W.J. Long, I. Kohane, and S.G. Pauker. Guardian Angel: Patient-Centered Health Information Systems. Technical report, Massachusetts Institute of Technology, Laboratory for Computer Science, 1994. [33] M.A. Walker, J. Cahn, and S. Whittaker. Improvising Linguistic Style: Social and Affective Bases for Agent Personality. In Lewi and Hayes-Roth [19], pages 96–105.

5