actions in practice and ii) novel forms of authoring should be developed for. Interactive ... Interactive Storytelling: From AI Experiment To New Media. 3.
INTERACTIVE STORYTELLING: FROM AI EXPERIMENT TO NEW MEDIA Marc Cavazza, Fred Charles and Steven J. Mead School of Computing and Mathematics University of Teesside, TS1 3BA Middlesbrough {m.o.cavazza, f.charles, steven.j.mead}@tees.ac.uk
Abstract The recent development of Interactive Storytelling techniques has not yet generated widespread interest in the traditional storytelling community. Part of the explanation lies in the lack of maturity of authoring formalisms and techniques. In this paper, we review some underlying problems in the light of our own experience in developing Interactive Storytelling technologies. We introduce the various elements for story generation and presentation in an Interactive Storytelling system. We discuss the connection between AI formalisms used in Interactive Storytelling systems and the formalism used in narrative analysis and the implication of these formalisms for authoring interactive narratives.
1. INTRODUCTION Research in Interactive Storytelling has developed in a spectacular fashion as more research groups in computer graphics and Artificial Intelligence joined this area of investigation [3,4] [7] [9] [10] [11] [14] [16] [17] [18] [20,21]. Most Interactive Storytelling systems described so far are based on AI techniques that control either plot generation or real-time character behaviour. They essentially appear as real-time 3D animations or virtual environments populated by autonomous synthetic actors performing according to a baseline plot, with which the user can interact. The initial diversity of approaches has led place to a certain form of convergence in the techniques supporting interactive storytelling, for instance AI planning systems [20]. These techniques are essential to the generation of a sequence of actions forming the skeleton of a plot, from a structural perspective. However, while most of the development has been dedicated to solving fundamental problems (such as narrative control, the preservation of story coherence despite user intervention), the representational aspects of the AI formalisms used have not been dedicated sufficient attention. As a result, there is now a major discrepancy between the many references to narratology [19], film theory and acting theory that appear as part of many discussions on interactive storytelling and the actual AI formalisms that support the behaviours of virtual actors. Throughout this paper, we will illustrate the discussion with our virtual sitcom system, which is inspired from the set of characters and situations in the popular Friends™ series. The system has been fully implemented using the 3D game engine Unreal Tournament™ (see Figure 1) as a graphic environment, into which a specific AI-based module for Interactive Storytelling has been integrated [3].
Copyright held by the author
1
Marc Cavazza, Fred Charles and Steven J. Mead
Fig 1. System Architecture
2. THE ORIGINS: NARRATIVE CONTROL AND INTERACTION At the heart of any research in interactive storytelling is the contradiction between storytelling and interactivity [11]. Apart from the advocates of “emergent storytelling”, who claim that people can make sense of any events through a process of “storification”, most researchers have considered technical solutions to this problem to be the groundwork for developing interactive storytelling technologies. As a result, most interactive storytelling systems tend to rely on a mechanism for the generation of narrative actions, which can accept user interaction, i.e. real-time modification of the story environment, and still incorporate the consequences of such intervention within the main plot. It can be said that the tension between storytelling and interactivity can be solved, at least in part, through the real-time generation of narrative actions whose conditions are constantly updated. This real-time generation should however preserve story consistency, in particular the forms of transitions between narrative events relevant to the story genre. All this has led to rely on formalisms for action selection that could accommodate narrative knowledge as well, so as to guarantee this form of consistency. Most work in Interactive Storytelling has made explicit reference to narrative formalisms, from Aristotle’s theories [12] to Propp’s [15] or Barthes’ [1]. The rationale being that these descriptive formalisms could be used for generative purposes as well, following a theoretical model that has been that of Computational Linguistics. This approach is however faced with two orders of difficulty: i) narrative formalisms have to be properly related to the AI mechanisms that control narrative actions in practice and ii) novel forms of authoring should be developed for Interactive Storytelling systems. The latter is certainly a major challenge, as current scriptwriters follow a wide range of practices: from working from a description of character profiles to using quasi-formal languages to generate soap opera episodes.
2
Interactive Storytelling: From AI Experiment To New Media 3. PRACTICE VS. THEORY: PLANS AS ROLES Because, at a structural level, a story presents itself as a sequence of actions, it comes as no surprise that AI planning, which precisely deals with action selection, should be the most frequently used AI technique to implement storytelling systems. Planning techniques can formalise the plot itself, in which case the plot-plan controls the unfolding of the story as a whole, each planning action being a narrative action dispatched onto the various characters taking place in that action. Another approach is to formalise each characters’ role as a plan and have the story driven by the dynamic interaction between characters’ roles. Using planning techniques within a characterbased approach [4], we have been essentially concerned with the following aspects of planning: i) the relation between planning and the authoring process for interactive narratives ii) how planning would model relevant narrative phenomena, such as beats, suspense or even the comic nature of situations and iii) how planning supports story variation, i.e. the change in the course of action. When using plans to represent characters’ roles, we are working under the following assumption: the actor’s behaviour in the story will correspond to an actual plan, whose various stages correspond to narrative actions. This means that, from a diegetic perspective, the actions witnessed by the spectator actually take the shape of a plan (e.g. to rob a bank, seduce a lover, etc.). From a representational perspective, this consists in: i) distributing the various narrative functions which represent the storyline into each actor taking a role in these functions and ii) instantiating the plan of each actor in terms of actual narrative events. The latter point ensures that the behaviour of the actor is actually perceived as a plan as well. If the plan is to be perceived as such by the spectator, this means that elements of the planning process itself can be given an interpretation in terms of dramatic value. Traditional narrative elements such as antagonism, which are usually represented through narrative functions can be shown to have a translation in planning terms as well: for instance, two rival characters will have competing plans for the same object, or one character will have as part of its plan to contrast a key action from another character’s role. To represent the roles of each character in our Interactive Storytelling system, we have made use of Hierarchical Task Networks (HTN) [13]. HTN planning is recognised as being appropriate to knowledge-intensive domains such as Interactive Storytelling, and this was the initial rationale for selecting it. Overall, HTN planning consists in using a hierarchical task model (represented as an AND/OR graph), in which each task is decomposed into sub-tasks until they can be described in terms of primitive actions (in Interactive Storytelling, these actions are those actually played in the story). The HTN can then be searched to extract a task decomposition corresponding to a solution plan. In our case, we adapted the search mechanism so as to obtain a real-time version interleaving planning and execution. HTN implements forward search from the goal state which brings the unique property that during planning itself the state of the world is known at all times. We have given several descriptions of our HTN planner [3,4] and will rather concentrate in the next sections on the representational aspects.
3
Marc Cavazza, Fred Charles and Steven J. Mead
Fig. 2. HTN representation of lead character behaviour. Authoring an HTN can be seen as the description of a character’s role in a traditional sense. Its actions have to be decomposed into lower-level tasks, until they can be described as elementary physical interventions on stage. From a narrative perspective, the natural hierarchy of an HTN can thus be used to describe several levels of representations, helping to attribute a narrative interpretation to the HTN nodes. The top-level node will correspond to the story goal, the first layer nodes the main temporal decomposition (scenes) and lower-level nodes to various tactics used in pursuing the scene goals. This is how in our Friends™ scenario the top-level goal (to invite Rachel out (see Figure 2)) corresponds to the baseline story, while the first level of decomposition describes the various steps required to achieve that objective (such as learning about her taste, finding a way to talk to her in private, etc.). Further decompositions correspond to the various courses of action to achieve a given subgoal: for instance, learning about Rachel’s preferences can be achieved by observing her, asking her friends, etc. (see Figure 3). The higher-level nodes of the HTN tend to represent narrative actions while the lower-level ones describe episodic actions that can contribute to their resolution, but for which much variability is allowed.
Fig. 3. Illustration of story instantiation. The HTN representation thus offers a good compromise between authoring and the provision of story variability within a baseline story. The reason is that total ordering supports the decomposition into scenes, preserving a basic temporal structure for the high-level actions, while offering variability in terms of action selection at lower-levels. This formalism shows dual properties from plot-based formalisms: narrative functions can be said to be “distributed” across the roles of different actors. However, in doing so they cease to be explicitly represented or readily identifiable.
4
Interactive Storytelling: From AI Experiment To New Media One major limitation of HTNs is that the plans they support can only fail due to “external” causes. That is to say, the plan allocated to each character is a rational plan corresponding to narrative objectives. Though this plan accounts for possible alternative choices that may lead to overall success or failure, the structure of the plan itself is not an element of the story. The notion of “internal failure” corresponds to a plan, whose structure would be generated on the fly by the character, in a story genre where this is customary (there are many such genre from crime stories to psychological dramas or cartoons). This plan would fail due to its flawed nature, but this failure would generate dramatic or comic situation. This kind of plan can be generated with STRIPS-based representations [6] using a search-based planning technique such as HSP [2]. This form of plan-based representations have a potential to advance our formal understanding of basic narrative mechanisms, but this comes at the expense of narrative control, as these plans would be assembled from a repertoire of possible actions [8]. Authoring of these elementary actions, together with the conditions for invoking them is also bound to be more difficult than when relying on a baseline plot architecture like with HTN structures.
4. STAGING: TURNING ACTIONS INTO A STORY A mere sequence of actions does not constitute a story. Each individual action has to be properly staged, including the transition between various actions, successfully completed or not. For instance, the process of re-planning, due to a previous action failure, represents the deliberation of a character to find a new action in order to fulfil his original objective. This planning mechanism is invisible, although the character’s dramatisation must provide the spectator with some form of visual representation of such a process, for example using the animation of a character waiting for a new idea to pop into his head. The spectator comprehends the dramatisation of actions for individual characters, and relates them together as part of the overall on-going situation. This describes a story event happening within the same scene. However, the most important problem for staging a consistent story arises from having many situations actually taking place simultaneously at different locations on the virtual stage. To attempt to solve this problem, we developed an automatic camera control system, based around two essential components. The first component of the camera system is a “virtual director”, which classifies the characters actions being currently executed according to their narrative relevance [5]. The virtual director then makes an informed choice, from this heuristic classification, on the most appropriate scene to be viewed by the spectator. The scenes presented to the spectator refer to narrative situations led by the execution of the protagonist actions. We can then associate the narrative impact of the actual scene to the narrative relevance of the protagonist actions. The character actions are classified according to heuristic values, which we call story weights. These values are attributed at the time of story authoring to the essential elements of an action: characters (action protagonist or not), activities (actions performed by the protagonist) and objects (which actions are performed onto). Each of the heuristic value associated to these components are then combined according to a formalised template representing actions, i.e. (subject, verb, object). The director selects scenes based on the subject nature of the shot, e.g. a character going to a café or two characters having a conversation. For example, we know where a character is going, though the producer cannot influence or modify its course. The modification of such behaviour would come from the interactive nature of the system. Another character passing by could stop momentarily the others to have a mundane
5
Marc Cavazza, Fred Charles and Steven J. Mead chat. For example, when an object (a diary, a bouquet of flowers) is judged strategic, it will be gratified with a higher story weight than a banal object (a glass, a bottle). The same principle applies for the allocation of weights for actions, as “talking” is more meaningful to the narrative than “walking”. This judgment seems rather artificial and subjective, though in accordance to the subjective decisions of the director. Optimally, the system should gather enough information to reason at a higher narrative level. Once the selection of the scene to be presented is made, we need to decide on the style of the preview to use. This task is passed onto the second component, which is the “virtual cinematographer”, a lower-level module that describes how the scene should be viewed according to well-defined film idioms (Figure 4). This component selects the idiom associated to it. This method works upon the assumption that to any typical type of event, there is an appropriate attached idiom. It also selects an adapted pre-set of visual preferences to apply. Once the scene is selected, the system binds any unbound variables in the idiom specification and passes the information to the “virtual cinematographer”. During this process, the system may query the application for additional information, such as the specific location/orientation and dimensions of the various characters. Moreover, the system will constantly analyse the screen contents to immediately correct the camera settings if occlusions are detected. The scene is then rendered using the animation parameters and descriptions of the current environment sent by the application, and camera specifications (position/orientation) sent by the camera control system.
Figure 4: The presentation of a conversation between two characters: Establishing shot (a), Over-the-shoulder shots from either character’s point of view (b, c).
5. CONCLUSIONS Interactive Storytelling is still in its infancy, with many research issues remaining to be addressed. Yet, it is worth analysing the early feedback expressed by many potential users, scriptwriters and authors. Once the very possibility of interfering with a storyline has gained acceptance, a significant trend is to express concern about the ability of the system to generate interesting stories. The “strong” version of this argument, which is that system will never be able to generate interesting stories, is a classical objection to any AI approaches. Like other traditional objections, it probably misses the point that AI systems in their knowledge-intensive version are still using human (meta)-knowledge and that the enterprise of story generation should turn to become an endeavour in obtaining better knowledge of what an interesting story is. This should certainly be a long-term research programme, yet much is to be gained in pursuing it, probably even by the sceptics. The authoring problem has not been dedicated much attention, probably because most Interactive Storytelling prototypes
6
Interactive Storytelling: From AI Experiment To New Media have not yet reached a critical scale that will open them to experimentation outside the research environments in which they have been developed.
REFERENCES [1] [2] [3]
[4]
[5] [6] [7]
[8]
[9]
[10]
[11]
[12]
[13]
[14] [15] [16]
Barthes, R. Introduction a l’Analyse Structurale des Recits (in French). Communications, 8, pp.1-27, 1966. Bonet, B. and Geffner. H. Planning as Heuristic Search. Artificial Intelligence Special Issue on Heuristic Search, 129(1), pp. 5-33, 2001. Cavazza, M., Charles, F. and Mead, S.J. Character-based Interactive Storytelling, IEEE Intelligent Systems, special issue on AI in Interactive Entertainment, pp. 17-24, 2002. Cavazza, M., Charles, F. and Mead, S.J. Interacting with Virtual Characters in Interactive Storytelling. ACM Joint Conference on Autonomous Agents and Multi-Agent Systems, Bologna, Italy, pp. 318-325, 2002. Charles, F., Lugrin, J.L., Cavazza, M., and Mead, S.J., 2002. Real-Time Camera Control for Interactive Storytelling. GameOn 2002, London, UK. Fykes R. and Nilsson N., STRIPS: A new approach to the application of theorem proving to problem solving Artificial Intelligence, 2:189-208, 1971. Grabson and Braun, 2001 A Morphological Approach to Interactive Storytelling, Proceedings of Cast01, Living in Mixed Realities, Sankt Augustin, Germany, pp. 337-340. Lozano, M., Mead, S.J., Cavazza, M. and Charles, F. Search Based Planning: A Model for Character Behaviour, in Proceedings of the 3rd on Intelligent Games and Simulation, GameOn-2002, London, UK, November 2002. Magerko, B. A Proposal for an Interactive Drama Architecture, AAAI 2002 Spring Sympo-sium Series: Artificial Intelligence and Interactive Entertainment, March 2002. Mateas, M. and Stern A. 2002. A Behavior Language for Story-Based Believable Agents. IEEE Intelligent Systems, special issue on AI in Interactive Entertainment, pp. 39-47. Mateas, M., 1997. An Oz-Centric Review of Interactive Drama and Believable Agents. Technical Report CMU-CS-97-156, Department of Computer Science, Carnegie Mellon University, Pittsburgh, USA. Mateas, M.: A Neo-Aristotelian Theory of Interactive Drama. Working Notes of the AAAI Spring Symposium on Artificial Intelligence and Interactive Entertainment, AAAI Press 2000. Nau, D., Cao, Y., Lotem, A., & Muñoz-Avila, H. 1999. SHOP: Simple hierarchical ordered planner. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (pp. 968-973). Stockholm: AAAI Press. Prada, R., Machado, I., and Paiva, A.: Teatrix: Virtual Environments for Story Creation. Proceedings of Intelligent Tutoring Systems 2000, Canada, 2000. Propp, V.,. Morphology of the Folktale. University of Texas Press:Austin and London, 1968. Riedl, M., Saretto, C.J. and Young, R.M. 2003. Managing interaction between users and agents in a multiagent storytelling environment, in the Proceedings of the Second International Conference on Autonomous Agents and Multi-Agent Systems. (to appear)
7
Marc Cavazza, Fred Charles and Steven J. Mead [17] Sgouros, N.M., Papakonstantinou, G. and Tsanakas, P., 1996. A Framework for Plot Con-trol in Interactive Story Systems, Proceedings AAAI’96, Portland, AAAI Press, 1996. [18] Swartout, W., Hill, R., Gratch, J., Johnson, W.L., Kyriakakis, C., LaBore, C., Lindheim, R., Marsella, S., Miraglia, D., Moore, B., Morie, J., Rickel, J., Thiebaux, M., Tuch, L., Whitney, R. and Douglas, J. Toward the Holodeck: Integrating Graphics, Sound, Character and Story. in Proceedings of the Autonomous Agents 2001 Conference, 2001. [19] Szilas, N.: Interactive Drama on Computer: Beyond Linear Narrative. Papers from the AAAI Fall Symposium on Narrative Intelligence, Technical Report FS99-01, AAAI Press, 1999. [20] Young, R.M. 2000. Creating Interactive Narrative Structures: The Potential for AI Approaches. AAAI Spring Symposium in Artificial Intelligence and Interactive Entertainment, AAAI Press. [21] Young, R.M.: An Overview of the Mimesis Architecture: Integrating Intelligent Narrative Control into an Existing Gaming Environment. Working Notes of the AAAI Spring Symposium on Artificial Intelligence and Interactive Entertainment, AAAI Press 2001.
8