Use of Ontologies for Knowledge Representation of a

0 downloads 0 Views 202KB Size Report
qualitative changes (e.g. coloring, weather conditions) that can be .... Creating Your First Ontology, http://www.ksl.stanford.edu/people/dlm/papers/ontology101/.
Use of Ontologies for Knowledge Representation of a Film Scene Yannis Christodoulou, Nikoletta Mavrogeorgi, Petros Kalogirou National Technical University of Athens Department of Electrical & Computer Engineering Zografou 157 73, Athens, Greece [email protected], [email protected] Abstract In the present paper we explore Ontologies, as a means to capture knowledge of the film-making domain. More specifically, we intend to open a discussion so as to guide the development of DirectorNotation, an under-development open-source tool for representing media content, as of its preproduction and production phases, using a symbolic notation system, in which a film director can formally record his intentions. DirectorNotation’s authors suggest Ontologies for formalizing and reasoning over their notation system. The current paper discusses this technical choice, considering its limitations, and makes suggestions about the work that needs to be performed if these limitations are to be overcome.

1. Introduction Cinematography has been evolving for over a century [16]-[19]. During this time, various principles have been established for different aspects of cinematic expression such as camera work, acting, editing etc. These principles have served as the basis for the development of automated tools, e.g. [8]-[15] that either emulate the Director’s role by automatically putting together raw footage to create a film or by creating original multimedia content through text conversion into original animations. For example [15] serves as an automated director by accepting a formalized screenplay and outputs camera directives in the form of a 3D movie. Similarly, [13] automatically converts natural language instructions into 3D scenes. DirectorNotation [1]-[4] is the next logical step in this evolution. It is a system that simultaneously works towards both an artistic and a technical formalisation of the domain of cinematography, at a greater level of detail than achieved before. DirectorNotation is

developed by our colleagues at the National Technical University of Athens as an open source project. As such, their work calls for open collaboration and community contributions to their development efforts. The current paper is an early response to this call. We have discussed DirectorNotation with its authors, and although we consider that this work has considerable merit, we wish to call attention to certain challenges that remain open. We aim to offer both suggestions, and constructive criticism of some technical ideas. The goal of this paper is to start a scientifically studied discussion on the implementation decisions that need to be made by the open source project of [4], so as to contribute to basing the artistic system on the best possible technical basis. All possible designs will face trade-offs, meaning there will be no perfect solution; therefore, we believe that an open discussion by multiple researchers can help form a more successful open source project.

2. What is DirectorNotation DirectorNotation makes an important step beyond the baseline of previous work, such as [8]-[15]. The key difference between DirectorNotation and these systems is that the former aims at becoming a powerful tool in the hands of a director (i.e. a professional human artist who possibly lacks computer skills), rather than trying to substitute his role in the film making process, replacing the (human) director with an “intelligent” software tool. Its main purpose is to analyse and represent media content during its production phase, rather than capturing knowledge about already existing media. For that purpose DirectorNotation suggests a logical symbolic structure, a notation system in which a film director can formally record his ideas and conceptualization regarding the creation of a film (similar to the way that a composer uses notes to record a piece of music), rather than using natural language or some kind of technical formal

language for inputting filming directives. This form of representation will also be fully computer-processable, leading to various applications such as automatic synthesis of animated storyboards, but in order to achieve this, the notation needs to be translated into an intermediate technical form. As a result, we see that DirectorNotation is valuable for many new media and entertainment applications. For instance, the customization of content, that a system can hope to perform when the content is delivered over an interactive medium such as the Internet, will require a representation of how the content needs to be presented in different possible delivery scenaria. DirectorNotation allows both the artistic representation of a director’s ideas, and automatic processing of these ideas, so it can provide many new solutions in this area. In this paper, we focus on the technical problems that must be overcome before such objectives can be realised.

3. Using Ontologies for Knowledge Representation Ontologies are becoming more and more popular among various Knowledge Representation methods over the last years. According to [5] [6] [7], their primary use is to create a shared understanding of a specific domain, enabling communication among heterogeneous domain applications. A salient benefit that derives directly from the nature of Ontologies is that once the knowledge of a specific domain is captured, it can be reused and even expanded without the need of “reinventing the wheel”. In other words, the ontology may serve as a common agreement among domain experts that can be leveraged by domain applications without each application having to build the domain knowledge separately and for its own purposes. Ontologies are also attempting to optimise practical performance, whereas older knowledge representation tools suffered from worse computational constraints (e.g. queries not being decidable or requiring exponential time to process, etc). We should make a distinction between Ontologies and Knowledge Bases, in the sense that an ontology mainly serves as a vocabulary through which we can describe and declare facts regarding a specific domain of knowledge, whilst a KB may also include reasoning functionality (e.g. execute queries about the domain). Nonetheless, various Semantic Reasoners have been developed to enhance Ontologies by transfusing to them reasoning capabilities, making them a powerful method for capturing domain knowledge. Therefore, when we discuss Ontologies as a general Knowledge

Engineering technique, we should also discuss the reasoning tools that are additionally used by a system, in order to get a complete picture of the resulting capabilities and restrictions. Currently the authors of DirectorNotation suggest Ontologies as the mechanism through which a certain piece composed in the notation can be represented in order to be further processed by other software components. The current paper discusses this technical choice, considering its limitations, and makes suggestions about the work that needs to be performed if these limitations are to be overcome.

4. Formalizing DirectorNotation: Declarative vs. Procedural Approach In this work, our intention is to put forward suggestions for the future of the DirectorNotation technology, while it is still in its development phase. Thus, we intend to offer a contribution so as to guide this development, rather than to simply wait for finished software to be delivered and then assess the final technical results. The specific issue this paper explores is the technical method that will be used to implement DirectorNotation’s formalization. The initial description of the system indicates that Knowledge Engineering, and specifically Ontologies and supporting reasoning capabilities will be used to allow a film director the freedom of expression in DirectorNotation. The ontology must be in position to cover many different aspects of the filming process like film objects (e.g. characters) and their activities, environment setup and changes (e.g. lighting), film editing etc. As far as we know, no such ontology has been developed so far that could tackle those issues. The idea of basing DirectorNotation upon an ontology is that the film director can make declarative statements describing his ideas, rather than having to specify every detail of the film in a “bottom-up” way. While we consider this a good idea, our aim here is to explore whether Ontologies are a sufficient tool for this purpose. An alternative to utilizing Ontologies would be to follow a typical programming approach, for example to choose an object-oriented language to do the job. Thus, we encounter two entirely different approaches which we intend to place in debate, in an effort to come near to an optimum solution. In the process of implementing the idea of DirectorNotation, the first and most critical step is to determine the most efficient method for capturing the knowledge of the film production domain. Envisioning

the use of Ontologies for the formalization of DirectorNotation, we inevitably come across some salient benefits. First of all, the idea of developing an ontology that will capture knowledge regarding the film-making process is most likely to attract a wide range of developers from different fields of entertainment (movie industry, computer games, interactive entertainment etc) who wish to structure applications based upon this knowledge and who will willingly contribute to the development of the ontology by suggesting their own improvement/expanding ideas, exploiting the expandability feature that is entailed by the use of Ontologies. As such, the expandable nature of Ontologies is a feature that the developers of DirectorNotation should take into account. Moreover, in a first view, expressing film direction intensions declaratively seems to be a much more appropriate method to adopt, than having to design exhaustive algorithms that would inevitably have to include every single detail in order to serve a practical purpose. The main notion is that declarative thinking is more intuitive because it allows a top-down approach where fundamental ideas are notated first, and additional detail may be filled in as required. Besides, DirectorNotation seems to be structured in a way that has many similarities to the nature of an ontology. Using Ontologies for formalizing DirectorNotation will also encourage flexibility, by making the knowledge model tolerant to any changes in the artistic notation system. In contrast, a procedural implementation would require dramatic alterations in order to cope with such changes. On the other hand, implementing DirectorNotation through a procedural model (in the sense that knowledge is modelled in the form of procedures or sequence of actions and numerical/geometrical computation is also part of the model itself) would certainly tackle, with ensured efficiency, every aspect of the notation that will require specific values to be calculated in order to represent a certain activity in a film scene, whereas any Knowledge Representation method intrinsically lacks such computational capabilities. We propose that the developers need (from the current early phase of their work) to carefully consider the need of finding approximate matches to descriptions given in notation by a director, in a declarative way. For example, it is stated in [1] that when a director specifies a “profile shot”, he does not mean an angle of exactly 90o between the camera facing direction and the direction in which the actor is looking; rather, a profile shot actually refers to a range of acceptable angles around the exact orthogonal orientation. It makes sense that a film director will often (perhaps, in fact, usually) make approximate

declarations such as this, rather than giving exact measurements in his instructions. When multiple such declarations are specifying conditions that must all hold at the same time, the ensuing problem is one of geometrical constraint satisfaction, either in 3D space or even specified in terms of perspective (e.g. other notation in [1] gives a representation for which part of the film screen an actor should appear in). These geometric problems will be combined with conditions that are described in a more conceptual framework, e.g. that one actor is “close” to another, that an object is in the “background”, that the camera “follows” an actor, etc. Thus, a purely procedural solution is also not sufficient, and the aforementioned advantages on reasoning over a semantic model are also needed. Similar concerns arise in the context of the modelling challenge itself. DirectorNotation introduces a notation system through which a film scene can be described by capturing the director’s intentions. In the scope of the present paper, the term “film scene” includes the description of the set where the action takes place (environment, lighting etc), the description of the actors present in the set and their actions. Moreover, the actions of the cameras that are used for shooting the scene are also considered as film scene components. As such, the knowledge needed to be recorded in order to thoroughly specify a film scene includes not only the modeling of static scene components such as the scene set, but also more dynamic components such as motion activities of actors (displacement, body gestures etc) as well as the movement of the camera(s) that take part in the shooting. The term “static” does not imply that those components will necessarily remain unaffected throughout the duration of the scene. However, changes in static components can be thought of as qualitative changes (e.g. coloring, weather conditions) that can be conceptually expressed, rather than quantitative changes that also require computation in order to determine them.

5. Combining Declarative and Procedural Requirements As implied in the previous section, the knowledge needed in order to record a film scene can be divided into two categories. The first one includes all the components of the scene that can be characterized as “descriptive” in the sense that they can intrinsically be expressed in a declarative way, such as the set description (e.g. environment, lighting), description of scene objects (e.g. characters) and their relative positioning within the set etc. The second category includes all the procedural knowledge in the sense that

Figure 1. Suggested architecture of a unified formalization model which combines semantics-based reasoning and procedural/numerical computation

some form of computation is part of the knowledge itself, such as the sequence of positions of a character during a scene, when these movements are not predetermined, but intended to achieve a given goal (e.g. the instruction “remain close to object x while also always keeping object y in your field of view”). For instance, although modeling a single linear movement could be easily tackled declaratively, the modeling of a complex constraint-determined geometry is clearly a difficult task that can not be achieved fully in a purely conceptual model. A medial solution between the declarative and procedural approach, would be to develop an external computational module that will handle computation tasks and input the results to the ontology. Although such a hybrid system sounds promising in that it effectively carries out both the declarative and procedural requirements of DirectorNotation, it may lead to an incomplete ontology creation, a problem which can be avoided with careful design and perhaps some duplicate work on the procedural and ontology modules of the system. We suggest that much care should be given to the workflow of how to create semantics-based reasoning and procedural/numerical computations as two discrete sub-systems, and combine them under a common architectural schema (Figure 1). To achieve this, an intermediate control module is needed between the two sub-systems. The control module takes as an input a piece of media content expressed through notation (e.g. a certain film scene), and attempts to match notation statements with ontology concepts. During the matching process, the controller identifies cases where certain declarations cannot be evaluated inside a reasoner, as they involve geometry computation and possibly constraint approximation (if multiple constraints must hold at the same time) that can be tackled only in a procedural manner. In this case, any procedural logic of the scene

can be represented declaratively (e.g. in the form of workflows), but the reasoner must work on the procedural knowledge (e.g. in order to derive a specific work-plan from a workflow, with respect to the constraints) in constant collaboration with a computational module that will handle geometrical computations and constraint approximation requirements, producing specific numerical values. Similarly, when numerical results are computed, the controller will need to be able to submit appropriate queries to the ontology, to find the (presumably more general) concept of which the computed result is an instance. Note that this collaborative relation between the reasoner and the procedural module may be proved rather complex, in cases where several iterations of the process are required (depending on the number and complexity of geometrical constraints), just to specify a single shot of a given scene. In Figure 2, we illustrate the complex control flow of the suggested system. Finally, let us assume another case scenario where at some point during a film scene we are forced to diverge from the linear motion path that the camera was supposed to follow, due to a set of constraints. More specifically, we want to frame a certain object’s course which forces the camera to diverge from its linear path. At the same time we want to achieve the minimum curvature in the camera’s path, in order to maintain the framing of another object (possibly in the background) as much as possible. Apparently, in order to solve the above problem, it will also require a computational module that will calculate the minimum degree of curvature with respect to the aforementioned constraints. Once the value has been calculated and the path has been determined, it will be input to the ontology in order to make a declarative statement about the camera’s final motion path. Inputting the result to the ontology requires determining which subconcept of the concept “motion” (given that there is such a concept in the ontology) better approximates the

Figure 2. A possible workflow of such a combinational system, in which the knowledge representation sub-system (ontology, procedural logic) collaborates with the computational sub-system (geometry, constraints approximation) through a central control module, in order to specify e.g. a dialogue scene.

result. However, if the resulted path is far too complex, it would be more convenient to (declaratively) divide it into simpler fragments in order to achieve more satisfactory approximations within the knowledge model. Alternatively the ontology itself can be edited so that an approximation for the resulted path can be created (for instance, expand the “motion” concept by adding more specialized sub-concepts). We consider that these methods could as well work in collaboration in order to achieve optimum results, depending on the case. For example, a certain (relatively complex) motion path may be evaluated as a rather common case (in the sense that it is often encountered) and as such the ontology editing approach should be preferred against the fragmentation method. However, if the level of complexity does not promise a satisfactory approximation, the fragmentation method can be used first, and then new concepts can be created to represent each fragment. However, there might be cases where, regardless the complexity of the problem, a conceptual approach must be exclusively followed. For instance, dramatically complex motion paths may be recognized as idioms (e.g. the abnormal walking of a drunken character). In such cases, it may be better to leave all (possibly massive) computations behind and deal with them in a conceptual manner (e.g. declaring them as “asymmetrical curved motion” concepts, accompanied by the appropriate annotation), leaving interpretation freedom to the actor.

6. Conclusions DirectorNotation is an under-development tool for analyzing and representing media content from its production phase, by introducing a symbolic notation system, in which a film director can formally record his intentions. We stated that DirectorNotation cannot be adequately formalized using exclusively either a knowledge representation method (such as Ontologies) or a purely procedural model, as both approaches face trade-offs and will lead to inefficient and possibly unsuccessful solutions. We proposed the idea of combining those two different approaches into a unified architectural schema. Furthermore, we suggested a potential system control flow in which the procedural and knowledge representation modules can interact with each other through an intermediate controller, in a way that can cater for both the declarative and procedural requirements of DirectorNotation.

7. References [1] A.Yannopoulos, K.Savrami, T.Varvarigou, A Theoretical Foundation and Formal Introduction for DirectorNotation, accepted with revisions for publication in Multimedia Systems, Springer Verlag, 2008, manuscript communicated by authors

[2] A.Yannopoulos, K.Savrami, T.Varvarigou, DirectorNotation as a Tool for AmI & Intelligent Content: an Introduction by Example, accepted for publication in the International Journal of Cognitive Informatics and Natural Intelligence (IJCiNi), 2008, manuscript communicated by authors [3] A.Yannopoulos, K.Savrami, T.Varvarigou, Artistic Notation Systems Integrated in Software Engineering of Knowledge Technologies: emphasising film, animation and games, under review, ACM JOCCH (Journal of Computers and Cultural Heritage), Special Issue on “Research Agendas in ICT and Cultural Heritage”, under review, 2008, manuscript communicated by authors [4] http://www.answer-project.org, valid Jan 2008 [5] B Chandrasekaran, TR Johnson, VR Benjamins, Ontologies: What are they? Why do we need them, IEEE Intelligent Systems and Their Applications, 1999 [6] N.F.Noy, D.L.McGuinness, Ontology Development 101: A Guide to Creating Your First Ontology, http://www.ksl.stanford.edu/people/dlm/papers/ontology101/ ontology101-noy-mcguinness.html, valid Jan 2008 [7] M.Uschold, M.Gruninger, Ontologies: Principles, Methods and Applications, Knowledge Engineering Review, Volume 11 Number 2, June 1996 [8] B.Tomlinson, B.Blumberg, D.Nain, Expressive autonomous cinematography for interactive virtual environments, Agents 2000: 317-324 [9] D.B.Christianson, S.E.Anderson, L.He, D.Salesin, D.S.Weld, M.F.Cohen, Declarative Camera Control for Automatic Cinematography, AAAI/IAAI, Vol. 1 1996: 148155 [10] C.B.Callaway, E.Not, A.Novello, C.Rocchi, O.Stock, M.Zancanaro, Automatic cinematography and multilingual NLG for generating video documentaries, Artif. Intell. 165(1): 57-89 (2005) [11] D.A.Friedman, Y. A.Feldman, Automated cinematic reasoning about camera behavior, Expert Syst. Appl. 30(4): 694-704 (2006) [12] J.Shen, T.Aoki, H.Yasuda and S.Miyazaki, E-Movie Creation by Rule-Based Reasoning from the Director's

Viewpoint - E-Movie: Computer Animation & Real Images, IEE EWIMT, Nov. 2004 [13] B.Coyne, R.Sproat, WordsEye: an automatic text-toscene conversion system, Proc. 28th annual conference on computer graphics and interactive techniques, pp. 487-496, Aug. 2001 [14] J.Pickering, P.Olivier, Declarative Camera Planning: Roles and Requirements, Smart Graphics 2003: 182-191 [15] Friedman, D., & Feldman, Y., Knowledge-based Formalization of Cinematic Expression and its Application to Animation. In the Proceeding of Eurographics '02, 2002 [16] D.Arijon. Grammar of the Film Language. Focal Press, Boston, 1976 [17] N.T.Proferes. Film Directing Fundamentals: see your film before shooting. Focal Press, London, 2nd edition, 2005 [18] J.V.Mascelli. The Five C's of Cinematography. Cine/Grafic Publications, Hollywood, 1965 [19] P.Wheeler, Practical Cinematography. Focal Press, Boston, 1999

Suggest Documents