Opera of Meaning (OoM) is a format of presentation of media contents, such as dra- ..... In such a case a custom software must be provided that divides the two ...
Opera of Meaning: film and music performance with semantic associative search Shlomo DUBNOV a and Yasushi KIYOKI b California Institute of Telecommunication and Information Technology, University of California in San Diego, La Jolla, 92093, USA b Faculty of Environmental Information, KEIO University, Fujisawa, Kanagawa 252, Japan a Music,
Abstract. Recently artists are exploring ways for incorporating large amounts of information and networking as part of their medium. One of the main challenges in applying information technology to film and opera is in relating different types of media to the meaning of story narrative. Opera of Meaning is a new format for distributed, collaborative and interactive viewing where the association of different media elements is done dynamically by semantic and impression search that is performed by the public during the performance in context of a main story. This opens new research questions in database modeling and semantic technology related to story meaning, media auto-tagging, automatic editing and mixing, user interaction, social networking and more. We plan to offer this format to artists, producers and the public, opening a new venue for social creation and experiencing of impression and meaning in digital media. Keywords. Opera, Film, Kansei, Expectancy, Mathematical Model of Meaning
1. Introduction Opera of Meaning (OoM) is a format of presentation of media contents, such as dramatic work, entertainment or lecture, that uses a realization of a script (main story) together with a collection of related clips and other electronic information (related contents) which are dynamically presented in parallel with the main story during the show. A central feature of Opera of Meaning is that the public can interact with related contents during the story presentation by providing information to the performance system regarding selection of related contents and by posting their own information on designated publicly viewed display areas. The main story can be live, prerecorded or a combination of the two, such as TV show, music performance, film screening or other types of electronic transmission. In a recent piece based on an ancient historical text [1][2] a live performance occurred on a central stage simultaneously with projection of films on a series of screens surrounding the audience. At dedicated times the live performance was stopped in order to present films and video interviews with experts about the story and to conduct public discussion by having the audience present their own contents such
as text and graphics on the surrounding screens, and express themselves via electronic chats and votes on questions related to the story. The name “Opera of Meaning“ is intended to emphasize the interplay of meanings and exchange of opinions that occur during performance of a story. This approach is related to works that explore social and educational roles of film and theater media, and is inspired by methods of debate and commentary that are common in traditional religious study situations such as Jewish Talmud or Tibetan Buddhism. 1.1. Relation between information technology and story telling Storytelling is the ancient art of conveying events in words, images, and sounds often by improvisation or embellishment. When a story is presented in a dramatic manner in one or more acts set to music and artistic visual expression, it becomes film or opera. Today, many methods are used to introduce more sophisticated structures and additional information into traditional stories. Simultaneity, split screens and sophisticated editing become inherent part of digital aesthetics. Metafiction and the use of a commentator allow embedding stories inside or in parallel to other stories, dealing with relation between fiction and reality. Information technology introduces yet more new and unexpected ways to tell and understand stories. In OoM the use of related materials presents an opportunity to “embed" user defined stories in the original plot. The commentary method, besides its obvious function as a learning tool, is understood as a shift in attention of the readers from plot to characterization. Building upon contemporary culture of social “tagging", OoM offers additional characterizations of main story in ways that are very different from traditional hyperlink approaches. In addition, public interaction aspects are designed in a manner that maintains dynamic flow and coherence of presentation, inspired by methods of machine improvisation [3] that use string matching over database of music materials to retrieve, recombine and create music.
2. System Concept and Architecture Our system combines concepts of Kansei, narrative, commentary and debate. In the field of multimedia database systems, the concept of Kansei is related to data definition and data retrieval with impression information for multi-media data, such as images, music and video. The multimedia data, together with its semantic and impression descriptors are used to define the "information world" in which the user can operate that is relevant to a particular story. The closed-world-assumpution in database modeling is suited to the Opera performance, because we can prepare appropriate information resources for it in advance not from open-world (WWW) but from closed-world. According to our OoM concept, the DB design and story annotation processes are included in the scenario design. 2.1. The Opera performance system The main principles behind designing the opera performance system are: 1. presentation design that establishes clear relations between the main story and related contents
2. timeline organization according to different levels of viewing and user interaction activities, such as performance acts, commentary and debate. 3. close world database and logical structure of the story are included in the scenario design 4. public participation is allowed according to reactive, deliberative and reflective interventions on informative and kansei levels Design of the presentation system is done according to a principle of separation between main story and related contents in a manner that maintains the centrality of the main story and by provision of surrounding materials in a spatially and temporally coherent manner. By careful control of the amount, rate and type of information that is allowed to be displayed from related contents, the total sum of audiovisual materials enhances the audience experience by double coding and by engagement through active participation. The architecture of the theater version of the system is shown in Figure 1.
Figure 1. Performance system architecture
2.2. Multidatabase system architecture In order to support various types of data (different story scenes, logic of story structure, rules for content composition and presentation, databases of related text, images, video), a meta-level active database functionality is used to integrate among heterogeneous resources. There are three main operators that retrieve the relevant content related to the story scene: The semantic component matches informative contents such as specific keywords related to story contents. Impression components operate on the kansei level and “filter" the data according to mood, emotions and other aesthetic criteria. The story / world component considers the role of data in story presentation such as identifying functions related to point of view, chronological and spatial editing considerations and keeping track of expectations and questions derived by the audience in response to story
segments. This architecture is presented in Figure 2. The database system interfaces the opera performance system through a set of actions defined by the script and user queries.
Figure 2. Multidatabase system architecture
3. The Mathematical Model of Meaning for Semantic Associative Search In the design of the OoM system, one of the important issues is how to deal with “Kansei” of humans. The concept of “Kansei” includes several meanings on sensitive recognition, such as “impression”, “human senses”, “feelings”, “sensitivity”, “psychological reaction” and “physiological reaction”. In the OoM system, the concept of Kansei is related to data definition and data retrieval with Kansei information for multimedia data, such as images, music, video and stories. The important subject is to retrieve images, music, video and stories dynamically according to the user’s impression given as Kansei information. The field of Kansei was originally introduced as the word “aesthetics” by Baumgrarten in 1750. The aesthetica of Baumgrarten had been established and succeeded by Kant with his ideological aesthetics. In the research field of multimedia database systems, it is becoming important to deal with Kansei information for human beings for defining and extracting media data according to impressions and senses of individual users. As one of the database systems dealing with Kansei information, a semantic associative search system based on the Mathematical Model of Meaning (MMM) has been proposed [4][5][6][7]. The MMM realizes media data retrieval by receiving keywords representing the user’s impression and the media data contents. The MMM provides semantic functions for computing specific meanings of keywords which are used for retrieving media data unambiguously and dynamically. The main feature of this model is that the semantic associative search is performed in the orthogonal semantic space. This
space is created for dynamically computing semantic equivalence or similarity between the metadata items of media data and keywords. The basic principle in the MMM is that each media data item, which can be text, image, animation, music, or movie, includes various meanings. That is, the meaning of a media data item is not fixed statically. The meaning of a media data item is fixed only when we know the context for explaining the content of a media data item. The MMM defines semantic functions for performing the semantic interpretation of a content and for selecting semantically related media data items, according to the given context. In MMM, metadata expressed in terms of English words are assigned to each media data item. In MMM, data items are mapped onto an orthogonal space. Each media data item is placed as a single coordinate point in the space and dynamically extracted by semantic associative search. In this semantic space, with approximately 2000 dimensions in current implementation, each context corresponds to one of the subspaces. The subspace is named “semantic subspace.” When the context is given, a semantic subspace corresponding to the context is selected. This selection reflects the recognition of the context given as an aspect. Each data item is also mapped onto the semantic subspace selected according to a given context, and the relationships between data items are dynamically computed by using the metric in the selected semantic subspace reflecting the context. In MMM, the number of phases of contexts is almost infinite, currently approximately 22000. 3.1. The Outline of the Mathematical Model of Meaning In this section, we briefly review the outline of the semantic associative search method which is based on the Mathematical Model of Meaning. The model has been presented in [4][5][6][7] in detail. The semantic associative search method consists of three steps as follows: STEP1: Creation of the metadata space: The semantic associative search for information resources is realized by the mathematical model of meaning. A metadata space is created as a basis for computing the relationships between data items. When m data items are given as the basic data items for creating the space, each data item is characterized by n features. The m basic data items is given in the form of an m x n matrix M. Computing the eigenvalue decomposition of the correlation matrix M T M, an orthogonal semantic space is created (M T represents the transpose of M). It is defined as the metadata space MDS. STEP2: Mapping data onto the semantic space: A set of keywords as a query and target data items, both of which are characterized by the same n features as used in STEP 1, are mapped as vectors onto the metadata space MDS. The MMM measures the association or correlation between context words and each candidate data item. Suppose a sequence of associated context words is given to search a data item, (e.g. peaceful, silent). We can regard the context words as those which form the context. The context is used to select the subspace from the metadata space MDS.
STEP3: Semantic associative search: First, when a context for explaining the meaning of a query is given, then the semantic subspace is dynamically extracted from the metadata space. In this model, each context given by a user corresponds to one of the semantic subspaces. Second, target data items are mapped onto the subspace. Then, by calculating the correlation of each data items in the selected subspace, the data items which are highly related to the given context can be extracted. Since the subspace reflects the given context words, the norm of the data item projected onto the selected subspace is regarded as the correlation between the data item and the given context words. That is, the data item with a larger norm is highly related to the given context, and is obtained as the appropriate data item for the context. The data items with higher norms are obtained as the resultant sets which are highly related to the given context.
4. Production of story in OoM format Production of OoM content consist of script writing / score composition, context annotation, related content collection, collage and montage design, and formulation of public interaction rules during performance, commentary and debate phases. This is similar to the process that was used to create the Kamza and Bar Kamza performance [1][2]. The starting point in this process is a textual script that must be rendered into a musical and visual depiction of the story. This text is extended into a multiple-track script that is described in the next paragraph. The next step is collection of related content and its association with the main story. The goal of related content is expanding the story into additional meaning and impression domains by creating associative links to other sources of information. These associations are developed during deep analysis phase and they may include aspects of literary criticism, social analysis, historical, economical, political, psychological, emotional, religious aspects and so on, depending on the specific story and the public who wishes to participate in the collaborative production. 4.1. Script format and method The scripting format for the Opera is used to define the variable contexts and consists of the following: Scenario: A time line describing the main story scenes, related content display and public interaction activities. The complete story consists of multiple acts, divided into performance, commentary and debate. Score: For each story scene the script provides performance instructions written in multiple simultaneous lines resembling a musical score. Each line in the script provides metadata for association of user queries with story context and instructions for audio mixing and visual display layout. It also provides rules and authorizations for audience participation during scenes. Tracks: The different elements contributing to the final display consist of the main story and related media that are retrieved dynamically during performance. In live version of the Opera, the main story consist of text and music score provided to the actors or musicians. In the film version of the opera system, the main story is represented by a pre-recorded media, such as film, slide show, text or audio narration. Related contents
consist of additional images, video clips, text, graphics and possibly sounds that are selected from a database. Conductor / Editor: This is part of audio-visual management system (also considered as control room) that is in charge of coordinating the different tracks according to instructions provided in the score. Since many details of the final presentation are decided “on the fly", the score is in fact a type of structured improvisation, and the conductor’s role is to monitor the overall result and provide editorial decisions. The Opera system requires careful composition of the overall visual and sonic elements so that the overall effect of the combination between main story and related contents will achieve a certain level of coherence and engagement. The main and related contents must be not only semantically but also aesthetically organized in term of Kansei relations, spatial arrangement and temporal structure.
5. Presentation systems Presentation design aspect of the OoM system refers to decision about placement of the main story, audience, graphical user interface (GUI) and related contents displays in physical or remote environment. A separate audio design is done to provide an optimal sound reinforcement and sound surround effects. Two examples of presentation design that are presented below are a shared common space called "hyper-cinema theater" and a personal multi-display viewing system. 5.1. Hypercinema theater The principal goal for the hypercinema theater is to provide a space for public presentation of works that integrate immersive and multi-layered video projections and audio reinforcement for film or mixed live and cinematic productions. The design elements include lighting, scenery, and video projection on two, or more screens. Additional aspects of the design includes division of the physical theater space into areas of main story performance and audience presence. Seating of audience can be done in several configurations relative to the main story performance area. In all configurations the audience is surrounded or flanked by projection screens that display the related contents. The main story area can be used for mounting a film projection display or as a stage for live performance. The audience area can also be reinforced with microphones that are fed into the audio system. Lighting is used to transition between the different acts in the performance, through projections, and debate sections when present. A local wireless network is used to access and conduct communication between the audience and the performance system, such as posting display requests, conducting chats or submitting votes during the different performance. 5.2. Personal multi-display system A personal display system is used for viewing of pre-recorded content with personal interaction. It consists of a single or multiple displays that are logically and functionally divided into three areas corresponding to main story, related contents and GUI. For instance, a graphics expansion module could be used to extend the desktop across multiple
screens. In such a case a custom software must be provided that divides the two screens into the three functional areas. One option is to split the first display into main story player and GUI , and devote the second screen solely to related contents. Another option is to use the first display for GUI only and divide the second display in software between main story and related content. In a later version of the system a networked shared viewing will be developed. This requires a synchronized streaming of audiovisual contents to multiple users and communication between the users and the server.
6. Applications The system will be used to create content and produce events in OoM format. In Kamza and Bar Kamza story from the Talmud, a recording of a stage performance will be used as the main media and collections of images, maps, text and movies will be stored in the database. Another original story "Gadget“ (in preparation) is an adaptation of Far Eastern folk tale "Mirror“ that presents social criticism using a humoristic plot [9].
Acknowledgements This work is partially funded by Heiwa-Nakajima-Zaidan fellowship. Debate and Commentary Play project is supported by California Institute of Telecommunications and Information Technology and the UCSD Chancellor’s Interdisciplinary Collaboratories program.
References [1]
[2] [3]
[4] [5]
[6]
[7]
[8] [9]
D. Ramsey and D. Sutro, “An Opera of Meaning Integrates Live Performance, Internet, Multimedia and Audience Participation at UC San Diego", UCSD News, February 2008, http://ucsdnews.ucsd.edu/newsrel/arts/02-08OperaOfMeaning.asp Debate and Commentary Play, http://kamzaandbarkamza.wikidot.com S. Dubnov and G. Assayag,“Memex and Composer Duets: computer-aided composition using style mixing", Open Music Composers Book 2, Collection Musique/Science, Editions DELATOUR France, 2008 Y. Kiyoki, T. Kitagawa and T. Hayama, “A metadatabase system for semantic image search by a mathematical model of meaning", ACM SIGMOD Record, Vol.23, No. 4, pp.34-41, Dec. 1994. Y. Kiyoki, T. Kitagawa and Y. Hitomi, “A fundamental framework for realizing semantic interoperability in a multidatabase environment,” Journal of Integrated Computer-Aided Engineering, Vol.2, No.1(Special Issue on Multidatabase and Interoperable Systems), pp.3-20, John Wiley & Sons, Jan. 1995. Y. Kiyoki, A. Miyagawa, and T. Kitagawa, “A multiple view mechanism with semantic learning for multidatabase environments,” Information Modelling and Knowledge Bases (IOS Press), Vol. IX, May, 1998. Y. Kiyoki and T. Kitagawa, and T. Hayama, “A Metadatabase System for Semantic Image Search by a Mathematical Model of Meaning,” Multimedia Data Management – using metadata to integrate and apply digital media –," McGrawHill(book), A. Sheth and W. Klas(editors), Chapter 7, March 1998. Y. Sato and Y. Kiyoki, “A semantic associative search method for media data with a story, “ Proceedings of the 18th IASTED International Conference on Applied Informatics, pp., Feb., 2000. J.H.Grayson, “They First Saw a Mirror: a Korean folktale as a form of social criticism", JRAS, Series 3, 16, 3 (2006), pp. 261-277