has been the standard human-computer interface since the Apple. Macintosh was introduced in .... researchers simply add event-outputs as found in integrators or .... machine), therefore, we use device ID's to identify different devices of the ...
DEMIS: A Dynamic Event Model for Interactive Systems Hua Jiang, G. Drew Kessler, Jean Nonnemaker Department of Computer Science and Engineering, Lehigh University 19 Memorial Drive West, Bethlehem, PA 18015, USA
{huj2, gdk2, jen2}@lehigh.edu
ABSTRACT Modern interaction systems are usually event-driven. New input devices often require new event types, and handling input from the user becomes increasingly more complex. Frequently, the WIMP (Windows, Icons, Menus, Pointer) paradigm widely used today is not suitable for interactive applications, such a virtual reality applications, that use more than the standard mouse and keyboard input devices.
computer experts. Many useful applications have been built during these two decades using the WIMP model. While these interfaces satisfy conventional desktop needs fairly well, they are reaching their limits as new types of applications emerge. These new applications are often complex, making use of multiple types of input devices and multi-modal interactions, such as immersive virtual environments that use 3D tracking, button, mini joystick, voice recognition, and other novel input devices.
D.3.3 [Programming Languages]: Language Contructs and Features – abstract data types, polymorphism, control structures. This is just an example, please use the correct category and subject descriptors for your submission.
One reason for the inadequacy of the WIMP interface is the basic event-model currently used. This model is object-oriented and event-driven. Instead of letting applications actively retrieve signals and information from input devices, networks, or other applications, the windowing system manages the events and sends events to the relevant applications. An application wades through the events it observes on the event queue until events it is interested in appear. However, many input devices, such as 3D tracking, are polled by the application separately (to retrieve 6 DOF position and state of any buttons, etc.), and are not associated with events that go through the windowing system event queue.
General Terms
This approach has proved applicable and efficient for some applications, but has several weaknesses:
In this paper, we present the design and implementation of the Dynamic Event Model for Interactive System (DEMIS). DEMIS is a middleware between the operating system and the application that supports various input device events while using generic event recognition to detect composite events.
Categories and Subject Descriptors
Algorithms, Computer Interfaces, Middleware.
Keywords Input devices, human-computer interaction, event recognition, composite events
1. INTRODUCTION The WIMP (Windows, Icons, Menus, Pointing Device) paradigm has been the standard human-computer interface since the Apple Macintosh was introduced in early 1984. Microsoft Windows has made a similar presence with the concept of the Graphical User Interface (GUI) on the PC, as has Motif on the Unix workstation. The introduction of the GUI has significantly changed the interaction concept between humans and computers, improving the efficiency of communication as well as making computers understandable to a wider range of users than
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. VRST’02, November 11-13, 2002, Hong Kong. Copyright 2000 ACM 1-58113-000-0/00/0000…$5.00.
ÿ
Uniformly handling event and state input, which require different methods to interpret and respond.
ÿ
Lack of the potential to support multiple input devices. It is not enough to identify different types of devices, but it is also necessary to distinguish different devices of the same kind.
ÿ
Inability to determine causality and concurrence between events. In the case of multi-modal interaction, the simple “one event, one handle” approach turns out to be weak. Often, a group of events with some causal relationship makes more sense to users as one single logical event.
ÿ
Inconvenient for applications to use temporal relationships between events. Each event comes at a specific time. However, once events go through the event-model, the temporal relationship is lost. Time-sensitive applications such as virtual environments require a more informative model.
Our system, the Dynamic Event Model for Interaction Systems (DEMIS), provides a framework to address these problems. DEMIS consists of two parts, an Input Model system, and an Event Model system. The Input Model is directly related to the operating system and includes a device database to manage the information related to different sources, devices, and
components. It is responsible for combining and translating lowlevel operating system events into the richer, more contextsensitive events defined by our system. The Event Model is built directly upon the Input Model. The core of the Event Model is an event-recognition algorithm that takes events generated by the input model, matches them with pre-defined event-relationships, and outputs logical events, which can be directly used by applications. The system also provides a new class template called Attribute. By declaring variables of a standard type as an Attribute instance, the variables are able to store the history of values for retrieval at any given time. DEMIS is designed to be flexible enough to fit within current windows systems and may be deployed in part or in whole. Attribute and Input-Model can be used individually depending on the goal of applications, however the utilization of Event-Model should depend on that of Input-Model. By using DEMIS as a support system, we believe that building applications will be more efficient, clear, and easy to maintain and update.
2. RELATED WORK Real-time interactive systems must handle two type of input: events that may occur at any time and values that change over time. Most systems that collect and handle input data for an interactive application shoehorn the data into one of these two categories. In contrast, (Dix and Abowd, 1996) treats events as atomic, non-persistent occurrences in the world with status as entities that persist and are observable in the world. The authors argue for a formal model that is specifically designed to reflect both the status and behavior of events within the interface. Their formalism includes two main components, a high-level requirements language and a low-level language for constructing widgets. According to their modified PIE model (which considers status-input as well as event-input and status-output), the researchers simply add event-outputs as found in integrators or simple event-response systems. Another approach, described in (Jacob, Deligiannidis, and Morrison, 1999), combines a data-flow or constraint-like component for the continuous relationships with an event-based component for discrete interactions. Individual continuous relationships can be enabled or disabled. This model abstracts away many of the details of specific input devices, treating them only in terms of the discrete events they produce and the continuous value they provide. Both of these works accomplished much in identifying the significance of events in changing state or relationships between state changes. However, since events and the data affected are combined, they ignore the possible changes outside the effects of events. Elliott et al. (1994) present a paradigm and toolkit, called TBAG, for the rapid prototyping of interactive, animated 3D graphics programs. It is designed to provide a fundamentally continuous treatment of naturally continuous phenomena such as time and motion. The solution given by TBAG is a demanddriven constraint system, not a event-driven system. Constraints can be established and removed dynamically. Elliott (2000) proposes a declarative approach to event-oriented programming based on a powerfully expressive event language. Both works are
based on Haskell – a general purpose, purely functional programming language. Much work has been done on the recognition of composite events that describe a set of individual events and their relationship to each other. Some of this work has been applied in areas such as acoustic analysis (Paliouras, 1997), the connection between vision and natural language systems (Andre, Herzog, and Rist 1988), on-line monitoring (Al-Shaer, Abdel-Wahab, and Maly, 1997), Active Databases (Motakis and Zaniolo, 1997), and multimodal interfaces (Oviatt and Cohen, 2000). The TCN system (Paliouras, 1997) uses a set of rules to determine high-level events based on a directed acyclic graph (DAG). TCN uses bottom-up, backward reasoning, propagation and self-convergence algorithms to map the sequence of the lowlevel input event stream to high-level events. The algorithm starts to infer a high-level event only when it encounters the terminal events. A high-level event cannot be instantiated unless all its sub-events are completely recognized. This assumption is acceptable if processing events after they have all been collected, but does not hold for real-time interactive systems where the events are continuously collected. Unlike TCN, the core of the VITRA system, described by (Andre, Herzog, and Rist 1988), is a course diagram. The work of these authors builds an event model for each high-level event. For each event model, there is one special instance (daemon event), waiting for a trigger-condition. When the condition becomes true, the daemon event will automatically generate a new daemon to wait for next trigger-condition to become true. Compared with TCN, VITRA can recognize events even when they are incomplete. However, the event handler is complex. Algorithms in the Hifi system (Al-Shaer, Abdel-Wahab, and Maly, 1997) are intended to reside on different sites of a distributed system to serve as separate filtering functions. The system supports three kinds of event filtering procedures; identity-based (which determines the generator), content-based (which checks for a valid attribute value), and correlation-based (which checks for a given relationship among an events set). Only events that pass through the three filtering procedures are sent to applications. The UCLA system (Motakis and Zaniolo, 1997) describes semantic descriptions and classifications of composite events. The paper introduces the logical concepts of And, Or, Seq and Not event. The And event indicates that all sub-events should occur for the composite event to occur. The Or event indicates that the occurrence of any sub-event leads to the occurrence of the composite event. A set of sub-events that occur in time sequence makes up a Seq event. A Not event occurs when not any instance of the sub-event occurs. It uses formal semantics to define the meaning of the original event expressions in applying the method to an active database. The QuickSet system (Oviatt and Cohen, 2000) allows for events which may not be discrete and sequenced in order but continuous and simultaneous. It uses a novel hierarchical recognition technique called Members-Teams-Committee (MTC), which is 3-tiered, executing in a bottom-up, layer-by-layer manner. Events from the voice modality and the gesture modality are separately
recognized using the respective modality knowledge; composite events of voice and gesture require a specific integrator. When the number of input modalities increases, the number of integrators will also increase, leading to greater complexity of event recognition.
3. DEMIS DEMIS is a middleware library between the operating system (Host OS) and the application (APP). As shown in figure 1, it consists of two subsystems (“input-model” and “event-model”) and the “attribute” class.
v3
Attribute Host OS
APP Input-Model
We implement attribute as a template class so that it can be used to represent many types of data. Each instantiated attribute stores a linked list of nodes that represents constant value or functions for a time-period. The list is sorted in time-descending order, since generally; we care more about recent changes. An attribute can also change its value at any given time. For example, consider an attribute with its value defined within [ts, ∞). At a specific time t, t>ts, something happens causing the attribute to call method setVatT with the new value and t. The attribute will keep the old value before t, and use the new value starting at t. This is especially useful in an interactive virtual environment, because the data values change frequently.
v2
Event-Model v1
Figure 1. DEMIS system overview In the next section, we discuss the Attribute class in detail. In section 3.2, we describe the Input-Model. In section 3.3, we describe our Event Model, which includes events, event trees, event types, an event recognition algorithm, and two policies used to help recognition. In the last section, we explain how these parts work together.
3.1 Attribute In our system, we introduce a new base class for data, called Attribute. Attribute differs from other data-types in that it can store and retrieve data by time. Each attribute-type data has a “start time”, an “end time”, and “getVatT” and “setVatT” methods. Instances of Attribute can not only store constant values, but can also store functions for segments of time, as shown in Figure 2.
v2 v1 0
ts
t1
t2
te
time(ms)
Figure 2. Attribute Example Figure 2 shows an attribute whose life time is from ts to ts, including ts and excluding te. Its value, however, changes during this period. It has a constant value of v1 within the interval [ts, t1). It is represented by a time-based function within [t1, t2), and is, again, a constant value v2 within [t2, te). Usually, an attribute can have a time span over the interval [ts, t∞) (where ts stands for the start time of the system, and t∞ means it does not change as long as the system running). We use the notation of [t1, t2) instead of [t1, t2] to show that discrete changes occur at t2, as demonstrated by Figure2. Figure 3 shows an example of a discrete change, which occurs at t2.
0
t1
t2
t3
time(ms)
Figure 3. Example of Discrete Change of an Attribute
3.2 Input Model (IM) The Input Model plays two important functions in DEMIS: it serves as an extensible device database and it performs low level-event recognition. As a device database, the input model manages all devices attached to the system. We treat various input devices in a general way. Each device is of some specific source type, (e.g. a mouse is a mouse-type device and a keyboard is a keyboard-type device). Each source type may have multiple actual devices (e.g. we can have two mice attached to the host machine), therefore, we use device ID’s to identify different devices of the same source type. Each device consists of several components that produce input values (e.g. a 2D mouse has the components left button, right button, and 2D cursor position, while a 3D mouse has a 3D position and orientation component rather than a 2D positional component). The model also includes the concept of active components. Each device type has a set of components, with the opportunity to specify which components are enabled or active, and which components are not actually used. The system provides two standard input devices, one keyboard, and one mouse. Other kinds of input devices can be registered into the device database such as a tracking device or 3D mouse to be used in a virtual reality application. We use the vector [source type, device-ID, component] to identify the source of an event. For example, a [mouse, 0, left button] event comes from the left button on the first mouse of the computer system, while a [mouse, 1, right button] event comes from the right button on the second mouse. For example, a click event from [mouse-source, 0, left-button] stands for a left mouse button click from the first mouse. In order to identify the event source the input model also performs low-level event recognition. Each registered source type has its own recognition process to recognize its own source events. This process is directly related to the operating system. For example, for mice and keyboard sources on the Windows OS,
the process will read directly from the windows message queue. Provided with message type and extra information, windows events are labeled by the source vector and translated into primitive DEMIS events. These events can be used in the event model described in the next section. The input-model not only encapsulates events in a general way, but also uses input data type to represent data uniformly. In Windows, event data is hidden inside two event parameters, making it hard to understand and use. DEMIS introduces a unified data type base class, InputData of which each possible data type related to different events is a subclass.
3.3 Event Model (EM) Our event model (EM) is built upon the input model described above, and is therefore not directly tied to the host operating system. . Events in DEMIS have several features: time, source and relationship. An event uses time stamps to represent the occurrence time as well as the stop time. This time information is important for determining casualty between events. The source of the event is denoted by the event source vector discussed in the previous section. It is our goal to make a clear division between the application (APP) and the event -model so as to relieve most pressure from the application side. The DEMIS event model performs the complex task of identifying relationships between different events and providing events tailored to the APP. In the following section, we will discuss how an event is expressed in DEMIS, relationships between events, and finally, our event recognition algorithm.
3.3.1 Event We define an event set as E = (N, T, S, G), where E is the set of events, N is the set of event name strings, T is the set of times, S is the set of sources and G is the set of Event Trees (ET). Event Trees will be described in more detail latter. For each e∈E, there is an event name, n(e)∈N, a start time, st(e)∈T, an end time, et(e)∈T, a source vector, s(e)∈S, and an event tree, g(e)∈G. Each event type has a unique name to aid in discrimination among different events. An event can be specified as e[t1, t2], where e∈E, t1 = st(e), t2 = et(e), and t1 ÿ t2. For combinations of events, the time relationship is more complicated. For example, we define two temporal relations between events as follows (Figure 4).
3.3.2 Event-Tree There are two categories of events in DEMIS, the primitive event (PE) and the composite event (CE). A primitive event is the simplest event. It represents basic events that can be detected by the input model (e.g. “mouse-click”). A composite event consists of a set of events that can be either primitive events or composite events. For example, a “multi-click” event could be expressed as a set of “mouse-click” events. These “mouse-click” events are called sub-events of the “multi-click” event. In DEMIS, a primitive event is simply expressed by name, time, and source; a composite event is specified by an event-tree (ET). Figure 5 shows an example of an event-tree. The root of the tree represents the composite event, e, while e1, e2, e3 are sub-events of e. e
e1
e2
e3
Figure 5. Example Composite Event An event tree is made up of nodes and edges, where nodes represent each event and edges represent a relationship between events. A circle node represents a primitive event while a rectangle node represents a composite event. A one-way edge represents a “consists of” relationship between a composite event and its sub-events. All sub-events are labeled with ascendant numbers from left to right. In Figure 5, for example, sub(e) = {e1, e2, e3}. sub ( e ) =
{ei | ei ∈ E , ei < e , i ∈ N }, iffeisaCE Φ , iffeisaPE
ÿ
Given e∈E, we define the degree of e as deg(e), 0, iffeisaPE (1) deg(e) = deg( e1 ), iffeisa" not" CEande1 < e(2) max(deg(e1 )..) + 1, iffneither(1)nor ( 2), andei < e(3) ÿ
In Figure 5, deg(e1) = deg(e2) = deg(e3) = 0; deg(e) = max (e1, e2, e3 ) + 1 = 1. deg(e) is an important factor in event recognition algorithm as we will demonstrate later. In DEMIS, we define several types of composite events.
t1
t5
t2
t3
e1
t4 e2
e3
time
ÿ
Concurrent event, which we diagram as a rectangle node with a “||” symbol. Given e∈E, and ei∈sub(e), ej∈sub(e), then ei || ej.
ÿ
Sequential event, which we diagram as a rectangle node with a “~” or “→” symbol.. Given e∈E, if ei∈sub(e), ej∈sub(e), and i e2
b) e1 || e3
Figure 4. Example Relationships Between Two Events Given e1[t1, t2] and e2[t3, t4], we define e1 → e2 iff t2 ÿ t3. Fig 4.(a) shows an example of this. Given e1[t1, t2], e3[t5, t3], we define e1 || e3 iff (t5 t3) . Fig 4.(b) shows an example of the second case. Obviously, if e1 || e3, then e3 || e1 as well.
ÿ
Not event, which we diagram as a rectangle node with a “!” symbol. A not event has only one sub-event. If the sub-event happens, the not event will be impossible. Not events are used extensively in the event recognition (ER) algorithm.
Repeating event, which is diagramed as a rectangle node with a “-” symbol. A repeating event consists of a sub-event that starts the composite event; zero or multiple sub-events of the same type; and a sub-event that ends the composite event. This type of event is especially useful to allow multiple event objects of the same type to happen in a sequential way. For example, to draw a series of connected segments, we use mouse click to get the start and end points of each segment. We keep the segment series growing using mouse click events, no matter how many times, but we could stop it growing using a double-click mouse event. Therefore, all these events could be described as a repeating composite event, click as “start” sub-event, followed by zero or multiple clicks, then followed by double-click as “end” event.
3.3.3 Event Recognition Algorithm Our event recognition algorithm occurs at two levels: low-level event recognition and high-level recognition. The low-level event recognition is accomplished in the input model stage discussed earlier. The event-model handles high-level recognition, which we will describe below. Once a composite event type is described by the APP, that specific type of event is registered with the DEMIS run-time system, which will determine when instances of composite events occur. Our algorithm will generate possible composite events whose sub-events match events that have occurred and have been recognized. Instances of an event are created when conditions occur that make it possible that the event has or will occur. After it is created, the state of the event instance is described by the Finite State Machine diagram given in Figure 6. fully recog sNew
sFlag part recog
fully recog dispatch
sProcess
timeout
timeout
subeve of
sExit
sHold dispatch
Figure 6. Event FSM Diagram The “sNew” state represents the beginning state for a newly created event. “sProcess” represents the state an event is in while it is being recognized. Within this state, only some of the subevents of the composite event have been recognized. Once all of its sub-events occur as given by the composite event specification, the event will enter the “sFlag” state, which indicates that the event has been recognized. The state may
proceed to the “sHold” state if the event is a sub-event of other events that might be recognized. The ”sExit” state is reached when this event is no longer used, and therefore can be destroyed. The event object is dynamically created; once the time constraints fail for the composite event, the event is “timed out” and enters the “sExit” state. In this way, the dynamically created event object is dynamically validated or discarded. An event could be dispatched to APP if it is a recognized composite event or a sub-event of some other recognized composite event. A primitive-event is recognized by the input model and labeled “sFlag” before it enters the event model. The recognition of composite events is more complicated. Each composite event has an event tree. A composite event’s sub-events may be primitive events or composite events with their own event trees. The complexity involved in the recognition of a composite event increases with the event’s degree. First, we consider the simplest case, where given a composite event e, deg(e) = 1. Based on e’s type, there will be one of several cases as described below. 1) Sequential event Let E = (N, T, S, G), given a composite event e∈E, sub(e) = { ei | i∈N }, e is called a sequential event if ei ej (where i