Software Solutions for Self-Organizing Multimedia ... - CiteSeerX

10 downloads 26796 Views 9MB Size Report
Apr 23, 2004 - All attendants bring notebook computers, at least one brings a beamer, ..... autonomously able to choose the best sound presentation device.
Software Solutions for Self-Organizing Multimedia-Appliances 1

Michael Hellenschmidt, Thomas Kirste Fraunhofer Institute for Computer Graphics, Darmstadt, Germany

Abstract The vision of Ambient Intelligence is based on the ubiquity of information technology, the presence of computation, communication, and sensorial capabilities in an unlimited abundance of everyday appliances and environments. But enabling an ensemble of devices to spontaneously act and cooperate coherently requires software technologies that support self-organization. We discuss the salient properties of such a software infrastructure and propose a solution to these challenges, the “SodaPop” middleware. SodaPop uses a two-stage approach to structuring multi-agent systems and provides unique facilities for coordinating the activities of competing agents. Furthermore we describe the application of SodaPop to realize a smart conference room. Here we introduce a principle component topology dynamically extensible with new devices. The articles ends with the description of some resolution strategies and the description of an application programmers interface, that is available from our project site. Key words: Ambient Intelligence, Multimedia Appliances, Middleware, Self-Organization

1

Introduction

The vision of Ambient Intelligence [1–3] is based on the ubiquity of information technology, the presence of computation, communication, and sensorial capabilities in an unlimited abundance of everyday appliances and environments. Email addresses: [email protected] (Michael Hellenschmidt), [email protected] (Thomas Kirste). 1 This work has been partially supported by the German Federal Ministry of Education and Research

Preprint submitted to Elsevier Science

23 April 2004

Fig. 1. Typical environments we’d like to be smart: High-tech conference rooms

A rather popular scenario illustrating this vision is the “smart conference room” (or “smart living room”, for consumer-oriented projects) that automatically adapts to the activities of its current occupants (cf. e. g., [4–6]). Such a room might, for instance, automatically switch the projector to the current lecturer’s presentation as she approaches the speaker’s desk 2 , and subdue the room lights—turning them up again for the discussion. Of course, we expect the environment to automatically fetch the presentation from the lecturer’s notebook. And the lecturer should be able to use her own wireless presentation controller to move through her slides—if she doesn’t have one, she might use a controller device at the speaker’s desk. Such a scenario doesn’t sound too difficult, it can readily be constructed from common hardware available today, and, using pressure sensors and RFID tagging, doesn’t even require expensive cameras and difficult image analysis to detect who is currently at the speaker’s desk. Setting up the application software for this scenario that drives the environment’s devices in response to sensor signals too doesn’t present a major hurdle. So it seems as if Ambient Intelligence is rather well understood, as far as information technology is concerned. Details like image and speech recognition, as well as natural dialogues, of course need further research, but building smart environments from components is technologically straightforward, once we understand what kind of proactivity users will expect and accept. But only as long as the device ensembles that make up the environment are anticipated by the developers. Today’s smart environments in the various research labs are usually built from devices and components whose functionality is known to the developer. So, all possi2

For the smart living room, such as [7], this reads: “switch the TV set to the user’s favorite show, as he takes seat on the sofa.”

2

ble interactions between devices can be considered in advance and suitable adaptation strategies for coping with changing ensembles can be defined. When looking at the underlying software infrastructure, we see that the interaction between the different devices, the “intelligence”, has been carefully handcrafted by the software engineers which have built this scenario. This means: significant (i. e., unforeseen) changes of the ensemble require a manual modification of the smart environment’s control application.

This is obviously out of the question for real world applications, where people continuously buy new devices for embellishing their home. And it is a severe cost factor for institutional operators of professional media infrastructures such as conference rooms and smart offices.

As example for such changes, consider the smart conference room above: if one participant’s notebook has a built in camera, the room could additionally support video conferencing and gesture interaction. Things can be even more challenging: imagine a typical ad hoc meeting, where some people meet at a perfectly average room. All attendants bring notebook computers, at least one brings a beamer, and the room has some light controls. Of course, all devices will be accessible by wireless networks. So it would be possible for this chance ensemble to provide the same assistance as the deliberate smart conference room above. Enabling this kind of ambient intelligence, the ability of devices to configure themselves into a coherently acting ensemble, requires more than setting up a control application in advance. Here, we need software infrastructures that allow a true self-organization of ad-hoc appliance ensembles, with the ability to afford non-trivial changes to the ensemble. (See also [8] for a similar viewpoint on this topic.)

In this paper, we discuss the salient properties of such a software infrastructure and propose a solution to these challenges, the “SodaPop” system. SodaPop uses a twostage approach to structuring multi-agent systems and provides unique facilities for coordinating the activities of competing agents.

The further structure of this paper is as follows: In Section 2, we review the requirements for self-organizing ensembles. In Section 3, we introduce our solution proposal for a software infrastructure that supports supports such ensembles.

Section 4 then outlines how a typical ensemble is managed with the help of this infrastructure. Based on this, a comparison of our approach with other activities is then given in Section 5. Finally, in Section 6, we outline the next steps. 3

2

Requirements

When looking at the challenges of self-organization as indicated in the previous section, we can distinguish two different aspects of self organization: Architectonic Integration refers to the integration of the device into the communication patterns of the ensemble. For instance, the attachment of an input device to the ensemble’s interaction event bus. Operational Integration describes the aspect of making new functionality provided by the device (or emerging from the extended ensemble) available to the user. For instance, if you connect a CD player to an ensemble containing a CD recorder, the capability of “copying” will now emerge in this ensemble. Although a thorough coverage of “self-organization” requires the handling of both aspects, we concentrate on the aspect of architectonic integration in this paper 3 . A central requirement for a software infrastructure supporting architectural integration is that it should support ensembles that are built from individual devices in an ad hoc fashion by the end user. This situation is for instance common in the area of home entertainment infrastructures, where users liberally mix devices from different vendors. From this follows that it is also not rely on a central controller—any device must be able to operate stand-alone. Furthermore, some infrastructures may change over time—due to hardware components entering or leaving the infrastructure or due to changes in the quality-of-service available for some infrastructure services, such as bandwidth in the case of wireless channels. Therefore, such an architecture should meet the following objectives: • • • • • • •

Ensure independence of components, Allow dynamic extensibility by new components, Avoid central components (single point of failures, bottlenecks), Support a distributed implementation, Allow flexible re-use of components, Enable exchangeability of components. Provide transparent service arbitration.

When interacting with a smart environment, users may be giving a lecture, enjoying a TV show at home, calling a colleague over the mobile phone, etc. These very different situations do not only influence the strategies for ambient intelligence 3

Operational integration can be realized based on an explicit modeling of the semantics of device operations as “precondition / effect” rules, which are defined over a suitable environment ontology. These rules then can be used by a planning system for deriving strategies for reaching user goals, which consider the capabilities of all currently available devices. See [9] for details.

4

provided by the environment—they also have a strong impact on the hardware infrastructure available for implementing this behavior. It becomes clear that a broad range of different hardware infrastructures has to be considered as implementation platform—for example: • • • • •

mobile personal communicators with wireless access to stationary servers, wearable computers using augmented reality displays for interaction, a set of networked consumer electronic components, without a central controller, the local information network of modern cars, the user’s PC at home, communicating with globally distributed information servers, • public terminals, etc. From these considerations, substantial challenges arise with respect to the software infrastructure that is required for implementing the ambient intelligence. It needs to support functions such as: • Distributed implementation of components. As soon as more than one host is available (or required) for implementing the architecture, a distribution scheme must be developed. The distribution scheme may either simply allocate different functional components on different hosts (relying on the assumption that intercomponent communication is less frequent than intra-component communication) or it may distribute individual components across multiple hosts (making each component virtually available everywhere, but creating challenges with respect to managing a consistent internal state). Clearly, the right choice depends on the concrete infrastructure that is available. • Communication mechanisms. Once a distributed implementation is considered, the choice of communication concept is no longer a matter of taste. Distributed shared memories or distributed blackboards for example are a much more heavyweight communication scheme than message passing and bus architectures—but simplify communication design for knowledge based systems. Again, the right choice cannot be made without considering the concrete infrastructure, the specific communication needs of the components in question, and the distribution model. • Ad-hoc discovery of system components. In some infrastructures, new components may join an existing system in an ad-hoc fashion. Consider, e. g., a personal communicator establishing contact to a point of sales terminal, where both components are equipped with their own version of the assistance system. Both systems must be able to integrate with each other and discover each other’s components and functionalities, in order to provide an interoperable service (such as using the mobile communicator’s input and output devices for the stationary terminal’s controller application). In the next section, we introduce a software infrastructure for managing self-organizing ensembles that we have developed based on these considerations. 5

User Interface

User Interface

User Interface

User Interface

Control Application

Control Application

Control Application

Actuators (Display)

Actuators (Tape Drive)

Actuators (Display)

Actuators (Tape Drive)

Device A (TV set)

Device B (VCR)

Device A (TV set)

Device B (VCR)

Event Channel Control Application

?

Action Channel

Devices ensembled

Devices on their own

Fig. 2. Devices and Data Flows

3

3.1

A Self-Organizing Middleware

Devices and Data Flows

When developing at a middleware concept, it is important to look at the communication patterns of the objects that are to be supported by this middleware. For smart environments, we need to look at physical devices which have at least one connection to the physical environment they are placed in: they observe user input, or they are able to change the environment (e. g., by increasing the light level, by rendering a medium, etc.), or both. When looking at the event processing in such devices, we may observe a specific event processing pipeline, as outlined in Figure 2: Devices have a User Interface component that translates physical user interactions to events, the Control Application is responsible for determining the appropriate action to be performed in response to this event, and finally the Actuators are physically executing these actions. It seems reasonable to assume that all devices employ a similar event processing pipeline (even if certain stages are implemented trivially, being just a wire connecting the switch to the light bulb). It would then be interesting to extend the interfaces between the individual processing stages across multiple devices, as outlined in the right right side of Figure 2. This would allow a dialogue component of one device to see the input events of other devices, or it would enable a particularly clever control application to drive the actuators provided by other devices. By turning the private interfaces between the processing stages in a device into public channels, it might be possible to achieve an architectonic integration. So, the underlying approach of our proposal for a middleware is to develop a system model that provides the essential communication patterns of such data-flow based multi-component architectures. 6

The model we have developed so far is called S ODA P OP (for: Self-Organizing Data-flow Architectures suPporting Ontology-based problem decomPosition.). Following, we give a brief overview over the salient features of this model. Note that the “channels” outlined in Figure 2 are not the complete story. Much more elaborate data processing pipelines can easily be developed (such as outlined in [10]). The point of S ODA P OP is not to fix a specific data flow topology, but rather to allow arbitrary such topologies to be created ad hoc from the components provided by the devices in an ensemble.

3.2

Basic Elements of S ODA P OP

The S ODA P OP model [11] introduces two fundamental organization levels: • Coarse-grained self-organization based on a data-flow partitioning. • Fine-grained self-organization for functionally similar components based on a kind of “Pattern Matching” approach. Consequently, a S ODA P OP system consists of two types of elements: Channels, which read a single message at time point and map them to multiple messages which are delivered to components (conceptually, without delay). Channels have no memory, may be distributed, and they have to accept every message. Channels provide for spatial distribution of a single event to multiple transducers. Transducers, which read one or more messages during a time interval and map them to one (or more) output messages. Transducers are not distributed, they may have a memory and they do not have to accept every message. Transducers provide for temporal aggregation of multiple events to a single output. In general, a transducer may have multiple input and output channels (m : n, rather than just 1 : 1). The “User Interface” or “Control Application” boxes in Figure 2 are transducers. The criterion for discriminating between transducers and channels is the amount of memory they may employ for processing a message – i. e., the complexity they create when trying to implement them in a distributed fashion: Channels may use no memory. This requirement clearly makes sense when considering that we may want to use channels as “cutting points” for distributing a system: Implementing distributed shared memory is expensive. Communication primitives for potentially distributed systems therefore should not provide such a facility “for free”. In addition, the “‘No Memory” Constraint provides a hard criterion for discriminating between the functions a channel is allowed to provide and functions that require a transducer. 7

Finally, it becomes obvious that persistence functionality (such as provided by blackboard-based communication infrastructures, e. g. LINDA [12] or FLiPSiDE [13]) shall not be part of a channel, as persistence clearly violates the concept of memory-free channels.

3.3

Channels & systems

Channels accept (and deliver) messages of a certain type t, Transducers map messages from a type t to a type t0 . A system is defined by a set of channels and a set of transducers connecting these channels. So, a system is a graph where channels represent points (nodes) and transducers represent edges 4 . Channels and transducers are equally important in defining a system. Channels are identified via Channel Descriptors. Conceptually, channel descriptors encode the channel’s ontology (the meaning of the messages), so that transducers can be automatically connected to channels that speak the languages they understand.

3.4

Communication patterns

The middleware for multimodal event processing and multi agent approaches should support at least the following two communication patterns: • Events that travel in a data-flow fashion through the different transducers. When an event e is posted by a transducer t, it (t) does not expect a reply. Rather it expects that other system components (i. e., the called transducer) know how to continue with processing the event. • RPCs that resemble normal remote procedure calls. When a RPC is called by a transducer, it expects a result. Here, the calling transducer determines the further processing of the result. Events and RPCs describe different routing semantics with respect to result processing. When considering the ensemble architecture in Figure 2, the flow from User Interface to Actuators is a typical event processing pipeline, where at each level we have a set of transducers that cooperate in order to translate an event received at the input (upper) level into an event posted at the output (lower) level. Event- and RPC-like result routing semantics correspond to the different types of channels a transducer may subscribe to. Event- and RPC-Channels are the two basic 4

Rather: a multigraph, because we may have several edges connecting the same two nodes.

8

channel types provided by S ODA P OP.

3.5

Subscriptions

Events and RPCs are (in general) posted without specific addressing information: in a dynamic system, a sender never can be sure, which receivers are currently able to process a message. It is up to the channel on which the message is posted to identify a suitable message decomposition and receiver set (service arbitration). A channel basically consists of a pipe into which event generators push messages (events or RPCs) which are then transmitted to the consumers (transducers) subscribing this channel. When subscribing to a channel, an event consumer declares: • the set of messages it is able to process, • how well it is suited for processing a certain message, • wether it is able to run in parallel to other message consumers on the same message, • wether it is able to cooperate with other consumers in processing the message. These aspects are described by the subscribing consumer’s utility. A utility is a function that maps a message to a utility value, which encodes the subscribers’ handling capabilities for the specific message. A transducer’s utility may depend on the transducer’s state. Examples for such utility functions are discussed in Section 4.

3.6

Message handling

When a channel processes a message, it evaluates the subscribing consumers’ handling capabilities and then decides which consumers will effectively receive the message (receiver set). Also, the channel may decide to decompose the message into multiple (presumably simpler) messages which can be handled better by the subscribing consumers. (Obviously, the consumers then solve the original message in cooperation.) The basic process of message handling is shown in Figure 3. How a channel determines the effective message decomposition and how it chooses the set of receiving consumers is defined by the channel’s decomposition strategy. Both the transducers’ utility and the channel’s strategy are eventually based on the channel’s ontology – the semantics of the messages that are communicated across the channel. Now, to summarize, self-organization is achieved by two means in S ODA P OP: 9

Utility Function

Utility Function

Channel

Publishing Transducer

Channel

Publishing Transducer

Subscribing Transducer (e.g., Actuator Component)

Subscribing Transducer

Subscribing Transducer (e.g., Actuator Component)

Subscribing Transducer

Channel / Transducer structure Publishing Transducer

Subscribing Transducer

Subscribing Transducer

Receiving request . . .

Publishing Transducer

a

b

Channel

b

Channel

Channel

a

Publishing Transducer

?

?

?

?

?

Subscribing Transducer

Subscribing Transducer

Subscribing Transducer

Subscribing Transducer

Subscribing Transducer

Subscribing Transducer

?

?

?

Subscribing Transducer

Subscribing Transducer

Utility Function

?

a'

Evaluating utility . . .

Decomposing . . .

b'

Subscribing Transducer

Delegating.

Fig. 3. Basic message handling process in S ODA P OP

(1) Identifying the set of channels that completely cover the essential message processing behavior for any appliance in the prospective application domain 5 . (2) Developing suitable channel strategies that effectively provide a distributed coordination mechanism tailored to the functionality which is anticipated for the listening components. Then any device is able to integrate itself autonomously into an ensemble, and any set of devices can spontaneously form an ensemble. Following, we describe how the S ODA P OP infrastructure can be used for in a concrete application setting.

4

Example

Our example environment is the “smart conference room” already outlined in Section 1. Such a high-tech conference room should react on device changes or actions from person in a way, that we can conider these reaction as “intelligent”. To give some examples: If a person stands up and walks to the speaker stand, the microphone of the speaker stand should be switched on and the room lights and the 5

Admittedly, “essential” and “complete” are problematic notions, which would deserve a tangible definition. Additional work is required here.

10

speaker’s desk lighting should be most appropriote for the lecturer and the audience. And furthermore, if the lecturer posses a personal notebook computer, that holds the presentation, the presentation should immediately start, as the lecturer reaches the speaker stand. And if we additionally assume, that the lecturer brings along a new device he needs for his presentation (e.g. modern loudspeakers for movies included in the presentation), we would want that the conference room is autonomously able to choose the best sound presentation device.

4.1

Devices and Channels

To realize this ambient intelligent behaviour, we first look at the devices that might be present in our conference room. We are considering: • microphone at the speaker stand with a simple loudspeaker • a beamer to display presentations • two different lightings, one that lights the presentation area and one that lights the speaker stand • a pressure sensor in-ground of the speaker stand • pressure sensors installed in the chairs (of the first row of the audience area) • places for personal devices, such as notebook copmputers Based on these devices, we defined a fundamental component topology that realizes the above scenario. Using the S ODA P OP approach for self-organizing device ensembles, we can identify three fundamental channels that group the components potentially contributed by the available devices into four processing levels: • the level of sensory components, that emit atomic events. An atomic event could be a reaction to environment changes (e.g. from pressure sensors) as well as initiated by a person (e.g. by using an on/off switch) • the level of parsing components, that analyse one or more atomic events and interpret those events into goals that represent environment changes intended by the user • the level of assistant components that is able to map above goals into sequences of device actions that will achieve the desired effect • the level of actors, that is able to accomplish the given function calls According to this, we identified three channels connecting the component levels: • the event channel, that allocates the events to the different parser components • the goals channel, that pass the constructed goal from the parsing level to the most appropriate assistant component • the operations channel, that passes concrete function calls the the most appropriate actor 11

sensors components

sensor

sensor

sensor

parser

parser

parser

parser

assistant

assistant

assistant

assistant

actor

actor

actor

actor

atomic events

sensor

strategy: event interpretation

parsers

assistants

goals

strategy: opinion based selection algorithm

operations

strategy: opinion based selection algorithm

actors

Fig. 4. An overall topology for appliance ensembles

Finally, in order to guarantee dynamic workflow and extensibility of the device ensemble, we assigned strategies to each channel. The event channel acts upon the event interpretation strategy, as described in Section 4.4.2 where as the goal and the operation channel uses the opinion based selection algorithm Section 4.4.1 in order to allocate tasks to the most appropriate assistant resp. actor component. Consequently, our overall component architecture for dynamic ambient intelligent device ensembles looks like illustrated in Figure 4. Possible workflows are highlighted coloured to point out, that different devices, each bringing with their own sensor, parser, assistant, and actor components can build a device ensemble dynamically. The topology in Figure 4 bears resemblence to the data flow of devices shown in Figure 2. However our overall topology introduces two levels of interpretation, one for event interpretation and one for goal interpretation and the construction of clear function calls.

4.2

Ensemble Dynamics

For our test scenario, we used a lab room with controllable lighting, a beamer (including the presenter pc), several pressure sensors, and a microphone (together with a simple loudspeaker component). To provide that notebooks be aware of the presence of their user, we implemented a simple awareness software that emits an event containing the ip number of the notebook if the user presses a special shortcut. After implementing the channels as identified in Section 4.1 using the Java version of SodaPop (see Section 4.3), devices equipped with these definition can readily be plugged together, then exhibiting a spontaneous “intelligent” behavior as expected. A typical ensemble based on this channel set is shown in Figure 5. Now we discuss the inner messaging dynamics of this ensemble and the emergence of “intel12

pressure sensor seat / speakers stand

sensor

beamer

personal laptop awareness

room lights

Speakers stand light

microphone

switch

sensor

switch

switch

switch

parser

parser

parser

parser

assistant

assistant

assistant

assistant

actor

actor

actor

actor

Fig. 5. A conference room’s appliance ensemble

ligence” as the ensemble is dynamically extended. Referring to the initial situation depicted in Figure 5, each device’s interaction component sends single events, that are immediately claimed by its own parser component. A device’s parser component publishes over simple goals (e.g. “room light on”) to the “operations” channel, which in turn are immediately (and only) claimed by the device’s own actor component. So, each device is directly (and solely) controlled by its own interaction component. Not wrong, but then, not very intelligent (or interesting). However, let’s now dynamically extend the ensemble by a new devices that provides another parser and another assistant component, as outlined in Figure 6. Assume, that the new parser is configured to claim a sequence of events, composed of a chair occupation change event (if one is standing up), the awareness event of a personal notebook, and a floor occupation event (provided by the floor sensor at the speaker stand), which are to be generated within a specific time interval (see the events marked with (1) at Figure 6). The parser interprets this sequence of events as the goal “prepare room for the presentation from pc with ip number x”, where x is the ip number from the notebook that emitted the absence event, and publishes it on the goal channel (see (2) in Figure 6). Given this goal, the newly introduced assistant component will win the competition on the goal channel, and it will then create a strategy for achieving this goal by constructing the following action sequence: • • • • • •

switch on beamer get presentation from notebook and start it switch speaker stand lights on darken room lights switch on microphone switch on loudspeakers

Each of these actions are then published on the “operations” channel, which dis13

pressure sensor seat / speakers stand

sensor 1

beamer

personal laptop awareness

room lights

Speakers stand light

microphone

switch

sensor

switch

switch

switch

parser

parser

parser

assistant

assistant

assistant

1

1

parser

parser 2

assistant

assistant 3

5

actor

4

4

actor

actor

4

actor

4

actor

Intelligent scenario

Fig. 6. Communication flow in an ensemble

tributes them to the different actors, going through another competition cycle for each action (see points (4) on Figure 6)). Looking back to our scenario, we observe that our lecturer has brought along a modern loudspeaker device she needs for her presentation. Plugged into the existent device environment the new loudspeakers extends the topology as shown in Figure 6 at point 5. Now there are two possible loudspeakers available. Both receive the request to tender for the task “speakers on” and both compete at the operation channel with the aspects they consider as important for accomplishing the task. In our example ensemble the new loudspeakers raises more aspects and claims more fidelity for them (e.g. stereo effect, powerful integrated amplifier) than the simple loudspeaker (for more details see 4.4.1). So the “operations” channel selects the new speakers as the device with the highest performance to accomplish the task “speakers on”. Both changes in ensemble behavior, as well as the original ensemble synthesis from the individual devices has occured spontaneously, without any human intervention, based on the device’s channel topology and the channel’s competition strategies.

4.3

The Java API

For the S ODA P OP infrastructure, a Java API is available. Also, several example components (such as for the scenario above) are written in Java. Following, we give an example, how the core transducers of the simple loudspeaker (that comes along with the microphone) and the new loudspeakers look using this Java API: First the simple loudspeaker: 1 public class MySpeaker {

14

2 3 4 5

static Channel ops = null; public MySpeaker(){ ops = new Channel("operations", InPipe, "obs"); ops.subscribe(new MySpeakerHandler(this));}}

That means, that a transducer constructs its part of the channel (see line 4), by giving the name of the channel, whether it wants to listen (“InPipe”) to channel messages or wants to write to the channel (“OutPipe”) and the strategy, the channel should apply (here: obs, the opinion based selection algorithm). After that, for handling utility value operations and to execute tasks, an SodaPopHandler is subscribed to the channel (see line 5). The new loudspeakers are implemented in the same manner. As mentioned above, by means of their utility value functions, the transducers provide the channel with the aspects they raise when receiving a request to tender. For the simple loudspeaker the SodaPopHandler looks like: 1 public class MySpeakerHandler extends SodaPopHandler { 2 3 4 5 6 7 8 9 10 11 12

// utility value function public String utilityValue(String content){ // evaluate the Message and give UtilityValue back if (content.equals("speaker on")) return "(sound (importance 1.0) (confidence 1.0) (fidelity 1.0))"; else return "False";} // message handler function public void doAct(String content){ // switch loudspeaker on }}

Thus, the simple loudspeaker raises only one aspect (see line 6). It can play sound. But the new loudspeakers, equipped with an internal amplifier and a pair of stereo speakers, raises more aspects, it considers as relevant: 1 public class NewSpeaker extends SodaPopHandler { 2 3 4 5 6 7 8 9 10 11 12 13

public String utilityValue(String content){ if (content.equals("speaker on")) { String v = "(sound (importance 0.25) (confidence 0.3) (fidelity 1.0))"; v += "(stereo (importance 0.5) (confidence 0.4) (fidelity 0.8))"; v += "(amplifier (importance 0.25) (confidence 0.3) (fidelity 1.0))"; return v; } else return "False";} // message handler function public void doAct(String content){ // switch loudspeaker on }}

The utility value function of the new loudspeaker raises three different aspects concerning the sound, stereo effects and the internal amplifier (see lines 4–6). It is obvious that the overall performance of the new loudspeakers is higher than than the performance of the simpler ones. Thus the task to switch on loudspeakers is given to the newer ones. Disconnecting the newer ones will result in the old situation. 15

4.4

4.4.1

Channel Strategies

Opinion Based Selection Algorithm

If we assume that there is more than one component that claims to be able to perform a given task, the channel has to deliver the task to the most appropriate component. But how can a channel provide this, without having a precise knowledge about the application domain? In order to use the fundamental topology outlined in Figure 4 in different scenarios, the channel cannot provide any specific knowledge about the domain. Otherwise the extension with new devices would not be possible. Thus, we need to provide a strategy that uses the components’ knowledge about the application domain. We use the following approach: Each component taking part in a request to tender for accomplishing a task, provides the channel with several aspects it considers as relevant to solve the given task. The channel then creates an objectivated opinion by combining the available domain expert’s (the components) subjective opinions in a suitable way (see below). Finally, it rates each domain expert’s solution proposal based on this objective opinion. For each aspect it raises, a component provides the channel with the following values: • relative importance of the aspect • confidence value describing the confidence of the component, that the aspect indeed has the assigned importance • and a fidelity value that describes, how well the component thinks it can consider this aspect or adjust it to the ideal value Additionally, restrictions are defined: • the importance and confidence values assigned to aspects must add up to 1, to prevent components from claiming arbitrarily high importance and confidence values • the fidelity value of each aspect has to be in the codomain [0,1] To compute the performance of each component, first all opinions have to be combined and for each aspect the effective objective importance has to be calculated. Then for each tender, the performance can be computed as the sum of all aspects’ fidelities multiplied by their associated objective importance. Then the component with the highest performance can be selected by the channel and the task can be allocated to it. Note that this strategy needs no information on what an aspect means ontologically. 16

It is not interested what the concrete ideal value is and what it means. The strategy computes the performance of each component only by the aspects that are given by the components and thus this strategy is applicable to any application domain and on any level of a component topology, where the most appropriate component for a given task has to be found.

4.4.2

Event Interpretation

Another challenge is to resolve between competing interpretations of (sequences of) sensor readings and interaction events. Consider the example topology shown in Figure 4. Here we have multiple event sources—such as sensors and input devices— feeding their observations into the “event” channel, and several event interpreters (“parsers”) listening to these events. In this topology, a sensor might deliver an event such as “person sensed at speaker stand” and a parser p may interpret this event as “turn up room light” (based on the idea that the listeners may want to take notes during a talk). Let us assume that we now add another parser, q, to the system, which provides an interpretation for the event sequence “beamer turned on” followed by “person sensed at speaker stand”, namely “turn down room light” (based on the idea that slides are difficult to see in glaring ambient light). Let us write a for “beamer turned on” and b for “person sensed at speaker stand”. The question is now, how to interpret the event sequence ha; bi in a system where both parsers p and q are active. Without further precautions, q would issue “turn down room light” after seeing ha; bi, while p issues “turn up room light” in response to the event sequence’s trailing b (after ignoring a). Obviously, p and q substantially disagree on how to interpret this event sequence. On the other hand, to our intuition it seems rather clear that q’s interpretation is more appropriate, as it accounts for a larger number of events (i. e., it is more specialized to the situation). The problem we have here is essentially identical to the problem of ambiguous grammars, where parsers (for instance, created by yacc(1)) conventionally choose the longer parse. The situation in our case is somewhat more complex, as we have to consider a distributed and dynamically changing parsing mechanism. However, the basic idea is the same: if two parsers compete for interpretations of non-disjoint event sequences, the parser supporting the longer sequence will win. The major snag here is if the shorter event sequence ends before the longer one: Imagine q would (also) react to hb; ai. After seeing b, p will have a complete parse. However q might be able to provide a better parse in case it sees an a in the (near) future. But when the next event is something different (e. g., a timeout), then p should be allowed to proceed. While an in-depth discussion of the algorithm for distributed resolution of competing parses is out of the scope of this paper, we want outline our approach: Our 17

solution is that parsers, which are able to interpret an event within the context of a longer parse, claim a “lock” on this event. Only unlocked events may immediately be interpreted by a parser, locked events are queued by all parsers that able to process them. Once a parser is able to assemble a completed event sequence and its interpretation, it “withdraws” these events from the other parsers’ queues and provides his interpretation. On the other hand, if a parser sees an event that can not be successfully integrated into the sequence, it “releases” the events it had locked, freeing the other parsers to process their queues.

5

Related Work and Assessment

There are other approaches that address the problem of dynamic, self organizing systems, such as for example HAVi [14], Jini [15], the Galaxy Communicator Architecture ([16,17]), or SRI’s Open Agent Architecture (OAA) ([18,19]). Especially Galaxy and OAA provide architectures for multi-agent systems. Also the pattern-matching approach in S ODA P OP ist not new. Comparable concepts can be found in Galaxy, in the OAA, as well as in earlier works on Prolog or in the Pattern-Matching Lambda Calculus. Here the S ODA P OP approach provides a certain refinement at the conceptual level by replacing language-specific syntactic pattern-matching functionality (cp. the Prolog-based pattern matching of OAA) by a language-independent facility based on utility value computation functions, that are provided by transducers. Another paradigm in a ubiquitous computing environment are publish-subscribe systems. When a publisher posts a message, the message is forwarded to any subscribers who have requested it (this is usually done by one or more central components). All components, that have registered to certain messages, get the published messages for further processings. Examples for systems based on publish-subscribe mechanism are Elvin[20], that implements content-based routing on individual servers, or SIENA[21] and Cryphon[22], that are distributed content-based systems. The main disadvantage of pure publishsubscribe approaches are obvious: there is no strategy in case that different subscribers have conflicting interpretations of the same message. Therefore S ODA P OP introduces important differences to the other approaches by using a two-stage approach to system decompositon and self organization. Coarse-grained structuring is provided by defining channels, fine grained structure is supported by patternmatching. This approach makes the application of different strategies working on different stages of the workflow possible. Two strategies, one for event interpretation and one for choosing the most appropriate component for a given task, were developed and applied to different channels. They are applicable to any application domain in contrast to the decompositon and recombination strategies of Galaxy and OAA. Galaxy provides a centralized hubcomponent, which uses routing rules for modeling how messages are transferred between the different system components, whereas OAA provides Prolog-based 18

strategy mechanisms. Both approaches require a re-design of their rule bases in case they are extended by other components. Another disadvantage of Galaxy and OAA is the using of heavyweight routing components, that incorporate arbitray memory. Consequently they are not suited for a distributed implementation. Other research initiatives address the intelligent control of users’ environments. The Easy Living project ([23,24]) makes access to information and services possible (e.g. a personal presentation). Therefore Easy Living uses a centralized architecture with two main components. A room server holds information about all devices and services, whereas a rule engine uses sensor informations to control the room devices. This approach bears resemblance to the Galaxy architecture by using a centralized rule engine. But the split-up of room information and rules into two components allows more flexibility. Nevertheless, the world model as well as the rules have to be extended (or exchanged) in case the device infrastructure changes. The Intelligent Classroom from the Northwestern University ([25,26]) uses a declarative approach to make a classroom more intelligent. Therefore two rule systems were used. The first rule system interprets user gestures and user utterances into user goals, whereas the second rule system infers the environments possible reactions. That is similar to our approach to a principle component topology as described in 4, where we differ between a parser level, that constructs goals from sequences of events and the assistant level, that infers operation calls for the actors. But instead of the rule systems approach of the Intelligent Classroom we choose a distributed solution, where the channels act like moderators and are able to choose the most appropriate strategy that is offered by available transducers. To our experience it is dangerous to provide only a single granularity for decomposing a complex system structure. The single granularity necessarily has to be fine in order to provide the required flexibility. When trying to fix the overall structure of the system, such a fine granularity provides too much detail and quickly leads to a proliferation of interfaces that are shared by only a few components. This danger specifically exits, when the interface discussion is carried out by several project partners in parallel for instance. However, the proliferation of interfaces is a Bad Thing, because it obstructs the interoperability of system components - a prime goal of S ODA P OP. The S ODA P OP approach provides abstractions that allow a top-down structuring of the system (channels) as well as a bottom-up structuring (within channel decomposition). In addition, it explicitly, includes a data-flow based mechanism for constructing systems out of the components, based on S O DA P OP Event Channels.

6

Current State and Next Steps

The S ODA P OP infrastructure is currently implemented using the functional language Haskell ([27]) as well as in Java. The Haskell implementation serves as a 19

proof of concept of S ODA P OP, as well as a testbed for experimenting with different data-flow topologies and alternative channel ontologies (e. g. decomposition and recombination strategies). For using S ODA P OP as a foundation for software projects dealing with distributed environment, we implemented the S ODA P OP infrastructure also in the more conventional language Java. This implementation offers the application of different channel strategies (e. g. event interpretation (see 4.4.2), the opinion based selection algorithm (see 4.4.1), or a strategy to distribute multi-modal system output to different render components by problem decomposition). It is applied within the project DynAMITE ([28]) and is available from the projects web site. This downloadable software offers an API to implement own channels and transducers as well as to apply different channel strategies to them. Some demos are also available (e. g. home entertainment device infrastructure) that illustrates the application of the software to own scenarios. Furthermore we implemented the conference room scenarios as described in this article to demonstrate the widespread applications of the S ODA P OP approach. But further work has to be done. The following prompts should give an overview about our actual work: • providing a strategy for transducers to accomplish a task in a collaborative way (e. g. a transducer that controls a beamer device collaborates with a transducer that controls the lightnings to maximize the contrast for the audience) • providing special problem decomposition and recombination strategies for different ontologies (e. g. a multi-modal input strategy as a supplement of the multimodal output strategy mentioned above). • demonstrators using graphical interfaces to allow fast and efficient experiments without the need to install ”real” devices Furthermore QoS guarantees have to be defined. Currently we have no mechanism for making global statements about a set of channels and transducers. These statements could contain both constraints on the topology of the channel / transducer network as well as constraints on their temporal behaviour. Although technologies for specifying and verifying such properties exist (e. g., temporal logic, petri nets, or process calculi,....), it has not yet been investigated, which of these technologies suits best the needs of S ODA P OP and how they can be integrated into an essentially decentralized system concept.

Acknowledgement

The work underlying the project DynAMITE has been funded by the German Ministry of Education and Research under the grant signature BMB-F No. FKZ 01 ISC 27A. The content of this publication is under sole responsibility of the authors. 20

References

[1] K. Ducatel, M. Bogdanowicz, F. Scapolo, J. Leijten, J.-C. Burgelman, Scenarios for ambient intelligence 2010, ISTAG report, European Commission, Institute for Prospective Technological Studies, Seville (Nov 2001). URL ftp://ftp.cordis.lu/pub/ist/docs/istagscenarios2010.pdf [2] N. Shadbolt, Ambient Intelligence, IEEE Intelligent Systems (2003) 2–3. [3] E. Aarts, Ambient intelligence: A multimedia perspective, IEEE Multimedia (2004) 12–19. [4] D. Franklin, K. Hammond, The intelligent classroom: providing competent assistance, in: Proceedings of the fifth international conference on Autonomous agents, ACM Press, 2001, pp. 161–168. URL http://doi.acm.org/10.1145/375735.376037 [5] Stanford Interactive Workspaces iWork, http://iwork.stanford.edu/ (Oct. 2003). [6] Oxygen, MIT Project Oxygen, Pervasive, http://oxygen.lcs.mit.edu/ (2002).

Project

Human-centered

Overview, Computing,

[7] B. Brumitt, B. Meyers, J. Krumm, A. Kern, S. A. Shafer, Easyliving: Technologies for intelligent environments, in: HUC, 2000, pp. 12–29. URL citeseer.nj.nec.com/brumitt00easyliving.html [8] D. Servat, A. Drogoul, Combining amorphous computing and reactive agentbased systems: a paradigm for pervasive intelligence?, in: Proceedings of the first international joint conference on Autonomous agents and multiagent systems, ACM Press, 2002, pp. 441–448. URL http://doi.acm.org/10.1145/544741.544842 [9] T. Heider, T. Kirste, Supporting goal-based interaction with dynamic intelligent environments, in: Proc. 15th European Conference on Artificial Intelligence (ECAI’2002), Lyon, France, 2002. [10] T. Herfet, T. Kirste, M. Schnaider, EMBASSI: multimodal assistance for infotainment and service infrastructures, Computers & Graphics 25 (4) (2001) 581–592. [11] T. Heider, T. Kirste, Architecture considerations for interoperable multi-modal assistant systems, in: Proc. 9th International Workshop on Desgin, Specification, and Verification of Interactive Systems (DSV-IS’2002), Rostock, Germany, 2002. [12] D. Gelernter, Mirror Worlds, Oxford University Press, New York, 1993. [13] D. G. Schwartz, Cooperating Heteorgeneous Systems, Kluwer Academic Publishers, Dordrecht, 1995. [14] HAVi, Inc., The HAVi Specification – Specification of the Home Audio/Video Interoperability (HAVi) Architecture – Version 1.1, www.havi.org (May 2001).

21

[15] Sun Microsystems, Inc., Jini Technology Core Platform Specification – Version 1.1, www.jini.org (Oct. 2000). [16] S. Seneff, E. Hurley, R. Lau, C. Pao, P. Schmid, V. Zue, Galaxy-II: A Reference Architecture for Conversational System Development, in: ICSLP 98, Sydney, Australia, 1998. [17] S. Seneff, R. Lau, J. Polifroni, Organization, Communication, and Control in the GALAXY-II Conversational System, in: Proc. Eurospeech 99, Budapest, Hungary, 1999. [18] SRI International AI Center, The http://www.ai.sri.com/ oaa/ (2000).

Open

Agent

Architecture,

[19] D. L. Martin, A. J. Cheyer, D. B. Moran, The Open Agent Architecture: a framework for building distributed software systems, Applied Artificial Intelligence 13 (1/2) (1999) 91–128. [20] B. Segall and D. Arnold and J. Boot and M. Henderson and T. Phelps, Content based routing with Elvin4, In Proc. AUUG2K (Jun. 2000). [21] A. Carzaniga, D. S. Rosenblum, A. L. Wolf, Achieving scalability and expressiveness in an internet-scale event notification service, in: Symposium on Principles of Distributed Computing, 2000, pp. 219–227. [22] R. Strom et al., Gryphon: An information flow based approach to message brokering, In International Symposium on Software Reliability Engineering ’98 Fast Abstrac (1998). [23] Microsoft, Inc., Easy Living Project www.research.microsoft.com/easyliving/ (Oct. 2001).

Overview,

[24] B. Braumitt, B. Meyers, J. Krumm, A. Kern, S. Shafer, Easy Living: Technologies for Intelligent Environments, Handheld and Ubiquitous Computing (Sep. 2000). [25] Northwestern University, Chicago, USA, The Intelligent Classroom, dent.infolab.nwu.edu/infolab/projects/project.asp?ID=11. [26] J. Flachsbart, D. Franklin, K. Hammond, Improving Human Computer Interaction in a Classroom Environment using Computer Vision, in: Proc. Intelligent User Interfaces, 2000. [27] S. Peyton Jones, J. Hughes, al., A Non-strict, Purely Functional Language , www.haskell.org (Feb. 1999). [28] DynAMITE, DynAMITE Project Overview, www.dynamite-project.org (2004).

22

Suggest Documents