An Architecture for Context-Aware Adaptive Data Stream Mining
Pari Delir Haghighi, Mohamed Medhat Gaber, Shonali Krishnaswamy, Arkady Zaslavsky, and Seng Loke Center for Distributed Systems and Software Engineering Monash University, Australia {pari.delirhaghighi, shonali.krishnaswamy, Arkady Zaslavsky}@infotech.monash.edu.au CSIRO ICT Center, Australia
[email protected] Department of Computer Science and Computer Engineering La Trobe University, Australia
[email protected]
Abstract. In resource-constrained devices, adaptation of data stream processing to variations of data rates, availability of resources and environment changes is crucial for consistency and continuity of running applications. Context-aware and resource-aware adaptation, as a new dimension of research in data stream mining, enhances and improves distributed data stream processing tasks. Context-awareness is one of the key aspects of ubiquitous computing as applications’ successful operations rely on detecting changes and adjusting accordingly. This paper presents a general architecture for context-aware adaptive mining of data streams that aims to dynamically and autonomously adjust input, output and algorithms of data stream mining according to changes in context and resource availability.
1. Introduction Processing of data streams due to their unpredictable and continuous nature is a challenging area of study. In literature, various techniques and approaches have been presented to address the issues associated with data stream processing both in data mining and querying. However, recently the emergence and growth of mobile computing and networking and importance of using mobile devices for data stream mining in certain application domains (e.g. health or bushfire monitoring applications) have introduced new research challenges that need to be addressed. Data stream applications running on resource-constrained devices need not only to consider limitations of computational resources such as memory, battery level and CPU speed but also to take into the account the issues of variable data rates, mobility, disconnections and environmental changes.
Nearly all of the pervasive systems utilize context to perform their tasks and this makes context-awareness an essential requirement of these systems [1, 2]. To perform data stream mining in heterogeneous and distributed computing environments, applications need to monitor context changes and react or adapt to them in order to continue performing their set tasks. Context-aware and resource-aware adaptation of data stream mining enhances and improves streaming tasks and also guarantees consistent and continuous operations. Studying the current state-of-art in data stream mining [3-5] indicates that there are methods and algorithms introduced for performing data stream mining on mobile devices, but there is very limited work that focuses on context-aware and resource-aware adaptation. One of the innovative adaptive works in data stream mining on resourceconstrained devices is the Algorithm Output Granularity (AOG) [6] that provides adaptability with respect to the available memory on a device. Examples of light-weight data stream mining algorithms that have been developed using the AOG include LWC, LWClass and LWF [7]. In this paper we introduce a novel architecture for context-aware adaptive data stream mining that aims to provide real-time and dynamic adaptation strategies by considering current context and availability of resources, and ensure the continuity and consistency of running applications. To explain different parts of our architecture throughout the course of the paper, we will use an example of a health monitoring application for heart patients. The details of the examples are only provided for the purpose of illustrating the points and may lack the necessary medical accuracy and correctness. Applications for healthcare biosensor networks are recently gaining popularity among people as they provide a convenient and safe way to monitor patients remotely and generate warnings and emergency calls. One of the main biosensors used for monitoring heart patients is ECG (Electrocardiogram) sensors that send heart beat rates as a continuous data stream to a PDA [8] or to a base station using an ISM band [9]. Data stream querying or mining need to be performed on the ECG sensor streams locally on a mobile device or on a central workstation for monitoring and analyzing heart beats. This paper is structured as follows: Section 2 provides a general view of our proposed architecture for context-aware adaptive data stream mining. Section 3 discusses context and situation modeling and how situations are inferred. Section 4 focuses on adaptation strategy and correlation functions. Finally section 5 concludes the paper and discusses the future work.
2. An Architecture for Context-Aware Adaptive Data Stream Mining In this section, we introduce an architecture for context-aware adaptive data stream mining. The architecture consists of two main parts as illustrated in Figure 1. The first part
is situation manager that provides context-awareness and includes components for context/situation modeling and inference. The second part, strategy manager, is responsible for adjusting adaptation strategy parameters based on correlation functions and invoking strategies.
Figure 1. A general architecture for context-aware adaptive data stream mining
3. Situation Manager Situation manager consists of three components: Situation repository, Situation Modeling and Situation Inference. These components work together to reason about the occurring situation based on current context attribute values. Contextual information used for inferring situations may include sensed context collected from sensors, static context or internal context such as battery level of mobile devices. 3.1 Situation Modeling We have based our context representation and modeling on the Naïve Context Spaces (NCS) Model [10, 11] but made extensions and changes to it to comply with our purpose.
The NCS model and its extension in [12] are used as a powerful tool for reasoning about context and addressing uncertainties of sensed information. The core of the NCS model is the concept of situations. The NCS model represents contextual information as geometrical objects in multidimensional space called situations [12]. A situation space is a tuple of regions of attribute values related to a situation. Each region is a set of accepted values for an attribute based on a pre-defined predicate and each context state a collection of values of context attributes at the given time. The NCS model extends the definition of context by describing it as “the set of facts, assumptions and predictions along with methods/algorithms of interpreting/ discovering/ processing that information” [12]. Considering our example for monitoring heart patients, from a list of related context attributes , we have consider the following context attributes: a1 (temperature), a 2 (age), a3 (location), a4 (time), a 5 (heart_rate), and a 6 (Battery_level). Weights are values from 0 to 1 that represents the importance of each context attribute in a situation and can have the total value of 1 per situation. Table 1 shows examples of pre-defined situations based on the aforementioned context attributes. Table 1. Examples of situations Situation
Context attributes
S1 normal
a1
a3 a5
a6 S2
sleep
a4 a5
Regions and their predicates AND 100>
0.3 0.2 0.5
S4
a1 a2 a3
34> 70> NOT HOME 100 >