Tracking Concept Drifting with an Online-Optimized Incremental Learning Framework Jun Wu*, Dayong Ding
Xian-Sheng Hua
Bo Zhang
AI Lab, Tsinghua University Beijing 100084, P. R. China
Microsoft Research Asia Beijing 100080, P. R. China
AI Lab, Tsinghua University Beijing 100084, P. R. China
{wujun01, ddy01} @mails.tsinghua.edu.cn
[email protected]
[email protected]
ABSTRACT
Concept drifting is an important and challenging research issue in the field of machine learning. This paper mainly addresses the issue of semantic concept drifting in time series such as video streams over a relatively long period of time. An OnlineOptimized Incremental Learning framework is proposed as an example learning system for tracking the drifting concepts. Furthermore, a set of measures are defined to track the process of concept drifting in the learning system. These tracking measures are also applied to determine the corresponding parameters used for model updating in order to obtain the optimal up-to-date classifiers. Experiments on the data set of TREC Video Retrieval Evaluation 2004 not only demonstrate the inside concept drifting process of the learning system, but also prove that the proposed learning framework is promising for tackling the issue of concept drifting.
Categories and Subject Descriptors
H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing-indexing methods; I.2.10 [Artificial Intelligence]: Vision and Scene Understanding-video analysis.
General Terms
Algorithms, Experimentation.
Keywords
Incremental Learning, Gaussian Mixture Model, Video Content Analysis, Concept Drifting, TREC Video Retrieval Evaluation
1. INTRODUCTION
In time series, the underlying data distribution, or the concept that we are trying to learn from the data sequences, typically is constantly evolving over time. Often these changes make the models built on old data inconsistent with the new data, thus instant updating of the models is required [1]. This problem, known as concept drifting [2], complicates the task of learning concepts from data. An effective learner should be able to track such changes and quickly adapt to them [1]. To model concept
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MIR’05, November 10–11, 2005, Singapore. Copyright 2005 ACM 1-59593-244-5/05/0011…$5.00.
drifting in time sequences has become an important and challenging task. Klinkenberg et al [3][4] propose a new method to recognize and handle concept changes with support vector machines, which maintains an automatically adjusted window on training data so that the estimated generation error is minimized. Fan [5] points out that the additional old data do not always help produce a more accurate hypothesis than using the most recent data only. It increases the accuracy only in some “lucky” situations. In [6] Fan also demonstrates a random decision-tree ensemble based engine, named as StreamMiner, to mine concept drifts in data streams. In StreamMiner systematic data selection of old data and new data chunks is utilized to compute the optimal model that best fits on the changing data streams. Wang et al [7] propose a general framework for mining drifting concept in data streams using weighted ensemble classifiers based on their expected classification accuracy on the test data under the time-evolving environment. Though many methods have been proposed to deal with concept drifting, few researchers have covered the issue of how to track concept drifting in a systematic viewpoint. This paper will address this issue based on a novel online learning framework, termed OOIL (Online-Optimized Incremental Learning) [11]. The evolving processes of the drifting concepts, as well as a couple of tracking measures related to concept drifting, will be investigated. The remainder of this paper is organized as follows. Section 2 briefly introduces the issue of concept drifting. The onlineoptimized incremental learning framework is presented in Section 3. Section 4 discusses how to track concept drifting in detail. Experiments are introduced in Section 5, followed by conclusions and future works in Section 6.
2. CONCEPT DRIFTING
For an incoming data streams, there are two important issues: data sufficiency and concept drifting [5]. Traditional machine learning schemes typically do not consider the problems of concept drifting. Actually if there is no concept drifting, and the training data set is sufficient, there is no need to update the models. But, when concept drifting occurs, besides the new data, the old data should also be considered to enhance the performance of the system. However, how much old data should be used and how to use these data are not trivial issues [5].
* Supported by National NSF of China (No.60135010), National NSF of China (No.60321002) and the Chinese National Key Foundation Research Development Plan (2004CB318108).
Most of existing research works related to concept drifting in machine learning are mainly concerning the final classification results. A more sophisticated way is to make clear whether there is concept drifting in the system and how much the concepts are drifting in a quantitative way before classification. This paper will follow this idea and define a couple of measures to investigate the intrinsic properties of the learning systems.
3. OOIL FRAMEWORK - AN EXAMPLE LEARNING SYSTEM 3.1 Framework Overview
As mentioned above, the scenario we are investigating is a batch learning problem. That is, data are supposed to arrive over time in batches. Different from traditional batch learning [16] [17], we suppose all of the upcoming batches are unlabeled. Only a small preliminary pre-labeled training data set is required during the whole learning process.
3.2 Symbol Definitions
To describe our system more clearly, some symbols which will be used in this paper are listed in Table 1 and Table 2. It should be noted that yt and xt denote the t-th batch and the first portion of the t-th batch. For simplicity, if there is no confusion, we denote y as any element batch in D, x as the part (a small portion) of y (i.e., the superscript t is omitted). Table 1. Main symbols definition (1) meanings feature vector an outcome of Y pre-labeled training sets t-th batch all batch set part of t-th batch
symbols Y = [Y(1), Y(2), …, Y(d)]T y = [y(1), y(2), …, y(d)]T y*c = {y1,c, y2,c, …, yn(c),c}
yt = {yt1, yt2, …, ytN(t) }
D = {y1 ,…, yt-2, yt-1, yt ,…} xt = {xt1, xt2, …, xtT} (1