MINING TIME SERIES DATA

Recommend Documents

Data Mining, Time Series, Representations, Classification, Clustering, Time Se- ... Church, 2001), electrocardiograms, electroencephalograms, gait analysis and .... to discover the optimal of an exponential number alignments, a basic imple-.

A Time series data mining - Ircam

similarity measures, stream analysis, temporal analysis, time series. 1. ... ver, with the ever-growing maturity of time series data mining techniques, this statement ...

Visual Mining of Spatial Time Series Data

Analysis of time series data is one of the most important topics in data mining research. A number of computational methods have been recently developed [1].

Using Data Mining with Time Series Data in Short

turnover; monitoring indicator Export foreign exchange volume; WPI; merchandise trade volume export/import; ... volume weighted RSI, williams %R; advance decline line; average true range; average position change .... [3] M. A. Ruggiero, âCibernetic

Mining Interval Time Series - CiteSeerX

Data mining can be used to extensively automate the data analysis process. ... An example is given in Figure 1; interval event B begins and ends during the.

Mining Interval Time Series - CiteSeerX

Experiments were run on a Dell PowerEdge 6300 server with 1GB RAM and ... 335. 543. 387. 1500. 1700. 1900. 2100. 2300. 2500. 2700. 2900. 3100. 3300.

Time Series Data Mining for Multimodal Bio-Signal ... - Semantic Scholar

Time Series Data Mining for Multimodal Bio-Signal Data. Masaki Aono, Yuske Sekiguchi, Yoshifumi Yasuda, Naoya Suzuki, and Yohei Seki, [email protected].

Combining Time Series Similarity with Density ... - Data Mining Lab

bundles start and end in the same regions. It also ignores .... Figure 1. Cost Matrix (a) and optimal warping path for time series (b) and 3D fibers (c). and max(Q1) ...

Mining Multivariate Time-Series Sensor Data to Discover ... - CiteSeerX

(http://www.cns.ed.ac.uk/ people/mark/intro/intro.html). Richard S. Sutton and Christopher J. Matheus. Learning polynomial functions by feature construction.

Time Series Shapelets: A New Primitive for Data Mining

significantly faster than state-of-the-art classifiers. Categories and Subject Descriptors. H.2.8 [Database Management]:

A Bit Level Representation for Time Series Data Mining ... - CiteSeerX

Jan 11, 2006 - Figure 1. A hierarchy of time series representations used for data mining. ...... Clustering, classification, rule discovery, motif discovery,. 337.

Predictive Mining of Time Series Data in Astronomy - adass

Astronomical Data Analysis Software and Systems XII. ASP Conference ... to build predictive models, allowing one to forecast when a given event might occur or ... Once a good fit is achieved, these clusters can then be considered as the basic ...

Perception Based Patterns in Time Series Data Mining

we discuss the role of perception-based time series data mining and computing with words and perceptions ... supported by statistical, data mining, or data processing software. Import of intelligent ...... 2005/2504/01/25041xxx.pdf. In this book.

High Performance Data Mining in Time Series - NYU Computer ...

Arthur Goldberg, for their interest in this dissertation and for their feedback. Rich interactions with ... mance time series data mining based on important primitives.

time series data mining: identifying temporal patterns ...

conversations on my favorite topic, Time Series Data Mining, and on his, Fuzzy Optimal. Control. ...... location might yield a gold strike. Similarly, it is necessary to ...

Time Series Data Mining - from PhD to Startup - Peter Laurinec

Oct 27, 2018 - context, the usage of time series data mining (analysis) methods in order to ... predictable (forecastable) groups of aggregated TS of electricity ...

Mining and Forecasting of Big Time-series Data

Time-series data analysis is becoming of increasingly high importance, thanks ... The objective of this tutorial is to provide a concise and intuitive overview of the ...

Implementing an Integrated Time-Series Data Mining Environment-A ...

data mining is one of key issues to get useful knowledge from ... clinical courses of patients. However ... data mining environment, which can apply medical data.

Pattern Discovery in Hydrological Time Series Data Mining during the ...

monsoon period data have been used. For standardization of data, statistical analysis such as mean monthly discharge, monthly Maximum Discharge, monthly ...

Mining Deviants in Time Series Data Streams - Semantic Scholar

We show experimen- tally using real network traffic data (SNMP aggregate time ... cause a number of applications are derived from monitoring systems that feed ...

Querying and Mining of Time Series Data: Experimental Comparison ...

... various readings obtained from sensor networks, position updates of moving objects in location-. âResearch supported by the Northrop Grumman Corp. con-.

time series data mining: techniques for anomalies detection in water ...

The growing attention in water supply system security urges the design of new tools in order to control water system vulnerability. The water system security ...

Mining Deviants in Time Series Data Streams - Semantic Scholar

in data stream mining have now engaged the database community; see [29] for a ... in network traffic applications and

Importance of Data Mining Time Series Technique ... - Semantic Scholar

This research paper highlights the importance of data mining technology in crime and .... Business Intelligence [BI] is a concept and method to improve business ...

MINING TIME SERIES DATA

Download PDF

121 downloads 71726 Views 1MB Size Report

Comment

Time series data accounts for an increasingly large fraction of the world's .... to discover the optimal of an exponential number alignments, a basic imple-.

PREVIEW VERSION

Chapter 1 MINING TIME SERIES DATA Chotirat Ann Ratanamahatana, Jessica Lin, Dimitrios Gunopulos, Eamonn Keogh University of California, Riverside

Michail Vlachos IBM T.J. Watson Research Center

Gautam Das University of Texas, Arlington

Abstract

Much of the world’s supply of data is in the form of time series. In the last decade, there has been an explosion of interest in mining time series data. A number of new algorithms have been introduced to classify, cluster, segment, index, discover rules, and detect anomalies/novelties in time series. While these many different techniques used to solve these problems use a multitude of different techniques, they all have one common factor; they require some high level representation of the data, rather than the original raw data. These high level representations are necessary as a feature extraction step, or simply to make the storage, transmission, and computation of massive dataset feasible. A multitude of representations have been proposed in the literature, including spectral transforms, wavelets transforms, piecewise polynomials, eigenfunctions, and symbolic mappings. This chapter gives a high-level survey of time series data mining tasks, with an emphasis on time series representations.

Keywords:

Data Mining, Time Series, Representations, Classification, Clustering, Time Series Similarity Measures

1.

Introduction

Time series data accounts for an increasingly large fraction of the world’s supply of data. A random sample of 4,000 graphics from 15 of the world’s

2 newspapers published from 1974 to 1989 found that more than 75% of all graphics were time series (Tufte, 1983). Given the ubiquity of time series data, and the exponentially growing sizes of databases, there has been recently been an explosion of interest in time series data mining. In the medical domain alone, large volumes of data as diverse as gene expression data (Aach and Church, 2001), electrocardiograms, electroencephalograms, gait analysis and growth development charts are routinely created. Similar remarks apply to industry, entertainment, finance, meteorology and virtually every other field of human endeavour. Although statisticians have worked with time series for more than a century, many of their techniques hold little utility for researchers working with massive time series databases (for reasons discussed below). Below are the major task considered by the time series data mining community. Indexing (Query by Content): Given a query time series Q, and some similarity/dissimilarity measure D(Q, C), find the most similar time series in database DB (Chakrabarti et al., 2002; Faloutsos et al., 1994; Kahveci and Singh, 2001; Popivanov et al., 2002). Clustering: Find natural groupings of the time series in database DB under some similarity/dissimilarity measure D(Q, C) (Aach and Church, 2001; Debregeas and Hebrail, 1998; Kalpakis et al., 2001; Keogh and Pazzani, 1998). Classification: Given an unlabeled time series Q, assign it to one of two or more predefined classes (Geurts, 2001; Keogh and Pazzani, 1998). Prediction (Forecasting): Given a time series Q containing n data points, predict the value at time n + 1. Summarization: Given a time series Q containing n data points where n is an extremely large number, create a (possibly graphic) approximation of Q which retains its essential features but fits on a single page, computer screen, etc. (Indyk et al., 2000; Wijk and Selow, 1999). Anomaly Detection (Interestingness Detection): Given a time series Q, assumed to be normal, and an unannotated time series R, find all sections of R which contain anomalies or “surprising/interesting/unexpected” occurrences (Guralnik and Srivastava, 1999; Keogh et al., 2002; Shahabi et al., 2000). Segmentation: (a) Given a time series Q containing n data points, con¯ from K piecewise segments (K