Distributed and parallel time series feature extraction for industrial big ...

Recommend Documents

Oct 25, 2016 - et al., 2013] and Industry 4.0 [Hermann et al., 2016] environments. ... Motivated by industrial applications for machine learning models Christ et ...

Distributed and parallel time series feature extraction for industrial big ...

several time series and metainformation simultaneously. Decentral processing latency issues, memory captivities exceeded, process close to source. Big Data.

A Time-Series-Based Feature Extraction Approach for

A Time-Series-Based Feature Extraction Approach for. Prediction of Protein Structural Class. Ravi Gupta,1, 2 Ankush Mittal,1 and Kuldip Singh1. 1 Department ...

Feature extraction over multiple representations for time series ...

oriented) data set, made of time series identifiers described by features ... wide variety of applications: from e.g., m

A Parallel Architecture for Feature Extraction in

Singapore, 1-3 December, 2004. A Parallel Architecture for Feature Extraction in. Content-Based Image Retrieval System. Kien-Ping Chung*, Jia Bin Li*, Chun ...

Probabilistic Feature Extraction from Multivariate Time Series using

Abstract. A novel nonlinear probabilistic feature extraction method, called Spatio-Temporal Gaussian Process Latent Variable Model, is in- troduced to discover ...

An improvement on feature extraction via time series ... - Scientia Iranica

Feature extraction by time series modeling based on statistical pattern recognition is a powerful approach to structural health monitoring. Determination of an ...

Time-Frequency Based Feature Extraction for Non

Time-Frequency Based Feature Extraction for Non-Stationary Signal ...... coefficients and reconstructed time series for each decomposition level are considered ...

Feature-Extraction and Feature-Selection for ...

transformation converts the saliency measure to decibels [32]. ANN-SNR ..... http://securityaffairs.co/wordpress/14641/cyber-crime/us-critical-infrastructure-under-cyber-attacks.html. ... Available: http://tech911.info/html/body_it_training_help.html

Feature-Extraction and Feature-Selection for ...

The CDX network traffic data under analysis was collected at AFIT [43] and was captured in the libpcap file format .... Training methods, such as backpropagation (used here), are used to ...... Business Analytics and Optimization, 1st ed., vol.

Distributed Parallel Endmember Extraction of Hyperspectral Data ...

May 22, 2016 - System (HDFS), and Apache Spark to realize distributed parallel implementation for hyperspectral endmember extraction, which significantly ...

Distributed and Parallel Big Textual Data Parsing for Social Sensor ...

Nov 20, 2013 - ... Sensor Networks. Volume 2013, Article ID 525687, 6 pages ... million cases of uploading per day and FaceBook, an- other provider, has ...

A Parallel Architecture for Feature Extraction in - Murdoch Research ...

(WWW) and cheap storage are the main dnving vehicles for such systems. Internet nowadays becomes an extremely huge database, containing millions of website, where virtually .... Generic domain retrieval systems contain images created.

CONNECTIONIST FEATURE EXTRACTION FOR

through feature extraction to generate relatively compact feature vectors at a frame rate of around 100 Hz. Secondly, these feature vectors are fed to an acoustic ...

Knowledge Extraction from Time Series and its

novel methods are needed for the knowledge extraction from noisy time series data (i.e., time series data having no discords or motifs). Additionally, most of the ...

Distributed Audio Feature Extraction for Music - ISMIR 2005

Victoria BC, Canada ... lections of audio and music on personal computers and portable devices. .... ernet local area network of Apple G5 computers, where.

Distributed Audio Feature Extraction for Music - ISMIR 2005

Distributed audio feature extraction for music. Stuart Bray. Computer Science Department. University of Victoria. 3800 Finnerty Rd. Victoria BC, Canada.

Parallel Indexing on Color and Texture Feature Extraction using R ...

Nov 30, 2015 - ... Science (PG), Kongunadu Arts and Science College, Coimbatore, Tamilnadu, India. 2 ... the database using some search engine. ... and content-based are the two techniques adopted for search ..... optimization tool [12].

Vibration Feature Extraction and Analysis of Industrial ... - Science Direct

Second International Symposium on Computer Vision and the Internet ... The vibration signal analysis for lab model sag mill is same as that of ball mill, but the ...

Feature Surface Extraction and Reconstruction from Industrial ... - MDPI

Jul 5, 2018 - Keywords: 3D point cloud; feature surface extraction; RANSAC; region ... objects' external surfaces with a large number of points, and can be ...

A Parallel Distributed Weka Framework for Big Data ... - CiteSeerX

A Parallel Distributed Weka Framework for Big Data Mining using Spark. Aris-Kyriakos Koliopoulos, Paraskevas Yiapanis, Firat Tekiner, Goran Nenadic, John ...

Nonparametric Time Series Modelling for Industrial Prognostics and ...

Sep 30, 2013 - is therefore crucial for reducing the downtime and the costs while .... A method for unsupervised change detection and health monitoring ..... This data set is fully described in NASAs website (ti.arc.nasa.gov/tech/dash/pcoe/).

A Time-Series Prediction Approach for Feature Extraction in a Brain

AbstractâThis paper presents a feature extraction procedure. (FEP) for a brainâcomputer ... network (NN)-based time-series prediction (TSP) feature ex-.

Real-Time Feature Extraction for High Speed Networks - EECS

Real-Time Feature Extraction for High Speed Networks. David Nguyen, Gokhan Memik, Seda Ogrenci Memik, and Alok Choudhary. Department of Electrical and ...

Distributed and parallel time series feature extraction for industrial big ...

Download PDF

100 downloads 40322 Views 4MB Size Report

Comment

Intelligent Process Prediction based on Big Data. Analytics. 2 ... Big Data scale with the number of time series, number of devices and ... discriminant analyzer**.

Distributed and parallel time series feature extraction for industrial big data applications Maximilian Christ, Andreas W. Kempa-Liehr , Michael Feindt Blue Yonder GmbH, Karlsruhe, Germany  

Maximilian Christ [email protected] @MaxBenChrist maximilianchrist.com ACML, 16.11.2016 1

iPRODICT Intelligent Process Prediction based on Big Data Analytics

2

Time Series Classification / Regression

Good? Bad? C=0.05%

ERP

Brand=SuperSteel 3

Industrial applications + Inhomogeneous sources

Decentral processing

several time series and metainformation simultaneously

latency issues, memory captivities exceeded, process close to source

Big Data scale with the number of time series, number of devices and length of time series

Robustness

Explainability

labeled samples are expensive, overfitting is bad

clients ask to justify results, traceability of results is mandatory

4

Two approaches: Time Series Classification Directly

Feature-based

9000 features with linear k-NN with Dynamic Time discriminant analyzer** Warping Distance (DTW)* filtering Inhomogeneous sources Decentral processing Big Data Robustness Explainability

✕ ? ? ? ✕

√ √ √ √ √

* Ratanamahatana, Chotirat Ann, and Eamonn Keogh. "Making time-series classification more accurate using learned constraints." SDM, 2004. ** Fulcher, Ben D., and Nick S. Jones. "Highly comparative feature-based time-series classification." IEEE Transactions on Knowledge and Data Engineering 26.12 (2014): 3026-3037.

Max Number Peaks

Median Mean

Min

f (t1 , . . . , tl ) = f 2 R

Good? Bad? C=0.05%

ERP

Brand=SuperSteel 8

Feature Extraction

Good? Bad? C=0.05%

ERP

Brand=SuperSteel

0

x11 B x21 B X=B . @ ..

xm1

x12 x22 .. .

··· ···

xm2

···

1

x1n x2n C C .. C . A

xmn

9

Feature Extraction

FeatuRe Extraction based on Scalable Hypothesis tests (FRESH)

Good? Bad? C=0.05%

ERP

Brand=SuperSteel

0

x11 B x21 B X=B . @ ..

xm1

x12 x22 .. .

··· ···

xm2

···

1

x1n x2n C C .. C . A

xmn

10

Feature Extraction

Many time series 53x250 > 13k Features Not all relevant

Many time series 53x250 > 13k Features Not all relevant

Robustness labeled samples are expensive, overfitting is bad

0

Feature Extraction

x11 B x21 B X=B . @ ..

xm1

x x

x

FRESH is a three step procedure 1. features extraction 2. feature significance 3. multiple testing

Address feature significance individual Hypothesis test for each feature

number of wrongly extracted features F ER = number of extracted features E(F ER) = q

FRESH controls false extraction rate asymptotically for 1. every distributions 2. every dependency structure by Benjamini Yekutieli procedure

number of wrongly extracted features F ER = number of extracted features E(F ER) = q

only parameter of fresh FRESH controls false extraction rate asymptotically for 1. every distributions 2. every dependency structure by Benjamini Yekutieli procedure

number of wrongly extracted features F ER = number of extracted features E(F ER) = q q = 7 %, 100 extracted features 7 irrelevant features 93 relevant fatures

FRESH controls false extraction rate asymptotically for 1. every distributions 2. every dependency structure by Benjamini Yekutieli procedure

Industrial applications + Inhomogeneous sources feature based

Decentral processing highly parallel

Robustness individual feature testing

Big Data highly parallel, linear runtime

Explainability feature based 22

http://github.com/blue-yonder/tsfresh