N95- 23686 Learning
Time
Series
Stefanos
for
Intelligent
Manganaris
Monitoring
Doug
*
Fisher
Dept. of Computer Science Vanderbilt University Box
1679,
Nashville,
Station
TN 37235,
Tel. (615)343-4111, E-maih
KEY
WORDS
Bayesian
AND
learning,
length,
time
data.
ABSTRACT We address series tures
the
problem
framework,
classification sifted
procedure
examples.
model
that
tures,
using
the
of classifying
priors.
classify
when
support
that
ficiently
novel
not present as such.
model length
In the
performance
series there
in one Time
features,
in the We report
in a monitoring
of the
is enough
decision.
series set, are
results
from
of interest
with
in temporal
In such
domains,
vary
with
time
series
one
a
and
research Ames
(NAG
of their
been
we
over
AND
to classify
time
fea-
(i.e.,
the
The
term with
by
signathe
term
OF
CLASS
MODELS
CLASSIFICATION
to suf-
A set of preclassified
signatures
to classes
examples)
are
recognized
taneously.
Given
class
morphological
experiments to NASA.
by a grant
by a
attribute.
morphological
attribute.
INDUCTION
share
design
in classification area of machine
supported
behavior
often
described
learning
on the
time-varying
them.
has
are on
domains.
properties
for each
focuses based
a learner
intervals,
presented that that
with
into
training
learner
signatures
simul-
in the
same
characteristics,
infers
class
of time,
in the one
(the
to the
by functions Functions
be decomposed
*This
objects
of values
paper
resented
NASA
an object's
time;
series
sequential
to
INTRODUCTION Performance improvement tasks has been a traditional
and
are
attribute by appli-
ture will be used synonymously univariate time series.
learned
evidence
belonging
training
domain
task,
to be classified
shape of their plots). We study univariate time series, where each object is described
fea-
approach
objects
cations
tures
a
we infer
induction
message
a time
classes
class,
The
described by time-invariant Our research is motivated
time
a set of preclas-
its morphological
Bayesian
minimum
assign
from
For each
captures
we induce
edu
usually values.
This
time
according to their morphological feain the time domain. In a supervised
machine-learning
vanderbilt,
learning.
description
series
U.S.A.
Fax (615)343-8006
stefanos©vuse,
PHRASES
minimum
monitoring,
B
space
a set
models,
that
rep-
capture
we consider
of polynomials
polynomial
we
per
can and
interval.
For example, Figure 1 shows a signature and the class model induced from it. We use a
from
Bayesian
2-834).
71
model
induction
technique
to find
2.00
use
polynomials
of up
the
method
be easily
facilitate
M
two,
but,
generalized.
probabilistic
predictions,
noise
and
a Gaussian
1.60
of sampling the variance
1.40
constant
1.20 1.00
we estimate the coefficients of the polynomial and the variance of the noise that maximize
0.80
the
over
After
0.40
the
0.20 0.00
I
I
model
2.00
4.00
6.00
treat
for each Figure
1: A signature
induced
from
the
function
data
[1].
model
the
class
model
best
M with
class
by the
for the
posterior
probabil-
information
I) = P(MII)
I and
The
training
imum
priors,
message
negative
other
(:)
P(DIM'I) P(D[I)
a model, oretical
P(MII), length
logarithm -log
describes Similar
M
in light
Class search
other
models for the
[3] and
class
time
domain a given
possible
best
model
We use model
used vision
thethat
than
testing
We
problem:
the
evidence,
of the
class
I.
extends the
has
C
as the
maximum tends is set
we search
of parameterized
in a class from
of all other
classes
likelihood.
The
When
the
computation
of its
very
low
among
to one
the
of the
known "novel"
low value.
in the
classes
its
it tends
prior
to an arbitrary
the
known
low likelihoods,
likelihood
to zero.
all known
of the
so that
circumstances,
role
any
"novel"
Under
class
plays
of evidence, posterior. have
Only
low
"novel"
posterior
class
become
in
parameter.
a partitioning
have
probabilities, does viable alternative.
in the space
(2)
is computed
a non-negligible
classes
because
the
I),
when
is computed
no
parameters
to zero
likelihood
normal
thus
S belongs
probabilities
classes
[3], and
design
that
all known
class
for
P(CID, I)] P(-CID,_
= 10log:0
C, P(CID,
is set
when
a sequence
interval
for the
knowledge.
and from the posterior probability of a special "novel" class. The likelihood of the "novel"
for surface
to support
precision
has
into
families
of
to the
parameterized,
an additional
Each
probability
information
been
models
The
I)
posterior
classes
applications. are
of parameters.
min-
of a message
in computer
engineering
[4], among
prior
of prior
have
the
[5, 6].
is equal
length
techniques
learning
approach
2 P(MII),
reconstruction
we use
of the
minimum
prior
S is an object
probability
class To assign
S, and
is to find
to be correct
of the
C, we compute
that
e(CID,
the
P(MID,
likely
goal
training
we search
maximum
of prior
model.
[2]:
supported
each
I),
interval
a signature, the
as a hypothesis class,
e(C]D,
it (M).
For
ity in light data D.
(S) and
given
most
this
of the
models,
in light
that is
For each
probability
training,
independence
We also assume noise distribution
an interval.
of class
signature
I
errors. of the
posterior
a set
model
To
we assume
1.80
0.60
For
can
to degree
A MONITORING
APPLICATION
of the
of intervals. through models;
The all we
72
Electrical
Loading
(EGIL)
telemetry
data
Generation controllers from
the
and
Integrated
at NASA Shuttle
monitor
to detect
a
various
events
Typically,
event
distinguished
is the
has
onset
All
or termination
device
a signature
morphological
was
onboard.
with
making
a challenging
their
are
stream
whenever
currents
from
that
exceeds
All signatures
(6 sec. after
the
the
a preset dura-
triggering
change),
and
designed
demonstrate the
the
of EGIL
training
in classification
the percentage dependent
in higher
notable
exception
there
signatures
are ten events;
per
class
implementation series.
are
many
in the
we ignore
two of the
In each
run,
class.
amount
We then
sizes.
of one
and
confusion entry
The eight
of the
test
signatures,
that
were
labeling
table
column.
class
was
obtained
with
one
signature
training
summarized in Table
shows
in the
different
with
shown
the
class
by
The
top
learning
approach
cal advantages.
per class;
the
there
in the
unable
results
class.
to handle on
multiple
in a confused
classification
class
accuracy.
offers
the from
problem
fication
is supported
sizes
we can
1.
classification.
Each of
evidence.
to the
class
features, the are
for each CALCHAS
classification
row
73
this
to making
belonging
the fact
same and
report
with
re-
not
when
significant
sufficiently
to classes
in each classi-
an arbitrary
we can with
of a
it miti-
confidence than one
is supported
way
CALCItAS
by roughly
Signatures
training classified
of the more
Similarly,
no classification
techni-
features noise;
recognize
it, as opposed
row,
following a principled
of overfitting.
evidence, port
the
measurement
provides an estimate classification. When
in the
of aua Bayesian
distinguishing
gates
the
advantages
monitoring,
training
bottom
GAL is
concept:
It provides
signature
the
row
training
training
of morphological
practical
cases
the
percentage labeling
CALCHAS
after
the
vs. manual
from
We vary
class,
results
that
training
lower
A
GAL
than
signatures
tomatic
its performance
by using
classified the
evaluate
are
matrix
Beyond
an equal
on
accuracies. signatures
pattern
for a class thus
training
to be the
accuracy
concepts;
of discerning
signatures. results
one
time
signatures
whose
of
phases. CALCIIAS
the
percentage of corresponding
We suspect
is currently
and
as
In general,
of a disjunctive
describing
model
as Wcs,
increased
eight
lower
features
current
in these
selected
of training
Our
univariate
domain;
we train
remaining
number
that
with
than
patterns
three-dimensional
EGIL
of randomly
the
65.
handles
In our
test
classified
for signatures
seems
is more
disjunctive
of signatures
average
is about
signatures
number
classes
the
only
There
of learning.
are
signature.
CALCHAS
instances
Wcs
classified
classification
training one
of eight
unknown.
1 indicates
an example
We use
classified
measure
for ten different
set
using
performance.
of correctly
experiments
on
signatures
set
of 74% of the
respectively.
UN3
results
with
shown.
were incorrectly
was
in significantly
of automating
a
series
each
to
Bayesian induction system for data. Here we focus on the effect of
CALCHAS,
as our
of experiments
feasibility
classification
time
a set
and class
Table
by subtracting
twenty
are
a training
NOVEL,
to UN1 actual
the same
over
correctly
25%
of eight.
averaged
diagonal indicates the classifications. Entries
where We have
were
1% and
matrix correct
of the
have
their baselines are normalized a suitable DC value.
with
sizes
deviations
an average
RcR and
teleme-
in one
are
example,
and
training
standard
signatures
identification
a change
is detected
threshold. tion
extracted
the
signatures,
task.
Signatures
with
percentages
For
a set of
characteristics,
accurate
obtained
runs;
on a power
on which the controllers identify them. are over two hundred different events of
interest,
try
place
of an electrical
Each
based There
take
an event
of operation bus.
that
novel
present
in
set, are recognized as such and as NOVEL; potentially costly mistakes
are
avoided.
Table CLASS
1: Classification PHO
PHO
I
404-29
8 I
964-5
VAC
VAC
1 8
H20
1 8
CAB
1
PRP
8 1
WCS
8 1
Awcs
signatures
H20
CAB
(assumed PRP
univariate--see
Wcs
TPS
14-4
324-32 74-2 924-22 964-2
54-22
2+9
34-1 44-2
984-9 1004-0 224-17
794-17 904-16
104-16 984-4
24-4
984-2
24-2 524-28 744-4
14-0 14-0
74-14
764-17
8+7
854-8
RCR
8 1 1
24-1
8
224-40
UNI
I S
464-10 55+4
134-2 12:[:1
124-2 124-3
UN3
1 8
94-5 184-2
204-4 154-1
304-4 2912
34-2 14-1 814 11_2
Artificial 1989. Phil
Wayne
Laird,
Iba,
and
[4] R.
Deepak
Bharat
model.
In J. Shrager
P. Langley, Models
editors, chapter
Kaufmann,
[2] E. T. Jaynes. logic this
book
author. 1993. [3] Edwin
Contact
in applying to surface Eleventh
most
Advance
description
and
of the
[6] C.
inductive
inference
reconstruction. International
of
experiments principles Conf.
The
of the on
74
Stephen
2±0 24-0 204-0 204-0
1603-1609,
length National
Lu.
with
the
principle. Conf.
pages
for integers description
of Statistics,
1983. and
P. R. Freeman.
inference
Journal
Society,
In
on
717-722,
prior
by minimum
and The
C-Y.
models
Annals
S. Wallace
Statistical
by the
In Proc. Joint
estimation
coding.
[email protected],
Some
pages
A universal
Estimation
The
fragments
Pednault.
94-3 3_2
Intelligence,
11(2):416-431,
73-95.
theory:
Tenth
Rissanen.
length.
Theory
available
44-1 4±1
and
minimum
[5] J.
and
1990.
are made
P.D.
and
3, pages
Probability
of science.
the
Computational
of Discovery
Formation, Morgan
Rao
engineering
Artificial 1992.
REFERENCES On finding
224-9 154-7
Intelligence,
Learning Proc.
Cheeseman.
24-1 34-1
Robert
Kulkarni for providing the data we used in the experiments and information on the domain.
probable
74-7 974-1 974-0 784-40
like to thank Saul,
154-11
98_0
ACKNOWLEDGMENTS
Ron
474-28 254-4
34-5
24-0 34-0
8
[1] Peter
NOVEL
934-2
1
Shelton,
GAL 574-29 44-5
TPS
We would
RCR
24-7
8
GAL
text).
684-32
8 AWCS
of EGIL
by compact
of the Royal 49(3):252-265,
1987.