Descriptive Modeling

53 downloads 71 Views 3MB Size Report
detect spatial clusters and explain them in spatial data mining. • Image Processing. • Economic Science (especially market research). • WWW. – Document ...
Descriptive Modeling

Based in part on Chapter 9 of Hand, Manilla, & Smyth And Section 14.3 of HTF David Madigan

Data Mining Algorithms “A data mining algorithm is a well-defined procedure that takes data as input and produces output in the form of models or patterns” Hand, Mannila, and Smyth

“well-defined”: can be encoded in software “algorithm”: must terminate after some finite number of steps

Models Prediction

Probability Distributions

•Linear regression

•Parametric models

•Piecewise linear

•Mixtures of parametric models

•Nonparametric regression •Classification

•Graphical Markov models (categorical, continuous, mixed)

Structured Data •Time series •Markov models •Mixture Transition Distribution models •Hidden Markov models •Spatial models

Patterns Local

Global

•Clustering via partitioning

•Outlier detection

•Hierarchical Clustering

•Changepoint detection

•Mixture Models

•Bump hunting •Scan statistics •Association rules

What is a descriptive model? •“presents the main features of the data” •“a summary of the data” •Data randomly generated from a “good” descriptive model will have the same characteristics as the real data •Chapter focuses on techniques and algorithms for fitting descriptive models to data

Estimating Probability Densities •parametric versus non-parametric •log-likelihood is a common score function: n

S L (# ) = "! log p ( x(i );# ) i =1

•Fails to penalize complexity •Common alternatives:

S BIC ( M k) = 2 S L (!ˆk ;M k) + d k log n SVL ( M k ) = # ! log pˆ M k ( x | $ ) x"Dv

Parametric Density Models •Multivariate normal •For large p, number of parameters dominated by the covariance matrix •Assume Σ=I? •Graphical Gaussian Models •Graphical models for categorical data

Mixture Models

("1 ) x e ! "1 ("2 ) 52! x e ! "2 f ( x) = p + (1 ! p ) x! (52 ! x)!

“Two-stage model” K

f ( x) = ! # k f k ( x;" k ) k =1

x