Azure Machine Learning Algorithm Flowchart

1 downloads 209 Views 422KB Size Report
Bayesian. Linear. Regression. Boosted. Decision Tree. Regression. Very fast training time but can be memory ... Azure Ma
Predicts values inside ranks but can only be used on data that represents an order or rank

Azure Machine Learning Algorithm Flowchart

Neural Network Regression

Can be used for modeling complex relationships but is difficult to interpret and computationally expensive

Linear Regression

Simple to use and predicts a single numeric outcome Discovers groups inside data. Is sensitive to outliers and initial cluster point selection

K-Means

Bayesian Linear Regression

Ordinal Regression Complex relationship between features?

More than one outcome?

Predicting a ranked value?

Large number of features?

Decision Forest Regression

Any type of number

Predicting a range of values?

Very fast training time but can be memory intensive

Boosted Decision Tree Regression

Fast Forest Quantile Regression

Predicting a count?

Perfect when you want to predict a range of values

Find groups

When you have lots of normal data and few anomalies, does have large training time and scalability issues

What do you want to predict?

One-Class Support Vector Machine

For dealing with anomalies based on time series data

Find anomalies

Very fast and handles missing and noisy data well. Can overfit easily and memory intensive

Used for predicting counts, cannot be used on negative numbers

Class membership

Time series Anomaly Detection

Multiclass Logistic Regression

PCA-Based Anomaly Detection

Difficult to interpret and high training times Complex dataset?

Two-Class Deep Support Vector Machine

Two-Class Decision Jungle

Two-Class Averaged Perceptron

Noisy or missing data?

Memory intensive but easy to use and fast

Two-Class Boosted Decision Tree

Small dataset?

Handles noisy and missing data very well at a cost of increasing training time

Better memory efficiency than trees of forests

Two-Class Logistic Regression

Linearly separable pattern detection and fast

Multiclass Decision Jungle

Linear problem?

Linear problem?

Use for numeric variable, not suited for non-linear problems

More than two output classes?

Two-Class Neural Network

Good for data with many features that are possibly corrolated

Only supports numeric variables

Poisson Regression

Complex relationships between features?

Noisy or missing data?

Handles data with many features very well but has high memory consumption

Multiclass Decision Forest

Computationally expensive and difficult to interpret but can handle complex relationships very well Multiclass Neural Network

Two-Class Support Vector Machine

Positive answer direction Continues or categorical input?

Very fast to train

Negative answer direction

Can easily overfit, especially with noisy data Two-Class Bayes Point Machine

Rating where three starts represents excellent and one star poor Training time Accuracy

Two-Class Decision Forest

Customization options

© 2017 Dataheroes

|

http://www.dataheroes.nl

Special thanks to Tomaž Kaštrun

Suggest Documents