A MACHINE LEARNING PIPELINE
FOR MULTIPLE SCLEROSIS COURSE DETECTION
FROM CLINICAL SCALES AND PATIENT REPORTED OUTCOMES Samuele Fiorini1, Alessandro Verri1, Andrea Tacchino2, Michela Ponzio2, Giampaolo Brichetto2, Annalisa Barla1
1- DIBRIS - Università degli Studi di Genova, Italy
2- AISM - Scientific Research Area, Italian MS Foundation, Genova, Italy
[email protected], {alessandro.verri, annalisa.barla}@unige.it, {andrea.tacchino, michela.ponzio, giampaolo.brichetto}@aism.it
Machine Learning Pipeline Min-Max Scaling
Data
Principal Components Analysis Linear Discriminant Analysis
2
kY
Problem Setting
2
Nested list of features
wXk2 + µ kwk2 + ⌧ kwk1 Y
˜ w ˜X
2 2
+
RLS LR SVM
2
kwk ˜ 2
`1 `2 Feature Selection
Data Exploration
2
kY wXk22 + kwk2 2 Logit(Y, fw ) + kwk2 2 Hinge(Y, fw ) + kwk2 KNN
Best Model
Linear Classification fw (x) = wT x
Multiple Sclerosis PRO and CS data understanding Using Patients Reported Outcomes (PRO), Clinical Scales (CS) and anthropometric measures the aim is to learn a statistical model for the classification of MS courses by means of machine learning techniques. The proposed classifier is based only on a meaningful subset of the available features. Dataset Description
Data Exploration
Name
Type
#Items Description
MS Course
#Patients
AGEo
measure
1
Age at the onset of the disease
Relapsing Remitting
RR
170
AGEd
measure
1
Age at the disease diagnosis
Secondary Progressive
SP
205
AGEv
measure
1
Age at the examination
Primary Progressive
PP
68
W
measure
1
Weight [Kg]
Progressive Relapsing
PR
8
H
measure
1
Height [cm]
Benign
B
6
MFIS
PRO
21
Modified Fatigue Impact Scale
HADS
PRO
14
Hospital Anxiety and Depression Scale
LIFE
PRO
11
Life Satisfaction Index
OAB
PRO
8
Overactive Bladder Questionnaire
FIM
CS
19
Functional Independence Measure
MOCA
CS
11
Montreal Cognitive Assessment
PASATT
CS
1
Paced Auditory Serial Addition Task
SDMT
CS
1
Symbol Digit Modality Test
In the study, we considered PRO and CS for 457 patients represented by 91 features.
PCA
LDA
The data exploration outcome is a binary classification setting: RR vs. ALL
ℓ1ℓ2 Feature Selection
Linear Classification
The outcome of ℓ1ℓ2 FS is a set of nested lists of relevant features with increasing level of correlation.
A set of linear models is tested in the RR vs. ALL classification scenario. We recall that, since the two classes are not balanced, the accuracy score of a random classifier is around 62.8%. Our experiments shows that the ℓ1ℓ2 feature selection can significantly improve the performance of the considered linear classifiers.
For optimal values of λ and 𝜏, the parameter µ governs the amount of correlation included in the model.
In the table below OLS can be considered as a particular case of RLS with λ = 0.
For µ1, the smallest value of µ, ℓ1ℓ2 FS provides a list of discriminant features ranked according to their selection frequency, i.e. how many times each feature was selected in a double cross-validation procedure.
Name
The best features from the sparsest model (µ = µ1) are used as prototypes in a centroid-based clustering procedure. The Pearson correlation was used as similarity measure. The features on the list associated with maximum value of µ = µ8 were considered. The outcome is the identification of groups of maximally correlated features.
Selection Description Frequency
LIFE 004
100%
These are the best years of my life
AGEv
100%
Age at the examination
FIM 011
100%
Type of transfer: tub or shower
FIM 012
100%
Locomotion: walking
FIM 014
100%
Locomotion: stairs
LIFE 009
88%
I would not change my past life even if I could
MFIS 020
62%
I have limited my physical activities
LIFE 005
25%
Most of the things I do are boring or monotonous
MFIS 014
25%
I have been physically unconfortable
HADS 007
25%
I can sit at ease and feel relaxed
OLS and RLS are the algorithms that benefit more from the ℓ1ℓ2 feature selection step. %
Accuracy =
OLS
T P +T N T P +F P +T N +F N Accuracy
Precision = Recall = F1 = 2 ·
TP T P +F P
Precision·Recall Precision+Recall
F1 Score
KNN
LR
SVM
NO FS
71,79 72,42 72,44 77,28
75,69
ℓ1ℓ2 FS
78,32 78,24 74,99 77,30
75,82
OLS
TP T P +F N
RLS
RLS
KNN
LR
SVM
NO FS
0,618 0,623 0,620 0,652
0,634
ℓ1ℓ2 FS
0,701 0,702 0,666 0,623
0,670
References
L1L2Signature - http://slipguru.disi.unige.it/Software/L1L2Signature/ LIFE 004
AGEv FIM 011
C.DeMol, S.Mosci, M.Traskine, and A.Verri, “A regularized method for selecting nested groups of relevant genes from microarray data”. Journal of Computational Biology, vol. 16, no. 5, pp. 677– 690, 2009. A. Barla, S. Mosci, L. Rosasco, and A. Verri, “A method for robust variable selection with significance assessment”. ESANN, 2008, pp. 83–88. F. D. Lublin and S. C. Reingold, “Defining the clinical course of multiple sclerosis results of an international survey”. Neurology, vol. 46, no. 4, pp. 907–911, 1996.
FIM 012 FIM 014
C. Granger, A. Cotter, B. Hamilton, R. Fiedler, and M. Hens, “Functional assessment scales: a study of persons with multiple sclerosis”. Archives of physical medicine and rehabilitation, vol. 71, no. 11, pp. 870–875, 1990.