Recognizing Postural Control with Statistical Tools ... - Science Direct

0 downloads 0 Views 685KB Size Report
in COP signal, while sample entropy and multiscale sample entropy are often ... filtering lead to less accurate information on the entropy in postural sway with ...
Available online at www.sciencedirect.com

ScienceDirect Procedia Computer Science 108C (2017) 129–138

International Conference Computational Science, ICCS 2017, 12-14 June 2017, Sampling and Filtering Effects When Sampling andonDigital Digital Zurich, Filtering Switzerland Effects When

Recognizing Control with Statistical Tools Sampling Postural and Digital Filtering When Recognizing Postural Control withEffects Statistical Tools Sampling and Digital Filtering Effects When and the Decision Tree Classifier Recognizing Postural Control with Statistical Tools and the Decision Tree Classifier Recognizing Postural Control with Statistical Tools and the Decision Tree Classifier 1,2 1,3 Luiz H. F.and Giovanini , Simone M.Tree Silva , Elisangela F. Manffra1,3 the 1,2 Decision Luiz H. F. Giovanini , Simone M. Silva1,3Classifier , Elisangela F. Manffra1,3 1,2 and Julio C. Nievola1,31,2 1,2 and Julio C.do Nievola Luiz H.11Pontifícia F. Giovanini , Simone Silva1,3Curitiba, , Elisangela F.Brazil Manffra1,3 Universidade CatólicaM. Paraná, Paraná, 1,2 Universidade Católica do Paraná, Curitiba, Paraná, 2 Luiz H.Pontifícia F. Giovanini , Simone Silvaem1,2 , Elisangela F.Brazil Manffra1,3 Programa de Pós-Graduação Informática and Julio C.M. Nievola 2 Programa de Pós-Graduação em Informática 1Pontifícia [email protected], [email protected] and Julio C.do Nievola Universidade Católica Paraná,1,2 Curitiba, Paraná, Brazil [email protected], [email protected] 3 2 Programa de Pós-Graduação em Tecnologia emParaná, Saúde Brazil Programa deCatólica Pós-Graduação em Curitiba, Informática Pontifícia Universidade do Paraná, 3 Programa de Pós-Graduação em Tecnologia em Saúde 2 [email protected], [email protected] Programa de Pós-Graduação em Informática [email protected], [email protected] [email protected], 3 [email protected],[email protected] Programa de Pós-Graduaçã[email protected] em Tecnologia em Saúde 3 Programa de Pós-Graduação em Tecnologia em Saúde [email protected], [email protected] [email protected], [email protected]

1

Abstract Abstract The postural control can be investigated from time series data of the center-of-pressure (COP) displacements. The postural control can be investigated from time series data of the center-of-pressure (COP) displacements.

Detrended fluctuation analysis and scaled windowed variance are commonly employed to measure fractality Abstract Detrended fluctuation analysis and scaled windowedsample variance are commonly employed to measure fractality in COP signal, while can sample entropy and multiscale entropy often used to address itsdisplacements. regularity and The postural control be investigated time series data of theare center-of-pressure (COP) Abstract in COP signal, while sample entropy and from multiscale sample entropy are often used to address its regularity and

complexity, respectively. Basedand on COP data from 19variance post stroke adults and 19 healthy matched subjects, we Detrended fluctuation analysis scaled windowed arethe commonly employed to measure fractality The postural control can be investigated from time series data of center-of-pressure (COP) displacements. complexity, respectively. Based on COP data from 19 post and 19 healthy matched subjects, we first support previous findings that the sampling and/or thestroke digitaladults filtering of those data may influence the in COP signal, while sample entropy and multiscale sample entropy are often used to address its regularity and Detrended analysis and scaled windowed variance are commonly to may measure fractality first supportfluctuation previous findings thatprovided the sampling and/or the digital filtering ofemployed thoseevidences data influence the interpretations on postural control by such types of metrics. Then, we show that the digital complexity, respectively. Based onprovided COP data from 19 postofstroke adults andwe 19show healthy matched subjects, we in COP signal, while sample entropy and multiscale sample entropy are often used to address its that regularity and interpretations postural control types metrics. Then, evidences the digital filtering lead previous toon less accurate information on by thesuch entropy inthe postural sway with either traditional tools first support findings that the sampling and/or digitaladults filtering of those data maystatistical influence the complexity, respectively. Based on COP data from 19 post stroke and 19 healthy matched subjects, we filtering lead to less accurate information on the entropy in postural sway with either traditional statistical tools or the decision on treepostural (DT) classifier. Thus, when computing itevidences is not advisable filter interpretations control by such typesthe ofentropy-related metrics. Then,features, weofshow that theto digital first support previous findings thatprovided the sampling and/or digital filtering those data may influence the or the decision tree (DT) classifier. Thus, when computing entropy-related features, it is not advisable to filter the data.lead However, if fractalinformation features are instead, thesway use with of digital filters and statistical downsampling filtering toon less accurate onconsidered thesuch entropy inof postural either traditional tools interpretations postural control provided by types metrics. Then, we showfilters evidences that the digital the data. However, if fractal features are considered instead, the use offractal digital and downsampling techniques can provide more discriminative information. When combining anditentropy-related features, or the decision tree (DT) classifier. Thus, when computing entropy-related features, is not advisable to tools filter filtering lead toprovide less accurate information oninformation. the entropy in postural sway with either traditional statistical techniques can more discriminative When combining fractal and entropy-related features, bothdata. original and processed COP data should be considered for either DTofordigital other popular classifiers. Lastly, the However, if fractal features are considered instead, the use filters and downsampling or the decision tree (DT) classifier. Thus, when computing entropy-related features, it is not advisable to filter both original processed COP data the should be considered either DT or other popular classifiers. Lastly, with the aidcan ofand aprovide DT, wemore could classify individuals with anfor accuracy of 77.6% for fractal features only (best techniques discriminative information. When combining fractal and entropy-related features, the data. However, if fractal featuresthe areindividuals consideredwith instead, the useofof77.6% digital filters and downsampling with the aid of a DT, we could classify an accuracy for fractal features only (best case),original 68.4% for entropy-related features only,be andconsidered 76.3% after and entropy-related features. both and processed data should forcombining either DT fractal or other popular classifiers.features, Lastly, techniques can provide moreCOP discriminative information. combining fractal and entropy-related case), 68.4% for entropy-related features only, and 76.3%When after combining fractal and entropy-related features. with the aid of a DT, we could classify the individuals with an accuracy of 77.6% for fractal features only (best both original and processed COP data should be considered for either DT or other popular classifiers. Lastly, © 2017 ThePosture, Authors.Center-of-Pressure, Published by Elsevier B.V.Processing, Feature Extraction, Machine Learning Keywords: Signal case),the 68.4% entropy-related only, and 76.3% after combining fractal andfractal entropy-related features. with aidPosture, offor a DT, we could classify the individuals withof an accuracy of 77.6% for features only (best Keywords: Center-of-Pressure, Signal Processing, Feature Extraction, Machine Learning Peer-review under responsibility offeatures the scientific committee the International Conference on Computational Science case), 68.4% for entropy-related features only, and 76.3% after combining fractal and entropy-related features. Keywords: Posture, Center-of-Pressure, Signal Processing, Feature Extraction, Machine Learning Keywords: Posture, Center-of-Pressure, Signal Processing, Feature Extraction, Machine Learning

1 1 Introduction Introduction The posturography is a well-known, widely used technique within the motor control field to 1measure Introduction The posturography is aexhibited well-known, widely used technique within the 1motor field to . This control is usually done the natural sway by the human body in the quiet stance 1measure Introduction the natural sway exhibited by the human body in the quiet stance1. This is usually done

withThe theposturography help of a force to obtain data within of the the center-of-pressure (COP) is aplatform well-known, widelytime usedseries technique motor control field to with the help in of the a force platform to obtain time sagittal series data of and the medial-lateral center-of-pressure (COP) displacements anterior-posterior (AP) (i.e., plane) (ML) (i.e., The posturography is aexhibited well-known, widely used technique within the 1motor field to . This control is usually done measure the natural sway by the human body in the quiet stance displacements in the anterior-posterior (AP) (i.e., sagittal plane)provides and medial-lateral (ML) (i.e., 1 1 . The parameterization of the COP signal metrics that can be used frontal plane) directions . This is usually done measure the natural sway exhibited by the human body in the quiet stance with the help of a force platform to obtain time series data of the center-of-pressure (COP) 1 .many The parameterization ofincluding the COP decisions signal provides metrics that can be used frontal plane) directions as clinical indicators for usefultopurposes, uponmedial-lateral rehabilitation programs. with the help of the a force platform obtain time series data of and the center-of-pressure (COP) displacements in anterior-posterior (AP) (i.e., sagittal plane) (ML) (i.e., as clinical indicators for many useful purposes, including decisions upon rehabilitation programs. 2 1 exponent provided (DFA-α) has For example, the scaling detrended fluctuation analysisthat displacements in the anterior-posterior (AP) by (i.e., plane) and medial-lateral (ML) (i.e., . The parameterization ofthe thesagittal COP signal provides metrics be used frontal plane) directions 2 can (DFA-α) has For example, the scaling exponentlong-range provided correlations by the detrended fluctuation analysis 1 measure been largely adopted to (or fractality) in COP signal, providing The parameterization the COP decisions signal provides metrics that can be used frontal plane) directions as clinical indicators for .many useful purposes,ofincluding upon rehabilitation programs. been largely adopted to measure long-range correlations (or fractality) in COP signal, providing 2 (DFA-α) information onthe thescaling regularity (or predictably) ofincluding the detrended bodydecisions swayfluctuation patterns. Sample (SE)has is For example, exponent provided by the analysisentropy as clinical indicators for many useful purposes, upon rehabilitation programs. information on the regularity (or predictably) of the body sway patterns. Sample entropy (SE) is 2 been largely adopted to measure (or fractality) in COP signal, providing (DFA-α) has For example, the scaling exponentlong-range provided correlations by the detrended fluctuation analysis been largelyon adopted to measure long-range correlations fractality) COP signal, providing information the regularity (or predictably) of the body(or sway patterns.inSample entropy (SE) is 1877-0509 © 2017 The Authors. Published by Elsevier B.V. information on the regularity (or predictably) of the body sway patterns. Sample entropy (SE) is Peer-review under responsibility of the scientific committee of the International Conference on Computational Science 10.1016/j.procs.2017.05.117

130

Luiz H.F. Giovanini et al. / Procedia Computer Science 108C (2017) 129–138

another metric commonly used to address regularity in postural control. Thus, the reliability of such metrics and the interpretations that they provide may affect the diagnostic and treatment of people with impaired balance control. In the literature, a large number of posturographic metrics have been designed not only for the COP displacement (COPd) signal but also for the COP velocity (COPv), which is indeed the most accurate form of sensory information used to maintain the upright stance3. They are mainly organized into magnitude metrics, related to the overall amount of postural sway, and structural metrics, associated with the temporal patterns in the sway dynamics1. In the motor control community, there is too much interest in finding metrics able to characterize and distinguish the underlying mechanisms of postural control of different populations. Over the last decades, it has been done with traditional statistical tools, where the candidate metrics are individually compared across groups or balance tasks1. More recently, however, studies in the machine learning field have successfully employed supervised classification methods for this purpose (see, e.g., ref.4), where it is possible to combine multiple clinical indicators to perform pattern recognition. There is a lot of variation on how the COP data are acquired and processed in the literature 1, mainly in terms of sampling frequency and digital filtering, which may affect the structures of the signal5. Accordingly, ref.5 has shown that structural metrics like DFA-α and SE are sensitive to those processing techniques, especially when they are computed from COPv rather than COPd. In fact, the statement “preprocessing tools may impact downstream conclusions” is something wellknown by the signal processing community and spans over many other databases and research fields beyond postural control. Even so, this is a very important matter as it helps to build guidelines on how much the results actually reflect the observed phenomenon and how much they are being influenced by processing techniques. Thus, it is important to highlight some limitations observed in ref.5. First of all, the conclusions were drafted from a small sample of eight young healthy subjects performing only a single balance task, and only COP data in AP direction were analyzed. Furthermore, critical remarks on DFA-α were given in ref.6, which has concluded that it is always important to prove the results with another method. Hence, to the best of our knowledge, further investigations are needed to provide more reliable directions on how acquisition and processing tools may affect the interpretations reached with structural metrics. Additionally, even knowing that the sampling and the digital filtering of the COP data may influence the conclusions on postural control5, it still remains unclear whether original or processed data are better to distinguish the postural strategies adopted by different populations, which poses a big challenge in the motor control field. Therefore, using statistical tools and a machine learning model, our main purpose in this paper is to investigate the effects of those processing tools for recognizing healthy and post stroke individuals based on structural metrics. Stroke is the leading cause of serious, long-term disabilities in the US and has figured as the second leading cause of death throughout the world in 2000 (5.7 million deaths) and 2012 (6.7 million deaths) 7. Based on four structural metrics — two fractal and two entropy-related — extracted from COPAP and COP-ML data of 19 post stroke adults and 19 healthy matched subjects performing two quiet standing balance tasks, this paper has the following contributions. First, we show that unfiltered data are better to address the entropy in postural control with either statistical tools or the decision tree (DT) classifier. Thus, when computing only entropy-related features from COP time series, it is not advisable to process them. However, things change when using fractal metrics instead. In this case, if statistical tools are considered, our findings encourage one to employ digital filtering and downsampling to the data, whereas both original and processed versions of the COP signals are relevant for the DT model. When combining fractal and entropy-related features, it is also advisable to consider both original and processed COP data for either DT or other popular classifiers. With a DT learned from entropy-related features, we established a cut-off score that allows one to distinguish between healthy and stroke individuals with 68.4% of accuracy. We also provided a DT based on fractal metrics able to recognize such groups with 77.6% of accuracy. Lastly, we show that the DT is competitive to other popular machine learning methods to distinguish healthy from stroke sway profiles. The rest of the paper is organized as follows. First, we motivated our experiments in section 2. Then, we present and discuss our results in section 3. Finally, we conclude and discuss future work in section 4.



Luiz H.F. Giovanini et al. / Procedia Computer Science 108C (2017) 129–138

2 Methods 2.1 Database With the approval of the Ethics Committee of PUCPR (permission no. 991.103), we used the posturographic dataset collected by ref.8 from 19 stroke patients (55.1 ± 6.7 years old) and 19 healthy matched subjects (53.6 ± 5.9 years old). The volunteers stood upright and barefoot on a force platform (AMTI OR6-7) as still and symmetrically as possible in two quiet standing tasks: with eyes open (EO) and eyes closed (EC). Then, COPd in AP and ML directions was recorded for 60 s in each task at a sampling frequency of 100 Hz, which can properly capture the multiple scales of the postural control dynamics9. Next, to reach some configurations commonly adopted by related studies5, our dataset was downsampled and/or filtered (dual pass 2nd order 10 Hz low-pass Butterworth), thus creating six different time series of COPd for each individual at each condition: original, downsampled to 50 Hz, and downsampled to 25 Hz, all of them with unfiltered (original) and filtered versions. Lastly, COPv data were computed as the first derivative of the COPd signals1.

2.2 Data Parameterization Four well-known structural metrics were analyzed in this paper: scaling exponent from detrended fluctuation analysis2 (DFA-α), Hurst exponent from scaled windowed variance10 (SWVH), sample entropy11 (SE), and multiscale sample entropy12 (MSE). Importantly, another study13 has shown that DFA works well on fractional Gaussian noise (fGn) while SWV is more appropriate when analyzing fractional Brownian motion (fBm). In this context, the COPd dynamics can be modeled as a fBm, whereas the COPv structures can be described as a fGn13. Therefore, we computed DFA-α only from COPv data, and SWV-H only from COPd data. Another key point is that entropy estimations are influenced by the time series length, which should be substantially larger than the scales in study to ensure good results14. Because of this, we computed SE and MSE only from COPd and COPv time series recorded at 100 Hz, considering both unfiltered and filtered versions. All the signal processing in this paper was performed using Matlab R2013b 15. Based on the fractality of the COP signal, the DFA-α and SWV-H metrics have been used to address the regularity in postural control5,16. These metrics provide information on the persistence (0.5 > α ≥ 1.0; H > 0.5) or anti-persistence (0 < α < 0.5; H < 0.5) of a given time series, where the larger the metric, the “smoother” the signal. Reference values have been reported for uncorrelated data (e.g., white noise) (α = 0.5; H = 0), Brown noise (α = 1.5; H = 0.5), and 1/f noise (α = 1)16,17. The application of both DFA and SWV methods involves the segmentation of the time series into non-overlapping windows over multiple turns, respecting a set of window lengths w = wmin, …, wmax that must be carefully chosen6,10. Following proper guidelines10, we performed DFA on COP data recorded at 100 Hz using wmin = 5 data points (0.05 s) and wmax = 1000 (10 s), whereas SWV was computed for wmin = 4 (0.04 s) and wmax = 1200 (12 s). The remaining data were all processed by both methods using the same parameters: w = 5 (0.1 s), …, 600 (12 s) and w = 4 (0.16 s), …, 500 (20 s) for signals downsampled to 50 Hz and 25 Hz, respectively. It was considered all values ranged from wmin to wmax in which the signals could be segmented into non-overlapping integer intervals. As the SWV yields better results for detrended time series10, bridge detrending was applied to the data recorded at 100 Hz while the remaining signals were all processed using linear detrending. This choice was made based on our time series lengths, as suggested by ref.10. Similarly to the fractal metrics, the SE has been successfully employed to describe the regularity of the postural control system where the larger the value, the less structured the body sway patterns within the COP signal13,16,18. Briefly, SE computes the negative natural logarithm of the conditional probability that sequences similar for m data points remain similar after adding one more point (m+1) to them within a tolerance r11. Complementarily, MSE has been used to address the complexity of COP dynamics over multiple timescales19,20. To this end, SE is calculated for consecutive coarse-grained time series and then plotted as a function of scale, which is known as MSE curve. Then, a complexity index is usually computed by summing up the entropy values as proposed by ref.12. Prior to the analyses, both nonstationarities and long-range correlations were removed of our data because they may mask the complexity of a time series 19. For this purpose, as

131

132

Luiz H.F. Giovanini et al. / Procedia Computer Science 108C (2017) 129–138

suggested in ref.19, the data was detrended by subtracting, from each COP signal, the 5 intrinsic mode functions of lowest frequency whose predominant energy typically ranges from 0.05 to 1 Hz. Then, after normalizing the time series to unit variance, SE was computed taken m = 2 and r = 0.15 for COPd data19,20 and m = 2 and r = 0.55 for COPv data13. Next, the complex index was calculated for each signal by examining temporal structures ranging from 0.03 s (33 Hz) for scale 1 to 0.3 s (3 Hz) for scale 10 as in ref.19. In this case, the removal of frequencies below 1 Hz by the detrending process assures that no relevant information is lost for SE and MSE analyses. Furthermore, the choice of m = 2 and maximum scale factor of 10 is in line with a rule of thumb reported for SE computation suggesting that one needs about 10m to 20m data points for good results14.

2.3 Experiments Two main experiments were performed in this paper. For the first one, we put all the fractal metrics together in a same dataset, while a second database with all the entropy-related metrics was composed for the second experiment. Both experiments consisted in analyzing such datasets using first traditional statistical tools, and then a machine learning model. With the statistical tools, we aimed at assessing the impact of data processing techniques in the interpretations on postural control, as well as identifying the most discriminative metrics to distinguish our healthy from stroke volunteers in each processing condition. Then, with the machine learning approach, we focused on investigating how metrics resulted from different processing tools would be combined to recognize the postural strategies adopted by those individuals. Lastly, to investigate the robustness of our main approach based on DT against other methods, we also performed a third experiment evaluating the performance of multiple classifiers on a dataset comprising all the fractal and entropy metrics. 2.3.1. Traditional statistical analysis Since our fractal and entropy-related metrics did not show a normal distribution (95% of confidence Shapiro-Wilk test), Mann-Whitney U-Test was used to perform some tests. For the first experiment, in order to investigate the impact of the sampling and digital filtering of the COP data, the DFA-α and SWV-H mean values were first compared across data recorded at 100 Hz, 50 Hz, and 25 Hz. Then, they were compared across filtered and unfiltered data. After that, to identify the best fractal metric for distinguishing our groups, those results were compared across health status over all processing conditions. Next, for the second experiment, we compared the SE and MSE mean values across data processing technique to check the influence of the digital filtering of COP data. Finally, those results were compared across health status to identify what condition between filtering or not filtering the data provides more discriminative information on the groups. 2.3.2. Machine learning approach For the first and second experiments of this paper, we decided to use the decision tree (DT) model due to its capability of providing a set of classification rules that can be visually interpreted 21. For the third experiment, beyond DT, we also considered five other popular methods of the stateof-the-art: multilayer perceptron (MLP), support vector machines (SVM), k-nearest neighbors (kNN), naïve Bayes (NB), and Random Forest (RF). These models were learned using the WEKA environment22 with the following algorithms, respectively: C4.5, MultilayerPerceptron (10 thousand epochs of training time, 10% of validation set size), LibSVM (setting the cost to 10), IBk (for k = 1, 3, and 5), NaiveBayes, and RandomForest, all ran with default parameters. As we have a small number of examples (38 “healthy” and 38 “post stroke” from 19 subjects × 2 tasks), the performance of each model was assessed using the leave-one-out technique within 10 repetitions, where accuracy (recognition rate), sensitivity, and specificity were adopted as performance metrics. Since the results of the third experiment did not show a normal distribution (95% of confidence Shapiro-Wilk test), we used the Mann-Whitney U-Test to compare the performance metrics obtained from DT against those resulted from the other methods. Importantly, before training the classifiers, we first performed a feature selection in all experiments using the Correlation-Based Features Selection method. Besides removing redundancies among our features, it helped us to achieve a standard features set for evaluating multiple classifiers in the third experiment.



Luiz H.F. Giovanini et al. / Procedia Computer Science 108C (2017) 129–138

133

3 Results and Discussion 3.1 First Experiment The effects of the downsampling and digital filtering on the DFA-α and SWV-H estimations can be visually observed in Fig. 1 for COP-ML from EO task (AP direction and EC task displayed qualitatively similar results). The mean values of such metrics are presented and compared across health status in Table 1 for COP-AP data from EO task. Metrics from COP-ML did not show any statistical difference and EC task showed statistically equivalent results. Finally, Fig. 2 (left panel) displays the DT model that was learned from two metrics taken after the features selection step: DFA-α from filtered, downsampled to 25 Hz COPv-AP data, and SWV-H from original (unfiltered, 100 Hz) COPd-AP data. Such model was able to distinguish healthy from stroke individuals with 77.6% of accuracy, 76.3% of sensitivity (“stroke” class), and 78.9% of specificity (“healthy” class).

0.6 0,6

*

0,90 0.9 0.7 0,70 0.5 0,50

100 50 25 Hz Hz Hz Sampling Freq.

*

Stroke, COPv 1.2 1,2

*

1.0 1,0

*

100 50 25 Hz Hz Hz Sampling Freq.

0.8 0,8 0.6 0,6

Stroke, COPd 1,30 1.3

*

1,10 1.1

*

* *

100 50 25 Hz Hz Hz Sampling Freq.

SWV-H

0.8 0,8

1,10 1.1

SWV-H

DFA-α

1.0 1,0

*

Healthy, COPd 1,30 1.3

DFA-α

Healthy, COPv 1.2 * 1,2

0,90 0.9 0,70 0.7 0,50 0.5

*

*

*

*

100 50 25 Hz Hz Hz Sampling Freq.

Fig. 1. Mean values of DFA-α from COPv and SWV-H from COPd data for unfiltered (black dashed line) and filtered (gray solid line) versions. The error bars denote the standard deviation. Asterisks placed above or below an error bar of a specific sampling frequency denote a significant difference (p < .05) from the other two sampling frequencies for filtered or unfiltered data, respectively. Stars placed between error bars denote significant difference (p < .05) between filtered and unfiltered data for a specific sampling frequency. Unfiltered COP data Filtered COP data Group 100 Hz 50 Hz 25 Hz 100 Hz 50 Hz 25 Hz Stroke .81±.08 .74±.09 .63±.09 .97±.1 .8±.1 .64±.09 DFA-α Healthy .78±.09 .78±.08 .7±.07 1.03±.08 .87±.08 .72±.07 (COPv) p-value .466 .122 .002 .029 .007 .001 Stroke .8±.07 .75±.08 .64±.09 1.02±.09 .82±.1 .66±.09 SWV-H Healthy .76±.09 .77±.09 .71±.07 1.06±.08 .89±.08 .73±.07 (COPd) p-value .153 .414 .090 .009 .023 .004 Table 1. Statistical comparison among mean values ± standard deviations of DFA-α and SWV-H from COPAP data at EO task across health status. Comparisons with p < 0.05 are marked in bold.

3.1.1. Statistical analysis — downsampling and filtering effects In this paper, excepting for the unfiltered data of healthy subjects, the downsampling of the COP signal from 100 Hz to 25 Hz yielded a statistical reduction of the DFA-α and SWV-H mean values toward more random, less-structured patterns of postural sway (see Fig. 1). These findings are consistent with previous assertion that the downsampling may lead to an interpretation of less stability in the balance control process5,23. This was already expected for DFA-α from COPv signal, which has enhanced spectral components at the higher frequencies due to the differentiation process13. However, the COPd time series comprise frequencies up to 1 Hz only13, so these findings are surprising in terms of SWV-H. Our results may also support previous reports that 100 Hz is a suitable value of sampling frequency to record the COP trajectories9. Regarding the digital filtering of the data, this is a common procedure intended to remove eventual artifacts unassociated with the motor control process1. On the other hand, it may also modify the structures within the signal by removing components that are actually inherent to the postural control and/or by adding artificial deterministic components24. Our findings show that both

134

Luiz H.F. Giovanini et al. / Procedia Computer Science 108C (2017) 129–138

Fig. 2. Decision tree model learned in the first (left panel) and second (right panel) experiments of this paper.

DFA-α and SWV-H are sensitive to such effects for data recorded at 100 Hz and 50 Hz (see Fig. 1), suggesting more regular and structured patterns of variability than unfiltered data, which would be interpreted as increased postural stability23. These results were already expected for DFA-α from COPv data, where the filtering at 10 Hz helps to “smooth” the signal by attenuating its high frequency components. Moreover, the lower the sampling frequency, the less the effects of the filtering in this case, since the higher frequencies above 10 Hz are ignored in respect to the Nyquist theorem. However, these findings were not expected for SWV-H since the COPd signal remains qualitatively similar after being filtered at 10 Hz5. Hence, based on our sampling- and filteringrelated results, we concluded that the SWV method is highly influenced by the higher spectral components of the time series. Also, we do not support that fractal metrics computed from COPd are more robust to the digital filtering than those extracted from COPv as suggested by ref 5. 3.1.2. Discriminative metrics with statistical tools For the unfiltered COP data, both fractal measures were discriminative only after the downsampling to 25 Hz (see Table 1). Using filtered data, however, SWV-H was discriminative for 25 Hz and 50 Hz, whereas DFA-α was able to distinguish the groups over all sampling frequencies. Thus, when computing fractal metrics from COP signals of healthy and stroke adults, it is advisable to filtering and downsampling the data. Besides showing that COPv time series are slightly superior to COPd, these results also suggest that DFA and SWV have similar capabilities to handle unfiltered COP data, but DFA is superior to deal with filtered data. In this paper, only COP-AP time series provided fractal metrics able to distinguish the body sway patterns of our groups. This is consistent with previous reports that COP-AP is more discriminative than COP-ML when assessing postural control with sensorial manipulations, such as visual deprivation25. In accordance with related works16,20, our healthy volunteers yielded statistically higher DFA-α and SWV-H values than the stroke individuals, thus suggesting more regular and structured patterns of postural variability, just as expected. 3.1.3. DT model First of all, in a similar fashion to the statistical analysis, only metrics obtained from COP-AP data were chosen in the feature selection step. However, while the selected DFA-α from filtered, downsampled (to 25 Hz) COPv-AP data outperformed all other metrics in that analysis by showing the lowest p-value, the selected SWV-H from original COPd-AP time series not even distinguished the groups with statistical significance (see Table 1). In other words, when analyzing fractal features with a DT rather than statistical tools, one should consider both original and processed versions of the COP signal rather than processed versions only. As can be seen in Fig 2 (left panel), the selected DFA-α was took as the most discriminative feature and placed at the top of the DT model. This means that the COPv information is slightly better than COP position, as observed in the statistical analysis. When such feature is equal or lower than 0.63, which means less structured patterns of body sway, most of instances are classified as “stroke”, whereas they are mainly recognized as “healthy” otherwise (see Fig. 2, left panel). These rules are in line with the traditional interpretations provided by DFA-α on the regularity in postural control.



Luiz H.F. Giovanini et al. / Procedia Computer Science 108C (2017) 129–138

135

3.2 Second Experiment A statistical comparison of SE and MSE mean results across healthy and stroke groups is shown in Fig. 3 for COP-ML data from EC motor task (our best scenario), where it is also possible to observe the effects of the filtering of the signals. COP-AP time series and EO task displayed qualitatively (but not statistically) similar results in terms of both health status and filtering effects. A full statistical comparison across health status and filtering condition is presented in Table 2. In respect to the machine learning analysis, four metrics were chosen in the features selection step: (i) SE from filtered COPd-AP, (ii) SE from filtered COPv-ML, (iii) SE from unfiltered COPv-ML, and (iv) MSE from unfiltered COPv-AP data. The Fig. 2 (right panel) displays the DT model that was learned from such features, which allowed us to classify our healthy and stroke volunteers with 68.4% of accuracy, 42.1% of sensitivity (“stroke” class), and 94.7% of specificity (“healthy” class).

Unfilt.

Stroke

Filt.

Healthy

1.3 1.1 0.9 0.7 0.5 0.3 0.1

MSE from COPd

MSE from COPv

14

14

12

12

10

10

MSE

✫ SE

SE

*



MSE

SE from COPv ✫ ✫ ✫

SE from COPd 1.3 1.1 0.9 0.7 0.5 0.3 0.1

8 6

Unfilt. Stroke

Healthy



8 6

4

Filt.



Unfilt. Stroke

4

Filt. Healthy

Unfilt. Stroke

Filt. Healthy

Fig. 3. Mean values of SE and MSE from COPd and COPv data in ML direction for healthy and stroke individuals performing the EC task. The error bars denote the standard deviation. The asterisks or stars placed above connecting lines denote significant difference between the two connected conditions with p < 0.05 or p < 0.01, respectively.

SE (COPd)

Stroke vs. healthy individuals EO task EC task AP ML AP ML UF F UF F UF F UF F ns ns ns ns ns ns * ns

SE (COPv)

ns

ns

*

ns

ns

ns

MSE (COPd)

ns

ns

ns

ns

ns

ns

ns

ns

MSE (COPv)

ns

ns

ns

ns

ns

ns

ns

ns



d

ns

Unfiltered vs. filtered data EO task EC task AP ML AP ML S H S H S H S H * ✫ ✫ ✫ ✫ ✫ ✫ ✫ ✫

ns ✫



ns ✫



ns ✫



ns ✫



ns ✫



ns ✫



ns ✫



ns ✫

ns denotes not significant (p > 0.05), * denotes p < 0.05, and ✫ denotes p < 0.01 Table 2. Statistical values for SE and MSE resulted from unfiltered (UF) and filtered (F) data across health status, as well as for those resulted from stroke (S) and healthy (H) groups across data processing condition.

3.2.1. Statistical analysis — digital filtering effects As can be seen in Fig. 3 and Table 2, except for the MSE from COPd, the digital filtering of the data yielded a statistical decrease in our SE and MSE mean results, thus suggesting more regular, less complex patterns of postural sway, just like observed for our fractal metrics. Similarly, these findings support that digital filers may remove and/or modify some components of the COP signal that are actually rooted in the postural control dynamics5,24. Such results were already expected for COPv since the filtering at 10 Hz helps to assign more deterministic behavior by “smoothing” the signals. Concerning COPd, as we examined SE for temporal structures of 0.03 s (33 Hz), the high frequency components of the time series were considered during the computations, so the digital filtering affected the results by attenuating such components. On the other side, with the scaling process in MSE, we examined progressively slower temporal patterns up to 0.3 s (3 Hz). Moreover, 95% of the sway energy in COPd comprises frequencies up to 1 Hz only13, which is well above of the cut-off frequency of our filter (10 Hz). Thus, while sensitive for the lower time scales, SE shown robustness to the filtering for the higher scales considered in this paper. Averaging such effects, the MSE results from COPd data remained unaffected after the filtering process.

136

Luiz H.F. Giovanini et al. / Procedia Computer Science 108C (2017) 129–138

3.2.2. Discriminative metrics with statistical tools In this paper, unlike SE, the MSE results were not able to distinguish healthy from stroke individuals in any situation (see Table 2). In other words, for our sample, COP regularity outperformed COP complexity to address the decrease in healthiness given by stroke. However, the groups were successfully distinguished only for SE from unfiltered time series. Therefore, when extracting entropy features from COP data, the digital filtering should not be applied. This is in line with a previous report that the use of digital filters may provide a skewed view of the postural control process5. It is also important to remark that the discriminative SE results were obtained from COPML signals only, which contrasts with previous findings that COP-ML is less discriminative than COP-AP when assessing balance control with sensory manipulations, such as visual deprivation25. Furthermore, the lower p-values were found for SE from COPv time series, supporting previous assertion that velocity can better detect physiological- or visual-related changes in postural control26. Over the last years, different interpretations on the postural control have been claimed based on the SE analysis13,14,27. On the one side, more irregular sway patterns (high SE values) can be associated with an increase in the efficiency of the balance control and taken as a sign of healthiness, whereas injured physiological systems may be rigid or regular (low SE values), indicating ineffective motor control strategies. On the other side, irregularity can be linked with unstructured systems unable to handle new postural challenges. Our SE results obtained from COPd time series (see Fig. 3 and Table 2) fits the former situation, supporting that post stroke individuals are less capable to explore the phase space when compared to healthy subjects. Related works have also found less SE values for post stroke adults16, elderly fallers19, and children with cerebral palsy27. However, the results derived from COPv data are in line with the second interpretation of decrease in irregularity as an indicative of decrease in healthiness. Similarly, ref. 20 has reported higher SE values for older people compared to young subjects. 3.2.3. DT model Out of the four metrics chosen after the features selection step, only the SE from unfiltered COPv-ML data was selected during the learning process (see Fig. 2, right panel). Hence, when recognizing healthy and stroke individuals with a DT model and entropy-related features, the digital filtering of the COP time series is not recommended. This is in line with the results observed in the statistical analysis, but contrasts with our previous findings for fractal metrics, when both original and processed data are relevant. Furthermore, while our results support that COP velocity is more accurate than COP position for balance assessment3, they refute previous reports that COP-ML is less discriminative than COP-AP on the capabilities of postural control25. The latter was also observed in this paper during the traditional statistical analysis of our SE results. As shown in Fig. 2 (right panel), the DT provided a cut-off score to distinguish healthy from stroke sway patterns based on the SE from unfiltered COPv-ML data. When it is equal or lower than 0.96, the majority of the instances are recognized as “healthy”, whereas they are mainly labeled as “stroke” otherwise. These rules are consistent with the traditional interpretation that more regular COP fluctuations (low SE values) may indicate better conditions of postural control13. Importantly, such cut-off score is considerably more accurate to recognize healthy than stroke people since we observed 42.1% of sensitivity in face of 94.7% of specificity. This could be explained by the fact that our sample comprises post stroke volunteers with different levels of disability, so the less impaired ones may have performed the tasks with similar capabilities compared to the healthy ones.

3.3 Third Experiment First of all, five metrics were taken in the features selection step: (i) DFA-α from filtered, downsampled to 25 Hz COPv-AP data, (ii) SWV-H from original COPd-AP, (iii) SE from unfiltered COPv-ML, (iv) SE from filtered COPd-AP, and (v) SE from filtered COPv-ML, but only the three first were chosen by the DT model during the learning process. These results are consistent with those from our previous experiments, suggesting that both original and processed versions of the COP data are relevant when combining fractal and entropy-related metrics for DT analyzes. The Table 3 displays the accuracy, sensitivity, and specificity of the classification models learned from



Luiz H.F. Giovanini et al. / Procedia Computer Science 108C (2017) 129–138

Classification model Accuracy (%) Sensitivity (%) Specificity (%) RF 83.0* 80.8 85.3* DT 76.3 76.3 76.3 k-NN (k =5) 76.3 71.1 81.6 MLP 75.5 71.6 79.5 k-NN (k =3) 75.0 68.4† 81.6 NB 72.4 71.1 73.7 k-NN (k =1) 68.4† 73.7 63.2† SVM 63.2† 78.9 47.4† * denotes a statistically (p < 0.05) higher value compared to the result from DT model † denotes a statistically (p < 0.05) lower value compared to the result from DT model Table 3. Mean values of accuracy, sensitivity, and specificity of the classifiers in the third experiment.

the selected features. As can be seen, the RF model was the only one to beat the accuracy and the specificity reached with the DT approach. Most important, in terms of sensitivity, the DT was competitive or superior to the other methods, which means it is able to properly recognize post stroke sway profiles. These results are encouraging because, compared to the other models, DT has the advantage of providing a set of rules that are easily interpretable by any professional (see Fig. 2), which is particularly interesting for real world applications in healthcare and medicine fields21.

4 Conclusions, Future Work and Acknowledgement In summary, when trying to distinguish healthy from stroke patterns of postural sway, the findings of this paper suggest that:    

If statistical tools and fractal features are used, then filtered, downsampled COPd-AP or COPv-AP data should be considered; If a DT model and fractal features are adopted, it is advisable to use COPv-AP and COPdAP data in both original and processed versions. If statistical tools or a DT model are used with entropy-related features, then original COPvML time series can provide better results. If a DT or other popular classifier is considered with fractal and entropy-related features, COPd-AP/ML and COPv-AP/ML data should be used in original and processed versions.

Furthermore, we also support previous findings that the downsampling and/or the digital filtering of the COP signal may influence the interpretations on postural control provided by structural metrics5. Therefore, before attempting to make interpretations or comparisons based on such metrics, it is always important to check if some processing tool is being employed on the data and identify how much it is affecting the results. Accordingly, it is advisable to always report to the community what type of data processing tools were considered, so future work intended to develop normative data (e.g., cut-off scores) can be better informed. Importantly, the contributions of this paper were drafted from a sample of healthy and stroke individuals, so there is no guarantee that they cover other populations. In order to ensure the repeatability of all experiments performed in this study, the routines and structural metrics are available to download at https://goo.gl/SHM9V5. In future work, we intend to address some limitations of this study that maybe prevented better recognition rates. For example, instead of individual classifiers, it would be better to combine multiple machine learning models, as well as consider more posturographic features able to describe other useful properties of the postural control beyond fractal and entropy-related metrics. Moreover, we also intend to perform features selection using wrapper methods, which allow reaching more discriminative features for each classifier in a personalized fashion. These methods and features could be used to recognize post stroke individuals with different levels of disability, chronic, and acute phases, which we also intend to do further. Lastly, we did not crosscheck our results due to the unavailability of a public, similar dataset, so we encourage other researchers to perform it. Luiz H. F. Giovanini is thankful to PUCPR for his scholarship. Authors acknowledge the collaboration of Ana Carolina Moura Xavier Rehabilitation Hospital.

137

Luiz H.F. Giovanini et al. / Procedia Computer Science 108C (2017) 129–138

138

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27.

Duarte, M. & Freitas, S. M. Revision of posturography based on force plate for balance evaluation. Braz. J. Phys. Ther. 14, 183–192 (2010). Peng, C.-K. et al. Mosaic organization of DNA nucleotides. Phys. Rev. E 49, 1685 (1994). Jeka, J., Kiemel, T., Creath, R., Horak, F. & Peterka, R. Controlling human upright posture: velocity information is more accurate than position or acceleration. J. Neurophysiol. 92, 2368–2379 (2004). Saripalle, S. K. et al. Classification of body movements based on posturographic data. Hum. Mov. Sci. 33, 238–250 (2014). Rhea, C. K., Kiefer, A. W., Wright, W. G., Raisbeck, L. D. & Haran, F. J. Interpretation of postural control may change due to data processing techniques. Gait Posture 41, 731–735 (2015). Bryce, R. M. & Sprague, K. B. Revisiting detrended fluctuation analysis. Sci. Rep. 2, (2012). WHO. The top 10 causes of death. Available at: http://www.who.int/mediacentre/factsheets/fs310/en/. Silva, Simone Massaneiro. Análise do controle postural de indivíduos pós-acidente vascular encefálico frente a perturbações dos sistemas visual e somatossensorial. Master thesis. Pontifícia Universidade Católica do Paraná, 2012. Ruhe, A., Fejer, R. & Walker, B. The test–retest reliability of centre of pressure measures in bipedal static task conditions–a systematic review of the literature. Gait Posture 32, 436–445 (2010). Cannon, M. J., Percival, D. B., Caccia, D. C., Raymond, G. M. & Bassingthwaighte, J. B. Evaluating scaled windowed variance methods for estimating the Hurst coefficient of time series. Phys. Stat. Mech. Its Appl. 241, 606–626 (1997). Richman, J. S. & Moorman, J. R. Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol.-Heart Circ. Physiol. 278, H2039–H2049 (2000). Costa, M., Goldberger, A. L. & Peng, C.-K. Multiscale entropy analysis of complex physiologic time series. Phys. Rev. Lett. 89, 068102 (2002). Kirchner, M., Schubert, P., Schmidtbleicher, D. & Haas, C. T. Evaluation of the temporal structure of postural sway fluctuations based on a comprehensive set of analysis tools. Phys. Stat. Mech. Its Appl. 391, 4692–4703 (2012). Borg, F. G. & Laxåback, G. Entropy of balance-some recent results. J. Neuroengineering Rehabil. 7, 1 (2010). Guide, M. U. The mathworks. Inc Natick MA 5, 333 (1998). Roerdink, M. et al. Dynamical structure of center-of-pressure trajectories in patients recovering from stroke. Exp. Brain Res. 174, 256–269 (2006). Duarte, M. & Zatsiorsky, V. M. Long-range correlations in human standing. Phys. Lett. A 283, 124– 128 (2001). Donker, S. F., Roerdink, M., Greven, A. J. & Beek, P. J. Regularity of center-of-pressure trajectories depends on the amount of attention invested in postural control. Exp. Brain Res. 181, 1–11 (2007). Costa, M. et al. Noise and poise: enhancement of postural complexity in the elderly with a stochastic-resonance–based therapy. EPL Europhys. Lett. 77, 68008 (2007). Duarte, M. & Sternad, D. Complexity of human postural control in young and older adults during prolonged standing. Exp. Brain Res. 191, 265–276 (2008). Han, J., Kamber, M. & Pei, J. Data mining: concepts and techniques. (Elsevier, 2011). Hall, M. et al. The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11, 10– 18 (2009). Amoud, H. et al. Fractal time series analysis of postural stability in elderly and control subjects. J. NeuroEngineering Rehabil. 4, 12–12 (2007). Riley, M. A. & Turvey, M. T. Variability and determinism in motor behavior. J. Mot. Behav. 34, 99–125 (2002). Ganesan, M., Lee, Y.-J. & Aruin, A. S. The effect of lateral or medial wedges on control of postural sway in standing. Gait Posture 39, 899–903 (2014). Prieto, T. E., Myklebust, J. B., Hoffmann, R. G., Lovett, E. G. & Myklebust, B. M. Measures of postural steadiness: differences between healthy young and elderly adults. IEEE Trans. Biomed. Eng. 43, 956–966 (1996). Donker, S. F., Ledebt, A., Roerdink, M., Savelsbergh, G. J. & Beek, P. J. Children with cerebral palsy exhibit greater and more regular postural sway than typically developing children. Exp. Brain Res. 184, 363–370 (2008).

Suggest Documents