International Journal of Medical Informatics 114 (2018) 6–17
Contents lists available at ScienceDirect
International Journal of Medical Informatics journal homepage: www.elsevier.com/locate/ijmedinf
Rest tremor quantification based on fuzzy inference systems and wearable sensors
T
⁎
Luis A. Sanchez-Pereza,b, , Luis P. Sanchez-Fernandezb, Adnan Shaouta, Juan M. Martinez-Hernandezc, Maria J. Alvarez-Noriegac Department of Electrical and Computer Engineering, University of Michigan – Dearborn, MI, USA Instituto Politecnico Nacional, Centro de Investigacion en Computacion, Mexico City, Mexico c Instituto Politecnico Nacional Escuela Nacional de Medicina y Homeopatia, Mexico City, Mexico a
b
A R T I C LE I N FO
A B S T R A C T
Keywords: Rest tremor Wearable sensors Fuzzy inference Tremor quantification Continuous scale
Background: Currently the most consistent, widely accepted and detailed instrument to rate Parkinson’s disease (PD) is the Movement Disorder Society sponsored Unified Parkinson Disease Rating Scale (MDS-UPDRS). However, the motor examination is based upon subjective human interpretation trying to capture a snapshot of PD status. Wearable sensors and machine learning have been broadly used to analyze PD motor disorder, but still most ratings and examinations lay outside MDS-UPDRS standards. Moreover, logical connections between features and output ratings are not clear and complex to derive from the model, thus limiting the understanding of the structure in the data. Methods: Fifty-seven PD patients underwent a full motor examination in accordance to the MDS-UPDRS on twelve different sessions, gathering 123 measurements. Overall, 446 different combinations of limb features correlated to rest tremors amplitude are extracted from gyroscopes, accelerometers, and magnetometers and feed into a fuzzy inference system to yield severity estimations. Results: A method to perform rest tremor quantification fully adhered to the MDS-UPDRS based on wearable sensors and fuzzy inference system is proposed, which enables a reliable and repeatable assessment while still computing features suggested by clinicians in the scale. This quantification is straightforward and scalable allowing clinicians to improve inference by means of new linguistic statements. In addition, the method is immediately accessible to clinical environments and provides rest tremor amplitude data with respect to the timeline. A better resolution is also achieved in tremors rating by adding a continuous range.
1. Introduction Tremor is the most frequent initial motor disorder associated with Parkinson’s disease (PD) [1]. The Movement Disorder Society (MDS) sponsored Unified Parkinson Disease Rating Scale (UPDRS), henceforward referred as the scale, contains several items to rate different tremor categories [2,3]. One of these items evaluates tremors that may appear at any time during the entire examination when some body parts are moving but others are at rest. The MDS has sponsored various revisions [2–5] of the original UPDRS [6] in order to incorporate current scientific knowledge, thenceforward becoming the most widely used PD clinical rating scale [7–14]. However, currently all motor examination items from the scale are quantified using low resolution ranges based on subjective observations gathered by the examiner [15]. Recently, wearable sensors have been widely used not only to detect,
⁎
but also to objectively quantify motor signs in PD patients [16,17]. Most common approaches rely on inertial sensors (accelerometers and gyroscopes) [18–26], digitography [27,28], surface electromyographic [29], force detection surfaces [28], and more sophisticated, complex and costly setups such as video motion analysis systems [30]. Tremors in PD patients have been widely studied using wearable sensors [19,29,31–35] and recent review articles have focused on highlighting advantages and disadvantages given the current progress and limitations [16,36,37]. In this sense, tremors quantification is being performed in two steps: detection and severity rating. Detection has mainly relied on frequency threshold [19,31–33,35] and dynamic classifiers [29,34]. In spite of the high accuracy reported in the latter models some caution must be taken when trying to describe tremors as a time-dependent variable, since human behavior is adaptable to circumstances, flexible, changeable, and tremors may disappear suddenly
Corresponding author at: University of Michigan – Dearborn, 215G ELB, 4901 Evergreen Rd, Dearborn, MI 48128, USA. E-mail addresses:
[email protected] (L.A. Sanchez-Perez),
[email protected] (L.P. Sanchez-Fernandez),
[email protected] (A. Shaout),
[email protected] (J.M. Martinez-Hernandez),
[email protected] (M.J. Alvarez-Noriega). https://doi.org/10.1016/j.ijmedinf.2018.03.002 Received 10 March 2017; Received in revised form 27 January 2018; Accepted 8 March 2018 1386-5056/ © 2018 Elsevier B.V. All rights reserved.
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
Several works have used wearable sensors and computer tools to quantitatively evaluate tremor manifestations in PD. These quantifications include differentiation between essential and parkinsonian tremors [40], detection and monitoring during daily life activities and motor exercises [19,31] and severity estimation [29,32–35]. However, all previous works have any of the following fundamental limitations: 1. Features extracted from physical properties measured by the sensors are often not linked directly to the rating process defined in the scale. For instance, the main feature used to rate rest tremors including re-emergent rest tremors is the amplitude measured in centimeters. However, in previous works [19,29,31–35,40], all features are statistical representation of inertial signals energy trying to represent tremors intensity. Therefore, clinicians find hard to reason about the PD patient status in those terms. According to the scale, when the maximum tremor observed is between 1 and 3 cm, the rating should be Mild. Nonetheless, finding the equivalent rating in terms of inertial signals energy is not straightforward and required supervised machine learning algorithms.
even in severe PD patients [16,35]. The severity rating estimation has been performed mainly through linear regression and classification using raw and frequency domain features from accelerometers, gyroscopes and surface electromyographic signals [19,29,31–35]. Despite all previous features are related to tremors amplitude, their relationship inside these models with respect to the output ratings are not clear and complicated to derive, thus limiting the understanding of the structure in the data. Moreover, if the model performs poorly there is no meaningful feedback in terms of what relationships are not accurate. This is currently a significant issue in PD motor disorder analysis based on wearable sensors and machine learning [16]. Additionally, trying to quantify tremors using a linear combination of the inputs must be further analyzed, particularly when all tremors severity ranges used in the scale exhibit a logarithmic behavior [38], which can also be inferred by examining limits established in tremor items defined in the same scale. On the other hand, tremors quantification using classification requires large datasets to avoid overfitting problems and to represent a reasonable large combination of the inputs from different patients [16]. What is more, although tremor ratings on previous works are expressed using similar categories to those proposed in the scale for short time windows, an overall rest tremor rating as that required in the scale is never performed due to models constraints. In this respect, both the classification models and the scale use a discrete range, thus producing a floor/ceil effect, i.e., even when a particular combination of inputs represents a stage between two severity ratings, still, one of the two must be chosen [39]. Despite the current limitations of the works previously mentioned, tremors severity estimation using wearable sensors remains appealing since it enables objective and accurate quantification of signs. To overcome current limitations, in this work the same motor examination procedure described in the scale is performed to collect data from wearable sensors. Also, a useful approach to compute tremors amplitude at any time is provided. Consequently, all features extracted are closely related to the highest amplitude tremors recorded during the entire session as proposed in the scale. The main purpose is to quantify exactly what the examiner is looking at when performing a subjective rating by means of observations. In this respect, clinicians knowledge is modelled using a rule-based fuzzy inference system in the same way the entire scale is built in, but avoiding sharp boundaries between severity ratings and the floor/ceil effect by adding a continuous range. This approach is primarily focused to route these quantification models towards a clinimetric validation for regulatory approval using the scale as the guideline.
- In this paper, advanced digital signal processing techniques are used to acquire features that allow directly applying the existing knowledge already modeled in the scale. 2. Computers output severity ratings are not related to the scale terms and guidelines [19,31], so the reasoning is difficult and discourages clinical applicability. - In this paper, examiners and clinicians can make sense on any quantified input/output since it strictly follows scale definitions. 3. Since all relationships between features from inertial signals energy and the final rating are not clear, machine learning algorithms must be used to associate inputs to outputs. The most common and advanced approaches so far are multivariate linear regression [33,35] and classification [19,29,34], where the main goal is to treat each severity rating as a class and determine their mathematical relationships using supervised learning to associate a group of features to a severity rating or class. Three major drawbacks arise from this: a These techniques do not generally supply explanatory power. In other words, they are “black boxes” that ingest data to produce generalized outputs [16]. Mathematical relationships between inputs and outputs are difficult to understand. Therefore, if the algorithm goes wrong, it is difficult to find out why. b If the dataset is not large enough, i.e., the selected PD patients do not represent properly all severity ratings, then overfitting problems can occur, and classification techniques will fail. c All classes used during supervised learning correspond to severity ratings given by examiners still under subjective quantification and reasoning. Every patient was subjectively rated as Normal, Slight, Mild, Moderate or Severe before applying any supervised machine learning algorithms, thus adding uncertainty to the model.
2. Review The motivation of this work is to provide an objective and accurate quantification of rest tremors severity covered by motor examination items in the scale, but using wearable sensors, advanced digital signal processing techniques, and fuzzy inference systems. In general, several improvements over the current rating process are behind this paper proposal:
– In this paper, the existing knowledge is modelled by means of fuzzy inference systems imitating how clinicians perform the rating process through if-them rules but with a consistent, quantitative and adaptable model. For instance, in the scale, tremors between 1 and 3 cm correspond to a Mild level while tremors between 3 and 10 cm correspond to a Severe level. However, given an accurate procedure to quantify tremors amplitude A, it is uncertain how and why 3 cm is the proper threshold to differentiate between Mild and Severe. This uncertainty can be computationally modelled using fuzzy logic rules such as:
– Avoid the lengthy review of video material used by examiners as the main tool for applying the scale, which often discourages its use. Instead, computer ratings can be delivered in real-time. – Provide consistent features quantification including tremors amplitude using wearable sensors and advanced computer tools, to make any given set of inputs always yield the same corresponding outputs. This is difficult to achieve by subjective human observations. – Computationally represent existing clinician knowledge gathered in the scale using a fuzzy inference system with if-them rules, which is a similar idea to the rating process in the scale. – Avoid using discrete rating values (0, 1, 2, 3, or 4) to prevent the floor/ceil effect [39].
IF A is Medium THEN Rating is Mild. IF A is High THEN Rating is Severe. 7
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
interfere with individual items in the motor examination. Measurements were conducted on three separate locations since the patients gathered for this study are being treated at different institutions or belong to various PD associations. Each motor examination lasted between 20 and 25 min. Some PD patients underwent the same examination on different days (sessions) in a time span ranging from one to six months. This is particularly helpful to perform clinical follow-up. A general review of the database is presented in Table 1. This work was carried out in accordance to The Code of Ethics of the World Medical Association and with Data Protection and Privacy Laws. All collected data was under explicit written informed consent.
Table 1 Database statistical review. PD Patients Total Women Men Measurements Total Sessions (days) Patient on Levodopa (Yes/No) Last Levodopa Dose (min) Mean ± Standard Deviation Range Age (years) Mean ± Standard Deviation Range Disease Duration (years since diagnose) Mean ± Standard Deviation Range
57 22 35 123 12 111/12 246.9 ± 278.0 5–1440 66.4 ± 9.0 48–85
3.2. Data acquisition One sensor unit per limb was worn by PD patients during the entire motor examination period. Regarding the upper extremities, the unit was placed on top of the dorsal side of the hand as it has been used in similar works [18,27]. Regarding lower extremities, the unit was setup in front of the shank right on top of the ankle. In this study, sensor orientation does not have any effect because it is computed at any given time, but the position is extremely important since the estimated amplitude of tremors will be determined by the unit displacement from that position. Each unit (Fig. 1a) is comprised of a triaxial accelerometer (resolution, 13 bits; dynamic range, ± 16 g), triaxial gyroscope (resolution, 16 bits; dynamic range, ± 300°/s), triaxial magnetometer (resolution, 12 bits; dynamic range ± 8 G), Bluetooth transmitter, and lithium battery. The four units are synchronized at a sample rate of 50 Hz, transmitting the data wireless to a computer where the measurement system is running. Although in this work prototype units are used, it takes between 2–3 min to arrange all units. Motor examination sessions were continuously videotaped using visual and audio clues to also synchronize sensors and video data. Additionally, every exercise’s beginning and end was marked by the computer operator when the examiner asked the patient to start and stop the exercise. Therefore, unconstrained and constrained periods during the motor examination can also be distinguished.
7.9 ± 6.4 1–27
where no crisp threshold between Medium and High is required. Both rules are clearly self-explanatory, modelling clinician inference similarly to the scale rating guidelines. 4. Either by using classification models or simply applying the scale without any computer aid, the motor examination ratings are still delivered in a discrete fashion (0, 1, 2, 3 or 4), leading to the floor/ceil effect [39]. That is to say, even when a particular inputs combination represents a stage between two severity ratings, still, one of the two must be chosen. – In this paper, continuous range from 0 to 4 is used, thus avoiding the floor/ceil effect, which can also be achieved by means of fuzzy inference systems. This is particularly helpful to perform follow-ups. 3. Materials and methods 3.1. Subjects and study protocol Overall fifty-seven PD patients at different stages of the disease participated in this study. A general description of the study population is given in Table 1. Each patient underwent a complete scale motor examination, although results discussed in this paper cover only rest tremor ratings with respect to the limbs. All subjects were screened for dementia and other neurological deficits, concurrent medical problems such as stroke or paralysis, as well as orthopedic problems that could
3.3. Tremor quantification An overview of the tremor quantification procedure is shown in Fig. 1b. The main objective is to provide amplitude data about tremors registered at any given time, which is currently used by examiners to rate tremor items in the scale, but only relying on observations.
Fig. 1. General overview of the model.
8
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
crucial to subtract the right gravity reading and obtain the dynamic acceleration actually produced by tremors. In this work the limb displacement caused by the tremor is considered the tremor’s amplitude. Although positioning the unit in another limb location will produce different results in terms of the tremors amplitude, the entire method proposed herein still remains valid and applicable. In order to compute the unit orientation with respect to the Earth frame an attitude heading reference system (AHRS) is implemented using two approaches [42,43]. Both use the gyroscope as the main source for detecting orientation changes and the accelerometer and magnetometer to compute correction parameters. Given that the unit orientation represents how the sensors frame (X, Y, Z axes) is arranged with respect to the Earth frame, the static acceleration can be computed by projecting the gravity vector onto the sensors frame. Afterwards, the dynamic acceleration on each sensor axis due to tremors can be determined by subtracting the static acceleration from the accelerometer readings. Other approaches to compute static acceleration have relied on low pass filtering [31], but these are much less accurate because the direction of gravity cannot be assumed to be slowly changing during tremors displacements. Gravitational artifacts, which are avoid in this work by means of accurate orientation computation, have been reported as one major disadvantage when using accelerometry [36]. Having dynamic acceleration enables displacement estimation using double integration. However, because examiners and clinicians are paying attention to each tremor individually as suggested in the scale, a method to compute every tremor amplitude is still required. In this respect the following algorithm is proposed:
3.3.1. Tremors detection Since this study is focused on the rest tremor item from the scale, only those tremors observed in a limb not involved in voluntary movements can be analyzed, e.g., tremors during the finger/nose maneuvers (kinetic tremors) are not considered because the arm is being intentionally moved. In this sense, a tremor/voluntary movement detection algorithm is needed. On the one hand, tremors are oscillatory involuntary movements of frequency usually reported from 4 to 12 Hz, although sometimes this range slightly varies [19,31–35]. Tremors can produce both limb displacement as well as orientation changes and therefore any sensor reading could reflect these involuntary movements. However, the tremor/voluntary movement detection proposed herein is based on the gyroscope signals because it is less affected by undesired sources such as gravity (accelerometer), or soft/hard iron distortions (magnetometer). In this work the main frequency component observed in tremors with respect to gyroscope signals is over 4 Hz. On the other hand, an extensive spectrum analysis showed that the main frequency component of all voluntary movements performed by patients during motor examinations, fall below 3 Hz. Based on the abovementioned, the spectrogram of every gyroscope signal (X, Y, Z axes) is computed using the short-time Fourier transform, thus getting a power representation for each frequency component at any given time. Subsequently, a signal segment is only considered for tremor quantification if there is no voluntary movement involved and tremors are present. In this work, a voluntary movement is present when the maximum power of the frequency components below 3 Hz is higher than the maximum power of the frequency components over 4 Hz by at least 15 dB, and it is also over −50 dB. Similarly, a tremor is present when the opposite happens, but the difference only must be higher than 10 dB. The gap between 3 and 4 Hz is not analyzed due to the uncertainty about setting a threshold when both tremors and voluntary movements are very close in frequency. The selected thresholds are chosen experimentally after analyzing several gyroscope spectrograms. Tremors or voluntary movements are detected any time the conditions described above are met for any of the axes. The first two columns from Fig. 2 depict the tremors detection outcomes for the upper extremities during two different exercises. The first column presents the gyroscope signal for all axes, while the second column shows only the gyroscope spectrogram for the X axis, along with the detection results using a step signal representation. Although the item 3.15 (postural tremor) is used to evaluate re-emergent rest tremor, which interferes with holding objects against gravity [41], this exercise is still used in Fig. 2a and b for demonstration and discussion purposes only. Yet some rest tremor can also be identified when the patient is no longer supporting arms against gravity (seconds 40–62 and 90–135). It is worth noting that the patient is asked to perform item 3.15 twice. A strong frequency content below 3 Hz can be observed in both spectrograms from Fig. 2c (seconds 1–20) and Fig. 2d (seconds 32–52) corresponding to the exercise performed for item 3.6 (pronation/supination). Moreover, some rest tremors are present on the left upper extremity (LUE) when the patient is performing the task with the right upper extremity (RUE), especially near the end of the task. Both situations are correctly identified by the detection algorithm described herein. Data regarding lower extremities during all the exercises from Fig. 2 is provided in Appendix A. Additionally, upper extremities data for another PD patient performing the same exercises is also given in Appendix A.
1. High-pass filter dynamic acceleration signals (X, Y, and Z) with cutoff frequency of 4 Hz to avoid voluntary movements. 2. Calculate the velocity vector norm after simple integration of the dynamic acceleration, i.e., integrating each axis signal (X, Y, and Z) and then computing the norm. 3. Since tremors are oscillatory movements, a tremor can be considered to occur in time between two contiguous points of low velocity (ideally zero), i.e., assuming being at rest at the beginning, the limb starts accelerating until the point of maximum velocity before starting to deaccelerate ideally to zero velocity again. Even though zero velocity is rarely observed in the measurements, considering each tremor to happen between the two valleys on either side of a velocity local maximum, produces excellent results using real tremor data as well as under several controlled experiments and simulations. 4. Compute simple integration of the velocity for the time window between each pair of consecutive valleys, which represents the limb displacement produced by a tremor. 5. Only keep analyzable tremors according to the detection algorithm described above. Thus, the tremors amplitude with respect to time is obtained as depicted in the third column in Fig. 2. This is extremely useful because it not only enables quantification of the tremors amplitude as the main feature to rate severity, but also provides what is happening at any given time. For instance, Fig. 2a shows that at the beginning of the task the patient does not experience any postural tremor on the RUE. However, as the time goes on tremors become higher in amplitude. Additionally, when the patient finishes the task some rest tremors appear as well. Fig. 2d confirms that while the patient is performing pronation/supination movements with the RUE, several rest tremors (some above 1 cm) are present in the LUE. Overall, although this patient RUE is highly affected during holding activities (postural tremors), the left hand exhibits more rest tremors of higher amplitude. In this respect, a video is provided along with this paper to showcase the tremor quantification performance. A detailed description of the video is given in Appendix C in Supplementary materials.
3.3.2. Limb displacement estimation Since accelerometers used in this work measure proper acceleration, gravity produces a non-zero reading even when the unit is not moving (static acceleration), e.g., if the unit is standing still in a table while the accelerometer Z axis is pointing towards the floor, then that axis will read approximately 1g. Any orientation change, except a rotation of the unit around the gravity vector, will produce a different static acceleration reading. Thus, knowing the orientation of the unit becomes 9
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
Fig. 2. Review of some data during upper extremities tremor quantification for a patient performing two different motor examination exercises: (a) RUE during Item 3.15, (b) LUE during Item 3.15, (c) RUE during Item 3.6, (d) LUE during Item 3.6.
Afterwards all tremors of amplitude at least higher than 10% of the maximum are considered relevant to compute rating features (this group of tremors is called T). Four features are extracted based on T:
3.4. Severity estimation An overview of the severity estimation procedure is depicted in Fig. 1c. The scale establishes the maximum tremor amplitude that is seen as the final rating using well-defined sharp limits [2,3]. In this sense, all features proposed in this work are based on the tremors of highest amplitude quantified during the entire motor examination. Yet, new rating features could still be proposed and analyzed by multidisciplinary teams using the framework set out in this paper.
1. A: amplitude average of tremors higher or equal to 90% of the maximum. At least five tremors are required to meet this condition, otherwise the current maximum is considered an outlier and then removed. Examiners tend to rate groups of high amplitude tremors instead of an individual occurrence due to the high frequency of tremors, e.g., a frequency of 5 Hz means that around ten displacements can happen in one second. In this respect, the average is selected over the maximum.
3.4.1. Features extraction All tremors detected and quantified are sorted in descending order. 10
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
2. P[100,90]: percentage of tremors higher or equal to 90% of the maximum with respect to the number of tremors in T. 3. P(90,70]: percentage of tremors lower than 90% of the maximum but at least 70% of the maximum with respect to the number of tremors in T. 4. P(70,50]: percentage of tremors lower than 70% of the maximum but at least 50% of the maximum with respect to the number of tremors in T.
Table 2 Model rating linguistic values. Normal Slight Might Moderate
These features are not only targeting tremors of higher amplitude as required in the scale but also those of lower amplitude as well, a need that has been raised in previous works [39], where measuring lower tremors level is strongly encouraged. Still a severity assessment based on A, P[100,90], P(90,70] and P(70,50] complies with the scale since all the features are closely related to the maximum tremor amplitude that is seen.
Severe
0.0 + 1.0·A + 0.075·⎛ ⎝
P[100,90] ⎞ 25 ⎠
+ 0.050·⎛ ⎝
2.0 + 0.1·A + 0.150·⎛ ⎝
P[100,90] ⎞ 25 ⎠
+ 0.075·⎛ ⎝
3.0 + 0.0·A +
P[100,90] ⎞ 0.175·⎛ ⎝ 25 ⎠
+ 0.075·⎛ ⎝
P[100,90] ⎞ 25 ⎠
+ 0.125·⎛ ⎝
3.5 + 0.0·A + 0.175·⎛ ⎝ 4.0 + 0.0·A +
P[100,90] ⎞ 0.200·⎛ ⎝ 25 ⎠
+ 0.125·⎛ ⎝
P (90,70] 25
P (90,70] 25 P (90,70] 25 P (90,70] 25 P (90,70] 25
⎞ + 0.025·⎛ ⎠ ⎝
⎞ + 0.025·⎛ ⎠ ⎝ ⎞ + 0.025·⎛ ⎠ ⎝ ⎞ + 0.025·⎛ ⎠ ⎝ ⎞ + 0.025·⎛ ⎠ ⎝
P (70,50] 25
P (70,50] 25 P (70,50] 25 P (70,50] 25 P (70,50] 25
⎞ ⎠
⎞ ⎠ ⎞ ⎠ ⎞ ⎠ ⎞ ⎠
IF A is Low THEN Rate is Normal IF A is Slight THEN Rate is Slight IF A is Medium THEN Rate is Might IF A is Medium AND P[100,90] is High THEN Rate is Moderate IF A is Medium AND P(90,70] is High THEN Rate is Moderate IF A is High THEN Rate is Moderate IF A is High AND P[100,90] is High THEN Rate is Severe IF A is High AND P(90,70] is High THEN Rate is Severe IF A is Medium AND P[100,90] is High AND P(90,70] is High AND P(70,50] is High THEN Rate is Severe 10. IF A is Severe THEN Rate is Severe 1. 2. 3. 4. 5. 6. 7. 8. 9.
3.4.2. Fuzzy inference system Clinician knowledge being captured in the scale over the years has led to tremor severity assessment based on if-then rules, e.g., if tremor amplitude is between one and three centimeters then the evaluation is mild [2,3]. In this sense, a Takagi-Sugeno [44] fuzzy inference system of if-then rules is proposed to model clinician knowledge closely adhered to the scale, but avoiding sharp boundaries between categories. The first step is to fuzzify all inputs by assigning linguistic values to various overlapping membership functions defined along the range of the feature. In this work, five linguistic values for A (Low, Slight, Medium, High, Severe) and three for P[100,90], P(90,70] and P(70,50] (Low, Medium, High) are used as drawn in Fig. 3. This overlapping “fuzzifies” boundaries between each category outlined by the membership functions. For instance, A = 6.5 represents an input that belongs in the same degree to both categories Medium and High, since the corresponding membership functions evaluate to similar values (approximately 0.4). Modelling uncertainties of the categories boundaries in such a way is essential in a setting where all amplitude quantifications so far have been done merely by visual identification. Afterwards the model’s output Rate is fuzzified using those linguistic values defined in the scale (Normal, Slight, Mild, Moderate, Severe). Takagi-Sugeno [44] systems perform outputs fuzzification by setting each linguistic value as a linear combination of the inputs w0 + w1 A + w2 P(90,70] + w3 P(70,50] instead of using membership functions. All linear combinations used in this work are shown in Table 2. It is important to notice that these linear combinations do not represent the model’s final output. For instance, we could define all linear combinations using the bias term w0 only (w1 = w2 = w3 = 0) and the model output will not necessarily be w0 for all cases. All membership functions and weights are manually/institutively adjusted as it is done in most fuzzy inference systems and can be further tweaked to yield different behavior. Once all features and output are fuzzified the inference process is performed based on the following ifthem rules:
The fuzzification allows to easily represent the inference process as a collection of self-explanatory rules where all relationships between inputs and outputs are clearly specified. The model’s final output is obtained by means of rules aggregation and defuzzification using a weighted average of all rules consequents where antecedents determine each rule weight. In this setting, the consequent is the linear combination corresponding to the linguistic value used in the rule and the antecedent is the minimum membership value of the inputs involved in that same rule. All the rules are evaluated and considered for computing the final output even went its contribution is negligible. This fuzzy quantification framework is scalable and easily adjustable by adding, modifying or deleting rules or even features. 4. Results and discussion The computer model proposed herein is applied to 446 different combinations of limb features obtained from 123 measurements. All fuzzy scores are computed by the model whereas crisp scores are determined using feature A as the maximum amplitude seen, according to the sharp boundaries of the rest tremor item defined in the scale. Crisp scores are used for comparison purposes only and symbolizes ratings that examiners might assign to these cases, under the assumption they have access to precise tools to quantify tremors amplitude as proposed herein, which is not the case in most situations. Although a comparison to previous works [19,29,34] performing classification of rest tremors severity into Normal, Slight, Mild, Moderate and Severe ratings is
Fig. 3. Inputs membership functions used in the fuzzy inference system (smooth transitions/no crisp boundaries).
11
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
Fig. 4. Comparison between MDS-UPDRS fuzzy and crisp scores.
Fig. 5. Features from group of tremors T and model output for the RUE of the same patient on four different days.
given using fuzzy scores as query points. This corroborates that the proposed model captures the scale logarithmic behavior [38] sought by clinicians while still performing accurate quantitative assessments. The scale floor effect is easily spotted near sharp boundaries, e.g., near A = 1 and A = 3. In this sense, two cases are highlighted in Fig. 4, where feature A is only 0.3 cm apart (2.9 and 3.2) but still the crisp scores (2 and 3) represent two different severity ratings. In contrast, the corresponding fuzzy scores (2.39 and 2.59) reflect that indeed, these two cases are not too different in terms of rest tremor severity. The stairs-like distribution of crisp scores can be clearly avoided by adding a continuous range like the one proposed herein while keeping a close
possible, the fact that all ratings assigned to known observations during supervised learning are also determined under subjective quantification and reasoning based on the scale, strongly discourages this comparison (refer to Section 2). Furthermore, since feature A is now available, a classification algorithm is not necessary anymore, since all rules required to infer about the severity rating are specifically defined in the scale. For instance, when 3 ≤ A < 10 then the severity rating should be Moderate. An overall graphical comparison between fuzzy and crisp scores for the 446 different combinations of limb features is drawn in Fig. 4. Additionally, logarithmic and third order polynomial fitting curves are
12
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
A key aspect of the model proposed herein is that expert examiners and clinicians are not replaced by computer models. Instead, their knowledge and experience are reproduced by means of fuzzy inference systems. In this respect, examiners and clinicians can actively engage in the computer model design by adding, modifying and removing linguistic statements capturing relationships between input features and output ratings. This opens a whole window to collaboration between disciplines. The idea is to bring neurologists, expert examiners, clinicians, physical therapists and engineers together so as to build computer tools to improve motor examination and rating of PD patients. This collaboration certainly will improve patient access to motor examination procedures since now accurate and consistent results are obtained in less time, having deep understanding and feedback on what produces the output rating.
relationship to the scale rating process. Yet, a discrete value (0, 1, 2, 3 or 4) is computable by simply rounding the model outputs. In addition, the proposed model is applicable to other types of tremors such as postural tremors (including re-emergent rest tremor) and kinetic tremors using the proper modifications. Fig. 5 presents a comparison between rest tremor features of the same PD patient in four different days, where only tremors on group T sorted in descending order are shown. It is worth mentioning that the time since the last levodopa dose was different on each day. Fig. 5a show that 48% (P[100,90] + P (90,70] + P(70,50]) of relevant tremors are at least higher than haft of the maximum amplitude seen (A = 4.83). In this sense, despite the maximum amplitude seen in Fig. 5c (A = 4.25) is below that from Fig. 5a, the 75% of relevant tremors are over A/2. This reveals that the PD patient experienced a higher percentage of relevant tremors close to A during the motor examination captured in Fig. 5c when compared to Fig. 5a. In this sense, although A in Fig. 5a is higher than in Fig. 5c, a close fuzzy score is output in both cases (3.40 and 3.43) because tremors of lower amplitude are also weighted using P[100,90], P(90,70] and P(70,50]. Rules 4 and 5 have the major contribution to the final rating of Fig. 5a and c although all rules are aggregated even when its value is negligible (refer to Fig. 3 and Table 2 for further understanding). The corresponding crisp score in both cases is 3 since A ≥ 3. The previous comparison underlines that both cases should be rated between Moderate and Severe (3 ≤ Rate ≤ 4) since none of these ratings fully describes by itself the patient status in terms of rest tremors. Furthermore, patients under the same category could have features noticeably different in terms of severity. This is evident by comparing Fig. 5b and c where the features P[100,90], P(90,70] and P(70,50] are very close in both cases but A differs by one centimeter, which still does not produce any change in the crisp score since 3 ≤ A < 10. However, the proposed model differentiates both cases by rating Fig. 5a with 3.40 and Fig. 5b with 3.11. Additionally, Fig. 5d exposes the scale floor effect; although the value A = 2.99 is very close to the next severity rating, according to the scale 1 ≤ A < 3, so the corresponding crisp score will be still two. The proposed model avoids this undesired effect by means of fuzzy boundaries between categories. Indeed, the fuzzy score better represents this case where only 14% of relevant tremors are over A/2 as well. Four additional cases are given in Appendix B.
Summary Points
• Motor examination in Parkinson’s disease patients is mainly based upon subjective human interpretation. • Logical connections between features and output ratings are • •
not clear and complex to derive from current tremor quantification models. Several features proposed in Parkinson’s disease rating scales, including rest tremor amplitude, are quantifiable using wearable sensors. Tremor severity models based on fuzzy inference systems allow modelling clinician knowledge avoiding sharp boundaries between severity categories.
Author contribution Luis A. Sanchez-Perez, PhD: Conception and Design, Data Acquisition, Data Interpretation, Drafting and Revising. Luis P. Sanchez-Fernandez, PhD: Conception and Design, Data Acquisition, Data Interpretation and Revising. Adnan Shaout, PhD: Data Interpretation and Revising. Juan M. Martinez-Hernandez, MD, MSc: Data Acquisition and Revising. Maria J. Alvarez-Noriega, MD: Data Acquisition and Revising.
5. Conclusions Currently the most widely accepted instrument to rate tremors during motor examinations in PD patients is the scale. However, since discrete values are assigned to all severity ratings (0, 1, 2, 3 or 4), sharp boundaries must be defined throughout the inputs range. This should be carefully reviewed particularly when subjective observations determine what inputs are used to infer about tremors severity. In this paper, results reveal that using a rest tremor severity model based on fuzzy inference systems prevents the floor/ceil effect inherent of discrete ranges. What is more, clinician knowledge is still modeled similarly to the human inference process captured in the scale using if-then rules, yielding to a scalable and easily adjustable model. Additionally, subjective observations of tremors amplitude are replaced by a novel rest tremor quantification method using wearable sensors, leading to greater certainty in the inputs definition, which is particularly useful to perform follow-ups.
Relevant conflicts of interest/financial disclosures Nothing to report. Funding sources This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Acknowledgements We thank the Mexican Council of Science and Technology (CONACYT) as well as the Instituto Politecnico Nacional (IPN) for their support.
Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at https://doi.org/10.1016/j.ijmedinf.2018.03.002. Appendix A Figs. A1 and A2
13
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
Fig. A1. Review of some data during lower extremities tremor quantification for a patient performing two different motor examination exercises: (a) RUE during Item 3.15, (b) LUE during Item 3.15, (c) RUE during Item 3.6, (d) LUE during Item 3.6.
14
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
Fig. A2. Review of some data during upper extremities tremor quantification for a patient performing two different motor examination exercises: (a) RUE during Item 3.15, (b) LUE during Item 3.15, (c) RUE during Item 3.6, (d) LUE during Item 3.6.
Appendix B Fig. B1
15
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
Fig. B1. Features from group of tremors T and model output for an extremity of four different PD patients.
[12] C.G. Goetz, G.T. Stebbins, T.A. Chmura, S. Fahn, W. Poewe, C.M. Tanner, Teaching program for the movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale: (MDS-UPDRS), Mov. Disord. 25 (9) (2010) 1190–1194. [13] D. Verbaan, S.M. van Rooden, C.P. Benit, E.W. van Zwet, J. Marinus, J.J. van Hilten, SPES/SCOPA and MDS-UPDRS. Formulas for converting scores of two motor scales in Parkinson’s disease, Park Relat Disord. 17 (8) (2011) 632–634, http://dx.doi. org/10.1016/j.parkreldis.2011.05.022. [14] P. Martínez-Martín, C. Rodríguez-Blázquez, M.J. Forjaz, et al., Relationship between the MDS-UPDRS domains and the health-related quality of life of Parkinsonís disease patients, Eur. J. Neurol. 21 (3) (2014) 519–524, http://dx.doi.org/10.1111/ ene.12349. [15] H.J. Lee, S.K. Kim, H. Park, et al., Clinicians’ tendencies to under-rate Parkinsonian tremors in the less affected hand, PLoS One 10 (6) (2015), http://dx.doi.org/10. 1371/journal.pone.0131703. [16] K.J. Kubota, J.A. Chen, M.A. Little, Machine learning for large-scale wearable sensor data in Parkinson’s disease: concepts, promises, pitfalls, and futures, Mov. Disord. 31 (9) (2016) 1314–1326, http://dx.doi.org/10.1002/mds.26693. [17] W. Maetzler, J. Domingos, K. Srulijes, J.J. Ferreira, B.R. Bloem, Quantitative wearable sensors for objective assessment of Parkinson’s disease, Mov. Disord. 28 (12) (2013), http://dx.doi.org/10.1002/mds.25628. [18] M.M. Koop, A. Andrzejewski, B.C. Hill, G. Heit, H.M. Bronte-Stewart, Improvement in a quantitative measure of bradykinesia after microelectrode recording in patients with Parkinson’s disease during deep brain stimulation surgery, Mov. Disord. 21 (5) (2006) 673–678, http://dx.doi.org/10.1002/mds.20796. [19] S. Patel, K. Lorincz, R. Hughes, et al., Monitoring motor fluctuations in patients with Parkinson’s disease using wearable sensors, IEEE Trans. Inf. Technol. Biomed. 13 (6) (2009) 864–873, http://dx.doi.org/10.1109/TITB.2009.2033471. [20] N. Millor, P. Lecumberri, M. Gomez, A. Martinez-Ramirez, M. Izquierdo, Kinematic parameters to evaluate functional performance of sit-to-stand and stand-to-sit transitions using motion sensor devices: a systematic review, IEEE Trans. Neural Syst. Rehabil. Eng. 22 (5) (2014) 926–936, http://dx.doi.org/10.1109/TNSRE. 2014.2331895. [21] S.J. Ozinga, A.G. Machado, M. Miller Koop, A.B. Rosenfeldt, J.L. Alberts, Objective assessment of postural stability in Parkinson’s disease using mobile technology, Mov. Disord. 30 (9) (2015) 1214–1221, http://dx.doi.org/10.1002/mds.26214. [22] M.D. Djurić-Jovičić, N.S. Jovičić, S.M. Radovanović, I.D. Stanković, M.B. Popović, V.S. Kostić, Automatic identification and classification of freezing of gait episodes in Parkinson’s disease patients, IEEE Trans. Neural Syst. Rehabil. Eng. 22 (3) (2014) 685–694, http://dx.doi.org/10.1109/TNSRE.2013.2287241. [23] E. Sejdić, K.A. Lowry, J. Bellanca, M.S. Redfern, J.S. Brach, A comprehensive
References [1] G.T. Stebbins, C.G. Goetz, D.J. Burn, J. Jankovic, T.K. Khoo, B.C. Tilley, How to identify tremor dominant and postural instability/gait difficulty groups with the movement disorder society unified Parkinson’s disease rating scale: comparison with the unified Parkinson’s disease rating scale, Mov. Disord. 28 (5) (2013) 668–670, http://dx.doi.org/10.1002/mds.25383. [2] C.G. Goetz, S. Fahn, P. Martinez-Martin, et al., Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (MDS-UPDRS): process, format, and clinimetric testing plan, Mov. Disord. 22 (1) (2007) 41–47, http://dx. doi.org/10.1002/mds.21198. [3] C.G. Goetz, B.C. Tilley, S.R. Shaftman, et al., Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (MDS-UPDRS): scale presentation and clinimetric testing results, Mov. Disord. 23 (15) (2008) 2129–2170, http://dx.doi.org/10.1002/mds.22340. [4] The unified Parkinson’s disease rating scale (UPDRS): status and recommendations, Mov. Disord. 18 (7) (2003) 738–750, http://dx.doi.org/10.1002/mds.10473. [5] R. Elble, P. Bain, M.J. Forjaz, et al., Task force report: scales for screening and evaluating tremor: critique and recommendations, Mov. Disord. 28 (13) (2013) 1793–1800, http://dx.doi.org/10.1002/mds.25648. [6] S. Fahn, R.L. Elton, UPDRS Program Members, S. Fahn, C.D. Marsden (Eds.), Unified Parkinson’s Disease Rating Scale, vol 2, Macmillan Healthcare Information, Florham Park, NJ, 1987. [7] P. Martinez-Martin, C. Rodriguez-Blazquez, M. Alvarez-Sanchez, et al., Expanded and independent validation of the Movement Disorder Society–Unified Parkinson’s disease rating scale (MDS-UPDRS), J. Neurol. 260 (1) (2013) 228–236, http://dx. doi.org/10.1007/s00415-012-6624-1. [8] M. Škorvánek, Z. Košutzká, P. Valkovič, R. Ghorbani Saeedian, Z. Gdovinová, N. Lapelle, J. Huang, B.C. Tilley, G.T. Stebbins, C.G. Goetz, Validation of the slovak version of the movement disorder society – unified parkinson’s disease rating scale (MDS-UPDRS), Ces a Slov Neurol a Neurochir 76 (4) (2013) 463–468. [9] A.J. Espay, D.E. Beaton, F. Morgante, C.A. Gunraj, A.E. Lang, R. Chen, Impairments of speed and amplitude of movement in Parkinson’s disease: a pilot study, Mov. Disord. 24 (7) (2009) 1001–1008, http://dx.doi.org/10.1002/mds.22480. [10] A. Kishore, A.J. Espay, C. Marras, et al., Unilateral versus bilateral tasks in early asymmetric Parkinson’s disease: differential effects on bradykinesia, Mov. Disord. 22 (3) (2007) 328–333, http://dx.doi.org/10.1002/mds.21238. [11] D.A. Gallagher, C.G. Goetz, G. Stebbins, A.J. Lees, A. Schrag, Validation of the MDSUPDRS Part I for nonmotor symptoms in Parkinson’s disease, Mov. Disord. 27 (1) (2012) 79–83, http://dx.doi.org/10.1002/mds.23939.
16
International Journal of Medical Informatics 114 (2018) 6–17
L.A. Sanchez-Perez et al.
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
723–730, http://dx.doi.org/10.1002/mds.22445. [34] B.T. Cole, S.H. Roy, C.J. De Luca, S.H. Nawab, Dynamical learning and tracking of tremor and dyskinesia from wearable sensors, IEEE Trans. Neural Syst. Rehabil. Eng. 22 (5) (2014) 982–991, http://dx.doi.org/10.1109/TNSRE.2014.2310904. [35] H. Dai, P. Zhang, T. Lueth, Quantitative assessment of parkinsonian tremor based on an inertial measurement unit, Sensors 15 (10) (2015) 25055 http://www.mdpi. com/1424-8220/15/10/25055. [36] D. Haubenberger, G. Abbruzzese, P.G. Bain, et al., Transducer-based evaluation of tremor, Mov. Disord. 31 (9) (2016) 1327–1336, http://dx.doi.org/10.1002/mds. 26671. [37] A. Sanchez-Ferro, M. Elshehabi, C. Godinho, et al., New methods for the assessment of Parkinson’sdisease (2005–2015): a systematic review, Mov. Disord. 31 (9) (2016), http://dx.doi.org/10.1002/mds.26723. [38] R.J. Elble, S.L. Pullman, J.Y. Matsumoto, J. Raethjen, G. Deuschl, R. Tintner, Tremor amplitude is logarithmically related to 4- and 5-point tremor rating scales, Brain 129 (10) (2006) 2660–2666, http://dx.doi.org/10.1093/brain/awl190. [39] M.J. Forjaz, A. Ayala, C.M. Testa, et al., Proposing a Parkinson’s disease–specific tremor scale from the MDS-UPDRS, Mov. Disord. 30 (8) (2015) 1139–1143, http:// dx.doi.org/10.1002/mds.26271. [40] J. Costa, H.A. González, F. Valldeoriola, C. Gaig, E. Tolosa, J. Valls-Solé, Nonlinear dynamic analysis of oscillatory repetitive movements in Parkinson’s disease and essential tremor, Mov. Disord. 25 (15) (2010) 2577–2586, http://dx.doi.org/10. 1002/mds.23334. [41] J. Jankovic, K.S. Schwartz, W. Ondo, Re-emergent tremor of Parkinson’s disease, J. Neurol. Neurosurg. Psychiatry 67 (5) (1999) 646–650. [42] S.O.H. Madgwick, A.J.L. Harrison, R. Vaidyanathan, Estimation of IMU and MARG orientation using a gradient descent algorithm, IEEE Int. Conf. Rehabil. Robot. 2011 (2011) 1–7, http://dx.doi.org/10.1109/ICORR.2011.5975346. [43] R. Mahony, T. Hamel, J.M. Pflimlin, Nonlinear complementary filters on the special orthogonal group, IEEE Trans. Automat. Contr. 53 (5) (2008) 1203–1218, http:// dx.doi.org/10.1109/TAC.2008.923738. [44] T. Takagi, M. Sugeno, Fuzzy identification of systems and its applications to modeling and control, IEEE Trans. Syst. Man Cybern. (SMC-15 (1)) (1985) 116–132, http://dx.doi.org/10.1109/TSMC.1985.6313399.
assessment of gait accelerometry signals in time, frequency and time-frequency domains, IEEE Trans. Neural Syst. Rehabil. Eng. 22 (3) (2014) 603–612, http://dx. doi.org/10.1109/TNSRE.2013.2265887. H. Dai, H. Lin, T.C. Lueth, Quantitative assessment of parkinsonian bradykinesia based on an inertial measurement unit, Biomed. Eng. Online 14 (2015) 68, http:// dx.doi.org/10.1186/s12938-015-0067-8. N. Kostikis, D. Hristu-Varsakelis, M. Arnaoutoglou, C. Kotsavasiloglou, A smartphone-based tool for assessing parkinsonian hand tremor, IEEE J. Biomed. Heal. Inf. 19 (6) (2015), http://dx.doi.org/10.1109/JBHI.2015.2471093. B.K. Scanlon, B.E. Levin, D.A. Nation, et al., An accelerometry-based study of lower and upper limb tremor in Parkinson’s disease, J. Clin. Neurosci. 20 (6) (2013), http://dx.doi.org/10.1016/j.jocn.2012.06.015. S. Louie, M.M. Koop, A. Frenklach, H. Bronte-Stewart, Quantitative lateralized measures of bradykinesia at different stages of Parkinson’s disease: the role of the less affected side, Mov. Disord. 24 (13) (2009) 1991–1997, http://dx.doi.org/10. 1002/mds.22741. M.M. Koop, N. Shivitz, H. Brontë-Stewart, Quantitative measures of fine motor, limb, and postural bradykinesia in very early stage, untreated Parkinson’s disease, Mov. Disord. 23 (9) (2008) 1262–1268, http://dx.doi.org/10.1002/mds.22077. S.H. Roy, B.T. Cole, L.D. Gilmore, et al., High-resolution tracking of motor disorders in Parkinson’s disease during unconstrained activity, Mov. Disord. 28 (8) (2013) 1080–1087, http://dx.doi.org/10.1002/mds.25391. S. Stuart, B. Galna, S. Lord, L. Rochester, A protocol to examine vision and gait in Parkinson’s disease: impact of cognition and response to visual cues, F1000 Res. 4 (2015) 1379, http://dx.doi.org/10.12688/f1000research.7320.2. D.G.M. Zwartjes, T. Heida, J.P.P. Van Vugt, J.A.G. Geelen, P.H. Veltink, Ambulatory monitoring of activities and motor symptoms in Parkinsons disease, IEEE Trans. Biomed. Eng. 57 (11) (2010) 2778–2786, http://dx.doi.org/10.1109/TBME.2010. 2049573. A. Salarian, H. Russmann, C. Wider, P.R. Burkhard, F.J.G. Vingerhoets, K. Aminian, Quantification of tremor and bradykinesia in Parkinson’s disease using a novel ambulatory monitoring system, IEEE Trans. Biomed. Eng. 54 (2) (2007) 313–322, http://dx.doi.org/10.1109/TBME.2006.886670. J.P. Giuffrida, D.E. Riley, B.N. Maddux, D.A. Heldman, Clinically deployable Kinesia™ technology for automated tremor assessment, Mov. Disord. 24 (5) (2009)
17