2014 Health Innovations and Point-of-Care Technologies Conference Seattle, Washington USA, October 8-10, 2014
Relating Sensor-Based Tremor Metrics to a Conventional Clinical Scale David Western1 , Member, IEEE, Simon A. Neild1 , Rick A. Hyde1, Rosemary Jones2, Angela Davies-Smith2
Abstract— Clinical scales based on visual observation are routinely used to assess pathological tremors, informing diagnoses and treatment selection. Because these scales incorporate elements that are subjective and limited in accuracy, sensor-based measurements of tremor are increasingly being considered as a means of improving on current practice. The translational potential of such measurement systems can be limited by their complexity and insufficient characterisation of how the new measures relate to the established scales and the prior literature that is based on those scales. In this study, we considered two simple sensor-based measures of tremor amplitude and compared them against an established clinical scale, the FahnTolosa-Marin Tremor Rating Scale, among a cohort of 21 people with multiple sclerosis. Both metrics were found to be strongly correlated with the clinical scores (Pearson’s r = 0.83 in both cases). Sources of discrepancy are discussed, suggesting avenues for further refinement of tremor assessment methods.
I. INTRODUCTION Tremor, or involuntary oscillatory movements, can arise as a symptom of various neurological pathologies including Parkinson’s Disease and Multiple Sclerosis. It can be severely disabling, as well as socially isolating. Diagnosis, monitoring, and treatment selection for individuals with tremor is conventionally based on observations carried out by a medical professional during a battery of set tasks. As summarised by Elble et al. [1], various scales have been developed to translate these observations into quantitative descriptions of tremor severity or its impact on functional ability and quality of life. These scales all incorporate subjective, qualitative measures of the movement. Good inter-rater and intra-rater reliability have been reported for many of these scales, but coarse resolutions (e.g. 0-4 integer scoring) must be imposed to achieve this. As a result, the scales can be insensitive to small changes in symptoms, limiting their efficacy in tracking an individual’s symptom development or assessing new treatments. Motion tracking sensors are increasingly being considered as a means of improving the precision, accuracy, and objectivity of tremor assessments. Despite the long-standing availability of numerous suitable technologies, current clinical practice remains dominated by assessments based on 1 D. Western (
[email protected]), S. Neild, and R. Hyde are with the Department of Mechanical Engineering, University of Bristol, U.K. 2 R. Jones and A. Davies-Smith are with the MS Research Unit, Bristol & Avon Multiple Sclerosis (BrAMS) Centre, Frenchay Hospital, Bristol, UK. This is a summary of independent research funded by the National Institute for Health Research (NIHR)’s Invention for Innovation (i4) Programme. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. This work was also supported by the UK EPSRC, Account ref. EP/P501326/1 and by the charity MS Research.
978-1-4799-2918-4/14/$31.00 ©2014 IEEE
visual observation. A likely barrier to the translation of motion capture technologies into clinical practice is that many sensor-based measurements cannot be intuitively related to visual observation, especially without engineering expertise or specialist training. This particular obstacle to progress is the focus of the present paper. Various authors have contributed methods for quantifying pathological tremor in a manner suited to the extraction of summary parameters. Arguably the most widely used is the Weighted Fourier Linear Combiner (WFLC) [2], which has been implemented in the control architecture of numerous tremor suppression devices. An alternative method based on the Hilbert-Huang transform was proposed by Mellone et al. [3]. Gao and Tung modelled tremor as a diffusion process, extracting several descriptive parameters based on these models [4]. II. A IMS All of these approaches have shown merit in providing useful descriptions of tremor, and could eventually play a key role in the use of motion capture technology in clinical tremor assessments, working beyond the limits of visual observation. However, the level of abstraction that separates the extracted parameters from the observed movement can make them difficult to understand for interpreters without mathematical expertise, and is therefore likely to slow the adoption of this technology into clinical practice. Hence we sought to determine whether a simple tremor metric can approximate the judgement made by a medical professional during a tremor severity assessment, while retaining the objectivity and accuracy afforded by motion capture technology. If sensor-based tremor metrics are to be incorporated into clinical practice, it is necessary to understand how the new metrics relate to the scales used in current practice. This relationship will determine the extent to which treatment guidelines and prior research findings based on the established scales can inform the application and interpretation of the new metrics. For that reason, this study compares two candidate metrics against scores from the Fahn-TolosaMarin Tremor Rating Scale (FTMTRS) [5], which is well established and widely used [1]. III. M ETHODS A. Tremor Metrics 1) Consideration of Translational Potential: The precise nature of any instance of tremor is known to vary greatly depending on the individual’s etiology and the task to be
165
performed, and on other factors such as body temperature [6]. Furthermore, tremor patients are often affected by other symptoms such as weakness, fatigue, limited range-ofmovement, or cognitive deficits that limit their ability to carry out prescribed tasks. These considerations are substantial barriers to the development of reliable, fully automated tremor measurements. A more practical system would make use of the clinician’s judgement to guide the measurement process and vet the reliability of individual measurements. Such supervision requires that the process for calculating any tremor metric from sensor readings must be readily understood by the clinician. Any such metric should therefore be intuitive in its construction. To further benefit from user supervision, the metric should be derived from displacement readings or other parameters that are easily verified by independent (visual) observation, such that gross errors (due to sensor misalignment, for example) are readily identified. It is important to note that incorporating user supervision in the implementation of any sensor-based metric sacrifices some objectivity in favour of versatility and robustness. 2) Proposed Metrics: Based on consultation with neurologists and physiotherapists, two metrics were proposed to summarise tremor amplitude in a way that satisfies the criteria identified above. Both metrics are suitable for summarising displacement amplitude in an intuitive manner, but are equally applicable to other movement parameters such as acceleration or angular rate. Average tremor amplitude (ATA): This metric is based in the frequency-domain, reflecting the fact that tremor is, by definition, ’rhythmic’ or ’oscillatory’. Once a suitable movement parameter has been recorded and the signal section of interest has been isolated, Welch’s method is used to calculate the power spectral density (P SD(f )) of this signal section. The tremor frequency fT is identified by the largest peak within the 2-10 Hz bandwidth of this spectrum. To allow for a broad tremor peak, lower and upper boundary frequencies f1 and f2 are identified as the nearest frequency bins either side of the peak for which the P SD is less than half its peak value. The tremor amplitude is then extracted by integrating the square root of P SD within this range, as described in (1). ATA = 4N TS
f2 X p P SD(f )
(1)
f =f1
N is the number of frequency bins between f1 and f2 , inclusive, and TS is the sample period of the recording. The factor 4 is used so that, when the signal is a perfect sinusoid, ATA returns the peak-to-peak (rather than zeroto-peak) amplitude of that sinusoid, for a more intuitive representation of visual perception of amplitude. From an engineering perspective, a frequency-domain measure such as ATA is an intuitive description of an oscillatory signal. However, to other interpreters the transformation from the time domain to the frequency domain may obscure the metric’s interpretation. Furthermore, in some forms of tremor, particularly those that arise in multiple sclerosis,
the spectral content of the movement can be highly nonstationary. In these cases, the behaviour of a metric based in the frequency domain can be difficult to predict. In light of these concerns, we developed a second metric that is based in the time domain, as described below. Maximum tremor envelope (MTE): The MTE was designed to reflect the consulted clinicians’ comments that, when the observed tremor amplitude is highly variable (as seen in Fig. 1, for example), they allow the largest oscillations to dominate the clinical scoring. Notably, there are no official guidelines as to how to make this judgement when scoring based on the FTMTRS. To calculate the MTE, the signal must first be subjected to band-pass filtering or some other form of processing to separate the tremor from the voluntary movement components and high-frequency artefacts. The most appropriate method for doing so may be decided on an ad hoc basis. The upper and lower bounds of the tremor envelope are then defined by connecting the consecutive maxima and consecutive minima with linear segments. The MTE is then defined as the largest difference between these two bounds at any instant within the section of interest. This process is illustrated in Fig. 1.
Fig. 1. An illustration of the method by which the maximum tremor envelope (MTE) is calculated. The solid line shows the hand displacements of a subject with multiple sclerosis, filtered to leave only the tremor component and projected onto the principal tremor axis. The dashed lines show the bounds of the tremor envelope.
In reality, the problem of distinguishing tremor from other movement components is non always straightforward. In some cases the bandwidth of the voluntary movements is distinct from that of the tremor, but such an assumption cannot be relied upon. Several methods have been proposed for distinguishing tremor from other movement components (e.g. [2], [3], [7]), with varying degrees of validation. However, none of these has been sufficiently validated to be assumed reliable across the full breadth of known tremor characteristics and etiologies, which are summarized in [6]. This task should therefore be regarded as an open research question, and one that is beyond the scope of this paper. To minimise the influence of voluntary movements, the proposed metrics were applied only to postural recordings in this study. B. Tremor Recordings and Pre-Processing Movement recordings were obtained from a cohort of 21 subjects (15 female; ages 30-60), each of whom had
166
a diagnosis of multiple sclerosis and exhibited tremor in one or both of their upper limbs. For each recording, a single motion-tracker unit (Xbus Kit, Xsens Technologies B.V., Enschede, The Netherlands) was attached to the dorsal side of the subject’s hand. At a sample frequency of 100 Hz, the sensor unit returned a quaternion describing its orientation in three dimensions, calculated from MEMS sensor readings (triple-axis accelerometer, gyroscope, and magnetometer) using the manufacturer’s proprietary Kalman Filter, XKF-3. The unit also returned temperature-calibrated accelerometer readings. The orientation readings were used to rotate the accelerometer readings into a global reference frame, so that the gravitational component could be removed and the accelerations could be integrated twice to yield displacement estimates. This approach is commonly known as inertial navigation. The resulting displacement estimates were translated to yield the position of the subject’s middle fingertip, which was assumed to be rigidly connected to the sensor unit; one subject was excluded from analysis because they were unable to hold their fingers straight. For MTE calculations, the estimates were then high-pass filtered to remove drift and non-tremor components of the movement. In this case, an 8th-order Bessel filter with a cut-off frequency of 1.8 Hz was applied in the forward and backward directions to minimise distortion of the tremor signal in the time domain. To extract a one-dimensional signal, the filtered threedimensional displacement readings were projected onto the principal tremor axis, the dominant component of the coorR ’s ‘princomp’ function, dinates as identified by MATLAB which is an implementation of principal component analysis. This vector describes the direction in which the filtered displacement exhibits the greatest variance. The proposed tremor metrics were applied to readings from the MEMS sensor unit using this method, and were compared against equivalent measures from a ‘gold standard’ camera-based system (Qualisys A.B., Gothenburg, Sweden), using a single subject. The metrics from the two systems were found to agree to within 0.7mm. C. Postural Task As discussed in Section III-A.2, to minimise the influence of voluntary movements on the results and their interpretation, the only movement used in this study was a postural task. The seated subject copied a demonstrator, holding their arm out horizontally in front of their body, with their palm facing downward and their fingers straightened and held slightly apart. Each subject was invited to be recorded on two separate days, with an interval of 21 to 381 days between visits. On each day, the subject was asked to perform the task twice with each arm, separated by intervals of approximately 5 to 15 minutes, during which other activities were performed. For various reasons such as patients’ time constraints or limited stamina, the full set of recordings was not gathered from all subjects. The data in this paper represent 126 recordings from 21 different subjects.
D. Clinical Scores From videos of each of the patient recordings, the severity of postural tremor was rated according to the clinical scale (FTMTRS), which has the following thresholds: 0 = No tremor; 1 = Slight, < 0.5 cm, may be intermittent; 2 = Moderate, 0.5 − 1 cm, may be intermittent; 3 = Marked, 1 − 2 cm; 4 = Severe, > 2 cm. IV. R ESULTS Fig. 2 compares the proposed metrics with the established clinical scores (FTMTRS). Both metrics are strongly correlated with the clinical scores (r = 0.831 for both). However, the metrics do not consistently fall within the thresholds described by the FTMTRS. As can be expected from their definitions, MTE appears to be more likely than ATA to overestimate the FTMTRS score, whereas ATA is more prone to underestimation. The same-day repeatability of the FTMTRS scores was assessed in terms of the absolute difference between the two measurements taken from a single arm on a single visit. These differences were assessed for all subjects, across all visits (where the necessary data was available): mean 0.4, median 0, max 2. Because the sensor-based metrics were logarithmically related to the clinical scores, their repeatability was assessed in terms of the ratio between the larger and smaller of the two measurements taken on the same day (ATA: mean 1.9, median 1.5, max 5.6; MTE: mean 1.9, median 1.5, max 7.1). V. D ISCUSSION In compiling data from multiple prior studies, Elble et al. [8] found that tremor was logarithmically related to FTMTRS scores for patients with various forms of essential tremor (postural, kinetic, writing), using a variety of motion capture technologies. However, the metric used and the method of extracting displacement data from sensor readings are not described in detail for all of the studies concerned. Nevertheless, our study finds that the same relationship holds among tremor patients with multiple sclerosis when hand displacement is sensed by inertial navigation. The data presented by Elble et al. [8], like our own, show that the sensor-based measures of tremor amplitude are not confined within the thresholds of the FTMTRS scores. Potential explanations for this discrepancy can be divided into three categories: variability in the clinical scoring, inaccuracy in the sensor-based measurement, and ambiguity in the definition of the clinical scale. Variability in the clinical scoring is one of the principal motivating factors for the development of more objective tremor measurements. Hooper et al. [9] assessed the intraand inter-rater reliability of the FTMTRS (the reliability of the scale when the same video is scored repeatedly by the same rater (intra-) or a different rater (inter-)), applied to people with multiple sclerosis performing this postural task. The intra-rater reliability was found to be excellent, with a Pearson’s correlation coefficient of 0.99, but the interrater reliability was lower, at 0.87. The fact that the latter
167
Average Tremor Amplitude (cm)
1
log y = 0.646x−1.677 r = 0.831
10
0
10
−1
10
0 Maximum Tremor Envelope (cm)
be measured. Furthermore, it does not specify whether the thresholds refer to the maximum amplitude, the average, or some other representative summary. In this sense, the greater specificity inherent in sensor-based tremor measurements may stimulate a refinement of the vocabulary used to describe tremor and related movement disorders. An important attribute of any subjective or objective tremor metric is its test-retest repeatability, which reflects the variability of the symptom itself, rather than just the performance of the metric [1], and which informs the clinical significance that can be ascribed to any observed change in the metric. The repeatability measures reported at the end of Section IV indicate that the severity of symptoms varied substantially between two equivalent measurements taken on the same visit. This variability is likely to be attributable to our use of a cohort of tremor in multiple sclerosis, which is known to be more complex and irregular than other forms of tremor. These results demonstrate a limitation of any tremor rating based purely on amplitude for this cohort, since a substantial variation can occur without clinical significance. Nevertheless, the emerging availability of sensor-based assessments at least allows these symptoms and their variability to be assessed on a continuous scale, which will be essential to the development of improved tremor assessment techniques.
1
1
2 Clinical score
3
4
3
4
log y = 0.546x−1.187 r = 0.831
10
0
10
−1
10
0
1
2 Clinical score
R EFERENCES
Fig. 2. A comparison of the proposed tremor metrics (ATA in the upper panel; MTE in the lower panel) against clinical scores (FTMTRS) applied to the same postural recordings. Dotted boxes mark out the defined thresholds for the different grades of the FTMTRS. The solid line presents the logarithmic relationship fitted to the data by the least-squared-error method. The fitted parameters of this relationship are presented on each graph, along with the Pearson’s correlation coefficient between the data (r-value).
value is similar to the correlation coefficients found in our study between the sensor-based metrics and the FTMTRS suggests that the variability of the FTMTRS may contribute substantially to the discrepancies observed here. The principal source of error in the sensor-based metrics is likely to be the method of translating the displacements of the sensor unit to give the displacements of a point-of-interest on the hand. Its accuracy is dependent on the validity of the assumption that the point-of-interest is connected to the sensor unit as a rigid body. A limitation of our study is that this aspect of the approach has not been directly validated. However, the subjects were monitored to ensure that they kept their finger’s straight during the period-of-interest, and one subject was excluded from the analysis on this basis. When not attempting a direct comparison with a clinical scale, it should be possible to choose the point-of-interest to better suit the rigid-body assumption. Ambiguity in the definition of the FTMTRS scale is likely to contribute significantly to the observed discrepancies. The scale states clear amplitude thresholds, but does not specify at which point on the hand these amplitudes should
[1] R. Elble, P. Bain, M. Joo Forjaz, D. Haubenberger, C. Testa, C. G. Goetz, A. F. G. Leentjens, P. Martinez-Martin, A. PavyLe Traon, B. Post, C. Sampaio, G. T. Stebbins, D. Weintraub, and A. Schrag, “Task force report: Scales for screening and evaluating tremor: Critique and recommendations,” Movement Disorders, vol. 28, no. 13, pp. 1793–1800, 2013. [Online]. Available: http://onlinelibrary.wiley.com/doi/10.1002/mds.25648/abstract [2] C. N. Riviere, S. G. Reich, and N. V. Thakor, “Adaptive fourier modeling for quantification of tremor,” Journal of Neuroscience Methods, vol. 74, no. 1, pp. 77–87, June 1997. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0165027097022632 [3] S. Mellone, L. Palmerini, A. Cappello, and L. Chiari, “Hilbert-huangbased tremor removal to assess postural properties from accelerometers,” IEEE Transactions on Biomedical Engineering, vol. 58, no. 6, pp. 1752 –1761, June 2011. [4] J. B. Gao and W.-w. Tung, “Pathological tremors as diffusional processes,” Biological cybernetics, vol. 86, no. 4, pp. 263–270, Apr. 2002, PMID: 11956807. [5] S. Fahn, E. Tolosa, and C. Marin, “Clinical rating scale for tremor,” in Parkinson’s Disease and Movement Disorders, 2nd ed. Baltimore, MD: Williams & Wilkins, 1993, pp. 225–234. [6] G. Deuschl, P. Bain, and M. Brin, “Consensus statement of the movement disorder society on tremor,” Movement Disorders, vol. 13, no. S3, pp. 2–23, Jan. 1998. [Online]. Available: http://onlinelibrary.wiley.com/doi/10.1002/mds.870131303/abstract ˘ [7] L. Z. Popovi´c, T. B. Sekara, and M. B. Popovi´c, “Adaptive band-pass filter (ABPF) for tremor extraction from inertial sensor data,” Computer Methods and Programs in Biomedicine, vol. 99, no. 3, pp. 298–305, Sept. 2010. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0169260710000702 [8] R. J. Elble, S. L. Pullman, J. Y. Matsumoto, J. Raethjen, G. Deuschl, and R. Tintner, “Tremor amplitude is logarithmically related to 4- and 5-point tremor rating scales,” Brain, vol. 129, no. 10, pp. 2660–2666, Oct. 2006, PMID: 16891320. [Online]. Available: http://brain.oxfordjournals.org/content/129/10/2660 [9] J. Hooper, R. Taylor, B. Pentland, and I. R. Whittle, “Rater reliability of fahn’s tremor rating scale in patients with multiple sclerosis,” Archives of physical medicine and rehabilitation, vol. 79, no. 9, pp. 1076–1079, Sept. 1998, PMID: 9749687.
168