Veterinary Surgery 34:445–449, 2005
Accuracy and Optimization of Force Platform Gait Analysis in Labradors with Cranial Cruciate Disease Evaluated at a Walking Gait RICHARD EVANS,
PhD,
CHRIS HORSTMAN,
DVM
and MIKE CONZEMIUS,
DVM, PhD, Diplomate ACVS
Objective—To determine the combination of ground reaction forces (GRFs) that best discriminates between lame and non-lame dogs. To compare the sensitivity of force platform gait analysis and visual observation at detecting gait abnormalities in Labradors after surgery for rupture of the cranial cruciate ligament (CCL). Animals—All dogs were adult Labrador Retrievers: 17 free of orthopedic and neurologic abnormalities, 100 with unilateral CCL rupture, and 131 studied 6 months after surgery for unilateral CCL injury, 15 with observable lameness. Procedure—Dogs were walked over a force platform with GRF recorded during the stance phase. Analytic properties of force platform gait analysis were calculated for several combinations of forces. The probability of visual observation detecting a gait abnormality was compared with that of force platform gait analysis. Results—We determined that a combination of peak vertical force (PVF) and falling slope were optimal for discriminating sound and lame Labradors. After surgery, many dogs (75%) with no observable lameness failed to achieve GRFs consistent with sound Labradors. Conclusion—A force platform is an accurate method of assessing lameness in Labradors with CCL rupture and is more sensitive than visual observation. Assessing lameness with a combination of GRFs is better than using univariate GRFs. Clinical Relevance—Therapies for stifle lameness can be accurately and objectively evaluated using 2 vertical ground reaction forces obtained from a force platform. r Copyright 2005 by The American College of Veterinary Surgeons Key words: ground reaction forces, gait analysis, force platform, peak vertical force (PVF), vertical impulse (VI), cranial cruciate ligament rupture, Labrador Retriever, dog.
They also compared affected and normal hind limbs after surgery. McLaughlin et al.3 objectively measured the effect of triple pelvic osteotomy on dogs with bilateral hip dysplasia with force platform gait analysis. McLaughlin and Roush4 compared 2 scapulohumeral arthrotomy techniques using GRFs, radiography, and subjective lameness scores. Use of a force platform detected group
INTRODUCTION
F
ORCE PLATFORM gait analysis has been used as an objective method for assessing lameness in horses, dogs, and other species.1 Budsberg et al.2 used the force platform to objectively measure pre and post ground reaction forces (GRFs) after stabilization of canine stifles.
From the Orthopaedic Research Laboratory, Iowa State University, College of Veterinary Medicine, Ames, IA. Presented at the 2003 Veterinary Orthopaedic Society Conference, Steamboat Springs, CO February 23, 2003. Dr. Horstman’s current address is Department of Clinical Sciences, College of Veterinary Medicine Sciences Mississippi State, MS 39762. Address reprint request to Dr. Evans, PhD, Orthopaedic Research Laboratory, Iowa State University, College of Veterinary Medicine, Ames, IA 50011. E-mail:
[email protected]. Submitted March 2005; Accepted May 2005 r Copyright 2005 by The American College of Veterinary Surgeons 0161-3499/04 doi:10.1111/j.1532-950X.2005.00067.x
445
446
FORCE PLATFORM GAIT ANALYSIS IN LABRADORS WITH CRANIAL CRUCIATE DISEASE
differences but subjective scoring did not. Jevens et al.5 used force platform analysis to compare outcomes of intracapsular and extracapsular methods for treatment of cranial cruciate ligament (CCL) rupture. Although these reports have all made important contributions there are no formal investigations of the ability of GRFs generated by force platform gait analysis to discriminate lame and normal dogs or quantify the level of accuracy. Also, there have been no multivariate approaches to determine which GRFs, or combination of forces, best discriminate between lame and normal dogs. This becomes increasingly important when clinical decisions and outcome measures are based on force platform data. Finally, many conclusions are drawn after only visual observation of gait and to our knowledge, the sensitivity of observation to discriminate normal from abnormal has not been compared with the discriminant ability of a force platform. Our purpose was 3-fold. First, we describe a methodology that can be used to quantify the accuracy, and determine approximately optimal GRFs from a force platform.6 This methodology is independent of species and disease. Second, using this methodology, we formally assess the accuracy of the force platform to gait data generated by normal Labradors and those 6 months after surgery for unilateral CCL rupture. Finally, we show that a force platform is more sensitive than visual observation of gait using this method with gait data generated from normal Labradors and those 6 months after surgery for unilateral CCL rupture. The probability of detecting a gait abnormality was compared between visual observation of gait and the computational method. Our hypothesis was that optimally weighted combinations of GRFs provide better discrimination among lame and sound dogs than univariate GRFs. MATERIALS AND METHODS Dogs We studied normal Labrador retrievers (n ¼ 17), preoperatively Labradors that had unilateral CCL rupture (100), and Labradors 6 months after surgery for unilateral CCL rupture (131). The postoperative group consisted of 6 month followup on the preoperative group, as well as an additional 31 postoperative dogs that were included to increase the power of that part of the study. For all dogs, owner history, physical examination, and force platform gait analysis were performed. Dogs with unilateral CCL rupture (preoperative and postoperative groups) had injury to the medial meniscus that required partial or complete meniscectomy. Normal dogs had no history of lameness, observable lameness, or physical examination abnormalities, and had no radiographic evidence of effusion or degenerative joint disease in either stifle or any other joint.
Dogs were diagnosed with unilateral CCL rupture by owner history, physical examination, and radiographs; the diagnosis was confirmed by surgical exploration of the affected joint. All dogs were free of other orthopedic and neurologic abnormalities, including bilateral CCL rupture. Gait analysis was performed before surgery in the preoperative group. For the first part of the study, normal and preoperative groups were the ‘‘gold standard’’ and used to determine the GRFs that best discriminated lame and sound Labradors with stifle injury, and to quantify discriminatory capability. For the second part of the study, postoperative Labradors were used to compare methods of detecting gait abnormalities. The objective in this part of the study was to determine if visual observation was as sensitive at detecting lameness as a force platform. Note that this was a different problem than correlating lameness scores and GRFs. Dogs were studied 6 months after surgery for unilateral CCL rupture by physical examination, visual observation of lameness (present, absent) and force platform gait analysis. Visual scoring was performed by 1 board certified surgeon who observed the dogs during the force platform gait analysis. The surgeon was unaware of the affected limb and the force platform data analysis. We chose a binary visual observation score because we wanted to determine if visual observation could discriminate between lame and sound dogs, and to compare that ability to the accuracy of the force platform.
Gait Analysis Computer-assisted force platform gait analysis was performed using a biomechanical platform (OR6-6-1000 Advanced Mechanical Technology Inc., Watertown, MI) embedded in a 10 m walkway. Three sets of retroflective photocell sensors were attached in series and positioned in the walkway, each 1 m apart with the middle sensor positioned at the middle of the force plate; they were used to determine velocity and acceleration over the 2 m measurement region (Me 92-TPAD Retroflective Photocell Sircon Controls Mississauga, Ontario, Canada). Gait analysis was performed at the walk (velocity, 1–1.3 m/s; acceleration, 0.5 m/s2). For the first part of the study, GRFs were obtained from clinically normal dogs and from preoperative gait analysis of the CCL rupture dogs. Rising slope, falling slope (FS), the ratio of time to peak vertical force (PVF) to stance time, PVF, vertical impulse (VI), and combinations of these 5 variables were used. Rising slope was defined as the slope of the straight line that connected the start of the stance phase, with zero force, to the point of maximum force. It represented the rate at which a dog loads a limb. FS was defined as the slope of the straight line that connected the point of maximum force to the end of stance phase (zero force). It represented the rate at which a dog unloaded the limb. PVF and VI were selected because they are commonly used measures of stifle lameness. The other 3 forces were selected because their interpretations suggest they may have value in predicting lameness. Forces in the x and y planes may also have predictive value, but are not generally reported for assessment of stifle injury.
447
EVANS, HORSTMAN, AND CONZEMIUS For the second part of the study, visual observation of gait score and GRFs were obtained from dogs 6 months after surgery for unilateral CCL rupture. GRF measurement units (N) were expressed as a percent of body weight. Slopes (N) were expressed as a percent of body weight/second, and the ratio of time to PVF to stance time was unitless. The accuracy of the force platform was quantified using the area under the receiver-operating characteristic (ROC) curve, the sensitivity (the probability of classifying a normal dog as normal) and specificity (the probability of classifying a lame dog as lame) for several GRFs and combination of forces. These values reflect the ability of a force platform to be used as a clinical tool. The optimal set of GRFs was selected using logistic regression, with sound/not sound as the binary-dependent variable and GRFs as candidate explanatory variables. A standard forward and backward model building procedure was used to select the optimal set of GRFs. This iterative, computer intensive method was used to build a descriptive yet parsimonious model when there are many candidate combinations of GRFs that could potentially and accurately model lameness. At every step in the process, the single best fitting GRF was included in the model (the forward step), and then the current model was reviewed to remove any single GRF (the backward step) that was redundant after the forward step. This procedure continued until no further improvements could be made in the model.
Data Analysis GRFs measured from the lame hind limb were compared with the GRFs from a randomly selected hind limb from the normal group. GRFs are continuous measures and a cutoff value was required to classify dogs as normal or lame. Greiner et al.7 documented many cutoff criteria; we used the cutoff which maximized Youden’s index (sensitivity þ specificity1). ROC analysis, as described by Griener et al.7 was used to assess the diagnostic properties of GRFs obtained from the force platform. An ROC curve is a plot that represents the relationship between sensitivity and specificity of a diagnostic test. The area under the curve (AUC) is a commonly used summary of test accuracy that ranges from 0.5 (a diagnostic test that cannot distinguish between disease and non-diseased) to 1 (a cutoff that has perfect sensitivity and specificity). The AUC ( SEM for the areas) under the ROC curves were obtained for GRFs using the Wilcoxon statistic provided by the ROCKIT 0.9b software package (University of Chicago, Chicago, IL). Then cutoff values for GRFs were determined by maximizing Youden’s index. These values were used to acquire the sensitivities and specificities for all GRFs. These values were then applied to dogs that were studied 6 months after unilateral CCL rupture surgery. The probability that an individual Labrador could be discriminated from the normal population of Labradors was calculated for data collected from visual observation of gait and from GRFs generated by force platform gait analysis. All data were expressed as mean SEM. Statistical significance was set at Po05.
RESULTS Mean SEM body weight for normal dogs was 29.3 0.8 kg, for preoperative dogs was 36.7 0.7 kg, and for 6-month CCL rupture dogs 37.63 0.68 kg. Mean weight of normal and lame groups was statistically different suggesting the possibility that weight was confounded with force platform GRF, so GRFs were normalized for body weight. Pearson’s correlation was calculated for each of the 5 GRFs with weight and none were statistically different from zero (P4.05). In essence, although body weights were statistically different between groups, the difference had no significant effect on normalized GRFs. Table 1 is a summary of the range of PVF and VI for normal and preoperative CCL rupture dogs. GRFs range for preoperative CCL rupture dogs was large and overlapped those for normal dogs. Table 2 is the area under the ROC curve (AUC) for the GRF we measured. The combination of GRFs that best discriminated lame and sound dogs was PVF with FS. The AUC generated from this combination was 0.98, and was higher and statistically different from all the other AUC scores. PVF and VI are commonly used GRFs; AUC under the ROC curve were 0.89 and 0.84, respectively. Table 3 compiles sensitivities and specificities for GRF at cutoffs determined by maximizing Youden’s index. The combined PVF–FS score has a sensitivity of 93% and a specificity of 94%, which was better than all other GRFs and other combinations. The values in this table are cutoff dependent, but emphasize the ability of the force platform to discriminate between lame and sound Labradors, and that the multivariate method provides considerably better discrimination. The score equation that combines PVF and FS is as follows: LogitðProbability of being soundÞ ¼ 5:7 þ 33:4FS þ 0:44PVF Note that FS is negative, and a large negative value indicates that a dog was unloading the limb quickly. This equation accounts for those lame dogs that have a larger PVF than expected, but then unload the limb quickly. Figure 1 is a scatter plot of FS against PVF and demonstrates the advantage of the multivariate approach to Table 1. Range of Ground Reaction Forces in Lame and Sound Dogs
Dogs Lame (n ¼ 17) Sound (n ¼ 131)
Peak Vertical Force (N expressed as % body weight)
Vertical Impulse (N expressed as % body weight)
14–48.4 34.2–54.6
2.6–17.9 10.8–17.2
448
FORCE PLATFORM GAIT ANALYSIS IN LABRADORS WITH CRANIAL CRUCIATE DISEASE
Table 2. Areas Under the Receiver-Operating Characteristic (ROC) Curve for Ground Reaction Forces (GRFs) and the Combination of Peak Vertical Force with Average Falling Slope
−0.1
Ground Reaction Force
AUC
−0.4
Falling slope (FS) Rising slope (RS) Vertical impulse Peak/total time ratio Peak vertical force (PVF) PVS–FS
0.80a 0.80a 0.84a,b 0.88b 0.89b 0.98c
Different superscripts indicate statistical difference (Po.05) AUC. Larger area under curve (AUC) indicate GRF that better discriminate sound and lame dogs.
−0.2 Average Falling Slope
−0.3 −0.5 −0.6 −0.7 −0.8 −0.9 −1 −1.1 −1.2 10
lameness assessment. Sound dogs are labeled by circles and lame dog by diamonds. There are lame dogs with FS (PVF) values that are in the range of those for sound dogs, so that these GRFs do not individually perform well for distinguishing lame from sound. For example, most lame dogs with larger PVF (e.g., 435) distinguish themselves from the sound dogs with a steeper FS. For the second part of the study, PVF–FS probabilities (of soundness) were calculated for 15 dogs with observable lameness 6 months after surgery, and 116 dogs with no observable lameness at 6 months. The largest probability of a dog with observable lameness having normal limb function was 0.35, which suggests that for dogs with observable lameness, the probability calculation generated from GRF agrees with visual observation. In contrast, when evaluating dogs with no observable lameness most dogs (75% have o.42 probability of soundness) still have a high probability of being discriminated from the gait of normal dogs when using the calculation generated from GFR. This suggests that the probability calculation generated from force platform gait analysis is more sensitive than visual observation of gait when trying to discriminate normal from abnormal gait. DISCUSSION McIntosh and Pepe6 described a statistical method of combining cancer biomarkers, each of which is an infeTable 3. Sensitivities and Specificities for Ground Reaction Forces for Cutoffs Determined by Youden’s Index Ground Reaction Force
Sensitivity
Specificity
Falling slope (FS) Rising slope (RS) Vertical impulse Peak/total time ratio Peak vertical force (PVF) PVS–FS
0.1 0.03 0.88 0.90 0.94 0.93
1 1 0.82 0.76 0.82 0.94
15
20
25
30 35 PVF
40
45
50
55
Fig 1. Scatterplot of falling slope (Newtons as % weight/s2) against peak vertical force (Newtons as % weight). Sound dogs (circles) typically have both large vertical force and shallow (small negative) falling slope pared with lame dogs (diamonds).
body body peak com-
rior test, so that the combined test is always as good as or better than the individual biomarkers. Historically, manuscripts have used a single GRF to describe limb function in a population of dogs or to discriminate gait among groups of dogs. However, it is possible that no one GRF completely describes lameness. We applied the statistical method described in an effort to optimize the use of GRFs as a diagnostic test, to determine if the multivariate approach improved sensitivity and specificity and to compare the best GRF combination to that of visual observation of gait to discriminate lame dogs from normal dogs. Our results suggest that the multivariate approach is superior to the univariate one and the combination of PVF and FS maximized Youden’s index. We also think that this combination makes clinical sense. Many dogs known to be lame (preoperative CCL rupture group) had a high PVF. Using a univariate approach would not have discriminated them from the normal population; however, most of those dogs concurrently had a large negative FS value; the combination separated them from the normal Labradors. The large FS value indicates that a dog was unloading the limb quickly. This could be because of limb pain once the peak force reaches a normal threshold. The assessment of accuracy of force platform analysis is dependent on many factors.8 The results presented are for normal Labradors and those with CCL rupture. The accuracy of force platform analysis may change with different lameness and different breeds. An obvious example, the results (e.g., the score equation and the area under the ROC curve) could be different if the disease were bilateral. Other GRFs (e.g., x or y forces) may be predictive for
EVANS, HORSTMAN, AND CONZEMIUS
other causes of lameness. However, the statistical methodology used in this paper could be used to determine the accuracy of the force platform for other diseases and breeds. The optimal combination of GRFs and coefficients of the score equation may be different, but will always be at least as good as a univariate analysis.6 The area under an ROC curve is a measure of the accuracy of a diagnostic test with continuous outcomes that is independent of an arbitrary cutoff value. The area under the ROC curve for the PVF–FS score was interpreted as the probability that a randomly selected lame dog would have larger PVF–FS probability than a randomly selected normal dog. That is, given 2 randomly chosen dogs (1 lame, 1 normal), there is a 98% chance that force platform gait analysis will correctly discriminate the lame from the normal Labrador. Cutoff selection is an inherently ambiguous process, because many external factors can influence the diagnostic properties of a test. Also, there are many analytic methods to select cut-offs for diagnostics tests with continuous outcomes. We chose a common method that determines a cutoff by maximizing the sum of sensitivity and specificity. However, if a clinician preferred a more sensitive (or specific) test, then another cutoff would have to be selected. Note that if sensitivity (specificity) is increased then specificity (sensitivity) is decreased. As a comparison with GRF from force platform gait analysis, the sensitivity for several canine heartworm tests range between 30% and 95% (depending on the worm’s sex and the total worm burden).9 Computed tomography imaging for detection of fragmented medial coronoid process in dogs has a sensitivity of 88.2% and a specificity of 84.6%.10 This suggests that force platform gait analysis is a superior diagnostic test when compared with some commonly used tests in veterinary medicine. Perhaps the most important finding from this study was generated from the comparison of visual observation of gait and force platform gait analysis. A clinical conclusion that a dog had normal limb function generated from visual observation of the patient’s gait was common, but is it a sensitive enough test to make generalizations about outcome? We found that all dogs with observable gait abnormalities had a very low probability of being normal (the largest probability of being normal was 35%). Here, normal is defined as having GRFs consistent with the sound Labradors in the multivariate sense. In contrast, 75% of the dogs that had no observable gait abnormalities had a probability of o50% of being sound and the abnormality was easily discriminated by their GRFs. In fact, 12 of these dogs had o10% chance of having normal gait. Unfortunately, this data creates a clinical dilemma. If visual observation of gait
449
cannot reliably discriminate abnormal from normal gait (in this model) and it has been previously reported that there is no relationship between limb function and radiographic osteoarthrosis score in dogs with stifle osteoarthrosis11 how does one simply tell an owner that their dog’s limb function has or has not returned to normal? To date, we are not aware of a simple diagnostic test to determine that a dog has normal limb function. Using a Labrador CCL rupture model, we found that a multivariate approach of GRFs generated by force platform data was superior to a univariate approach. We also found that the combination of PVF–FS had sensitivity and specificity that is comparable with some commonly used diagnostic tests in veterinary medicine. Finally, we found that visual observation of gait of Labradors 6 months after unilateral CCL rupture surgery was an inferior method of discriminating normal from abnormal gait when compared with GRFs. REFERENCES 1. Corr SA, McCorquodale CC, McGovern RE, et al: Evaluation of ground reaction forces produced by chickens walking on a force plate. Am J Vet Res 64:76–82, 2003 2. Budsberg SC, Verstraete MC, Soutas-Little RW, et al: Force plate analyses before and after stabilization of canine stifles for cruciate injury. Am J Vet Res 49:1522–1524, 1988 3. McLaughlin RM Jr, Miller CW, Taves CL, et al: Force plate analysis of triple pelvic osteotomy for the treatment of canine hip dysplasia. Vet Surg 20:291–297, 1991 4. McLaughlin RM, Roush JK: A comparison of two surgical approaches to the scapulohumeral joint in dogs. Vet Surg 24:207–214, 1995 5. Jevens DJ, DeCamp CE, Hauptman J, et al: Use of forceplate analysis of gait to compare two surgical techniques for treatment of cranial cruciate ligament rupture in dogs. Am J Vet Res 57:389–393, 1996 6. McIntosh MW, Pepe MS: Combining several screening tests: optimality of the risk score. Biometrics 58:657–664, 2002 7. Greiner M, Pfeiffer D, Smith RD: Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests. Prev Vet Med 45:23–41, 2000 8. Begg CB: Biases in the assessment of diagnostic tests. Stat Med 6:411–423, 1987 9. Courtney C, Zeng Q: Comparison of heartworm antigen test kit performance in dogs having low heartworm burdens. Vet Parasitol 96:317–322, 2001 10. Carpenter LG, Schwarz PD, Lowry JE, et al: Comparison of radiologic imaging techniques for diagnosis of fragmented medial coronoid process of the cubital joint in dogs. J Am Vet Med Assoc 203:78–83, 1993 11. Gordon WJ, Conzemius MG, Riedesel E, et al: The relationship between limb function and radiographic osteoarthrosis in dogs with stifle osteoarthrosis. Vet Surg 32:451–454, 2003