Upper Extremity Reachable Workspace Evaluation with Kinect

Upper Extremity Reachable Workspace Evaluation with Kinect Gregorij KURILLOa,b,1, Jay J. HAN b, Štěpán OBDRŽÁLEKa, Posu YANa, Richard T. ABRESCHb, Alina NICORICIb, Ruzena BAJCSYa a University of California at Berkeley, Berkeley, USA b University of California at Davis Medical Center, Sacramento, USA

Abstract. We propose a novel low-cost method for quantitative assessment of upper extremity workspace envelope using Microsoft Kinect camera. In clinical environment there are currently no practical and cost-effective methods available to provide arm-function evaluation in three-dimensional space. In this paper we examine the accuracy of the proposed technique for workspace estimation using Kinect in comparison with a motion capture system. The experimental results show that the developed system is capable of capturing the workspace with sufficient accuracy and robustness. Keywords. Goniometry, Kinect, reachable workspace, upper-limb evaluation

Introduction Accurate and reliable assessment of upper extremity is critical for diagnosis and characterization of various neuromuscular conditions and injuries, for tracking progress of therapy and to evaluate effects of drug or surgical interventions. Various functional tests administered in clinical practice are by large subjective and provide only coarse qualitative information on patient's mobility [1]. Assessment of upper extremity is also performed through range of motion (ROM) measurements using manual goniometry where angular range of motion in the joints in various body planes is obtained [2]. The ROM assessment can either be passive, where the therapist moves the joint, or active, where the patient without help exercises the joint through the available range of rotation. Active ROM assessment is therefore often used as an indicator for upper extremity function. The manual goniometry as opposed to functional tests provides quantitative results on patient's joint mobility; however, it is quite subjective, time-consuming and suffers from low accuracy and reliability [3]. In biomechanics and kinesthetic laboratories, various motion capture techniques are applied to capture the full kinematics of individual's limbs and to perform three-dimensional workspace analysis with high accuracy [4]. Clinical physiotherapy and physician/surgical practice; however, have no 3D tools for arm function/workspace evaluation, which hampers a uniform and objective comparison.

1

Corresponding Author: Gregorij Kurillo, University of California at Berkeley, 736 Sutardja Dai Hall # 1764, Berkeley CA 94720-1764, USA; E-mail: [email protected]

Figure 1. Kinect-based reachable workspace analysis system is aimed to quantify and visualize the upperextremity function. By capturing the movement in various body planes through visual-feedback driven assessment protocol, the surface area of the maximal reachable points is obtained. The space is further divided into quadrants as denoted by Roman numerals (left) and corresponding colors (right).

To address the lack of clinical tools in this area, we propose a new methodology for assessment of reachable workspace for evaluation of the upper extremity which can be closely associated with functional status of an upper limb. The concept of the approach for Kinect-based assessment of the reachable workspace is illustrated in Figure 1. In our pilot study [5] we have developed the concept using a stereo camera system to characterize upper-limb dysfunction in patients with various neuromuscular conditions. In this work we take advantage of the Microsoft Kinect camera with fullbody tracking capabilities, and a custom-designed 3D virtual environment for visual feedback to ensure consistency of the assessment. In this paper we present a protocol for upper extremity evaluation which is aimed to capture the active ROM while reducing the time needed in traditional goniometry. We evaluate individual's reachable workspace by analyzing the surface area which provides quantitative and visual information on the range of the upper limb movement. Finally, we compare the accuracy of the Kinect-based assessment with a commercial motion capture system.

1. Methods & Materials 1.1. Kinect System The Kinect camera captures depth and color images with 30 frames per second (fps), generating a cloud of three-dimensional points from an infrared pattern projected onto the scene. The resolution of the depth sensor is 320 x 240 pixels providing depth accuracy of about 10-40 mm in the range of 1-4 m [6]. The accompanied Microsoft Kinect SDK features real-time tracking of human limbs. The system is primarily intended for gaming and human computer interaction; therefore, it is important to evaluate its accuracy and robustness in medical applications [7]. 1.2. Motion Capture For validation of the Kinect skeleton tracking data, we simultaneously captured the subjects using commercial marker-based motion capture system Impulse (PhaseSpace, Inc., San Leandro, CA). Impulse motion capture system can uniquely identify and track 3D position of LED markers with the frequency of 480 Hz and sub-millimeter accuracy. For the experiments we used a tight-fitted shirt equipped with 18 markers. In addition

we applied three markers on dorsal side of each hand and three markers on tight-fitted cap to mark the top part of the head. In total 27 markers were used to capture the upper body. Since it is difficult to position markers on anatomical landmarks, we used skeletonization method integrated in Recap 2 software (PhaseSpace, Inc.). In our past experiments we showed that Recap skeleton fitting was superior over MotionBuilder (AutoDesk, Inc.) algorithm for the purpose of biomechanical evaluation [7]. For each subject we recorded a calibration procedure which involved exercising movement in the wrist, elbows and shoulder joints while keeping the rest of the body in a T-pose. From the calibration data, the algorithm determines location of the joints and is thus able to fit an anthropometric skeleton into the marker data. The Recap skeletonization algorithm is able to provide a skeleton even when some markers are occluded. The temporal synchronization of the Kinect camera and the Impulse motion capture system was performed using Network Time Protocol (NTP) service. The motion capture server provided NTP service for clock synchronization while the desktop computer with the Kinect was able to update its clock accordingly. We used Meinberg NTP client to achieve accurate synchronization on the Windows system. The geometric calibration of the two systems was performed in two steps. In the first step we calibrated the intrinsic cameras parameters of the Kinect using a standard checkerboard. In the second step we collected 2D projection of an LED marker captured by the Kinect camera and the corresponding 3D location captured by the Impulse cameras. From the data, rotation and position of the Kinect camera with respect to the motion capture system coordinates was determined. Figure 2 shows the output of the two skeletonization algorithms after the calibration and the vertical and horizontal shoulder angles during the performance of the movement protocol. 1.3. Visualization To facilitate objective evaluation of the upper extremity reachable workspace, the users were presented with a visual feedback as shown in Figure 3a. The 3D environment features the video of the therapist performing the protocol, and a mirrored 3D image of the user as captured by the Kinect camera. We found that the visual feedback of the user provides important visual cues for following the movement protocol. For the feedback we deliberately displayed only a texture-less 3D image since patients may not be comfortable watching a full (textured) video of themselves. Furthermore we have implemented visualization of the results that the examiner can review following the data acquisition (Figure 1). The visualization framework was implemented using Ogre 3D graphics engine (www.ogre3d.org). 1.4. Upper Extremity Reachable Workspace Protocol We developed a simple set of movements consisting of first lifting the arm from the resting position above the head while keeping the elbow extended, performing the same movements in the vertical planes of about 0, 45, 90, and 135 degrees. The second set of movements consists of horizontal sweeps at the level of umbilicus and shoulder. Both vertical and horizontal movements were performed in one recording session, lasting less than 1 minute. The movement protocol was developed and refined through a series of experiments with healthy persons and individuals with various forms of neuromuscular diseases [5]. Two videos of the protocol performed by a kinesiologist were recorded (i.e. for left and right side) and used as feedback.

Figure 2. (a) Comparison of two different skeleton poses represented by PhaseSpace Recap (blue) and Kinect (red). (b) Extracted joint angles for the right elbow and shoulder during the execution of the movement protocol. The results are compared between the Kinect (solid line) and PhaseSpace Recap skeleton (dotted line).

1.5. Experimental Procedure & Subjects The validation of the Kinect-based reachable workspace evaluation was performed in 10 healthy subjects. We collected simultaneous recordings of the motion capture markers and Kinect skeleton data. After donning the suit with markers, we collected calibration data for each subject. During the entire procedure the subject was seated on a chair and instructed to keep the back upright. Each subject first watched instructional video in full screen mode. The kinesiologist in person provided additional instructions on body posture and limb positioning during various sequences of the task. Next, the subject performed three repetitions of the protocol on each side of the body while observing the visual feedback provided on a 55" TV screen. 1.6. Analysis The captured data was post-processed by first aligning all the modalities in the spatial and temporal domains. Using Recap software, we obtained the reference skeleton for each individual from their calibration sequence. The skeleton was then fitted into each recoding trial and stored in standard motion capture BVH format. The motion capture data was pre-processed by filtering spikes and interpolating occluded markers through a cubic interpolation. The skeleton data from the Kinect and motion capture was further analyzed in Matlab (MathWorks, Inc., Natick, MA). The reachable workspace is defined by a set of all the points relative to the torso that an individual can reach by moving their hands. Its envelope can be characterized by the encompassing surface area. It is not practical or feasible to ask the subject to reach all the possible points therefore we use the trajectory obtained from the movements in various standardized body planes. In 3D space, the obtained hand trajectory can be interpreted as a point cloud where the points lie on a surface of the reachable envelope of the arm. Since the arm trajectory covers only a portion of the space, it is not possible to determine the enclosed surface by a simple Delaunay triangulation. Instead we approximate the shoulder joint movement by a spherical joint and parameterize the trajectory in spherical coordinates with two angles corresponding

Figure 3. Extracted skeleton, the corresponding hand trajectories and reachable workspace as acquired with the Kinect camera (a) and PhaseSpace motion capture system (b). (c) Average relative surface area with standard deviation as measured in 10 subjects, combining the results for left and right arm (N=6).

to shoulder flexion/extension and abduction/adduction measurements in goniometry. This is a reasonable approximation since the skeleton model only provides a simple kinematic chain of the body segments. After the mapping into the spherical coordinates, we can determine the boundaries of the trajectory by a concave polygon. The polygon is determined by using the alpha shape [8] with radius π/4 to tightly fit the data points. Finally the boundary of the polygon is projected back to the Cartesian coordinates to obtain their equivalent 3D trajectory. The resulting boundary lies on the spherical surface which can then be culled accordingly to retain only the surface inside the point cloud of hand positions. Furthermore we divide the workspace area into several quadrants that correspond to clinically significant functional subspaces, e.g. above/below shoulder, left/right side of the body. The sagittal plane divides the surface into left and right side of the workspace and the horizontal plane (at the level of the shoulder joint) divides the top and bottom part of the workspace. The reported surface area was calculated for the entire workspace envelope and for individual quadrants. To allow for comparison between different subjects, we normalized the absolute surface area as the portion of the unit hemi-sphere that is covered by the hand movement. It is determined by dividing the absolute area by the factor 2πR2. The parameter R, which represents the average distance of the hand from the shoulder, is determined by the least-squares sphere fitting algorithm. The relative surface area of 1.0 would thus correspond to the entire frontal hemisphere that the subject could reach, with its origin in the shoulder joint.

2. Results For the analysis of the reachable workspace, the exact location of the joint is not as critical as the relative location of the hand with respect to the shoulder joint of the skeleton. As shown in Figure 2 there is an offset between the two skeletons originating from different approach to fitting the kinematic structure into the data. In the case of Recap, the skeleton is fitted into the markers which are located in the front and back side of the torso providing more accurate determination of the body center. The Kinect on the other hand only sees the frontal body surface and fits the skeleton closer to this

Figure 4. (a) Average relative surface area and corresponding division into quadrants as captured by the two systems in 10 subjects, 3 trials each. The error bars show standard deviation of the total surface area. (b) Data of 10 control subjects and their corresponding mean and standard deviation results are compared between the two acquisition systems.

surface. The skeleton data can be applied to extract the joint angles during the performance of the movement protocol as shown in Figure 2b. The output compares the rotation angle of the elbow joint and two rotational angles for the shoulder joint (around vertical and horizontal body axes). Note, that the elbow joint was kept straight during the movement, except when reaching the extreme positions in the two vertical sweeps. Although the variability of the measured elbow angle is relatively large (about 20º), the time when elbow was flexed is clearly visible. The shoulder angle output shows high accuracy in the extreme positions (e.g. four peaks in horizontal angle), which could be used to automatically determine the active ROM. Figure 3a and 3b show the results of the measured workspace envelope in one of the subjects as determined from the Kinect and Recap skeletons. Although the hand trajectory in the Kinect shows more variability due to the noisiness of the skeleton fitting algorithm, the final result is robust to this noise. The absolute surface area for this example was 1.09 m2 for the Kinect and 1.11 m2 for the motion capture data. We show only results for the frontal hemisphere. The individual results of all the subjects are presented in Figure 3c where the data for the left and right side are combined as the average of three trials on each side. The results show that in majority of the subjects, the difference between the motion capture and Kinect is relatively small. Figure 4a presents the average relative surface area with the standard deviation as measured in 10 subjects, each repeating the task three times. The results are compared for the left and right side as acquired by the motion capture and Kinect. The segmentation of the bars corresponds to the average division of the surface area into the four quadrants. The corresponding numerical values are shown in Figure 4b. The statistical analysis shows that there is no significant difference in the performance with left and right arm in either of the systems (paired samples t-test; Mocap: t9= 0.038, p=0.971; Kinect: t9= 1.424, p=0.188). We can observe a small offset in the average relative surface area with the Kinect consistently exhibiting slightly larger values. The average absolute surface area, however, is nearly identical (Figure 4b). The relative surface area is affected by the scaling factor (R) which represents the radius of the fitted sphere into the hand trajectory. In Kinect the radius is smaller due to slightly different positioning of the shoulder joints. The results however have consistent offset which can be modeled in terms of a systematic error. The analysis of variance pertaining to the data in Figure 4a shows that the overall performance between the two systems is not significantly different (one-way ANOVA; F3,36 = 1.68, p = 0.189).

3. Conclusions In this paper we presented initial results on development of a quantitative assessment for the upper extremity function through reachable workspace area using Microsoft Kinect camera. The comparison with the motion capture system showed that the Kinect based measurement is sufficiently accurate and robust for this type of evaluation. We are planning to perform further evaluation in larger group of healthy subjects and patients in a clinical setting (e.g. patients with various types of neuromuscular diseases, persons after shoulder injury). We will validate the proposed method against standardized outcome measures pertaining to upper-extremity function. Furthermore, we will investigate how the system could be applied to perform automatic goniometric measurements of the shoulder joint. The developed concept also has potential to be used in applications of tele-medicine, where an individual's functional data could be collected and sent from home via internet-connected Kinect to the clinical sites where data collection and further off-line analysis can take place. The 3D visualization proposed in this work could be applied to augment the real-time feedback during performance of upper-limb exercises.

Acknowledgements The research was supported by Center for Information Technology Research in the Interest of Society (CITRIS) at University of California, Berkeley; National Science Foundation (NSF): #1111965; and U.S. Department of Education/NIDRR: #H133B031118 and #H133B090001. References [1] E. Croarkin, J. Danoff and C. Barnes, Evidence-based rating of upper-extremity motor function tests used for people following a stroke, Physical Therapy 84 (2008), 62-74. [2] M. Mullaney, M. McHugh, C. Johnson and T. Tyler, Reliability of shoulder range of motion comparing a goniometer to a digital level, Physiother Theory Pract 26 (2010), 327-333. [3] R. Gajdosik and R. Bohannon, Clinical measurement of range of motion. Review of goniometry emphasizing reliability and validity, Phys Ther 67 (1987), 1867-1872. [4] N. Klopcar, M. Tomsic and J. Lenarcic, A kinematic model of the shoulder complex to evaluate the arm-reachable workspace, Journal of Biomechanics 40 (2007), 86-91. [5] G. Kurillo, J. J. Han, R. T. Abresch, A. Nicorici, P. Yan and R. Bajcsy, Development and application of stereo camera-based upper extremity workspace evaluation in patients with neuromuscular diseases, PLoS ONE 7 (2012). [6] K. Khoshelham and S. O. Elberink, Accuracy and resolution of Kinect depth data for indoor mapping applications, Sensors 12 (2012), 1437-1454. [7] Š. Obdržálek, G. Kurillo, F. Ofli, R. Bajcsy, E. Seto, H. Jimison and M. Pavel, Accuracy and robustness of Kinect pose estimation in the context of coaching of elderly population, in Proceedings of EMBC, 34th International Conference of the IEEE Engineering in Medicine and Biology Society, San Diego, CA, 2012. [8] H. Edelsbrunner and E. P. Mücke, Three-dimensional alpha shapes" in VVS '92 Proceedings of the 1992 workshop on Volume visualization, 1992.