Markerless Knee Joint Position Measurement ... - Semantic Scholar

sensors Article

Markerless Knee Joint Position Measurement Using Depth Data during Stair Walking Ami Ogawa 1, * 1 2 3

*

ID

, Akira Mita 2 , Ayanori Yorozu 3 and Masaki Takahashi 2

School of Science for Open and Environmental Systems, Graduate School of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan Department of System Design Engineering, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan; [email protected] (A.M.); [email protected] (M.T.) Graduate School of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan; [email protected] Correspondence: [email protected]; Tel.: +81-45-563-1151

Received: 4 September 2017; Accepted: 21 November 2017; Published: 22 November 2017

Abstract: Climbing and descending stairs are demanding daily activities, and the monitoring of them may reveal the presence of musculoskeletal diseases at an early stage. A markerless system is needed to monitor such stair walking activity without mentally or physically disturbing the subject. Microsoft Kinect v2 has been used for gait monitoring, as it provides a markerless skeleton tracking function. However, few studies have used this device for stair walking monitoring, and the accuracy of its skeleton tracking function during stair walking has not been evaluated. Moreover, skeleton tracking is not likely to be suitable for estimating body joints during stair walking, as the form of the body is different from what it is when it walks on level surfaces. In this study, a new method of estimating the 3D position of the knee joint was devised that uses the depth data of Kinect v2. The accuracy of this method was compared with that of the skeleton tracking function of Kinect v2 by simultaneously measuring subjects with a 3D motion capture system. The depth data method was found to be more accurate than skeleton tracking. The mean error of the 3D Euclidian distance of the depth data method was 43.2 ± 27.5 mm, while that of the skeleton tracking was 50.4 ± 23.9 mm. This method indicates the possibility of stair walking monitoring for the early discovery of musculoskeletal diseases. Keywords: stair climbing; stair descending; knee joint position; gait measurement; depth data; skeleton tracking; markerless measurement; Kinect v2; VICON; 3D motion capture system

1. Introduction Due to a significant increase in the number of elderly people aged 65 years and older [1], there is a growing need for a method to find musculoskeletal diseases at an early stage. Stair walking monitoring is considered to be one of the solutions to meet this need. Subtle signs of musculoskeletal diseases are likely to be discovered earlier in tasks involving walking up and down stairs, because ascending and descending stairs are demanding activities [2–4]. In the case of knee osteoarthritis (OA), for example, which is a musculoskeletal disease that raises the risk of death [5], knee pain occurs in such stair walking activities before it does in other activities [6]. In addition, compared with controls, patients with knee OA show higher external knee adduction moments during stair walking [7,8], which contributes to the development of chronic knee pain [9]. However, it is difficult to recognize the gradual changes of musculoskeletal diseases by themselves. People do not usually go to see a doctor or monitor their musculoskeletal functions until the diseases have clearly developed. Furthermore, even if they have gait checks made in hospital, their performance in walking tests is not always the same as what it would be in a real living environment [10].

Sensors 2017, 17, 2698; doi:10.3390/s17112698

www.mdpi.com/journal/sensors

A markerless method should be used in order not to mentally or physically disturb the subject. Previous studies have used 3D motion capture systems [11–19], electromyography (EMG) measurement devices [13,20,21], and electro-goniometers [20] to monitor stair walking. It is necessary to attach reflective markers or electrodes to the subject’s body so that these devices can record data. This makes these devices impractical for everyday living environments. Markerless sensors such as Sensors 2017, 17, 2698 2 of 18 force plates [4,16,19–21] are widely used, but they are expensive and difficult to install. Video cameras must be installed on the side walls of stairs [2], but many households do not have stairwells wide enough. In addition, the use of cameras best avoidedwithout owing toany privacy concerns. Stair walking monitoring in daily living is environments disturbance is what is needed to Depth sensors overcome many of the shortcomings of these other devices. They can be placed screen people for clinical treatment. so that face the method subject’s should frontal plane, andinmarkers are unnecessary, installation A they markerless be used order or notelectrodes to mentally or physicallysodisturb the is convenient. Skeleton tracking, which a function provided by [11–19], Kinect for Windows [22], makes it subject. Previous studies have used 3Dismotion capture systems electromyography (EMG) possible to perform a markerless jointelectro-goniometers position estimation[20] based on machine a number measurement devices [13,20,21], and to monitor stairlearning walking.using It is necessary of attach datasets. Kinect markers can acquire kinematicto and it has been useddata. for to reflective or electrodes thespatiotemporal subject’s body parameters, so that theseand devices can record level-walking performance evaluations This makes these devices impractical for[14,18]. everyday living environments. Markerless sensors such as for [4,16,19–21] stair walking other hand, skeleton tracking problems. In our force As plates aremonitoring, widely used,on butthe they are expensive and difficult to has install. Video cameras previous study, we empirically tracking of Kinect second generation must be installed onfound the side walls of that stairsthe [2],skeleton but many households do v2, notthe have stairwells wide of Kinect,Inisaddition, not goodthe atuse estimating theisknee positions stairconcerns. climbing [23]. Differences enough. of cameras best joint avoided owingduring to privacy between actual knee joint positions and estimated of bythese skeleton shown in placed Figure so 1. Depth sensors overcome many of theones shortcomings othertracking devices.are They can be The large differences suggest that aplane, new method of stairorwalking monitoring should be that they face the subject’s frontal and markers electrodes are unnecessary, sodeveloped. installation The goal of our study is to develop of stair walking Themakes present is convenient. Skeleton tracking, whichaismarkerless a functionmethod provided by Kinect for monitoring. Windows [22], it article reports (1) a new markerless method of knee joint position estimation body range possible to perform a markerless joint position estimation based on machine by learning using definition a number and leg dataKinect extraction useskinematic depth data acquired by Kinectparameters, v2, (2), an investigation on knee of datasets. can that acquire and spatiotemporal and it has been usedjoint for position estimation using the skeleton tracking function of Kinect v2, and (3), a comparison of the level-walking performance evaluations [14,18]. applicability of our method and skeleton stair walking was carried out As for stair walking monitoring, ontracking. the otherAhand, skeletonexperiment tracking has problems. Inwith our a precise study, 3D motion capture system. that the skeleton tracking of Kinect v2, the second generation of previous we found empirically This paper as follows. A description of ourstair method is [23]. presented in Section 2. Kinect, is not goodproceeds at estimating the knee joint positions during climbing Differences between Experimental results are described Section 3,by and a discussion of the the validity our actual knee joint positions and onesinestimated skeleton tracking areresults shownand in Figure 1. Theof large method is presented in Section 4. Section 5 concludes the paper. differences suggest that a new method of stair walking monitoring should be developed. Kn ee j oin t p osition of skeleton tra ckin g Actu a l kn ee j oin t p osition

Figure 1. Knee joint position estimation during stair climbing. The knee joint positions detected by Figure 1. Knee joint position estimation during stair climbing. The knee joint positions detected by the the skeleton tracking of Kinect v2 (orange circles) differ significantly from the actual ones (green skeleton tracking of Kinect v2 (orange circles) differ significantly from the actual ones (green squares). squares).

The goal of our study is to develop a markerless method of stair walking monitoring. The present 2. Materials article reportsand (1) Methods a new markerless method of knee joint position estimation by body range definition and leg data extraction that uses depth data acquired by Kinect v2, (2), an investigation on knee joint 2.1. Definition and System Flow position estimation using the skeleton tracking function of Kinect v2, and (3), a comparison of the applicability of ourtomethod and skeleton tracking. A stair walking experiment carried out is with We decided use Kinect v2 for Windows (Microsoft, Redmond, WA,was USA), which an aaffordable precise 3D motion capturehigh-quality system. and sufficiently depth sensor [24]. The joint positions of the skeleton tracking This paper v2 proceeds as follows. exactly A description of our joint method is presented in Section 2. feature of Kinect do not correspond to the anatomical centers. They express positions Experimental are described in Section a discussion of theMoreover, results and validity on the surfaceresults of the joint because their values 3, areand based on depth data. ourthe method is of our on method presented Section 4. Sectionthe 5 concludes paper.on the surface in the same way based depthisdata, so we in aimed to calculate knee jointthe position as the skeleton tracking of Kinect v2. Although our target is not the anatomical knee joint center, we 2. Materials Methods use the termand ‘knee joint position’. For ease of comparison, we put the reflective markers of the 3D 2.1. Definition and System Flow We decided to use Kinect v2 for Windows (Microsoft, Redmond, WA, USA), which is an affordable and sufficiently high-quality depth sensor [24]. The joint positions of the skeleton tracking feature of Kinect v2 do not correspond exactly to the anatomical joint centers. They express positions on the surface of the joint because their values are based on depth data. Moreover, our method is based on depth data, so we aimed to calculate the knee joint position on the surface in the same way as

Sensors 2017, 17, 2698

3 of 18

the skeleton tracking of Kinect v2. Although our target is not the anatomical knee joint center, we 17, 2698 3 ofof 18 the 3D use the Sensors term2017, ’knee joint position’. For ease of comparison, we put the reflective markers motion capture system on the corresponding Kinect v2 knee joint positions, thus avoiding the need for motion capture system on the corresponding Kinect v2 knee joint positions, thus avoiding the 3need Sensors 2017, 17, 2698 of 18 conversion to the anatomical joint center. for conversion to the anatomical joint center. The flow of our method is shown in Figure 2. The 2. method consists of three phases: data acquisition, The flow of our method is shown in Figure The consists of three phases: data motion capture system on the corresponding Kinect v2 kneemethod joint positions, thus avoiding the need acquisition, preprocessing, and definition of body data acquisition programmed in C++, preprocessing, and definition of body Theparts. dataThe acquisition phasephase waswas programmed for conversion to the anatomical jointparts. center. C++, and the other phases programmed in2.MATLAB 2017a (MathWorks Inc.,phases: Natick, data MA, and theinother two phases were programmed MATLAB (MathWorks Inc., Natick, MA, USA). The flow of ourtwo method iswere shown in in Figure The2017a method consists of three USA). These phases are completely independent and offline. acquisition, preprocessing, and definition of body parts. The data acquisition phase was programmed These phases are completely independent and offline. in C++, and the other two phases were programmed in MATLAB 2017a (MathWorks Inc., Natick, MA, USA). These phases are completely independent and offline. Sta rt 3 . 4 D ata a cq u isition

3.5 3.4

3.5

3.6

3.6

Ba ckg rou n d d ep th acq u isition

W a lkin g su b j ects’ d ep th a cq u isition Sta rt Prep rocessin g 3 D d ata ca lcu lation of b a ckg rou n d Ba ckg rou n d d ep th acq u isition D ata a cq u isition All fram es W a lkin g su b j ects’ d ep th a cq u isition 3 D d a ta ca lcu lation Prep rocessin g 3 D d ata ca lcu lation of b a ckg rou n d Backg rou n d elim in a tion All fram es N oise rej ection 3 D d a ta ca lcu lation Bod y ra n g e d efin ition D efin ition s of b od y p a rts Backg rou n d elim in a tion Leg ran g e d efin ition N oise rej ection Kn ee p osition calcu la tion Bod y ra n g e d efin ition D efin ition s of b od y p a rts Leg ran g e d efin ition En d Kn ee p osition calcu la tion

Figure ofour our system. Figure2.2.Flow Flow of system. En d 2.2. Kinect for Windows v2 by MicrosoftFigure 2. Flow of our system. 2.2. Kinect for Windows v2 by Microsoft Kinect v2 is composed of an RGB camera, a depth camera, and microphone array (Figure 3a). It

Kinect v2 RGB isforcomposed an RGBIRcamera, depth and microphone array 3a). 2.2. Kinect Windows v2of by Microsoft acquires data, depth data, and data at 30afps. It alsocamera, has a skeleton tracking function and (Figure face It acquires RGB data, depth data, and IRcamera, datavisual at 30 angles fps. Itare also has a60°, skeleton tracking function and tracking function. Its horizontal 70° respectively. Depth data Kinect v2 is composed of anand RGBvertical a depth camera, andand microphone array (Figure 3a). It ◦ and ◦ , be face tracking function. Its horizontal and vertical visual angles are 70 60 respectively. Depth can be obtained in the range of 500 to 8000 mm, while the skeleton tracking data can acquired in acquires RGB data, depth data, and IR data at 30 fps. It also has a skeleton tracking function and face the of 500in toIts 4500 mm. Although v2 has no tilt motor, it 60°, can tracking be tilted from −32 14°. data cantracking berange obtained the range ofand 500vertical to Kinect 8000visual mm, while the skeleton data cantobe acquired function. horizontal angles are 70° and respectively. Depth data The accuracy of the depth data acquired by Kinect v2 was evaluated [25]. It was found that the error can be obtained in the range of 500 to 8000 mm, while the skeleton tracking data can be acquired in in the range of 500 to 4500 mm. Although Kinect v2 has no tilt motor, it can be tilted from −32 to 14◦ . wasrange less than 4 mm in an elliptical area,Kinect with v2 a 3.5 mno major axis around thetilted subject. We−32 acquired the to 4500 mm. Although has tiltevaluated motor, it can be to 14°. The accuracy of of the500 depth data acquired by Kinect v2 was [25]. It wasfrom found that the error depth data (Figure for data our method skeleton data for[25]. comparison. As our is The accuracy of the 3b) depth acquiredand by Kinect v2tracking was evaluated It was found that system the error was lesssupposed than 4 mm in an elliptical area, with a 3.5 m major axis around the subject. We acquired to be4 used in an homes, depth datawith is more datathe in terms of We privacy. The depth was less than mm in elliptical area, a 3.5suitable m majorthan axisRGB around subject. acquired data (Figure 3b) our and skeleton dataasfordata comparison. As our system dimensions of method v2 are definedtracking follows: mediolateral = is supposed X, depth datafor (Figure 3b)Kinect for our method and tracking skeleton for comparison. As our system is vertical = Y, and anteroposterior = Z. We will simply refer to “Kinect v2” as “Kinect” in what follows. to be used in homes, depth data is more suitable than RGB data in terms of privacy. The dimensions supposed to be used in homes, depth data is more suitable than RGB data in terms of privacy. The dimensions of as Kinect are defined as follows: mediolateral = = Z. X,We will of Kinect v2 are defined follows:v2mediolateral = X, vertical = Y, and anteroposterior vertical = Y, and anteroposterior = Z. We will simply refer to “Kinect v2” as “Kinect” in what follows. simply refer to “Kinect v2” as “Kinect” in what follows.

(a)

(b)

Figure 3. (a) Kinect v2 and its camera positions; (b) depth data acquired by Kinect v2.

(a) (b) 2.3. Experimental Setup and Subjects Figure 3. (a) Kinect v2 and its camera positions; (b) depth data acquired by Kinect v2.

Figure 3. (a) Kinect v2 and its camera positions; (b) depth data acquired by Kinect v2.

Our proposed method was compared with skeleton tracking of Kinect v2. Their results were 2.3. Experimental Setup and Subjects evaluated with data collected by a 3D motion capture system (gold standard) that had been used in 2.3. Experimental Setup and Subjects numerous previous studies [11–19]. The experiment was conducted at the Multi Keio Our proposed method was compared with skeleton tracking of Kinect v2.Media TheirRoom resultsatwere University in September 2015. The Multi Media Room is large enough to place a stage, stairs, two evaluated withmethod data collected a 3D motion capture system (gold standard) that v2. had Their been used in were Our proposed was by compared with skeleton tracking of Kinect results Kinects with tripods, and seven cameras with tripods of the 3D motion capture system. The setup is numerous previous studies [11–19]. The experiment was conducted at the Multi Media Room at Keio

evaluated with data collected by a 3D motion capture system (gold standard) that had been used University in September 2015. The Multi Media Room is large enough to place a stage, stairs, two in numerous previous studies [11–19]. The experiment was conducted at the Multi Media Room at Kinects with tripods, and seven cameras with tripods of the 3D motion capture system. The setup is Keio University in September 2015. The Multi Media Room is large enough to place a stage, stairs, two Kinects with tripods, and seven cameras with tripods of the 3D motion capture system. The setup

Sensors 2017, 17, 2698

4 of 18

is illustrated in Figure 4a. The staircase had six steps and a 1000-mm-high stage. The size of the stage was (W) 1000 mm × (D) 2000 mm × (H) 1000 mm. The width of the stairs was 900 mm, the run length was 304 mm, the riser height was 166 mm, and stair angle was 28.6◦ . The difference in height between Sensors 2017, 17, 2698 4 of 18 the stage and the top of the stairs was 170 mm. Although stair climbing hasstaircase been observed from the side of the subject in most illustrated in Figure 4a. The had six steps and a 1000-mm-high stage. The size ofprevious the stage studies, Kinectswas had(W) to 1000 be set at×the or the our study, our system is the to be mm (D) front 2000 mm × (H)back 1000 in mm. The width because of the stairs was 900 mm, runused lengthin living was 304 mm, the riser height 166 mm, and stair angle wasKinects 28.6°. The difference height between environments. To determine an was appropriate position, two were used;inthe lower one was set the stage and the top of the stairs was 170 mm. at 940 mm above the ground and 2200 mm away from the stairs; the upper one was set at 1500 mm Although stair climbing has been observed from the side of the subject in most previous studies, above the stage and 1600 mm away from the stairs. The tilt of the upper one was −20◦ , and the tilt of Kinects had to◦be set at the front or the back in our study, because our system is to be used in living the lower one was 0 To . The tilts and the Kinects were determined solower that the environments. determine an positions appropriateofposition, two Kinects were used; the one whole was set body of each subject within devices’ visual Also, distance from Kinect to mm the subject at 940 was mm above thethe ground and 2200 mmrange. away from thethe stairs; the upper oneeach was set at 1500 above4 the stage andof 1600 away from The tilt of the upper one was −20°, and the tilt was within m because themm limitation of the thestairs. skeleton tracking function. the use lower 0°. The tilts and positions the Kinects were determined so attached that the whole Forofthe ofone thewas 3D motion capture system,offour reflective markers were to the front body of each subject was within the devices’ visual range. Also, the distance from each Kinect to the and the back of the knees of each subject (Figure 4b), and each subject wore easily fitting pants of subject was within 4 m because of the limitation of the skeleton tracking function. motion capture suits to reduce the influence of the clothes’ texture on the depth data. We recorded the For the use of the 3D motion capture system, four reflective markers were attached to the front and 3D positions ofofthe markers 200 Hz seven wore cameras B10 of bymotion VICON [26]) the back thereflective knees of each subjectat (Figure 4b),by andusing each subject easily(Bonita fitting pants set around thesuits walking course. The beginning of texture both Kinects’ datadata. acquisition was capture to reduce the influence of the clothes’ on the depth We recorded thetemporally 3D positions of the reflectivelevel markers at 200recorded Hz by using cameras system, (Bonita B10 [26]) set of the synchronized with a voltage change byseven the VICON soby theVICON time counters aroundand theVICON walking were course. The beginning We of both datamarkers acquisition was temporally two Kinects synchronized. usedKinects’ reflective of VICON as landmarks synchronized with a voltage level change recorded by the VICON system, so the time counters of the (Figure 4c) for alignment of each Kinect and VICON. The coordinates of the landmarks were subtracted two Kinects and VICON were synchronized. We used reflective markers of VICON as landmarks from the(Figure data collected by the upper and lower Kinects. 4c) for alignment of each Kinect and VICON. The coordinates of the landmarks were Eight healthyfrom students ofcollected Keio University, whose information subtracted the data by the upper and lower Kinects. is shown in Table 1, volunteered for the experiment. They provided consent. Each subject is ascended the stairs Eight healthy students ofinformed Keio University, whose information shown in and Tabledescended 1, volunteered for the experiment. They provided informed consent. Each subject ascended and descended the stairs three times. The total number of ascents and descents for each subject was six. Though previous times. The total of ascents and of descents for each subject was[27,28], six. Though previous of this studies three had considered thenumber effect of the speed approach to the stairs the purpose studies had considered the effect of the speed of approach to the stairs [27,28], the purpose of this experiment was to examine the accuracy of our method. Thus, subjects started to ascend and descend experiment was to examine the accuracy of our method. Thus, subjects started to ascend and descend the stairs standing position with a aself-selected comfortable speed. thefrom stairsafrom a standing position with self-selected comfortable speed. Fron t

Up p er Kin ect 20°

1500 m m

Rig h t fron t

Left fron t

Left b a ck

Rig h t b a ck

1600 m m

0°

(b)

900 m m

940 m m

Low er Kin ect 4000 m m

1600 m m

1000 m m

Back

1000 m m

1520 m m

La n d m a rk for u p p er Kin ect

2200 m m

(a)

La n d m a rk for low er Kin ect

(c)

Figure 4. (a) Experimental setup;(b) (b)positions positions ofofmarkers on aon subject; (c) positions of markers Figure 4. (a) Experimental setup; markers a subject; (c) positions ofon markers on stairs as landmarks. stairs as landmarks. Table 1. Characteristics of the subjects.

Table 1. Characteristics of the subjects. No

No

Sex

1 F Sex 2 M 3 M FM 4 MF 5 MF 6 MF 7 F F 8 MeanF± SD

1 2 3 4 5 6 7 F 8 F Mean ± SD

Age

Mass (kg)

Age

24 24 24 2421 2424 2424 2125 2424 24 ± 1 24

Mass (kg)

25 24 24 ± 1

44 49 55 ± 13

48 55 80 48 73 55 43 80 51 73 44 43 49 55 51 ± 13

Height (mm)

Height 1540 1730 (mm) 1770 1540 1800 1730 1510 1770 1625 1800 1540 1510 1580 1636 ±1625 107 1540 1580 1636 ± 107

Inter-ASIS Leg Length Knee Width Ankle Width Distance (mm) (mm) (mm) (mm) Inter-ASIS Ankle Width 280 800Leg Length 100Knee Width 70 290 115 (mm) 70 (mm) Distance (mm) 910 (mm) 310 910 110 90 280 800 100 70 260 960 110 80 910 115 70 270 290 780 90 80 910 110 90 300 310 840 110 75 960 110 80 270 260 800 100 80 780 90 80 250 270 830 100 75 279 ± 19 300 854 ± 61 840 104 ± 8 110 78 ± 6 75

270 250 279 ± 19

800 830 854 ± 61

100 100 104 ± 8

80 75 78 ± 6

Sensors 2017, 17, 2698

5 of 18

Sensors 2017, 17, 2698

5 of 18

2.4. Data Acquired by Kinect 2.4. Data Acquired by Kinect Kinect provides depth data in the form of a 424 by 512 matrix per frame. Each cell of the matrix Kinect provides depth data in the form of a 424 by 512 matrix per frame. Each cell of the matrix has the distance from Kinect to the object. Figure 3b shows an example of a depth image acquired by has the distance from Kinect to the object. Figure 3b shows an example of a depth image acquired by Kinect. The depth data in each cell is expressed as the color of the pixel. Kinect. The depth data in each cell is expressed as the color of the pixel. The depth data was recorded every frame while the subject ascended and descended the stairs. The depth data was recorded every frame while the subject ascended and descended the stairs. Also, a clock program saved the time at which every depth data was collected. All these data were Also, a clock program saved the time at which every depth data was collected. All these data were stored in memory once and written out as CSV files when the subject finished the stair walking stored in memory once and written out as CSV files when the subject finished the stair walking test. test. The depth data of the background, which was used for the background elimination phase, was The depth data of the background, which was used for the background elimination phase, was acquired before the stair walking test when no one was around the stairs. Figure 3b is a depth image acquired before the stair walking test when no one was around the stairs. Figure 3b is a depth image of the acquired background depth data. of the acquired background depth data. 2.5. Preprocessing 2.5. Preprocessing 2.5.1. Three Dimensional Data Calculation and Tilt Correction 2.5.1. Three Dimensional Data Calculation and Tilt Correction The X matrix and Y matrix were calculated from the acquired depth data, i.e., the Z matrix, and the X matrix andasYfollows: matrix were calculated from the acquired depth data, i.e., the Z matrix, and visualThe angle of Kinect, the visual angle of Kinect, as follows: ((col/2 − y + 1) × ZK ( x, y) × tan α) (1) XK ( x, y) = /2 − + 1 × , × tan α col/2 (1) , = /2 ((row/2 − x + 1) × ZK ( x, y) × tan β) /2 − + 1 × , × tan , YK ( x, y) = (2) (2) row/2 , = , /2 where col = 512, α = 35◦ , and row = 424, β = 30◦ . The definitions of XK , YK , and ZK are shown , , and are shown in = 424, = 30°. The definitions of inwhere Figure 5a.= 512, = 35°, and Figure 5a. When the Kinect was tilted, a tilt correction had to be made to the Z and Y matrices. The method, the Kinect was tilted, a tilt shownWhen in Figure 5b, used Equations (3)correction and (4). had to be made to the Z and Y matrices. The method, shown in Figure 5b, used Equations (3) and (4). YR ( x, y, ) ==ZK ( x, y, ) ××sin sinθ ++YK ( x,,y) × ×cos cosθ

(3) (3)

cosθ −−YK ( x, y, ) ××sin sinθ, , ZR ( x, y,) ==ZK ( x, y, ) ××cos

(4)

whereθ means means angle of the sensor. preprocessing was applied to the bothbackground the background where thethe tilt tilt angle of the sensor. The The preprocessing was applied to both data dataeach andframe each frame of data. and of data.

Z K sin θ

Y

Y K sin θ θ

YK

θ

Y K cos θ

Z K cos θ

YR

θ

Z

ZR ZK

(a)

(b)

Figure5.5.(a) (a)Three-dimensional Three-dimensionaldata datacalculation; calculation;(b) (b)tilt tiltcorrection. correction. Figure

2.5.2. Background Elimination and Noise Rejection 2.5.2. Background Elimination and Noise Rejection The X, Y, and Z matrices of the background data were subtracted from those of each frame data. The X, Y, and Z matrices of the background data were subtracted from those of each frame data. Then the 3D Euclidean distances of each cell were calculated. The cells, of which the 3D Euclidean Then the 3D Euclidean distances of each cell were calculated. The cells, of which the 3D Euclidean distances were less than 50 mm, were regarded as background. The background cells in the X, Y, and Z matrices of each frame data were assigned zero. Figure 6a,b shows examples before and after the background elimination.

Sensors 2017, 17, 2698

6 of 18

distances were less than 50 mm, were regarded as background. The background cells in the X, Y, and Z matrices of each frame data were assigned zero. Figure 6a,b shows examples before and after the Sensors 2017, 17,elimination. 2698 6 of 18 background Zero was assigned to the appropriate cells of X, Y, and Z for the ranges in Equations (5)–(7). waswere assigned to the according appropriate of X, Y, of and for the ranges Equations (5)–(7). TheseZero ranges determined to cells the positions theZstairs and Kinectinshown in Figure 4a. These ranges were determined according to the positions of the stairs and Kinect shown in Figure 4a. Z ( x,, y) > 4000 4000

(5) (5)

X ( x, , y) < −800 −800

(6) (6)

X ( x,, y) > 800 800

(7) (7)

After that, the histogram of the Z matrix was calculated. The range range that that had had the the largest largest frequency frequency except 0 was defined as the subject’s subject’s position on the Z Z axis. axis. All data points that were farther than 500 mm from the subject’s position were deleted for noise rejection. Figure 6c shows the same data as Figure 6b after background elimination and before noise rejection. Figure 6d shows the result after noise rejection.

(a)

(b)

(c)

(d)

Figure 6. Background elimination and noise rejection: (a) before background elimination; (b) after Figure 6. Background elimination and noise rejection: (a) before background elimination; (b) after background elimination; (c) after background elimination and before noise rejection; (d) after noise background elimination; (c) after background elimination and before noise rejection; (d) after rejection. noise rejection.

2.6. Definitions of Body Parts 2.6.1. Body Range Definition and Leg Data Extraction The histogram of the Y matrix was calculated for the purpose of extracting the body from the recorded data. The bins with more than a frequency of 50 were used as the range of the body. The frequency threshold of 50 was very small, although the bin width was set automatically. For example, in the case of the lowest density of plots, when the smallest person (1540 mm tall) among the

Sensors 2017, 17, 2698

7 of 18

2.6. Definitions of Body Parts 2.6.1. Body Range Definition and Leg Data Extraction The histogram of the Y matrix was calculated for the purpose of extracting the body from the recorded data. The bins with more than a frequency of 50 were used as the range of the body. The frequency threshold of 50 was very small, although the bin width was set automatically. For example, in the case of the lowest density of plots, when the smallest person (1540 mm tall) among Sensors 2017, 17, 2698 7 of 18 the participants was at the farthest position from the Kinect, the Z axis distance was about 3300 mm and the body consisted of about plots. In this maximum frequency about the body consisted of about 39003900 plots. In this case,case, the the maximum frequency waswas about 400,400, andand the the bin width was 100 mm. The cut-off of 50 worked as noise rejection for the Y axis. Among the bin width was 100 mm. The cut-off of 50 worked as noise rejection for the Y axis. Among the extracted extracted bins, the largestwas Y value wasasdefined the startofposition of thebody, subject’s and theY bins, the largest Y value defined the startasposition the subject’s and body, the smallest smallest Y value was the endofposition of thebody subject’s along Y axis. Figure the 7a value was defined as defined the end as position the subject’s alongbody the Y axis.the Figure 7a shows shows the extracted 3D plots of the subject’s body. extracted 3D plots of the subject’s body. The The leg leg range range was was defined defined as as the the lower lower half half of of the the height; height; 50% 50% of of the the height height from from the the bottom bottom on on the the YY axis axis was was defined defined as asthe theboundary boundarydividing dividing the thebody bodyinto intoupper upperand andlower lowerparts. parts. The The sitting sitting height height/stature) × 100, is height ratio ratio (SHR), (SHR), which which is is calculated calculated as as (sitting sitting height/stature × 100, is commonly commonly used used for for body body proportion proportion evaluations evaluations [29]. [29]. According According to to aa worldwide worldwide study study on on body body proportion, proportion, the the mean mean SHR SHR of of adults in a nation varies from 47.3 to 55.8 [30]. Thus, 50% is justified as a percentage of leg length to adults in a nation varies from 47.3 to 55.8 [30]. Thus, 50% is justified as a percentage of leg length to stature. stature. The The yellow yellow line line in in Figure Figure 7b 7b expresses expresses the the defined defined boundary boundary for for the the SHR SHR of of 50%. 50%. The The plots plots below below the the boundary boundary were weretaken takento tobe bethe theleg legarea. area.

(a)

(b)

(d)

(e)

(c)

(f)

Figure 7. 7. Body extraction: (a)(a) body extraction by using a histogram of Y Figure Body range rangedefinition definitionand andleg legdata data extraction: body extraction by using a histogram values; (b) body plots and the boundary (yellow line) separating the body into upper and lower parts; of Y values; (b) body plots and the boundary (yellow line) separating the body into upper and (c) example section an section arbitrary (d)Yrepresentative positions of left and right lower parts; of (c)X–Z example of at X-Z atYanposition; arbitrary position; (d) representative positions oflegs; left (e) separated left leg and its representative positions; (f) separated right leg and its representative and right legs; (e) separated left leg and its representative positions; (f) separated right leg and its positions. representative positions.

Twoclusters clustersappear appearin inthe theX-Z X-Zsection sectionin inthe theleg legarea. area. An An example example in in an an arbitrary arbitrary X-Z X-Z section section is is Two shown in Figure 7c. In this figure, the clusters form a curve, because the leg is in the shape of an shown in Figure 7c. In this figure, the clusters form a curve, because the leg is in the shape of an ellipse. ellipse. Thus, we defined the peak plots on the Z axis of two clusters as the representative positions of the left and the right legs in the X-Z section. The representative positions of both legs were defined all along the positions on the Y axis (Figure 7d). As the surface of the subject’s body was not so smooth, there were actually more than two peaks in the X-Z section, and it was not easy to determine the optimal representative positions of each leg. Therefore, we selected two peaks that had the largest

Sensors 2017, 17, 2698

8 of 18

Thus, we defined the peak plots on the Z axis of two clusters as the representative positions of the left and the right legs in the X-Z section. The representative positions of both legs were defined all along the positions on the Y axis (Figure 7d). As the surface of the subject’s body was not so smooth, there were actually more than two peaks in the X-Z section, and it was not easy to determine the optimal representative positions of each leg. Therefore, we selected two peaks that had the largest peak width and set a rule that the peaks had to be separated by more than each subject’s ankle width (the thinnest leg area) to determine the optimal representative positions of each leg. Sensors 2017, 17, 2698 8 of 18 The boundary position on the X axis was calculated as the average of all representative positions of boundary the left and plots were separated left and legs by re-determined the position ofbythe on right the Xlegs. axis. The The representative positionsinto in each X-Z right section were boundary on the X axis. The representative positions in each X-Z section were re-determined by finding the minimum plots (Figure 7e,f). finding the minimum plots (Figure 7e,f). 2.6.2. Knee Joint Position Calculation 2.6.2. Knee Joint Position Calculation The knee joint position was defined as follows. Figure 8a,b show plots of the left and right legs knee joint The position was defined follows. Figure 8a,b show plots of theofleft legs in in The the Z-Y plane. yellow lines in theas figures express representative positions theand left right and right thelegs Z-Yatplane. The yellow in the lines figures express positions of the left and right legs each height, and lines the purple connect therepresentative top and bottom representative positions. The position waspurple definedlines as the point on yellow line thatrepresentative had the largest positions. distance from at knee each joint height, and the connect thethe top and bottom Thethis knee purple line.was Figure 8c,d shows the estimated knee joint as green and the legthis areas as joint position defined as the point on the yellow line positions that had the largestdots distance from purple redFigure dots. 8c,d shows the estimated knee joint positions as green dots and the leg areas as red dots. line.

-700 -800 -900 -1000 -1100 -1200 -1300 -1400 -1500 1600

1800

Z (mm)

(a)

(b)

(c)

(d)

Figure 8. Knee joint position extraction: (a) plots of the left leg with representative positions (yellow Figure 8. Knee joint position extraction: (a) plots of the left leg with representative positions line) and line connecting the top and bottom of the representative positions (purple line); (b) plots of (yellow line) and line connecting the top and bottom of the representative positions (purple line); the right leg with representative positions (yellow line) and line connecting the top and bottom of the (b) plots of the right leg with representative positions (yellow line) and line connecting the top and representative positions (purple line); (c) plots of the left leg area (red dots) and estimated knee joint bottom of the representative positions (purple line); (c) plots of the left leg area (red dots) and estimated position of the left leg (green dot); (d) plots of the right leg area (red dots) and estimated knee joint knee joint position of the left leg (green dot); (d) plots of the right leg area (red dots) and estimated position of the right leg (green dot). knee joint position of the right leg (green dot).

There was a problem that the knee joint position may be possibly falsely estimated when the leg There was a problem that the knee joint position may be possibly falsely estimated when the leg is is extended. Therefore, whether the leg was extended or not was checked after the knee joint position extended. Therefore, whether the leg was extended or not was checked after the knee joint position was was estimated once. The leg was considered to be extended if the distance from the estimated knee estimated once.toThe wasline considered be the extended if the distance from estimatedtoknee joint joint position theleg purple was less to than threshold. The threshold wasthe determined be 100 position to the purple line was less than the threshold. The threshold was determined to be 100 mm mm after examining the subjects’ data. In the case of leg extension, the distance between the estimated after examining the subjects’ data.knee In the case of leg was extension, the distance between estimated knee joint position and the latest joint position calculated. If the distance wasthe larger than knee joint position and the latest knee joint position was calculated. If the distance was larger than the threshold, the leg area was reduced to an area covering a distance of 100 mm on the Y axis from thethe threshold, the leg area was reduced to an area covering a distance of 100 mm on the Y axis from latest knee joint position. Then the purple and yellow lines were drawn again, and the knee joint theposition latest knee position. Then the purple and yellow were drawn again, and the was joint estimated. By extracting the leg area around lines the correct knee joint position, theknee risk joint of position was estimated. extracting leg areawas around the correct jointby position, the risk misrecognition could beByreduced. Thethe threshold determined to beknee 150 mm considering the of sampling time and displacement of the knee joint. This threshold was also used for the abnormal data on the knee joint position, as described in Section 2.7. In this case, the range of 100 mm was determined through consideration of the frame rate and the walking speed. This method used the latest knee joint position, and that is why it was applied only after the knee joint position was acquired once.

Sensors 2017, 17, 2698

9 of 18

misrecognition could be reduced. The threshold was determined to be 150 mm by considering the sampling time and displacement of the knee joint. This threshold was also used for the abnormal data on the knee joint position, as described in Section 2.7. In this case, the range of 100 mm was determined through consideration of the frame rate and the walking speed. This method used the latest knee joint position, and that is why it was applied only after the knee joint position was acquired once. 2.7. Data Analysis We obtained the depth data and skeleton tracking data of the subjects from Kinect at about 30 frames per second and the 3D position data of four markers from the VICON system at 200 frames per second. We estimated the 3D knee joint positions using the proposed method. To calculate the error for each point of Kinect data, the data of VICON nearest to that time were selected. In order to avoid the errors caused by the interpolation for upsampling, the VICON data were downsampled to the same number of frames as that of Kinect. Data points that showed sudden changes (>150 mm) from the previous data points were rejected as abnormal. The present method was evaluated using Pearson’s correlation coefficients (r), 95% confidence intervals of the intraclass correlation coefficient (ICC) case 2 (ICC (2,1)), signal-to-noise ratio (SNR), and the 3D Euclidian distance (3D error). The front (back) knee joint position was estimated from depth data captured by the upper (lower) Kinect device during stair ascents. The front (back) knee joint position was estimated from the depth data captured by the lower (upper) Kinect device during stair descents. All analyses were applied to compare our method with VICON and to compare the Kinect skeleton tracking with VICON. The r and SNR values were applied to every frame of each trial of each subject. On average, 230 frames (samples) were collected for each trial. Based on the calculation of G*Power 3 1.9.2 (Universität Düsseldorf, Düsseldorf, Germany), the appropriate sample size for correlation analysis was 82. Thus, the sample size was enough. Values for r were calculated using MATLAB 2017a to assess the linear relationship between the knee joint positions acquired by two methods. The ICC (2,1), which expresses the Inter-rater reliability, was used to evaluate the agreement between the results obtained by the two measurement systems. This was calculated using IBM SPSS Statistics, version 24 (IBM, Armonk, NY, USA). We used SNRs based on the variance of the signals to quantify the noise relative to the VICON data, as described previously [18]. The SNRs, defined in Equations (8)–(10), included the usual transformation into decibels [31]. variance( XV ICON ) SNRx = 20 log10 , (8) variance( Xerror ) variance(YV ICON ) SNRy = 20 log10 , (9) variance(Yerror ) variance( ZV ICON ) , (10) SNRz = 20 log10 variance( Zerror ) where Xerror , Yerror , and Zerror are the errors of the values estimated by our method compared with the VICON values: Xerror = XProposed − XV ICON (11) Yerror = YProposed − YV ICON Zerror = ZProposed − ZV ICON

(12) (13)

The 3D error between the estimated position and the marker position of VICON was calculated as follows: p 3Derror = Xerror 2 + Yerror 2 + Zerror 2 . (14)

Sensors 2017, 17, 2698

10 of 18

3. Results As shown in Table 2, all results of ry and rz are larger than 0.9. This indicates a strong correlation, which is defined as an ’excellent relationship’ [18,32] between our method and the VICON measurement in the case of the Y and Z axes position measurement. In the case of X axis position measurement, on the other hand, the values of rx are so small that our method and the VICON measurement are in a ’poor’ or ’moderate relationship’ [18,32]. The tendency of the SNRs is similar Sensors 2017, 17, 2698 10 of 18 to the values of r. The values are assessed as follows [18]: SNR < −20 dB means that the data are ‘altered or influenced noise’; −20 dBas data Y andthat Z axes small noise or small systematic bias’;indicate and SNR 20 dBofmeans theyare are’accurate ‘accurate enough’, enough’. but Table SNRy SNRz the data Y and Z axesThe are ‘accurate but over SNRxAccording indicatestothat the2,data ofand X axis areindicate ’influenced byof small noise’. ICC (2,1)enough’, values are SNRx indicates that the data of X axis are ‘influenced by small noise’. The ICC (2,1) values are over 0.800 for the X axis and over 0.900 for the Y and Z axes; these are considered ’good’ and ’excellent 0.800 for the X axis and forfor thethe Y and Z axes; these are considered ‘good’ and ‘excellent reliability’, respectively [33].over The0.900 results Kinect skeleton tracking showed a similar tendency to reliability’, respectively [33]. The results for the Kinect skeleton tracking showed a similar tendency those for our method. to those for our method. The results of our method are shown in Figure 9a together with those of the Kinect skeleton The results of our method are shown in Figure 9a together with those of the Kinect skeleton tracking for comparison. For allallconditions, errorsfor forour ourmethod, method, except of third the third tracking for comparison. For conditions, the the 3D 3D errors except thatthat of the condition (lower Kinect, descent captured from the front), were smaller than those for the Kinect condition (lower Kinect, descent captured from the front), were smaller than those for the Kinect skeleton tracking. ThisThis means that our smallerbias biaserror error than skeleton tracking, except skeleton tracking. means that ourmethod methodhas has a smaller than skeleton tracking, except for the condition. forthird the third condition.

(a)

(b)

(c)

(d)

Figure 9. 3D error of our method compared with the Kinect skeleton tracking method: (a) unadjusted

Figure 9. 3D error of our method compared with the Kinect skeleton tracking method: (a) unadjusted 3D error; (b) zero-mean shifted 3D error; (c) relation between resolution and distance from the Kinect 3D error; (b) zero-mean shifted 3D error; (c) relation between resolution and distance from the Kinect sensor; (d) normalized 3D error. Data are shown as mean ± standard deviation. sensor; (d) normalized 3D error. Data are shown as mean ± standard deviation.

To eliminate the bias error, the mean value of the data for each dimension was subtracted from the data of each dimension, for example: = =

− mean − mean

, .

(15) (16)

Sensors 2017, 17, 2698

11 of 18

To eliminate the bias error, the mean value of the data for each dimension was subtracted from the data of each dimension, for example: zeromeanXProposed = XProposed − mean XProposed ,

(15)

zeromeanXV ICON = XV ICON − mean( XV ICON ).

(16)

Figure 9b shows the zero-mean shifted 3D errors for our method, together with those for skeleton tracking. Compared with Figure 9a, it can be seen that the zero-mean shifted error for the third condition (lower Kinect, descent captured from the front) of our method is significantly improved from the un-shifted data, as there was a large bias error. The skeleton tracking results also had bias errors. The distance from the Kinect affects the resolution of the data. To normalize the results to mitigate this issue, we used the following process. Four plots were chosen per six risers to examine the resolution of Kinect at various distances. The distance and the number of plots between pairs of selected plots on the X and Y axes were acquired. We calculated the resolution for each axis as the distance between the two selected plots divided by the number of plots. Figure 9c shows that there is a linear relationship between the distance and the number of plots, and the data for both axes correlate well. These relations are expressed below: Resolution x = 0.0027 × Distance − 0.0265

(17)

Resolutiony = 0.0028 × Distance − 0.1238

(18)

To consider the effect of resolution, the 3D error of each frame was divided by its resolution, as calculated using the equation above. Equation (17) was used for normalizing the 3D error because the distances of the X axis were larger than those of the Y axis. Figure 9d shows that the normalized 3D error of our method for both ascent and descent was the smallest when the data were captured from the back. Conversely, the skeleton tracking method showed better results when the data were measured from the front compared with from the back. Table 2. Accuracy of knee joint positions estimated by our method relative to VICON values: Pearson’s correlation coefficients and signal-to-noise ratio. Pearson’s Correlation Coefficients

Signal-to-Noise Ratio

rx

ry

rz

SNRx

SNRy

SNRz

Ascent

Upper Kinect, Front Lower Kinect, Back

0.561 (0.180) 0.687 (0.162)

0.983 (0.011) 0.978 (0.022)

0.998 (0.001) 0.998 (0.002)

−4.21 (4.57) 0.744 (7.05)

30.3 (5.91) 28.7 (6.23)

51.0 (5.72) 49.6 (6.96)

Descent

Lower Kinect, Front Upper Kinect, Back

0.339 (0.222) 0.608 (0.223)

0.994 (0.004) 0.994 (0.007)

0.996 (0.002) 0.999 (0.001)

−10.7 (5.14) −5.47 (5.58)

36.8 (4.11) 44.6 (9.66)

43.2 (5.73) 54.6 (5.90)

4. Discussion 4.1. Accuracy Comparison and Applications of Our Method All subjects’ data were certainly acquired, and our method calculated the subject’s knee joint positions during stair walking from Kinect’s depth data. Since our system works without any restrictions on the subject, it can be installed in houses and used for daily stair walking measurements. The most remarkable thing is that it can be used by non-professionals. The system is simple to set up: the only hardware is a Kinect on a tripod connected to a PC. Our system may be able to be used to screen subjects for abnormal gait on stairs, and it may enable doctors to monitor the rehabilitation of patients after they have been discharged from a hospital by installing it in the patients’ houses. From Figure 9d, we conclude that our method more accurately estimates the knee joint position than skeleton tracking during stair walking tasks. Kinect is normally set up facing the target subject,

Sensors 2017, 17, 2698

12 of 18

as it was originally developed as a game controller. Thus, even when the subject turns his back on the Kinect, the skeleton tracking recognizes the subject as if s/he is facing to it. This fact is probably why the results of skeleton tracking were better when the data was captured from the front than from the back. From these results and those of the previous study shown in Figure 1, it is clear that the accuracy of skeleton tracking depends on the tilt of the Kinect. The ratio of run to rise of a staircase has an influence on stair walking monitoring. It is important that Kinect can capture the subject’s whole body while it is less tilted. However, it is difficult to do so when the gradient of the staircase is steep like in Figure 1 (an angle of 35.6◦ ). Accordingly, our method is likely to be more versatile than skeleton tracking. A previous study indicated that a 3D Error of more than 0.05 m for the skeleton joint positions was a large error in several tasks [18]. That study used a clinical parameter and knee displacement in the evaluation of walking on the spot. Accuracy and reliability of all clinical analyses were high enough, while the accuracy (3D error) of the knee joint position was 0.04 ± 0.01 m. On the other hand, our method had accuracies of 60.1 ± 30.1 mm (lower Kinect, ascent captured from the back) and 43.2 ± 27.5 mm (upper Kinect, descent captured from the back). Although both of these values had larger errors than in the previous study; the r of our method were better than theirs on the X, Y, and Z axes. Therefore, we conclude that our method is accurate enough to estimate knee joint positions. The source of the ’poor’ or ’moderate relationship’ with the reference values in rx can be explained by the small movement of the knee joint on the X axis. Although they are not written in order to avoid the redundancy, the orders of magnitude of absolute errors in each axis were the same. Thus, the correlation of the 3D positions of two methods can be considered ’good’. The ability of our method to acquire the actual clinical parameters should be confirmed in order to clarify its usefulness. In addition, we tested our method on only young healthy subjects. Additional experiments should be conducted with elderly persons and patients with musculoskeletal diseases. Moreover, our experiment used a six step staircase, and the maximum distance from the subjects to the Kinect was about 4 m. That means our results only cover the stair walking task in area range up to 4 m from the Kinect. Its usability on other staircases, which have different ratios of runs to rises, should also be examined. 4.2. Descent Figure 9a,b indicates that the lower Kinect had significantly poorer results and that its data had large bias errors. Figure 10a,c shows the knee displacements on the X, Y, and Z axes of a trial during a descent captured by the upper Kinect, and Figure 10d–f shows those captured by the lower Kinect. Comparing these figures, it can be seen that the plots of our method for the Y axis in Figure 10e of the lower Kinect are rather different. They are always higher by a certain amount than the VICON data. Figure 10h shows subject 6’s descent in the Z-Y plane captured by the lower Kinect. For comparison, Figure 10g shows the same subject’s descent captured at the same time as Figure 10h but with the upper Kinect. The green dot shows the estimated knee joint position, and the red dots show the leg area. In Figure 10h, the silhouette of the knee area is round. The leg area plots in 3D of the same data as in Figure 10h are shown in Figure 10i. The arrows in Figure 10h,i point to the correct knee joint positions, where the reflective marker of VICON was put. The data of plots for the reflective markers could not be acquired, so their positions appear as holes in the data in Figure 10i. According to the two figures, it can be seen that the estimated knee joint position (the green dot) is located above the correct knee joint position. It is hard to find the correct knee joint positions by finding the positions that are farthest from the purple line in Figure 8a, because the front silhouette of the knee area is round.

data as in Figure 10h are shown in Figure 10i. The arrows in Figure 10h,i point to the correct knee joint positions, where the reflective marker of VICON was put. The data of plots for the reflective markers could not be acquired, so their positions appear as holes in the data in Figure 10i. According to the two figures, it can be seen that the estimated knee joint position (the green dot) is located above the correct knee joint position. It is hard to find the correct knee joint positions by finding the positions Sensors 2017, 17, 2698 13 that of 18 are farthest from the purple line in Figure 8a, because the front silhouette of the knee area is round. VI CON X Ou r m eth od X X Error

Sensors 2017, 17, 2698

(a)

VI CON X Ou r m eth od X X Error

(d)

VI CON Y Ou r m eth od Y Y Error

VI CON Z Ou r m eth od Z Z Error

13 of 18

(c)

(b) VI CON Y Ou r m eth od Y Y Error


(e)

(f) 0 -100 -200 -300 -400 -500 -600 -700 -800 -900 0

-200

-400

X (mm)

(g)

(h)

2000 1800 1600

Z (mm)

(i)

Figure 10. Front silhouette around the knee is round during descent: (a) knee displacement X axis Figure 10. Front silhouette around the knee is round during descent: (a) knee displacement on Xonaxis during stair descent captured by upper Kinect; (b) knee displacement on Y axis during stair descent during stair descent captured by upper Kinect; (b) knee displacement on Y axis during stair descent captured by upper Kinect; (c) knee displacement Z axis during descent captured by upper captured by upper Kinect; (c) knee displacement on Zonaxis during stairstair descent captured by upper Kinect; (d) knee displacement on X axis during stair descent captured by lower Kinect; (e) knee Kinect; (d) knee displacement on X axis during stair descent captured by lower Kinect; (e) knee displacement Y axis during descent captured by lower Kinect; (f) knee displacement Z axis displacement on Yon axis during stairstair descent captured by lower Kinect; (f) knee displacement on Zon axis during stair descent captured by lower Kinect; (g) estimated knee joint position (green dot) and leg during stair descent captured by lower Kinect; (g) estimated knee joint position (green dot) and legarea area(red (reddots) dots)during duringdescent descentcaptured capturedby byupper upper Kinect; Kinect; (h) (h) estimated estimated knee knee joint joint position position (green (green dot), dot), leg area (red dots), and the position of reflective marker of VICON (data hole indicated by an arrow) during leg area (red dots), and the position of reflective marker of VICON (data hole indicated by an arrow) descent captured by lower Kinect;Kinect; (i) 3D visualization of Figureof 10g; estimated knee joint position (green during descent captured by lower (i) 3D visualization Figure 10g; estimated knee joint dot) is(green located above the position reflective marker of VICON (holeofindicated an arrow). position dot) is located aboveofthe position of reflective marker VICON by (hole indicated by an arrow).

The above results were compensated by subtracting a constant bias error, as described in The above results were compensated by subtracting a constant biasfrom error, asback described in Section 3. Section 3. However, in Figure 9d, the results for descents captured the are better than those However, in Figure 9d,front. the results from the back thanshould those captured captured from the Hence,for wedescents concludecaptured from these results that are stairbetter descents be captured from the back. The results likely include not only the bias error but also other errors. 4.3. Ascent Figure 11a–c shows the knee displacement on the X, Y, and Z axes, as captured by the upper Kinect during stair ascents, and Figure 11d–f shows those captured by the lower Kinect. Comparing these figures, it is clear that the errors on the Y axis data captured by the upper Kinect are larger than

Sensors 2017, 17, 2698

14 of 18

from the front. Hence, we conclude from these results that stair descents should be captured from the back. The results likely include not only the bias error but also other errors. 4.3. Ascent Figure 11a–c shows the knee displacement on the X, Y, and Z axes, as captured by the upper Kinect during stair ascents, and Figure 11d–f shows those captured by the lower Kinect. Comparing these figures, it is clear that the errors on the Y axis data captured by the upper Kinect are larger than the others (Figure 11b). One of the main error sources was that the method did not work well when the silhouettes around the knee were round. Figure 11g,h contains data of subject 8 climbing the stairs, which were captured at the same time from the front and back by the upper and lower Kinect. The data show that the back of2698 the knee bent at a sharper angle than the front. Sensors 2017, 17, 14 of 18 VI CON X Ou r m eth od X X Error

(a)



(c)

(b)

VI CON X Ou r m eth od X X Error

(d)



(e)

(g)

(f)

(h)

Figure 11. Comparison of results during ascent captured from the front and the back: (a) knee

Figure 11. displacement Comparison during from(b) the front and theonback: on Xof axisresults during stair ascentascent capturedcaptured by upper Kinect; knee displacement Y axis (a) knee displacement on X axis during stairbyascent by upper Kinect;on(b) knee displacement during stair ascent captured upper captured Kinect; (c) knee displacement Z axis during stair ascent on Y axis captured upper Kinect; knee Kinect; displacement on X axis during stair ascent by lower during stair ascentbycaptured by (d) upper (c) knee displacement on Z captured axis during stair ascent knee displacement on Y axis during stair ascent captured by lower Kinect; (f) knee captured byKinect; upper(e)Kinect; (d) knee displacement on X axis during stair ascent captured by lower Kinect; displacement on Z axis during stair ascent captured by lower Kinect; (g) estimated knee joint position (e) knee displacement on Y axis during stair ascent captured by lower Kinect; (f) knee displacement on (green dot) and leg area (red dots) during ascent captured by upper Kinect; (h) estimated knee joint Z axis during stair(green ascent by(red lower (g) estimated position (green dot) and position dot)captured and leg area dots)Kinect; during ascent captured byknee lowerjoint Kinect. leg area (red dots) during ascent captured by upper Kinect; (h) estimated knee joint position (green dot) According to the heights of ascent in Figure 9b,d, we can see that the lower Kinect data was and leg area (red dots) during ascent captured by lower Kinect. affected by the resolution, as the distance from the lower Kinect was longer than that of the upper Kinect. In other words, other disturbances more greatly affected the upper Kinect data, as is indicated by the larger standard deviation of the upper Kinect data. In light of the discussion in Section 4.2, we conclude that the stair walking should be captured from the back, not the front.

Sensors 2017, 17, 2698

15 of 18

According to the heights of ascent in Figure 9b,d, we can see that the lower Kinect data was affected by the resolution, as the distance from the lower Kinect was longer than that of the upper Kinect. In other words, other disturbances more greatly affected the upper Kinect data, as is indicated by the larger standard deviation of the upper Kinect data. In light of the discussion in Section 4.2, we conclude that the stair walking should be captured from the back, not the front. 4.4. Possible Sources of Error We checked all errors that behaved differently from the others, such as the plots pointed to by the arrow in Figure 12a. Except for bias errors, there were few errors when the leg was bent. Accordingly, this method can accurately provide the knee joint positions when the knee joint angles are large. The strongest error sources had to do with misrecognition of the thigh position. There were two sources. One was the round shape of the area around the knee, as we already mentioned in Sections 4.2 and 4.3. The other was the influence of the reflective markers of VICON. The examples shown in Figure 12b,c correspond to the time indicated by the arrows in Figure 12a. From Figure 12c, it can be seen that the reflective marker on the thigh made the surface uneven, thereby causing a deficiency of data points, and it was falsely recognized as the knee joint position. The data points of the reflective markers could not be acquired; they appear as holes in Figures 10i and 12c. 674 frames were falsely recognized for this reason: 444 frames of ascents and 134 frames of descents captured by the upper Kinect, and 96 frames of descents captured by the lower Kinect. Incidentally, misrecognition of the shin position happened in 34 frames during ascents captured by the upper Kinect. The errors also depended on the subject’s bodily proportions and posture. Here, as an example of gender differences, the misrecognition due to the bulge of the gluteus maximus muscle happened in only the male subjects’ data. The boundary between the gluteus maximus and thigh was often falsely recognized as the knee joint position during ascents captured from the back. A misrecognized frame of subject 6, who is male, is shown in Figure 12d, and ascending data captured from the back of subject 10, who is female, is shown in Figure 12e. In this case, it can be said that the male subject had a more muscular build. When his leg was almost completely extended, the other uneven parts were very likely to be falsely recognized as knee joint positions. All such errors happened during ascents captured by the lower Kinect. On the other hand, the left and right legs were relatively difficult to distinguish in the female subjects’ data, as the female subjects tended to position their legs closer together when they walked. Figure 12f shows an error caused by confusion of the left and right leg data. In this case, some of the left leg plots were misrecognized as right leg plots, and as the left leg was positioned ahead, the representative plots were defined incorrectly. In some frames, the left and right legs could not be separated by the vertical plane, so a more complex boundary plane should have been used. The data of subjects who moved their hands in the front of their legs when climbing the stairs had errors. These errors only happened in the frames during ascents captured from the front by the upper Kinect. When the subject was climbing the stairs, the legs moved above and came closer to the arms. We speculate that this caused the errors. Figure 12g is an example of an error caused by a hand. The hands should have been rejected in the body range definition. Despite this, such noise was never observed in the data captured from the back, because the subjects did not move their arms behind, only ahead. Therefore, the hands do not have to be rejected when our system always captures the subject from the back. In other words, this fact reinforces the conclusion that stair walking should be observed from the back. The density of the data plots became quite low when the subjects’ shins were parallel to the radial rays of the depth sensor. The case of the upper Kinect is shown in Figure 12h. In our study, this situation happened in frames which were captured from the front. Examples are shown in Figure 12i,j. Despite the low density of plots for shins, the knee joint position estimation worked well.

Sensors 2017, 17, 2698

16 of 18

Sensors 2017, 17, 2698

16 of 18


1500

1000

500

0 20

21

22

23

24

25

Time (sec)

(a)

(b)

(c) 200

0

-200

-400

-600

-800

-1000

-1200

-1400

-1600 1000

1200

1400

1600

1800

2000

Z (mm)

(d)

(e)

(f)

(g)

(i)

(j)

Up p er Kin ect

1500 m m

20°

50°

(h)

Figure 12. 12. (a) (a) Subject Subject 8’s 8’s knee knee displacement displacement on on YY axis axis during during third third stair stair ascent ascent (abnormal (abnormal data data Figure indicated by an arrow); (b) estimated knee joint position (green dot) that was misrecognized and leg indicated by an arrow); (b) estimated knee joint position (green dot) that was misrecognized and leg area (red (red dots); dots); (c) (c) aa data the thigh was falsely recognized as the area data hole holemade madeby byaareflective reflectivemarker markeronon the thigh was falsely recognized as knee joint position; (d) instance in which the boundary of the gluteus maximus and thigh was falsely the knee joint position; (d) instance in which the boundary of the gluteus maximus and thigh was recognized duringduring ascentascent captured from from the back in a male subject; (e) comparison of Figure 10d:10d: the falsely recognized captured the back in a male subject; (e) comparison of Figure same situation of a female subject; (f) error caused by confusion of left and right leg data; (g) error the same situation of a female subject; (f) error caused by confusion of left and right leg data; (g) error causedby bynoise noisedue duetotohands; hands;(h) (h)location locationgiving giving the lowest density plots of the shins in the caused the lowest density of of plots of the shins in the casecase of of the upper Kinect; (i) example of low-density plots of the shins during ascent captured by the upper the upper Kinect; (i) example of low-density plots of the shins during ascent captured by the upper Kinect;(j) (j)example exampleof oflow-density low-densityplots plotsof ofthe theshins shinsduring duringdescent descentcaptured capturedby bythe thelower lowerKinect. Kinect. Kinect;

Conclusions 5.5. Conclusions Thisstudy studysuggested suggestedthat thataamarkerless markerlessmeasurement measurementsystem systemworking workingby bybody bodyrange rangedefinition definition This andleg legdata dataextraction extractioncan canbe be used used for for estimating estimating knee knee joint joint positions positions during during stair stair walking walking activities. activities. and The accuracy of our method, which uses depth data from Kinect v2, and that of the skeleton tracking The accuracy of our method, which uses depth data from Kinect v2, and that of the skeleton tracking function of Kinect v2 were evaluated by using the data acquired by a 3D motion capture system function of Kinect v2 were evaluated by using the data acquired by a 3D motion capture systemasasa reference. The experimental results indicate that our estimation method during stair walking is more accurate than the skeleton tracking of Kinect v2. They show that our system can estimate the knee

Sensors 2017, 17, 2698

17 of 18

a reference. The experimental results indicate that our estimation method during stair walking is more accurate than the skeleton tracking of Kinect v2. They show that our system can estimate the knee joint positions within 43.2 ± 27.5 mm of 3D Euclidian distance errors, while Pearson’s correlation coefficients of the anteroposterior dimension (Z axis) and vertical dimension (Y axis) are over 0.9. This suggests that stair walking activity can be measured from the back, because there is less noise in the depth data captured from the back than from the front. In the future, an experiment will be carried out with more subjects, including the people who have recently developed musculoskeletal diseases, and variously designed staircases in an attempt to evaluate the practicality of the present knee joint position estimation method. Acknowledgments: This work was supported by the Japan Society for the Promotion of Science (JSPS KAKENHI Grant-in-Aid for Scientific Research, Grant Number 15H02890) and, in part, by a MEXT Grant-in-Aid for the “Program for Leading Graduate Schools.” Author Contributions: A.O. developed and implemented the algorithm for estimating knee joint positions, analyzed the results, and wrote the paper; A.M. supervised A.O.’s work and analyzed the results; M.T., A.Y. and A.O. organized and conducted the experiment; all authors contributed in revising this paper. Conflicts of Interest: The authors declare no conflict of interest.

References 1. 2. 3.

4.

5.

6.

7.

8.

9. 10. 11.

12.

National Institutes of Health. Global health and aging. Natl. Inst. Health Publ. 2011, 11, 7737. Bergmann, G.; Deuretzbacher, G.; Heller, M.; Graichen, F.; Rohlmann, A.; Strauss, J.; Duda, G. Hip contact forces and gait patterns from routine activities. J. Biomech. 2001, 34, 859–871. [CrossRef] Kutzner, I.; Heinlein, B.; Graichen, F.; Bender, A.; Rohlmann, A.; Halder, A.; Beier, A.; Bergmann, G. Loading of the knee joint during activities of daily living measured in vivo in five subjects. J. Biomech. 2010, 43, 2164–2173. [CrossRef] [PubMed] Liikavainio, T.; Isolehto, J.; Helminen, H.J.; Perttunen, J.; Lepola, V.; Kiviranta, I.; Arokoski, J.P.; Komi, P.V. Loading and gait symmetry during level and stair walking in asymptomatic subjects with knee osteoarthritis: Importance of quadriceps femoris in reducing impact force during heel strike? Knee 2007, 14, 231–238. [CrossRef] [PubMed] Nüesch, E.; Dieppe, P.; Reichenbach, S.; Williams, S.; Iff, S.; Jüni, P. All cause and disease specific mortality in patients with knee or hip osteoarthritis: Population based cohort study. BMJ 2011, 342, d1165. [CrossRef] [PubMed] Hensor, E.; Dube, B.; Kingsbury, S.R.; Tennant, A.; Conaghan, P.G. Toward a Clinical Definition of Early Osteoarthritis: Onset of Patient-Reported Knee Pain Begins on Stairs. Data From the Osteoarthritis Initiative. Arthritis Care Res. 2015, 67, 40–47. [CrossRef] [PubMed] ¯ Kenji, T.; Masaya, A.; Yoshio, W.; Hiroka, H.; Kazuki, T.; Makoto, T.; Koichi, S. Characteristics of External Knee Adduction Moment during Stair Descent and Level Walking: Comparison of Healthy Elderly People and Patients with Knee Osteoarthritis. Rigakuryoho Kagaku 2015, 30, 353–357. Sacco, I.; Trombini-Souza, F.; Butugan, M.; Passaro, A.; Arnone, A.; Fuller, R. Joint loading decreased by inexpensive and minimalist footwear in elderly women with knee osteoarthritis during stair descent. Arthritis Care Res. 2012, 64, 368–374. [CrossRef] [PubMed] Amin, S.; Luepongsak, N.; McGibbon, C.A.; LaValley, M.P.; Krebs, D.E.; Felson, D.T. Knee adduction moment and development of chronic knee pain in elders. Arthritis Care Res. 2004, 51, 371–376. [CrossRef] [PubMed] Wang, F.; Stone, E.; Skubic, M.; Keller, J.M.; Abbott, C.; Rantz, M. Toward a passive low-cost in-home gait assessment system for older adults. IEEE J. Biomed. Health Inform. 2013, 17, 346–355. [CrossRef] [PubMed] Hicks-Little, C.A.; Peindl, R.D.; Fehring, T.K.; Odum, S.M.; Hubbard, T.J.; Cordova, M.L. Temporal-spatial gait adaptations during stair ascent and descent in patients with knee osteoarthritis. J. Arthroplast. 2012, 27, 1183–1189. [CrossRef] [PubMed] Hicks-Little, C.A.; Peindl, R.D.; Hubbard, T.J.; Scannell, B.P.; Springer, B.D.; Odum, S.M.; Fehring, T.K.; Cordova, M.L. Lower extremity joint kinematics during stair climbing in knee osteoarthritis. Med. Sci. Sports Exerc. 2011, 43, 516–524. [CrossRef] [PubMed]

Sensors 2017, 17, 2698

13.

14.

15. 16. 17. 18.

19. 20.

21.

22. 23.

24. 25. 26. 27.

28. 29. 30. 31. 32. 33.

18 of 18

Hinman, R.S.; Bennell, K.L.; Metcalf, B.R.; Crossley, K.M. Delayed onset of quadriceps activity and altered knee joint kinematics during stair stepping in individuals with knee osteoarthritis. Arch. Phys. Med. Rehabilit. 2002, 83, 1080–1086. [CrossRef] Mentiplay, B.F.; Perraton, L.G.; Bower, K.J.; Pua, Y.-H.; McGaw, R.; Heywood, S.; Clark, R.A. Gait assessment using the Microsoft Xbox One Kinect: Concurrent validity and inter-day reliability of spatiotemporal and kinematic variables. J. Biomech. 2015, 48, 2166–2170. [CrossRef] [PubMed] Nicolas, R.; Nicolas, B.; François, V.; Michel, T.; Nathaly, G. Comparison of knee kinematics between meniscal tear and normal control during a step-down task. Clin. Biomech. 2015, 30, 762–764. [CrossRef] [PubMed] Novak, A.; Brouwer, B. Sagittal and frontal lower limb joint moments during stair ascent and descent in young and older adults. Gait Posture 2011, 33, 54–60. [CrossRef] [PubMed] Novak, A.; Komisar, V.; Maki, B.; Fernie, G.R. Age-related differences in dynamic balance control during stair descent and effect of varying step geometry. Appl. Ergon. 2016, 52, 275–284. [CrossRef] [PubMed] Otte, K.; Kayser, B.; Mansow-Model, S.; Verrel, J.; Paul, F.; Brandt, A.U.; Schmitz-Hübsch, T. Accuracy and reliability of the kinect version 2 for clinical measurement of motor function. PLoS ONE 2016, 11, e0166532. [CrossRef] [PubMed] Reid, S.M.; Graham, R.B.; Costigan, P.A. Differentiation of young and older adult stair climbing gait using principal component analysis. Gait Posture 2010, 31, 197–203. [CrossRef] [PubMed] Hsu, M.-J.; Wei, S.-H.; Young-Hue, Y.; Ya-Ju, C. Leg stiffness and electromyography of knee extensors/flexors: Comparison between older and younger adults during stair descent. J. Rehabilit. Res. Dev. 2007, 44, 429. [CrossRef] Larsen, A.H.; Puggaard, L.; Hämäläinen, U.; Aagaard, P. Comparison of ground reaction forces and antagonist muscle coactivation during stair walking with ageing. J. Electromyogr. Kinesiol. 2008, 18, 568–580. [CrossRef] [PubMed] Shotton, J.; Sharp, T.; Kipman, A.; Fitzgibbon, A.; Finocchio, M.; Blake, A.; Cook, M.; Moore, R. Real-time human pose recognition in parts from single depth images. Commun. ACM 2013, 56, 116–124. [CrossRef] Ogawa, A.; Mita, A.; Georgoulas, C.; Bock, T. A Face Recognition System for Automated Door Opening with parallel Health Status Validation Using the Kinect v2. In Proceedings of the International Symposium on Automation and Robotics in Construction, Auburn, AL, USA, 18–21 July 2016; Department of Construction Economics & Property, Vilnius Gediminas Technical University: Vilnius, Lithuania, 2016; pp. 132–140. Butkiewicz, T. Low-cost coastal mapping using Kinect v2 time-of-flight cameras. In Oceans-St. John’s, 2014; IEEE: St. John’s, NL, Canada, 14–19 September 2014; pp. 1–9. Yang, L.; Zhang, L.; Dong, H.; Alelaiwi, A.; El Saddik, A. Evaluating and improving the depth accuracy of Kinect for Windows v2. IEEE Sens. J. 2015, 15, 4275–4285. [CrossRef] Vicon. Available online: https://www.vicon.com (accessed on 27 February 2017). Vallabhajosula, S.; Yentes, J.M.; Momcilovic, M.; Blanke, D.J.; Stergiou, N. Do lower-extremity joint dynamics change when stair negotiation is initiated with a self-selected comfortable gait speed? Gait Posture 2012, 35, 203–208. [CrossRef] [PubMed] Vallabhajosula, S.; Yentes, J.M.; Stergiou, N. Frontal joint dynamics when initiating stair ascent from a walk versus a stand. J. Biomech. 2012, 45, 609–613. [CrossRef] [PubMed] Bogin, B.; Varela-Silva, M.I. Leg length, body proportion, and health: A review with a note on beauty. Int. J. Environ. Res. Public Health 2010, 7, 1047–1075. [CrossRef] [PubMed] Eveleth, P.B.; Tanner, J.M. Worldwide Variation in Human Growth; CUP Archive: Cambridge, UK, 1976; Volume 8. Plonus, M. Electronics and Communications for Scientists and Engineers; Academic Press: Cambridge, MA, USA, 2001. Portney, L.; Watkins, M. Foundations of clinical research: Application to practice. Stamford USA Appleton Lange 1993, 53, 505–528. Koo, T.K.; Li, M.Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 2016, 15, 155–163. [CrossRef] [PubMed] © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).