Towards a neuro fuzzy tennis coach - IEEE Xplore

1 downloads 0 Views 332KB Size Report
Automated Extraction of the Region of Interest (ROI). Boris BaEiC. Auckland University of Technology. Private Bag 92006 Auckland 1020. New Zealand. E-mail: ...
25-29 July, 2004 * Budapest, Hungary

Towards a Neuro Fuzzy Tennis Coach: Automated Extraction of the Region of Interest (ROI) Boris BaEiC Auckland University of Technology Private Bag 92006 Auckland 1020 New Zealand E-mail: [email protected]

Abstract - This article introduces a case study on building an automated tennis coach in the area of connectionist methods for data analysis and modelling of human sporting activities. The main part of the article proposes automatic extraction of the Region Of Interest (ROI) from multidimensional time series representing human motion. Using this method with off-line learning on a small test data set this research demonstrates that it is possible to build a small-scale prototype of an automated tennis coach that can provide simple feedback on players' technique. I n the process this study shows that the automated ROI selection method presented in this article was able to detect 100% of ROls from the test data.

Prior research addressing tennis data processing includes the classification of TV sports news [2], tennis stroke classification [3-71, identifying players, tennis balls [SI, and a tennis court in a video. Which data acquisition technique will prevail in any given circumstance is still not clear 19, IO). Examples are: obtrusive wearable electromagnetic sensors [I I], infra-red reflective markers [IO] or non-obtrusive camera systems [SI). Many numerical methods including 2D and 3D motion analysis algorithms for human body modelling are published or available on the Internet [ 12-16].

11. PROTOTYPE

1. INTRODUCTION

Sport coaching can be represented as a cyclic process of observing what is occurring and providing constructive intervention. This study introduces an experimental system that can provide simple feedback similar to the advice a coach might give to encourage a player to follow simple heuristic rules [I]. An example of such feedback is "good stroke" or "more shoulder turn". An obvious question is: in order to build an automated coach what skills and knowledge are needed and how should we integrate them?

To develop an artificial intelligence (AI) or even evolving intelligence (EI) coach for any sport, the research effort must be holistic; that is, it must incorporate aspects of several research directions, as depicted in Figure 1.

A prototype automated coach should be able to observe machine-interpreted biomechanical data and provide humanlike feedback.

..-.m-Y.ur-y*i

Expected research activities are: Modelling and testing of predictive theories (experimental hypotheses). Converting coaching knowledge into knowledge suitable for building connectionist systems. Exploratory and explanatory research to develop and test hypothesesheuristics involving cause-effect relationships. Development of a working system as a proof of concept.

0-7803-8353-2/04/$20.00 0 2004 IEEE

+

UF.FE*HCH mom = m45

sa

+

,815

15

Fig. I. Holistic Research Efforts in Automated Coaching

A. Milestone I : Accepting the First Hypothesis A fust prototype has been developed and tested [17]. In this instance all the ROI data have been prepared manually by an expert - a professional tennis coach. Input data acquisition - 2D data were obtained 6 0 m 9 cameras recording infrared reflective markers at a rate of SO

703

FUZZ-IEEE 2004

fps. With some manual intervention the data were converted into a 3D stick Figure (Fig. 2). Architecture design - a Radial Basis Function (RBF) connectionist system was implemented to classify good and bad strokes.

Postmortem: Hypothesis asserting that a machine observing a short period of time (around 100 ms) can classify good and bad tennis shotsitechnique has been proved. With a small data set, “leave one out” cross validation was used. No attempts were made at this stage to obtain synthetic data or interpolated data from data containing missing values. The RBF architecture achieved a classification accuracy of 99.9%. Input data was not filtered polynomial curve fitting acted as a low band pass filter. While the system could indicate good or bad technique it could not be used to provide any information related to injury prevention. Input data needed to be insensitive to a player’s position.

-

B. Milestone 2: Automating a Neuro Fuzzy Tennis Coach After the initial hypothesis had been proved in Milestone

I , the next step was to design an automated low-level data provider for the Region Of Interest (ROI) as shown in Figure 3. In Figure 2, a player holding a tennis racquet is visually represented as a stick figure from a 3D marker time series presented in a manner similar to kames in a video. During the initial experiment, markers attached to the racquet were required to classify good and bad stroke techniques. For the ROI calculation, only three markers attached to the player’s body, at approximately half of the player’s height, have been used: 1. PSGT ...Player’s Side Great Trohanter, 2. SSGT ... opposite Side Great Trohanter, 3. PSHD .. . Player’s Hand marker.

NB: Great Trohanter is anatomical bone marker position commonly used in biomechanical data acquisitions.

Fig. 2. Stick Figure in Right-handed Coordinate System

Design decisions: No synthetic data to be used at this stage. Raw data not to be filtered or interpolated. Input data supplied in off-line mode. Data containing missing values should not be processed. Working with a small data set is still a design constraint.

*$,

30 model Stick Figure

tt

Feature Neural Extraction Net Technique (REF)

ROI Extraction

1

I

Fig. 3. System Block Diagram 111. AUTOMATED R01 SELECTION

With reference to Figure 3, the architecture of the system has been designed around the idea of operation in both on-line and off-line monitoring modes. At present all the test data U, have been supplied in off-line mode (as in Equation 2).

704

25-29 July, 2004 * Budapest, Hungary

- Position (x, y , z ) A. Input Data

- Velocity (i, ,;, For the ROI extraction process, the experimental input data can he described as follows: Multidimensional representation of n-markers as a 3D stick figure (Fig. 2) approximating a human holding a racquet. Within each frame M, a marker m, is defined as 3D coordinates:

(3)

)

NB: the notation; represents the fust derivative of x with respect to time as dx/dt.

POslIlO"

2

10

E X l

x o

-

s5

0

0

5

10

15

20

25

6O

5

10

15

20

25

where M i s a set of n markers as in Figure 2. A small data set of tennis strokes S, (i.e. characteristic forehands and backhands) were originally arranged into ten clusters for ROI extraction algorithm testing purposes. Each cluster contains a series of five or six tennis strokes. A chaotic time series of each tennis stroke S, with random time delay between the two successive strokes. Each stroke S, is a set of consecutive frames M(t) of individual duration k. ~

S,GU

mdt)

=

(4). Y,(%

zdt))

s, =(M(t)I I, S t Sik-,}

(2)

Finite state automata of typical sets of motion sequences constituting tennis strokes. Some motion sequences contain typical postures, or a set of rigid body positions (e.g. hitting the ball phases). Motion sequence is described as a 3D time series of rigid body positions over an arbitrary time period

B. Experimental Development Sequence I ) Determining Redundant Data: With reference to the stick figure (Fig. 2), only three markers (balded in Fig. 2. as SSCT, PSGT and PSHD) have been used to compute the ROI interval. Figure 4 shows visual observation of relevant kinematics data [18, 201 as markers presented separately in x, y and z axes. While a marker's position is obtained through the acquisition system, a markers' velocity is calculated as the first order derivative of relative position displacement over time.

Fig. 4. Markers' Displacement and Velocity Time Series

By observing the I and B axes' time series (Fig. 4) it is possible to recognise typical pattems related to forehand and backhand tennis strokes. As a further observation, t h e y axis' time series indicate the player's intention related to the energy transfer to spin the ball in a particular manner (e.g. slice or topspin). For the ROI selection method the y axis data is redundant.

2) Sliding Window Approach - Phase One: In phase one, incoming data have been processed in successive windows using arbitrary data intervals of one second. Within this window the system can evaluate the presence of a tennis stroke, S,. The rule for detecting a particular stroke type (e.g. backhand or forehand) relies on two mutually dependent parameters: relative stroke magnitude (i.e. local stroke maximum) - a descriptor insensitive to a player's absolute displacement (i.e. position) and swing velocity towards hitting orientation. If both conditions of phase one are met for a given window W, (e.g. five window starting times marked as vertical dashed lines in Fig. S), then the phase two computation is invoked.

705

FUZZ-IEEE 2004

computational levels, where phase 1 invokes a phase 2 whenever needed.

I

3m

I 602

10.132 l4.m 18.132 Wmdwc,l.$me[sec]

Fig. 5 . Stroke Magnitude and VelociQ

Figure 5 above shows the SSGT, PSGT and PSHD markers and the superimposed swing velocity.

3) Determining ROI Interval - Phase Two: When invoking the phase two process, the sliding window data interval W, is extended to include the prior and post window neighbours W’, = (W,,, W,, W!+,). Ultimately, three markers (i.e. bolded in Fig. 2. as SSGT, PSGT and PSHD) time series have been presented as markers’ traces in 2D Transverse plane (XJ) in Figure 6, below. To visualise the motion pattern dynamics, an additional ‘virtual’ marker is computed for visual evaluation of the centre of the pelvis (or human body) in a transverse plane.

-. *

-30

PSHD PSGT BCdyCenlrE

-10

F

Phase two: Determining ROI interval Initialise parameters and read input data as W , determine a frame number of local stroke maximum as Lmax, WHILE DO reduce frame number determine ROIend, frame number as in Equations (5 and 6 ) determine which marker is “near rear hip” determine ROIstart, END WHILE Note that determining which hip marker is “near rear hip” provides information to distinguish between backhand and forehand strokes.

D. Experimental Results The results obtained from the prototype system have been compared to those acquired from a human expert, who used a 3D visualisation tool to annotate time periods of selected ROls for a series of tennis strokes U u = { ~ , ~ J = l , . . . , Pp )s , N

I,

01 Y 10 N

Phase one: Sliding window sequence Initialise parameters and read input data WHILE DO update W, data structure calculate “local stroke maximum’’ calculate maximum “hand velocity” IF (“hand velocity” > “velocity threshold”) AND (“local stroke maximum” > “stroke magnitude threshold”) invoke Phase two END IF END WHILE

local stroke m a .

40

b

Due to human factors, quantisation (i.e. digitising) error, and the limited number of kames per second, it was possible to determine a ROI time interval ROI, within reasonable levels of accuracy as shown in Table 1. ROI, c S,

Fig. 6 . 2D Transverse Plane

C. The ROI Extraction Algorithm

The ROI extraction algorithm presented in Figure 3 shows the relationship between two separate

706

(4)

ROl,~[ROIsturl,,ROIend,]

(5)

The variance in the ROIend, parameter in the algorithm is influenced by human (fuzzy) interpretation of the end of ROI, interval. As shown in Figure 6 , the computed ROIend, frame number is further reduced from the local stroke maximum Lmax, by a factor which depends on angle 9. ROlend, = f(Lmax, ,a)

(6)

-

25-29 July, 2004 Budapest, Hungary TABLE 2 Proposed Solutions

This has been defmed as parameter p. In the Table 1 below, the best results are achieved when p = 0.25. TABLE 1 Experimentll Results

Method HMM

I Possible Applications I Motion and kinematics sequence (i.e. kinetic link)

RBF or similar F W

I I

I recognition 118. 191.

EfuNN,

2

201

215

9

2

-1

6

3

2

372

381

3

-I

6

4

2

ss1

358

10 8

2

1

7

I

Rigid b d y position classification. Biomechanical key values of specific motions observed to be compared within fuzzy ranges. Fuzzy

DENFIS

Adaptive learning and life long learning ofever evolving tennis techniques [20].

Rule based

Hard coded solution.

As the research to date has used only a small data set and expert knowledge has been available regarding the generalisation requirements, a hard-coded rule-based solution has been used and has produced a very positive outcome in terms of stroke classification.

IV. CONCLUSIONS AND FUTURE CHALLENGES

866

873

8

4

151

160

1U

4

326

4

506

338 516

4 4

880

880

1058 1068

3

_____5

S64

868-

~~~

11

5 - -.-

I

0 0

-I 0

I

-1

8

0

12

I

-I

9

0

-I 0

10

I

0

II

o

0

I

6

6 8

I

0

I

.,

8.263

0.789

-o,16

7.316

0947

Max

I3

3

I

12

4

Min

Average

5

0

-I

5

-I

Median

8

I

O

6

I

Range

8

3

2

7

5

E. Proposed Methodologies The system as a whole will implement hybrid solutions for R01 and stroke interpretation inspired by studies of speech and gesture recognition and evolving connectionist systems.

The two staged algorithm structure allows for the possibility of additional tennis strokes techniques algorithms being added for detection and ROI extraction. Using fuzzy inf(imation - such as a player’s height (which is possible to estimate from the average height of SSGT and PSGT) should influence observations about the magnitude of the swing attribute. Building on previously stored player’s profiling data it would be possible to take into account a player’s skill level and devise an individualised strategy for feedbacklintervention.

-

The same fuzzy rules could be applied for automatic ROI extraction (instead of using an arbitrary threshold for determining the phase two precondition) to achieve better generalisation for teaching both children and very tall players. The next stage of the research will explore possibilities for synthetic data creation, derived from existing data sets. The system will have additional fuzzy rules implemented, to provide better feedback. More output classes will also be used to improve the feedback component. Finally, different solutions including evolving connectionist systems for adaptive training will be explored (as per Table 2) and integrated as appropriate into the system. In conclusion, the favourable experimental results to date give promise of.an interesting and innovative area of research to pursue.

707

FUZZ-IEEE 2004

ACKNOWLEDGMENTS

I wish to express my appreciation to Professor Nik Kasabov, Professor Stephen MacDonell and Gordon Grimsey for the support, contribution and helpful remarks they gave me in of this article. Tennis data have been obtained in collaboration with Polyclinic for physical therapy and rehabilitation "Peharec", Pula (Croatia) including support and helpful contribution of Petar BaEif.

REFERENCES S. Dreyfus and H. L. Dreyfus. Mind over mochme: The Free Press, 1986. Y. Ariki and Y. Sugiyama, "Classification of TV spolrs news by DCT features using multiple subspace method.'' in Proc. 14th Inl. Conference on Poitern Recognition (ICPR '98). 1998. pp. 14881491.

G. Sudhir, J. C. M. Lee, and A. K. Jain. "Automatic classification of tennis video for high-level content-based retrieval." IEEE Compuler Sociey, pp. 81-90, 1997. M.Petkovic. W. Jonker, and Z. Zivkovic, "Recognizing strokes in tennis videos using hidden Markov models [Electronic version]." in Proc. Inrernorionol Conference on Visualcarion, lmoging and Image Processing, Marbella, Spain. 2001 G. Pingali. A. Opalach. Y. lean. and I. Carlhm. "Visualization of sports using motion trajectories: Providing insights into performance, style. and strategy." in Pror 12th IEEE Vimzlcalion 2001 Conference (VISZOOI), San Diego, CA, 2001. pp. 75-82. D. Zhong and S.-F. Chang. (20 Jan). Smcture analysis of sports video "sine domain models. IOnlinel. Available: Image segmentation and feature extraction for recognizing strokes in tennis game videos. [Online]. Available: htlo:llmonetdb.cwi nllacoilDMWinublicationsi3 I .odf G. Pingali. A. Opalach. and Y. Jean. "Ball tracking and virtual replays for innovative tennis broadcasts [Electronic version]." in Proe. 15th lnrernolionol Con/erence on Poltern Recognrlion (ICPR'OO). Barcelona, Spain, 2000. R. Boulic and P. Baerlocher. (2002. Dec.). Cinematique inverse pour personnage en 3D, solunions analytiques et variatiannelles. Joumal de CFAO. [Online]. Available: hno:llvrlab.enfl.chiPublicatiansiDdUBouli~ Baerlocher Joumal CF A 0 0l.ndf [IO] L. Herda, P. Fua, R. Plaenken, R. Baulic, and D. Thalmann, "Using skeleton-based tracking to increase the reliability of optical motion capture,'' Human Movemenl Science Journal. vol. 20. pp. 313.341, 2001. [I I] Skill Technologies. (2004. Jan.). Integrated solutions and sofhvare for motion measurement. IOnlinel. Available: [I31 (2004. Jan.). 3-D analysis of human movement - home page. [Online]. Available: hno:llu?rw.utc.eduMuman-M~"~m~"~ 1141 V. M. Zatsiarsky, Krnemalics of human motion. Champaign, IL: Human Kinetics, 1998. [IS] D. A. Winter. Biomechanics and molor conlrol o/ human movement: Wiley, 1990. 1161 (2004, Jan.). lntemational Society of Biomechanics. [Online]. Available: hno:llwww.isbweb.or*l 1171 B. Bacic, "Automating System for Interpreting Biomechanical 3D Data using ANN: A case study an Tennis," in Proc. 3rd Conference on Akwc-Compring and Evolving Intelligence 2003 - NCEl'O3. Auckland, New Zealand, 2003, pp. 101-102.

708

[IS] M. Petkovic, W. lonker, and 2. Zivkovic. (2003, Dec). Recognizing strokes in tennis videos using hidden Markov models. Presented at Internotional Con/erence on Visualizarion, Imaging and Image Processing.

[Online].

Available:

hnn:llmonetdb.cwi.nllacailDMWl~~blicationsl32.odf

[I91 W. Wolf, B. Ozer, and T. Lv. "Sman camem far embedded systems." IEEE Compu1erSocret.y. vol. 35(9). pp. 48-53.2002. 120) N. K. Kasabav, Evolving connectionist sylems: Methods ond applications in bioinformolrcs, brain st& and intelligent mochmes: Springer Verlag, 2002. [ZI] B. Bacic. "A general connectionist development environment for sports data indexing and analysis - a case study on tennis," presented at NeuroComputing Colloquium & Workshop - NCC&W02. Auckland University ofTechnalogy, Auckland. New Zealand. 2002. [22] H-Anim. (2003, Jan.). Specification for a standard humanoid. [Online]. Available: htto:llh-anim.oreiSoeciticafions/H-AnimI . I l 1231 M. Pollefeys and L. V. Goal, "From images to 3D models;" Commimicorions of the ACM, vol. 4517, pp. 50-55.2002.