Automatic Classification of Hand Drawn Geometric Shapes using Constructional Sequence Analysis 1
R.M. Guest1, S. Chindaro1, M.C.Fairhurst1 and J.M. Potter2 Department of Electronics, University of Kent, Canterbury, UK, CT2 7NT 2 Kent and Canterbury Hospital, Canterbury, Kent. Email:
[email protected] Abstract
A method for automatically assessing the constructional sequence from a neuropsychological drawing task using Hidden Markov Models is presented. We also present a method of extracting and identifying the position of individual pen strokes relating to individual sides of a shape within a drawing to form training and testing sequences. Our results from two experiments using data from patients with visuo-spatial neglect show the HMM classifier is able to generalise on incorrectly extracted sequences and obtain a diagnostic classification which can be used alongside other forms of conventional assessment.
1. Introduction Figure drawing tasks are used widely for the diagnosis and monitoring of a variety of neuropsychological conditions such as dyspraxia, Parkinson’s disease and stroke. Analysis of these tests typically relies on the assessment of the presence of components (e.g. sides of a shape) within the completed drawing. In the analysis of many clinical conditions it has been shown that important diagnostic information can also be obtained by observing the process and sequence in which the test subject responded [1]. In order to assess if pattern recognition methods can be applied to the assessment of constructional aspects of neuropsychological test performance, the experiments described in this paper use data captured using a penbased computer system from tests subjects with the poststroke condition of visuo-spatial neglect (VSN). VSN is a dysfunction caused by brain damage, the main effect of which is to cause subjects to fail to respond to stimuli in the visual field on the opposite side to the location of the lesion. Assessment of VSN has conventionally included a variety of drawing tests alongside clinical and functional evaluation. Assessing figure drawing tasks on a static level (that is only considering the outcome of the drawing) fails to encapsulate any information concerning the underlying strategy of drawing execution which has been shown as an important diagnostic indicator for other tasks of VSN detection [2].
In this paper we describe experiments carried out using Hidden Markov Models (HMM) to model and predict classification from drawing construction sequence recognition. To facilitate this we also present a method for automated shape edge and sequence extraction from two commonly used drawing tasks. The aim of this paper is to demonstrate that the execution strategy - the sequence in which drawing strokes are made - can be effectively used as a diagnostic indictor of VSN in stroke patients.
2. Experimental Protocol The data for this research were collected as part of a wider study of the assessment of VSN in stroke patients [3]. Test subjects were required to complete a series of tasks, which were printed on sheets of paper overlaid on a standard graphics digitisation tablet. As the test subject completed the task using a cordless pen, status data such as pen position, tablet pressure and pen tilt were stored in a time-stamped pen response file for feature extraction. Two drawing tasks were conducted. In the first of the drawing tasks, figure copying (FC), a square and a cross were printed in the top horizontal centre of separate overlay. The test subject was required to copy the shape directly below the printed image. The second drawing task, drawing from memory (DFM), uses only the square. The test subject is asked to draw each shape on a separate blank overlay. The cross shape in particular was chosen because of its use in a copying trial by Warrington, James and Kinsbourne [4] where it was found to give the best separation between VSN and non-VSN subjects based on observed performance characteristics such as simplification of drawings and difficulty in producing angles between sides. Figure 1 shows the model shapes used in the FC task. A total of 28 VSN subjects were included in the trial collected over a four-year period. These subjects were identified using standard clinical assessments. A group of 55 stroke subjects without VSN were used as the Stroke Control group. Two experiments were conducted using these data. In the first experiment, sequences were automatically extracted using a devised algorithm which were then
Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003) 0-7695-1960-1/03 $17.00 © 2003 IEEE
utilised in the training and testing of the HMMs. In the second experiment, drawing sequences were obtained manually using a playback routine which reproduced the drawings dynamically utilising the timing data captured during the drawing task. These manually obtained sequences were presented to the HMM. Comparing the results from the two experiments allows an observation as to the robustness of the generalisation ability of the HMM given that some of the sequences in the first experiment may not be correctly detected. It also enables the accuracy of the automated sequence extraction method to be measured.
The first task towards identifying the drawing order is the segmentation of the subject’s drawing into individual strokes. This was followed by stroke labelling with reference to the coding models. Once these strokes and their sequencing have been identified, each stroke acts as a state in the physical model of the events sequence which can be presented to an HMM.
3.1. Strokes and Sequence Extraction Figure 3 shows a flowchart of the method used to automatically extract side sequences from shape drawing response files.
Open capture file
Read in coordinate pair
Obtain 8 point chain code
Smooth chain code to remove movement noise.
Identify sections
Repeat until EOF
Figure 1: Figure Copying Shape Models
3. Drawing Sequence Analysis Using Hidden Markov Models Hidden Markov Models (HMM) are statistical models of sequential data that have been used successfully in image processing and computer vision [5,6], speech recognition [7,8] and modelling of biological sequences [9,10] among other applications. The HMM is a natural way of recognising a pattern which can be represented as a sequence of discrete observation symbols [11]. 4 2
3 5 3
7
6
8
4
2 1
1
Obtain sequence
Order sides according to position
Square: Identify largest 2 horizontal and vertical sections
Cross: Identify largest 4 horizontal and vertical sections
Figure 3: Sequence Extraction The algorithm extracts a conventional 8-point chain code from pairs of pen coordinates within the response file. Once a chain code is generated for the entire file it is then smoothed to eliminate any spurious pen movements. This is achieved by sliding a 5-code width window across the sequence and calculating the modal directional code within the current position. All directional codes within the window are set to this modal value. Figure 4 shows this process with the extracted chain code in Figure 4a with a window around the first 5 codes. This window is then smoothed as shown in Figure 4b by setting all values to the modal code of 1. The window then moves on to the next block of 5 codes.
Figure 2: Drawing Sequence Models
a)
1121111121221
To facilitate an analysis of drawing sequencing, it is necessary to have a coded model of each task (Figure 2) where each side of the shape is assigned a unique identifier. For the application described in this paper, an HMM observation symbol is the identification number of an individual shape side extracted from the drawing.
b)
1111111121221
Figure 4: Window Modal Smoothing
Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003) 0-7695-1960-1/03 $17.00 © 2003 IEEE
The sequence of drawn sides is obtained by assessing the two longest vertical and horizontal sections in each direction within the square chain code (the four longest when assessing the cross) and mapping the location identification relative to the sequence model shown in Figure 2. By establishing the order in which the sides were drawn an output sequence can be obtained.
The Viterbi algorithm is widely used in predicting the most likely next state (based on training sequences) for a given sequence of testing observation in HMM problems [15]. For a single HMM, Viterbi algorithm can be expressed as [16]:
3.2. Model Selection
where N is the number of states, O is the observation sequence, π i is the initial state probability vector, δ 1 ( i )
A number of different models have been proposed and implemented using HMMs [12]. The left-to-right or Bakis model [13] was adopted for this investigation. In this model the underlying state sequence associated with the model increases or stays the same as time increases. The sample sequences extracted from the tasks are representatives of successive strokes, which should have a sequential relationship likely to be captured by the leftto-right model. Using this model, the number of unknown variables is reduced significantly and a representative model has a higher chance of being realised with a limited number of training samples. The Bakis model also supports multiple observation sequences [14] which is a requirement of our system as we have multiple training samples for each of the groups. It is assumed that each of the observations are independent of each other. The observation sequence O, therefore consists of several independent sequences, O(k), k = 1, 2, …, K, where O(k) is a training sequence, for a given class and K is the total number of samples used for that class. The goal is to maximise P (O/λ), where λ represents a particular class model, which is obtained by: K
K
k =1
k =1
P( O | λ ) = ∏ P( O ( k ) | λ ) = ∏ Pk Since the re-estimation formulas are based on frequencies of occurrence of various sequences, the formulas for multiple observation sequences are computed by adding together the individual frequencies of occurrence for each observed sequence.
3.3. Strokes and Sequence Extraction The transition behaviour of the sequences obtained from the feature extraction routine can be modelled in terms of the probability of each transition using HMM. We can therefore compute a model for each of the two groups of patients (VSN and Stroke Control). The maximum number of states is governed by the maximum number of transitions for each model: 4 for the square and 8 for the cross.
δ 1 (i ) = π i bi (O1 ),
1≤ i ≤ N
is the probability that symbol O1 is observed at time t = 1 and state i, and bi is the model output symbol probability matrix. After initialisation, a recursive method is applied as follows:
δ i ( j ) = max1≤i ≤ N [δ T −1 ( i )a ij ]b j ( Ot ), 2 ≤ t ≤ T , 1≤ j ≤ N where δ ( j ) is the maximum probability that the symbols O2, O3,….,OT are observed in state j at time t and a ij is the transition matrix of the underlying Markov chain - the probability of transition to state j given the current state j. The terminating criteria is:
p * = max1≤i ≤ N [δ T ( i )] The maximums probability of δ T ( i ) for each state i is the scored probability of the observed sequence against the model. The corresponding predicted state is the final state of the optimal path [16].
4. Results When training a Hidden Markov Model on a group of sequences, the resultant model is represented in a single prototype. We therefore have a model for each of the two groups of patients, VSN and Stroke Control, and for each drawing task we apply the method to. The success of the training phase can be judged by evaluating the model’s output using the same samples as used for training. The approach used to assess the overall accuracy of the method involved the random partitioning of the available datasets into equally sized training and test sets for a ten-fold cross validation experiment. We therefore have different subjects represented in the training and testing datasets for each trial. The recognition performance presented in the tables is averaged across all trials.
Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003) 0-7695-1960-1/03 $17.00 © 2003 IEEE
subject might not have followed the anticipated model line sequences.
4.1 Automatic Sequence Extraction and Classification
4.3. Recognition Results for Verified Sequences
The results presented in Table 1 below are for the fully automated stroke extraction and HMM sequence recognition process. Accuracy results are presented for the detection of individual groups (VSN/Stroke Control) and overall recognition rate for a particular task. An overall recognition rate of 69.0% was obtained using the sequences extracted from the cross drawing task. Of the three tasks the square figure copying task produced the lowest recognition rate of 56.0%. These results provide the benchmarks against which the performance using manually verified sequences can be compared.
Task
Group
VSN Square (FC)
Square (DFM)
Cross (FC)
Stroke Control VSN Stroke Control VSN Stroke Control
Group Based Recog. Rate (%) Training Set
Overall Recog. Rate (%) Training Set
68.0
Group Based Recog. Rate (%) -Test Set
Drawing Task Square (DFM) Square (FC) Cross (FC) Square (DFM) Square (FC) Cross (FC)
VSN 56.0
52.4
58.0
70.0
68.0 64.0
Stroke Control 65.1
58.0
62.2
72.4
68.0 71.2
69.0
Strokes /Sequence Extraction Rate (%) 82.4 82.4 56.6 92.7 83.7 71.2
Table 2: Results of strokes and sequence extraction rate
70.0
Table 1: Results of classification using HMM with automatically extracted sequences
4.2.
Group
54.0 60.2
70.0
Overall Recog. Rate (%) - Test Set
Table 3 shows the results of the second experiment – testing and training HMM on manually extracted sequences. An overall improvement in recognition performance was observed in all the drawing tasks, with the cross again producing the highest recognition rate. As would be expected, the recognition rates are higher for manually extracted ‘correct’ sequences, however the small difference between the two experiment’s results shows both the minimal trade-off between accuracy and speed in using the automated stroke recognition system and the generalisation ability of the HMM classifier.
Manual Sequence Extraction
The results obtained using the automatic strokes and sequence extraction modules were manually verified for correctness of sequence. Table 2 shows the rate of success of the automatic procedure. Incorrect sequence detection was due to a number of factors. In some cases all the straight lines would be successfully extracted, but the sequence would be distorted by the subject returning within the drawing sequence to repeat a line they had drawn earlier. Some cases would result in two parallel lines being drawn whilst the subject attempts to re-emphasize a particular line. In the case of the cross, besides these distortions, the
Task
Group
Group Based Recog. Rate (%) Training Set
VSN Square (FC)
Square (DFM)
Cross (FC)
Stroke Control VSN Stroke Control VSN Stroke Control
Overall Recog. Rate (%) Training Set
70.2
Group Based Recog. Rate (%) -Test Set
58.0 63.5
60.3
56.8
62.7
74.0
72.4
62.2
68.1
78.0
70.1 67.8 75.4
77.0 76.0
Overall Recog. Rate (%) - Test Set
72.7 70.0
Table 3: Results of classification using HMM with automatically plus manually extracted sequences
Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003) 0-7695-1960-1/03 $17.00 © 2003 IEEE
An overall improvement in recognition performance was observed in all the drawing tasks, with the cross again producing the highest recognition rate. As would be expected, the recognition rates are higher for manually extracted ‘correct’ sequences, however the small difference between the two experiment’s results shows both the minimal trade-off between accuracy and speed in using the automated stroke recognition system and the generalisation ability of the HMM classifier.
5. Conclusions We have presented an objective VSN detection scheme based on computer based drawing sequence analysis. Results for a drawing components stroke identification and sequence extraction technique have been presented, showing high success rate for the square drawing tasks. Despite lower accuracy rates in automatically extracting sequences from the cross drawing tasks, once a sequence was extracted a higher recognition rate was achieved. It has also been shown the HMMs can be used to provide a useful diagnostic indicator for the condition of VSN. Further work will concentrate on combining the sequencing data with other static and dynamic features, the classification outcome of which can be used alongside other forms of assessment in forming a diagnosis.
6. Acknowledgements The authors acknowledge the support of the UK Engineering and Physical Science Research Council.
7. References [1] J.J. Muir, L.C. Kirk, S.C.Casey, M.K. Morris, R.D. Morris, ‘Neuropsychological predictors of constructional abilities: Drawing and assembling’, Archives Of Clinical Neuropsychology, 14 (8): 734-735, 1999. [2] R.M. Guest, M.C. Fairhurst, J.M. Potter., ‘Diagnosis of visuo-spatial neglect using dynamic sequence features from a cancellation task’, Pattern Analysis and Applications, 5, pp. 261-270, 2002. [3] R.M. Guest, M.C. Fairhurst, J.M. Potter, N. Donnelly, ‘Computer Assessment of Line Drawing in Visual Neglect’, Cerbrovacscular Diseases, 10, S2, pp 96, 2000 ISSN 3-8055-7096-1
[4] E.K. Warrington, M. James, M. Kinsbourne, Drawing disability in relation to laterality of cerebral lesion. Brain, 89, pp. 53-82, 1966. [5] J.Li, A. Najmi, R.M. Gray, ‘Image Classification by a Two-Dimensional Hidden Markov Model’, IEEE transactions on signal processing, 48(2), p. 517, Feb 2000. [6] K. Aas, L. Eikvil, R.B. Huseby, `Applications of hidden Markov chains in image analysis', Pattern recognition, 32(4), p. 703, 1999. [7] R.E. Donovan, P.C. Woodland, `A hidden Markovmodel-based trainable speech synthesizer', Computer speech & language, 13(3), p. 223, Jul 1999. [8] S.M. Ahadi, P.C. Woodland, `Combined Bayesian and predictive techniques for rapid speaker adaptation of continuous density hidden Markov models', Computer speech & language, 11(3), p. 187, Jul 1997. [9] R.J. Boys, D.A. Henderson, D.J. Wilkinson, `Detecting homogeneous segments in DNA sequences by using hidden Markov models', Applied statistics, 49(2), p. 269, 2000. [10] A. C. Camproux, P. Tuffery, S. Hazout, `Hidden Markov model approach for identifying the modular framework of the protein backbone', Protein engineering, 12(12), p. 1063, Dec 1999. [11] H. Kunsch, S. Geman, A. Kehagias, `Hidden Markov random fields', The annals of applied probability, 5(3), Aug 1995. [12] Y. Bengio, ‘Markovian Models for Sequential Data’ Neural Computing Surveys 2, pp. 129-162, 1999. [13] R Bakis, ‘Continuous speech word recognition via centi-second acoustic states,’ in Proc. ASA Meeting (Washington, DC), April 1976. [14] L. R. Rabiner, ‘A tutorial on Hidden Markov Models and Selected Applications in Speech Recognition’ Proc. of the IEEE, Vol. 7, No. 2, pp. 257-286, Feb. 1989. [15] A. J. Viterbi. Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Trans. Information Theory, IT-13:260-269, 1967 [16] L. R. Rabiner and B. H. Juang ‘An Introduction to Hidden Markov Models’, IEEE ASSP Magazine, pp. 416, January, 1986.
Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003) 0-7695-1960-1/03 $17.00 © 2003 IEEE