An interactive system for the automatic generation of huge ... - CiteSeerX

14 downloads 0 Views 376KB Size Report
An interactive system for the automatic generation of huge handwriting databases from a few specimens. Moussa Djioua , Réjean Plamondon.
An interactive system for the automatic generation of huge handwriting databases from a few specimens Moussa Djioua , Réjean Plamondon École Polytechnique de Montréal, Laboratoire Scribens, Département de Génie Électrique, C.P. 6079, Succursale Centre-Ville, Montréal QC, H3C 3A7, Canada {moussa.djioua, rejean.plamondon}@polymtl.ca

Abstract The Sigma-Lognormal model of the Kinematic Theory of rapid human movements, has been implemented in an interactive software tool, allowing the generation of databases of unlimited size from a few online handwriting specimens of letters and words. Online trajectories of a target word produced by a few writers are fitted by the Sigma-Lognormal parameters; using the interactive system. Thereafter, the fiducial pattern of the word is constructed and the writer variability is circumscribed respectively from the mean values and the standard deviations of the extracted parameters. Typical simulation results obtained by randomly fixing the parameters inside these realistic intervals are presented to highlight the ability of the generator to produce a large variety of multi-writer and writer-dependent handwriting patterns as observed in real data. Overall, this software tool provides new insights on the development of huge databases for the training and testing of online handwriting classifiers and recognizers.

1. Introduction Various pattern recognition methods use a discontinuous representation scheme to describe complex handwriting gestures with the superimposition of strokes. These primitives constitute a specific class of rapid human movements from which complex movements are planned and executed [1],[9],[11]. The Sigma-Lognormal model is one of these methods [7]. It characterizes individual stroke trajectory with six parameters and word patterns can then be constructed by the vectorial superimposition of a specific set of strokes, localized both in time and space. Recent theoretical developments and simulation results have highlighted the human-like quality of the patterns generated with this model [1],[7] and a software tool to simulate realistic signatures from fiducial specimens has been designed

978-1-4244-2175-6/08/$25.00 ©2008 IEEE

[2]. This paper is a follow up of these studies. It presents an improved version of the tool, particularly adapted for the generation of huge databases of handwriting specimens. Among other things, this software facilitates the interactive extraction of Sigma-Lognormal parameters from the X-Y trajectory and the velocity profile of a target word. To do so, the system estimates an initial solution, as obtained by an interactive curve fitting process. Then, a Levenberg-Marquardt nonlinear optimization algorithm is activated and converges iteratively towards the optimum solution; according to the minimum least square errors [5].The SigmaLognormal parameter extractor is exploited to fit word trajectories produced by various writers. Fiducial patterns and realistic variability limits can thus be estimated from a few specimens. The template can then be used to generate huge databases of this word by automatically varying its parameters inside realistic writer-dependent or multi-writer intervals. The resulting synthetic database can then be exploited for the training and testing of online handwriting classifiers and recognizers. This study is organized as follows. The SigmaLognormal model is summarized in Section 2, while the software system is presented in Section 3. In Section 4, the original methodology for the fiducial construction and the estimation of the variability limits is described using a typical example. Some samples of writer-dependent and writer-independent (multi-writer) databases are also displayed and commented.

2. Overview of the Sigma-Lognormal model The Sigma-Lognormal model is the highest level of representation in the family of models supported by the Kinematic Theory [7]. It characterizes complex trajectories carried out by an end-effector in 2D and 3D spaces using a discontinuous representation scheme and considering single strokes as primitives. In this model, a stroke has a lognormal velocity profile

and a sigmoid direction profile (erf function) [1]. The r formal expression of the velocity v ( t ) of a complex movement is given by the following equation: L r r v ( t ) = ∑ vi ( t ) ; L ≥ 2

(1)

i =1

where L represents the number of strokes involved in the generation of a given pattern and

vi ( t ) is the

velocity profile of a movement that generates the ith stroke. In 2D space, the X-Y Cartesian coordinates of the trajectory are given by:

how an experimenter can synthesize a set of letters by individually acting on the six Sigma-Lognormal parameter values of each superimposed strokes. It also illustrates the interactive Sigma-Lognormal fitting result, as obtained here with 14 strokes, of the target word “lune” both in the X-Y and in the velocity spaces. Following this primary fitting, the resulting estimated parameters are used as an initial solution for the automatic optimization process which uses the iterative Levenberg-Marquardt algorithm [5] and the analytical expression of the Sigma-Lognormal Jacobian to converge towards an optimal solution.

t

L

x ( t ) = x0 + ∑ ∫ vi (τ ) cos ϕi (τ )  dτ

(2.a)

i =1 t0

L

t

y ( t ) = y0 + ∑ ∫ vi (τ ) sin ϕi (τ )  dτ

(2.b)

i =1 t0

with

vi ( t ) = and ϕi ( t ) = θ di +



fi

1

− 2 ln ( t −t0 i ) − µi  Di e 2σ i σ i ( t − t0i ) 2π

− θ di )  1 + erf 2 

 ln ( t − t0i ) − µi  σi 2 

    

2

(3.a)

(3.b)

Each curved stroke, indexed by i, is completely described in a 2D space by six Sigma-Lognormal parameters: P = t0 i , Di , µi , σ i , θ di , θ fi .This set of

{

}

parameters reflects both the motor control process and the neuromuscular response. Indeed, from a functioning point of view, the parameters t0i are the time occurrences of the input commands (represented by the parameters Di , θ di , θ fi ), which control the distance and the direction of each stroke, i.e. Di represents the length, θ di the starting direction angle and θ fi the ending direction angle of the curved stroke i. The neuromuscular systems that react to these impulse commands are characterized by their logtime delay µi and their logresponse time σ i (for more details, see [1-2],[7]).

3. Presentation of the system The mathematical expressions described in equations (1) to (3) have been implemented in an interactive software tool, providing an ergonomic humanmachine interface to assist an operator in the synthesis of artificial 2D trajectories and the estimation of an initial set of Sigma-Lognormal parameters. Figure 1 depicts a typical example of

Figure 1: Typical display used for the interactive estimation of an initial solution by fitting both the trajectory and the velocity of a handwriting pattern. The Sigma-Lognormal parameters of each stroke are interactively tuned using this Human-Machine interface. In this example, the real and the synthetic X-Y trajectory and their corresponding velocity profiles are well matched by 14 superimposed strokes ( SNRvelocity = 32.54 dB and SNRtrajectory = 47.32 dB ).

4. Typical application The construction of classifiers and recognizers of on-line handwriting notes or signatures collected on a PDA or Tablet PC has been an intensive research topic for more than a decades [10] and various methods have been proposed to construct databases in order to train and test the performances of these automatic systems [3-4],[6],[10]. In this perspective, one of potential applications of the present software tool is the generation of huge databases from a few specimens of on-line handwriting patterns that reflect the personal style of a single writer or the global writing habits of a set of writers. To illustrate this point, an experiment has been conducted where six writers were asked to write three specimens of the word “lune” on a digitizer. The X-Y trajectories were sampled at 200 Hz and a low-pass and derivative filters were used to calculate a numerical velocity profile. Each trial pattern of each writer was then fitted by the Sigma-Lognormal model and the

optimum parameter sets were obtained using the above mentioned interactive protocol. A fiducial pattern for the word produced by each subject was

Pi of the

then constructed from the mean values

corresponding Sigma-Lognormal parameters. The writer specific variability of a given fiducial was also circumscribed by the standard deviations STDPi of its parameter sets. Overall, this process leaded to six ( w)

 P  , ip

fiducial patterns, represented by the matrix

and six intervals ( w) ( w) [ I ]( w) = [ P ]ip −  k p  * [ STD ]ip( w) , [ P ]ip +  k p  * [ STD ](ipw) 

patterns one must choose small range factors [1]. In this study, the components of the range factor vector k p were empirically fixed using the following values: kt0 =0.05; kD =1;kθ =kθ f =0.35;kµ =kσ =0.5. Two main databases have been constructed: (1) a writerdependent one, made up of patterns synthesized from d

range vector; p=1,…,6 referring to one of the six stroke parameters, i=1,…,14 the stroke number and w=1,…,6 the writer identification number). The k p intervals reflect the realistic variability of each

2.5

(b)

2.5

(c)

2.5

2.5

2

2

1.5

1.5

0.5

1.5

1

1

0.5

4

5

6

7

0.5

Y (cm)

1.5

1.5

0.5 5

6

7

5

6

7

0.5

4

5

2.5

7

6

7

6

7

2.5

2

2

1.5

1.5

1 0.5

6

(d)

1 4

5

6

7

0.5

4

5

(e)

(f)

2.5

2.5

2

2

1.5

1.5

1

1 4

5

6

7

0.5

4

5

X (cm)

1

4

1 4

(c)

0.5 2

2

2

(b)

2.5

1

Y (cm)

(a)

interval of a given writer w

(a)

k p =  kt 0 k D kθd kθ f kµ kσ  is a factor

individual writer. Figure 2 shows the three specimens from which the fiducial (Figure 3b) produced by the writer #2 was calculated and Figure 3 depicts the fiducial patterns of the six writers.

( w)

and, (2) a writer-independent one made up of patterns generated using the fiducial of the fiducials as a reference pattern and the global intervals as constructed from the between subjects variability.

ip

(where

[ I ]ip

the fiducial and the

4

5

6

7

Figure 3. Fiducial patterns of the target word for the six subjects as reproduced from the mean values of their Sigma-Lognormal parameters.

X (cm)

Figure 2. Three specimens written by the subject # 2,

from which the fiducial and its corresponding variability intervals are calculated. According to the Kinematic Theory, the distortions observed in handwriting data can be reproduced by randomly varying the SigmaLognormal parameters around their mean values, within the limits fixed by their standard deviations, as measured from a few trials [1]. Global deformations can be obtained by varying the parameters with the same factor. For example, to produce scale changes of a pattern, the parameter Di of each constituting stroke is multiplied by a positive homothetic factor and to get a global rotation, the direction parameters



di

, θ fi ) of each stroke are changed with the same

positive or negative offset [1]. A second option offered by this model consists in the possibility of producing local deformations by individually varying the parameters. This mixes, in the same pattern, various kinds of deformations such as local scale changes, rotations, local sharpening or smoothening of the trajectory. Such a mix can lead to important deformations and therefore, to get readable

Figures 4a-f illustrate typical samples of a writerdependent database as randomly selected from the 10000 patterns, automatically generated from the combination of Sigma-Lognormal parameters randomly chosen within the interval

[ I ]ip

(2)

of the

writer #2. As one can see, the resulting synthetic patterns (Fig 4b-f) look like the fiducial one (Figure 3b or 4a). Moreover, comparing with the real specimens (Figures 2a-c), we can see that the generation process has qualitatively preserved the “style” of the writer. Figures 5a-f depict typical samples of a huge writer-independent database as randomly selected from of the 10000 multi-writer specimens automatically constructed as follows: a fiducial (Figure 5a) of the six fiducials (Figures 3a-f) has been calculated from the mean values of the parameters ( w)

 P  ip and by applying the above processing, the variability has been circumscribed with the standard deviations of  P 

( w) ip

where the parameters were

randomly chosen. As one can see, the resulting patterns qualitatively preserve some similarity with

the fiducial of the fiducials but look less similar to any of the six individual fiducials (Figure 3). (a)

(b)

(c)

2.5

2.5

2.5

2

2

2

1.5

1.5

1.5

1

1

0.5

4

5

6

7

0.5

1 4

5

6

(d)

6

2 1.5

1

2

7

0.5

1.5 1

1 6

7

(f)

Y (cm)

2 1.5

5

5

2.5

2.5

4

4

(e)

2.5

0.5

0.5

7

0.5 4

5

6

7

4

5

6

7

X (cm)

Figure 4: Typical samples from a huge writerdependent database constructed from the fiducial depicted in (a), and the variability interval computed for the subject # 2. If the control parameters are slightly varied, the style of the subject is globally preserved. This is quite apparent if one looks for example at the shape of the letter ‘n’. Although more work will be needed to provide quantitative comparisons of the writers’ styles, these simulations already highlight the wide possibilities offered by the software tool for the interactive construction of huge databases which reflect the wide but realistic variability of a single or a group of writers. (a)

(b)

(c)

2.5

2.5

2.5

2

2

2

1.5

1.5

1.5

1

1

0.5

4

5

6

7

0.5

1

4

5

(d)

0.5

7

2 1.5

1

7

0.5

7

6

7

2

1

6

6

2.5

Y (cm)

2 1.5

5

5

(f)

2.5

4

4

(e)

2.5

0.5

6

1.5 1

4

5

6

7

0.5

4

5

X (cm)

Figure 5. Typical examples from the 10000 simulations results obtained by randomly varying the Sigma-Lognormal parameters inside realistic intervals constructed from fiducials of the six subjects. The pattern depicted in (a) represents the fiducial of the fiducials and (b-f) some distorted ones.

5. Conclusion In this study, a Software system dedicated to the extraction of Sigma-Lognormal parameters and the synthesis of handwriting patterns has been presented and used for the definition of word fiducials and their variability limits as calculated from a few specimens produced by a single or many writers. This tool relies on the implementation of the theoretical backgrounds of the Kinematic Theory and particularly on the vectorial superimposition of primitive strokes having an erf direction and a lognormal velocity profiles. This software package allows a user to create any kinds of two dimensional trajectories like models of

letters, words, etc. A specific application dealing with the automatic generation of thousand of realistic writer-independent and writer-dependent patterns of a given word from a few specimens produced by six writers has been reported to highlight the flexibility of the tool. However, improvements are necessary. For example, the extraction process of parameters must be entirely automatic and assessments must be made to circumscribe the performances and limits of the tool in the generation of realistic training data. This work opens up interesting perspectives for the generation of huge databases of letters and words to develop and test online handwriting classifiers and recognizers. It could also be useful for the characterization of specific writing styles, whatever the language used. Acknowledgment This work was supported by grant RGPIN-915 from NSERC to Réjean Plamondon. References [1] Djioua M. and Plamondon R. (2007) Analysis and Synthesis of Handwriting Variability using the Sigma-Lognormal Model, Proceeding of the 13th Conference of the International Graphonomics Society , 13, 19-22. [2] Djioua M. O’Reilly C. and Plamondon R. (2006) An interactive trajectory synthesizer to study outlier patterns in hand-writing recognition and signature verification," Pro. of 18th Int. Conf. on Pattern Recognition (ICPR’06), 1, 1124-1127. [3] Kherallah M., Hadded L., Mitiche A., Alimi A. M.(2008) On-Line Recognition of Handwritten Digits Based on Trajectory and Velocity Modeling. International journal of Pattern Recognition Letter. 29, 580-594. [4] Liwicki, M. and Bunke, H.(2005): Handwriting Recognition of Whiteboard Notes. Proc. 12th Conf.of the Int. Graphonomics Society (IGS2005), 12, 118 – 122. [5] Marquardt D. W. (1963) An algorithm for leastsquares estimation of non-linear parameters, Journal of the society of industrial and applied mathematics, 11, 431-441. [6] Mouchère H., Bayoudh, S. Anquetil E. and Miclet L. (2007) Synthetic On-Line Handwriting Generation by Distortions and Analogy. Proc. 13th Conf. of the Int. Graphonomics Society (IGS2007), 13, 10 – 13. [7] Plamondon R. and Djioua M. (2006) A multi-level representation paradigm for handwriting stroke generation, Human Movement Science, 25, 586-607. [8] Plamondon R. Lopresti D. Schomaker L.R.B. Srihari S. (1999) On-line handwriting recognition, In J. G. Webster (Ed.). Wiley Encyclopedia of Electrical and Electronics Engineering, 15, 123-146, ISBN 0-471-13946-7. [9] Thoroughman K. A. and Shadmehr R. (2000) Learning of action through adaptive combination of motor primitives," Nature, 407, 742-747. [10] Varga T. Kilchhofer D. and Bunke H. (2005) Templatebased Synthetic Handwriting Generation for the Training of Recognition Systems, Proceeding of the 12th Conf. of the Int. Graphonomics Society, 12, 206–211. [11] Woch A. and Plamondon R. (2004) Using the Framework of the Kinematic Theory for the Definition of a Movement Primitive, Motor Control, 8, 547-557.

Suggest Documents