Document not found! Please try again

On-line recognition of cursive Korean characters ... - VisGraph Lab

23 downloads 71 Views 1021KB Size Report
The automatic recognition of cursive Korean characters is a di$cult problem, not only ... This paper proposes a recognition method for Korean characters using ...
Pattern Recognition 33 (2000) 399} 412

On-line recognition of cursive Korean characters using graph representation Keechul Jung*, Hang Joon Kim Department of Computer Engineering, Kyungpook National University, Sangyuk-dong, Pookgu, Taegu, 702-701 South Korea Received 31 March 1997; accepted 4 March 1999

Abstract The automatic recognition of cursive Korean characters is a di$cult problem, not only due to the multiple possible variations involved in the shapes of characters, but also because of the interconnections of neighboring graphemes within an individual character. This paper proposes a recognition method for Korean characters using graph representation. This method uses a time-delay neural network (TDNN) and graph-algorithmic post-processor for grapheme recognition and character composition, respectively. The proposed method was evaluated using multi-writer cursive characters in a boxed input mode. For a test data set containing 26,500 hand-written cursive characters, a 92.3% recognition rate was obtained. ( 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Cursive character; Graph representation; TDNN; Grapheme interpretation graph; Viterbi; Grapheme

1. Introduction The automatic recognition of cursive characters has recently enjoyed an increased interest from researchers, especially in view of the development of pen-based notebook computers for a more natural and e$cient man}machine interface. To develop a recognition method for a set of characters, we need to understand the characteristics of the character set. Korean, like Chinese and Japanese, is a large-alphabet language and it has many similar characters constructed with graphemes. Moreover, on-line Korean characters have wide variations in their number of strokes and shapes. These characteristics make it di$cult to recognize on-line Korean characters. Korean character recognition methods can be classi"ed according to the recognition units of character [1], stroke [2] or stroke segment [3], and grapheme [4,5]: (1) The use of character as a unit requires excessive

* Corresponding author. Tel.: #82-53-940-8694; fax: #8253-957-4846 E-mail address: [email protected] (K. Jung)

memory space and processing time. Accordingly, a large-classi"cation technique is often used, however, this results in an incomplete algorithm with many errors; (2) Since there can be many variations of stroke segments for a given stroke, a cursive stroke will necessarily include many di!erent representations. Whole cursive strokes have been used as recognition units in previous research, yet this has a drawback as it restricts the addition of new strokes; and (3) As Korean characters are composed of two or three graphemes, some researchers have used the grapheme as a recognition unit. For hand-printed Korean characters, this approach is e$cient since the character recognition problem is reduced to a grapheme recognition problem, which is simpler and appropriate for the composition rules of Korean characters. However, segmenting a hand-written character into graphemes without contextual knowledge is a very di$cult task. In this paper, we propose a grapheme-based recognition method of Korean characters using a graph representation [6,7]. The graph representation of a character consists of a set of nodes and edges: nodes include the grapheme recognition results and edges include the transition probabilities between two graphemes. After all the graphemes of a character are identi"ed (a grapheme

0031-3203/00/$20.00 ( 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 0 6 2 - X

400

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

Fig. 1. Block diagram of proposed system.

interpretation graph is constructed), a character interpretation score can be computed using the Viterbi algorithm. This character interpretation score will ultimately determine the stroke segmentation position. Such a recognition-based segmentation results in many `tentative graphemes (candidate cut)a, because without contextual knowledge it is di$cult to segment graphemes from a hand-written character. The character recognition scores determine the stroke segmentation (de"nite cut) into graphemes, and once the graphemes of a character are identi"ed, the character can be easily recognized. Most Korean character recognition systems try to segment graphemes from a character, however, this is not easy as stroke contacts frequently occur due to various writing styles. Such a recognition-based segmentation method is frequently used for o!-line character recognition and on-line English recognition [7}10]. In the proposed method, the boundary of tentative graphemes is de"ned using heuristic segmentation points. Corner points are assumed to be most appropriate for heuristic segmentation points since Korean characters are com). A TDNN posed of mostly straight lines (except is used to identify the graphemes. This network is a multilayer feed-forward network, the layers of which perform successively higher-level feature extractions to produce a "nal classi"cation. TDNNs have also been very successful for speech and character recognition [6,11,12]. In the character composition stage, we identify the most appropriate path (character) in the graph produced by the TDNN. Fig. 1 shows a diagram of the processes and information available at the various stages of our system. This paper is organized as following. A detailed explanation of each step of the proposed system is included in Section 2. Experimental results are outlined in Section 3, and the "nal conclusions are summarized in Section 4. 2. Methodology The proposed recognition system consists of six steps: input data, preprocessing, candidate cut generation,

feature extraction, grapheme recognition, and character recognition. The block diagram above outlines the proposed recognition system, and a detailed explanation of each step follows: In Fig. 1, we illustrate the various steps of the proposed system. The rounded rectangles include the names of each stage (denoting the activity in that step), and the rectangles beneath indicate the method or tool used in that stage. The input to the system is a set of handwritten components sampled by a digitizer. A component consists of a pair of vectors x(t), y(t), t3M0, 1,2, i,2, nN that de"ne the tablet's pen position along the two orthogonal axes of its writing surface, sampled at "xed time intervals. Because the raw data contain numerous variations in writing speed, in stage (b) the data are resampled in order to obtain a constant number of regularly spaced points on the trajectory. Stage (c) de"nes `tentative graphemesa delimited by `corner pointsa. These tentative graphemes are then passed to stage (d) which performs the feature extraction of each tentative grapheme. Stage (e) recognizes the graphemes using a TDNN. The recognition results are then gathered into a grapheme interpretation graph and "nally in stage ( f ), the graph of grapheme recognition scores is used to identify the best path. This approach is introduced in Ref. [13]. 2.1. Input data and preprocessing An input character is a sequence of absolute (x, y) positions on a tablet. The tablet dimension used is 5]5 inch and the resolution is 500 LPI. In this study, no constraints are placed on the writers other than a boxed input mode. Writers are not given any guidelines about writing speed or style, only instructions about operating the tablet. To reduce geometric distortion, the input sequence of positions is resampled to points regularly spaced in an arc length using linear interpolation (shown in Fig. 2b). The pen trajectory is slightly smoothed using following

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

401

Fig. 2. Examples of preprocessing.

Fig. 3. Examples of candidate cut generation.

Eq. (1). The symbol x means the x-position and y does i i y-position in x}y coordination. x "(!3x #12x #17x #12x !3x )/35. i i~2 i~1 i i`1 i`2 (1) The arrows of Fig. 2a denote the writing direction in a stroke which is the locus from the pen-down to the pen-up position. As shown in Fig. 2a, the strokes of a Korean character are usually directed from top to bottom and left to right. 2.2. Candidate cut generation Corner points [14] are used for candidate cut generation. This method detects the local maximum convex or concave curvatures of a stroke using a chain code representation. An example of detected corner points is shown in Fig. 3a. Intuitively, corner points mean the pen-start, pen-end, and abrupt direction change positions. In Fig. 3b, all strokes are numbered to show the order of the tentative graphemes.

2.3. Feature extraction At this stage, the feature vectors of all possible tentative graphemes are extracted for use as input data for the TDNN (grapheme recognizer). One or more segments are combined to form a tentative grapheme. Accordingly, a tentative grapheme represents a combined segment that links one or more segments. These tentative graphemes are normalized into an appropriate input size before being extracted as feature vectors in order to reduce the time and scale distortion in the hand-written data. Each tentative grapheme is normalized into seven feature vectors with 54 length points and is bounded and vary between 0 and #1. Some examples of tentative graphemes are shown in Fig. 4. An example of feature vectors is illustrated in Fig. 6. In this "gure, time increases along the horizontal axis from left to right, and each line of boxes corresponds to the following feature vector. A feature includes pen up/down, pen coordinates, direction and curvature [11] (Fig. 5 shows the meaning of direction (h) and curvature (/)). The direction of the pen movement is represented into two features. The sine and cosine values of direction for

402

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

Fig. 4. Examples of tentative graphemes.

seen in the intermediate representation. All these position correspondences between an input character and intermediate representation can be detected with careful observance. 2.4. Grapheme recognition

Fig. 5. Meaning of curvature (/) and direction (h).

each position are calculated. The curvature of the pen movement also has two features. The sine and cosine values of the angle between two consecutive points are calculated. The curvature and direction of the x}y coordination are normalized between 0 and #1. These kinds of feature vectors include local information along the pen trajectory. They are then passed to a TDNN in order to extract higher order features and "nally classify the grapheme. All these components are bounded and vary between 0 and #1 as follows:

G

H

1 if pen is up, x!x .*/ , , f " f " 1 x !x 0 0 otherwise, .!9 .*/ f " 2 y

y!y .*/ , f "(cos h#1)/2, 3 !y .!9 .*/

f "(sin h#1)/2, f "(cos /#1)/2, 4 5 f "(sin /#1)/2. 6 In Fig. 6, position A indicates a drastic directional change. Therefore, the darkness of the feature vectors representing direction and curvature also changes suddenly. Position B is in the middle of a gentle curve, and the intermediate representation around point B shows a non-changing curvature (feature f and f do not vary 5 6 much) and smooth curve. Point C is a straight line: indicated by the constant direction and zero curvature, as

A TDNN is very useful for representing the time relationships between events [12], thus it was used as the grapheme recognition engine. Extracted feature vectors were used as the input to the TDNN in order to obtain recognition scores. These results were then accumulated into a grapheme interpretation graph that was used to identify the best path for deciding on a result character. 2.4.1. Time-delay neural network architecture For the recognition of graphemes, a three-layer timedelay neural network was constructed. Its overall structure is shown in Fig. 7. At the lowest level, the number of input nodes is equal to the dimensions of the input vector (each tentative grapheme is normalized into seven feature vectors with 54 length points). A neuron has an input "eld that is time restricted. The output layer shows the grapheme recognition scores. There is one output per grapheme class (three outputs per nil grapheme), thereby, providing scores for all the graphemes of the Korean characters. 2.4.2. Training and recognition A TDNN was trained with graphemes from handprinted and cursive character data (Fig. 14). At the start of the experiments, all the segmentations into graphemes were completed automatically as only hand-printed data were used since it can be easily divided into segments. Accordingly, the grapheme recognition rate for the cursive data, as shown in Table 3, was relatively lower. The TDNN was trained with a back-propagation algorithm. The purpose of algorithm was to minimize the sum of the squared errors between the TDNN outputs and target values during training. The target value was set at #1 for a correct interpretation and at 0 for all other classes.

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

403

Fig. 6. Example of feature vectors from the data in Fig. 4. The tentative graphemes of the character shown in Fig. 4 have been resampled with 54 points. The color of each box indicates the value of the component and all these components are bounded between 0 and #1 [6].

Fig. 7. Time-delay neural network: (a) what is done at a neuron used in (b); (b) architecture of the network; (c) parameters of the network architecture [11].

The training of the TDNN was achieved with several sets of 1183 grapheme examples (corresponding to 500 selected characters) produced by eight writers. Some characters consisted of two graphemes, that is, a "rst consonant and vowel, while other characters had three graphemes including a last consonant. The actual distribution of all the training sets of graphemes extracted from the hand-printed data is shown in Fig. 8. The horizontal axis denotes 67 Korean graphemes. The distribution of samples among the di!erent grapheme classes in the training set was non-uniform and some of them were omitted. However, the omitted graphemes are of no concern, as since they were not tested, they had nothing to do with the recognition rate. These missing graphemes are also rarely used in daily life. As shown in Fig. 15, the recognition rates of the

grapheme classes that had a relatively small class distribution were somewhat lower than the other classes. The recognizer was trained with both valid graphemes and `counter-examplesa. Counter-examples include meaningless graphemes (nil graphemes), which should receive a nil interpretation. These include stroke segments that connect two graphemes or stroke segments within a grapheme and have three outputs in the TDNN's output node as they have many variations in their feature vectors (Fig. 14c). Even though only hand-printed data were used to train the grapheme recognizer, a reasonable recognition rate for cursive data could still be acquired. When the recognition rates for cursive characters exceed about 90%, training data from cursive characters can be automatically collected using the proposed recognition

404

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

Fig. 8. Distribution of the graphemes used in training and testing.

Fig. 9. All graphemes derived from Korean characters.

system. Accordingly, this saves on the labor required to manually segment cursive characters, and slightly enhances the grapheme recognizer. Mis-recognized characters must still be manually segmented for use as training data. 2.5. Character recognition: searching for the best path on the graph Korean has di!erent characteristics from English or Chinese. To make a good recognition system, it is important to know the characteristics of the target characters. The following are the basic structure and composition rules of Korean characters: Ten of the simple graphemes are vowels and the rest are consonants. The characters are composed of 1}4 simple consonants and 1}3 simple vowels. As this composition rule is too complicated, most researchers de"ne and use complex graphemes. If complex graphemes are adopted, a single character includes 1 consonant, 1 vowel, and 1 last consonant (optional). Complex vowels and consonants are made by combining simple vowels and simple consonants, respectively. Fig. 9 shows the complete variation of graphemes derived from Korean characters. Korean characters have a two-dimensional structure. Fig. 10 shows each type, where VV denotes a vertical vowel, HV a horizontal vowel, C1 the "rst consonant, and C2 the last consonant. For every character, there must be one "rst consonant and at least one vowel. If it

Fig. 10. The structure of Korean characters: (a) general structure; (b) six types of Korean characters.

exists, the "rst consonant should be on the left of the vertical vowel and on top of the horizontal vowel. The optional last consonant is below the "rst consonant and vowel. Except for the vertical vowel, all graphemes have vertical relationships in two dimensions. There are a possible 11,172 characters with simple grapheme combination rules, and about 3000 of them are used daily. The frequency of each character, grapheme, and stroke is a very important factor for developing a character recognition system. The frequency of each character was checked from newspaper articles. From newspaper articles over three months, 175,955 characters were selected and their frequency tested. From the results, it was clear

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

405

Fig. 11. Grapheme interpretation graph produced by the TDNN. (a), (b) and (c) represent the recognized graphemes, thereafter, these graphemes are combined into one character.

that only a small portion of the possible 11,172 characters was used frequently. Therefore, it could be determined that only 500 characters could cover 99% of Korean text. Fig. 8 also demonstrates a huge di!erence in usage among the various graphemes. In this research, character recognition is associated with a graph where the nodes contain grapheme recognition scores. A one-to-one correspondence exists whereby every path through the graph branches corresponds to a particular legal segmentation of the input character, and, conversely, every possible legal segmentation of the input character corresponds to a particular path through the graph. In Fig. 11, the vertical and horizontal lines indicate the start and end segment numbers of tentative graphemes, respectively. The various shaded blocks in Fig. 11 signify the recognized results of each sequence of a tentative grapheme: the darker the block, the higher the recognition score. Each tentative grapheme has a double index shape like the i}j grapheme in Fig. 11, which connects the ith segment of the input character at the grapheme starting point to the jth segment at the grapheme ending

the best-scoring path (corresponding to an interpretation for a character). There are many possible character interpretations in the graph, extending from the starting node of the "rst column to the terminal node. The probability of a given path is the sum of the product of all its node and edge values. Each node in the graph is assigned a value, which is derived from the recognizer score for the tentative grapheme corresponding to that node. An edge value signi"es the transition probability between two graphemes. The probabilities needed to build a conventional HMM-type network are denoted as Mn, A, BN in HMM notation. These statistics can be calculated by examining the characters within the database and tabulating the frequency with which each grapheme follows every other graphemes. All of these statistical counts are then normalized to yield the "nal probabilities. The following is the notation used in Fig. 12. In the implementation, the outputs of the TDNN were considered as symbol probabilities (B matrix of HMM; Eq. (2)). X(i, j) represents the node in the grapheme interpretation graph for all tentative graphemes.

the value of output node of a TDNN correspondingto a grapheme x the value of x(i, j)" sum of the values of all output nodes of a TDNN

point. For example, Fig. 11a represents the recognition score for the combined tentative grapheme of the input character inclusive of segments one through six. In this graph representation, the Viterbi algorithm provides a convenient method for rapidly determining

(2)

P(>(i, j)D X(a, b)) is the transition probability from node X(a, b) to node >(i, j) on the grapheme interpretation graph. Table 1a presents the initial state probability distributions of the "rst consonant from the grapheme interpretation graph (Eq. (3)). This can

406

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

Fig. 12. Search algorithm.

be denoted as (x(1, m)Dx(0, 0)) in the algorithm of Fig. 12. As shown in Table 1, graphemes such as ( ) are frequently used as the "rst consonant.

Table 1b presents the transition probabilities from "rst consonants to vowels, whereas Table 1c presents from vowels to last consonants. These are shown in the following equation:

the number of characters beginning with x n(x)" total number of characters

the number of transitions from x to y p(yDx)" the number of transitions from x

(3)

(4)

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

407

Table 1 Probabilities used in composing graphemes

Character interpretation is achieved by searching for the best path using the Viterbi algorithm. In Fig. 12, a formal statement of the algorithm is outlined, which is actually just a simple version of forward dynamic programming.

When investigating the best path using the Viterbi algorithm, some improper (uncomposable) character interpretations can occur. These search paths are pruned, whereby any path that corresponds to an improper

408

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

Fig. 13. Korean grapheme composition rule.

Fig. 14. Examples of the training and test data (a) hand-printed data; (b) cursive data; (c) ligature (dotted line indicates nil graphemes).

character interpretation can be removed from the graph. These are implemented in the Viterbi algorithm (the algorithm tests whether it would keep the Korean grapheme composition rule: such as a Korean character is written structurally with a head consonant, a vowel, and an optional bottom consonant). For example, among Korean characters, either a ligature or a vowel can follow a "rst consonant, and either a ligature or a last consonant can follow a vowel. Fig. 13 illustrates the basic Korean composition rule. Each node of Fig. 13 indicates a possible state in composing a character, and the double-lined circle signi"es a terminal node where the character composition is completed. According to the algorithm in Fig. 12, the transition probability between graphemes, denoted as p(x(m, i!1)Dx(i, j)), is only multiplied in the case of a solid line transition.

3. Experiment results In order to verify the proposed recognition method, further training and test data was used, as shown in Fig. 14, experiments were performed in several di!erent environments, and ACECAT was used as an alternative input device. Experiments were conducted using a database containing a total of 60,079 unconstrained on-line handwritten Korean characters. This database

consisted of four subsets. Database 1 was generated by eight di!erent writers with a few constraints (e.g. consecutive graphemes could not be connected i.e. noncursive). Database 2 was from KAIST (Korea Advanced Institute of Science & Technology) written with no constraints. Databases 3 and 4 were written by 30 individuals with no constraints. All characters were written according to a boxed input mode. Table 2 shows the number of characters in the training and test sets. A TDNN was simulated in a C-language program on an IBM Pentium compatible machine. Character recognition speed varied according to the character ranging from 0.12 s to 2.27 s. Accordingly, an additional feature that can cope with this recognition speed variation needs to be identi"ed. Fig. 15 presents the grapheme recognition results for each grapheme. For example the distributions of and are smaller than the others, accordingly, the recognition rates of these graphemes are lower. The grapheme recognition rates are presented in Table 3. We have this result after training the TDNN using only the hand-printed data. Fig. 16 shows an example of character composition. The nodes marked with a circle are de"nite segmented graphemes based on a character interpretation result. Table 4 below presents the recognition results. An evaluation of the proposed recognition system was completed in three steps. Test 1 only used hand-printed data

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

409

Table 2 Database used in experiments Database

Characteristics

The number of data For training

DB1 DB2 DB3 DB4

Hand-printed From KAIST Cursive Cursive

For testing

500 (characters)]8 (persons) 19579 (characters) 500 (characters)]8 (persons) 500 (characters)]20 (persons)

17,500 (characters) 500 (characters)]10 (persons)

Fig. 15. Grapheme recognition result for each grapheme.

Table 3 Experimental grapheme recognition results

No. of grapheme Correct recognition Recognition rate(%)

Hand-printed data

Cursive data

5372 5116 95.2

10744 9242 86.0

as training data, whereas tests 2 and 3 included both hand-printed and cursive data. The average recognition rate for cursive characters was about 92.3% (Test number 3). When the TDNN was trained with more data, slightly higher rates were achieved. Also the data from Table 4 indicates a better performance using the proposed recognition system compared with that from Ref. [8]. Accordingly, the proposed

410

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

Fig. 16. An example of character composition: (a) input character; (b) grapheme interpretation graph. Table 4 Recognition results Test No.

Training data

Testing data

Recognition rates of Ref. [8] (%)

Recognition rates of proposed method (%)

1

DB1

DB1 DB2 DB3 DB4

100 78.7 83.2 79.3

100 84.3 91.9 80.2

2

DB1#DB2

DB1 DB2 DB3 DB4

100 90.8 82.1 82.7

100 93.2 92.1 86.4

3

DB1#DB2 #DB3#DB4

DB1 DB2 DB3 DB4

100 89.2 91.2 90.7

100 91.4 94.8 93.5

Fig. 17. Examples of mis-recognition.

method can produce a reasonable recognition rate and can be applied in a variety of practical environments. Fig. 17 shows examples of mis-recognition. The main recognition error reasons include: (1) Di$culty in seg-

menting a cursive character stroke into tentative graphemes; (2) As shown in the second of the mis-recognition examples, sometimes the recognizer cannot discriminate between a ligature and a grapheme segment.

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

411

References

Table 5 Error ratio according to error type Mis-recognition types

Rates (%)

Grapheme mis-recognition of TDNN Character composition error Candidate cut generation error Ambiguous character Excessive hook Etc.

21.9 10.7 22.4 20.2 15.4 9.4

Table 5, illustrates various error types produced by the proposed recognition system.

4. Conclusion This paper has presented a grapheme-based Korean character recognition method using graph representation. Grapheme recognition results were used to determine de"nite stroke segmentation points, and a TDNN was utilized as the grapheme recognizer. A TDNN has desirable properties related to the dynamic structure of an on-line character. It can detect the temporal structure of characters and their temporal relationships, plus these features learned by the network are insensitive to time shifts. At the character composition stage, the combination of the Viterbi search along with a Korean grapheme composition rule produces the capability to cope with non-ideal cases including the addition of redundant strokes between well-separated ones written in a handprinted letter, the deletion of short strokes between graphemes, abrupt change between curves, or smooth curves between abrupt strokes. Complex graphemes and graphemes with many corner points, like , require a longer time for recognition. The development of a more complex composition rule for graphemes will be attempted to minimize these longer recognition times. The "fth and sixth mis-recognition examples indicate the drawback of using corner points as segmentation points. Accordingly, alternative additional information needs to be identi"ed to compensate for this shortcoming. Future research will investigate the application of the proposed method to character segmentation in an online run-on mode.

[1] H.D. Lee, T.K. Kim, T. Agui, M. Nakajima, On-line recognition of cursive Hangul by extended matching method, J. KITE 26 (1) (1989) 29}37. [2] P.K. Kim, J.Y. Yang, H.J. Kim, On-line cursive Korean character recognition using extended primitive strokes. The third PRICAI (Paci"c Rim International Conference on A.I.), August 1994, pp. 816}821. [3] O.S. Kwon, Y.B. Kwon, An on-line recognition of Hangul handwriting using dynamic generation of line segments, Proceeding of The Twentieth KISS Spring Conference, 1993, pp. 151}154. [4] T.K. Kim, E.J. Rhee, On-line recognition of successively written Korean characters by suitable structure analysis for Korean characters, J. Korea Inform. Sci. Soc. (KISS) 20 (6) (1988) 171}181. [5] D.G. Sim, Y.K. Ham, R.H. Park, Online recognition of cursive Korean characters using DP matching and fuzzy concept, Pattern Recognition 27 (12) (1994) 1605}1620. [6] H. Weissman, M. Schenkel, I. Guyon, C. Nohl, D. Henderson, Recognition-based segmentation of on-line run-on handprinted words: input vs. output segmentation, Pattern Recognition 27 (3) (1994) 405}420. [7] K.C. Jung, S.K. Kim, H.J. Kim, Recognition-based segmentation of on-line cursive Korean characters, Proc. ICNN 6 (1996) 3101}3106. [8] K.C. Jung, S.K. Kim, H.J. Kim, Grapheme-based on-line recognition of cursive Korean characters, J. KITE 33-B (9) (1996) 124}134. [9] D.J. Lee, S.W. Lee, A new methodology for gray-scale character segmentation and recognition, Proceedings of third International Conference On Document Analysis and Recognition, Montreal, Canada, August, 1995, pp. 524}527. [10] J.H. Bae, S.H. Park, H.J. Kim, Character segmentation and recognition in Hangul document with alphanumeric characters, J. KISS 23 (9) (1996) 941}949. [11] I. Guyon, P. Albrecht, Y. Le cun, J. Denker, W. Hubbard, Design of a neural network character recognizer for a touch terminal, Pattern Recognition 24 (2) (1991) 105}119. [12] A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, K. Lang, Phoneme recognition using time delay neural networks, IEEE Trans. Acoust. Speech Signal Process 37 (1989) 328}339. [13] C.J.C. Burges, O. Matan, Y. Le Cun, D. Denkerm, L.D. Jackel, C.E. Stenard, C.R. Nohl, J.I. Ben, Shortest path segmentation: a method for training neural networks to recognize character string, IJCNN'92, Baltimore, vol. 3. IEEE press, New York, 1992. [14] X. Li, N.S. Hall, Corner detection and shape classi"cation of on-line handprinted Kanji strokes, Pattern recognition 26 (9) (1993) 1315}1334.

About the Author*KEECHUL JUNG has been a Ph.D. student in the Arti"cial Intelligence Lab. of the Computer Engineering Department at the Kyungpook National University since March, 1996. He had been awarded the degree of Master of Science in Engineering in Computer Engineering. His research interests include image processing, pattern recognition.

412

K. Jung, H.J. Kim / Pattern Recognition 33 (2000) 399}412

About the Author*HANG JOON KIM received the B.S. degree in Electrical Engineering from the Seoul National University, Seoul, S. Korea in 1977, the M.S. degree in Electrical Engineering from the Korea Advanced Institute of Science and Technology in 1979 and the Ph.D. degree in Electronic Science and Technology from Shizuoka University, Japan in 1997. From 1979 to 1983, he was a full-time Lecturer at the Department of Computer Engineering, Kyungpook National University, Taegu, South Korea, and from 1983 to 1994 an Assistant and Associate Professor at the same department. Since October 1994, he has been with the Kyungpook National University as a Professor. His research interests include image processing, pattern recognition and arti"cial intelligence.

Suggest Documents