Handwritten character recognition system with

0 downloads 0 Views 101KB Size Report
Hindi handwritten characters may not be smooth curves and ... In the offline character recognition system, the image of ... and classification of network using MATLAB by Ziga Zadnik. ... from gray scale to binary form. ... Imresize(bw2,[30,30]);. 4.
ISSN 2347 - 7911

International Journal for Innovations in Engineering, Science and Management Available online at: www.ijiesm.com Volume 2, Issue 9,September 2014

Handwritten character recognition system with Devanagari script (SWARS) Kanak Upmanyu Dept. CSE, Integral University Lucknow, UP [email protected]

Mr.Shahid Hussain Dept. CSE, Integral University Lucknow, UP [email protected]

Dr. Rizwan Beg Dept. CSE, Integral University Lucknow,

[email protected] 1.2 ONLINE SYSTEM

Abstract- To recognize handwritten Hindi characters automatically is a very difficult because of characters written in different ways like curves and cursively written are differently in various ways. Therefore, these characters are written in distinct sizes, dimension, orientation, format and thickness. Offline written text images from a piece of paper are scanned optically i.e. OCR (optical characters recognition). Devanagari script has 13 vowels and 33 consonants so, an offline handwritten Hindi characters (on SWARS) recognition system using neural network is presented in this paper, Which can be used in popular and common applications like government records, commercial forms, bank cheques, post code recognition, bill processing systems, signature verification and passport readers. In this paper, by using Gradient descent approach, Devanagari script characters are OCR from document images.

Keywords- Neural network Devanagari script, handwriting recognition, OCR, Feature extraction.

1. INTRODUTION Handwritten character recognition is a most important area of image processing and pattern recognition. This area aim is to translate human readable characters to machine readable characters. Handwritten characters are non- uniform in nature, a particular character can be written in distinct sizes and styles by different peoples, also same writer may write the same character in different styles at different times. These Hindi handwritten characters may not be smooth curves and perfect straight lines each and every time. Character recognition system can be of two types:

1.1 OFFLINE SYSTEM In the offline character recognition system, the image of character is converted into bit pattern by optical digitized devices such as camera and optical scanner. The recognizer is also allowing written and printed text to be processed and recognized. All characters are scanned image format in the form of paper document. They are differently written.

In online character recognition system, real time character recognition is used. Online systems have timing information for recognition of characters. They define the position of the pen like a function of time directly for the interface. 2. LITERATURE SURVEY Firstly OCR research report on handwritten Devanagari Hindi characters was published in 1977. In this paper, the implementation is done on MATLAB which allows matrix manipulation and plotting of functions or implementation of algorithm. A two stage classification technique using ANN was proposed by Arora et al [2] and obtained 90.74% recognition rate. A back-propagation based neural network recognition system has been published in 2012 by Sushma Lehri by 93% recognition rate [1]. A fuzzy model based recognition system was proposed with accuracy of 90.65%. Hammandlu and Murthy proposed a fuzzy model based recognition of Hindi numerals and characters and obtained 92.67% accuracy. Handwritten character recognition training and classification of network using MATLAB by Ziga Zadnik. Neural network based approach for recognition for Devanagari characters is proposed by Shilpa Srivastava Khare 94% accuracy. 2.1 DEVANAGARI SCRIPT Character recognition for Devanagari script is more complex due to set of conjuncts. Sanskrit is a very ancient language and no longer spoken. It is also used now a day, but Hindi is an expressive language and also an attractive language. It has 33 consonants (VYANJANS) and 13 vowels (SWARS). Hindi is one of the Indian official and Indo‐Aryan languages. Devanagari is written from left to right. It is the third most language in India. An akshar is formed by a vowel only and any combination of vowel consonants with vowel and consonant is also combined with consonant. Character recognition of Devanagari script is a very difficult task.

2.2 OPTICAL CHARACTER RECOGNITION OCR stands for Optical character recognition. This technology allows a machine to recognize characters using an 22

All Rights Reserved © 2014 IJIESM

ISSN 2347 - 7911

International Journal for Innovations in Engineering, Science and Management Available online at: www.ijiesm.com Volume 2, Issue 9,September 2014 optical mechanism. Documents are in the form of papers which are human readable but these documents are not understood by the computer directly. The conversions of documents from human readable to computer readable, OCR systems are developed. It is an area of pattern recognitions and processing of handwriting character to improve communication between man and machine.

3. PROPOSED RECOGINITION SYSTEM

D.

(b) Edge detection of image using function. Edge1=edge(uint8(bw2)) (c) Dilation of image using function. Se=strel(“square”,3); E2= imdilate(edge1,se); Extraction and scaling the normalized characters to 30x30 scale using boundary value analysis Imcrop() Imresize(bw2,[30,30]); 4. RESULTS

The related works of this recognition system have four stagesScanning: The handwritten data samples are takes from five different people on paper. Through an optically digitizing device such as optical scanner or camera, all the data samples are scanned. This scanner is used to converts the scanned data into a bitmap image.

Firstly, system training is done on Hindi (SWARS) characters written in five different handwritings. These are used as data set or samples. Character set is shown as in the following figure:

Pre-processing: The various Operations are performed during pre-processing. Thresholding is used to convert image from gray scale to binary form. Feature extraction: Transform the binary image into a onedimensional vector. Then train the system using backpropagation neural network and recognize it. Recognition using back-propagation neural network: In the proposed recognition system back-propagation neural network is used for classification. A back-propagation neural network is a multilayer feed-forward network trained with gradient-descent based delta-learning rule or backpropagation learning rule. A MLP consists of an input layer, an output layer and one or more hidden layer. Hidden layers lie between input and output layer. The information flows in forward direction only. Back‐propagation learning rate minimizes the total squared error of the output computed by the network.

3. 1 PROPOSED ALGORITHM The system performs character recognition by exploring the feature of gradient descent for its ability to recognize handwritten Hindi (SWARS) characters. The following steps are used. A.

B. C.

A database of Hindi handwritten character is created in different handwritings from 5 different persons. Pre-processing of training image. (a) Binarization of image using function Bw=im2bw() Edge1=edge(uint8(bw2))

Figure 1. Sample data set

Chara cter no.

Output code

No. Of Recognitio epochs n accuracy

1.

0001

950

95%

2.

0010

920

92.99%

3.

O100

800

94%

4.

0101

876

89%

5.

0110

954

93.68%

Average recognition accuracy

92%

Table 1: Results

23

All Rights Reserved © 2014 IJIESM

ISSN 2347 - 7911

International Journal for Innovations in Engineering, Science and Management Available online at: www.ijiesm.com Volume 2, Issue 9,September 2014 Each character is entered as input vector. Total 65 characters are taken as scanned image. Out of these 70% are used as training purpose and 30% for testing. Results of recognition are shown in Table 1.

5. Conclusion In this paper, we are presented a system for handwritten Hindi character recognition. Features are extracted using conversion of gray to binary image. Recognition approaches depend on the nature of the data to be recognized. Experimental results show that back-propagation network show recognition accuracy 92%.Here, two hidden layers in back-propagation feed-forward neural network is used to recognize any non-linear pattern with good accuracy. This is one of the strength of the proposed method. In future, we can include other features to increase the recognition rate further.

6. REFERENCES [1]

[2] [3] [4]

[5]

[6]

[7] [8] [9]

[10]

Gunjan Singh and Sushma Lehri, “Recognition of Handwritten characters using back-propagation Neural network”, 2012. Arora et al,” Two stage classification technique”. Sunith bandaru,” Handwritten character recognition using neural networks”. Shruti Agrawal and Dr. Naveen hemarjani, ”Offline handwritten character recognition with Devanagari script”,2013 Neha Sahu and Shipra gupta,” Neural network based approach for recognition for Devanagari characters”,2014. Prof. Primoz Protocnik and Ziga Zadnik,” Handwritten character recognition: Training a Simple NN for classification using MATLAB”. Dr. Manu Pratap Singh.” Character Recognition using MATLAB”. Govindan V. K. and Shivprasad A.P,” Character Recognition”. Verma B.K,” Handwritten Hindi character recognition using Multilayer Perceptron and Radial basis function in neural networks”. Hammandlu and Murthy proposed a “fuzzy model based recognition of Hindi numerals and characters”.

24

All Rights Reserved © 2014 IJIESM