TRANSLASI BAHASA ISYARAT (ppt) [Compatibility Mode] - Digilib ITS

4 downloads 64 Views 777KB Size Report
Bahasa Indonesia). Does not translate movement, facial expression or body gesture. Only translate static sign of finger spelling from A to Z. Background must not ...
SIGN LANGUAGE TRANSLATION By: Juniar Prima Rakhman

7405 040 043

Tutor: Nana Ramadijanti S.Kom M.Kom Edi Satriyanto S.Si, M.Si

Introduction     

Due to their disability, hearing-impaired uses sign language as their primary means of communication. Sign language uses hand shape, facial expression, movement and body gesture. Sign language is unfamiliar to most people. Communication between hearing-impaired and normal people can be difficult. There is a need for a system to translate sign language into spoken/written language.

Objective 

To create a system that helps simplify communication between hearingimpaired and normal people (i.e. people who are unfamiliar with sign language)

Problems   

How to detect the users hand? How to extract the shape of the hand sign given? How to classify the hand shape?

Limitations   

  

One user at a time. Sufficient light intensity. Only translate ASL (American Sign Language) or SIBI (Sistem Isyarat Bahasa Indonesia). Does not translate movement, facial expression or body gesture. Only translate static sign of finger spelling from A to Z. Background must not have the same color as skin color.

System Design Object Detection

Object

Frame Capture

P r o c e s s i n g

Skin Detection

Noise Removal

Thresholding

Assign Class Label

Motion Detection

Data Training

T r a i n i n g

Input

Normalize

Normalize

Convert to Feature Vector

Convert to Feature Vector

Build Feature Space

Vote for K nearest data

C l a s s i f i c a t i o n

Object Detection (Haar (Haar Classifier) 1. The method used to detect hand (open dan closed hand) in this final project is Haar Classifier. 2. It is a method that builds a boosted rejection cascade, that works by rejecting the negative data to come up with a decision to find positive data. 3. It is supervised learning that needs data training to detect certain object (need positive and negative data training). 4. After training is done, cascade is built . Within it contains stages that works as a decision tree to decide which object to be detected and which object to ignore. 5. In this project Haar Classifier is used to initialized the region of interest in which user hand must be within the box to be classified.

Object Detection Result (1)

Object Detection Result (2)

Object Detection Result (3)

Object Detection at Various Angles Subject

Straight

1

V

2

V

3

V

4

V

5

V

6

V

7

V

8

V

9

V

10

V

Percentage

100%

300 V X V X X V V V V X 60%

450

900

1800

V

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

10%

0%

0%

Object Detection at Various Distances Subject

30 cm

50 cm

70 cm

90 cm

1

V

V

V

V

2

V

V

X

X

3

V

V

V

V

4

V

V

V

V

5

V

V

V

X

6

V

V

V

X

7

V

V

V

X

8

V

V

X

X

9

V

V

V

X

10

X

X

V

V

Percentage

90%

90%

80%

40%

110 cm X X X X X X X X X X 0%

Skin Color Values 

RGB Color Model: r > 95 and g > 40 and b > 20 and max(r,g,b)min(r,g,b) > 15 and abs(r-g) > 15 and r > g and r > b and g > b



HSV Color Model: H = 48

Skin Detection Result Original

RGB

HSV

Skin Color Detection Comparison (1) 

Comparison of skin color detection at dim light Original

RGB

HSV

30 FPS

23-24 FPS

12-13 FPS

Skin Color Detection Comparison (2) 

Comparison of skin color detection at medium light Original

RGB

HSV

30 FPS

23-24 FPS

12-13 FPS

Skin Color Detection Comparison (3) 

Comparison of skin color detection at intense light Original

RGB

HSV

30 FPS

23-24 FPS

12-13 FPS

Skin Detection on HSV model

Advantages:  

Accurately separate skin color pixels from non-skin color pixels at medium to low light intensity. It uses only 2 variables (H and S). V (value/intensity) is ignored Hence can still accurately detect skin color at dim light.

Disadvantages:  

High cost of conversion. High CPU load (low FPS processed).

Noise Removal and Thresholding Erode:

Dilate:

Motion Detection • By comparing the average pixel differences between the current and previous frame to determine whether or not motion existed. • If there is no motion detected then the image will be classified. • If there is too much motion (i.e. user’s hand waving, then reset the ROI).

Normalization public IplImage normalisasi(IplImage imgSrc, int lebar, int tinggi)

IplImage imgSrc Far

public CvRect cariBoundingBox(IplImage imgSrc) Bounding Box

Medium Normalize aspect ratio (width = height)

Scale to 150 x 150 pixels Near

Classifying Data 1. The system will check if data already trained. If not load the sample data to create feature space 2. Normalize each binary images to 150 x 150. 3. Convert both input and sample data from matrix pixels to vectors. 4. Build feature space. 5. Classify input data by determining the majority class of the nearest Kth data.

K Nearest Neighbors 1. Determine parameter K (number of nearest neighbor). 2. Calculate the distance between the new vector and all the vectors inside the feature space (training data). 3. Sort the distance and determine nearest neighbors based on the Kth minimum distance. 4. Gather the category Y of the nearest neighbors. 5. Decide the majority of the category of nearest neighbors as the prediction value of query instance.

Building Feature Space Pixels Matrix 0

0

1

0

0

1

0

0

1

classData matrix (labels)

Feature vector 0

0

1

0

0

1

0

0

1

A

0

0

1

0

0

1

0

0

1

B

1

0

1

1

1

1

1

1

0

C

0

1

1

1

0

0

0

0

0

Z

0

0

0

1

1

1

1

0

0

trainData matrix (Feature Space)

Determining Class of New Data Finding the distance between two U vectors: 1 0 1 0 0 1 0 0 1 V

0

1

1

1

1

1

1

0

1

1

1

0

1

1

0

1

0

0

If the nearest Kth data is Y and Y = {A, A, B, B, C, B} If K = 3 then it is classified as A (2/3) If K = 6 then it is classified as B (3/6)

Overall Result 100

Average Accuracy : 89. 68%.

90 80 70 60

Accuracy %

Identified Signs : 19.

50 40 30

Unidentified Signs: 7

20 10 0 A B C D F G H

I

K L O P Q R U V W X Y

Hand Signs

Conclusion 1. The maximum distance of object detection is at 90 cm. 2. To gain an optimal result in object detection, hand position has to be straight. 3. Skin color detection on HSV color model has been proven to be very effective at detecting skin color at dim to medium light environment. 4. Noise removal and size normalization of the training and input data has been proven to be effective in creating consistent data training. 5. Using KNN algorithm, the classification of the signs is can be considered to be accurate. This is shown by average accuracy of 89.68% on 19 identified signs.