Bahasa Indonesia). Does not translate movement, facial expression or body
gesture. Only translate static sign of finger spelling from A to Z. Background must
not ...
SIGN LANGUAGE TRANSLATION By: Juniar Prima Rakhman
7405 040 043
Tutor: Nana Ramadijanti S.Kom M.Kom Edi Satriyanto S.Si, M.Si
Introduction
Due to their disability, hearing-impaired uses sign language as their primary means of communication. Sign language uses hand shape, facial expression, movement and body gesture. Sign language is unfamiliar to most people. Communication between hearing-impaired and normal people can be difficult. There is a need for a system to translate sign language into spoken/written language.
Objective
To create a system that helps simplify communication between hearingimpaired and normal people (i.e. people who are unfamiliar with sign language)
Problems
How to detect the users hand? How to extract the shape of the hand sign given? How to classify the hand shape?
Limitations
One user at a time. Sufficient light intensity. Only translate ASL (American Sign Language) or SIBI (Sistem Isyarat Bahasa Indonesia). Does not translate movement, facial expression or body gesture. Only translate static sign of finger spelling from A to Z. Background must not have the same color as skin color.
System Design Object Detection
Object
Frame Capture
P r o c e s s i n g
Skin Detection
Noise Removal
Thresholding
Assign Class Label
Motion Detection
Data Training
T r a i n i n g
Input
Normalize
Normalize
Convert to Feature Vector
Convert to Feature Vector
Build Feature Space
Vote for K nearest data
C l a s s i f i c a t i o n
Object Detection (Haar (Haar Classifier) 1. The method used to detect hand (open dan closed hand) in this final project is Haar Classifier. 2. It is a method that builds a boosted rejection cascade, that works by rejecting the negative data to come up with a decision to find positive data. 3. It is supervised learning that needs data training to detect certain object (need positive and negative data training). 4. After training is done, cascade is built . Within it contains stages that works as a decision tree to decide which object to be detected and which object to ignore. 5. In this project Haar Classifier is used to initialized the region of interest in which user hand must be within the box to be classified.
Object Detection Result (1)
Object Detection Result (2)
Object Detection Result (3)
Object Detection at Various Angles Subject
Straight
1
V
2
V
3
V
4
V
5
V
6
V
7
V
8
V
9
V
10
V
Percentage
100%
300 V X V X X V V V V X 60%
450
900
1800
V
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
10%
0%
0%
Object Detection at Various Distances Subject
30 cm
50 cm
70 cm
90 cm
1
V
V
V
V
2
V
V
X
X
3
V
V
V
V
4
V
V
V
V
5
V
V
V
X
6
V
V
V
X
7
V
V
V
X
8
V
V
X
X
9
V
V
V
X
10
X
X
V
V
Percentage
90%
90%
80%
40%
110 cm X X X X X X X X X X 0%
Skin Color Values
RGB Color Model: r > 95 and g > 40 and b > 20 and max(r,g,b)min(r,g,b) > 15 and abs(r-g) > 15 and r > g and r > b and g > b
HSV Color Model: H = 48
Skin Detection Result Original
RGB
HSV
Skin Color Detection Comparison (1)
Comparison of skin color detection at dim light Original
RGB
HSV
30 FPS
23-24 FPS
12-13 FPS
Skin Color Detection Comparison (2)
Comparison of skin color detection at medium light Original
RGB
HSV
30 FPS
23-24 FPS
12-13 FPS
Skin Color Detection Comparison (3)
Comparison of skin color detection at intense light Original
RGB
HSV
30 FPS
23-24 FPS
12-13 FPS
Skin Detection on HSV model
Advantages:
Accurately separate skin color pixels from non-skin color pixels at medium to low light intensity. It uses only 2 variables (H and S). V (value/intensity) is ignored Hence can still accurately detect skin color at dim light.
Disadvantages:
High cost of conversion. High CPU load (low FPS processed).
Noise Removal and Thresholding Erode:
Dilate:
Motion Detection • By comparing the average pixel differences between the current and previous frame to determine whether or not motion existed. • If there is no motion detected then the image will be classified. • If there is too much motion (i.e. user’s hand waving, then reset the ROI).
Normalization public IplImage normalisasi(IplImage imgSrc, int lebar, int tinggi)
IplImage imgSrc Far
public CvRect cariBoundingBox(IplImage imgSrc) Bounding Box
Medium Normalize aspect ratio (width = height)
Scale to 150 x 150 pixels Near
Classifying Data 1. The system will check if data already trained. If not load the sample data to create feature space 2. Normalize each binary images to 150 x 150. 3. Convert both input and sample data from matrix pixels to vectors. 4. Build feature space. 5. Classify input data by determining the majority class of the nearest Kth data.
K Nearest Neighbors 1. Determine parameter K (number of nearest neighbor). 2. Calculate the distance between the new vector and all the vectors inside the feature space (training data). 3. Sort the distance and determine nearest neighbors based on the Kth minimum distance. 4. Gather the category Y of the nearest neighbors. 5. Decide the majority of the category of nearest neighbors as the prediction value of query instance.
Building Feature Space Pixels Matrix 0
0
1
0
0
1
0
0
1
classData matrix (labels)
Feature vector 0
0
1
0
0
1
0
0
1
A
0
0
1
0
0
1
0
0
1
B
1
0
1
1
1
1
1
1
0
C
0
1
1
1
0
0
0
0
0
Z
0
0
0
1
1
1
1
0
0
trainData matrix (Feature Space)
Determining Class of New Data Finding the distance between two U vectors: 1 0 1 0 0 1 0 0 1 V
0
1
1
1
1
1
1
0
1
1
1
0
1
1
0
1
0
0
If the nearest Kth data is Y and Y = {A, A, B, B, C, B} If K = 3 then it is classified as A (2/3) If K = 6 then it is classified as B (3/6)
Overall Result 100
Average Accuracy : 89. 68%.
90 80 70 60
Accuracy %
Identified Signs : 19.
50 40 30
Unidentified Signs: 7
20 10 0 A B C D F G H
I
K L O P Q R U V W X Y
Hand Signs
Conclusion 1. The maximum distance of object detection is at 90 cm. 2. To gain an optimal result in object detection, hand position has to be straight. 3. Skin color detection on HSV color model has been proven to be very effective at detecting skin color at dim to medium light environment. 4. Noise removal and size normalization of the training and input data has been proven to be effective in creating consistent data training. 5. Using KNN algorithm, the classification of the signs is can be considered to be accurate. This is shown by average accuracy of 89.68% on 19 identified signs.