Off-Line Signature Verification Using Graphical Model - CNRS

1 downloads 0 Views 657KB Size Report
In this paper, we propose a novel probabilistic graphical model to address the off-line signature verification problem. Different from previous work, our approach ...
2010 International Conference on Pattern Recognition

Off-line Signature Verification Using Graphical Model Hairong Lv, Xinxin Bai, Wenjun Yin, Jin Dong IBM China Research Lab {lvhr, baixx, yinwenj, dongjin}@cn.ibm.com were extracted. Then we use the number of matched landmark points to verify the authenticity of testing signatures. In this paper, we present a novel probabilistic graphical model to address the off-line Chinese signature verification problem. We propose a probabilistic graphical model that can capture the variations and dependence of signature landmark points naturally, while landmark points, unexpected and wild landmark points (described in Section 3) are all taken into consideration. The rest of this paper is organized as follows. In section 2 we discusses how to represent signatures. And the details of the probabilistic graphical model will be discussed in setion 3. In setion 4 we present the verification strategies and demonstrate experimental results, follow by conclusions in section 5.

Abstract In this paper, we propose a novel probabilistic graphical model to address the off-line signature verification problem. Different from previous work, our approach introduces the concept of feature roles according to their distribution in genuine and forgery signatures, with all these features represented by a unique graphical model. And we propose several new techniques to improve the performance of the new signature verification system. Results based on 200 persons' signatures (16000 signature samples) indicate that the proposed method outperforms other popular techniques for off-line signature verification with a great improvement.

1. Introduction Handwritten signature is one of the most widely accepted personal attributes for identity verification. As a symbol of consent and authorization, especially in the prevalence of credit cards and bank cheques, handwritten signature has long been the target of fraudulence. Therefore, with the growing demand for processing individual identifications faster and more accurately, the design of an automatic signature verification system (ASV) faces a real challenge. Handwritten signature verification can be divided into on-line and off-line verification. In general, offline signature verification is a more challenging problem because all handwriting features, such as the handwriting order, writing-speed variation, and skillfulness need to be recovered from the gray-level pixels. During the last few years, researchers have made great efforts on off-line signature verification and achieved some success[1][2][3][4][5][6]. In our point of view, most of writer-dependent information exists in the starting, ending and turning points of the strokes. We called them landmark points. In our previous work [8] five types of landmark points

1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.922

2. Signature representation In our previous work [8], we extract five types of feature points called landmark points from thinned image, including turning points, isolated points, trifurcate points, intersection points and termination points, as shown in Figure 1. turning point Isolated point ‰ Trifurcate point ‰ Intersection point ‰ Termination point ‰ ‰

Figure 1: An example of five classes of landmark points

Each landmark point P has three attributes: coordinates ( x, y ) , direction θ1 , θ 2 ∈ [0, 2π ) and landmark point type (turning, isolated, trifurcate, intersection and termination), that is: P = {x, y, θ1 , θ 2 , type}

The distance between two landmark points P1 and P2 is defined by Equation (1), if they have the same type:

3772 3788 3784

d ( P1 , P2 ) = w1 × ( P1 .x − P2 .x) 2 + ( P1 . y − P2 . y )2 + w2 × ( P1 .θ1 − P2 .θ1 + P1 .θ 2 − P2 .θ 2

) ,

CPT (Conditional Probability Table) From training samples (genuine and forgeries)

(1)

where P1.x means the value of x in landmark point P1 and others have similar meanings. If P1 and P2 have

Alternative group

different types, d (P1 , P2 ) = ∞ . Here, we set w1 ≈ w2 ≈ 0.5 empirically.

Common points Compute the probability of R (true or false) from true or false of A, B, C, D, E, F1, F2, G…

Commensal group 2

3. The Proposed Method

Commensal group 1

As pointed out in Section 1, different landmark points play different roles in the signature verification process. In this section, we will use these landmark points as the features to represent a signature with a probabilistic graphical model[7].

Figure 4: Graphical model of signatures.

3.2. Landmark Point Clustering The purpose of this procedure is to find the common points and commensal group from the genuine training samples. The clustering algorithm is described by the Procedure point_clustering.

A B C D E Common points

F1 F2

Group 1

G

Group 2 Alternative group

Figure 2: A simple example of different landmark points which play different roles in graphical models

3.1. Motivation

Suppose that we get M clusters Ci (i = 1, 2,..., M ) finally. If Ci contains only one landmark point, just discard

Here, we use an example shown in Figure 2 to explain our motivation. In Figure 2, there are two types of writing style for the same Chinese characters, the upper three and the nether three.. The landmark points A, B, C, D and E occur in all the genuine signatures. They may be the common attributes of the person’s signature and we call them common points. In Figure 3, among the upper three characters, F1 and F2 occur together, they form a group and we call it commensal group (Group 1 in Figure 3). While among the nether three characters, only G (Group 2 in Figure 3) occurs at the corresponding location. The two commensal groups don’t occur at the same time but the total times they occur is close to the number of genuine samples in the training set. They are regarded as alternative groups. Figure 4 shows the proposed graphical model for the characters. It’s a tree structure. The root R represents whether the signature is genuine. The common points A, B, C, D and E are directly connected to the root. We add some hypothesis nodes such as H, H1 and H2 in the model. They are regarded as some writing style. Furthermore, we use W(wild points) to represent landmark points which don't occur so continually, while U(unexpected points) represents landmark points which have no corresponding nodes in the model.

the cluster, else, we can computer the center Ci and the radius r (Ci ) of Ci by (2): C = ∑ P C (2) r ( C ) = ρ m a x d ( P , C ) , ( ρ ≈ 1 .2 ) i

i

P∈Ci

i

i

P∈Ci

In order to make the algorithm clear, we define a N × M matrix MASK = [mi , j ] , where mi , j = 0 or 1 . mi , j = 1 means that the i th signature matches the

cluster C j , otherwise, the i th signature doesn’t match Cj . 3.2.1. Common Points If Freq(C j ) = ∑ kN=1 mk , j N ≥ α , (α ≈ 0.8) , then Ci is

regarded as common points. Note that



j

Freq (C j ) = 1 .

If Freq(C j ) = ∑ kN=1 mk , j N ≤ β , ( β ≈ 0.2) , then Ci will be put into W (wild points, as shown in Figure 4). 3.2.2. Commensal Group Consider a subset {C j1 , C j 2 ,..., C jK } , JK = { j1,..., jK } , If

3785 3789 3773

Freq (C j1 ,..., C jK ) = ∑ k =1 N

(∏

j∈JK

mk , j

)

N > λ , (λ ≈ 0.2)

and

( ∑ (∏ N

k =1

j∈JK

)

mk , j > λN

)∑

N

3.5 Inference algorithm

max mk , j ≥ μ,(μ ≈ 0.8) then

k =1 j∈JK

Given the true or false of each leaf node, NW (the number landmark points that fall into W) and NU (the number landmark points that fall into U), our goal is to evaluate the posterior probability of R being TRUE or FALSE. We use the algorithm developed by Pearl[9]. Our graphical model is very simple, but it’s a bit different with general Bayesian belief network. We will explain the bottom-up propagation below and highlight the difference between our algorithm and general model caused by alternative groups, wild points and unexpected points. We define parameters λ(Y) = (λT (Y), λF (Y)) = (λT , λF ) to evaluate the contribution of node Y to its parent. The definition for different types of nodes is different. We use Figure 4 to describe our ideas. 3.5.1. Common Points Suppose that A in Figure 4 has been assigned a value (T or F). (5 λ ( A) = (λT , λF ) = (σ P( A | R = T ), σ P( A | R = F )) ) Here, σ is used to satisfy λT + λF = 1 . 3.5.2. Wild Points If N aver (T / W ) < N aver ( F / W ) , λ (W ) = (0.5, 0.5) , else:

{C j1 , C j 2 ,..., C jK } can be regarded as a commensal

group. Here, C j , ( j ∈ JK ) is neither a common nor wild point. 3.2.3. Alternative Group If commensal groups {A1, A2 ,..., AI }I{B1, B2 ,..., BJ } = Φ and Freq ( A1 , A2 ,..., AI ) + Freq ( B1 , B2 ,..., BJ ) ≥ κ , (κ ≈ 0.8) , then the two commensal groups are called alternative group. 3.2.4. Matching Given a landmark point set LP and the cluster set Ci , i = 1, 2,..., M , if ∃P ∈ LP, s.t. d ( P, Ci ) < r (Ci ) , then we define MATCH (Ci , LP) = TRUE , otherwise it is FALSE .

3.3 Conditional Probability Table As pointed out in Section 4.1, each node of the graphical model has a value of TRUE or FALSE except U and W. For each parent node N1 and a child node N2, the conditional probability is defined by a table: ⎡ P(N 2 = T | N1 = T ) P ( N 2 | N 1) = ⎢ ⎣P(N 2 = T | N1 = F )

P(N 2 = F | N1 = T )⎤ P ( N 2 = F | N 1 = F ) ⎥⎦

It is convenient to evaluate these probabilities from the training set that contains N genuine signatures LPGi and L forgeries LPFi . We use an example to explain the process. If a common point A is directly connected to the root R, then P ( A = T | R = T ) means the frequency of the occurrence of A when the signature is genuine. P( A = T | R = T ) = Freq( A) P( A = F | R = T ) = 1 − Freq( A)

λT = max(1, NW ( N aver (T / W ) + N aver ( F / W ))) λ (W ) = (λT , λF ) = (λT ,1 − λT )

3.5.3. Unexpected Points If N aver (T / U ) > N aver ( F / U ) , λ (U ) = (0.5, 0.5) , else:

λF = max(1, NU ( N aver (T / U ) + N aver ( F / U ))) λ (U ) = (λT , λF ) = (1 − λF , λF )

⎛ ∑ K Fi i =1

L

P( A = F | R = F ) = 1 − P( A = T | R = F )

(7)

3.5.4. Commensal Group Suppose that the commensal group H1 has K children F1 , F2 ,..., FK , and each child has a value being TRUE or FALSE.

(3)

P ( A = T | R = F ) means the frequency of the occurrence of A in forgeries. P( A = T | R = F ) = ∑ i =1 MATCH ( A, LPFi )

(6)

λ ( H1 ) = (λT , λF ) = ⎜

(4)

⎜ ⎝

3.4 Wild Points and Unexpected Points

K

,1 −



Fi ⎞ ⎟ K ⎟ ⎠ K

i =1

3.5.5. Alternative Group Firstly, we evaluate the probability of H by (9): P ( H = T ) = max(λT ( H1 ), λT ( H 2 )) P ( H = F ) = 1 − max(λT ( H1 ), λT ( H 2 )) Then:

Generally speaking, if a signature is a forgery, there would be more landmark points that fall into U than into W; otherwise, there would be more landmark points that fall into W than into U. We define: Naver (T /U) --Average number of the landmark points of a genuine signature that fall into U in the training set. Similarly, we can define Naver (F /U) , Naver (T /W) and Naver (F/W) . Usually, Naver (T /U) < Naver (F /U) and Naver (T /W) > Naver (F /W) .

λ ( H ) = ( λT' (λT' + λF' ) , λF' (λT' + λF' ) )

(8)

(9) (10)

Here, λT' ( H ) = P( H = T ) × P( H = T | R = T ) + P( H = F ) × P( H = F | R = T )

λF' ( H ) = P ( H = T ) × P( H = T | R = F ) + P( H = F ) × P( H = F | R = F )

3786 3790 3774

(11)

P( R = T ) = ε ∏ i =1 λT ( Fi ) K

(12)

P( R = F ) = ε ∏ i =1 λF ( Fi )

False A cceptance R ates (% )

False Acceptance R ates (% )

25

15 10 5 0

0

5

10

15

20

False Rejection Rates (%)

25

30

[1].

EER= 3.74%

0

5

10

15

20

25

5

10

15

20

25

30

EER= 5.43%

5 0

0

5

10

15

20

25

30

False Rejection Rates (%)

H.R. Lv, W.Y. Wang, C. Wang and Q. Zhuo. Off-line Chinese signature verification based on support vector machines. Pattern Recognition Letters, 26(15):2390-2399, 2005 [2]. E.J.R. Justino, F. Bortolozzi and R. Sabourin. Off-line signature verification using HMM for random, simple and skilled forgeries. International Conference on Document Analysis and Recognition, pp. 1031-1034, 2001 [3]. C.L. Liu, R.W. Dai and Y.J. Liu. Modified wigner distribution and application to writer identification. Chinese Journal of Computers, 1997. [4]. M.A. Ferrer, J.B. Alonso and C.M. Travieso. Offline geometric parameters for automatic signature verification using fixed-point arithmetic. IEEE Transactions on PAMI, 27(6): 993 – 997, 2005 [5]. S.H. Peter, H.Y.M. Deng, C.W. Liao and H.R.T. Ho. WaveletBased Off-Line Handwritten Signature Verification. Computer Vision and Image Understanding, 76 (3):173–190, 1999 [6]. G.S. Ng and H.S. Ong. A neural network approach for offline signature verification. TENCON'93, 2:770-773, 1993 [7]. Finn V. Jensen. An introduction to Bayesian network. UCL Press Limited, University College London, 1996 [8]. Lv, H.R. Off-line Chinese signature verification based on landmark points matching. Pattern Recognition Letters, submitted. [9]. J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, Inc., San Mateo, CA, 1988 [10]. Justino Edson J R. A comparison of SVM and HMM classifiers in the off-line signature verification. Pattern Recognition Letters, 26(9): 1377-1385, 2005

10

0

0

10

References

15

5

5

15

In this paper, a probabilistic model representation for Chinese signatures is proposed. It provides better performance in signature verification. Currently, we use discrete landmark points as the features and discard the strokes among them, which would be considered in the future work. Besides, our model only considers the simple relationships among landmark points, more complex relationships will also be explored in the future.

20

EER= 3.5%

EER= 11.23%

10

20

5. Conclusion and Future Work

We collect the database by ourselves. There are total 200 signers in our database. For each signer, there are 25 genuine signatures, 25 random forgeries, 15 simple forgeries and 15 skilled forgeries. We carried out an experiment to test the approach with the database. For each signer, we randomly select 15 genuine signatures, 5 random forgeries, 5 simple forgeries and 5 skilled forgeries into the training set. The remained are left for test. The experimental results are shown in Figure 5, using the ROC curves. The EER is 5.43% on the whole database. To make a comparison, we also test some classical methods and the basic landmark point matching method on our database, the results are listed in Table 2. The results indicate that this graphical model provides better performance than basic landmark point matching and other popular methods. The possible reason for the improvement is that our model can capture the variations and dependence of signature landmark points as well as the unexpected and wild landmark point in the same framework.

20

15

Figure 5: ROC curve(from top left to bottom right: random forgeries , simple forgeries, skilled forgeries, whole database) Table 2: Performance Comparison Mehtod ERR(%) Baisic landmark point correspondence 7.2 Graphic approch 5.4 HMM[10] 8.7 Tracking of feature[12] 8.3 Structural feature correspondence[11] 7.9 Extended shadow code[13] 9.1

4. Experiments

30

25

20

False Rejection Rates (%)

Here, ε is used to satisfy P ( R = T ) + P( R = F ) = 1 . If P ( R = T ) > P ( R = F ) , the character is genuine, otherwise, it’s determined to be a forgery. For each signer, if his/her signature contains N characters, we’ll build a graphical model for each Chinese character, and their totally N models. If each model tells that its corresponding character is determined to be genuine, the signature is genuine, otherwise, it’s supposed to be a forgery.

25

30

25

0

K

30

30

False Acceptance R ates (% )

False Acceptance R ates (% )

3.4.6. Verification Strategy Suppose the root node R have K children F1 , F2 ,..., FK , including common points, wild points, unexpected points and alternative nodes. We can compute the parameter λ of each child by using (5)~(11). Then:

30

False Rejection Rates (%)

3787 3791 3775

[11]. Huang Kai, Yan Hong. Off-line signature verification using structural feature correspondence. Pattern Recognition, 35(11): 2467-2477, 2002 [12]. Fang B, Leung C H, Tang Y Y, Tse K W, Kwok P C K, Wong Y K. Off-line signature verification by the tracking of feature and stroke positions. Pattern Recognition, 36(1): 91-101, 2003 [13]. R Sabourin, G Genest. An extended-shadow-code-based approach for off-line signature veri-fication: Part I. Evaluation of the bar mask definition. ICPR, Jerusalem, Israel, 1994

3788 3792 3776

Suggest Documents