Implement Credit Card Fraudulent Detection System ... - IEEE Xplore

21 downloads 74 Views 269KB Size Report
Many technics for credit card fraudulent detection but hidden markov model (HMM) is one of the best engineering practices tool for credit card fraud system.
2012 NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING, NUiCONE-2012, 06-08DECEMBER, 2012

1

Implement Credit Card Fraudulent Detection System Using Observation Probabilistic In Hidden Markov Model Ashphak Khan, Tejpal Singh, Amit Sinhal

for online transaction. Banking system provides e-cash, ecommerce and e-services by using online transaction. Credit card is one of the best ways for online transaction. In case of risk of fraud transaction using credit card has also been increasing. Credit card fraud detection is one of the ethical issues in the credit card companies, mortgage companies, banks and financial institutes. Many technics for credit card fraudulent detection but hidden markov model (HMM) is one of the best engineering practices tool for credit card fraud system. Hidden markov model generate, observation symbols for online transaction. Observation probabilistic in an HMM based system is initially studies spending profile of the cardholder and checking an incoming transaction, against spending behavior of the cardholder. we can show clustering model is used to classify the legal and fraudulent transaction using data conglomeration of regions of parameter , we has shown the Hidden Markov Model for fraud detection in Credit card Applications. We presented experimental result to show the effectiveness of our approach. Index Terms--Hidden Markov Model, Credit card fraud, credit card fraud detection system, online transaction, Ecommerce, clustering.

T

I.

INTRODUCTION

oday is our life become more comfortable. We doing the all things such as, online banking, e-cash, online shopping, online ticket booking, online recharge, online pay of fees etc., where performed transaction operation. The credit card is one of the best and conventional ways of using the online transaction. Credit card most important mod of payment through internet. Online transaction, only some of the card details like secure code, expiration date and card number etc. is needed to do the transaction as it is mostly done via phone or internet .The use of credit cards is prevalent in modern day society. In day-to-day life, online transactions have increased to purchase goods and services. According to Nielsen study conducted in 2007-2008, 28% of the world’s total population has been using internet [1]. 85% of total population today have used internet to make online shopping and the rate of making online purchasing has increased by 40% from 2005 to 2008. In developed countries and in developing countries to some extent, credit card is most acceptable payment mode for online and offline transaction. As usage of credit card increases worldwide, chances of attacker to steal credit card details and then, make fraud transaction are also increasing. There are several ways

to steal credit card details such as phishing websites, steal/lost credit cards, counterfeit cards, theft of card details, intercepted cards etc. [3]. We performed the online transaction, then fraud is day by increase, the credit card fraud detection is one of the interesting are for research technology, credit card fraud increases is dramatically for the credit card transaction. Credit card fraud is security weakness of credit card companies, banking system and business evolution, after resulting we loss billion of dollar each and every year by using the online shopping, e-cash, online banking either use the off line baking. We show the survey in USA where yearly credit card fraud is regularly increases each and every year [4], show in fig 1.

Amount in million dollar $

Abstract-- The internet becomes most popular mode of payment

2006

2007

2008

2009

2010

Figure 1: Yearly credit card fraud Credit card we can use both purpose for online or offline transaction, mostly credit card is divide two brought categories, first physical credit card, now in physical credit card where card holder is present, which is relation between selling counter and card holder. Selling counter can use the EMV (Europay, MasterCard, and visa) machine. Transaction of amount is done in front of card holder. Virtual credit card is where card holder in not present, internet baking is part of virtual credit card. Online baking is challenging part of traditional baking system. The credit card is use of modern society day by day. Prevalent of credit card fraud is difficult task when using online transaction. In this paper we proposed the observation probability in hidden markov model use in credit card fraud detection and its spending profile of every transaction, all transaction divided in three buy categories that we have shown in section V. II. LITERATURE STUDY OF CREDIT CARD FRAUD Credit card fraud is an important and interesting work of

2

2012 NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING, NUiCONE-2012

research technology. Several techniques has been develop for credit card fraud detecting system in online transaction, such techniques base on artificial intelligence, fuzzy logic, data mining, machine learning, genetic algorithm, decision tree, Bayesian network, neural network, clustering algorithm, etc., that evaluate of various credit card fraud transaction. Ghosh and Reilly [5] have proposed a neural network method to detect credit card fraud transaction. They have built a detection system, which is trained on a large sample of labeled credit card account transactions. These sample contain example fraud cases due to lost cards, stealing cards, application fraud, stolen card details, counterfeit fraud etc. They tested on a data set of all transactions of credit card account over a subsequent period. Kokkinaki and other have proposed the technics of decision tree. This technics of decision tree are simple and easy to the implementation, decision trees is reduces misclassification of incoming transaction of data, but this is not for use dynamically adaptive of online transaction. A decision tree is defined recursively; it contains nodes and edges that are labeled with attribute names and with values of attributes, respectively [6]. Meas, Suggest of fraud detection technics using the bayesian network, in this technics, improving the fraud detecting by removing highly correlated attribute, ANN was found the credit card fraud predication faster of the testing phase, at using transaction profile. Bayesian algorithm is performed better result of fraud detection only on neural network [7]. Chan and Stalfo, have proposed the a technics of multiclassifier Meta learning issues of credit card transaction, it detecting the fraud detection 46 % improving of overall fraud, to use for each tanning experiment are required to the best distribution determine [8]. Kim, method improving number fraud detection classifier and compare only on the neural network by using the unsupervised algorithm of data mining, this method is only able to find local minima in the error function. Centralize fraudulent transaction from fraud investigation of increasing, accuracy of model a distributed dataset for higher fraud are show chiu and tsai, a web based knowledge sharing scheme using for rule-based algorithms. Since there are millions of transactions processed every day and their data are highly skewed. The transactions are more legitimate than fraudulent. It requires highly efficient technique to scale down all data and try to identify fraud transaction not legitimate transactions [9]. Syeda has proposed improving the speed of data mining, discovery of knowledge in credit card fraud detection system of transaction process using granular neural network. Credit card fraud detecting purpose this system has been implemented [10]. Establish logic rules capable of classifying transactions of credit card into suspicious and non-suspicious classes using Genetic algorithm. This algorithm based on genetic programming this concept suggest by Bentley [11]. Bolton and Hand et al. [12] it has proposed credit card detection using unsupervised method by frequency of transactions and observing abnormal spending behavior. Break point analysis and Group analysis techniques as unsupervised tools, Successful in detection local anomalies and can fraud detection system of behavior in a continuous

manner. Algorithms don’t show differentiate between accounts it show the treats of all accounts equally. We propose in this system of credit card fraud detection using observation probability in Hidden Markov Model. Hidden Markov Model is one of the best methods for observation spending profile generate at the state transaction. HMM is statically model for best engineering practice. Hidden markov Model is best for using the FDS (fraud Detection system). Hidden markov process is double embedded random process means it performs transaction of probability if state is “hidden” or state of transaction is “open” of two different levels. In this data mining technics we have divide the three sub categories method. we suggestion of present fraud detection system to alternative sequence to spending profile show online transaction data generate of credit card system Credit card data set is not available to easily its most important part of banking system. Bank should not provide be provide, it is security part of any banking system. We use a dummy data set to credit card fraud detection system, improve credit card fraud accuracy of system, we show the better result in show section of experimental result and find out the categories of fraud system. III. MATHEMATICAL MODEL We are use application of hidden markov model in credit card fraud detection. Markov process is showing directly initial state and transaction state, but in HMM does not show directly transaction states it provide the observation state of initial state according its observation state transaction is succeed. Hidden Markov Model is probably the simplest and easiest models, which can be used to model sequential data, i.e. data samples that are dependent from each other. HMM based system initially studies spending profile of the card holder and followed by checking an incoming transaction against spending behavior of the card holder. Hmm does not directly use the states, which provide the external observation and gate use external observer find the visible state. Hidden Markov Model technics successfully apply for data mining, speech recognition, bio-information, robotics, artificial intelligence, voice recognition etc. Q1

O1

Q2

O2

Q3

Q4

O3

O4

Figure 2: State of Hidden Markov Model Hidden Markov Model element of needed to hidden state in observation symbols. • The Number of “Hidden” state is N, we denote the set of state: {S 1, S 2, S 3,....., SN} …… (3.1) • The Number of sequence of state Q. we show: Q = {q1, q 2, q 3,.....} …… (3.2)

2012 NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING, NUiCONE-2012, 06-08DECEMBER, 2012



The Number observation symbols M, then we show sequence of observation : O = {o1, o 2, o3, ....., oM } …… (3.3) Then we need the Hidden Markov Model in sequence of observation symbols: χ = (T , U , π ) …… (3.4) Where T is state transition probabilities:

3

IV. CREDIT CARD FRAUD DETECTION In this section we show the application credit card fraudulent use in observation probability in hidden markov model, we in this fraud detection system. Incoming transaction

T = {aij}

Credit card transaction

+φ / φ 1 ≥ Threshold

Create Cluster (low,mid,high)

Generate Observation symbol OR+1

Φ2 Calculate

Φ1 Calculate

We show

aij = P{qt + 1 = Sj | qt = si ), 1 ≤ i, j ≤ N N

∑ aij = 1, 1 ≤ i ≤ N

Profile Analyzer

…… (3.5)

Customer Details DB

j =1

Where qt is denotes the current state.

Test

U is Observation probability distribution

U = {bi (v)},

GT

bi (v) ≥ 0, 1 ≤ i ≤ N 1 ≤ v ≤ N …… (3.6) Where v denote set of symbols of observation probability; Probability distribution of each of the states: bi (v ) = P (Ot = v | qt = si ) , 1 ≤ i ≤ N 1 ≤ v ≤ N π is initial state distribution.

π = {π i}

Where

FT

Figure 3: Propose Fraud Detection System We consider three different spending profiles of the cardholder, which is depending upon price range, named high (h), medium (m) and low (l). In this set of symbols, we define V = {l, m, h} and M =3. The price range of proposed symbols has taken as low (0, $500), medium ($500, $1000) and high ($1000, up to credit card limit).we characterization of credit card transaction of card holder spending profile.

π i = P (q1 = Si ) , 1 ≤ i ≤ N

OLL

To denote an HMM with discrete probability distribution, while

λ = (T , C jm , μ

jm

,∑

jm

,π ) …… (3.7)

Low

OHL

K-Means

OLH

Choose K vectors from the training vectors, here denoted χ, at random. , vectors will be the centroids μ k , which is to be found correctly. For each vector in the training set, let every vector belong to cluster k. this is done by choosing the cluster closest to the vector:

k * = arg min [ d ( x, μ k ) ] k

…… (3.8)

From this clustering (done for one state j), the following ( Cjm , μ jm ,



jm

) parameters have found.

Where (in eq. (3.7) and (3.8)

Cjm = Weighting Coefficients, μ jm = Mean Vector,



jm

= Covariance matrices

OLM

High

OMM

OML

OHM Mid.

OMH OMM

Figure 4: Categories of Low, Mid, High Transaction We will consider forming an initial sequence of the existing spending behavior of the card holder. Let O1, O2, OR be consisting of R symbols to form a sequence. This sequence is recorded from cardholder’s transaction till time t. We put this sequence in HMM model to compute the probability of acceptance. Let us assume be this probability is α1, which can be calculated as

φ 1 = P (o1, o 2, o 3,.....oR | λ ),

2012 NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING, NUiCONE-2012

4

Let OR+1 be new generated sequence at time t+1, when a transaction is going to process. The total number of sequences is R+1. To consider R sequences only, we will drop O1 sequence and we will have R sequences from O2 to OR+1. Let the probability of new R sequences be α2

19th 20th

3 2

355 560

According to this table we propose:

φ 2 = P (o 2, o3, o 4,.....oR + 1 | λ ),

Hence, we will find

Δφ = φ 1 − φ 2,

If Δφ > 0, 0, it means that HMM consider new sequence i.e. OR+1 with low probability and therefore, this transaction will be considered as fraud transaction if and only if percentage change in probability is greater than a predefined threshold value. VI. HELPFUL HINTS

Δφ / φ 1 ≥ Threshold value, The threshold value can be calculated empirically. This Fraud detection system if finds that the present transaction is a malicious, then credit card issuing bank will regret the transaction and FDS discard to add OR+1 symbol to available sequence. If it will be a genuine transaction, FDS will add this symbol in the sequence and will consider in future for fraud detection.

Fig.5: transaction amount of each Category In figure 3, show the amount of purchasing item of different categories, consider Low, Medium and High categories of amount the (1, 2 and 3). Simulate large data set of spending profile and our propose FDS (Fraud Detection system and found the probability of observation sequence.

V. EXPERIMENTAL RESULT AND ANALYSIS Credit card fraud detection is a difficult task for preceding the result. Banking system does not support to proving its data, because that data is very square. We are using the manually data of this system and show the result and accuracy of fraudulent system. Table 1, we proposed the observation probability generate by online transaction of different categories their purchasing type, With the help of this, we calculate probability of each spending profile high, low, medium (h, l, and m) of every category (1, 2 and 3). Table 1: list of all transactions happened till date Transaction no 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th 17th 18th

Category

Amount in $

1 3 2 2 1 3 1 2 2 1 1 1 3 2 2 1 3 2

5 10 40 75 28 115 54 110 180 119 140 240 125 280 430 520 180 560

Fig.6: Different Spending Profile of each category We show the fraud categories of each transaction sequence. Three categories we shoe low, medium and high we show on figure 4. Propose the mean distribution probability of observation sequence, according to mean distribution probability we show genuine transaction and false transaction of data, found the online credit card transaction. Now we generate the number of tables given data, because we calculate the threshold (10, 20, 30, 50, 60, 70, 80, 90) operation of only 5, 10 length of sequence, we can use the number of hundred states of given threshold operation. Table 2: Different sequence Length of GT & FT Threshold (%) 10

Different Sequence Length

5

10

0.98

0.043

2012 NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING, NUiCONE-2012, 06-08DECEMBER, 2012

20 30 40 50 60 70 80 90

0.85 0.82 0.87 0.8 0.83 0.85 0.98 1.00

0.16 0.13 0.098 0.075 0.04 0.025 0.03 0.042

And the accuracy of system, number of genuine transaction and number of false transaction, we consider the number of total transaction of amount.

amount, each group show aberration symbols. In hidden markov model methods is very low compare techniques using fraud detection rate. The system is also scalable for handling large volumes of transaction. VIII. REFERENCES [1] [2] [3]

[4] [5]

[6]

[7]

[8]

[9]

Fig. 7: Mean Distribution of Fraud Transaction In Hidden Markov Model is reduce the complexity of system and improve the highly accuracy of large scale of transaction of data. Accuracy in % =

GT − FT *100 N

Where, GT = Number of Genuine Transaction of Amount. FT = Number of Fraud Transaction of Amount. N = Number of Total Transaction of Amount. We calculate the mean distribution of given table, show the graph of genuine transaction and false transaction of observation state. In figure 8 we compare the accuracy of all method with HMM. VII. CONCLUSION Efficient credit card fraud detection system is an utmost required for card issuing bank to all type of online transaction that through using credit card. In this paper, we have implemented of hidden markov model in credit card fraud detection. The very easily detect and remove the complexity of system by using hidden markov model. It has also explained the hidden markov model how can detect whether an incoming transaction is fraudulent or not. Comparative studies reveal that the accuracy to the system is also 87-90% over a wide variation in the input data. We are dividing the transaction amount in three categories that is grouping high, medium & low used on different ranges of transaction

5

[10] [11]

[12] [13] [14]

[15] [16] [17] [18] [19] [20] [21]

Internet usage world statistics, “http://www.ternetworldstats.com/ stats.html, 2011. The Nilson Report. “U.S. credit card projected,” The Nilson Report, pp: 7-8, October 2010. Mhamane S., Lobo L.M.R.J., “Fraud detection in online banking using hmm”, “International conference on Information and network technology (ICINT) vol. 37 pp. 200-204, IACIT press, Singapore, 2012. Sentinel Annual Report “http://ftc.gov/sentinel/report/sentinel-annualreport.pdf., 2010. Ghosh, S, and D.L. Relly. “Credit card fraud detection with a neural network.” Proceeding of the 27th Hawaii International conference on system. Los Alamitos, CA: IEEE Computer Society, pp. 621-630, 1994. Kokkinaki, A. “On Atypical Database Transaction: Identification of Probable Frauds using Machine Learning for User Profiling.” Knowledge and Data Engineering Exchange workshop, pp. 107-113, 1997. Meas, S., K. Tuyls, B. Vanschoenwinkel, and B. Manderick. “Credit card fraud Detection Using Bayesian and Neural Networks.” Proceeding of the 1st International NAISO Congress on Neuro Fuzzy Technologies. Havana, Cuba, 2002. Chan, Philip L, and Salvatore J stolfo. “Toward Scalable Learning with Non-uniform class and cost Distribution: A Case Study in credit card Fraud Detection.” Proceeding of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 164-168, 1998. Kim, M. and T Kim. “A Neural Classifier with fraud Density Map for Effective Credit Card Fraud Detection. “ Proceeding of IDEAL, pp. 378-383, 2002. Syeda, M., Zhang, Y.Q., and Pan, Y., “Parallel Granular Networks for Fast Credit Card Fraud Detection, Proceeding of IEEE international conference on Fuzzy System, pp. 572-577, 2002. Bentley, Peter J., Kim, Jungwon, Jung, Gil-ho and Choi, Jong-uk, “Fuzzy Darwinian Detection of credit Card Fraud.” Proceeding of 14th Annual Fall Symposium of Korean information processing society. 2000. Bolton, Richard J. & Hand, David J., “Statistical Fraud Detection: A Review”, Statistical science, Vol. 10, No. 3, pp.235-255, 2002. Rabiner, Lawewncw r., “A Tutorial on Hidden Markov Model and selected application in speech recognition”, Proceeding of IEEE vol. 77, No. 2, pp.257-286, 1986. Srivastava, abhinav, kundu, amlan, sural, shamik and majumdar, arun k., “credit card fraud detection using hidden markov model”, ieee transactions on dependable and secure computing, vol. 5, no. 1, pp. 37-48, 2008. V.bhusari, s. Patil “an application of hidden markov model in credit card fraud detection” international journal of distributed and parallel systems ijdps, vol.2, no.6, 2011. V.Dheepa, Dr. R. Dhanapal, “Analysis of credit card fraud detection methods”, International journal of recent trends in engineering, vol. 2, no.3 pp. 126-127, November 2009. Fonzo, Valeria De, Aluffi-Pentini, Filippo and Valeria, Arun K., “Hidden Markov Model in Bioinformation”, current Bioinformation, vol. 2, pp. 49-61, 2007. Li xie, valery a. Ugrinovskii, ian r. Petersen, “a posterior probability distances between finite-alphabet hidden markov models” IEEE, vol. 53, no. 2, 783- 793, 2007. S.B. Cho and H.J. Park, “Efficient Anomaly Detection by Modeling Privilege Flows Using Hidden Markov Model,” Computer and Security, vol. 22, no. 1, pp. 45-55, 2003. S.S. Joshi and V.V. Phoha, “Investigating Hidden Markov Models Capabilities in Anomaly Detection,” Proc. 43rd ACM Ann. Southeast Regional Conf., vol. 1, pp. 98-103, 2005. Chiu, C, and C Tsai. "A Web Services-Based Collaborative Scheme for Credit Card Fraud Detection." Proceedings of 2004 IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004.

6

2012 NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING, NUiCONE-2012

[22] Ngai, E.W.T., Yong Hu, Y.H. Wong, Yijun Chen, and Xin Sun. "The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature." Decision Support Systems,pp: 559-569, 2011. [23] Shajith Ikbal, Tanveer Faruquir, “HMM based Event Detection In audio conversation”, IBM India Research lab, vol. -8, pp.- 1497-1500, IEEE 2008. [24] Mhamane S.S, Lobo L.M.R.J., “Use of Hidden Markov Model as internet banking fraud detection”, International Journal of Computer application (0975-8887), vol. – 45- No. 21, May 2012. [25] Dr. D. Ourston, Ms. S. Matzner, Mr. W. Stump, and Dr. B. Hopkins, “ Application of Hidden Markov Models to Detecting Multi-stage network attacks”, Proc. Of the 36th Hawaii International Conference on system sciences (HICSS’03), IEEE, 2002