International Journal of Pure and Applied Mathematics Volume 114 No. 11 2017, 157-165 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu Special Issue
ijpam.eu
Detection of Fraudsters in Electronic Auction Using Hidden Markov Model with X-Means Clustering Algorithm 1
2
3
T.S. Swathy Lakshmy, Aswathi Sivadasan and L. Nitha 1
Department of Computer Science and I.T,
School of Arts and Sciences, Amrita University, Kochi.
[email protected] 2
Department of Computer Science and I.T,
School of Arts and Sciences, Amrita University, Kochi.
[email protected] 3
Department of Computer Science and I.T,
School of Arts and Sciences, Amrita University, Kochi.
[email protected]
Abstract In this world of new innovations, electronic auction had a vital role in our daily life. Because of its amazing features, people are addicted to online sites. On the other hand, online sites are subjected to fraudulent activities. In this paper, we are detecting fraudulent activities of online auction by analyzing the history of customer’s transaction. In order to detect the fraudsters that demolish the essence of electronic auction, we have proposed an improved mechanism using Hidden Markov Model (HMM) and X-mean clustering algorithm. This algorithm is a modified version of K-means clustering algorithm, which solves the problems using the K-means approach. Hidden Markov model is a statistical model that gives approximate value by accepting less input and it provides a probabilistic behavior of fake bids in online auction. We have compared the results of current model that uses Hidden Markov Model and K-means with our proposed model. By examining the result, it is clear that our model is more efficient in terms of accuracy and consumes less time. Key Words:Online auction, fraud detection, X-means clustering, hidden markov model (HMM).
157
International Journal of Pure and Applied Mathematics
1. Introduction Throughout these years, online auctions have created massive growth in the online environment. This is mainly because online auctions remove the physical barriers of traditional auctions such as place, presence and time. Online auction websites provide a platform common for both buyers and sellers to meet up and bid for products of their interest. It is very much advantageous and definitely cost effective. Because of its easiness and fast growing, there are loopholes for the fraudsters to place fake bids in the auction. Through this fake bids the fraudsters earn money from customers legally or illegally. Preauction, In-auction and Post-Auction are the different stages of frauds in online auction. In this paper, we concentrate on online auction fraud in the In-Auction phase. We uses Hidden Markov Model (HMM) and X-means clustering algorithm to detect and categorize the fraudsters in online auction websites. Hidden Markov Model is a statistical model that gives approximate value by accepting less input and it provides a probabilistic behavior of fake bids in any online auction. In HMM, the states are not visible directly, but reliant on the stage it is visible [1]. K-means clustering algorithm is a numerical, unsupervised, non-deterministic and iterative method of clustering. The mean value of objects in the cluster is represented as centroids in k-means [18]. X-means clustering algorithm is an improved version of K-means clustering algorithm that made more pure group assignments by over and over again attempting subdivision, and keeping the best resulting splits, until some judging requirement is reached [9].
2. Literature Review Here we discuss various methods in our survey were we have carefully studied about various problems happening in electronic auction illegal dishonesty and ways which have been offered by many scholars in the current systems. Priyanka Gupta et.al [1] proposed an algorithm by applying Hidden Markov Model and K-means algorithm. Their works ensure the recognition of electronic auction fraud effectively by considering the bidding behaviors of customers by generalizing them into only three classes low, medium and high. A bid is treated as a fraud bid when it is beyond the range. Jau-Shien Chang et.al[2] come up with a method in which they applied Xmeans clustering algorithm after phase profiling the transaction histories for the purpose of disclosing the hidden statuses of the fraudsters by monitoring
158
Special Issue
International Journal of Pure and Applied Mathematics
the behavioral fluctuations. Additionally, they projected a two-way monitoring procedure that combines forward status monitoring and backward status monitoring to rectify fraudster miscalculation by observing the statuses of a suspicious account respectively. Jie Zhang et.al [3] adopted a combined tactic of Fuzzy Logic Expert Systems Model and a statistical hedonic regression model to evaluate users bidding behavior at electronic auction website. This integrated approach overcomes their own drawbacks by combining the advantages of both the methods. Wen-His Chang et.al [4] had come up with a new initial scam discovering mechanism that reflects timeliness and accuracy concurrently. To resolve the utmost relevant aspects that discriminate among fraudsters and common merchants, an improved wrapper technique and a complement phased modelling procedure. Their result implies binary reputation method is not faultless but beneficial. Ankit Mundra et.al [5] have enlarged the requisite engineering and functioning of various OHM units. They have exposed the strength of these units in terms of their operational interactions and logical design. More than that, they have estimated the performance of this model and have revealed that OHM is very efficient electronic auction scam recognition or prevention approaches. Jarrod Trevathan et.al [6] offered an algorithm to identify the existence of shill bidding in electronic auctions by computing a stable score called shill score which combines of many distinct constraints that play part in any auction. If any fraudulent action happens, the shill score will be greater. Xiling Cui et.al [7] attempted to recognize bidding policies empirically in online single-unit auctions and estimate their outcomes. A research model developed and used to observe the relationship. Early, snipe, and agent supported ratchet bidding are the main three bidding strategies they have found. This finding suggests that snipe bidding produces great winning result. Wen-His Chang et.al [8] suggested a novel two-stage phased modelling framework that incorporates hybrid phased models with a consecutive filtering techniques to recognize hidden deceiver by observing the phased features of potential fraudster’s lifecycle. This progresses functioning of recognizing hidden imposter disguising a genuine account. The result also shows the efficiency of the measuring attributes accepted which are useful for increasing the performance of recognition model. Duen Horng Chau et.al [11] aims to tackle electronic auction fraud by suggesting a novel method to identify auction fraudsters, which includes determining and extracting characteristic features from exposed fraudsters, through examining the fraudsters’ transaction history which exist as a graph.
159
Special Issue
International Journal of Pure and Applied Mathematics
Special Issue
Andrea Merin [13] described the efficiency, limits and the contexts where the models can be used effectively. Keerthi A et.al [14] used Hidden Markov Model as a solution to reduce the fraudulent practices in credit card usage. Further they proposed an improvement in the current model by using improved K-means algorithm for clustering which overcomes some disadvantages of K-means algorithm and detects fraud more efficiently. Yungchang Ku et.al [16] come up with a method which is to collect probable fraudsters by social network analysis and decision tree which helps buyers to easily avoid fraudulent activities in electronic auction. Jinal C. Thosani et.al [17] modeled a sequence of operations in credit card transaction processing using Hidden Markov Model for fraud detection and to ensure the acceptance of authentic transactions. The HMM checks whether the incoming transactions is fraudulent or not based on the probability result.
3. Proposed Work A. Hidden Markov Model Study Hidden Markov Model is an automation model to which we provides inputs and shows near to correct probabilistic behavior which directly helps in identifying the probability of the presence of shill bids in any active auction [1].HMM model can be used in different areas such as bio-information, robotics, speech recognition, data mining, voice recognition, artificial intelligence, etc. . HMM is defined as [1, 19, 20]: 1. N is the number of states in the model. 2. At any instant t the state is denoted by qt. 3. M is the number of distinct observations symbols and are symbolized as V= V1, V2 … VM. 4. The state transition probability matrix A= [aij] where aji =Pr (qi+1 at t+1 | qi at t). 5. The observation symbol probability matrix B= [bj (m)], where bj (m) = Pr (vm at t | qi at t). 6. πi =Pr (qi at t=1). HMM model involves two stages: 1. Training stage 2. Detection Stage B. Fraud Detection In online auction, the customer needs to register for the online auction website with a unique register ID and password. For the purpose of fraud
160
International Journal of Pure and Applied Mathematics
prevention, the registration phase should be verified. In the initial stage, the customer have to provide personal details such as full name, date of birth, gender, address, nationality, contact number, email id and card details. Using the basic and important attributes like bid, bid time, bidder rate, open bid and value of bid from the auction site (eBay), HMM parameters are defined. To complete HMM, observation symbols are calculated using X-means clustering algorithm. The observation symbols are the clusters formed after performing X-means clustering algorithm. In detection phase, after the calculation each new bid placed by customer will be analyzed with its bidding habit and on the basis of the behavior fluctuations, the judgments are taken whether the new bid is legitimate or not.
4. Comparison between Current Model and Proposed model and result Existing system uses Hidden Markov Model and K-means clustering algorithm approach to detect and categorize fraudsters in an online auction. But from our study [2], using of X-means clustering algorithm, it will provide better results than K-means clustering algorithm. In our proposed work, we used X-means clustering algorithm instead of K-means clustering algorithm. We have taken the dataset from online auction site eBay to perform X-means and K- means clustering on the same dataset. By making the attribute percentage split and maxIterations constant for both the clustering algorithms, the clustered instances generated was considerably larger for X-means (Fig. 1) rather than K-means (Fig. 2). The number of iterations performed and time taken to build model will be lesser for X-means algorithm. So on the basis of iterations, time, and clustered instances, we come to a conclusion that Xmeans clustering algorithm is efficient than K-means clustering algorithm. Xmeans can work with larger dataset compared to K-means clustering algorithm which consumes only less time. It has improved the accuracy rate and hence made the model cost effective. In the figure below shows the output in WEKA Explorer the Clustered instances of X- means and K-means:
Fig. 1: Output for X-means clustering algorithm
161
Special Issue
International Journal of Pure and Applied Mathematics
Fig. 2: Output for K-means clustering algorithm
5. Conclusion Online auctions has become one of the major concerns in worldwide marketing. It also helps the buyers as well as the sellers to purchase and sell the products from any corner of the world via a fingertip. Because of its easiness and vast growing there are loop holes for the fraudsters to place fake bids in the auction. The auction fraudsters devastate the auction with false high prices for a product by placing shill bids for an item. The real buyers couldn’t get the product with affordable rate due to fake activities by fraudsters. To detect the in-auction fraudsters there are several approaches, here we made a study between an existing approach that uses Hidden Markov Model and Kmeans and our proposed model that uses Hidden Markov Model and Xmeans. From our study, we have identified that the approach that uses Hidden Markov Model and X-means for detecting the online fraudsters are more effective than the former approach.
References [1]
Priyanka Gupta, Ankit Mundra, Online In-Auction Fraud Detection Using Online Hybrid Model, International Conference on Computing, Communication and Automation (2015).
[2]
Jau-Shien Chang, Wen-Hsi Chang, Analysis of fraudulent behavior strategies in online auctions for detecting latent fraudsters, Electronic Commerce Research and Applications 13 (2014), 79-97.
[3]
Jie Zhang, Edmund L.Prater, Ilya Lipkin, Feedback reviews and bidding in online auctions: An integrated hedonic regression and fuzzy logic expert system approach, Decision Support Systems, 55 (2013), 89-902.
162
Special Issue
International Journal of Pure and Applied Mathematics
[4]
Wen-Hsi Chang, Jau-Shien Chang, An effective early fraud detection method for online auctions, Electronic Commerce Research and Applications 11 (2012), 346–360.
[5]
Ankit Mundra, Nitin Rakesh, Ghrera S.P., Empirical study of Online Hybrid Model for Internet frauds Prevention and Detection, International Conference on Human Computer Interactions (2013).
[6]
Jarrod Trevathan, Wayne Read, Detecting Collusive Shill Bidding, Fourth International Conference on Information Technology (2007)
[7]
Xiling Cui, Vincent S. Lai, Bidding strategies in online single-unit auctions: Their impact and satisfaction, Information & Management 50 (2013), 314-321.
[8]
Wen-Hsi Chang, Jau-Shien Chang, A novel two-stage phased modeling framework for early fraud detection in online auctions, Expert Systems with Applications 38 (2011), 11244-11260.
[9]
Dan Pelleg, Andrew Moore, X-means: Extending K-means with Efficient Estimation of the Number of Clusters, Proceedings of the Seventeenth International Conference on Machine Learning, (2000), 727-734,
[10]
Fei Dong, Sol M. Shatz and Haiping Xu, Combating online inauction fraud: Clues, techniques and challenges, Computer Science Review 3 (2009), 245-258.
[11]
Duen Horng Chau, Christos Faloutsos, Fraud Detection in Electronic Auction , https://www.researchgate.net/publication/ 249906880_Fraud_Detection_in_Electronic_Auction (2005).
[12]
Duen Horng Chau, Shashank Pandit, and Christos Faloutsos, “Detecting Fraudulent Personalities in Networks of Online Auctioneers, European Conference on Principles of Data Mining and Knowledge Discovery (2006), 103-114.
[13]
Andrea Marin, Hidden Markov Models applied to Data Mining, https://www.researchgate.net/publication/268303004_Hidden_M arkov_Models_applied_to_Data_Mining (2006).
[14]
Keerthi A., Remya M.S., Nitha L., Detection of credit card frauds using hidden markov model with improved K-means clustering algorithm, International Journal of Applied Engineering Research, 10(55) (2015).
[15]
Wenli Wang, Zoltan Hidvegi and Andrew B.Whinston, An Intermediation Shill-Deterrent Fee Schedule (SDFS), https://www.researchgate.net/publication/2485434_An_Intermedi ation_Shill-Deterrent_Fee_Schedule_SDFS (2001).
163
Special Issue
International Journal of Pure and Applied Mathematics
[16]
Yungchang Ku, Yuchi Chen, Chaochang Chiu, A Proposed Data Mining Approach for Internet Auction Fraud Detection, PacificAsia Workshop on Intelligence and Security Informatics 4430 (2007), 238-243.
[17]
Jinal C. Thosani, Chetashri Bhadane, Harsh M. Avlani, Zalak H. Parekh, Credit Card Fraud Detection Using Hidden Markov Model, International Journal of Scientific & Engineering Research 5 (2014).
[18]
Jyoti Yadav, Monika Sharma, A Review of K-mean Algorithm, International Journal of Engineering Trends and Technology (IJETT) 4 (2013).
[19]
Rabiner, Lawrence, Biing-Hwang Juang, An introduction to hidden Markov models, ASSP Magazine 3(1) (1986), 4-16.
[20]
Rabiner L.R., A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE 77(2) (1989), 257-286.
164
Special Issue
165
166