Kernel neighbor density with parallel computing mechanism for anomaly detection algorithm* Rui Ma1 , Hui Cao1,2 , Shuzhi Sam Ge2 and Hongliang Ren2 Abstract— Anomaly detection is an important research direction in the field of data mining and industrial dataset preprocess. The paper proposed a kernel neighbor density definition with parallel computing mechanism for anomaly detection algorithm. The kernel neighbor density formula calculates the density of points in high dimensional space. In our definition, we adopt the median operation because the breakdown point of the median is the largest possible. So the definition could be a very robust estimate of the data location, and parallel computing mechanism is introduced to improve the efficiency of algorithms. We use two real datasets and three different kernel functions to evaluate the performance of algorithms. The experiment results confirm that the presented definition of kernel neighbor density improves the performance of algorithms and the Gaussian kernel function has the best effect.
I. INTRODUCTION In order to find the objects that are very deviating from the rest of the objects according to suitable metrics, anomaly detection has become an important task in the development of data mining. Usually, the Anomaly object is called outlier. Hawkins describes an outlier intuitively as: “An outlier is an object that deviates from other objects so much as to be suspected that it was generated by a different mechanism” [1]. Hence, outlier cannot simply be considered as noise, it may represent an interesting phenomenon or contain very meaningful knowledge from application perspective. Anomaly detection could be used in some practical domains, such as financial fraud detection for detecting the abnormal usage of credit cards [2], network intrusion detection [3], medical security monitor [4], and product qualification tests [5], etc. Anomaly detection has become an important research direction in the field of data mining, various schemes for eliminating outliers or minimizing the influence of outliers has been presented. At present, anomaly detection algorithm generally can be divided into statistics-based approaches, distance-based approaches, clustering-based approaches and density-based approaches. Statistics-based algorithm is used for anomaly *This work is supported by the National Natural Science Foundation of China under Grant 61375055, the Program for New Century Excellent Talents in University under Grant NCET-12-0447, the Natural Science Foundation of Shaanxi Province of China under Grant 2014JQ8365 and the A*STAR Industrial Robotics Program of Singapore under grant R-261506-007-305 and R-261-506-008-305. 1 The State Key Laboratory of Electrical Insulation and Power Equipment, Electrical Engineering School, Xi’an Jiaotong University, Xi’an, Shaanxi, 710049, China
[email protected],
[email protected]
2 The Faculty of Engineering & Advanced Robotics Centre, National University of Singapore, 117580, Singapore samge,
[email protected]
c 978-1-4673-7338-8/15/$31.00 2015 IEEE
detection at the earliest [6], it assumes a model or distribution for a given dataset [7], and evaluates objects according to their model fitting degree. These approaches could be used efficiently when there is sufficient knowledge of dataset. Clustering-based approaches clustering all the objects in the first place, and then evaluate the degree of the objects belonging to the cluster [8]. In general, clustering algorithms are used to detect outliers as by-products of the clustering processes. Although the clustering based approaches can find cluster and outliers at the same time, the operation effect of the method is heavily dependent on clustering result. Distance-based anomaly detection methods estimate an object as outlier based on the distance to its nearest neighbors [9]. Distance-based anomaly detection methods are used conveniently, but they could not handle the dataset with different density area effectively. Moreover, the result of this method is sensitive to parameter selection. The concept of data density based on the distance between objects and the number of objects within a given range. For the densitybased viewpoint, outliers are objects that are in regions of low density. In general, the density-based anomaly detection algorithms give a quantitative measure of the degree of judgment to anomaly detection [10][11][12], which depends on how isolated the object is with respect to the surrounding neighborhood. Therefore, density based anomaly detection algorithms could work well on a dataset with different density region. Nevertheless, the weakness of this solution is that it is very sensitive to parameters defining the neighborhood. The paper proposed a kernel neighbor density definition with parallel computing mechanism for anomaly detection algorithm. To make more difference appeared between the data, objects are mapped to high dimensional space and a kernel neighbor density definition is proposed. In our definition, we adopt the median operation because the breakdown point of the median is the largest possible. So the definition could be a very robust estimate of the data location, and parallel computing mechanism is introduced to improve the efficiency of algorithms. We use two real datasets and three different kernel functions to evaluate the performance of algorithms. The rest of the paper is organized as follows: In Section 2, we review some existing work. In Section 3, the proposed definition and parallel mechanism is explained in detail. In Section 4, we present the experiments results. Finally, section 5 concludes the paper.
259
II. RELATED WORK In this section, we review the existing density-based anomaly detection algorithms and look back upon some existing definitions. Density-based anomaly detection algorithm because of its good performance and valid identification of local anomaly are widely used. The local outlier factor (LOF) [10] is the first concept which quantifies the degree of an object being an outlier. The LOF of an object is based on the average of the ratios of the local reachability density of the neighborhood and the local reachability densities of its neighbors. A parameter called MinPts(minimum number of points) is needed to determine the size of the neighbors of the object. Hence, the lower the local reachability density of an object is, the larger the local reachability densities of the MinPts-nearest neighbors of the object are, and the larger is the LOF value of the object. Literature [11] presents a connectivity-based outlier factor (COF) algorithm, which considers that an object and other objects are ”connected”. The COF of an object is based on the ratio of the average chaining distance of the object and the average of the average chaining distances from k-distance neighbors to its neighbors to their own neighbors. The COF algorithm can deal with the low-density dataset which is similar to the neighborhood density of an outlier. Literature [12] proposed a densitysimilarity-neighbor-based algorithm (DSNOF) for anomaly detection. DSNOF constructs average series cost (ASC) of each object, and the DSNOF value of an object is the ratio of the ASC of the object and the average of the ASC of k-distance neighbors of the object. Let D be a dataset and p is an object of D. In this part, we introduce two definitions of the k-distance and the k-distance neighborhood, which have been discussed in literature [10]. Definition 2.1(k-distance of an object p). For any positive integer k, the k-distance of object p, denoted as k-distance(p), is defined as the distance d(p,o) between p and an object of D such that: ′ ′ (i) for at least k objects 𝑜 ∈D∖{p} it holds that d(p,𝑜 ) ≤ d(p,o),and ′ ′ (ii) for at most k-1 objects 𝑜 ∈D∖{p} it holds that d(p,𝑜 ) < d(p,o). Definition 2.2(k-distance neighborhood of an object p). Given the k-distance of p, the k-distance neighborhood of p contains every object whose distance from p is not greater than the k-distance.
Fig. 1.
260
k-distance(o), for k=4 and k=5
k-distance and k-distance neighborhood as shown in Fig.1. III. A LGORITHMS In this section, kernel neighbor density definition and parallel mechanism is explained in detail. A. Kernel neighbor density In this section, we briefly introduce the kernel function, and then, introduce the definition of kernel neighbor density. The kernel function is often used in the classification or regression problems [13]. According to the pattern recognition theory, the nonlinear mapping can change each linearly inseparable problem into a separable one. But if this technique is directly applied on classification or regression in high dimensional space, there exists some problems such as determining the form and parameters of nonlinear mapping functions, obtaining the dimension of the feature space, etc. The biggest obstacle is the dimensional disaster in the highdimensional feature space, using the kernel function can effectively solve these problems. Assuming x, z ∈ X, 𝑋 ⊆ 𝑅𝑛 , nonlinear function Φ realize the mapping from input space X to the feature space F, and F ⊆ 𝑅𝑚 , n ≪ m. According to the kernel function technique: K(x,y)=< Φ(𝑥), Φ(𝑧) > is the inner product, K(x,y) is kernel function. kernel function transforms high-dimensional space of inner product calculation into low-dimensional space of kernel function calculation, which skillfully solved the computational problem such as dimension disaster in high-dimensional space. Thus, a solid theoretical foundation has been laid for solving complex problems in high dimensional space. Here we present three kinds of commonly used kernel function: ∙ Gaussian Kernel: The Gaussian kernel is an example of radial basis function kernel. K(x,y)=exp(-∥x-y∥2 /𝜎 2 ), 𝜎 >0 ∙ Polynomial Kernel: The Polynomial kernel is a nonstationary kernel. Polynomial kernels are well suited for problems where all the training data is normalized. K(x,y)=(𝑥 ∙ 𝑦 + 1)𝑑 , d>0 ∙ Laplacian Kernel: The Laplace Kernel is completely equivalent to the exponential kernel, except for being less sensitive for changes in the sigma parameter. Being equivalent, it is also a radial basis function kernel. K(x,y)=exp(-𝛾 ∥x-y∥), 𝛾 >0 More kernel function can refer to the literature[14]. To make more difference appeared between the data, we can map objects to the high-dimensional feature space. According to the kernel function, the distance of the feature space can be expressed as √ 𝑑𝐾 (𝑥, 𝑦) = 𝐾(𝑥, 𝑥) − 2𝐾(𝑥, 𝑦) + 𝐾(𝑦, 𝑦) This kernel distance formula can calculate the distance of points in high dimensional space, and using different kernel functions have different effects.
2015 IEEE 7th International Conference on CIS & RAM
Let 𝑁𝑘 (p) represent the k-distance neighborhood of the object p, according to the kernel distance formula, we propose a new density definition of an object, which is called the kernel neighbor density of the object. Definition 2.3(kernel neighbor density of an object p). The kernel neighbor density is define as 𝑘𝑒𝑟𝑑𝑒𝑛(𝑝) =
∣ 𝑁𝑘 (𝑝) ∣ 𝑚𝑒𝑑𝑖𝑎𝑛{𝑑𝐾 (𝑝, 𝑜) ∣ 𝑜 ∈ 𝑁𝑘 𝑝}
This kernel neighbor density formula calculates the density of points in high dimensional space. In our definition, we adopt the median operation because the breakdown point of the median is 50%, which is the largest possible [15]. So the definition could be a very robust estimate of the data location. Moreover, we use the size of 𝑁𝑘 (𝑝) in the definition can deal with that there are some neighbors of an object with same distance from the object. B. Parallel computing mechanism In recent years, with the rapid increase of data quantity, Data mining faces a difficult problem. Data complexity and calculation system with limited capacity has become increasingly prominent contradictions, the traditional stand-alone system exhibits a characteristic of slow speed, low efficiency, high energy consumption in the process of calculation. Using the PC cluster system of distributed parallel computing for massive data mining provides a good scalability, easy to implement solutions. Research on parallel computing environment is one of hotspots in the field of high performance computing research. Fig.2 is a simple diagram of the parallel computing.
formula of LOF algorithm, COF algorithm and DSNOF algorithm. ∑ 𝐿𝑂𝐹𝑀 𝑖𝑛𝑃 𝑡𝑠 (𝑝) =
𝐶𝑂𝐹𝑘 (𝑝) =
𝑜∈𝑁𝑀 𝑖𝑛𝑃 𝑡𝑠 (𝑝)
𝑙𝑟𝑑𝑀 𝑖𝑛𝑃 𝑡𝑠 (𝑜) 𝑙𝑟𝑑𝑀 𝑖𝑛𝑃 𝑡𝑠 (𝑝)
∣ 𝑁𝑀 𝑖𝑛𝑃 𝑡𝑠 (𝑝) ∣ ∣ 𝑁𝑘 (𝑝) ∣ 𝑎𝑐 − 𝑑𝑖𝑠𝑡𝑘 (𝑝) ∑ 𝑜∈𝑁𝑘 (𝑝)𝑎𝑐−𝑑𝑖𝑠𝑡𝑘 (𝑜)
∣ 𝑁𝑘 (𝑝) ∣ 𝐴𝑆𝐶(𝑝) 𝐷𝑆𝑁 𝑂𝐹 (𝑝) = ∑ 𝑜∈𝑁𝑘 (𝑝)𝐴𝑆𝐶(𝑜)
In the LOF algorithm, COF algorithm and DSNOF algorithm, the outlier factor of each object is only associated with the objects in its neighborhood. So the calculation of each object’s outlier factor can be used as an independent task, all the tasks can be assigned to different worker to perform. The use of parallel platform mechanism can greatly improve the efficiency of the operation of density-based anomaly detection algorithm. IV. EXPERIMENTS RESULTS In the section, the kernel neighbor density is adopt in DSNOF algorithm constructing K-DSNOF algorithm, and we present experiments on real datasets to evaluate the effectiveness and the performance of LOF algorithm, COF algorithm and K-DSNOF algorithm, and also comparing the effect of three different kernel functions at the same time. All algorithms were implemented in parallel computing platform. Before reporting the experiments results, we will introduce three metrics, namely, precision, recall and rank power for measuring the effectiveness of an anomaly detection algorithm. A. Effectiveness metrics
Fig. 2.
a simple framework for parallel computing
As shown in Fig.2, parallel computing system consists of three kinds of roles. The client submits the job, and divided the job for multiple tasks. Scheduler manages the operation of job, and assigns tasks to different workers. Worker is responsible for the specific calculation. Finally, worker returns the results to the scheduler, scheduler summarize and then returned to the clients. This is a simple work flow of parallel computing platform. We will introduce the parallel computing mechanism to density-based anomaly detection algorithm. Most of the density-based anomaly detection algorithm is given an outlier factor to evaluate the object, we present the outlier factor
Precision, recall and rank power are three classical metrics of the effectiveness of an information system [16][17][18], and they can be applied to measure the effectiveness of anomaly detection algorithm. We assume that a dataset D=𝐷𝑜 ∪𝐷𝑛 , and 𝐷𝑜 and 𝐷𝑛 respectively represent the set of all outliers and the set of all non-outliers. let 𝑂𝑚 be the set of outliers among the objects in the top m positions returned by an anomaly detection algorithm, where any integer m ≥ 1. ∣𝑂𝑚 ∣ represent the number of objects in 𝑂𝑚 , and the same to ∣𝐷𝑜 ∣. Precision is equal to the percentage of outliers among the top m ranked objects returned by the algorithm, and precision is defined as Precision= ∣𝑂𝑚𝑚 ∣ Recall is equal to the percentage of the total outlier set included within the top m ranked objects, and recall is defined as 𝑚∣ Recall= ∣𝑂 ∣𝐷𝑜 ∣
2015 IEEE 7th International Conference on CIS & RAM
261
Literature [11] give a slightly revised definition of rank power with values ranging from 0 to 1 so that a value of 1 indicates the best performance and 0 indicates the worst. We assume that an anomaly detection algorithm returns m objects with placing from position 1 to position m. For 1≤ 𝑖 ≤ ∣𝑂𝑚 ∣, let 𝐿𝑖 represent the position of the ith outlier and rank power is defined as ∑𝑛 RankPower= 2 𝑛(𝑛+1) 𝐿𝑖 𝑖=1
According to the three metrics discussed above, the larger precision and recall are, the more effective the anomaly detection algorithm is. In addition, for the same size of precision and recall, the larger rank power is, the more effective the algorithm is. B. real datasets As indicated by literature [19], one method to measure the effectiveness of an anomaly detection algorithm is to run it on the dataset to discover rare classes, and the method has been used by literature [20][21][22][23][11]. Therefore, we also use the same method, and two real datasets, Johns Hopkins university ionosphere dataset and image segmentation dataset, are adopted in the experiments. The effectiveness of an algorithm is mainly measured by precision, recall, and rank power, and we let Pr, Re, and RP represent the three metric, respectively. Moreover, m indicates top m ranked instances returned by the algorithm and 𝑁𝑟𝑐 represents the number of instances in rare class detected. K1, K2 and K3 represent Gaussian Kernel, Polynomial Kernel and Laplacian Kernel, respectively. In the following, the experiments results are reported and discussed. 1) Johns Hopkins University ionosphere dataset: This dataset is Johns Hopkins University ionosphere, which has 351 instances with 34 attributes, and it contains two classes labeled as good and bad, respectively. There are neither duplicated records nor records with missing attributes values. We follow the experimental technique used by [19], [20], [21], [22], [23], and [11], to randomly remove some instances from the dataset to produce a very unbalanced distribution. There are 235 instances in the resulting dataset, where 225 instances labeled as good and 10 labeled as bad. Table 1 exhibits the class distribution of the dataset. TABLE I T HE CLASS DISTRIBUTION OF J OHNS H OPKINS U NIVERSITY IONOSPHERE DATASET.
case Commonly occurring classes Rare classes
class node good bad
Percentage of instances 95.74 4.26
We run the LOF algorithm, COF algorithm and the KDSNOF algorithm on the dataset to find the rare class. Following the same value used by literature [11], we let k be 12, which are about 5% of the number of instances in the dataset. We make 𝜎 be 1 in Gaussian Kernel, d be 2
262
in Polynomial Kernel, and 𝛾 be 2 in Laplacian Kernel. The experiments results are exhibited in Table 2. For m being from 5 to 30, the K-DSNOF algorithm with Gaussian Kernel exactly detects all instances in rare class, the K-DSNOF algorithm with Laplacian Kernel detects all instances in rare class for m=10, and has a good rank power. The COF algorithm, LOF algorithm and KDSNOF algorithm with Polynomial Kernel cannot detects all instances in rare class, but the rank power of K-DSNOF with Polynomial Kernel is still far better than LOF algorithm and COF algorithm. Consequently, the K-DSNOF algorithm performs better on the dataset. For other k values, the experiments results are almost consistent with Table 2. For example, let k be 11 and 13, the experiments results are shown in Tables 3 and 4, respectively. 2) Image segmentation dataset: This dataset is image segmentation dataset, which has 210 instances with 19 attributes. The dataset contains seven equally sized classes labeled as brickface, sky, foliage, cement, window, path, and grass, respectively. There are neither duplicated records nor records with missing attributes values. We follow the similar ways used by [19], [20], [21], [22], [23], and [11], and let the dataset produce a very unbalanced distribution. We randomly select 5 instances from the classes of brickface and sky returns, respectively, to form two small clusters, and we randomly select 2 instances from the classes of foliage, cement, and window, respectively, to form rare classes. Table 5 exhibits the class distribution of the dataset. TABLE V T HE CLASS DISTRIBUTION OF IMAGE SEGMENTATION DATASET. case commonly occurring classes Small cluster class Rare classes
class node
Percentage of instances
path,grass
78.95
brickface,sky foliage,cement window
13.16 7.89
We run the LOF algorithm, COF algorithm and the KDSNOF algorithm on the dataset to find the rare class. Following the same value used by literature [11], we let k be 4, which is about 5% of the number of instances in the dataset. We make 𝜎 be 45 in Gaussian Kernel, d be 10 in Polynomial Kernel, and 𝛾 be 0.05 in Laplacian Kernel. The experiments results are exhibited in Table 6. Among top 5 ranked instances returned by the algorithms, the K-DSNOF with Gaussian Kernel algorithm could detect more rare instances than others, the rank power of LOF algorithm is larger than K-DSNOF with Laplacian Kernel and K-DSNOF with Polynomial Kernel. However, for m being 5 to 20, the K-DSNOF with Laplacian Kernel and K-DSNOF with Polynomial Kernel could detect more rare instances than COF algorithm and LOF algorithm, and the effect of K-DSNOF with Gaussian Kernel algorithm is still better than others. Among top 20 ranked instances returned by the algorithms, only the K-DSNOF algorithm detects all instances in rare class. Hence, the experimental results show
2015 IEEE 7th International Conference on CIS & RAM
TABLE II D ETECTED RARE CLASS IN J OHNS H OPKINS U NIVERSITY IONOSPHERE DATASET FOR K = 12
m 5 10 15 20 25 30
𝑁𝑟𝑐 2 5 7 7 8 9
LOF Pr Re 0.4 0.2 0.5 0.5 0.47 0.7 0.35 0.7 0.32 0.8 0.3 0.9
RP 0.6 0.5 0.48 0.48 0.43 0.41
𝑁𝑟𝑐 3 5 6 7 7 8
COF Pr Re 0.6 0.3 0.5 0.5 0.4 0.6 0.35 0.7 0.28 0.7 0.27 0.8
RP 1 0.71 0.64 0.55 0.55 0.44
𝑁𝑟𝑐 5 10 10 10 10 10
K1-DSNOF Pr Re 1 1 1 1 0.67 1 0.5 1 0.4 1 0.33 1
RP 1 1 1 1 1 1
𝑁𝑟𝑐 5 8 8 9 9 9
K2-DSNOF Pr Re 1 0.5 0.8 0.8 0.53 0.8 0.45 0.9 0.36 0.9 0.3 0.9
RP 1 0.97 0.97 0.85 0.85 0.85
𝑁𝑟𝑐 5 8 9 10 10 10
K3-DSNOF Pr Re 1 0.5 0.8 0.8 0.6 0.9 0.5 1 0.4 1 0.33 1
RP 1 1 0.88 0.8 0.8 0.8
𝑁𝑟𝑐 5 8 9 10 10 10
K3-DSNOF Pr Re 1 0.5 0.8 0.8 0.6 0.9 0.5 1 0.4 1 0.33 1
RP 1 1 0.88 0.78 0.78 0.78
TABLE III D ETECTED RARE CLASS IN J OHNS H OPKINS U NIVERSITY IONOSPHERE DATASET FOR K = 11
m 5 10 15 20 25 30
𝑁𝑟𝑐 2 4 7 7 8 9
LOF Pr Re 0.4 0.2 0.4 0.4 0.47 0.7 0.35 0.7 0.32 0.8 0.3 0.9
RP 0.6 0.5 0.47 0.47 0.42 0.4
𝑁𝑟𝑐 3 5 6 7 7 8
COF Pr Re 0.6 0.3 0.5 0.5 0.4 0.6 0.35 0.7 0.28 0.7 0.27 0.8
RP 1 0.71 0.62 0.53 0.53 0.43
𝑁𝑟𝑐 5 10 10 10 10 10
K1-DSNOF Pr Re 1 1 1 1 0.67 1 0.5 1 0.4 1 0.33 1
RP 1 1 1 1 1 1
𝑁𝑟𝑐 5 8 9 9 9 9
K2-DSNOF Pr Re 1 0.5 0.8 0.8 0.6 0.9 0.45 0.9 0.36 0.9 0.3 0.9
RP 1 1 0.92 0.92 0.92 0.92
TABLE IV D ETECTED RARE CLASS IN J OHNS H OPKINS U NIVERSITY IONOSPHERE DATASET FOR K = 13
m 5 10 15 20 25 30
𝑁𝑟𝑐 2 5 7 7 8 9
LOF Pr Re 0.4 0.2 0.5 0.5 0.47 0.7 0.35 0.7 0.32 0.8 0.3 0.9
RP 0.6 0.48 0.49 0.49 0.44 0.42
𝑁𝑟𝑐 3 5 6 7 7 9
COF Pr Re 0.6 0.3 0.5 0.5 0.4 0.6 0.35 0.7 0.28 0.7 0.3 0.9
RP 1 0.71 0.64 0.54 0.54 0.43
𝑁𝑟𝑐 4 9 10 10 10 10
K1-DSNOF Pr Re 1 0.4 1 0.9 0.67 1 0.5 1 0.4 1 0.33 1
RP 0.71 0.83 0.85 0.85 0.85 0.85
𝑁𝑟𝑐 5 8 8 8 8 9
K2-DSNOF Pr Re 1 0.5 0.8 0.8 0.53 0.8 0.4 0.8 0.32 0.8 0.3 0.9
RP 1 0.95 0.95 0.95 0.95 0.69
𝑁𝑟𝑐 5 8 8 9 10 10
K3-DSNOF Pr Re 1 0.5 0.8 0.8 0.53 0.8 0.45 0.9 0.4 1 0.33 1
RP 1 1 1 0.85 0.72 0.72
RP 0.86 0.71 0.57 0.57
𝑁𝑟𝑐 3 5 5 6
K3-DSNOF Pr Re 0.6 0.5 0.5 0.8 0.33 0.8 0.3 1
RP 0.5 0.54 0.54 0.48
TABLE VI D ETECTED RARE CLASSES IN IMAGE SEGMENTATION DATASET FOR K = 4.
m 5 10 15 20
𝑁𝑟𝑐 3 3 4 5
LOF Pr Re 0.6 0.5 0.3 0.5 0.27 0.7 0.25 0.8
RP 1 1 0.53 0.42
𝑁𝑟𝑐 3 4 5 5
COF Pr Re 0.6 0.5 0.4 0.7 0.33 0.8 0.25 0.8
RP 0.86 0.63 0.5 0.5
𝑁𝑟𝑐 4 5 6 6
K1-DSNOF Pr Re RP 0.8 0.7 0.83 0.5 0.8 0.83 0.4 1 0.7 0.3 1 0.7
that the K-DSNOF algorithm performs better on the dataset. In summary, the above experiments results of the real datasets verify that the K-DSNOF algorithm could mine the outliers more effectively than the LOF algorithm and the COF algorithm. By comparing three kinds of kernel function, we find that Gaussian Kernel function has better effectiveness and stabilization in the three kinds of kernel functions. V. CONCLUSIONS In this paper, we propose a kernel neighbor density definition with parallel computing mechanism for anomaly detection algorithm. The proposed algorithm has some advantages as follows. First, the presented kernel neighbor density definition improves the performance and effectiveness of algorithms. Second, through the experimental comparison,
𝑁𝑟𝑐 3 4 6 6
K2-DSNOF Pr Re 0.6 0.5 0.4 0.67 0.4 1 0.3 1
we found that Gaussian kernel function has the best effect and stability in the three kinds of kernel function. Third, parallel computing mechanism is introduced to improve the efficiency of algorithms. In the future research work, we will construct a new algorithm based on kernel neighbor density and verify the algorithm with more datasets. R EFERENCES [1] D. M. Hawkins, Identificationofoutliers, Monographson Applied Probability and Statistics, Chapman and Hall, London, 1980. [2] Y. Kim, SY. Sohn, Stock fraud detection using peer group analysis, EXPERT. SYST. APPL, vol. 39, pp. 8986-8992, AUG. 2012. [3] A. Chaudhary, V. N. Tiwari, A. Kumar, Design an anomaly based fuzzy intrusion detection system for packet dropping attack in mobile ad hoc networks, in Proc. IEEE International Advance Computing Conference, Gurgaon, India, 2014, pp. 256-261.
2015 IEEE 7th International Conference on CIS & RAM
263
[4] M. Zhang, A. Raghunathan, NK. Jha, MedMon: Securing Medical Devices Through Wireless Monitoring and Anomaly Detection, IEEE. T. BIOMED. CIRC. S, vol. 7, pp. 871-881, DEC. 2013. [5] MH. Chang, CC. Chen, D. Das, and M. Pecht, Anomaly Detection of Light-Emitting Diodes Using the Similarity-Based Metric Test, IEEE. T. IND INFORM, vol. 10, pp. 1852-1863, AUG. 2014. [6] V.J. Hodge, J. Austin, A survey of outlier detection methodologies, ARTIF. INTELL. REV, vol. 2, pp. 85-126, OCT. 2004. [7] H. Wang, M. Tang, Y. Park, and, CE. Priebe, Locality Statistics for Anomaly Detection in Time Series of Graphs, IEEE. T. SIGNAL PROCES, vol. 62, pp. 703-717, FEB. 2014. [8] H. Izakian, W. Pedrycz, Anomaly Detection and Characterization in Spatial Time Series Data: A Cluster-Centric Approach, IEEE. T. FUZZY SYST, vol. 22, pp. 1612-1624, DEC. 2014. [9] Y. Wang, Q. Miao, EWM. Ma, KL. Tsui, and, MG. Pecht, Online Anomaly Detection for Hard Disk Drives Based on Mahalanobis Distance, IEEE. T. RELIAB, vol. 62, pp. 136-145, MAR. 2013. [10] MM. Breunig, HP. Kriegel, RT. Ng, and J. Sander, LOF: Identifying density-based local outliers, in Proc. Int. Conf. on Management of Data, DALLAS, USA, 2000, pp. 93-104. [11] J. Tang, ZX. Chen, A. W. Fu, and D. W. Cheung, Capabilities of outlier detection schemes in large datasets, framework and methodologies, KNOWL. INF. SYST, vol. 11, pp. 45-84, JAN. 2007. [12] H. Cao, GQ. Si, YB. Zhang, LX. Jia, Enhancing effectiveness of density-based outlier mining scheme with density-similarity-neighborbased outlier factor, EXPERT. SYST. APPL, vol. 37, pp. 8090-8101, DEC. 2010. [13] C. Huang, XQ. Ding, C. Fang, and, D. Wen, Robust Image Restoration via Adaptive Low-Rank Approximation and Joint Kernel Regression, IEEE. T. IMAGE PROCESS, vol. 23, pp. 5284-5297, DEC. 2014. [14] C. Saunders, M. O. Stitson, J. Weston, Support vector machineReference manual, Department of Computer Science, Royal Holloway University of London, Egham UK: Technical Report CSD-TR-98-03 1998. [15] M. Daszykowski, K. Kaczmarek, YV. Heyden, B. Walczak, Robust statistics in data analysis - A review basic concepts, CHEMOMETR. INTELL. LAB, vol. 85, pp. 203, 219. FEB. 2007.
264
[16] R. Baeza-Yates, B. Ribeiro-Neto, Modern Information Retrieval, 1999, Boston: Addison-Wesley Longman Publishing Co., Inc. [17] XN. Meng, ZX. Chen, On user-oriented measurements of effectiveness of web information retrieval systems, in Proc. Int. Conf. on Internet Computing, Las Vegas, USA, 2004, pp. 527-533. [18] G. Salton, Automated text processing: The transformation, analysis, and retrieval of information by computer. Boston: Addison-Wesley Longman Publishing Co., Inc. [19] CC. Aggarwal, PS. Yu, Outlier Detection for High Dimensional Data, in Proc. ACM SIGMOD Int. Conf. on Management of Data, SANTA BARBARA, USA, 2001, pp. 37-46. [20] F. Jiang, YF. Sui, CG. Cao, Some issues about outlier detection in rough set theory, EXPERT. SYST. APPL, vol. 36, pp. 4680-4687, APR. 2009. [21] TM.Hu, SY. Sung, Detecting pattern-based outliers, PATTERN. RECOGN. LETT, vol. 24, pp. 3059-3068, DEC. 2003. [22] ZY. He, XF. Xu, SC. Deng, Discovering cluster-based local outliers, PATTERN. RECOGN. LETT, vol. 24, pp. 1641-1650, JUN. 2003. [23] M. Ye, X. Li, ME. Orlowska, Projected outlier detection in highdimensional mixed-attributes data set, EXPERT. SYST. APPL, vol. 36, pp. 7104-7113, APR. 2009.
2015 IEEE 7th International Conference on CIS & RAM