ANOMALY DETECTION IN MOBILE COMMUNICATION NETWORKS ...

4 downloads 14562 Views 137KB Size Report
The multi-service character of today's mobile radio technology brings totally .... ANOMALY DETECTION IN MOBILE COMMUNICATION NETWORKS USING THE ...
ANOMALY DETECTION IN MOBILE COMMUNICATION NETWORKS USING THE SELF-ORGANIZING MAP ˜ C. M. MOTA REWBENIO A. FROTA, GUILHERME A. BARRETO AND JOAO Abstract. Anomaly detection is a pattern recognition task whose goal is to report the occurrence of abnormal or unknown behavior in a given system being monitored. In this paper we propose a general procedure for the computation of decision thresholds for anomaly detection in mobile communication networks. The proposed method is based on Kohonen’s Self-Organizing Map (SOM) and the computation of nonparametric (i.e. percentile-based) confidence intervals. Through simulations we compare the performance of the proposed and standard SOM-based anomaly detection methods with respect to the false positive rates produced.

1. Introduction The multi-service character of today’s mobile radio technology brings totally new requirements into network optimization process and radio resource management algorithms, differing significantly from traditional speech-dominated technologies [11]. One of these new aspects is related to the quality of service (QoS) requirements. Because of them, operation and maintenance of such cellular networks is becoming more and more challenging, since mobile cells interact and interfere more, have hundreds of adjustable parameters and monitor and record several hundreds of different variables in each cell, thus producing a huge amount of data. Considering scenarios with thousands of cells, it is clear that for optimum handling of the radio access network (RAN), effective data mining methods for performance analysis based on Key Performance Indicator (KPI) are required. KPIs are a set of essential measurements which summarize the behavior of the cellular network of interest, and can be used for system acceptance, benchmarking and system specification. A good choice of KPIs to monitor and analyze collected data are crucial to understand the reasons for the various operational states of the cellular network, noticing abnormal behaviors, analyzing them and providing possible solutions. Data mining is an expanding area of research in artificial intelligence and information management whose objective is to extract relevant information from large databases [3]. Typical data mining and analysis tasks include classification, regression, and clustering of data, aiming at determining parameter/data dependencies and finding various anomalies from the data. In this paper, we are interested in the clustering capabilities of the Self-Organizing Map (SOM) [7] applied to the detection of anomalous states of a CDMA200 cellular systems. The authors are with the Department of Teleinformatics Engineering, Federal University of Cear´ a (UFC), CP 6005, CEP 60455-760, Fortaleza, Cear´ a, Brazil. Emails: {rewbenio, guilherme, mota}@deti.ufc.br. 1

2

˜ C. M. MOTA REWBENIO A. FROTA, GUILHERME A. BARRETO AND JOAO

The SOM is an important unsupervised competitive learning algorithm, being able to extract statistical regularities from the input data vectors and encode them in the weights without supervision [12]. Such a learning machine will then be used to build a compact internal representation of the cellular network, in the sense that the data vectors representing its behavior are projected onto a reduced number of prototype vectors (each representing a given cluster of data), which can be further analyzed in search of hidden data structures. In addition to data clustering tasks, the SOM is also widely used for visualization of data cluster structures[1]. This visualization ability is particularly suitable to network optimization purposes, as discussed in a number of recent studies [8, 13, 9]. In addition, Laiho et al. [8] also applied the SOM to monitoring objects in the cellular network, such as base stations and radio network controllers, looking for anomalous or abnormal behaviors. They argue that this approach makes it much easier to monitor a large amount of cells in a network, since only abnormal observations have to be examined. By the same token, we propose a general procedure to detect abnormal behavior of a CDMA2000 cellular network using the SOM as a data modelling tool and nonparametric (i.e. percentile-based) confidence intervals to define reliable decision thresholds. The remainder of the paper is organized as follows. In Section 2, we describe the fundamentals of the SOM algorithm. In Section 3, we review standard methods for computing single decision thresholds for anomaly detection tests. In Section 4, we introduce a percentile-based technique for computing intervalbased decision thresholds for anomaly detection tests. Computer simulations are presented in Section 5 and the paper is concluded in Section 6.

2. The Self-Organizing Map The Self-Organizing Map (SOM) is one of the most popular neural network architectures, usually designed to build an ordered representation of spatial proximity among vectors of an unlabelled data set. Neurons in the SOM are put together in an output layer, A, in one-, two- or even three-dimensional arrays. Each neuron i ∈ A is associated to a weight vector wi ∈ τ + , and hence the test is almost never positive for anomalies, even when the presented data vector is truly abnormal, so we will have less false positive (false alarms) cases. The third set of simulations compares the influence of the size of the training set on the SOM-based anomaly detectors. The purpose of these tests is to give a rough idea of which method requires less training data to obtain a high classification accuracy. In these simulations, the number of neurons and the number of training epochs were set to 40 and 50, respectively. In Figure 3 we observed a decrease in the false positive rates with the growth of the training set. At the end, we observe a increase in these rates, caused by the need of more training epochs in order to handle better the larger amount of data. As a general conclusion we can state that the best overall detection performances were provided by the pairs (SOM, DI) and (SOM, Tanaka). However, it is worth noting that the

8

˜ C. M. MOTA REWBENIO A. FROTA, GUILHERME A. BARRETO AND JOAO

detection test implemented by the pair (SOM, DI) is computationally faster than the pair (SOM, Tanaka), since the decision threshold of Tanaka’s method should be computed for every new input vector, while the two thresholds, τ − and τ + , of the proposed DI method are computed only once at the end of the SOM’s training phase. 6. Conclusion and Further Work In this paper we proposed a general procedure to design anomaly detection systems for mobile communication network based on Kohonen’s Self-Organizing Map (SOM). We illustrate through simulations that the proposed performs better in average than standard methods with respect to the false positive (false alarm) rates produced. Currently we are developing SOM-based anomaly detectors for serially correlated data, such as financial time series, in order to detect change in regimes and novelties associated with the temporal evolution of a given stock market. Applications in engineering, such as detection of faults for an electric induction motor, are also being developed. References 1. A. Flexer, On the use of self-organizing maps for clustering and visualization, Intelligent Data Analysis 5 (2001), no. 5, 373–384. 2. F. Gonzalez and D. Dasgupta, Neuro-immune and self-organizing map approaches to anomaly detection: A comparison, Proceedings of the First International Conference on Artificial Immune Systems (Canterbury, UK), 2002, pp. 203–211. 3. David J. Hand, H. Mannila, and P. Smyth, Principles of data mining, MIT Press, 2001. 4. T. Harris, A Kohonen SOM based machine health monitoring system which enables diagnosis of faults not seen in the training set, Proceedings of the International Joint Conference on Neural Networks, (IJCNN’93), vol. 1, 1993, pp. 947–950. 5. V. J. Hodge and J. Austin, A survey of outlier detection methodologies, Artificial Intelligence Review 22 (2004), no. 2, 85–126. 6. A. J. H¨ oglund, K. H¨ at¨ onen, and A. S. Sorvari, A computer host-based user anomaly detection system using the self-organizing map, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN’00) (Como, Italy), vol. 5, 2000, pp. 411–416. 7. T. Kohonen, The self-organizing map, Proceedings of the IEEE 78 (1990), no. 9, 1464–1480. 8. J. Laiho, M. Kylv¨ aj¨ a, and A. H¨ oglund, Utilisation of advanced analysis methods in UMTS networks, Proceedings of the IEEE Vehicular Technology Conference (VTS/spring) (Birmingham, Alabama), 2002, pp. 726–730. 9. J. Laiho, K. Raivio, P. Lehtim{”aki, K. H¨ at¨ onen, and O. Simula, Advanced analysis methods for 3G cellular networks, IEEE Transactions on Wireless Communications 4 (2005), no. 3, 930–942. 10. A. Mu˜ noz and J. Muruz´ abal, Self-organising maps for outlier detection, Neurocomputing 18 (1998), 33–60. 11. R. Prasad, W. Mohr, and W. Kon¨ auser, Third generation mobile communication systems universal personal communications, Artech House Publishers, 2000. 12. J. C. Principe, N. R. Euliano, and W. C. Lefebvre, Neural and adaptive systems: Fundamentals through simulations, John Wiley & Sons, 2000. 13. K. Raivio, O. Simula, J. Laiho, and P. Lehtim¨ aki, Analysis of mobile radio access network using the Self-Organizing Map, Proceedings of the IPIP/IEEE International Symposium on Integrated Network Management (Colorado Springs, Colorado), 2003, pp. 439–451. 14. J.W. Sammon, Jr., A nonlinear mapping for data structure analysis, IEEE Transactions on Computers C-18 (1969), 401–409. 15. M. Tanaka, M. Sakawa, I. Shiromaru, and T. Matsumoto, Application of Kohonen’s selforganizing network to the diagnosis system for rotating machinery, Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC’95), vol. 5, 1995, pp. 4039– 4044.

Suggest Documents