A Data Collection Approach for Mobile Botnet Analysis and Detection Meisam Eslahi1,2, Mohammad Reza Rostami3, H. Hashim1, N.M. Tahir1, Maryam Var Naseri2 1
Faculty of Electrical Engineering, Universiti Teknologi MARA, Malaysia.
[email protected],
[email protected],
[email protected] 2
Faculty of Computing, Engineering & Technology, Asia Pacific University of Technology & Innovation, Malaysia.
[email protected] 3
Advanced Informatics School, University Technology Malaysia.
[email protected]
Abstract— Recently, MoBots or Mobile Botnets have become one of the most critical challenges in mobile communication and cyber security. The integration of Mobile devices with the Internet along with enhanced features and capabilities has made them an environment of interest for cyber criminals. Therefore, the spread of sophisticated malware such as Botnets has significantly increased in mobile devices and networks. On the other hand, the Bots and Botnets are newly migrated to mobile devices and have not been fully explored yet. Thus, the efficiency of current security solutions is highly limited due to the lack of available Mobile Botnet datasets and samples. As a result providing a valid dataset to analyse and understand the Mobile botnets has become a crucial issue in mobile security and privacy. In this paper we present an overview of the current available data set and samples and we discuss their advantages and disadvantages. We also propose a model to implement a mobile Botnet test bed to collect data for further analysis.
environment is extremely low. Therefore, mobile devices have become a common environment for Botmasters to conduct their malicious activities with less risk of being detected [6, 7]. The mobile botnets such as Zitmo, DroidDream, Smartroot, AnserverBot, Ikee.B and TigerBot are proof of the concept that botnets have migrated to mobile networks where the current security solutions may not be fully applicable [4, 6].
Keyword — Mobile malware; smartphone security; Botnets; network traffic; Dataset.
Section II explains the characteristics of mobile bots and Botnets. Several sources of datasets and data samples that can be employed and analysed for mobile Botnet studies and researches are considered in section III followed by their current states and challenges in Section IV. Section V proposes an approach to create a proper dataset for HTTP based mobile Botnets. Finally, Section VI presents the overall conclusions of this paper.
I. INTRODUCTION The first generation of Botnets (e.g. IRC, P2P, and HTTP) are known for executing a wide range of dangerous attacks (e.g. Spamming, Information theft) on computer networks and mostly attacked the less-monitored computers with highbandwidth connections such as home computers and university servers [1, 2]. Recently, the use of mobile devices and smartphones has significantly increased. For instance, the BYOD or Bring Your Own Device is largely used by organizations in which employees connect their personal mobile devices to an enterprise network to gain access to corporate information and conduct daily business functions [3]. Although the advanced capabilities of mobile networks have brought convenience to daily activities, such as mobile banking, they are not fully protected as compared to computer networks and also mobile users pay less attention to the security issues and updates [4]. As observed in [5], the mobile devices are widely used by organizations (e.g. BYOD) where the number of employees who follow the policies required to secure the mobile
A series of highly updated reports published by North Carolina State University in collaboration with NQ Mobile [8] has shown that the new generation of Botnets and organized C&C based cybercrimes have targeted mobile networks. Since the mobile Botnets are a new phenomenon, the number of available datasets and data samples is relatively low. Therefore, in this paper we proposed a new approach to build a dataset for mobile Botnet analysis. The remainder of this paper is organized as follows:
II. MOBILE BOTNET CHARACTERISTICS Similar to traditional Botnets in computer environments, the mobile Botnets also consist of three main elements, such as Bot, Command and Control (C&C) Mechanism, and Botmaster [9, 10]. Botmasters design and implement the small malicious mobile applications called Bots to infect mobile devices and convert them into Zombies [11]. The main difference between mobile Bots and the other types of malware lies in the fact that the infected devices are connected to each other to form a network of Mobile Bots or MoBots [2, 12]. Moreover, the Botmasters regularly use the C&C mechanisms (e.g. SMS or Internet) to send orders to the Bots or update them with new settings [13]. According to the aforementioned process the life cycle of a mobile Botnet can be divided into three main stages:
Infection and Propagation, Operational activities and communications, Perform Botmaster’s tasks and attacks [14]. A. Infection and Propagation Several methods and techniques are used by Botmasters such as social engineering, malicious email attachments, exploiting mobile vulnerabilities, infected URLs, SMS /MMS, and Bluetooth to infect new mobile devices and make them members of Botnet without the users’ knowledge [4, 15]. They can also use other malicious and Trojanized third party applications (e.g. Cracked Games) as a vector to propagate their bots on computers or mobile devices [16]. In general, Botmasters try to take advantage of smartphones as the lessmonitored devices with high-bandwidth connectivity to propagate their Bots and infect as many victims as possible [2, 13]. B. Operational Activities and Communications This step is the most critical phase of each Botnet where the bots interact with C&C servers to get new commands or updates from the Botmaster. The new orders are periodically received by Bots in affected mobile devices and the results are reported to the Botmaster right after command execution [6, 9]. In fact the Command and Control Servers act as an interface between Botmasters and the bots and allow them to communicate with each other. There are a number of technologies that have been used by Botmasters to create their command and control mechanism as follows: 1) SMS: The significant characteristics of the Short Message Service such as number of subscribers, simple functionality and high level of availability have made the SMS a common medium to design new SMS-Based command and control mechanisms for mobile Botnets in a large number of academic studies [6, 17]. However, in the real world the number of SMS-Based mobile Botnets is quiet low as this service is not free and the costs of communication may notify mobile users and lead to Bots being detected [17]. 2) Internet/Web: A wide number of mobile Botnets use web-based technologies and protocols (e.g. HTTP Protocol) as the mobile devices are well integrated with the Internet. This approach requires the Botmaster to publish the commands on certain websites, for example in order to communicate with the Bots on infected mobile devices [18]. By using the standard web-based protocols and services the Bots can hide their communication flows within the normal web traffic, which makes them more stealthy and hard to detect. In addition, since the web service and the Internet are widely used by mobile users, the HTTP-based Botnets are considered one of the most dangerous types of Botnet and are widely used by Botmasters [13]. Yajin et al. [19] from North Carolina State University have managed to identify and collet more than 1,200 mobile malwares within one year in collaboration with NQ Mobile [6]. Based on their observation more that 97% of collected mobile Bots communicate with their command and control
TABLE 1: REAL MOBILE BOTNETS C&C MECHANISM [20] Mobile Bot AnserverBot Geinimi DroidKungFu GoldDream NickyBot Zitmo GPSSMSSpy
SMS √ √
Internet/Web √ √ √ √ √ -
server via HTTP-Based web traffic. In contrast there were only two instances of Bots such as NickyBot and GPSSMSSpy which employ the SMS service to establish their communication mechanism [21]. Table 1 shows the command and control mechanism that was used by real examples of mobile Botnets. C. Perform Botmasters tasks and Attacks The main aim of a Botmaster is to dynamically interact with Bots to deliver commands to them and perform several types of attack [2]. In fact the mobile Botnets can be considered as a platform to conduct every type of malicious activities in mobile devices and networks as follows [4]: • • • • • • •
Damage Firmware Spamming (e.g. email, SMS, and MMS) Theft of Sensitive Information Intercept SMS Record Audio Click Fraud and Adware Downloading Additional Content
In addition to the attacks discussed above, mobile Botnets can seriously attack corporate classified information, financial assets and intellectual property in organizations that employ BYOD. Recently, Bring Your own Device has become one of the most popular models for enterprises to provide mobility and flexibility in the workplace [22]. BYOD is widely used by organizations in countries such as Spain, Brazil, Malaysia and Singapore (e.g. up to 80%) where the number of employees who follow the policies required to secure BYOD is extremely low [5]. In addition, the current mobile security solutions for BYOD provide limited protection and only focus on managing devices (i.e. MDM), applications (i.e. MAM) and information (i.e. MIM) based on certain policies [23, 24]. Therefore, with increasing number of mobile devices and users, mobile security and mobile Botnet detection and analysis has become a globally critical issue. III. DATA SOURCES FOR MOBILE BOTNET ANALYSIS As highlighted by Eslahi et al. [6] the mobile Botnets only recently appeared and have not been fully explored yet. Therefore, the main challenge for mobile Botnet detection and analysis is the limited understanding of these new emerging cybercrimes due to the lack of sufficient samples and benchmark datasets [19, 21]. Hence, this section reviews the
available malware samples and datasets that can be analysed to look for any evidence of mobile Bots and Botnet activities. A. Available Malware Samples Without doubt one of the key challenges for mobile security researchers is the lack of comprehensive mobile malware samples to start with. However, there are several websites and communities that share malware instances, but there are few reliable data samples that have been collected by research communities as follows: 1) Android Malware Genome Project: one of the most significant and comprehensive malware data samples on the Android platform was collected by North Carolina State University in collaboration with NQ Mobile [20]. They successfully collected more than 1200 real instances from 49 android malware families within one year. The main impact factor of this data sample is that more than 93% of collected malware convert mobile phone to the Bots and use HTTPBased web traffic to communicate with Botmasters via command and control servers [19]. 2) M0Droid: the M0Droid is a project conducted by University Putra Malaysia to analyse and detect Android malwares via behavioural analysis and pattern recognition techniques. However, the research team released their samples in two categories called Malware and Goodware datasets [25]. The main advantage of M0Droid is the collection of both malwares and trusted applications, which facilitate anomalybased detection by analysing the normal behaviour along with suspicious activities. B. Real-World Behavioural Datasets In addition to the aforementioned malware samples, there are a number of features such as system information, phone usage, location and movement etc. that can help in further analysis. Currently there are two comprehensive behavioural data collections for smart phones as follows: 1) Mobile Data Challenge (MDC): the MDC is one of the most comprehensive mobile behavioural datasets collected by the large-scale research to generate unique set of data and push the possibilities of further data analysis [26]. Table 2 shows the type and amount of data in detail. As shown, the MDC can be considered as one of the most complete real datasets in terms of variety of the data, scale and temporal dimension which was conducted by 200 individuals over two years. TABLE 2: THE MDC COLLECTED DATASET [27] Data type Calls & SMS Pictures taken & Videos Bluetooth scans WLAN scans Audio samples Application events Phone book entries
Quantity 220,334 30,217 15,362,182 12,568,788 218,021 3,569,860 34,053
Duration 3,907 hours 1817 hours -
2) Data for Development (D4D) Challenge: from December 2011 to April 2012 the orange lab managed to collect more than 2.5 billion records, class and text messages from 5 million users [27]. The research has already released several datasets in three categories such as Mobility traces, Aggregate communication, and Communication sub-graphs. However, the last two datasets may be used by mobile malware analysts as normal patterns to distinguish communication patterns of Bot and normal users in infected mobile networks. IV. CURRENT STATE AND CHALLENGES The aforementioned malware samples and datasets can be used in both static and dynamic mobile Malware and Botnet analysis. The static analysis refers to the examination and evaluation of a mobile application (e.g. apk file) without execution [7]. The Genome Malware [20] and M0driod [25] provide a huge collection of malware apk files that can be directly used for static analysis. The main step in the static analysis is to reverse back the apk file into source code. The source code itself can be analysed to look for malicious intentions [28]. The apkinspector is a powerful GUI-based tool that provides both analysis functions and graphic features such as CFG, Call Graph, Dalvik codes, Smali codes and Java codes [29]. However, disassembling an apk file may not generate an exact source code. In addition, a number of features such as system calls, risky API, suspicious rooting keywords, and level and types of permissions can be extracted and used as a parameter for anomaly-based detection [15, 30]. The Anubis is an online solution to generate advance reports on an apk file static analysis such as activities, services, required and used permission, features and URLs [31]. In contrast to the static examination, the dynamic analysis attempts to evaluate the behaviour and actions of an application during its execution [15]. In this technique a mobile application is executed in the virtual environments such as Android Emulator and Android-x86 [32]. Several features and activities including file operations, started services, processes, threads, file changes, and Information flow tracking can be used as parameters in dynamic analysis [15, 31]. Another common parameter for dynamic analysis is energy consumption evaluation which is used in a number of studies for mobile malware detection as the mobile device resources, such as battery life, CPU and memory are limited [4]. However, Barroso [33] termed the Botnets as “Silent Threats”, as the Bots on infected targets will not make any unusual or suspicious use of the battery, CPU, memory or other device resources, which would, otherwise, cause their presence to be exposed. Therefore, this method is unlikely to work for Mobile Botnets[6]. Regardless of the advantages and disadvantages of the aforementioned techniques, their main focus is to detect the mobile malwares in the Infection and Propagation stage (e.g. file change analysis and permissions) and Perform Botmaster’s tasks and attacks stage (e.g. suspicious system calls and risky APIs) in a Bot lifecycle. As discussed earlier, the main
difference between mobile Botnets and the other threats lies in the fact that they dynamically communicate with their controller called a Botmaster [6]. Therefore, in addition to the method mentioned above an operational behaviour analysis can be employed to analyse Bot communication characteristics, methods and C&C servers. The SMS communication pattern and statistics of MDC and D4D datasets can be used as a normal pattern to compare with the malicious behaviour of SMS-based mobile Botnets [34]. However, as discussed earlier the majority of real world Mobile Botnets communicate via an HTTP protocol over the Internet. A wide number of studies on computer-based Botnet detection have adopted behavioural analysis by collecting the network traffic for a specific period (i.e. passive approach) and analysing them in order to identify any evidence of Bot and Botnet activities [2]. Therefore, further studies must be conducted to design and propose new detection models for HTTP-Based mobile Botnets as well. On the other hand, to the best of our knowledge there is no benchmark dataset available for mobile Bot generated traffic and also normal mobile web traffic [13]. Therefore, in the next section we propose an approach to establish a testbed and create a dataset for HTTPBased mobile Botnet analysis and detection. V. PROPOSED DATA COLLECTION APPROACH As discussed earlier, the current lack of sufficient mobile HTTP traffic datasets (benign and malicious) can be considered as a big challenge for mobile HTTP Botnet detection [13]. Therefore, in this paper we propose a data collection approach to create normal mobile network traffic in addition to the malicious and Bot generated traffic. The selection of real mobile malware is an important issue in any experiment, thus, the samples must be collected from valid sources to ensure that the malicious APKs are definitely Bots [35]. As discussed in section 3 the Genome Malware [19] is the benchmark sample of malware with more than 1172 mobile Bot instances. However, the original command and control servers of Bots may not be fully accessible anymore due to their being detected or shutdown. Therefore, accurate network traffic may not be simply generated by submitting the APK file to the services such as Anubis [31] CopperDroid [36] and SandDroid [37]. Hence, the proposed approach depicted in Fig 1 aims to build a proper dataset to overcome the aforementioned issues. A. Generating Mobile Botnet Dataset The first step in generating the malicious dataset is to set up a virtual mobile Botnet. Therefore, three main elements are needed as follows: 1) Android Virtual Environments: The experiment requires several infected mobile devices to act as a Botnet; therefore, an android platform called Android-x86 is used to create a group of victim mobile devices and form a mobile network testbed [32].
Android Virtual Environments
Infected Traffic
Real Mobile Devices
Normal Traffic
Data Cleaning
Data labelling
Data Aggregation
Final Dataset
Fig 1: Dataset Building Process
2) Mobile Bots: The virtual mobiles are infected by a repackaged form of real mobile Bots as their original command and control server IP and URLs are replaced by the new address of virtual C&C servers. The same approach is conducted in [38] where the Geinimi, DroidKungFu-B and DroidKungFu Bot samples are modified with new settings to perform network connections in an experimental environment. 3) Virtual Command and Control Servers: As pointed out by Eslahi et al. [13] the Bot communication patterns and characteristics can be used to distinguish the Botnet and normal activities. On the other hand, most of the original command and control servers of mobile Bots in the Android Malware Genome Project dataset have been detected and shutdown. Therefore, virtual command and control servers must be implanted in order to simulate a complete functional Mobile Botnet in the proposed testbed. B. Generating Normal Traffic Dataset Although the normal traffic can be generated in a simulated environment as well, in order to generate more realistic data a group of volunteers’ mobile devices are equipped with a network sniffer module to collect the network traffic. Based on the literature, 4 to 8 mobile devices were used by existing studies to collect real data for a duration of two weeks to three month [38]. However, as pointed out by Burguera et al. [39] the more participants, the more reliable and accurate the collected dataset.
C. Data Cleaning The data cleaning is the main part of dataset preprocessing, especially when the human factor is involved. Unlike the controlled enviroments (e.g. test bed), the real world data are generally incomplete (e.g. lacking certain attributes of interest), noisy, containing errors, or outliers [40]. Therfore, the data cleaning is applied to the collected normal traffic to smooth noisy data, identify or remove errors, and resolve inconsistencies. D. Data labelling and Aggregation Both malicious and normal data are labelled to facilitate the training and evaluation processes. In addition, the collected traffic is also labelled based on the originated application. Finally, the two aforementioned datasets are aggregated to create the final dataset. The final dataset can be used by researchers to extract their desired features for future analysis. VI. CONCLUSION This paper presents an overview of the current state of available datasets and data samples on Mobile Botnets along with their challenges and shortcomings. As Mobile Botnets have recently appeared the current studies on Mobile Botnet analysis and detection are highly limited due to the lack of available Mobile Botnet datasets. Moreover, the majority of real-world Mobile Botnets communicate via an HTTP protocol over the Internet. Therefore, this paper proposes a new approach to build a dataset for HTTP-based Mobile Botnet analysis. Acknowledgments This project was supported by the Ministry of Education Research Acculturation Grant Scheme (RAGS): 600RMI/RAGS 5/3(208/2013) and the Research Management Institute (RMI) Grant: 600-RMI/DANA 5/3/PSI (281/2013), Universiti Teknologi Mara (UiTM). REFERENCES
[1] [2]
[3] [4] [5] [6]
[7] [8] [9]
J. Kok and B. Kurz, "Analysis of the BotNet Ecosystem" in Proceedings of the 10th Conference of Telecommunication, Media and Internet Techno-Economics (CTTE), 2011, pp. 1-10. M. Eslahi, R. Salleh, and N. B. Anuar, "Bots and botnets: An overview of characteristics, detection and challenges," in Proceedings of the IEEE International Conference on Control System, Computing and Engineering (ICCSCE), 2012, pp. 349-354. H. Romer, "Best practices for BYOD security" Computer Fraud & Security, pp. 13-15, 2014. M. La Polla, F. Martinelli, and D. Sgandurra, "A Survey on Security for Mobile Devices" Communications Surveys & Tutorials,pp. 1-26, 2012. A. Drury and R. Absalom. 2012. BYOD: an emerging market trend in more ways than one. Available: http://ovum.com/research/byod-anemerging-market-trend-in-more-ways-than-one/ M. Eslahi, R. Salleh, and N. B. Anuar, "MoBots: A new generation of botnets on mobile devices and networks" in Proceedings of the IEEE Symposium on Computer Applications and Industrial Electronics (ISCAIE), 2012, pp. 262-266. M. Chandramohan and H. Tan, "Detection of Mobile Malware in the Wild" Computer, vol. 1-1, 2012. X. Jiang. 2011. AnserverBot, New Sophisticated Android Bot Found in Alternative Android Markets. Available: http://www.csc.ncsu.edu/faculty/jiang/ Z. J. Balhare and V. S. Gulhane, "A Study on Security for Mobile Devices" International Journal of Research in Advent Technology, vol. 2, 2014.
[10] Q. Liao and Z. Li, "Portfolio optimization of computer and mobile botnets," International Journal of Information Security, vol. 13, pp. 1-14, 2014.. [11] N. Leavitt, "Mobile Security: Finally a Serious Problem?," Computer, vol. 44, pp. 11-14, 2011. [12] H. Lee, T. Kang, S. Lee, J. Kim and Y. Kim, "Punobot: Mobile Botnet Using Push Notification Service in Android," in Information Security Applications, ed: Springer International Publishing, 2014, pp. 124-137. [13] M. Eslahi, H. Hashim, and N. M. Tahir, "An efficient false alarm reduction approach in HTTP-based botnet detection," in Proceedings of the IEEE Symposium on Computers & Informatics (ISCI), 2013, pp. 201205. [14] R. A., Rodrguez-Gomez, G. MaciFernndez, and P. Garca-Teodoro, "Survey and taxonomy of botnet research through life-cycle," ACM Comput. Surv., vol. 45, pp. 1-33, 2013. [15] S. Mohite and P. R. S. Sonar, "A Survey on Mobile Malware: A War without End," International Journal of Computer Science and Business Informatics, vol. 9, pp. 23-35, 2014. [16] G. Dini, F. Martinelli, I. Matteucci, M. Petrocchi, A. Saracino, and D. Sgandurra, "Evaluating the Trust of Android Applications through an Adaptive and Distributed Multi-criteria Approach" in Proceedings of the IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), 2013, pp. 1541-1546. [17] J. Hua and K. Sakurai, "Botnet command and control based on Short Message Service and human mobility," Computer Networks, vol. 57, pp. 579-597, 2013. [18] S. S. C. Silva, R. M. P. Silva, R. C. G. Pinto, and R. M. Salles, "Botnets: A survey," Computer Networks, vol. 57, pp. 378-403, 2013. [19] Z. Yajin and J. Xuxian, "Dissecting Android Malware: Characterization and Evolution," in Proceedings of the Symposium on Security and Privacy (SP), 2012, pp. 95-109. [20] Y. Zhou and X. Jiang. 2012. Android Malware Genome Project. Available: http://www.malgenomeproject.org/ [21] X. Jiang and Y. Zhou, "A Survey of Android Malware," in Android Malware, ed: Springer New York, 2013, pp. 3-20. [22] B. Morrow, "BYOD security challenges: control and protect your most sensitive data," Network Security, vol. 2012, pp. 5-8, 2012. [23] A. Armando, G. Costa, and A. Merlo, "Bring your own device, securely," presented at the 28th Annual ACM Symposium on Applied Computing, Coimbra, Portugal, 2013. [24] M. Eslahi, M. V. Naseri, H. Hashim, N. M. Tahir, and E. H. M. Saad, "BYOD: Current State and Security Challenges," presented at the IEEE Symposium on Computer Applications & Industrial Electronics, Peneng, Malaysia, 2014. [25] M. Damshenas and A. Dehghantanha. (2013). M0Droid. Available: http://m0droid.uni.me/ [26] J. K. Laurila, D. Gatica-Perez, I. Aad, J. Blom, O. Bornet, T.-M.-T. Do, O. Dousse, J. Eberle, and M. Miettinen, "The Mobile Data Challenge: Big Data for Mobile Computing Research," presented at the Mobile Data Challenge Workshop (MDC) in conjunction with Pervasive, Newcastle, 2012. [27] N. Kiukkonen, J. Blom, O. Dousse, D. Gatica-Perez, and J. Laurila, "Towards Rich Mobile Phone Datasets: Lausanne Data Collection Campaign," presented at the Pervasive Services (ICPS), Berlin, 2010. [28] G. Suarez-Tangil, J. E. Tapiador, P. Peris-Lopez, and J. Blasco, "Dendroid: A text mining approach to analyzing and classifying code structures in Android malware families," Expert Systems with Applications, vol. 41, pp. 1104-1117, 2014. [29] APKInspector. 2014. apkinspector. Available: https://code.google.com /p/apkinspector/ [30] S.-H. Seo, A. Gupta, A. Mohamed Sallam, E. Bertino, and K. Yim, "Detecting mobile malware threats to homeland security through static analysis," Journal of Network and Computer Applications, vol. 38, pp. 43-53, 2014. [31] Andrubis. 2014. Anubis - Malware Analysis for Unknown Binaries. Available: http://anubis.iseclab.org/ [32] Android-x86. 2014. Android-x86 Project - Run Android on Your PC. Available: http://www.android-x86.org/ [33] D. Barosso, "Botnets-The Silent Threat," ENISA, vol. 3, pp. 1-9, 2007. [34] I. Vural and H. Venter, "Mobile botnet detection using network forensics," presented at the Proceedings of the Third future internet conference on Future internet, Berlin, Germany, 2010.
[35] L. Deshotels, V. Notani, and A. Lakhotia, "DroidLegacy: Automated Familial Classification of Android Malware," presented at the Proceedings of ACM SIGPLAN on Program Protection and Reverse Engineering Workshop 2014, San Diego, CA, USA, 2014. [36] Copperdroid. 2014. CopperDroid. Available: http://copperdroid. isg.rhul.ac.uk/ [37] Sanddroid, "SandDroid - An automatic android program analysis sandbox," 2014. [38] A. Shabtai, L. Tenenboim-Chekina, D. Mimran, L. Rokach, B. Shapira, and Y. Elovici, "Mobile malware detection through analysis of deviations in application network behavior," Computers & Security, vol. 43, pp. 118, 2014.
[39] I. Burguera, U. Zurutuza, and S. Nadjm-Tehrani, "Crowdroid: behaviorbased malware detection system for Android," presented at the Proceedings of the 1st ACM workshop on Security and privacy in smartphones and mobile devices, Chicago, Illinois, USA, 2011. [40] D. Z. Markov. 2014. Data preprocessing. Available: http://www.cs.ccsu .edu/~markov/ccsu_courses/DataMining-3.html