Research on Android Malware Detection and ... - Springer Link

1 downloads 1263 Views 421KB Size Report
May 9, 2012 - method to monitor software behaviors and detect malicious appli- cations on Android platform. According to the theory and imple- mentation of ...
2012, Vol.17 No.5, 421-427 Article ID 1007-1202(2012)05-0421-07 DOI 10.1007/s11859-012-0864-x

Research on Android Malware Detection and Interception Based on Behavior Monitoring □ PENG Guojun1,2, SHAO Yuru2, WANG Taige2, 2

0

Introduction

1,2

ZHAN Xian , ZHANG Huanguo

1. Key Laboratory of Aerospace Information Security and Trust Computing, Ministry of Education, Wuhan 430072, Hubei, China; 2. School of Computer, Wuhan University, Wuhan 430072, Hubei, China © Wuhan University and Springer-Verlag Berlin Heidelberg 2012

Abstract: Focusing on the sensitive behaviors of malware, such as privacy stealing and money costing, this paper proposes a new method to monitor software behaviors and detect malicious applications on Android platform. According to the theory and implementation of Android Binder interprocess communication mechanism, a prototype system that integrates behavior monitoring and intercepting, malware detection, and identification is built in this work. There are 50 different kinds of samples used in the experiment of malware detection, including 40 normal samples and 10 malicious samples. The theoretical analysis and experimental result demonstrate that this system is effective in malware detection and interception, with a true positive rate equal to 100% and a false positive rate less than 3%. Key words: Android; software behavior; smartphone security; malware detection CLC number: TP 305

Received date: 2012-05-09 Foundation item: Supported by the National Natural Science Foundation of China (61103220) and the Fundamental Research Funds for the Central Universities (6082013) and the National Natural Science Foundation of Hubei (2011CDB456), and Chenguang Program (2012710367) Biography: PENG Guojun, male, Associate professor, Ph. D., research direction: malicious code, network and information system security. E-mail: [email protected]

Smartphones become more and more popular all over the world. At the same time, security threats for smartphones become increasingly dangerous since both stealing privacy stored in smartphones and costing the telephone charge of users have economic benefits for attackers. Currently, the most popular smartphone operating system is Android, which is free, open-source, and based on embedded Linux. Android platform provides a lot of easily used programming interfaces. On one hand, rich programming interfaces make it convenient to develop applications; on the other hand, malicious applications are also easy to create. How to detect malware and prevent Android devices from being attacked become the key issues and great challenges. Behavior monitoring and intercepting technique has been widely used in security tools on desktop platforms. Due to the increase of Android devices, some achievements related to behavior monitoring and intercepting on Android platform have been proposed so far. Schmidt et al [1] ported Linux-based software (ClamAV, Nmap, etc.) to Android platform for enhancing security and monitoring events occurring on kernel level, such as system calls, file modifications, and so on. The information they extracted was used to create a behavioral model of Android smartphone. However, there were no real Android devices available at that time, and they had no way to fully test their system. Based on software behavior monitoring, Burguera et al [2] built a system called Crowdroid. The tool they mainly used was strace,

422

Wuhan University Journal of Natural Sciences 2012, Vol.17 No.5

which is also available in Linux. Their system hijacked system calls to collect information of events generated by Android applications and created an output file. The output file could be uploaded to remote servers for further analysis to detect malware. Besides, there are many other researches about software behavior monitoring and intercepting on Android. However, almost all of them have a common factor: monitoring and intercepting system calls on kernel level. We find that the method of monitoring and intercepting system calls on kernel level lacks practicability. The most important reasons are listed as follows: ① It is not efficient. System calls are basic interfaces provided by an operating system, and they are the only entrance for processes to enter kernel mode from user mode [3]. On one hand, the Java API invoked by Android applications will resolve into many system calls eventually. It is difficult to identify the behavior with system calls. On the other hand, monitoring and intercepting system calls on kernel level will influence all the running processes. ② It is not applicable for real Android devices. The kernel of a real Android device cannot use loadable kernel modules. In other words, the code written by developers will never have the chance to run on kernel level without recompiling the kernel. Therefore, it is impossible to monitor and intercept system calls on kernel level for real Android devices. In this paper, a new method to monitor and intercept sensitive behaviors of Android applications is proposed. We also build a prototype system to collect the details of sensitive behaviors and offer an effective algorithm to detect malware. Different from traditional ways, this system does not rely on malware database, and it performs better in detecting unknown malware. This paper is organized as follows: Section 1 introduces the Android Binder interprocess communication(IPC) mechanism, as well as the technique of using it for sensitive behavior monitoring and intercepting. Section 2 presents a prototype system, including the framework and some key techniques. Section 3 shows the experimental results. Section 4 gives possible future work to enhance the system and puts forward the prospect of this work.

1

Android IPC Mechanism

IPC is a set of methods for exchanging data among multiple running processes [4]. There are some traditional IPC mechanisms in Linux, including pipes, System V IPC, socket, etc. However, considering efficiency and

security, Android builds a new kind of IPC mechanism based on C/S mode, which is called Binder. Many system related operations in Android, such as SMS operations, telephone operations, media operations, and so on, are provided as system services. Each service has a server to reside in. If applications need to use certain service, what they only need to do is to connect to the relevant server and then requesting service data from the server [5]. 1.1 Binder IPC Mechanism Model To implement the Binder IPC mechanism, the virtual device /dev/binder is mounted in kernel [5]. The process requesting for data is client, and the process that provides relevant data is server. Both of them are running on user space. While using Binder IPC mechanism, client and server seem to communicate with each other directly. In fact, the data transmitted between client and server will be transported by Binder driver, as shown in Fig. 1.

Fig. 1

Android Binder IPC model

The essence of the Binder IPC mechanism is memory sharing. In addition, it is the Binder driver that is responsible for the management of the shared memory. The work of the Binder driver is completely transparent for both client and server. Client and Server interact with Binder driver through the function ioctl, which is used to control the I/O devices in Linux. It provides a method to send control parameters and data to devices simultaneously. Different from ordinary functions, ioctl has a variable argument list. Its prototype is int ioctl(int fd, int cmd, ...); The first parameter fd is a file descriptor of the virtual device /dev/binder. Parameter cmd is the control instruction for device. 1.2 Software Behaviors and Binder IPC Data If the second parameter of ioctl, namely, cmd, is BINDER_WRITE_READ, it means that client or server is going to read data from /dev/binder or write data to

423

PENG Guojun et al : Research on Android Malware Detection and Interception …

/dev/binder. In such circumstances, a pointer pointing to structure binder_write_read will be passed as the third parameter to ioctl. With the help of structure binder_write_read, another struct binder_transaction_data will be found. It is just like the packets in network communications and Binder IPC data is encapsulated in it. Member variables sender_uid and sender_pid indicate the user ID and process ID of the sender. buffer points to a buffer containing the real payload data. The relation of these structures is shown in Fig. 2.

Fig. 2

Relation of data structures in Binder IPC

Note: The definition of structures shown in Fig. 2 can be found in 〈src〉/ bionic/ libc/kernel/common/linux/binder.h, 〈src〉is the root directory of Android’s source code.

We find that when client and server are communicating with each other through Binder IPC mechanism, the behaviors of client can be identified exactly by means of analyzing the data sent from client to server or received by server from client. For example, when an application is trying to get user’s location information, it will be from the client side to access the server which is responsible for location service. The data this application sent to server side will contain a string as “android.location.ILocationManager”. Therefore, on the server side, by analyzing the data sent from client side, we will know that a certain application is trying to get user’s location. For the reason that every Android application has a unique UID, it is easy to query which application is performing this behavior with sender_uid in binder_transaction_data. Furthermore, if we have hijacked the data in Binder IPC, the relevant behavior would be hijacked too.

2

Table 1

Servers and system services.

Server process

Services being managed

com.android.phone

Services related to telecommunication, such as SMS, telephone, etc. Services related to media, such as recording voice and videos, etc. Other services, such as location, network connection, etc.

mediaserver system_server

In Section 1.2, we have explained that behaviors of client can be identified by means of analyzing the data sent from client to server or received by server from client. If hundreds of different applications are installed in Android devices, there would be hundreds of clients. In contrast to the only three servers, it is obvious that analyzing Binder IPC data from server side to identify behaviors of client is more efficient and stable. For intercepting the Binder IPC data, we compile our code into a shared library named libmonitor.so. By using the injectso technique, we can make libmonitor.so start to run in server processes. Then, we hook the function ioctl and read the Binder IPC data. According to this idea, we design and implement a system to monitor and intercept software behaviors. Moreover, the system uses a novel algorithm to evaluate the threats of applications. The framework and key techniques will be introduced in Section 2.1 and 2.2, respectively. 2.1 Framework Figure 3 presents the framework of the system.

Framework and Implementation

Though there are dozens of system services provided by Android, only three server processes are running to manage these services. The details are shown in Table 1.

Fig. 3 Framework of behavior monitoring and malware detecting system the dotted arrow lines represent invoking relationships;the black dashed arrow lines represent communications between different levels of Android system; the gray dashed arrow lines represent the flow direction of data

424

Next, we present a brief description of the modules shown in Fig. 3.  Intercepting Module Hook the function ioctl in server processes then read the real payload data and pass it to next module.  Extracting Module According to the predefined mappings between special strings in real payload data and software behaviors, this module extracting the behaviors of applications.  Identifying Module This module will compare received software behaviors to predefined sensitive behaviors. If some behaviors are sensitive, the module will read the blacklist or whitelist in configuration file. Behaviors in the blacklist or not in the whitelist will be passed to next module; otherwise, they will be ignored.  Communicating Module This module is responsible for sending data about sensitive behaviors to Android application level from Linux level. The application level is written in Java; libmonitor.so is written in C. They have to communicate with each other through a language independent interface.  Injector As the name implies, this module is used to inject libmonitor.so into the running server processes.  Interacting Module At first, this module executes Linux shell command to start the injector. After libmonitor.so is injected successfully, this module will wait and receive data sent from communicating module and store the data into a SQLite database.  Threat Evaluating Module This module reads the data describing sensitive behaviors of each application stored in database and evaluates their threats by calculating its Threat point (TP). For an application, if its Threat Point is bigger than the threshold value, it will be treated as malware. This module will give user warnings and block the sensitive behaviors of this application. 2.2 Key Techniques Since we want to acquire the Binder IPC data on server side, our code has to run in server processes. With the study of dynamic linking mechanism in Linux, we use the injectso technique to inject libmonitor.so into server processes [6]. Linux does not provide APIs for hooking, neither does Android. In order to hook the function ioctl, we redirect it after injecting libmonitor.so. The goal is quite explicit: replacing the address of ioctl with the address of hooked_ioctl. hooked_ioctl is implemented in libmonitor.so and we can intercept the Binder IPC data with it.

Wuhan University Journal of Natural Sciences 2012, Vol.17 No.5

2.2.1

Injectso The four steps for injectso are shown in Fig. 4. We implement the procedure of injectso in the injector.

Fig. 4

Steps of injectso

When manipulating the target process, injector using the function ptrace. ptrace is good at controlling the status of process and modifying the virtual memory of running process, even registers. Loading shared library into the target process also needs another important function dlopen. We modify the registers and make PC register point to dlopen in target process. Injectso in Android is similar to injectso in Linux [7]. 2.2.2 Redirecting Global Offset Table and Procedure Linkage Table (GOT and PLT) are used by the dynamic linking mechanism in Linux. Shared libraries are ELF (Executable and Linkable Format) files, which consist of many sections. In particular, two sections, rel.got and rel.plt, contain useful information for GOT and PLT [8]. Zhang et al [8] illustrated how GOT and PLT perform in the procedure of dynamic linking. Based on their research, take mediaserver as an example, the method to hook ioctl is shown in Fig. 5. The address where libbinder.so loaded in server processes can be found in file /proc/〈target _ pid〉/maps (〈target _ pid〉is the Process ID of target process). Searching from the beginning of this address, we can find the GOT of libbinder.so and the index in relocation table of ioctl. Then, we can calculate the address of ioctl in GOT [9]. 2.2.3 Communication between Linux level and Android application level libmonitor.so is injected into server process running at Linux level, but the MalwareDetector.apk of the system is running in Dalvik virtual machine [10] at Android application level. We use the Unix Domain Socket (also named as Local Socket) for communication between different levels.

425

PENG Guojun et al : Research on Android Malware Detection and Interception …

Fig. 5

Hook ioctl by using redirecting

Android application framework provides some classes to support the Unix Domain Socket, such as LocalSocket, LocalServerSocket, etc. 2.2.4 Threat evaluation We have analyzed many normal applications and malware in Android and made some statistics. Although many normal applications have some sensitive behaviors, the number and combinations are limited. For example, SMS application can read and send SMS, but it will never take recording. Maybe the games has been installed by user will access the Internet, but it is abnormal that they can send SMS or make telephone calls. We assign a certain weight to every sensitive behavior and their combinations. Sensitive behaviors, which are used widely, have a smaller weight, such as accessing the Internet, because most applications check updates automatically. Android permission groups can be classified into different dangerous levels [11], so do the combinations of sensitive behaviors. For example, almost all the IM applications have a combination of accessing the Internet and reading the contacts, so the weight of this group is small. However, for the combination of sending SMS and making telephone calls, the weight is bigger because normal applications will hardly use them. The sensitive behaviors of many malware have a fixed time interval, such as reading user’s privacy information every hour. This is a very typical feature of malware. In this circumstance, we can increase the weight by multiplying a value. For some malware, especially spyware, they do not have user interfaces (UI) because they do not need to

interact with users. It is another typical feature for these malware. We can increase the Threat Point by adding a certain value directly. The assignment of weight is shown in Table 2. Table 2

Assignment of weight for behaviors/combinations

Behaviors/Combinations

Weight

Widely used Not widely used Normal combinations Dangerous combinations Very dangerous combinations

1 3 1 2 4

Repeatable Y N 1 1 1 1 1 1 2 1 4 1

In the third column of Table 2, the two values before and behind the slash mean that when repeatable or not, the weight should multiply them, respectively. We calculate the Threat Point with two equations shown below: N c = CN2 b i≤ N b

i≤ N c

i =1

j =1

TP = (  Wi Ri +  W j R j + α ) / ( Nb + N c )

(1) (2)

In Eq.(1), Nb is the number of sensitive behaviors, and Nc is the number of binary combinations of these behaviors. In Eq.(2), TP is short for Threat Point. Wi and Wj are the weight of relative behavior or combination. Ri/Rj means the value that the weight should be multiplied when repeatable or not repeatable. α is the value that will be added when an application has no user interfaces. The threshold of Threat Point is determined by studying from samples. The details will be explained in Section 3.

426

3

Wuhan University Journal of Natural Sciences 2012, Vol.17 No.5

Experimental Results

From the official developer website of Android, we can see that the most widely used Android versions currently are 2.2 and 2.3, taking a market share over 85% [12]. Therefore, we test the system mainly on these two versions. We should notice that executing injectso needs a root privilege. The real Android devices in our experiments have been rooted successfully. 3.1 Behavior Monitoring and Intercepting Lots of tests verify that the system is able to monitor all sensitive behaviors and intercept most of them. The detailed result is shown in Table 3. Table 3 No. 0 1 2 3 4 5 6 7 8 9

Detailed result of tests of sensitive behavior: monitoring and intercepting

Sensitive behavior Reading SMS Sending SMS Reading contacts Reading call logs Making telephone calls Accessing Internet Reading browser history Accessing location Using camera Recording

Monitored √ √ √ √ √

Intercepted √ √ √ √ √

√ √

— √

√ √ √

√ — —

Since we have not finished intercepting all sensitive behaviors yet, a few grids in the last column are marked with “—”. 3.2 Malware Detection We take two steps to test malware detection capability of the system. At first, we use 35 applications to determine the threshold of Threat Point. There are 25 applications downloaded from their official websites and installed as normal samples, and 10 malware chosen from ContagioMobile [13] are installed as malicious samples. For some remote control Trojans, their sensitive behaviors may not be triggered properly. Therefore, they are not included in our list of malicious samples. The result is demonstrated in Section 3.2.1. In the second step, we use 40 applications, which are top 5 of eight random categories in Google Play as normal samples. Then, we choose another eight kinds of malware from ContagioMobile and two self-written remote control Trojans as malicious samples. The result is

demonstrated and analyzed in Section 3.2.2. 3.2.1 Determine threat point threshold Running with the system, we calculate the Threat Point of all the samples and make a statistics, as shown in Fig. 6.

Fig. 6

Threat point of samples to determine threshold

Normal samples are marked with the number from 1 to 25, and malware samples from 26 to 35. The Threat Points of all the normal samples are lower than 2.60. In contrast, the Thread Points of all the malware samples are higher than 2.60. Therefore, we choose 2.60 as threshold. 3.2.2 Malware detection After the threshold is determined, we set this value in the system to detect malware. The overall result is shown in Table 4. Table 4

Overall result of system test for malware detection

Total number Malicious 10 Normal 40 Type

Detected number 10 1

True positive False positive rate/% rate/% 100 — — 2.5

As expected, the system is able to detect all the malicious samples with a low false positive rate. A detailed result is shown below in Fig. 7. By turns, the eight categories we choose from Google Play are Games, Transportation, Sports, Communication, Social, Music, Tools, and Entertainment. Top 5 of each category are marked with the number from 1 to 40.

Fig. 7

Threat point of samples in system test

In Fig. 7, we can find out that in the fourth category (App ID:15-20), namely, Communication, applications have higher Threat Point. In this category, communica-

427

PENG Guojun et al : Research on Android Malware Detection and Interception …

tion tools have more sensitive behaviors because they need to read privacy data, such as contacts, call logs, and SMS, even help users send SMS and make phone calls. They are designed to do this work. Malicious samples are marked with the number from 41 to 50 in Fig. 7. Their sensitive behaviors, Threat Point, and other features are shown in Table 5. Table 5

Threat point of malicious samples

Sample Behaviors Repeatable Copy9 0,2,5,6,7 Y Nickispy.B 0,1,5,7 Y Kidlogger 0,1,2,3,5,6,7 Y RougeSPPush 0,1,5 Y Geinimi-CacheMate 0,1,5,7 Y Geinimi.A-SPL meter 0,1,5,7 Y Geinimi.B-GoldenMiner 0,1,5,7 Y Geinimi-Kosenkov 0,1,5,7 Y protector Self-written1 0,2,3,5,7,9 N Self-written2 0,2,3,5,9 N

UI N N N Y Y Y Y

TP 3.75 3.73 4.07 2.67 3.50 3.50 3.50

Y

3.50

N N

3.30 3.53

In Table 5, the numbers in the second column are defined in the first column of Table 3. Each of them indicates a kind of sensitive behavior. As we can see in Table 5, the Threat Point of Kidlogger is the highest. Kidlogger is a spyware that has the most sensitive behaviors among all malicious samples, and its dangerous behavior combinations and very dangerous behavior combinations are repeatable. In addition, Kidlogger has no user interfaces. After checking up the only false positive application, we find that it is a security protection tool named Lookout Antivirus. It is reasonable for this application to perform many sensitive behaviors due to their special functions. Since false positives are inevitable, we can add trusted applications into whitelist to eliminate false positives.

4

Conclusion

The experimental results demonstrate that the system performs well in monitoring and intercepting sensitive behaviors of Android applications. About the new method of detecting and identifying malware, we propose processes with high practical value. The technique of behavior monitoring and intercepting proposed in this paper can be used in security protection tools to implement dynamic defense. This technique makes it possible for security tools to detect

and identify unknown malware without malware database. With the rapid growth of Android malware, the system and techniques that it uses will have an extremely great prospect of application.

References [1]

Schmidt A D, Schmidt H G, Clausen J, et al. Enhancing security of Linux-based Android devices [EB/OL]. [2011-11-19]. http://www.dai-labor.de/fileadmin/files/publica

tions/lk2008-

android_security.pdf. [2]

Burguera L, Urko Z, Simin N. Crowdroid: behavior-based malware detection system for Android [C]//Proc 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices. New York: ACM Press, 2011: 15-26.

[3]

Manuel E, Theodoor S, Engin K, et al. A survey on automated dynamic malware analysis techniques and tools [J]. ACM Computing Surveys, 2012, 44(2): 1-49.

[4]

Wikipedia. Inter-process communication [EB/OL]. [2012-0107] http://en.wikipedia.org/wiki/Inter-process_communication.

[5]

Schreibe T. Android binder [EB/OL]. [2012-03-29]. http://www.

[6]

Zhang Hejun, Zhang Yue. Research and application of dy-

nds.rub.de/media/attachments/files/2012/03/binder.pdf. namic link mechanism in Linux [J]. Computer Engineering, 2006, 32(22): 64-66(Ch). [7]

Xfocus Team. Injecting shared library [EB/OL]. [2011-1214]. http://www.focus.net/articles/200208/438.html.

[8]

TIS Committee. Executable and linkable format [EB/OL]. [2011-10-30]. http://www.skyfree.org/linux/references/ELF_F ormat.pdf.

[9]

Anonymous. Runtime process infection [EB/OL]. [2011-1205]. http://www.phrack.org/issues.html?issue=59&id=8.

[10] Li T S, Jing S, Xu J H, et al. The research of dalvik virtual machine on the Android platform[C]// Proc 3rd International Conf on Manufacturing Science and Engineering, Xiamen: IEEE Press, 2012: 2534-2537 [11] Tang W, Jin G , He J M, et al. Extending Android security enforcement with a security distance model [EB/OL]. [201201-06]. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber= 06006288. (DOI: 10.1109/ITAP.2011.6006288)

[12] Android Developers Guide. Android platform versions’ current distribution [EB/OL]. [2011-10-30]. http://developer.a ndroid.com/resources/dashboard/platform-versions.html. [13] ContagioMobile Blog. Collection of 96 mobile malware samples [EB/OL]. [2011-11-04]. http://contagiominidump. blogs pot.com.



Suggest Documents