International Journal of
Comput Syst Sci & Eng (2017) 2: 71–80 © 2017 CRL Publishing Ltd
Computer Systems Science & Engineering
Detecting local covert channels using process activity correlation on android smartphones Marcin Urbanski1 , Wojciech Mazurczyk1, Jean-Francois Lalande2 , Luca Caviglione3 1 Warsaw University of Technology, Institute of Telecommunications, Warsaw, Poland E-mail:
[email protected],
[email protected] 2 INSA Centre Val de Loire, University of OrlØans, Bourges, France. E-mail:
[email protected] 3 Institute of Intelligent Systems for Automation (ISSIA), Genova, Italy. E-mail:
[email protected]
Modern malware threats utilize many advanced techniques to increase their stealthiness. To this aim, information hiding is becoming one of the preferred approaches, especially to ex ltrate data. However, for the case of smartphones, covert communications are primarily used to bypass the security framework of the device. The most relevant case is when two colluding applications" cooperate to elude the security policies enforced by the underlying OS. Unfortunately, detecting this type of malware is a challenging task as well as a poorly generalizable process. In this paper, we propose a method for the detection of malware exploiting colluding applications. In more details, we analyze the correlation of processes to spot the unknown pair covertly exchanging information. Experimental results collected on an Android device showcase the effectiveness of the approach, especially to detect low-attention raising covert channels, i.e., those active when the user is not operating the smartphone.
1.
INTRODUCTION
More than ten years of advancements in security systems imposed malware developers to study new techniques to make their “products" stealthier. To this aim, utilizing some form of information hiding is a relevant trend, which presumably started in 2006 when the Trojan.Downbot [22] allowed to attack and damage numerous institutions during the “Operation Shady RAT". Specifically, the threat hid its presence for months by encoding control commands within HTML pages or JPEG images, which appeared as licit traffic [7]. From 2011, the proliferation of information hiding-capable malware constantly increased and showcased three major groups of techniques to cloak communications [12]: (i) methods embedding secrets via digital media steganography, e.g., by manipulating pixels of a picture or samples of an audio track. The most notable examples are Duqu [2] and Trojan.Zbot [17] using digital images as hidden data carriers; (ii) methods injecting data within network traffic, e.g., by encoding bits into well-defined values of the throughput. In this class, the most diffuse threats are Regin [19] and Linux.Fokirtor [18] exploiting flows pro-
vol 32 no 2 March 2017
duced by popular network services, e.g., the secure shell; (iii) methods hiding information by modulating the status of shared hardware/software resources, e.g., by artificially altering the load of the CPU to signal some events to other processes. A possible example is Airhopper [4] using the GPU of the infected host to produce electromagnetic signals to communicate with the attacker. As regards targets of such threats, smartphones are generally more vulnerable than desktops as they offer different carriers for embedding secrets. In fact, modern devices natively incorporate camera, GPS, IEEE 802.11, Bluetooth, cellular connectivity, together with a mixed variety of sensors. However, for the case of smartphones, information hiding is primarily used to implement a communication path between two processes running on the same device, defined as a local covert channel. This is usually exploited by a pair of “colluding applications" cooperating to void the security framework of the underlying OS [13]. The archetypal approach is Soundcomber [16], which uses information hiding to bypass the security layer of the Android OS. In essence, the part of the malware that has insufficient privileges to access the network implements a covert inter-process commu-
71
DETECTING LOCAL COVERT CHANNELS USING PROCESS ACTIVITY CORRELATION ON ANDROID SMARTPHONES
nication service with another application having the proper permissions to leak data outside the device. Unfortunately, this type of malware is very difficult to spot, since: the pair of processes involved in the exchange is unknown; covert channels are characterized by a very low bitrate and often empowered with techniques to reduce their impact on the device (e.g., to avoid performance degradation); each information hiding technique is tightly coupled with the carrier, thus making the detection a poorly generalizable procedure. In this paper, we propose a method to spot the presence of local covert channels of mobile devices by considering the activity statistics of processes. To this aim, we take advantage of the fact that to create an effective local covert channel, the pair of colluding processes should be active at nearly the same time. Despite the presence of different techniques, this work focuses on low-attention raising covert channels, which increase their stealthiness by activating themselves when the user is not operating the smartphone. We point out that this class of threats is the most difficult to identify, and it is quickly becoming the preferred choice to build the next generation of mobile malware [6]. To offer a comprehensive analysis, we also partially investigate the case of channels activating when the user is busy (e.g., the device is used to watch videos or browse the Web). Concerning our approach, we firstly introduce three different techniques to identify active processes. Then we use such information to compute an ad-hoc metric to discover the colluding pair. The effectiveness of the proposed solution has been tested on an Android device. Additionally, owing to its generality and since we rely upon simple and high-level metrics, our detection framework can be deployed into a device without causing major computational burdens. Furthermore, it can be easily adapted to spot different information hiding-capable threats. The rest of the paper is structured as follows. Section 2 provides the background on local covert channel and surveys works dealing with the colluding applications threat and its detection methods. Section 3 presents the proposed detection method, while Section 4 deals with the experimental results collected on a real Android device in different use cases. Lastly, Section 5 concludes the paper.
2.
BACKGROUND AND RELATED WORK
Covert channels allow two endpoints to exchange data in a stealthy manner. During the years, different methods have been proposed, e.g., timing mechanisms [20], alteration of fields within protocols of the TCP/IP stack [14, 10] and exploitation of cloud computing environments[15]. Even if covert channels can be also used to prevent censorship, they are mainly used to increase the stealthiness of malware, especially to exfiltrate sensitive data, download additional code to be executed in infected hosts, as well as to cloak communications towards Command & Control servers [12]. According to their communication range, two main types of covert channels exist: network covert channels and local covert channels. The former enables to transmit data across the Internet, while the latter limits the exchange of information within the device. Detection and mitigation of both types of threats are important. However, in this paper, we focus on local covert
72
channels operating within a smartphone, since it is the most popular hazard found in recent malware [13].
2.1
Android Security and Colluding Applications
Before introducing the colluding applications threat, we provide some background on security of Android OS, which is widely used in the related literature as well as the target of many realworld attacks. Put briefly, Android relies on Unix and Java security paradigms borrowed from standard Linux distributions. Specifically, processes are isolated from each others by means of identifiers (defined as PIDs), which are dynamically generated when an application starts. Applications can exchange information by using specific messages named intents, which are designed to reduce hazards. To isolate processes from the underlying OS, before version 5.0, Android used the Dalvik Virtual Machine (VM), i.e., a re-implementation of the Java VM that can check at runtime bytecodes and enforce security policies. Instead, now Android uses an ahead-of-time compiler, which transforms the bytecode in native executables embedding security policies to be applied. As regards security policies, they are expressed via permissions. Then, each application needs a specific authorization to access resources (e.g., network and sensors), system data or components (e.g., the list of processes or the ability to keep the smartphone awake). The most “sensitive" permissions are enforced by the security framework of the OS or by the kernel, while the remaining ones are managed by the application itself. As an example, the access to the network is enforced via the Unix group inet, and bypassing such a constraint requires to attack the Linux kernel. Therefore, a malware wanting to exfiltrate sensitive information, must obtain permissions such as INTERNET and READ_CONTACTS, which guarantee the access to the network layer and the address book of the device, respectively. However, a process requiring resources not strictly related to its scope could appear suspicious both from the perspective of a user or tools used to increase the security of the device (as discussed in Sections 2.3 and 2.4). To bypass such a protection mechanism, an attacker could use a covert channel. The most notable example of information hiding-capable mobile malware is Soundcomber [16], which covertly transmits the keys pressed during a call, e.g., to exfiltrate a PIN of a bank account. Since the sensitive data is acquired by using the microphone, some security/privacy policies of the OS could prevent the malware to access the network. To bypass such a limit, the part that has insufficient privileges can use a colluding application to leak data outside the device. The latter could be an innocent looking software, for instance an organizer or a game, having the access to the TCP/IP protocol stack. Figure 1 depicts such a use case, which is commonly defined as the “colluding applications" scenario [11]. In more details, the Secret Sender has access to the data but has not the permission to access the network, while the Secret Receiver has access to the network and will be able to exfiltrate the received data to an external server. The communication between the two colluding processes is granted by a local covert channel, which can be created by using different techniques.
computer systems science & engineering
´ M. URBANSKI ET AL
The number of free blocks is controlled by the transmitting side via ad-hoc crafted write/delete operations;
• Processor statistics or frequency: secret bits are transferred by encoding the information into the usage statistics of the CPU available in /proc/stat or by observing the trend of its frequency.
Later, Lalande and Wendzel [6] described additional ways to enable local covert channels for mobile devices. While designing the channels, authors primarily focused on how to reduce their “attention raising" characteristics. Especially, they propose to activate the malware when the user is not performing any task with his/her smartphone (denoted in the following as “idle state" or simply as “idle"), since establishing the covert channel during busy periods could degrade the performances of the smartphone and reveal the presence of the ongoing attack. More specifically the authors of[6] propose:
Figure 1 Colluding applications scenario: Secret Sender and Secret Receiver using some form of information hiding to establish a local covert channel to exchange data outside their sandboxes.
2.2
Local Covert Channels for Mobile Devices
To implement the local covert channel between the colluding applications, Soundcomber exploits the most popular functionalities of modern smartphones and mobile devices [16]. Specifically: • Vibration/volume settings: one process is differencing the status of the vibration/volume and the other infers secret data bits from this event; • Screen state: secret bits are transferred by acquiring and releasing the wake-lock permission that controls the screen state; • File lock: secret bits are exchanged between processes by competing for the exclusive access to a file. Along the lines of Soundcomber, other techniques for developing covert communication on a smartphone have been recently proposed. The work by Marforio et al. [11] investigates seven new methodologies to enhance the bandwidth of classical covert channels. For instance, different locks are used simultaneously as to increase the throughput of exchanged information. In more details, the authors of [11] propose the following methods: • Vibration/volume settings: as it done by Soundcomber, the covert channel is created by modifying vibration/volume settings, yet in this case they are applied to multiple settings at the same time (e.g., message and ring tones); • Automatic intent: secret bits are exchanged by using OSwide notifications, i.e., broadcast messages sent automatically to interested processes when the modification of a subscribed variable happens; • Type of intent: secret bits are transferred by encoding data into an intent, which is an event of interest. In this case, the secret is encoded in the “nature" of the event, rather than its value; • Threads enumeration: the covert channel is created by encoding data into the number of currently active threads reported in the /proc directory; • Unix socket: similar to the previous case, but here the secret data bits are encoded into the state of socket(s); • Free space: secret bits are exchanged by encoding data into the amount of free space in the storage unit of the device.
vol 32 no 2 March 2017
• Task list and screen state: secret bits are exchanged by encoding them into the time for which the sender application stays active after the screen is switched off. Based on this measurement, the latter can infer secret bits. Authors also presented a simplified version only relying upon the state of the screen. Even if effective, this method is fragile as it assumes that an user will not interrupt the receiving application by operating the device; • Process priority and screen state: similar to the previous method, the main difference is that the receiving side measures the time for which a given process priority remains set to a reference value. Authors also evaluated a simplified version without the synchronization. In this case, the covert channel is created by scanning the priorities of all the processes until detecting the predetermined value. Concerning the screen state, it helps to detect when the user is not operating the smartphone, e.g., during the night or when the battery is charging. This dramatically reduces the chance of any human detection of possible side effects of the covert channel, e.g., visual/audio notifications or a reduction in the responsiveness of the device. Nevertheless, the price to pay for such an increased stealthiness is a reduction of the achievable throughput (less than 1 bit/s in many cases). Recently, Al-Haiqi et al. [1] proposed a new covert channel using the vibration feature to transmit data and the accelerometer to act as a receiver. Clearly, activating the vibration is an observable event, thus this method relies upon functionalities to start the transmission when the user is supposed to be “away". The obtained throughput depends on the hardware of the phone, thus is highly variable and equals to 5 bit/s, on the average.
2.3
Detection Methods
From the general Android security perspective, literature mainly showcases efforts to track information flows [3], which is a mandatory preliminary step before taking into account local covert channels. Another relevant field deals with the extension of existing solutions such as Taintdroid [3] or inter component analysis approaches [8]. However, their adoption to detect a local covert channel violating the security/information policies is
73
DETECTING LOCAL COVERT CHANNELS USING PROCESS ACTIVITY CORRELATION ON ANDROID SMARTPHONES
still an open research problem. More specifically, Lalande and Wendzel [6] demonstrated that their low-attention raising local covert channels are able to successfully exfiltrate sensitive user data even in the presence of tools for tracking information flows, such as Taintdroid. As regards detection of local covert channels, the literature presents only a very limited amount of work, thus highlighting difficulties in developing effective or general countermeasures. The most relevant effort is [5], where Hansen et al. aim at detecting three types of covert channels (i.e., those based upon vibration, volume, and wake lock). Specifically, authors propose to count the number of events, which are the changes of settings related to vibration, volume and the wake lock, and evaluate them within different time windows to reveal anomalous bursts. Unfortunately, the decision is made by comparing the number of events against a threshold: this requires to know their type and to fix the threshold in advance. Another recent trend is to use more general indicators to spot the presence of colluding applications, for instance by using anomalous energy consumptions of devices [23], [24]. Therefore, in this paper we focus on the detection of local covert channels in which colluding applications are primarily active at the same time as they offer increased bandwidths and are also widely adopted in malware found in the wild [13]. As regards the proposed detection technique, we introduce a more general solution considering different rules to decide whether a process is active and a set of metrics to spot colluding applications. Our approach prevents the need of any a-priori knowledge on the behavior of a process, while guaranteeing an easy implementation. As for any other available countermeasures (see, e.g., [5] and references therein), an attacker knowing its internals could defeat the method by reducing the bandwidth of the covert channel, i.e., the concurrent activity of colluding applications. However, the effectiveness of our detection approach demonstrates that previously unknown covert channels would be made unusable (i.e., they are spotted) or at least severely impaired.
2.4
Additional Malware Protections
To void the effectiveness of a malware using information hiding, another possible approach solely concentrates on the part devoted to the exfiltration of the data, i.e., the Secret Receiver for the colluding applications case considered in this work. However, this could be a naive idea. For example, a detection tool can inspect the network packets transmitted outside the device, or parse the bytecode of a suspicious application. Unfortunately, such solutions would have all the drawbacks of network-based Intrusion Detection System (IDS) or antivirus. In particular, the Secret Receiver may avoid the detection of the data exfiltration by using advanced defense techniques that are briefly described in the following. First, the related traffic can be further injected into another covert channel in order to bypass inspection policies [13]. Additionally, the attacker can protect the external server identity by using a network distributed architecture like The Onion Router (TOR). In fact, modern malware, such as Simplocker [9], already uses this workaround to cover the verification of user’s payments
74
when asking for ransoms. Lastly, the code of the malware can be obfuscated [21] in order to defeat static analysis methods. Thus, if a benign application embeds the Secret Receiver, it becomes very difficult to distinguish the suspicious code accessing the network from the legitimate part. Summing up, these reasons motivate the need of a systemlevel method that early recognizes the presence of the malware when both the Secret Sender and Secret Receiver jointly operate within the device.
3.
DETECTION METHODOLOGY
In this section, we describe the detection method used to identify the unknown pair of processes acting as colluding applications. To mitigate the complexity and guarantee an efficient implementation on many devices, we only use activity statistics provided by the Android OS and we introduce a condensed metric. For each process, we gather from /proc/[pid]/stat the following values: 1. utime – the amount of time that the process has been scheduled in user mode; 2. stime – the amount of time that the process has been scheduled in kernel mode; 3. cutime – the amount of time that the process waited for children when scheduled in user mode; 4. cstime – the amount of time that the process waited for children when scheduled in kernel mode. Then, the number of ticks assigned to the process from its creation is given by the sum of all values, i.e., utime + stime + cutime + cstime. The measure is repeated after a t, which is randomly selected from a uniform distribution. This allows to reduce the impact of the proposed framework over the device in terms of battery drains and CPU usage. For each process, the number of cycles allocated between two adjacent samplings are evaluated by computing their difference, and we define such a quantity as Mi . For example, if the measurement of a process at the instant i is equal to 713 clock cycles, and the value changes to 715 at i + 1, the difference is of 2 clock ticks, thus Mi = 2. This means that the process has performed some operations during the period of time between two measurements, i.e., t. However it must be highlighted that such a mechanism could fail to correctly identify an active process, especially when in the presence of a very limited number of performed operations. Therefore, to decide whether a process is active, we introduce the following three alternative decision rules: 1. threshold: the process is considered active if the number of ticks Mi exceeds a given threshold T . 2. relative: the process is considered active if the number of ticks Mi is greater than the average number of ticks consumed by the other processes running on the device. 3. sliding window: the process is considered active if the average of the last N measurements (i.e., Mi−N−1 , Mi−N−2 , . . . , Mi−1 ) is lower than the current measured value Mi .
computer systems science & engineering
´ M. URBANSKI ET AL
Table 1 Exemplary calculation of the total counter TC (x) for processes a, b and c during five consecutive measurements. Measure M1 M2 M3 M4 M5 TC (x)
Process a Active Not active Active Not active Active TC (a) = 3
Process b Active Active Active Active Active TC (b) = 5
Process c Not active Not active Active Active Active TC (c) = 3
would be severely impaired (or almost ineffective to exfiltrate a relevant amount of sensitive data). This also requires a proper tuning campaign of the parameters. Therefore, Section 4 discusses how to experimentally find the most suitable values for the A F to detect colluding applications both when the phone is idle and when the user performs some operations.
4. Table 2 Exemplary calculation of the activity factor A F (x, y) for processes a, b and c. Pair
Activity
A F (x, y) [%]
a, b
Active during 1, 3, 5
3 A F (a, b) = (3+5−3) · 100 = 60
a, c
Active during 3, 5
2 A F (a, c) = (3+3−2) · 100 = 50
b, c
Active during 3, 4, 5
3 A F (b, c) = (3+5−3) · 100 = 60
Upon choosing a variant of the decision rule, a process can be considered active/inactive. However, this only gives “instantaneous" information. Thus, we introduce a total counter, denoted as TC (x), considering the time for which the given process x is active during its lifetime. Table 1 shows an example of results for TC (x), for three processes a, b and c during five consecutive measurements. As a second step, we evaluate the behavior of all the pairs of processes running on the smartphone in order to recognize if they are two colluding applications. To this aim, we introduce the activity counter denoted as AC (x, y) representing the number of measurements for which processes x and y were both active. The two metrics defined above are then used to obtain a condensed indicator, defined as Activity Factor ( A F ), to decide whether a pair of processes is covertly exchanging data. Specifically: A F (x, y) =
AC (x, y) × 100 TC (x) + TC (y) − AC (x, y)
(1)
where, AC (x, y) is the activity counter of the pair x and y, TC (x) and TC (y) are the total counters for process x and y, respectively. In more details, A F (x, y) allows to assess the correlation between processes by offering a percentage of “concurrent" activity. Hence, it can be used as a condensed indicator to judge whether a pair of processes is implementing a colluding applications threat and exchanging data through a local covert channel. Its simple nature allows an easy implementation on resourcelimited devices and reduces impacts in terms of performance degradation if used in real-time. As an example, Table 2 reports computed A F (x, y) values for the scenario previously presented in Table 1. As for any other countermeasures available (see, e.g., [5] and references therein), an attacker knowing the static threshold could defeat the detection techniques by reducing the bandwidth of the covert channel, i.e., the concurrent activity of colluding applications. Nevertheless, the usage of techniques relying upon past average values reduces such a fragility since the detection method enjoys a sort of “self-adaptivity", which prevents to use static values that the threat can bypass at runtime. In general, the performances of our approach in identifying correlations in the activity of processes demonstrate that undetected covert channels
vol 32 no 2 March 2017
EXPERIMENTS AND RESULTS
To have a realistic testbed, we implemented the following three state-of-the-art information hiding techniques used to establish a local covert channel between the colluding applications on the Android OS-based device. The main idea of all three covert channels have been described in Section 2. Specifically we have implemented the File Lock, Volume Settings and Type of Intent covert channels. We selected such methods since they are an effective representation of the usage of different resources available in a modern smartphone, as well as they are characterized by quite mixed performances in terms of throughput values. In more details with the type of intent channel it is possible to transmit 1 byte between the Secret Sender and Secret Receiver at once, with the volume settings 4 bits (when an application using a 16-valued audio output is used), and only 1 bit with the file lock method. Such behaviors impact also on the duration of the transmission, and possibly, the detectability of the colluding applications. For each method, we used the colluding applications to exchange a message of 100 bytes as to simulate the exfiltration of sensitive information, e.g., an entry of the address book (name, email, and phone number). Each trial lasted 20 minutes and has been repeated 10 times as to have average values with a proper confidence level. We point out that such timeframe has been selected to increase the indeterminacy on “when" colluding applications start to interact with each other. This represents a very realistic set-up, since in real-world use cases the time at which a covert channel is established is unknown. Additionally, this choice allows to test our method in a worst-case, i.e., the duration of a transmission could be spread over a long period, hence making the detection harder. The duration of the trial should not be confused with the time needed by our method to perform the detection. In fact, postponing the spot of colluding applications of 20 minutes is not acceptable since it is sufficient to complete the exfiltration of data in many cases. Instead, our approach reacts after few samples. Unfortunately, a quick detection could account for a loss of accuracy, therefore investigating such a trade-off is part of our ongoing research. As regards the detection method, the time between two consecutive measurements t was randomly chosen from a uniform distribution ranging from 200ms to 2s. All the trials have been repeated in two scenarios: when the smartphone is idle and when the user is performing some activity.
4.1
Scenario 1: idle phone
In this section, we consider the smartphone as idle, i.e., there is no deliberate user’s activity. We recall that this scenario is very realistic and, as today, the preferred choice of malware
75
DETECTING LOCAL COVERT CHANNELS USING PROCESS ACTIVITY CORRELATION ON ANDROID SMARTPHONES
developers for exfiltrating data without raising the attention of the victim [6]. The first results are collected by using the decision rule based on a threshold T presented in Section 3. Table 3 showcases the values for AC (x, y), TC (x) + TC (y) and A F (x, y) for the pair of processes involved in the covert channel communication. In the following, to avoid burdening the notation, we drop the dependence from the process, i.e., we simply refer to AC , TC and A F . Since the parameter T plays a major role, we investigated different values, i.e., T = 1, 2, 5, 10. For all the covert channels we can observe that if T is equal to 1 or 2 clock ticks, the standard deviation σ of A F has the best values, thus reducing statistical errors. On the contrary, as T increases “slow" processes could remain undetected and marked as idle for long bursts of measurements. Therefore, using such a condensed parameter to discriminate if a pair of processes are colluding could lead to incoherent results or false positives/negatives due to the high σ . Figure 2 portraits the A F for the five most active pairs of processes, for T = 1, 2, 5, 10, and when data is exchanged through the type of intent covert channel. In this case, the A F value of the pair of colluding applications (labeled as ‘A’ in the figure) is greater than the one of other processes, thus allowing to easily recognize them. As it can be seen, the lower the value of T , the better is the “spread" of A F among the most interacting processes, thus making the detection of colluding applications more clear and accurate. Figure 3 and 4 present the same analysis for the volume settings and the file lock covert channels. Results are very similar to those obtained for the type of intent channel, and the best activity factor is obtained for smaller values of T . Then, we present results when using the “relative" decision rule, i.e., the pair of processes is active when it gets more clock ticks than the average of other entities running on the device. Table 4 presents the values for the parameters of the processes acting as colluding applications. The best performances are obtained when detecting colluding applications exchanging data through the file lock method. This is due to the behavior of the channel, which is active for a longer period if compared to other methods. In fact, when using the file lock, the data transmission lasts 8 times longer than for the type of intent case. Figure 5 showcases the five most active pairs of processes sorted according to the A F , and again, our method correctly identifies the pair ‘A’ as the colluding one. As it can be noted, better results are obtained for hiding methods that send secret data in smaller parts and thus require more time to complete the transfer. Moreover, the covert channel using the file lock information hiding scheme is characterized by the highest A F , and also has the lower σ compared to the other two channels. Hence, the covert channel using the file lock technique can be recognized with the greater accuracy. Lastly, we investigate the results when using the sliding window rule to detect the pairs of applications covertly exchanging data. Table 5 presents obtained experimental results for all three considered covert channels, and the best performances are achieved when detecting the colluding applications sending information through the file lock covert channel. Figures 6–8 illustrate the A F results for the five most active process pairs for this detection method, and also in this case, ‘A’ is the one implementing the colluding applications threat. However, despite the high value of σ , colluding applications can be clearly identified,
76
Table 3 Statistics for the pair of processes identified as colluding applications when using the threshold decision rule.
T
1 2 5 10 1 2 5 10 1 2 5 10
Activity Total Activity counter counter Factor AC TC A F [%] Avg. σ Avg. σ Avg. σ Type of Intent covert channel 8.20 9.14 14.50 21.53 66.80 15.41 2.70 1.25 4.40 1.58 60.83 18.45 2.10 1.60 7.10 9.85 55.40 32.65 1.00 0.94 2.60 4.74 56.88 47.73 Volume Settings covert channel 11.60 8.57 16.60 14.32 72.97 11.09 3.40 1.17 5.20 1.55 64.71 11.10 6.1 0 4.04 13.10 10.30 58.04 22.15 0.70 0.48 1.30 0.48 55.00 43.78 File Lock covert channel 70.70 3.50 76.70 4.35 92.27 3.42 34.50 3.44 37.70 4.16 91.76 5.54 25.90 19.23 32.40 22.86 77.07 9.80 1.00 0.47 1.80 0.79 60.00 32.58
Table 4 Statistics for the colluding applications using the relative decision rule.
Activity Total Activity counter counter Factor AC TC A F [%] Avg. σ Avg. σ Avg. σ Intent 12.30 8.14 19.90 13.20 63.22 11.51 Volume 23.50 6.00 37.60 20.08 68.31 13.06 File Lock 95.70 11.00 114.60 13.76 83.68 4.48 Channel Type
and also in this case, the detection of threats using the file lock covert channel is the most accurate. To summarize, all the decision rules used to mark a process as active proven to be effective to feed our method based on the condensed A F metric to identify colluding applications. In more details, the variant using the threshold T yielded to the best results, and the file lock covert channel was the one that can be detected with the best accuracy. It must be also noted that in the majority of cases, a pair of processes having an A F > 60% can be considered suspicious as the other pairs of processes typically did not reach values higher than 10%. Therefore, as a preliminary step, we do consider colluding applications those having an A F > 60%, which also leads to a number of false positives smaller than the 5%. As hinted, knowing this threshold in advance could allow a malware developer to beat the method, for instance by slowing down the transmission (or increase the degree of sophistication of the covert channel) as to have colluding applications with a reduced concurrent activity. However, this would lead to a covert data exchange highly impaired, and thus not effective for the exfiltration of sensitive data. At the same time, the “good" statistical properties of our method, especially the behavior of σ , guarantee to counteract methods with scarce activity at the price of an increased amount of false positives. Nevertheless, high values of σ could lead to too many false positives/negatives, thus reducing the effectiveness of the
computer systems science & engineering
´ M. URBANSKI ET AL
90
80
80
70
70
70
60
60
60
80
70 60
AF [%]
80
AF [%]
90
50
50
AF [%]
100
90
100
90
AF [%]
100
100
50
50
40
40
40
30
30
30
30
20
20
20
20
10
10
10
10
0
0
0
40
A
B
C
D
E
A
B
C
D
0 A
E
(a) T = 1
B
C
D
A
E
B
(b) T = 2
C
D
E
Pair of processes
Pair of processes
Pair of processes
Pair of processes
(c) T = 5
(d) T = 10
Figure 2 Five most active pairs of processes when measuring activity using the threshold decision rule for the type of intent covert channel.
80
60
AF [%]
AF [%]
70
50 40
100
100
90
90
80
80
70
70
60
60
50
100 90 80 70
AF [%]
90
AF [%]
100
50
60 50
40
40
40
30
30
30
30
20
20
20
20
10
10
10
0
A
B
C
D
0
E
A
B
Pair of processes
C
D
10
0
E
A
Pair of processes
(a) T = 1
B
C
D
E
0
A
B
Pair of processes
(b) T = 2
C
D
E
Pair of processes
(c) T = 5
(d) T = 10
100
100
90
90
90
80
80
80
80
70
70
70
70
60
60
60
60
50
50
AF [%]
100
90
AF [%]
100
AF [%]
AF [%]
Figure 3 Five most active pairs of processes when measuring activity using the threshold decision rule for the volume settings covert channel.
50
50
40
40
40
30
30
30
30
20
20
20
20
10
10
10
10
0
0
0
A
B
C
D
E
A
B
Pair of processes
C
D
E
40
A
(a) T = 1
B
C
D
E
0
A
B
Pair of processes
Pair of processes
(b) T = 2
C
D
E
Pair of processes
(c) T = 5
(d) T = 10
100
100
90
90
90
80
80
80
70
70
70
60
60
60
50
AF [%]
100
AF [%]
AF [%]
Figure 4 Five most active pairs of processes when measuring activity using the threshold decision rule for the file lock covert channel.
50
50
40
40
40
30
30
30
20
20
20
10
10
0
A
B
C
D
E
Pair of processes
(a) Type of intent covert channel
0
10
A
B
C
D
E
Pair of processes
(b) Volume settings covert channel
0
A
B
C
D
E
Pair of processes
(c) File lock covert channel
Figure 5 Five most active pairs of processes when measuring activity using the relative decision rule.
vol 32 no 2 March 2017
77
DETECTING LOCAL COVERT CHANNELS USING PROCESS ACTIVITY CORRELATION ON ANDROID SMARTPHONES
80 70
AF [%]
AF [%]
60 50 40
100
100
100
90
90
90
80
80
80
70
70
70
60
60
60
50
AF [%]
90
AF [%]
100
50
50
40
40
40
30
30
30
30
20
20
20
20
10
10
10
0
0
0
A
B
C
D
E
A
B
Pair of processes
C
D
E
10
A
B
D
0
E
A
B
Pair of processes
Pair of processes
(a) N = 2
C
(b) N = 5
C
D
E
Pair of processes
(c) N = 10
(d) N = 20
Figure 6 Five most active pairs of processes when measuring activity using the sliding window decision rule for the type of intent covert channel.
90
80
80
80
70
70
70
70
60
60
60
60
50
50
AF [%]
100
90
80
AF [%]
100
90
AF [%]
100
90
AF [%]
100
50
50
40
40
30
30
30
30
20
20
40
20
20
10
10
10
0
0
0
A
B
C
Pair of processes
(a) N = 2
D
E
A
B
C
D
E
40
10
A
B
C
D
E
0
A
Pair of processes
Pair of processes
(b) N = 5
B
C
D
E
Pair of processes
(c) N = 10
(d) N = 20
Figure 7 Five most active pairs of processes when measuring activity using the sliding window decision rule for the volume settings covert channel.
Table 5 Statistics for the pair of processes identified as colluding applications when using the sliding window decision rule.
N
2 5 10 20 2 5 10 20 2 5 10 20
Activity Total Activity counter counter Factor AC TC A F [%] Avg. σ Avg. σ Avg. σ Type of Intent covert channel 7.30 5.96 14.00 17.11 64.73 19.98 4.50 1.27 5.50 2.95 89.26 18.82 4.20 1.87 9.70 11.14 73.56 34.60 5.60 5.36 9.60 9.11 61.01 16.91 Volume Settings covert channel 9.50 2.68 18.90 15.67 67.94 28.92 7.60 2.32 9.70 2.50 79.01 13.69 12.80 6.23 21.10 10.41 63.83 15.70 4.60 0.97 7.50 2.01 62.83 8.84 File Lock covert channel 33.00 4.35 39.40 6.36 84.19 4.48 33.10 3.41 43.50 5.84 76.56 6.20 58.40 19.90 91.50 34.79 65.79 10.98 31.00 5.12 38.10 4.51 81.17 7.61
proposed approach. In this case, a possible solution could be using additional metrics, such as the CPU/memory usage statistics. Obviously, such aspects require a proper modeling. Alas, how the different rules to mark a process as active influence the A F is part of our ongoing research, as well as using some of the aforementioned per-process statistics.
78
4.2
Scenario 2: user activity
As said, the best performances in terms of spotted applications are achieved when active processes are identified with the threshold method and T = 1. Therefore, for the set of trials considering a busy smartphone, we limit our considerations to such a case. Furthermore, for the sake of brevity, we limit our investigation to the file lock covert channel, since it is the most difficult to recognize from a theoretical point of view, also due to its very limited bandiwdth. The duration of this round of tests is 15 minutes, which has been selected to have a proper time horizon as discussed in Section 4.1. The pattern characterizing the user is subdivided into 5 minute slots as follows: download of a video stream, Facebook and writing, and web browsing. Concerning the parameters used in our framework, the pair of processes implementing the colluding applications threat are characterized by, on the average, AC = 54.50 and σ = 2.12, TC = 94.80 and σ = 6.16, and A F = 57.65 and σ = 5.02. Figure 9 depicts the five processes having the highest concurrent activity and, also in this case, ‘A’ denotes the pair of colluding applications. As shown, even if the user is interacting with the device, the A F of the pair exploiting information hiding is the highest one, but its average value is lower than in the idle case. Moreover, pairs B - E achieve significantly higher A F compared to the idle case, e.g., B has A F > 40%. By analyzing the most correlated processes, it turns out that they are tightly coupled with the behavior of the user. For instance, pairs B and C aggressively interact due to the presence of the com.touchtype.swiftkey process, i.e., an on-screen keyboard used by the Facebook App and the Web Browser, respectively. Thus, also in this scenario, colluding applications can be spotted by simply evaluating the A F , yet its variance would lead to many false positives. Therefore, when in the presence of a busy de-
computer systems science & engineering
´ M. URBANSKI ET AL
80 70
AF [%]
AF [%]
60 50
100
100
100
90
90
90
80
80
80
70
70
70
60
60
60
50
AF [%]
90
AF [%]
100
50
50
40
40
40
40
30
30
30
30
20
20
20
20
10
10
10
0
A
B
C
D
0
E
A
Pair of processes
B
C
D
E
0
B
C
Pair of processes
Pair of processes
(a) N = 2
10
A
(b) N = 5
(c) N = 10
D
E
0
A
B
C
D
E
Pair of processes
(d) N = 20
Figure 8 Five most active pairs of processes when measuring activity using the sliding window decision rule for the file lock covert channel.
vice, the proposed approach should be supported by a minimal knowledge of the processes used by the Android OS to offer system-wide services. In other words, the “black box" methodology used for the idle case (i.e., the nature of the processes has not be considered) could partially reduce its performances when in the presence of applications tightly interacting with other software components. Luckily, having a sort of database of wellknown “clean" applications supposed to support other components is straightforward. For instance, if two applications have to communicate “by design", this can be inferred by evaluating the related manifests. In this case, both applications will have broadcast receiver/listener subscribed to a common intent. Hence, this knowledge can be used to increase the accuracy of the detection.
additional information or use artificial intelligence tools to take advantage of unknown relations. Instead, when operating online, the proposed approach is used to perform a real-time detection of the steganographic threat. Hence, its computing/memory requirements could not be too demanding, e.g., to not impair the phone or to avoid excessive battery drains. To this aim, sampling all the processes with a too aggressive t would be unfeasible. Luckily, using a sampling time of 2 seconds yielded to good results. As regards processes, a proper “pruning" procedure should be used, since sampling all the processes could be unfeasible. A possible idea, which is part of our ongoing research, is to adopt a sort of statistical sampling also by omitting well-known applications (e.g., system daemons), which could be assumed as “clean". Indeed, the latter could be inspected via third party tools like antivirus.
100 90 80
5.
CONCLUSIONS AND FUTURE WORKS
AF [%]
70 60 50 40 30 20 10 0
A
B
C
D
E
Pair of processes Figure 9 Five most active process pairs when activity is measured with the threshold rule and T = 1 for the file lock covert channel.
4.3
Implementation Issues
The proposed approach has to be intended as a possible “detection engine" for an Android tool able to spot information hidingbased threats exploiting the colluding application technique. In this perspective, it can be deployed both offline and online. When operating offline, the tool can spot (past) attacks by elaborating a trace containing the activity of processes. This requires to develop a sort of log of à la top. In this case, the computation could be intensive, but it can be made on a desktop to offload the device. Therefore, one might also imagine to tune the parameters (i.e., T , N, and suitable values for A F ) in order to augment the effectiveness of the technique, also by mixing
vol 32 no 2 March 2017
In this paper we presented a method to detect local covert channels on Android-based mobile devices exploiting the colluding applications architecture. To this aim, we introduced a condensed activity factor to be used with different decision rules to mark a process as active. Experimental results showcased the effectiveness of the proposed approaches to detect malware using low-attention raising local covert channels, i.e., those active when the user is idle. This is not a limitation, since they are at the basis of the most effective information hiding-based threats. Best performances have been achieved when using the decision rule based on a threshold, but at the price of having a method with some fragilities. Instead, the relative and the sliding window rules offer reduced detection rates, but a greater robustness against possible workarounds as they use self-adaptive parameters. Part of our ongoing research is aimed at removing some limitation of the framework. For instance a relevant effort will be devoted to test the approach against covert channels operating when the user actively performs some actions. In this case, we envisage also to exploit some form of behavioral-modeling, for instance to reduce possible false positives due to two legitimate processes aggressively interacting (e.g., the keyboard daemon and a client-interface). Unfortunately, the literature currently lacks of such approaches. Another future research item deals with the development of a mathematical model in to better understand limits of the approach and the maximum theoretical bandwidths, which can be achieved by two colluding applications without being spotted.
79
DETECTING LOCAL COVERT CHANNELS USING PROCESS ACTIVITY CORRELATION ON ANDROID SMARTPHONES
A relevant part of our efforts deals with the implementation of a production-quality Android tool exploiting the discussed correlation-based process as the engine to spot malware exfiltrating data. In this extent, a specific activity will be devoted to tune the parameters as well as to select processes to reduce the computational effort. Lastly, considering additional metrics, such as the CPU/RAM usage, jointly with the consumed energy, is another future research action. On one hand, it allows to consider asynchronous colluding applications, i.e., those cooperating without being concurrently active. On the other hand, such an enriched set of features could augment the effectiveness of the approach and further reduce the computational efforts needed to sample processes.
REFERENCES 1. A. Al-Haiqi, M. Ismail, R. Nordin, “A New Sensors-Based Covert Channel on Android", The Scientific World Journal, 2014. 2. B. Bencsáth, G. Pék, L. Buttyán, M. Félegyházi, “Duqu: A Stuxnet-like Malware found in the wild", CrySyS Lab Technical Report, Apr. 2014. 3. W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, P. McDaniel, A. N. Sheth, “TaintDroid: an Information-flow Tracking System for Realtime Privacy Monitoring on Smartphones", in Proc. of the 9th USENIX Symposium on Operating Systems Design and Implementation, pp. 393 - 407, Oct. 2010. 4. M. Guri, G. Kedma, A. Kachlon, Y. Elovici, “AirHopper: Bridging the air-gap Between Isolated Networks and Mobile Phones Using Radio Frequencies", in Proc. of the 9th International Conference on Malicious and Unwanted Software (MALWARE), pp. 58 - 67, Oct. 2014. 5. M. Hansen, R. Hill, S. Wimberly, “Detecting Covert Communication on Android", in Proc. of the 37th Annual IEEE Conference on Local Computer Networks, pp. 300 - 303, Oct. 2012. 6. J.-F. Lalande, S. Wendzel, “Hiding Privacy Leaks in Android Applications Using Low-Attention Raising Covert Channels", in Proc. of 1st International Workshop on Emerging Cyberthreats and Countermeasures, pp. 701 - 710, Sept. 2013. 7. H. Lau, “The Truth Behind the Shady RAT", online: http://www.symantec.com/connect/blogs/truth-behind-shadyrat, last accessed April 2016. 8. L. Li, A. Bartel, J. Klein, Y. L. Traon, S. Arzt, S. Rasthofer, E. Bodden, D. Octeau, P. Mcdaniel, R. Siegfried, “I know what Leaked in your Pocket: Uncovering Privacy Leaks on Android Apps with Static Taint Analysis", Technical Report, Apr. 2014. 9. R. Lipovsky, “Eset Analyzes Simplocker – first Android file-encrypting, tor-enabled Ransomware", online: http://www.welivesecurity.com/2014/06/04/simplocker, last accessed April 2016. 10. N. Lucena, G. Lewandowski, S. Chapin, “Covert Channels in IPv6", in G. Danezis, D. Martin, (Eds.), Privacy Enhancing Technologies", Lecture Notes in Computer Science", vol. 3856, pp. 147 - 166, Springer Berlin Heidelberg, 2006.
80
11. C. Marforio, H. Ritzdorf, A. Francillon, S. Capkun, “Analysis of the Communication Between Colluding Applications on Modern Smartphones", in Proc. of the 28th Annual Computer Security Applications Conference, pp. 51 - 60, Dec. 2012. 12. W. Mazurczyk, L. Caviglione, “Information Hiding as a Challenge for Malware Detection", IEEE Security & Privacy, vol. 13, no. 2, pp. 89 - 93, Mar 2015. 13. W. Mazurczyk, L. Caviglione, “Steganography in Modern Smartphones and Mitigation Techniques", IEEE Communications Surveys & Tutorials, vol. 17, no. 1, pp. 334 - 357, Jan. 2015. 14. S. J. Murdoch, S. Lewis, “Embedding Covert Channels into TCP/IP", in Proc. of the 7th International Conference on Information Hiding", pp. 247 - 261, 2005. 15. T. Ristenpart, E. Tromer, H. Shacham, S. Savage, “Hey, you, get off of my cloud: Exploring Information Leakage in third-party Compute Clouds", in Proc. of the 16th ACM Conference on Computer and Communications Security, pp. 199 - 212, 2009. 16. R. Schlegel, K. Zhang, X.-y. Zhou, M. Intwala, A. Kapadia, X. Wang, “Soundcomber: A Stealthy and Context-Aware Sound Trojan for Smartphones", in Proc. of the Network and Distributed System Security Symposium, pp. 17 - 33, Feb. 2011. 17. J. Segura, “Hiding in plain sight: a story about a Sneaky Banking Trojan", https://blog.malwarebytes.org/securitythreat/2014/02/hiding-in-plain-sight-a-story-about-a-sneakybanking-trojan/, last accessed April 2016. 18. Symantec Security Response, “Linux back door uses Covert Communication Protocol", http://www.symantec.com/connect/blogs/linux-back-door-usescovert-communication-protocol, last accessed April 2016. 19. Symantec Security Response, “Top-tier Espionage tool Enables Stealthy Surveillance", http://www.symantec.com/connect/blogs/regin-top-tierespionage-tool-enables-stealthy-surveillance, last accessed April 2016. 20. J. Wray, “An Analysis of Covert Timing Channels", in Proc. of IEEE Computer Society Symposium on Research in Security and Privacy, pp. 2 - 7. May 1991. 21. I. You, K. Yim, “Malware Obfuscation Techniques: A brief Survey", in Proc. of the International Conference on Broadband, Wireless Computing Communication and Applications, pp. 297 - 300, 2010. 22. E. Young, E. Ward, “Trojan.downbot", https://www.sym antec.com/security_response/writeup.jsp?docid=2011-0524131248-99, last accessed April 2016. 23. L. Caviglione, M. Gaggero, J. F. Lalande, W. Mazurczyk, M. Urba´nski, “Seeing the Unseen: Revealing Mobile Malware Hidden Communications via Energy Consumption and Artificial Intelligence", IEEE Transactions on Information Forensics and Security, vol. 11, no. 4, pp. 799 - 810, April 2016. 24. A. Merlo, M. Migliardi, L. Caviglione, “A Survey on EnergyAware Security Mechanisms", Pervasive and Mobile Computing, vol. 24, pp. 77 - 90, Dec. 2015.
computer systems science & engineering