Improving energy efficiency in distributed intrusion detection systems

3 downloads 256 Views 595KB Size Report
IOS Press. Improving energy efficiency in distributed intrusion detection systems ..... in a stepwise fashion without requiring a global hardware/software upgrade.
Journal of High Speed Networks 19 (2013) 251–264 DOI 10.3233/JHS-130476 IOS Press

251

Improving energy efficiency in distributed intrusion detection systems Mauro Migliardi a,∗ and Alessio Merlo b a Centro

PY

Ingegneria Piattaforme Informatiche, University of Genoa and University of Padua, Genoa, Italy E-mail: [email protected] b Dipartimento di Ingegneria, E-Campus University, Novedrate, Italy E-mail: [email protected]

C

O

Abstract. Recent studies show that malicious traffic has a significant impact on the behavior of networks, thus early dropping of malicious packets could enhance network performance. Furthermore, recent developments in the field of green networking show that it is possible to modulate power consumption according to the traffic level. Intrusion Detection Systems burden the basic routing task with packets inspection; however, traffic reduction could have beneficial results both on network performance and, with next generation routing devices, in network energy consumption. In this paper, we focus on energy consumption and we present an approach to distribute packet inspection over the network nodes, a model to evaluate the changes in network energy consumption according to early or late discovery and discard of rogue packets, and we adopt our technique and model to evaluate by means of simulation how aggressive distributed intrusion detection could be beneficial in terms of network energy savings.

O

R

Keywords: Network security, green networking, distributed intrusion detection, simulation, evaluation

1. Introduction

AU

TH

Recent studies show that malicious traffic has a significant effect on the behavior of networks [21] even when it does not produce catastrophic network failures such as the ones caused by the widespread Nimda/RedCode infection [41]. Furthermore, the capability of malicious activities to generate significant amounts of traffic is at the basis of all anomaly based malicious traffic detection systems; in fact, if malicious traffic were completely negligible compared to normal aggregate traffic, no anomalies would be ever generated. Furthermore, recent works on network threats ([22,23,33] and [8]) tend to suggest that the number of controlled nodes somehow participating to the malicious activities of a botnet is dramatically increasing. This growth is a side effect of two major evolutional lines. The first evolutional line is the explosive growth of Internet connectivity available to the masses and of the number of users that are “always on” [2]. This phenomenon has enormously increased the appeal of Internet based services and applications creating a very dynamic market; however, at the same time has multiplied the number of attack points for users bent to mischief, malfeasance and crime automatically increasing the likelihood of any kind of attack to find a weak spot. The second evolutional line is the growth of mobile access through smartphones and tablets; in fact, our reliance on mobile access to data for daily activities and business [42], and their vulnerability to coordinated attacks [39] makes mobile access networks a critical component and another very interesting target for attacks. For these reasons, it is of paramount importance to achieve an early and complete sanitization of all traffic network. This means both to drop malicious traffic packets before they reach their intended destinations to prevent * Corresponding author: Mauro Migliardi, Centro Ingegneria Piattaforme Informatiche (CIPI), University of Genoa and University of Padua,

Via Opera Pia 13, 16145 Genoa, Italy. E-mail: [email protected]. 0926-6801/13/$27.50 © 2013 – IOS Press and the authors. All rights reserved

252

M. Migliardi and A. Merlo / Improving energy efficiency in distributed intrusion detection systems

O

PY

them from damaging those destinations, and to remove such malicious traffic as soon as possible to prevent them from squandering network resources. Such a sanitization will also prevent negative effects on higher layers security solutions for distributed applications (e.g. [24]). Early sanitization of traffic, though, implies analyzing on the fly all the network traffic. This, for core networks, is a daunting task and, as it requires a not negligible amount of computational power, may introduce undesired delays in the network core-business, i.e., the forwarding of packets. In this paper we propose to leverage a fully distributed approach to the task of detecting network intrusions, then we model the energy consumption of both packet delivery and packet analysis and we simulate different scenarios to show that aggressive intrusion detection may turn beneficial in terms of network energy consumption. This paper is an extended version of the research work presented in [25]. More in details, this paper provides a wider analysis of the related works, it introduces the concept of probabilistic recognition of malicious packet distinguishing among different durations of the analysis process itself, and, finally, it presents a larger, more significant set of simulation results adopting realistic values for energy consumptions and packet traffic. This paper is structured as follows: in Section 2 we provide an introduction to Intrusion Detection Systems. In Section 3 we briefly summarize how the activities of an Intrusion Detection System could be parceled over several nodes while in Section 4 we provide a model for assessing the energy consumption of a distributed analysis. In Section 5 we simulate several different scenarios and we evaluate the energy leakage due to late discovery of malicious traffic. Finally, in Section 6 we provide some concluding remarks.

C

2. Intrusion detection systems

AU

• Anomalies detectors; • Signature detectors.

TH

O

R

The most commonly adopted instrument of network traffic sanitization is the Intrusion Detection System (IDS) [1,12,31,34]. A first categorization of intrusion detection systems distinguishes between Host Intrusion Detection Systems (HIDS) and Network Intrusion Detection Systems (NIDS). While the former are dedicated to the analysis of the traffic targeted at a single host (hence the name), the latter are dedicated to analyze all the traffic traveling across the network and thus only those pertaining to this category are apt to the task of sanitizing the network traffic before it reaches the destination. Although there are different types on NIDS, it is generally possible to define two major categories based on how they perform the detection itself [3,16]. Thus, we can divide the NIDS in:

The first category of NIDS works upon the assumption that it is possible to define a statistical model that, taking into account effects of daily, weekly and monthly variations, is capable of describing all the traffic that crosses the network in a normal situation. Starting from this assumption, it is thus possible to deduce that any traffic that cannot be described using the above mentioned model (i.e. any anomaly with respect to the statistical forecast) is abnormal and requires further analysis because it is likely to be caused by a malicious intrusion. These NIDS may recognize the effects of threats never seen before, but they have two major weaknesses: first, they mostly recognize the effects of an intrusion, not the attempt itself; second, the exact definition of normalcy is a very difficult task, thus Anomaly Based NIDS are usually prone to detect a large number of false positives. As an example, a perfectly legal change in network traffic caused by the opening of a very popular web site or service will cause an anomaly in network traffic and alert anomaly-based detectors. The second category of NIDS bases its work on the same mechanism used by most virus control programs. They have a database of “signatures”. A signature is a pattern of bytes, which has been identified as distinctive of an intrusion attempt packet. These NIDS are capable of eliminating intrusion attempts before they have spread an infection, but they cannot detect new threats. Thus they could stop any known worm from spreading, but they are

M. Migliardi and A. Merlo / Improving energy efficiency in distributed intrusion detection systems

253

AU

TH

O

R

C

O

PY

mostly impotent when a new threat appears. To enhance the robustness of this category of NIDS some implementations add “behavioral signatures”, i.e. they do not check just for a specific byte pattern in a packet, but they also check for unusual behavior [16]. All of these systems are dedicated to the recognition and elimination of malicious packets from network streams. In most cases, they are positioned in fringe networks and they concentrate on avoiding delivery of malicious packets to final targets. This positioning of IDS systems is commonly due to the amount of traffic that travels through central or core router. In fact, the amount of CPU time consumed to perform a complete intrusion detection analysis on a packet is non-negligible and may introduce a significant delay in the process of packet forwarding. The introduction of such a delay is not a viable option, as it would disrupt most of the application active on the internet that require real-time packet delivery. However, the protection provided by fringe-placed IDS systems has two major limitations. First, while the IDS can prevent an attack endpoint from being actually hit by the malicious packets, if the host has been compromised in some other way (such as a simple social engineering attack) then the IDS is usually unable to prevent the compromised host from participating in further mischief such as taking part into a Distributed Denial of Service attack. Second, the late drop of malicious packets does not allow any reduction of the amount of resources squandered by routing them. In order to overcome this impasse, some projects have studied how to evolve IDS from a monolithic, single node infrastructure toward a distributed architecture capable of integrating the results of the analysis performed on several nodes. These systems are commonly called Distributed Intrusion Detection Systems (DIDS). In past works, several different approaches have been studied, e.g. the application of bio-inspired techniques such as Artificial Immune Systems [14], the application of Bayesian networks [19] and the application of Fuzzy Logic [32]. However, the focus of these projects is the enhancement of the detection capability of DIDS and their actual efficiency in contrasting distributed attacks such as Distributed Denial of Service ones, they do not directly concern themselves with the capability to perform a complete analysis inside the nodes of a core network. On this, same directions moved the action of a dedicated Working Group of the IETF that has written an RFC that defines the Intrusion Detection Message Exchange Format (IDMEF) [10]. This protocol aims at unifying the format with which different nodes involved in a coordinated effort to recognize distributed attacks exchange information. However, the protocol is XML based, thus it tends to generate large chunks of data that may constitute by themselves a significant traffic and a burden to the network. All these studies have focused on cooperative strategies only from the effectiveness perspective; in fact their stated goal was to maximize the detection ratio while minimizing the event of false positives. However, effectiveness is not the unique aspect to take into account when dealing with DIDS; in fact, it is well known both that the energy consumption of network infrastructures is a growing problem [40], and that all of these components are computationally demanding and significantly energy consuming. Other works have focused on accelerating the process of the intrusion detection by parallelizing it. This goal has been pursued by means of dedicated parallel architectures such as FPGA [30], array processors [20], or by providing memory efficient algorithms for the string matching task [19]. None of these projects, though, aims at devising a methodology to parcel the computational cost of intrusion detection so that it can be performed in a distributed fashion among the nodes that the packet to be analyzed naturally traverses. A fully distributed approach to the task of intrusion detection would allow opportunistic tuning of the trade-off between the effort dedicated to fast packet switching and the effort dedicated to early recognition of packets dedicated to intrusion attempts.

3. Distributed intrusion detection There are many studies that present a distributed design for intrusion detection; among these we may cite, Hi-dra [18], DIPS [15,38] and Intrusion Detection Force [37]. All of these adopt an approach where detection is distributed but the decision and policy building is centralized. Starting from a different approach, the main idea behind our Distributed Intrusion Detection scheme is that every network node involved in packet routing/switching may perform a portion of the search for malicious packets on

254

M. Migliardi and A. Merlo / Improving energy efficiency in distributed intrusion detection systems

AU

TH

O

R

C

O

PY

the traffic that flows through it, while the remaining portion of the analysis is delegated to the nodes further along the path. Thus, the load of detecting intrusion attempts among all the packets flowing from node A to node B is divided among a subset (that may coincide with the whole) of the nodes along the path. There are two fundamental assumptions at the basis of this. The first assumption is that a node can be modeled as a selector: each packet in an input queue has to be processed to select the correct output queue. The length of the input is a measure of the load of the router from a processing point of view, while the length of the output queue is independent from the processing load of the node. A second fundamental assumption is that the process of detecting an intrusion can be modeled as a visit to a Directed Acyclic Graph (DAG). This assumption derives from the analysis of the internal structure of one of the most diffuse signature detection software, namely SNORT [7]. According to its characteristics and network level options each packet navigates the DAG to be matched against the different misuse signatures. When a match is found the packet is tagged as bad. If no match is found the packed is tagged as legitimate. We assume that the navigation of the DAG may be suspended and resumed at a later stage, thus a router may tag a partially checked packet with the current position in the DAG to allow a following router to resume the checking activity. On the basis of these assumptions, there are two different flavors of the algorithm to select the portion of traffic to be checked in each node. The first methodology tries to advance as much as possible in the intrusion detection check of every packet in the input queue. Each packet has an intrusion detection time quantum (IDQ) that is calculated on the basis of the router current load. The second methodology tries to complete the control of as many packets as possible. At each intrusion detection step, the router calculates how much Slack Time (ST) he has given the current state of its input queue. Once the ST has been calculated, the router performs the complete intrusion detection analysis of as many packets as he can in ST time. The packets detected as bad are immediately discarded; the other checked packets are tagged as legitimate and forwarded. In order to avoid locking up the packet forwarding process, and thus to introduce excessive delay into it, if the value of ST allows performing the complete intrusion detection analysis of a packet, this analysis is performed in steps of IDQ length. Once the allowed amount of complete analysis has been performed a new ST value is computed. The check of the remaining packet is delegated to routers in future hops and, until the router has performed the routing process for Input Queue Length (as measured at the previous detection step) packets, it does not engage itself in any further intrusion detection activity. Obviously, the information about where in the analysis DAG a packet has proceeded has to be transmitted to the next hop to continue the process; however, adding dedicated packets to forward this knowledge has two negative effects: (1) Introducing additional traffic may induce additional congestion in a network that is already under attack; (2) The IDS packet may get lost or be delayed forcing the next hop to wait or ask for a retransmission. Thus, we chose to propagate information about how much the analysis has proceeded so far using the same packets analyzed as a vehicle for the information. The IP protocol offers a way to put additional information on how to handle datagrams. This is done using the IP Options field. Using an appropriate IP option, is possible to add to a packet the information about the part of analysis already conducted on it. Moreover, a host can also tell to its follower what kind of attack (involving more than a packet) he thinks could be happening. Using IP options has an additional significant advantage: since it is already a part of a known protocol, every IPv4-enabled machine can handle them. If the router is running a software capable of handling the information, it will take the information contained in the option, and start its analysis from that point. In the case it doesn’t understand this protocol, it will simply forward the packet as is. This is very important, because there’s no need to have all hosts on the network understanding this protocol, thus it is possible to introduce this functionality inside core networks in a stepwise fashion without requiring a global hardware/software upgrade. We expect the information about the advancement of the IDS analysis to travel un-encrypted; in fact, encrypting it will introduce a significant burden in the network nodes. This solution, however, does not represent a security

M. Migliardi and A. Merlo / Improving energy efficiency in distributed intrusion detection systems

255

risk; in fact it will be easy to prevent tampering of that information by any malicious entity unless a router inside the core network is compromised. We can consider three possible ways of compromising the information included in a network packet: (1) Manipulating information inside the network to short-circuit the analysis; (2) Removing the information inside the network to force a start from scratch of the analysis; (3) Putting false negative results in malicious packets from the source.

R

C

O

PY

Case 1 will cause stopping the analysis before the packet is recognized as malicious and forwarding it to the destination. However, it requires control of the software inside a router in the core network, thus it can currently be considered unfeasible. Case 2 will cause the node after the one where the manipulation takes place to spend more resources than necessary. This, if repeated, could force the network nodes to deplete their analysis resources and will reduce the detection capability of the whole system. However, once again in order to perform such a manipulation it would be necessary to control the software inside a core network router, thus we can consider this kind of attack unfeasible too. Finally, case 3 will prevent any analysis on a malicious packet allowing its unperturbed delivery at the target. This attack does not need a compromised core network router as the injection of the information that the packet has been already checked and judged legitimate may happen directly in the attacking source node. However, to prevent this kind of attack it is enough to clean any IDS information in the packet entering any core network. If this is done systematically, any node participating the distributed intrusion detection can trust the information found on the packet itself, as it has to come from another safe node inside the core network. Thus, even if this latter case is feasible, it is trivially thwarted. For further details about the workload distribution mechanism, the communication protocols involved and cryptography to protect the traveling information in the case it has to traverse insecure nodes see [27,28] and [29].

O

4. A probabilistic model for energy consumption in distributed packet analysis

AU

TH

In this section, we briefly introduce a general model for describing the energy consumption in a distributed intrusion detection system that has been already presented in [9] and [26]. Then we extend it with probabilistic components able to model (1) the early detection of the state of a packet (i.e. marking the packet as good or bad without a complete analysis activity) and (2) false positives and false negatives in the detection activity. A Core Network (as opposed to an Access Network) of an Internet Service Provider (ISP) is defined as a set of connected nodes. Each node supports different operations among which we are interested in packet forwarding, storing and analysis. A packet reaching a boundary node (nodes A, B, C, D, E, F, G in Fig. 1) in the network is then forwarded through the nodes to another boundary node (e.g. node G in Fig. 1), towards its destination. It is obvious that each operation inside the network has an energy cost. However, we are interested in modeling the energy consumptions related to the intrusion detection analysis, to packet routing and delivery and the interaction of the two. Thus, we ignore any other cost such as, for instance, costs caused by network management. In our model the core network is a set of links and nodes, connected in any topology. As the basic function of the nodes is routing packets toward their destination, we will assume that all of the nodes are routers with additional intrusion detection capabilities. Thus, we indicate with RSet = {Rk }k∈N the set of routers in the core network and with LSet = {hl }l∈N the set of links (hops) between routers. As remarked, an IDS like SNORT [7] divides the analysis of a packet into a set of independent analysis unit (au), such that, as shown in [28] and [29], the analysis of a packet can be carried out by different routers along the path to the destination. Given Aus = {au} the set of all possible analysis units of an IDS, we define a single packet pi as the ordered sequence of analysis units that the IDS should execute for its security assessment: pi = [au1 , au2 , . . . , auMi ],

256

M. Migliardi and A. Merlo / Improving energy efficiency in distributed intrusion detection systems

Fig. 1. Routers in an internet service provider core network. (Colors are visible in the online version of the article; http://dx.doi.org/10.3233/ JHS-130476.)

O

PY

where auj ∈ Aus. We identify the packet in terms of analysis units only; therefore, two packets in the network that requires to be analyzed by the IDS through the same sequence of units are considered identical. Any subsequence of a packet is thus considered a packet. Thus, we refer to any sequence of analysis unit as a packet. This model assumes that:

C

(1) the whole sequence of analysis units should be executed by the IDS in order to flag the packet as good or bad, and (2) the analysis units must be executed orderly, namely auj should be executed before auj+1 .

Mi 

O

Ep i =

R

The execution of each analysis unit has an energy cost. Thus, we associate an energy value Eauj to the analysis unit auj . Thus, we define the energy cost (limited to the Intrusion Detection analysis) for packet pi as:

Eauj .

TH

j=1

(1)

AU

Analysis units are executed on routers. With the previous definition of packet, we model a router as a packet consumer. More in detail, given a packet pi = [au1 , au2 , . . . , auMi ] entering a router, the router processes the first part of the analysis sequence (e.g. [au1 , au2 , . . . , auj  ] for some j  ) and forwards the remaining sequence (i.e. the packet [auj  +1 , auj  +2 , . . . , auMi ]) to some neighbor. Since the main goal of each node in the core network is to deliver a packet to its destination without introducing an excessive latency, only a part of the node capability can be used for intrusion detection purposes in any moment. Furthermore, the size of this part may change over time, depending on the workload of the router. Hence, we indicate with ERk the amount of energy that, in a given moment, the IDS on router Rk can use for analyzing the content of a packet. Therefore, given a packet pi , the number of consecutive analysis units that the router can process is provided by auiR the maximum value auiRk such that ERk  j=1k Eauj . We refer to auiRk as the number of analysis units that the router Rk can process for the packet pi in a given moment. Besides the energy used for the analysis, it is necessary to take into account also the energy used for the routing of the packet from the entrance point (router A in Fig. 1) to the exit one (router G in Fig. 1). Given the set of links (LSet = {hl }l∈N ) and a packet pi , we tag each link in LSet with a value Ehi l , thus we may define the whole energy consumption of the activity of forwarding the packet pi on that link. Note that the activity of the router (both analysis and forwarding) has no effects on the dimension of the packet. For this, from delivery perspective, the packet is recognized as invariant during its travel through the ISP. Therefore,

M. Migliardi and A. Merlo / Improving energy efficiency in distributed intrusion detection systems

257

the energy cost of delivery is independent from the order of the links. Thus, given a subset of L links HL ⊆ LSet, that connect A to B, we define the energy cost of forwarding the packet pi along such path as: EHL =

L 

Ehi l

for hl ∈ HL .

(2)

l=1

The total energy consumption for packet analysis and routing in an ISPN with IDS analysis is globally defined as:

ETOT i = Epi + EHL =

Mi 

Eauj +

j=1

L 

Ehi l .

(3)

i=1

AU

TH

O

R

C

O

PY

Such model is fully deterministic and assumes (see assumption (1)) that each packet must be fully checked in order to come to a decision on its state (i.e. be it good or bad). Thus,  an energy-aware IDS strategy can only optimize i the consumption related to the routing of bad packets (reducing L i=1 Ehl ) but it cannot reduce the costs related to analysis. Furthermore, the same assumption does not cope with the functioning of actual IDS. For instance, in SNORT each packet is iteratively compared with signatures of malwares. It is important to notice that the current model assumes that the analysis activity is perfect, namely the packet is eventually recognized as it really is (good or bad), without considering false negatives or false positives cases. This is not completely adherent to reality; however, the fact that the analysis is not 100% accurate does not invalidate our model. In fact, the analysis in itself will produce the same results in our scenario as in the case in which it is performed all at once in a network fringe node. Furthermore, a deeper, more sophisticated analysis capable of providing better accuracy may introduce changes in cost values but will not change the nature of the analysis itself. According to our model, we can assume that each comparison corresponds to an analysis unit. In this case, all analysis units are checked only if (1) the packet is good and the IDS contains no rules to recognize a good packet or (2) the packet is bad and its signature matches the last one in the signature DB of the IDS. The first case may be true, however, there are IP packet configurations that cannot support an intrusion attempt, and thus it is possible to tag a packet as good before having executed all the analysis steps. The second case maps the very worst case in signature-based detection: in the average case, a bad packet may be recognized after 1/2 of the signature checks. To sum up, in signature-based IDS assuming that all checks are performed to decide if a packet is good or bad is a conservative approach. Thus, modeling the energy consumption due to discard of bad packets as something that, in any case, requires the full analysis cost is a conservative, worst case assumption too. To soften this worst case assumption, we need to extend the original model so that it is possible to relax assumption (1). To achieve this goal, we add probabilistic components to each analysis unit that model the event that specific analysis unit may be capable of recognizing the packet as bad. To this aim, we provide a new definition of packet. Definition 1. A packet PM of length M is a triple (SPM , PListPM , EListPM ), where: • SPM is the state of the packet (i.e. good or bad), • PList = [PB1 , . . . , PBM ] is a list of probabilities related to the single analysis unit. Each value PBj (

Suggest Documents