A Game Theoretic Investigation of Deception in Network Security Thomas E. Carroll Pacific Northwest National Laboratory 902 Battelle Boulevard P.O. Box 999, MSIN J4-45 Richland, WA 99352 USA Email:
[email protected]
Daniel Grosu Department of Computer Science, Wayne State University 5143 Cass Avenue, Detroit, MI 48202 USA. Email:
[email protected]
Abstract We perform a game theoretic investigation of the effects of deception on the interactions between an attacker and a defender of a computer network. The defender can employ camouflage by either disguising a normal system as a honeypot, or by disguising a honeypot as a normal system. We model the interactions between defender and attacker using a signaling game, a non-cooperative two player dynamic game of incomplete information. For this model, we determine which strategies admit perfect Bayesian equilibria. These equilibria are refined Nash equilibria in which neither the defender nor the attacker will unilaterally choose to deviate from their strategies. We discuss the benefits of employing deceptive equilibrium strategies in the defense of a computer network.
1 Introduction Defenders can employ deception to increase the effectiveness of their defenses. Deception causes the attackers increased uncertainty. Attackers respond to the uncertainty by expending additional resources (e.g., time, money) for reconnaissance and development of situational awareness. Furthermore, deception impedes attackers deploying highly effective tailored attacks. Deception has a long history of effective use within the military and is now being deployed for the protection of information systems [7, 17, 20, 24]. It is common practice nowadays for a defender to deploy honeypots on her network. A honeypot is a computer system that is a trap to detect unauthorized accesses [27]. Unlike normal systems, honeypots produce a rich source of information detailing the attack methods used when attackers attempt to compromise them. This reason alone is why (rational) attackers avoid them [16]. Attackers may be able to determine if a system is a honeypot by considering clues such as slow I/O and other time delays, unusual system calls, temptingly obvious file names (e.g., “passwords”), 1
and the addition of data in memory [8]. To complicate the job of detecting honeypots, Rowe et al. [24] have proposed the use of “fake honeypots”, which are normal systems that have been disguised to appear as honeypots. This is a form of deception in which objects are camouflaged or masked to appear as some other object [3]. Once fake honeypots are implemented, attackers will avoid compromising them, thinking that the systems are actually honeypots. As this defensive technique becomes common knowledge, attackers must expend additional resources to determine whether a system is a true honeypot or not. In this work, we perform a game theoretical investigation of deception in network security. The scenario we examine is as follows. A defender deploys honeypots in her network. She can choose to employ camouflage which means either disguising normal systems as honeypots or honeypots as normal systems. The attacker observes the system without being able to detect its real type. If the system is camouflaged, she observes the disguise; otherwise, she observes the system’s actual type. Then the attacker determines whether or not to proceed compromising the system. We model the defender-attacker interaction as a signaling game, a dynamic game of incomplete information. A dynamic game is a game in which players take turns choosing their actions. In the scenario under study, the defender moves first by choosing whether or not to disguise the system, after which, the attacker decides whether to compromise the system. The incomplete information arises from the attacker’s uncertainty of system type. We determine and characterize the perfect Bayesian equilibria of the game. At an equilibrium, the defender and the attacker do not have incentives to unilaterally deviate by changing their strategies. We show that camouflage is an equilibrium strategy for the defender. Finally, we discuss the benefits of these deceptive equilibrium actions in defending a network.
1.1 Related Work Recently, honeypots have become one of the main tools used to collect information about attacks. A honeypot is a system that is specifically setup to trap unauthorized accesses [27]. Unlike a normal system, a honeypot has rich logging facilities that record all activities and permit detailed analysis. Since its sole purpose is engaging hackers, any activity directed at it is by definition unauthorized. HoneyNet is a volunteer project that has deployed honeynets, networks of honeypots, around the world. One goal of the HoneyNet project is to collect data from their honeynets and analyze and correlate activities to achieve early warning and prediction [25]. In their current form, honeypots have been successful in identifying large scale worm outbreaks [18], understanding of denial-ofservice attacks [19], and creating attack signatures [13]. Besides honeypots, deception techniques have been proposed for defending information systems. Cohen [6] provides a comprehensive discussion of deception as a means to protect information systems. Among his many conclusions is that deception has a positive effect for the defenders, and conversely, a negative effect for the attackers. Cohen and Koike [7] showed how deception can control the path of an attack. Rowe et al. [24] showed how fake honeypots, normal systems disguised as honeypots, decreased the amount of attacks a network witnesses. Several tools for evaluating honeypot deceptions were proposed in [23]. Game theory has been used for studying various security related problems. The existing work in this area is in its early stages and it is mainly focused on very simple scenarios for intrusion 2
detection. Browne [4] proposed a game theoretic model for the coordinated attack and defense problem in military networks. The attackers and defenders interactions are modeled using a static noncooperative game. Syverson [26] proposed the use of game theory to model secure computations in distributed computing. In his model the distributed computation is viewed as a two-player noncooperative game between the ‘good’ network (composed of ‘good’ nodes) and the ‘bad’ network (composed of ‘bad’ nodes). It is suggested that a better model for distributed secure computation should be based on stochastic games. Lye and Wing [15] used a two-player stochastic game between the attacker and the system. They modeled a scenario in which an attacker attempts to launch a DOS (Denial of Service) attack and captures some important data from a machine in the network. They computed the Nash equilibria for the game that models this scenario. Hespanha and Bohacek [12] investigated a zero-sum routing game between the designer of the routing algorithm and an adversary that attempts to intercept packets. Notable is the work by Alpcan and Basar [1,2] in which they modeled the interaction between the intrusion detection system and the attacker as a noncooperative game with dynamic information. For special games they obtained the Nash equilibrium solutions in closed form. Liu and Wang [14] presented an incentive-based method for modeling the attacker’s intent, objective and strategies. They discussed the possibility of using different types of games for modeling and inferring the attacker’s intent. They limited their study to a Bayesian repeated game that models the DDoS attack and defense. Grossklags et al. [11] provided a game-theoretic analysis of security investment decision-making. Garg and Grosu [9] developed a model for deception in honeynets using a dynamic game in extensive form. Patcha and Park [22] used a signaling game to study intrusion detection in ad hoc wireless networks. To the best of our knowledge, our current paper is the first work that models deception in computer network security as a signaling game.
1.2 Organization This paper is organized as follows. In Section 2, we describe signaling games and the perfect Bayesian equilibrium solution concept. In Section 3, we model the defender-attacker interaction as a signaling game, describe and characterize its equilibria, and discuss the benefits of camouflage in relation to its equilibria. In Section 5, we discuss how the defender can use the analysis in order to better protect her network. Finally, we summarize the work and detail future directions in Section 6.
2 Signaling Games The analysis performed in this paper is based on game theory, a subfield of economics that focuses on analyzing the interactions among decision makers. In our setting we have two decision makers, the defender and the attacker. The defender employs deception which masks the type of her systems (i.e., normal system or honeypot). Due to incomplete information, the attacker is uncertain upon initial inspection if the system she is attacking is a normal system that is beneficial to compromise or is a honeypot that is harmful.
3
x1 , y1
x2 , y2
L
R t 1 b
L x5 , y5
x3 , y3
x4 , y4
L [q] 2 [1 − q]
T [P ]
C B [1 − P ]
[p] 2 [1 − p] R
R t 1 b
L
x6 , y6
x7 , y7
R x8 , y8
Figure 1: A signaling game. The defender and the attacker play a dynamic game: the defender plays first followed by the attacker. Most importantly, the attacker observes some information about the defender’s action which she uses to optimize her choice of action. Dynamic games are usually specified in extensive form. This form represents the game as a tree. Each decision node represents a state in which a player must choose an action. Each leaf gives a payoff for a particular sequence of choices. We model the interaction between the defender and the attacker as a signaling game [10], a non-cooperative two player dynamic game of incomplete information. An extensive form representation of a basic signaling game is given in Fig. 1. A game of incomplete information can be transformed into a game of imperfect information by adding a hypothetical player called Nature (denoted by C here) and conditioning the payoffs on the Nature’s unknown moves. The Nature player moves first by randomly choosing the type of Player 1 from an a priori probability distribution over all Player 1’s types. This distribution is known by all players. In our example, Nature assigns type T with probability P r(T ) = P ∈ [0, 1] and type B with probability P r(B) = 1 − P . The type is private to Player 1 and cannot be directly observed by Player 2. Once Player 1 learns her type, she decides what signal or message to send to Player 2. The signal provides indirect information to Player 2 about the type of Player 1. In our example, Player 1 can send either signal t (signaling that player 1 is of type T ) or b (signaling that player 1 is of type B). Note that Player 1 can send a signal t even in the case her real type is B, or send a signal b even in the case her real type is T . Player 2 observes the signal and then responds with an action, either L or R. The set of decision nodes is partitioned into information sets. An information set is a set of one or more decision nodes of a player that determines the possible moves conditioned on what the player has observed so far. In other words, two or more decision nodes are part of the same information set if the player whose turn is to move cannot distinguish between them. Decision nodes belonging to the same information set are indicated by a dotted line connecting them. Player 1 has two information sets, one where Nature assigns T and the other where Nature assigns B. These are all singleton information sets since Player 1 knows his type selected by the Nature player. Player 2 also has two information sets, one where she receives signal t and the other where she receives signal b. These two sets are composed of two nodes, since Player 2 does not 4
know the real type of Player 1 when she makes her decision of playing L or R. The game in our example has eight outcomes. One example of outcome is Nature assigning T to Player 1, Player 1 sending b, and Player 2 responding with L. Each outcome results in a payoff. A payoff measures the benefit that each player earns if that outcome is reached. Payoffs corresponding to outcome i are represented as tuples (xi , yi ), the first component being Player 1’s payoff and the second being Player 2’s payoff. A strategy is a plan that accounts for all contingencies and consists of one action for each information set. Continuing with the above example, one of Player 1’s strategies is to send t if she is of type B, and to send b if she is of type T . For Player 2, one strategy is to respond with L if she receives t, and to respond with R if she receives b. Since Player 1 has two information sets and two signals, she has 22 = 4 strategies. Similarly, Player 2 has 22 = 4 strategies. The players are self-interested and welfare-maximizing, which induces them to choose strategies that maximize their payoff. A strategy profile consists of a tuple of strategies, one strategy for each player. When investigating non-cooperative games, we are interested in strategy profiles that results in equilibria. Definition 1 (Nash equilibrium) A Nash equilibrium is a strategy profile in which each player cannot improve her payoff by unilaterally changing her strategies. A Nash equilibrium results in a steady state in which each player chooses not to deviate as doing so reduces her payoff. It was proved by Nash that a game with a finite set of players, each having a finite set of actions, has at least one equilibrium [21]. In a signaling game, an information set can be on the equilibrium path or off the equilibrium path. An equilibrium path determined by a given equilibrium is composed of all the decision nodes reached by following the equilibrium strategy. An information set is on the equilibrium path if it is reached with positive probability when the equilibrium strategies are played, and it is off the equilibrium path if it is reached with zero probability. Certain Nash equilibria result in unlikely plays for off the equilibrium path information sets. The perfect Bayesian equilibrium (PBE) is a refined equilibrium concept that eliminates the possibility of players threatening to play strategies which are strictly dominated at any information set off the equilibrium path. Besides strategy profiles, PBE requires that players have beliefs. A player has beliefs about which decision node within an information set is reached. Beliefs are represented as a probability distribution over the set of decision nodes within an information set. In Fig 1, Player 2 assigns beliefs q = P r(T |t) to the upper information set and p = P r(T |b) to the lower information set. That is, at the upper information set Player 2 belief is that Player 1 is of type T with probability q and of type B with probability 1 − q. Similarly, at the lower information set Player 2 belief is that Player 1 is of type T with probability p and of type B with probability 1 − p. Gibbons [10] defines perfect Bayesian equilibria as strategy profiles and beliefs that satisfy the following four requirements. Requirement 1 At each information set, the player with the move must have a belief about which decision node in the information set has been reached. Requirement 2 The action taken at an information set by the player with the move must be optimal given the player’s belief at the information set and the player’s subsequent moves. This is referred to as sequential rationality. 5
Requirement 3 At information sets on the equilibrium path, beliefs are determined by Bayes’ law and player’s equilibrium strategies. Requirement 4 At information sets off the equilibrium path, beliefs are determined by Bayes’ law and the players’ equilibrium strategies where possible. Definition 2 (Perfect Bayesian equilibrium) A perfect Bayesian equilibrium consists of a strategy profile and beliefs that satisfy Requirements 1 to 4. In plain language, the four requirements state that the players have beliefs (Requirement 1), that they choose moves based on those beliefs (Requirement 2), and that the beliefs are correct (Requirement 3 and Requirement 4). Requirement 3 and 4 guarantee that the equilibrium strategies and the beliefs are consistent with each other. For signaling games, a PBE where the sender gives the same signal regardless of type is called a pooling equilibrium. Pooling equilibria require that Player 2 have two sets of beliefs, one for the on equilibrium path and one for the off equilibrium path. The on equilibrium path beliefs are given by probability p = P (respectively q = P ) and the off equilibrium path beliefs are given by q ∈ [0, 1] (respectively p ∈ [0, 1]). A separating equilibrium is a PBE in which the sender gives an unique signal for each type. Player 2’s beliefs are such that p = 1 and q = 0, or q = 1 and p = 0. Finally, a hybrid strategies equilibrium is a PBE in which Player 1 of type T chooses t with certainty and Player 1 of type B randomizes between pooling with t and separating with b, or when Player 1 of type B chooses b with certainty and Player 1 of type T randomizes between pooling with b and separating with t. In the first case p ∈ (0, 1) and q is either zero or one while in the second case q ∈ (0, 1) and p is either zero or one. In the remaining work we examine the defender-attacker interaction by modeling it as a signaling game. Then, we quantify the perfect Bayesian equilibria of this game and draw conclusions on defender’s strategies.
3 The Deception Game A defender protects her network by deploying honeypots, traps to detect unauthorized access. To improve efficacy, she camouflages her systems. She can disguise normal systems as honeypots and honeypots as normal systems. Rowe et al. [24] showed that this technique effectively enhances computer network security. After the network is created, an attacker then attempts to compromise systems. The attacker can successfully compromise normal systems, but not honeypots. If the attacker attempts to compromise a honeypot, the defender observes the actions and can later improve her defenses. We model this interaction between defender and attacker as the signaling game presented in Fig. 2. The notation used in this paper is summarized in Tab. 1. The game begins when the attacker attempts to compromise a system on the defender’s network. Nature (C) chooses the type of the system that the defender will protect. Nature chooses either type honeypot (H) or normal system (N ) with probability Ph and 1 − Ph , respectively. We interpret Nature assigning the type as the attacker randomly selecting a system within the network to compromise. The defender can signal that the system is a honeypot (h) or a normal system 6
vo , −ca − co
0, 0
0, −cn
T
A
n
A
vo , −ca − co
[p]
T
0, −ch
T
W
Attacker [1 − q]
H [Ph ]
Defender
vg , 0
−cc , va − ca − cn
A
W [q]
h
−cc , va − ca
C
N [1 − Ph ]
Attacker [1 − p]
W
A
0, 0
h Defender n
T
−cc , va − ca −cc , va − ca − ch Figure 2: The deception signaling game.
W
vg , 0
(n), independent of the system’s actual type. A “fake honeypot” [24] is a system of type N for which the defender signals h. The attacker receives the signal and then chooses an action. The attacker will either attack the system without determining the system type (A), condition the attack on determining the system type (T ), or retreat (W ). Figure 2 shows that there are twelve potential outcomes. We assume that the costs of setting up the systems, incorporating camouflage, and the management of the systems are fixed costs and therefore are not considered in the defender’s decision making process. We further assume that normal systems generate revenue for the defender and that honeypots do not. In this work we exclude the possibility of a firm running honeypots for profit. If one of the normal systems is compromised, the defender incurs a loss of cc . This includes both the costs due to loss of business and the cost of restoring the system. If the attacker attempts to compromise a honeypot, the defender observes these actions, learning how to improve her defenses. In this case the defender gains vo , the benefit from observing the attacker. If the attacker retreats before attacking a normal system, the defender gains vg . The attacker incurs cost ca when attacking a system, irrespective of success. If the attack succeeds, she gains va , leading to a “profit” of va − ca . The attacker loses co when attacking honeypots. The attacker has tests to determine if a system is normal or not. The tests to determine if a system is normal or a honeypot costs cn and ch , respectively. After receiving the results of tests, the attacker either moves forward with her attack or abandons the attempt. If she tests for a honeypot, then she incurs either cn or ch , both of which are less then ca + co . If she tests for a normal system, her payoffs are either va − ca − cn or va − ca − ch . The strategies of each player are given in Tab. 2. The defender has four pure (non-deterministic) 7
Table 1: Deception game notation. Notation Defender H N h n vo vg cc Attacker A T W Ph q p ca cn ch co va
Description System is a honeypot System is normal (i.e., not a honeypot) Signal that the system is a honeypot Signal that the system is normal Benefit of observing an attack on a honeypot (vo > 0) Benefit of avoiding an attack on a normal system (vg ≥ 0) Cost due to a compromised normal system (cc > 0) Attack without determining the system type Condition the attack on determining the system type Do not attempt an attack Probability that the attacked system is a honeypot (Ph ∈ [0, 1]) Attacker’s belief at the upper information set that the system is a honeypot (q ∈ [0, 1]) Attacker’s belief at the lower information set that the system is a honeypot (p ∈ [0, 1]) Cost of compromising a system (ca ≥ 0) Cost of testing if a system is normal (cn ∈ [0, ca ]) Cost of testing if a system is a honeypot (ch ∈ [0, ca ]) Cost due to being observed (co > 0) Benefit of compromising a normal system (va ≥ ca )
Table 2: The players’ pure strategies. Defender s1 s2 s3 n n h n h n
type N H
signal n h
t1 A A
t2 A T
t3 A W
Attacker t4 t5 T T A T
t6 T W
s4 h h
t7 W A
t8 W T
t9 W W
strategies. Each strategy has a contingency for each of its types. Strategy s1 and s4 is to signal normal system (n) and honeypot (h), respectively, regardless of the actual system type. In s2 the defender signals the actual type of the system, while in s3 she signals the opposite of the actual type, e.g., the defender signals honeypot if the system is normal. The attacker has nine pure strategies, each strategy giving a plan of action for every signal that she can receive. In the case of strategy t4 , the attacker plays T if she receives signal n and plays A if she receives h.
8
4 Deception Game’s Equilibria In this section we examine the deception game for the existence and properties of its Perfect Bayesian Equilibria (PBE). We choose PBE as it provides good predictions of how the deception game will be played by the attacker and the defender. We first examine the game for any pure strategy PBE, either separating or pooling, and characterize the beliefs necessary for their existence. We then perform a similar analysis for the game’s hybrid PBE.
4.1 Separating Equilibria We first examine the existence of separating equilibria. This reduces to examining the possible equilibria involving defender’s strategies s2 and s3 . We examine if any equilibria contain strategy s2 (reveal the true system type). When faced with this strategy, the attacker chooses A if she receives n as the signal, since the payoff va − ca obtained by playing A, is greater than the payoff va −ca −ch obtained by playing T , and the zero payoff obtained by playing W . When faced with h, she responds with W as the resultant payoff of zero is greater than A’s payoff of −ca − co and T ’s payoff of −cn . The attacker plays A and W upon receiving n and h, respectively. This is strategy t3 . The defender’s best response to t3 is s4 : by sending h instead of n, the defender increases her payoff from −cc to vg . Examining the defender’s strategy s3 , the attacker’s best response is t7 : she selects action W when she receives n and A when h. In response to t7 , the defender switches to strategy s2 as, again, her payoff for normal systems increases from −cc to vg . Consequently, this game does not have any separating equilibria as neither s2 nor s3 results in a steady state.
4.2 Pooling Equilibria We now investigate the existence of pooling equilibria. Potential equilibria should involve either strategy s1 or s4 . If the defender involves strategy s1 , the attacker receives signal n and will choose A if the action results in an expected payoff greater than the expected payoff from her other two actions. The attacker’s expected payoff for playing A must be greater than or equal to the expected payoff for playing T , Ph (−ca − co ) + (1 − Ph )(va − ca ) ≥ Ph (−ch ) + (1 − Ph )(va − ca − ch ) which gives, Ph ≤
ch . ca + co
(1)
The attacker’s expected payoff for playing A must also be greater than or equal to the expected payoff for playing W , Ph (−ca − co ) + (1 − Ph )(va − ca ) ≥ 0
9
which gives, Ph ≤
va − ca . va + co
(2)
We now need to determine the beliefs and actions for the off-equilibrium path of sending signal h. First, strategy t3 does not lead to a steady state as the defender’s best response is s3 , not s1 . The defender changes her response as signal h results in a payoff of vg which is greater than the payoff of −cc when she sends n. The attacker responds with t1 if she has a set of beliefs for the offequilibrium path, q, that gives the maximum expected payoff when playing A. Thus, the expected payoff obtained by playing A should be greater than or equal to the expected payoff obtained by playing T . Therefore, q must satisfy q(−ca − co ) + (1 − q)(va − ca ) ≥ q(−cn ) + (1 − q)(va − ca − cn ) which gives, q≤
cn . ca + co
(3)
The expected payoff obtained by playing A should also be greater than or equal to the expected payoff obtained by playing W . Therefore q must also satisfy, q(−ca − co ) + (1 − q)(va − ca ) ≥ 0 which gives, q≤
va − ca va + co
(4)
Similarly, the attacker responds with t2 if she has a set of beliefs that gives the maximum expected payoff when playing T . The attacker’s expected payoff obtained by playing T is greater than or equal to the payoff obtained by playing A if q(−cn ) + (1 − q)(va − ca − cn ) ≥ q(−ca − co ) + (1 − q)(va − ca ) which is equivalent to, q≥
cn . ca + co
(5)
The attacker’s expected payoff obtained by playing T is greater than or equal to the payoff obtained by playing W if q(−cn ) + (1 − q)(va − ca − cn ) ≥ 0 10
which is equivalent to, q ≤1−
cn . va − ca
(6)
The attacker responds with T if the expected payoff obtained by playing T is greater than or equal to the one obtained by playing A, Ph (−ch ) + (1 − Ph )(va − ca − ch ) ≥ Ph (−ca − co ) + (1 − Ph )(va − ca ) which gives, Ph ≥
ch , ca + co
(7)
and if the expected payoff obtained by playing T is greater than or equal to the one obtained by playing W , Ph (−ch ) + (1 − Ph )(va − ca − ch ) ≥ 0 which gives, Ph ≤ 1 −
ch . va − ca
(8)
Neither t4 nor t6 leads to a steady state. If the attacker plays t4 , the defender switches to s2 as her payoff increases from 0 to vg . If the attacker plays t6 , the defender’s best response is s3 . Strategy t5 leads to equilibrium if the off-equilibrium path beliefs satisfy conditions (5) and (6). The attacker responds with W when she receives signal n, if the expected payoff obtained by playing W is at least as large as the expected payoff obtained by playing A 0 ≥ Ph (−ca − co ) + (1 − Ph )(va − ca ) which gives, Ph ≤
va − ca . va + co
(9)
The payoff obtained by playing W should also be at least as large as the expected payoff obtained by playing T , 0 ≥ Ph (−ch ) + (1 − Ph )(va − ca − ch ) which gives, Ph ≥ 1 −
ch . va − ca
11
(10)
Strategies s1 and t8 and beliefs that satisfy (3) and (4) results in a PBE. The attacker plays W on the off-equilibrium path (strategy t9 ) if she has a set of beliefs in which the expected payoff obtained by playing W is at least as large as the one obtained by playing A, 0 ≥ q(−ca − co ) + (1 − q)(va − ca )
q≥
va − ca , va + co
(11)
and at least as large as the payoff obtained by playing T , 0 ≥ q(−cn ) + (1 − q)(va − ca − cn )
q ≥1−
cn . va − ca
(12)
If A is played on the off-equilibrium path, the defender switches to strategy s2 . Thus, t7 is not part of a steady state. The other pooling equilibria incorporating s4 can be found following a similar methodology. We summarize the potential equilibria and their conditions in Tab. 3 and the expected payoffs in Tab. 4. We denote by Ei (i = 1, . . . , 10) the pooling Perfect Bayesian Equilibria (PBE). The second column of the table gives the equilibria as a tuple (si , ti , p, q) where si is the defender’s strategy, ti is the attacker strategy and p, q are the beliefs of the attacker. For an equilibrium to exist, the beliefs must satisfy both the on-equilibrium and off-equilibrium path conditions. If no equilibria exist, players will alternate between strategies and the outcome of the game becomes uncertain. Some of the equilibria have identical on-equilibrium paths. A unique path is needed to have a unique outcome. Six outcomes are given by the ten equilibria: the on-equilibrium path of E1 and E2 are identical, as well as the path of E4 and E5 , E6 and E7 , and E9 and E10 .
4.3 Hybrid Strategy Equilibria We now investigate the deception game for hybrid strategy equilibria. The defender in a hybrid strategy equilibrium mixes a pooling strategy with a separating strategy. We first examine cases in which the attacker plays a pure strategy. Let us consider the case of t1 in which the attacker plays A in response to either signal. For A to be the best response to n, the payoff must be greater than the payoff of the other choices. Attacker’s beliefs must satisfy the conditions p ≤ ch /(ca + co ) and p ≤ (va − ca )/(va + co ) (conditions obtained in the computation of the pure strategy equilibrium E1 ). The defender of type H will randomize if her expected payoffs of signaling n and h are equal. These expected payoffs are equal as it can be observed directly from the definition of the game. Furthermore, the attacker must have beliefs given by q = 1. But the choice of A does not maximize the attacker’s payoff; she would prefer W . Let us consider the case in which the attacker responds with W for either signal, which is strategy t9 . Her beliefs are p ≥ (va − ca )/(va + co ) and p ≥ 1 − ch /(va − ca ). The defender of type H will randomize between h and n, while the 12
E1 E2 E3 E4 E5
Table 3: Pooling equilibria and their conditions. conditions pooling PBE on-equilibrium path off-equilibrium path ch cn q ≤ ca +co Ph ≤ ca +co (s1 , t1 , p = Ph , q) va −ca a q ≤ vvaa−c Ph ≤ va +co +co h n Ph ≤ cac+c q ≥ cac+c o o (s1 , t2 , p = Ph , q) cn a q ≤ 1 − Ph ≤ vvaa−c va −ca +co n h q ≥ cac+c Ph ≥ cac+c o o (s1 , t5 , p = Ph , q) P ≤ 1− ch cn q ≤ 1 − h va −ca va −ca n a q ≥ cac+c Ph ≥ vvaa−c +co o (s1 , t8 , p = Ph , q) cn h q ≤ 1 − Ph ≥ 1− vac−c va −ca a (s1 , t9 , p = Ph , q)
a Ph ≥ vvaa−c +co h Ph ≥ 1− vac−c a
Ph Ph Ph Ph Ph Ph
n ≤ cac+c o a ≤ vvaa−c +co n ≤ cac+c o va −ca ≤ va +co n ≥ cac+c o n ≤ 1− vac−c a
Ph Ph Ph Ph
a ≥ vvaa−c +co n ≥ 1− vac−c a a ≥ vvaa−c +co n ≥ 1− vac−c a
a q ≥ vvaa−c +co n q ≥ 1 − vac−c a h p ≤ cac+c o va −ca p ≤ va +co h p ≥ cac+c o h p ≤ 1 − vac−c a h p ≥ cac+c o h p ≤ 1 − vac−c a ch p ≥ ca +co h p ≤ 1 − vac−c a
E6
(s4 , t1 , p, q = Ph )
E7
(s4 , t4 , p, q = Ph )
E8
(s4 , t5 , p, q = Ph )
E9
(s4 , t6 , p, q = Ph )
E10
(s4 , t9 , p, q = Ph )
E1 E2 E3 E4 E5 E6 E7 E8 E9 E10
Table 4: The expected payoffs of the pooling equilibria. expected payoff Defender Attacker Ph vo − (1 − Ph )cc (1 − Ph )(va − ca ) − Ph (ca + co ) Ph vo − (1 − Ph )cc (1 − Ph )(va − ca ) − Ph (ca + co ) −(1 − Ph )cc (1 − Ph )(va − ca − ch ) − Ph ch (1 − Ph )vg 0 (1 − Ph )vg 0 Ph vo − (1 − P − h)cc (1 − Ph )(va − ca ) − Ph (ca + co ) Ph vo − (1 − P − h)cc (1 − Ph )(va − ca ) − Ph (ca + co ) −(1 − Ph )cc (1 − Ph )(va − ca − cn ) − Ph cn (1 − Ph )vg 0 (1 − Ph )vg 0
a p ≥ vvaa−c +co h p ≥ 1 − vac−c a
13
defender of type N will signal n. Since p is on the equilibrium path, it must satisfy Requirement 3, which states that beliefs are determined by Bayes’ law and player’s equilibrium strategies. The probability that the defender of type H plays n is then P r(n|H) =
1 − Ph p . Ph 1 − p
(13)
This is equilibrium E11 in Tab. 5 in which the defender plays strategy s1 with probability P r(n|H) and s2 with probability 1 − P r(n|H). Now consider the cases in which the defender of type N randomizes between h and n, and the attacker responds with t1 . The beliefs of the defender are as stated above. The expected payoff for the defender of type N sending n is equal to the payoff of her sending h and the defender of type H is indifferent between signals. Equilibrium E12 exists if P r(n|N ) =
Ph 1 − p . 1 − Ph p
(14)
The other equilibria in which the defender randomizes between signaling h and n, and the attacker plays a pure strategy can be determined using a similar analysis. All these equilibria and their conditions are given in Tab. 5 and the resulting expected payoffs in Tab. 6. We denote by Ei (i = 11, . . . , 14) the hybrid strategy equilibria. The second column of the table gives the equilibria as tuples ((s1 , s2 | pr1 , pr2 ), ti , p, q) where s1 , s2 are the defender’s strategy, pr1 is the probability of playing s1 , pr2 is the probability of playing s2 , ti is the attacker strategy and p, q are the beliefs of the attacker. Now consider the cases in which the attacker randomizes over two responses. The defender of type H will randomize if the attacker plays T and W in response to n (respectively h) and W in response to h (respectively n). The defender of type N will send h instead of n since when signaling h her payoff of vg is greater than the payoff of (1 − p)(vg − cc ) obtained when signaling n. Also, the defender of type N will send n instead of h since when signaling n her payoff of vg is greater than the payoff of (1−p)(vg −cc ) obtained when signaling h. Now suppose the defender of type N is to randomize. The attacker would choose pure strategy A since it maximizes her payoff. But this choice induces the defender of type H to deviate (vo > (1 − p)vo ). Therefore, there does not exist any equilibria in which the attacker mixes two strategies. Due to similar conditions, E11 is likely to coexist with either E1 or E2 or both. Since the expected payoffs are equal, the hybrid strategy equilibrium and the pooling equilibrium are strategically equivalent, i.e., the hybrid strategy equilibrium is neither better nor worse than the pooling equilibrium. The other three hybrid strategy equilibria have similar associations to pooling equilibria.
5 Case Studies and Discussion The question now is how the defender can best use the above analysis. Normally multiple equilibria pose problems as there does not exist any coordination to designate an equilibrium. But since the 14
E11
E12
E13
E14
E11 E12 E13 E14
Table 5: Hybrid strategy equilibria and their conditions hybrid strategy equilibrium conditions a p≥ vvaa−c +co h ((s1 , s2 |P r(n|H), 1 − P r(n|H)) , t9 , p, q = 1) p≥ 1 − vac−c a h p P r(n|H)= 1−P (implies p < Ph ) Ph 1−p ch p≤ ca +co a ((s1 , s3 |P r(n|N ), 1 − P r(n|N )) , t1 , p, q = 0) p≤ vvaa−c +co Ph 1−p (implies p > Ph ) P r(n|N )= 1−P h p va −ca q≥ va +co n ((s4 , s3 |P r(h|H), 1 − P r(h|H)) , t9 , p = 1, q) q≥ 1 − vac−c a h q P r(h|H)= 1−P (implies q < Ph ) Ph 1−q cn q≤ va +co a ((s4 , s2 |P r(h|N ), 1 − P r(h|N )) , t1 , p = 0, q) q≤ vvaa−c +co Ph 1−q P r(h|N )= 1−P (implies q > Ph ) h q Table 6: The expected payoffs of the hybrid strategy equilibria. expected payoff Defender Attacker (1 − Ph )vg 0 Ph vo − (1 − Ph )cc (1 − Ph )(va − ca ) − Ph (ca + co ) (1 − Ph )vg 0 Ph vo − (1 − Ph )cc (1 − Ph )(va − ca ) − Ph (ca + co )
defender moves first, she dictates the resulting equilibrium by selecting her strategy (the “first mover advantage”). The attacker observes the signal and then chooses her best response. The defender chooses her strategy as follows. She determines the set of existent equilibria by evaluating the conditions for E1 –E14 . The ten pooling equilibria comprise two strategies, either s1 or s4 . Each hybrid strategy equilibrium comprises a unique hybrid strategy. If a strategy is represented more than once in the set, she computes the attacker’s payoffs for the equilibria that contain the strategy and excludes the ones from the set that do not maximize the payoff. Assuming belief sets that are most advantageous to the attacker, the defender chooses the equilibrium strategy that maximizes her payoff, breaking ties by choosing the strategy that minimizes the attacker’s payoff. In the following we investigate three case studies to illustrate the above decision process. Case Study 1. Assume a defender has 10 percent (Ph = 0.10) of the machines in her network setup as honeypots. Suppose the defender believes that the attack costs are ca = 3.00, ch = 0.25, cn = 0.50, and co = 0.10 and that the attacker values a compromised system at va = 5.00. We begin by evaluating the equilibria conditions set forth in Tab. 3 and Tab. 5. Only pooling equilibria E3 , E6 , and E7 are possible. Equilibrium E3 involves strategy s1 ; E6 an E7 involve strategy s4 . We note that E6 and E7 have an identical equilibrium path, resulting in the same outcome. Thus, the 15
attacker is indifferent between her equilibrium strategies in E6 and E7 . Equilibrium E3 gives the defender a payoff of −0.90cc , while E6 and E7 gives a payoff of 0.10vo − 0.90cc . Therefore, the defender will choose to implement strategy s4 , a deceptive strategy in which all normal systems need to be camouflaged as honeypots. Case Study 2. Consider another case in which the defender has Ph = 0.15 percent of her network as honeypots and believes that ca = 3.00, ch = 0.50, cn = 0.25, and co = 0.10. After evaluating the conditions, equilibria E1 , E2 , and E12 (with p = 5/31 and P r(n|N ) = 15/442) are feasible. The expected payoffs for the equilibria are equal. The defender can choose to implement strategy s1 in which all systems appear as normal systems or randomize over s1 and s3 , in which all the honeypots are disguised as normal systems, and approximately 0.96 (1 − P r(n|N )) of the normal systems are camouflaged as honeypots. In either case, the expected payoff is 0.15vo − 0.85cc for the defender and 0.85va − 3.015 for the attacker. There are some situations in which disguising all normal systems as honeypots is not feasible. This is the case when camouflaging normal systems as honeypots induces higher overheads in the response time, especially for systems running critical processes. In such situation choosing a hybrid strategy (e.g., that used in E12 ) which does not require camouflaging of all normal systems is appropriate. Case Study 3. Most attacks are heavily automated. Networks of compromised systems (“botnets”) provide inexpensive processing for highly parallel automated attacks. This essentially reduces the cost associated with testing and attacks to zero, and therefore, ca = ch = cn = 0. With this in mind, the defender evaluates the equilibria conditions. For example to check when E1 is possible we use Table 3. Since ch = 0, the first on-equilibrium path condition gives Ph ≤ 0. This is satisfied only when Ph = 0, that means that all systems in the network are normal systems. Equilibria E1 , E2 , E6 , and E7 occur only when Ph = 0; equilibria E4 , E5 , E9 , and E10 occur only when Ph = 1. Supposing that the defender has honeypots and normal systems within her network, these equilibria are not possible. This is because Ph = 1 means that all systems are honeypots and Ph = 0 means that all systems are normal. Each hybrid strategy equilibrium requires beliefs that are one or zero. Given this, the defender will not randomize and, thus, the hybrid strategy equilibria are excluded from consideration. Equilibria E3 and E8 are possible since all the on-equilibrium and off-equilibrium path conditions are satisfied. These conditions can be easily verified using Table 3. According to the payoffs given in Table 4 both equilibria result in the defender losing (1 − Ph )cc . Consequently, the defender will choose to implement either strategy s1 , have all honeypot systems disguised as normal, or s4 , have all normal systems disguised as honeypots. These two strategies are strategically equivalent and thus equally efficient to use by the defender.
6 Conclusion In this paper we investigated the use of deception in the interaction between a defender of a network and an attacker. We modeled the interaction between the defender and the attacker as a signaling game. We examined the game and determined and characterized its potential pooling equilibria. We also examined its hybrid strategies equilibria, which involve non-deterministic choices of strategies. A hybrid strategy allows the defender to have a network in which a mix of all types of
16
systems coexists (honeypots, camouflaged honeypots, normal systems, and camouflaged normal systems) allowing a much richer set of equilibria, leading to effective deception strategies. For future work, we plan to examine deception games with different payoff structures and to implement our game models in a decision support systems that can be used to deploy effective defensive deception strategies in computer networks.
Acknowledgments This research was supported, in part, by NSF grant DGE-0654014. A short version of this paper [5] was published in the Proc. of the Workshop on Security, Privacy and Trust of Computer and CyberPhysical Networks (SecureCPN 2009). The authors wish to express their thanks to the editor and the anonymous referees for their helpful and constructive suggestions, which considerably improved the quality of the paper.
References [1] T. Alpcan and T. Bas¸ar. A game theoretic approach to decision and analysis in network intrusion detection. In Proc. of 43rd IEEE Conference on Decision and Control, pages 2595– 2600, Dec. 2003. [2] T. Alpcan and T. Bas¸ar. A game theoretic analysis of intrusion detection in access control systems. In Proc. of the 43rd IEEE Conf. on Decision and Control, volume 2, pages 1568– 1573, 2004. [3] J. Bowyer Bell and B. Whaley. Cheating and Deception. Transaction Publishers, 1991. [4] R. Browne. C4i defensive infrastructure for survivability against multi-mode attacks. In Proc. of the Military Communication Conference, pages 417–424, Oct. 2000. [5] T. E. Carroll and D. Grosu. A game theoretic investigation of deception in network security. In Proc. of the 18th IEEE International Conference on Computer Communications and Networks (ICCCN 2009), Workshop on Security, Privacy and Trust of Computer and CyberPhysical Networks (SecureCPN 2009), 2009. [6] F. Cohen. A note on the role of deception in information protection. Computers and Security, 17(6):483–506, 1998. [7] F. Cohen and D. Koike. Misleading attackers with deception. In Proc. of the 5th IEEE SMC Information Assurance Workshop, pages 30–37, 2004. [8] X. Fu, W. Yu, D. Cheng, X. Tan, K. Streff, and S. Graham. On recognizing virtual honeypots and countermeasures. In Proc. 2nd IEEE Int. Symp. on Dependable, Autonomic and Secure Computing (DASC ’06), pages 211–218, 2006.
17
[9] N. Garg and D. Grosu. Deception in honeynets: a game-theoretic analysis. In Proc. of the 8th IEEE Information Assurance Workshop (IAW ’07), pages 107–113, 2007. [10] R. Gibbons. Game Theory for Applied Economists. Princeton University Press, Princeton, NJ, USA, 1992. [11] J. Grossklags, N. Christin, and J. Chuang. Secure or insure? a game-theoretic analysis of information security games. In Proc. of the 17th Intl. World Wide Web Conference, April 2008. [12] J. P. Hespanha and S. Bohacek. Preliminary results in routing games. In Proc. of the 2001 American Control Conference, pages 1904–1909, June 2001. [13] C. Kreibich and J. Crowcroft. Honeycomb: Creating intrusion detection signatures using honeypots. ACM SIGCOMM Computer Communications Review, 34(1):51–56, 2004. [14] P. Liu, W. Zang, and M. Yu. Incentive-based modeling and inference of attacker intent, objectives, and strategies. ACM Transactions on Information and System Security, 8(1):78– 118, 2005. [15] K. Lye and M. Wing. Game strategies in network security. International Journal of Information Security, 4(1-2):71–86, 2005. [16] B. McCarty. The honeynet arms race. IEEE Security Privacy, 1(6):79–82, 2003. [17] J. Michalski, C. Price, E. Stanton, E. Lee, K.-S. Chua, Y.-H. Wong, and C.-P. Tan. Final report for the network security mechanisms utilizing network address translation LDRD project. Technical Report SAND2002-3613, Sandia National Laboratories, Nov. 2002. [18] D. Moore, V. Paxson, S. Savage, C. Shannon, S. Staniford, and N. Weaver. Inside the slammer worm. IEEE Security Privacy, 1(4):33–39, 2003. [19] D. Moore, C. Shannon, D. J. Brown, G. M. Voelker, and S. Savage. Inferring internet denialof-service activity. ACM Trans. Comput. Syst., 24(2):15–139, 2006. [20] S. Murphy, T. McDonald, and R. Mills. An application of deception in cyberspace: Operating system obfuscation. In Proc. of the 5th Intl. Conf. on Information Warfare and Security (ICIW 2010), pages 241–249, 2010. [21] J. F. Nash. Equilibrium points in n-person games. Proc. Nat’l Academy of Sciences of the United States of Am., 36(1):48–49, Jan. 1950. [22] A. Patcha and J.-M. Park. A game theoretic approach to modeling intrusion detection in mobile ad hoc networks. In Proc. of the 5th IEEE SMC Information Assurance Workshop, pages 280–284, 2004. [23] N. C. Rowe. Measuring the effectiveness of honeypot counter-counterdeception. In Proc. of the 39th Annual Hawaii Int. Conf. on System Sciences (HICSS ’06), page 129.3, 2006. 18
[24] N. C. Rowe, E. John Custy, and B. T. Duong. Defending cyberspace with fake honeypots. J. Comput., 2(2):25–36, 2007. [25] L. Spitzner. The honeynet project: Trapping the hackers. IEEE Security Privacy, 1(2):15–23, 2003. [26] P. Syverson. A different look at secure distributed computation. In Proc. of the 10th IEEE Computer Security Foundations Workshop, pages 109–115, June 1997. [27] The Honeynet Project. Know Your Enemy: Learning about Security Threats. Addison-Wesley Professional, 2004.
19