Efficiently Computing the Likelihoods of Cyclically ...

7 downloads 16315 Views 408KB Size Report
Nov 24, 2016 - 1For illustration, take a server hosting a critical service. .... (which is cheap), but also implies the complete loss of core data (which may.
This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

Eciently Computing the Likelihoods of Cyclically Interdependent Risk Scenarios Steve Muller

a,b,c

a

b

c

, Carlo Harpes , Yves Le Traon , Sylvain Gombault , c

Jean-Marie Bonnin

itrust consulting s.à r.l., {steve.muller, harpes}@itrust.lu b University of Luxembourg, [email protected] c Telecom Bretagne, {sylvain.gombault, jm.bonnin}@telecom-bretagne.eu a

Abstract

Quantitative risk assessment provides a holistic view of risk in an organisation, which is, however, often biased by the fact that risk shared by several assets is encoded multiple times in a risk analysis.

An apparent solution to this issue

is to take all dependencies between assets into consideration when building a risk model.

However, existing approaches rarely support cyclic dependencies,

although assets that mutually rely on each other are encountered in many organisations, notably in critical infrastructures. To the best of our knowledge, no author has provided a

provably

ecient algorithm for computing the risk in

such an organisation, notwithstanding that some heuristics exist. This paper introduces the dependency-aware root cause model (DARC), which is able to compute the risk resulting from by a collection of root causes using a poly-time randomized algorithm, and concludes with a discussion on real-time risk monitoring, which DARC supports by design.

Keywords:

Cyclic dependencies, cyclic causal graphs, risk analysis, risk

assessment, quantitative assessment, dependency graph.

1. Introduction

Risk management constitutes an important aspect of decision taking, especially if the outcome is uncertain or has a large-scale impact on an organisation, which is why it forms a basis for many information security standards, includ-

Preprint submitted to Computers & Security

November 24, 2016

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

ing ISO/IEC 27xxx [1]. Risks can be evaluated in two ways [2]: qualitatively and quantitatively. Furthermore, some authors have suggested combining [3] or converting [4] both methods to get better results, but this topic is beyond the scope of this paper. In a

qualitative

assessment, risk scenarios are identied and then estimated

in terms of probability and impact on a discrete (and often abstract) scale, which consists of some few values, such as `low', `normal', `high' dened set of

unacceptable tuples hprobability, impacti

aut al.

A previously

permits to distinguish

between risk scenarios for which counter-measures need to be implemented (so as to reduce risk) and scenarios that are critical for an organisation.

Freq.

Impact

low

normal

high

critical

very often often

X

normal

X

X

rarely

X

X

X

Figure 1: Sample table that can be used in a risk analysis. White cells denote acceptable, black cells unacceptable risk scenarios. In contrast,

quantitative

assessments focus more on the potential damage

that is inicted to an organisation, e.g. in nancial terms. So instead of qualifying a risk scenario as above, its likelihood and impact are expressed numerically; for example, by stating that scenario X is estimated to occur every 5 years and when it does, it causes a loss of 10 k¿, leading to an expected loss of 2 k¿ per year. Note that unlike above, quantitative risk analyses provide an integral view of risk faced by an organisation since all scenarios can be inspected and compared to one another at once, thanks to the numerical value of the expected losses. By consequence, the urgency of securing a specic asset can be readily deduced, which is not so obvious to achieve in qualitative analyses. A major drawback of many risk assessment methods is the fact that the

2

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

1

determined risk is biased; indeed, assets often share

the same risk scenarios

which are included in the risk analysis of all these assets individually. Although this is the way to go when considering each asset on its own, the risk scenario in question is going to be accounted for multiple times in the global risk analysis, the outcome of which thereby becomes distorted. It is thus sensible to eliminate any redundancy from the risk assessment, for instance by assigning each scenario to the most related asset. However, by proceeding so, the separate view on an individual asset is no longer complete, since it does not take care of every possible risk scenario. To counteract this behaviour, several authors [5] [6] [7] [8] [9] [10] incorporate the assets and/or risk scenarios together with their

interdependencies

into a hierarchical graph,

based on which they deduce the risk for an asset, a group of assets or the whole organisation by reading o all subordinate risk scenarios.

holistic

individual

view

view

biased

X

redundancies removed

X

incomplete

dependency model

X

X

complete scenario list for each asset

Figure 2: Flaws and strengths of the several risk assessment models. Check marks (X) indicate correct outcome. However, hardly any risk assessment model supports cyclic dependencies, although cycles exist in every (sub-) system where the compromise of one component aects the whole (sub-) system. This is especially true in the context of Industrial Control Systems (ICS), where cascading eects can be of devastating order; for instance, power availability and a voltage control system (requiring electricity to work) constitute a common example of interdependent assets.

1 For illustration, take a server hosting a critical service. The well-functioning of the re-

sponsible software is threatened on the one hand by vulnerabilities (bugs, security aws) of the

service itself, but also, on the other hand, by any down-time of the server. 3

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

Cases from the non-ICS realm include a web service with a database (containing the administrator password) and an administration interface (permitting to read out the database); this setting is further elaborated in Figure 3.

User Database

Admin Web Interface

Web Interface

Backup Location

password disclosure modied credentials

XSS unauthorized access SQL injection

users cannot connect condential data leak

unauthorized access database dump disclosure

Figure 3: Simple example of a cyclic dependency graph represented by the model introduced in this paper. `x → y ' means `x can lead to y '. This illustration depicts a poorly designed web service hosting condential and valuable data (e.g. medical information) where the administrator can change any user passwords and can retrieve any of the regularly made backups of the user account database. Note the dependency cycle `admin interface  backup location  user database' (dotted lines). Some authors [11] [6] propose a solution to deal with cyclic dependency graphs using graph unfolding techniques, but they fail at providing a complexity analysis for their methods. This paper introduces a novel approach for computing the risk faced by cyclically dependent assets. Indeed, the proposed algorithm is based on a randomised (non-deterministic) simulation and  in contrast to other algorithms  provably ecient (which is important when doing real-time risk monitoring). Section 2 presents related papers dealing with (cyclic) dependencies in risk assessments. The risk model used by the algorithm is dened in Section 3, along with the algorithm itself, whereas the proof of its correctness and running time can be found in the Appendix. The conducted experiments and their results are exposed in Section 4, Section 5 deliberates a generalisation of the model to also support more complex dependencies, and the paper closes with a conclusion in Section 6.

4

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

2. Related work

Breier [7] encodes information security assets and their dependencies in a directed graph, where edges denote causal relations between nodes. The model supports the use of logical and (for assets that depend on all parents) and or (for assets that depend on one of the parents) operations. In the risk computation, dependencies manifest themselves by added-value impact to the risk of dependent assets. Aubigny et al. [12] establish a risk ontology for highly interdependent (critical) infrastructures, based on the estimation of quality of service (QoS). The proposed model supports risk prediction and incorporates a data structure which allows QoS information to be shared among interconnected infrastructures. Xiaofang et al. [13] classify assets into three layers, viz. business, information and system. risk

=

impact

×

On the lowest layer, risk is computed traditionally as

likelihood.

Dependencies appear in the model as weighted

impact added to the risk of dependent higher-level assets.

MAGERIT

[14] is a risk assessment methodology supported by the Spanish

government which also deals with asset dependencies by embedding them into a graph. Assets are characterized by their security objectives, viz. condentiality, integrity and availability, which are linked together whenever they have an impact on each other.

The methodology does not specify an explicit way to

perform the risk assessment, though. Fenz et al. [15] use a very abstract approach by establishing a framework describing in detail how to semi-automatically generate a Bayesian network (thus encoding dependencies in a causal graph) from an ontology. Rahmad et al. [16] aim at improving on thoughts from [15].

MAGERIT

by combining it with

A major dierence consists in the fact that they use an

exhaustive list of threat scenarios instead of security objectives, which considerably increases the size of the model. Baiardi and Sgandurra [17] provide a methodology 

Haruspex

 based on

attack trees that permits the likelihood of a threat to be deduced. The link to

5

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

asset dependency is that, in a certain sense, causal relations can be deduced from the attack path of an intruder compromising one asset after another in order to reach his goal. The model, which is based on a

Monte Carlo simulation, requires

a lot of parameters and estimates before it can be used. McQueen et al. [18] propose a methodology to deduce an expected time-tocompromise (representing risk) from attack paths which encode dependencies between services in a SCADA environment. Utne et al. [9] focus on cascading eects in critical infrastructures by analysing the interdependencies between several high-level services (such as electricity) and their impact to the society. They also provide a framework to quantize the several magnitudes involved in the risk assessment, allowing an explicit computation of risk. Most interestingly, the risk itself is expressed as expected number of people aected by an incident. Wang et al. [6] present an algorithm for computing probabilities of occurrence of cyclic dependencies, but they do not analyse the running time, which is generally exponentially large and thus only works for small or sparse graphs. Homer et al. [11] adapt the concepts and algorithms known from Bayesian networks to the realm of risk assessments and apply a graph unfolding technique to generalise the model for cyclic dependency graphs. They do not guarantee for any running time bounds, either.

3. Modelling the Risk Analysis Context

3.1. Risk Assessment The

context

of a risk analysis is primarily determined by the set of

assets,

which can be virtually anything of value to a company or institution. In terms of risk, each asset is characterized by its impact on business when one of its (security-related) properties can no longer be guaranteed. Such properties are called

security aspects

in the following, and include

2

most notably the three

2 Note that on the one hand, some of these aspects might not be applicable to certain

assets. On the other hand, it is sometimes sensible to add further properties, or to be more 6

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

notions [19] below.

ˆ

Condentiality  the assumption that sensible information is known only to a well-dened group of people

ˆ

Integrity  the state that an asset is guaranteed to remain in a well-dened state

ˆ

Availability  the property that an asset can be accessed in the way that was previously dened

The impact itself is expressed in nancial terms; this approach has the notable advantage that estimates have an objective meaning and can thus be easily compared to each other. More precisely, the impact is dened to be the nancial damage caused by the threatened security aspect, or the amount of money necessary to recover back to the original state. That way, one can compute the

loss expectancy (LE)

which represents the total losses to be expected in given

period, typically a year:

LE

=

X s:risk

impact(s)

· likelihood(s),

(1)

scenario

where likelihood(s) denotes the number of times that scenario

s

is expected to

occur in the given period (i.e., the expected frequency)  quantity usually to be estimated by a risk assessor, a methodology or a tool. This paper will present an algorithm for eciently computing the likelihood function, see Section 3.5.

3.2. Dependencies A considerable aw in many risk management models is the lack of understanding of asset dependencies. Indeed, consider a hard disk hosting valuable data (and suppose, for the sake of the example, that no back-up is available),

specic about existing ones  especially if the impact considerably changes when doing so. For instance, one usually wants to distinguish between interruption, and

temporary unavailability, causing business

permanent loss, which may be fatal to business. 7

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

then a hard disk failure does not only require the physical disk to be replaced (which is cheap), but also implies the complete loss of core data (which may be business-ending). By consequence, dependency-unaware models do not give disk health monitoring the attention it deserves. In fact, asset dependencies represent nothing else than relations of cause and consequence of security incidents on given assets. Causal graphs are thus a natural candidate for encoding these relationships in a mathematical model. Recall that a causal graph on a vertex set of events is a directed graph such that two incidents are linked whenever they cause one another.

3.3. Likelihoods and Probabilities Whereas many approaches found in the literature (e.g. [7], [20], [21]) use an abstract scale (like `low', `medium', `high') for describing the likelihood of an event, this paper relies on concrete physical magnitudes that support the direct use in a computation.

Magnitude

Unit

Description

¿

Damage faced if threat occurs

Likelihood

1/y

Expected occurrence per year

LE

¿/y

Expected loss per year

Impact

Figure 4: Table summarizing the notions involved in the assessment of risk A reason why one uses to prefer an abstract scale over a number is because precise values are rarely known, so only a rough estimate can be given. However, to make the assessment task easier, one can still restrict to a given set of discrete values to choose from

for orientation, and interpolate to express slight nuances.

One possible such mapping is given in Figure 5, but can (and has to) be adapted to the setting in question. Note the fundamental dierence between `likelihood' and `probability'. Probabilities are only meaningful when characterising a random event

a priori :

happens or not with a certain chance. Likelihoods, however, express the

teriori

it

a pos-

statistical occurrence of events over time. Mathematically, probabilities

8

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

level

likelihood

/y

very low

every 30 years

0.0333

low

every 10 years

0.1

moderate

every 3 years

high

once a year

very high

once a month

0.333 1 12

Figure 5: One possible mapping of an abstract scale to a concrete likelihood.

[0, 1],

are unit-less and lie within and have as unit

1 time

whereas likelihoods can be arbitrarily large

.

Both notions `probability' and `likelihood' can be easily confused, because they are somewhat related. It is meaningless to say that a risk scenario occurs with a certain probability, though. Indeed, consider the statement there is a 10% chance of re and suppose that the damage caused by re is 100 k¿, then 3

it is clearly not obvious

to determine the expected damage over a given period.

If, however, one estimates the likelihood of re to be in average once every 10 years, then the expected damage is trivially

105 ¿ ·

1 10 y

= 10 k¿/y .

The model introduced in this paper makes use of both notions, that is, on the one hand, the hand, the

likelihood

probability

that it

of an incident or risk scenario, and on the other

entails

another (dependent) incident.

In order to distinguish between the two, write and

L

P

for the probability measure

for the likelihood.

3.4. The Dependency-Aware Root Cause ( darc) Model Pick a set of nodes choice is to opt for

V,

each representing a security incident.

V ⊆ A × S,

where

security properties. For instance,

A

is the set of all assets and

S := {C, I, A}

A possible

S

the set of

could comprise condentiality,

3 Is it 10% per day? Per year? How can one then express events that are estimated to occur

very often, like, every day? Keep in mind that probabilities have an intrinsic upper bound of 100%.

9

This is a preprint of an article published in Computer & Security:

integrity and availability threat scenarios.

a.s

instead of Let

E ⊆ V ×V

incident of

(a, s) ∈ A × S

(α, β)

α

http://dx.doi.org/10.1016/j.cose.2016.09.008

In that case, for readability, write

to denote the security aspect

be the set of (directed) edges such that

has an impact on incident

if the set of edges

E

β.

s

a.

of an asset

(α, β) ∈ E

For readability, write

i security

α→β

instead

is understood from the context. Illustrating the

notation, the example above can be rewritten in a very short and intuitive form:

HDD.A

→ Data.A,

which reads if the availability of the hard-drive is compromised, then so is the availability of any data stored on it. The tuple to as the

(V, E)

constitutes a directed graph which is henceforth referred

(asset) dependency graph.

It is not required to be acyclic.

Whereas manual estimation of the full probability distribution of the several incidents is theoretically possible, statistical experiments in real-world systems are usually infeasible, for it would require the simulation of threats on business processes. Instead, the proposed model will use a slightly simplied approach by basing itself on estimating the probability that a particular incident 4

another, independently of possible other causes . mapping

p : E → [0, 1]

entailed by

α.

where

Note that this is

entailed by other events.

p(α → β)

not

causes

Formally, this describes a

denotes the probability that

the same as

P[β|α],

since

β

β

is

could also be

p(α → β) is often denoted P[β | do(α)] or P[β | set(α)]

in the literature [22]. Dene a

root cause

to be a vertex without parent nodes in the dependency

graph. These are the causes for which an explicit likelihood needs to be specied later on, whereas for non-root nodes it is deduced from the model. An example of a dependency graph is depicted in Figure 6. The edge weights represent the values of

p.

4 In particular, the parent causes of an event are related by a boolean OR operation. The

model can be extended to support arbitrary boolean formulas as well, see Section 5.

10

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

DoS Router.I

0.1

Power.A

0.01 0.5

Database.A

0.001 0.8 Server.A

Figure 6: An example illustrating the representation of the model as a graph. A and I stand for the availability and integrity properties of the assets, respectively. Edge weights encode the values of the probability map p. Note how the model does not make a dierence between external threats (circled) and security properties of assets (boxed).

3.5. Computing the probability distribution The complexity of the DARC model lies in the fact that one is interested in the

probability

that a certain sequence of events cause each other, which is

dierent from the problem of nding such a sequence. Indeed, for the former, one needs to determine of them occurs.

all

such sequences and compute the probability that

one

For this, simply listing those sequences and adding up their

probabilities of occurrence yields wrong results, since some edges are accounted for twice.

3.5.1. The acyclic case If the dependency graph is acyclic, well-established theory can be used to describe the full probability distribution. Indeed, by its denition, the dependency graph together with the associated probability distribution constitutes a Bayesian network. This case has been extensively studied [23], notably in [24] and [11]. This paper will thus focus on cyclic dependencies. It is important to note that already in this simpler case, it is computationally infeasible to determine the full probability distribution of

general

Bayesian net-

works [25]. If one assumes further properties of the graph, ecient algorithms do exist, though [26]. A new approach will thus be required to support cyclic graphs as well.

11

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

3.5.2. The general case For cyclically related security incidents, computing their likelihoods constitutes a more delicate problem than it seems.

C

A

B

E

F

D

Figure 7: A simple cyclic dependency graph By the way how the model is dened, an event (i.e., a node) is not triggered multiple times throughout the course of the experiment, but once and for all. For a concrete instance of the random experiment, the issue consists in nding all events that are activated by any of the root causes. See for example Figure 7: event

E

E

can be caused by

C

or

F , but inspecting the situation in more detail,

is only caused by either of the two event chains

In particular, chain

E

can only by triggered by

C

if

C

A→B→C→E

or

F → E.

is not already triggered by the

F → E → D → B → C.

By consequence, the probability that a risk scenario occurs cannot only be expressed by the probability of its direct parents, but has to involve all cycle-free paths from a root node. The computation eort for enumerating all such paths 5

can be huge , which is also why any eorts of nding an

ecient

deterministic

algorithm failed. Instead, the randomized algorithm given in Algorithm 1 aims to give a good approximation; note that risk assessments as introduced in this paper do not require input data to be precise and are stable with respect to uctuations. The running time of Algorithm 1 is polynomial in its input data.

More

5 In the worst case, namely in complete graphs, the running time is exponential in the

number of vertices.

12

http://dx.doi.org/10.1016/j.cose.2016.09.008

This is a preprint of an article published in Computer & Security:

Algorithm 1 Compute probability matrix

G = (V, E)

Input: Graph

Input: Probability map Input:

with root nodes

VR ⊂ V

p : E → [0, 1]

ε > 0, δ > 0

Output: Probabilities

C : VR × V → [0, 1] that a root node causes a node, each

value with absolute error at most

ε.

The algorithm will fail with probability

δ.

at most

1: γ := 1+ε√ε

 2: N := ε26γ ln 2n δ 3: 4:

for

where

(vr , v) ∈ VR × V

n := |V |

do

C(vr , v) ← 0.

5:

end for

6:

loop

N

times

7:

Sample a random graph

8:

for

9:

vr ∈ VR for

10: 11:

14:

G

according to

vr

v

then

do

path in

G0

from

to

C(vr , v) ← C(vr , v) + 1/N

12: 13:



from

do

v ∈ V (G0 ) if

G0

end if end for end for

15:

end loop

16:

return

C

13

p

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

precisely, it is bounded by

    2n O n · m · ln · ε−3 , δ where

n is the number of vertices, m is the number of edges, δ ε

that the algorithm output is wrong and

is the probability

is an upper bound for the absolute

error of the computed values. Observe the logarithmic dependency on

δ,

which

permits amplifying the algorithm accuracy without signicantly increasing its running time. The proof of correctness, running time and error probability is given in the annex.

3.6. Real-Time Risk Monitoring The darc model permits to be used for real-time

6

risk monitoring in the

sense that the likelihoods of the root nodes, which usually have to be estimated manually, can be automatically determined by external sources such as intrusion detection systems (IDS) or security information and event management (SIEM) appliances. Observe that the function

C : VR × V → R

depends on the probability weights

p

computed in Algorithm 1 only

encoded into the graph, but not on the

likelihoods of the root causes. By consequence, since the model (including

p)

is

not supposed to change during the risk monitoring phase, the above algorithm is only invoked once, namely after the model design phase. As such,

C

can be

considered static. Since

C

C ∈ RVR ×V

expects two arguments, it can be viewed as a two-dimensional matrix where each row represents a root node

arbitrary node

∈V.

In fact, an entry of

C

∈ VR

and each column an

denotes the probability that a given

root node (= row) entails any given node (= column).

6 Note that `real-time' denotes a process where an explicit bound on the running time is

known.

14

This is a preprint of an article published in Computer & Security:

If the vector

LR ∈ RVR

then the likelihoods

L

http://dx.doi.org/10.1016/j.cose.2016.09.008

denotes the (estimated) likelihoods of root causes,

of the all nodes can be computed as

L = L> R · C, where

·

denotes matrix multiplication and

Moreover, let

I∈R

V

L> R

the transpose of the vector

LR .

be the vector that holds the direct impact caused by each

node. The global risk can nally be written as

risk

= L> R ·C ·I

In a more explicit fashion,

risk

=

X X

LR (vr ) · C(vr , v) · I(v),

vr ∈VR v∈V where

C is computed from the model by Algorithm 1, the impacts I

estimated by a risk assessor and the root cause likelihoods

LR

are manually

are dynamically

monitored by external sources (IDS, SIEM ...).

4. Experiments

The algorithm is constructed in such a way that it can be interrupted at any point, yielding a less precise, but complete solution. More precisely, if it is aborted after

αN

steps, for

at most by a factor

α

− 31

0 < α < 1,

then the relative error

ε

will increase

: this estimate follows directly from the denition of

N

(the number of simulation iterations) and is formally proven in Lemma 1 in the appendix. Moreover, since all simulations are run in an independent manner, they can be perfectly run in parallel (proting from multi-threading capabilities of a CPU) or in a distributed way. To test the performance of the algorithm on `average' graphs, dependency graphs have been generated uniformly at random. A typical risk analysis may

15

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

12

execution time (ms)

10 8 6 4 2 200

400

600

number of nodes

800

1,000

n

Figure 8: Execution time of Algorithm 1 in seconds, depending on the graph size n, with ε = 0.1 and δ = 0.01 and an average of 5 neighbours per node.

7

8

cover up to 50 dierent assets , each of which generally encounters 35 threats , so a related graph is composed of a few hundred nodes. It is sensible to assume that nodes are not connected (in average) to more than a few edges, so a typical graph will consist of a few thousand edges at most. The simulation was performed on a dual-core 2.5 GHz processor (i7-3537U). The results are depicted in Figures 8, 9, 10 and 11  as expected, they reect the running time computed in Proposition 1 in the appendix. The precise numbers can be found in Table .1 in the appendix. In order to compare the performance of Algorithm 1 to other approaches, similar experiments have been conducted with for straight-forward deterministic algorithms.

For example, consider the simple recursive algorithm which

conditions on the existence of each edge. The former relies on the mathematical

itrust consulting

7 Based on experience from past risk analyses performed by . 8 Most often, these threats include the criticality, integrity and availability aspects of each

asset, which can be further sub-divided (e.g. temporary unavailability vs. permanent loss).

16

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

execution time (ms)

300

200

100

0.1

0.2

0.3

relative error

0.4

0.5

ε

Figure 9: Execution time of Algorithm 1 in seconds, depending on the precision ε of the results, with n = 500 and δ = 0.01 and an average of 5 neighbours per node.

observation that, for each edge

Pr[∃ path v

e ∈ E,

w] =

p(e)· Pr[∃ path v +(1 − p(e))· Pr[∃ path v

w | e] w | ¬ e],

which can easily be turned into a recursive algorithm, computing the probability that a path exists between any two nodes

v

and

w.

Unfortunately, such

algorithms have exponential running time and take more than a few minutes already for small graphs (|V

| ≥ 20, |E| ≥ 200).

All other attempts to solving the problem in a deterministic way resulted in similarly bad execution times.

5. Extension to boolean formulas

The dependency graph is based on the concept of causality; that is, the parents of a node represent alternative causes, each of which can engender the consequential scenario.

Formally, the dependency relationship of a vertex

17

v0

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

execution time (ms)

6

4

2

0

10−4

10−3

10−2

10−1

algorithm correctness

δ

Figure 10: Execution time of Algorithm 1 in seconds, depending on the correctness δ of the algorithm output, with n = 500 and ε = 0.1 and an average of 5 neighbours per node.

and its parent nodes

Pv0 ⊂ V (G)

can be expressed as

_

ρ(v0 ) :=

Ix ,

x∈Pv0 where

Ix

denotes the boolean variable encoding whether the event

x

occurs or

not. The beauty of Algorithm 1 lies in the fact that it does not depend at all on the topology of the graph or on the form of the dependencies. In fact, generalising the `or' relations to arbitrary boolean expressions

ρ(·)

is straight-forward and

does not change the main lines or the proof of the algorithm; yet considerable adaptations have to be made in order to nd whether a node is triggered or not (line 10 in the algorithm). For general boolean formulas the eort for computing the likelihoods is considerably higher, since a recursive search might no longer be possible (see e.g. Figure 12). A dierent theory, such as boolean satisability [27], is required in these matters. Moreover, the comparatively good running time of Algorithm 1 was due to the fact that evaluating the likelihood (sc. nding all cycle-free paths) can be implemented in an ecient way.

For more complex boolean formulas this

18

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

execution time (s)

60

40

20

0 200 num b er

400

600

of n o de

s

800

0

n

1 80,00000,000 6 40,000 0,000 20,000 ges m b er num

of ed

Figure 11: Execution time of Algorithm 1 in seconds, depending on the graph size n and m, with ε = 0.1 and δ = 0.01. may not necessarily be the case (indeed, the boolean satisability problem sat is

N P -complete

9

[27]) so that deterministic (and possibly even error-free)

algorithms could outperform the simulation variant.

6. Conclusion and Outlook

This paper provides a simple and lightweight approach for encoding asset dependencies into a graph structure.

Since that graph is not assumed to be

acyclic, the model can also be used in environments with interdependencies, such as in Industrial Control Systems (ICS) or Critical Infrastructures (CI). The major contribution of this piece of work (apart from the DARC model) is Algorithm 1, which computes the resulting risk in such a graph, but in a provably ecient way.

Indeed, as it turned out, any deterministic approach

9 Given an arbitrary boolean formula ρ on variables x , . . . , x , the n 1

in determining whether there is an

sat problem consists

assignment α ∈ {0, 1}n such that ρ(x1 := α1 , . . . , xn :=

αn ) = 1 .

19

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

Y

A

A∧X

Y ∧B

B

X

Figure 12: Endless loop in dependencies for general boolean formulas: in fact, in order to evaluate A ∧ X , one needs to evaluate

both parents, including X and thus Y ∧ B and Y . But

Y can only be evaluated if A ∧ X is known already. It is not so clear how to proceed in such a

case: one solution is to set the likelihood to zero for all non-reachable nodes, because in fact the cycle can never be entered; however, this might not be sensible in all use-cases. 10

that we could think of, is computationally too complex

to serve as a basis for

any usable algorithm. The DARC model was developed with the intention of creating a tool that continuously computes and monitors the taking all dependencies into account. relationships between incidents (i.e.

current

risk faced by an organisation,

For now, it merely encodes the causal

A

causes

B ),

so that a quantitative risk

assessment can only be performed in a very basic way (risk = likelihood

×

impact)  this approach has been discussed in Section 3.6. Therefore, the next steps consist in embedding the DARC mode into a whole risk methodology, by including more ne-grained notions into the model (such as threat exposure, vulnerabilities or preventive measures). Doing so will also enable more sources of risk information to be integrated into the monitoring tool, for instance software agents that rate and report the performance of preventive security measures.

10 That is, the running time is exponential in the number of nodes and edges, which rapidly

becomes a problem already for small graphs (|V | ≥ 30).

20

This is a preprint of an article published in Computer & Security:

http://dx.doi.org/10.1016/j.cose.2016.09.008

7. Acknowledgements

This work was supported by the Fonds National de la Recherche, Luxembourg (project reference 10239425).

Appendix

Proposition 1.

Algorithm 1 is correct with probability δ and terminates within

time 



O n · m · ln

2n δ

 ·ε

−3

 .

Moreover, each computed value lies within an interval of ±ε around the true value. Proof.

Fix a root node

vr ∈ VR .

For

v ∈ V

and

the indicator variable that there is a path from

1 ≤ i ≤ N,

vr

to

v

in the

let

Xi (v)

i-th

be

random

experiment. Observe that

N 1 X E[Xi (v)] = E[X0 (v)] = P[X0 (v) = 1], N i=1 that is, the quantity approximated by the algorithm (left hand-side) equals the probability that node

v

vr .

is reachable by

So if the random experiments do not

deviate too much from their expectations, the algorithm output is correct up to a certain relative error, which is determined in the following. Dene Fix

γ

and

v ∈ V.

N

as in the algorithm. Note that

Suppose for now that

µ(v) ≥ γ .

γ < ε < 1. Using a two-sided Cherno

bound [28],

" N # X P Xi (v) − N µ(v) > N µ(v)ε i=1  2  ε ≤ 2 exp − N γ 3 δ = 2. n

21

(by the choice of

N)

This is a preprint of an article published in Computer & Security:

µ(v) ≤ γ < ε, using a one-sided "N # X P Xi (v)/N > ε

If however

http://dx.doi.org/10.1016/j.cose.2016.09.008

Cherno bound,

i=1

"

#  ε − µ(v) =P N µ(v) Xi (v) > 1 + µ(v) i=1 ! 2  ε − µ(v) N µ(v) ≤ 2 exp − µ(v) 3   N ≤ 2 exp −(ε − γ)2 3γ ! 2  ε−γ 2 ln(2n /δ) . = 2 exp − εγ By the denition of

γ

N X

it holds that

with probability at least

ˆ

If

µ(v) ≥ γ

δn−2

µ(v) < 1, P N i=1 Xi (v) − N µ(v) is If

µ(v) ≤ γ

ε−γ > εγ

and thus

(∗) ≤

vr,0 .

δ n2 . To summarize,

the following two statements hold:

this also implies that the absolute error at most

ε;

e(v) :=

ε.

then the absolute error

e(v)

error is at most

Note that the statements above hold for any xed vertex node

(∗)

then the relative error of the random experiment is at most

however, since

ˆ



v0

ε.

and any xed root

Using a union bound,

P[∃vr ∃v : e(v) > ε] ≤ n2 · P[e(v0 ) > ε] = δ yielding the desired error probability for the algorithm. Regarding the running time, note that the inner for-loop can be implemented (e.g. using a breadth-rst search) in linear time at most

n

O(m)

root nodes, whereas the sampling requires time

total execution time of

Lemma 1.

for each of the

O(m),

resulting in a

O(n · m · N ).

For xed δ , if Algorithm 1 is aborted after αN iterations, for 0

Suggest Documents