cloud risk-o-meter: an algorithm for cloud risk assessment and ...

SDPS-2012 Printed in the United States of America, June, 2012 2012 Society for Design and Process Science

CLOUD RISK-O-METER: AN ALGORITHM FOR CLOUD RISK ASSESSMENT AND MANAGEMENT Mehmet Sahinoglu, Scott Morton Informatics Institute Auburn University Montgomery Montgomery, Alabama 36124-4023, USA

countermeasures. Threat countermeasures are used to mitigate risk and lower it to a desirable level. Using gametheoretic optimization techniques, the user will see how his/her budgetary resource can be at best spent towards an optimal allocation plan, so as to lower the undesirable risk to a more tolerable level (Sahinoglu, 2011). Before delving into the CLOUD RoM in short, it is timely to briefly summarize the essentials of the Security (or Risk) Meter methodology (Sahinoglu, 2005, 2007, 2008, 2009, 2010). In summary, innovative quantitative risk measurements are needed to objectively compare risk alternatives and manage risks as compared to conventional guesswork using hand calculators.

ABSTRACT Many times, the management will not know how the receiving end or the CLOUD user evaluates the network, in addition to the numerous self-assessment efforts executed by the CLOUD owners and managers who want to deliver a good product with minimal glitches. The hosting side will need to know what the customer base thinks so that the management (or host) can take countermeasures for the vulnerabilities that are threatened causing risk factor to increase without control. To that effect, the management has to do information gathering surveys of dynamic nature to find out what is missing. One such algorithm developed by the author is CLOUDRisk-O-Meter or CLOUD RoM. The proposed software tool not only can assess the risk content in percentage, but also utilizing game theoretic approaches executes costminimal recovery management by taking a list of prioritized precautions to monitor the desired mitigation process.

RISK-O-METER - A BRIEF SUMMARY The Security Meter (SM) or Risk-O-Meter (RoM) design provides the quantitative tool that is imperative in the security world. For a practical and accurate statistical design, security breaches will be recorded so as to estimate the model‟s input probabilities using the risk equations developed. Undesirable threats (with and without bluffs) that take advantage of hardware and software vulnerabilities can break down availability, integrity, conﬁdentiality, nonrepudiation, and other aspects of software quality such as authentication, privacy, and encryption. Fig. 1 below illustrates the constants in the SM (or RoM) model as the utility cost or dollar asset, and a criticality constant. Those probabilistic inputs are vulnerability, threat, and lack of countermeasure all valued between 0 and 1 (Sahinoglu, 2005). See Benini et al. (2008) for more resources. See Fig. 1, SM is described as follows in two subsections.

INTRODUCTION Even with all of the data centers‟ assurance of complete security, it‟s still unsafe to host important data on a virtual CLOUD server than on a dedicated physical machine (Anthes, 2010). Some strong voices include as follow: “Imagine what would happen if the hackers gained access to thousands of people‟s data. It would be nothing less than a catastrophe (especially for businesses) and the data center would pretty much have to stop all or some outgoing data while they solve the problem, which means downtime for not only one, but a lot of clients and their sites and data” (Greengard, 2010). Boland (2011) studies Private Cloud. See Srinivasan et al. for more details. The CLOUD Risk-o-Meter is an automated tool for information gathering, quantifying, assessing, and cost-effective managing risk. It further provides objective dollar based mitigation advice, allowing the user to see where their funds will be best allocated to lower risk to an acceptable level. This section will examine CLOUD computing risk in the context of vulnerability categories (Grobauer et al. 2011), threats presented, and specific

Probabilistic Tree Diagram Given that a simple sample system or component has two or more outcomes for each risk factor, vulnerability, threat, and countermeasure, the following probabilistic framework holds for the sums ∑vi = 1 and ∑tij = 1 for each i, and the sum of LCM + CM = 1 for each ij, within the tree diagram structure in Fig. 2. Using the probabilistic inputs, we get the residual risk = vulnerability x threat x lack of countermeasure, where x denotes times. That is, if

1

breakeven cost of $5.67 per % improvement. The next step proceeds with optimization to a next desirable percentage once these acquisitions or services are provided, such as 5% mitigated from 10% if the budget allows. The RoM tool can serve as an auditing expert system to circumvent criticisms regarding the budgeting plans to manage the risk.

we add all the residual risks due to lack of countermeasures, we can calculate the overall residual risk. We apply the criticality factor to the residual risk to calculate the final risk. Then we apply the capital investment cost to the final risk to determine the expected cost of loss (ECL), which helps to budget for avoiding (before the attack) or repairing (after the attack) the entire risk where the final risk = residual risk x criticality, whereas ECL ($) = final risk x capital cost.

CLOUD RISK-O-METER (OR ROM) ESSENTIALS The CLOUD RoM has two versions imbedded. The CLOUD RoM provider version is geared toward service providers and corporate users. The CLOUD RoM client version is geared toward individual and smaller corporate end users, for whom a new vulnerability titled Client Perception (PR) & Transparency is included. Akin to Fig. 3, let‟s begin with a relevant comprehensive tree diagram as follows in Fig. 5. A thorough list of Vulnerabilities for CLOUD RoM with their related threats (Sahinoglu and Morton, 2011):

Algorithmic Calculations Fig. 1 leads to an example probabilistic tree diagram of Fig. 2 to perform the calculations. For example, out of 100 malware attempts, the number of penetrating attacks not prevented will give the estimate of the percentage of LCM. One can then trace the root cause of the threat level retrospectively in the tree diagram. A cyber-attack example: 1) A hacking attack as a threat occurs. 2) The firewall software does not detect it. 3) As a result of this attack, whose root threat is known, the „network‟ as vulnerability is exploited. This illustrates the “line of attack” on the tree diagram such as in Fig. 2. Out of those that are not prevented by a certain countermeasure (CM), how many of them were caused by threat 1 or 2, etc., to a particular vulnerability 1 or 2, etc.? We calculate as in Fig. 2. Residual Risk (RR) = Vulnerability x Threat x LCM, for each branch and then proceed by summing the RRs to obtain the total residual risk (TRR). Let‟s assume that we have the following input risk tree diagram in Fig. 3 and input risk probability chart in Table 1 for a sample health care study where only the highlighted boxes in the tree diagram of Fig. 3 are selected for a case study.

Accessibility & Privacy  Threats:  Insufficient Network-based Controls  Insider/Outsider Intrusion  Poor Key Management & Inadequate Cryptography  Lack of Availability Software Capacity  Threats:  Software Incompatibility  Unsecure Code  Lack of User Friendly software  Inadequate CLOUD Applications Internet Protocols  Threats:  Web Applications & Services  Lack of Security & Privacy  Virtualization  Inadequate Cryptography Server Capacity & Scalability  Threats:  Lack of Sufficient Hardware  Lack of Existing Hardware Scalability  Server Farm Incapacity to Meet Customer Demand  Incorrect Configuration Physical Infrastructure  Threats:  Power Outages  Unreliable Network Connections  Inadequate Facilities  Inadequate Repair Crews Data & Disaster Recovery

Risk Management Clarifications for Table 1 and Figures 3, 4 Using the input Table 1 and the results from Fig. 2 and 3, and so as to improve the base risk by mitigating from 26% to 10%, we implement the first-prioritized four recommended actions. 1) Increase the CM capacity for the vulnerability of “Outpatient Facilities” and its threat “Patient records” from the current 70% to 100%. 2) Increase the CM capacity for the vulnerability of “Urgent Care‟s Surgery Centers” and its threat “Patient records” from the current 96% to 100%. 3) Increase the CM capacity for the vulnerability of “Local Health Centers” and its threat “Patient records” from the current 72% to 98.54%. 4) Increase the CM capacity for the vulnerability of “Local Health Centers” and its threat “Internet” from the current 70% to 99.99%. In taking these actions, as in Fig. 4, a total amount of $510 is dispensed (< $513.30 as advised) each within the limits of optimal costs annotated, staying below the 2





Threats:  Lack of a Contingency Plan  Lack of Multiple Sites  Inadequate Software & Hardware  Recovery Time Managerial Quality  Threats:  Lack of Quality Crisis Response Personnel  Inadequate Technical Education  Insufficient Load Demand Management  Lack of Service Monitoring Macro -Economic & Cost Factors  Threats:  Inadequate Payment Plans  Low Growth Rates  High Interest Rates  Adverse Regulatory Environment Client Perceptions (PR) & Transparency  Threats:  Lack of PR Promotion  Adverse Company News  Unresponsiveness to Client Complaints  Lack of Openness





Did the provider patch the vulnerability or switch to another platform? Did the provider read-in current asset or deployment information from the CLOUD and then dynamically update the IP address information before scans commence? Did the provider utilize Network Access Controlbased enforcement for continuous monitoring of its virtual machine population and virtual machine sprawl prevention? Risk Calculation and Mitigation would include as examples:  Essentially, the users are responding yes or no to these questions. These responses are used to calculate residual risk.  Using a game-theoretical optimization approach, the calculated risk index is used further to generate a cost-optimal plan to lower the risk to tolerable levels from those unwanted or unacceptable.  Mitigation advice will be generated to show the user in what areas the risk can be reduced to optimized or desired levels such as from 50% to 45% in the screenshot (displaying threat, countermeasure, and residual risk indices; optimization options; as well as risk mitigation advice).

Nature of CLOUD Risk Assessment Questions – A Quantitative Example: Questions are designed to elicit the user‟s response regarding the perceived risk from particular threats, and the countermeasures the users may employ to counteract those threats. For example, regarding Internet Protocols vulnerability, questions regarding Virtualization include both threat and countermeasure questions.

Risk Management Clarifications for Figures 5 and 6

Threat questions would include as examples:  Do your provider‟s virtualization appliances have packet inspection settings set on default?  Is escape to the hypervisor likely in the case of a breach of the virtualization platform?  Does your provider use Microsoft‟s Virtual PC hypervisor?  Does your provider fail to scan the correct customer system?  Does your provider fail to monitor its virtual machines?

Using the RoM results for the risk assessment step from Fig.6, and so as to mitigate the base risk by mitigating from 50.32% down to 40%, we implement the prioritized four RoM recommended actions. 1) Increase the CM capacity for the vulnerability of “Accessibility and Privacy” and its threat “Lack of Availability” from the current 32% to 99.97%. 2) Increase the CM capacity for the vulnerability of “Data and Disaster Recovery” and its threat “Recovery Time” from the current 48% to 82.44%. 3) Increase the CM capacity for the vulnerability of “Managerial Quality” and its threat “Insufficient Load Demand Management” from the current 52.50% to 100%. 4) Increase the CM capacity for the vulnerability of “Macro Economic and Cost Factors” and its threat “Adverse Regulatory Market” from the current 55% to 100.00%.

Countermeasure questions would include as examples:  Did the provider‟s virtualization appliances inspect all packets?  Did the provider extend their vulnerability and configuration management process to the virtualization platform?

In taking these actions, a total positive change of 194.91% is the possible minimized total change for guaranteeing the improvement percentage (from 50.32% to 40%). Under the given circumstances using the gametheoretic principles for the risk management step, out of total CLOUD investment of $1M, a total amount of $101,000 is dispensed (< $103,222.73 as advised) each

3

Benini M. and Sicari S (2008). “Risk Assessment in Practice: A Real Case Study”, Computer Communications, Vol. 31, No. 15, pp. 3691-3699. Boland Rita (2011). “Approval Granted for Private Software to Run in Secure Cloud”, www.afcea.org, SIGNAL, Information Security, pp. 35-38. Greengard, S (2010). “Cloud Computing and Developing Nations”, Communications of the ACM. 53 (5), pp. 18-20. Grobauer B., Walloschek, Stocker E (2011). “Understanding Cloud Computing Vulnerabilities”, Vol. 9, No.2, IEEE Security & Privacy, 50-57. Kim H., Chaudhuri S., Parashar M., Marty C (2009). “Online Risk Analytics on the Cloud”, CCGRID‟09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, IEEE Computer Society, Washington DC, USA. Leavitt N (2009). “Is Cloud Computing Really Ready for Prime Time”, IEEE Computer, January issue, 15-20. Sahinoglu Mehmet, Cueva-Parra L (2011). “CLOUD Computing”, WIREs Comp. Stat., 3:. Doi: 10.1002/wics.139; 47-68. Sahinoglu, M.(2007). Trustworthy Computing: Analytical and Quantitative Engineering Evaluation, New York: John Wiley and Sons Inc. Sahinoglu M., Morton S (2011). “CLOUD Computing Risk Assessment with Risk-o-Meter”, AFITC (Air Force Information Technology Conference), Montgomery, AL. Sahinoglu M (2005). “Security Meter - A Practical Decision Tree Model to Quantify Risk,” IEEE Security and Privacy, 3 (3), April/May 2005, 18-24. Sahinoglu M (2008). “An Input-Output Measurable Design for the Security Meter Model to Quantify and Manage Software Security Risk”, IEEE Trans on Instrumentation and Measurement, 57(6), 1251-1260. Sahinoglu M (2009). “Can We Quantitatively Assess and Manage Risk of Software Privacy Breaches?”, IJCITAE-International Journal of Computers, Information Technology and Engineering, Vol. 3 No. 2, pp. 65-70. Sahinoglu M, Yuan Y-L, Banks D (2010). “Validation of a Security and Privacy Risk Metric Using Triple Uniform Product Rule,” International Journal of Computers, Information Technology and Engineering, Vol. 4, No. 2, 125–135. Sahinoglu M (2008). “Generalized Game Theory Applications to Computer Security Risk, Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, May 18-21. Srinivasan S, Getov V (2011). “Navigating the Cloud Computing Landscape - Technologies, Services, and Adopters”, IEEE Computer, March Issue, 22-28. Worthen G, Vascellaro J (2009). “E Mail glitch shows pitfalls of online software – Photo: Services like Gmail run on vast computer farms. A Google center in Lenoir, N.C.”, Media and Marketing, Wall Street Journal. B 4-5.

within the limits of optimal costs annotated, staying below the breakeven cost of $5.29 per 1% improvement. The next step proceeds with optimization to a next desirable percentage once these acquisitions or services are provided, such as mitigated to 35% from 40% if the budget still exists. See Fig. 6 for the mitigation step. CONCLUSIONS AND DISCUSSIONS 











CLOUD computing, also viewed as a fifth utility after water, electric power, telephony, and gas is set to expand dramatically if issues of availability and security can be trustfully resolved. The CLOUD simulator tool (CREA) by the author and its further refinement will aid in that expansion (Sahinoglu, 2011). Also Monte Carlo VaR is an alternative method for day to day monitoring of the CLOUD (Kim et al., 2009). The CLOUD Risk-O-Meter breaks new ground in that it provides a quantitative assessment of risk to the user as well as recommendations to mitigate that risk. A cross section of draft questions (subject to change as required by the CLOUD management organization) are listed in Table 2 after the References. As such, it will be a highly useful tool to both the end user as well as IT professionals involved in CLOUD service provision due to mounting customer complaints on the breach of reliability (Worthen and Vascellaro, 2009). Both CRAM and Monte-Carlo VaR (as Simulation Methods) only cited here, and CLOUD-RoM (as Information Gathering Customer Survey Method) will provide quantitative risk assessment and management solutions if the correctly collected data needed for both approaches can be justified. Besides, Markov-chain method is only useful for small scale problems, utilized as a theoretical comparison alternative, mainly because large scale problems with excessive Markov states are intractable to compute even with supercomputers exceeding 500 servers. Further research commands reliable random data collection practices to render these two recommended methods, i.e. discrete event simulation (DES) and RoM (Risk-o-Meter), useful and applicable “most bang for the buck” to aid managers with assessing and managing CLOUD risk, conventionally left to chance saving the day.

REFERENCES Anthes, G (2010). “Security in the Cloud”, Communications of the ACM 53(11), pp. 16-18, DOI:10.1145/1839676.1839683.

4

FIGURES AND TABLES Patient Records Internet

HC Clinician Settings

Insurance Records Staff HIPAA

Patient Records Internet

Outpatient Facilities

Insurance Records Staff HIPAA Patient Records

Urgent Care/ Ambulatory Surgery Centers Ambulatory Healthcare Cybersecurity

Internet Insurance Records Staff HIPAA Patient Records Internet

Local Health Centers

Fig. 1. Security Meter Model of probabilistic, deterministic inputs, calculated outputs.

Insurance Records Staff HIPAA

Pharmacy Records

Pharmacies

Internet Staff Customer Fraud Insurance Records

Homes/ Residential Facilities

Internet Doctor’s Records Prescriptions Insurance records

Fig. 3. Health Care Related Security Meter‟s Tree Diagram with highlighted selections.

Fig. 4. Example of a Game-theoretic Cost-Optimal Risk Management for input Table 1 and tree diagram of Fig. 3.

Fig.2. General-purpose tree diagram (V-branches, T-twigs, LCM-limbs) for the RoM model.

5

Fig. 5. Tree Diagram for CLOUD Risk-O-Meter: Comprehensive (both Client and Host inclusive).

Table 1. Vulnerability-Threat-Countermeasure Input Risk Data for Fig. 3 and 4 re: Healthcare Tree Diagram and RoM

6

Fig.

Table 2. Example of Survey Questions for CLOUD Risk-o-Meter.

Fig. 6. RoM Risk Assessment Results Applying Fig. 5.

7

8