Keynote Talk 2 Securability: The Key Challenge for Autonomic and Trusted Computing Miroslaw Malek Humboldt-Universität zu Berlin, Germany e-mail:
[email protected] Abstract— Securability, the key challenge for autonomic and trusted computing, focuses on both dependability and security. Main techniques to meet this challenge will be outlined and concepts such as proactive fault management, failure prediction, translucency and qq-plane will be introduced. Keywords: autonomic computing, dependability, proactive fault management, qq-plane, securability, security, translucency
OVERVIEW The main hurdle on the way to wide acceptance of cloud computing is securability. We broaden the meaning of securability as being not only ability to secure the system but also having ability of keeping it running. In a nutshell, securability is a system or a service property that integrates dependability and security. Solving the securability challenge is the key to attracting more enterprises to move their operations to computing clouds of servers. As daily lives and enterprise existence are becoming increasingly dependent on such systems, the demand for securability will become a sine qua non condition for the emerging networks and systems. This trend will simply force system manufacturers and operators to deliver higher levels of securability in addition to already well-established scalable performance and cost effectiveness. Meeting users expectations will not be easy as securability is and will remain a permanent challenge due to: ever-increasing systems complexity, growing connectivity and interoperability, dynamicity (frequent configurations, reconfigurations, updates, upgrades and patches), evergrowing number and diversity of cyber-attacks and the demand for global, 7x24 access and utilization (any place, any time). In some cases, an additional problem is the integration of the physical world with the virtual (cyber) world that is not easy to accomplish [1]. Also, the dependability frequently requires replication which might be detrimental to security. Self-x, autonomic and trustworthy computing approaches will have to integrate proactive fault and attack management methods to effectively cope with ever more complex systems and networks. With universal use of cloud computing, internet of things and smartphones as main human-machine interaction devices, securability in addition to low power challenge will dominate this decade. SECURABILITY As with performability in the past [2], where the challenge was to optimize a tradeoff between performance and dependability, there is a need to pose a securability (security/dependability) challenge, especially for the emerging networks and systems such as multiclouds. We define securability as a property of a system or service that expresses reliance that can be placed on a system or service even in presence of hostile attacks and other attempts to breach security. There is no security without dependability and vice versa. But merging these two worlds is by no means trivial and requires expertise in both areas which is relatively rare as the two fields continue to grow separately. This separation is to such extent that even terminology in both areas continues to be different. Unreliable systems are usually vulnerable to security breaches and low security systems are vulnerable to hostile attacks resulting in crashes or data loss. With ever growing complexity of computer and communication systems analytical methods for securability evaluation do not scale, especially with respect to availability and security of Information Technology Organizations (IT-Organizations). As an alternative to analytical approaches, best practices and generic frameworks can be used such as IT Infrastructure Library (ITIL) and the Control Objectives for Information and Related Technology (CobiT) to derive a quantifiable concept to evaluate and improve the IT-Organization availability and security. These very important, management aspects and standard compliance are handled by a separate community, comprised mainly of practitioners and fewer researchers, who have developed them. Since securability is a difficult problem to tackle we need several diverse techniques. Here, we present a set of different approaches that we have developed for this purpose. We start with a comprehensive approach to availability, followed by RAID-like approach to multicloud security. We also formulate the securability challenges and potential solutions by
lvi lv
considering dependability, security, translucency, quantitative evaluation and qualitative assessment in a form of a QQ-plane thus providing the basis for the proactive securability management. DEPENDABILITY WITH SHIP-IT The raison d'être of computer/communication systems is a seamless execution of business processes and delivery of services on demand anywhere and anytime. The SHIP-IT approach (Software, Hardware, Infrastructure, Personnel, ITOrganization) takes a holistic view of system, people and infrastructure in optimizing business continuity [3]. Our approach is based on three pillars: 1) Comprehensive online service availability evaluation (SHIP-IT) 2) Single points of failure identification, and 3) Proactive fault management including seamless failure avoidance techniques using runtime monitoring and prediction. The power of SHIP-IT lies in rare combination of the classic, scientific approach to dependability, with the best practices and frameworks (reference models) from practice. We argue that by applying the presented techniques the business process availability may be improved by an order of magnitude or more. In a multicloud environment where the business processes and services are run across many organizations and several domains, the higher availability can be achieved by a rapid failover from a cloud of a given organization to another cloud owned by another company. In such cases failure prediction methods are essential as moving applications and data across cloud domains is not a trivial undertaking. SECURITY In a multicloud environment there is an excellent opportunity to dramatically enhance security by exploiting multiple clouds that may have only a part of data so no attacker can retrieve the entire file by compromising a single machine. A file is divided into n+1 parts such that at least n parts are needed to be able to decrypt the data from the file. Like in some RAID systems we use n+1 parts on n+1 servers or disk drives for fault tolerance in case one of the servers fails. An authenticated file owner must have access to at least n files in order to be able to read it [4]. In similar way the information should be dispersed in transmitting it over the network [5]. Depending on the performance/securability requirements various RAID configurations can be used to optimize the system accordingly. TRANSLUCENCY Translucency [6] is a concept which helps to decide at what level the highest cost-efficient availability gains can be achieved. Translucency requires ability to assess effectiveness and cost of dependability improvement techniques at every system level, be it hardware, software, operating system, service, or business process. Such capability is highly desirable as systems providers and refined users would be able to easily assess where they can get “the biggest bang for the buck.” The similar approach can be applied to securability by deciding at which level system dependability or service protection measures are most effective. QQ-CHALLENGE In securability evaluation both quantitative measurement and qualitative assessment (QQ) play an important role. Due to practical reasons, security and/or dependability can often be assessed only qualitatively but, in general, for comparative analysis quantitative evaluation is preferred. The QQ-plane allows us to pursue both approaches by marking behavior of the system with respect to both quantitative (e.g., along x-axis) and qualitative measures (along y-axis). For instance, SHIP-IT software tool allows us to model and analyze systems using the QQ approach focusing on evaluation of IT-enterprise availability. This approach can be easily extended to securability, assuming existence of adequate metrics for quantitative evaluation and questionnaires and methods for qualitative assessment. With rising proliferation of services it is also imperative to assess service and business process securability of supporting IT-infrastructure and management. PROACTIVE SECURABILITY MANAGEMENT The essence of proactive methods is ability to predict the behavior of a system. A number of failure prediction and avoidance techniques have been developed [7] that can be adapted to securability. Predictive Securability would anticipate threats, potential attacks and predict failures that could bring down the system which would allow proactive development and deployment of methods for dealing with threats, attacks and failures, thus creating a truly comprehensive Proactive Securability Management. ONE STEP AHEAD The emerging networks and systems will pose a continuous challenge to scientific and industrial community as well as the entire society for the decades to come. Keeping the systems secure and running is becoming essential not only for the critical infrastructures but for our daily lives as well. The securability challenges are many and they will have to be faced to ensure our economic and social well-being. In addition to classical methods, the Proactive Securability Management using autonomic and trusted computing methodologies holds a big promise by being one step ahead of the emerging problems by anticipating and reacting to potential security violations and failures.
lvii lvi
REFERENCES [1] [2] [3] [4] [5] [6] [7]
E. A. Lee, “CPS foundations,” DAC 2010, pp. 737-742, 2010. J. F. Meyer, “On Evaluating the Performability of Degradable Computing Systems,” IEEE Trans. Computers 29 (8), pp. 720-731, 1980. M. Malek, “Online Dependability Assessment through Runtime Monitoring and Prediction,” EDCC 2008, p.181, 2008. A. Shamir, “How to Share a Secret,” Commun. ACM 22 (11), pp. 612-613, 1979. M. O. Rabin “Efficient dispersal of information for security, load balancing, and fault tolerance,” J. ACM 36 (2), pp. 335-348, 1989. V. Stantchev and M. Malek, “Architectural Translucency in Service-Oriented Architectures,” IEE Proc.-Software, vol. 153, no. 1, pp. 31-37, 2006. F. Salfner, M. Lenk, M. Malek, “A survey of online failure prediction methods,” ACM Computing Surveys, 42 (3), Article 10, 42 pages, March 2010.
Biographical Sketch Miroslaw Malek is professor and holder of Chair in Computer Architecture and Communication at the Department of Computer Science at Humboldt University in Berlin. His research interests focus on dependable architectures and services in parallel, cloud, distributed and embedded computing environments including failure prediction, dependable architectures and service availability. He has participated in two pioneering parallel computer projects, contributed to the theory and practice of parallel network design, developed the comparison-based method for system diagnosis, codeveloped comprehensive WSI and networks testing techniques, proposed the consensus-based framework for responsive (faulttolerant, real-time) computer systems design and has made numerous other contributions, reflected in over 200 publications and nine books. He has supervised 26 Ph.D. dissertations and three habilitations (ten of his students are professors) and founded, organized and co-organized numerous workshops and conferences. He served and serves on editorial boards of several journals and is consultant to government and companies on technical and strategic issues in information technology. Malek received his PhD in Computer Science from the Technical University of Wroclaw in Poland, spent 17 years as professor at the University of Texas at Austin and was also, among others, visiting professor at Stanford, Universita di Roma “La Sapienza”, Politecnico di Milano, Keio University, Technical University in Vienna, New York University, Chinese University of Hong Kong, and guest researcher at Bell Laboratories and IBM T.J. Watson Research Center.
lviii lvii