Security and survivability of distributed systems: an overview ...

101 downloads 0 Views 586KB Size Report
upon large-scale, highly distributed systems that ..... another interesting project where it is developing a ... an illustrative (rather than exhaustive) list of EU.
SECURITY AND SURVIVABILITY OF DISTRIBUTED SYSTEMS: AN OVERVIEW Kyandoghere Kyamakya Klaus Jobmann Michael Meincke University of Hannover Institute for Communications Engineering (IANT) Hannover, Germany

ABSTRACT Society is growing increasingly dependent upon large-scale, highly distributed systems that operate in unbounded network environments, which like the Internet, have no central administrative control and no uni$ed security policy. Despite the best efforts of security practitioners, no amount of system hardening can assure that a system that is connected to an unbounded network will be invulnerable to attack. The discipline of network survivability and security can help ensure that such systems can deliver essential services and maintain essential properties such as integrity, confidentiality and performance, despite the presence of intrusion. Unlike the traditional security policies that require central control instance or administration, survivability is intended to address unbounded network environments. Furthermore, since survivability requires robustness under conditions of intrusion, failure, or accident, it includes the concept of fault tolerance. This paper formulates the basic issues to be solved in this new field, discusses and comments some current solution concepts and finally outlines the most challenging future research avenues. 1. THE CONTEXT Information technology penetrates into many aspects of life for an increasing number of people throughout the world, enriching us but also producing systems of such complexity that they create new dependencies and risks to the society. Therefore, policies must be adopted and processed to distribute this infrastructure and related services and protect them from damage and misuse. Normally, infrastructure systems are tested daily by accidents, natural disasters, and human errors, and the engineers, managers and network administrators have substantial opportunity to harden their systems, learn from errors, and prepare for future stresses. Were this the only concern, societies might take comfort in relying on professional and 0-7803-6521-6/$10.00(C) 2000 IEEE

449

economic drivers for greater telecommunications and information systems safety and reliability. But these infrastructures face not only the random process of failure and error, they are also frequently maliciously attacked through the very devices that otherwise enhance their operation. Currently in the Survivability field a considerable emphasis is placed upon providing robust hardware and software as the enabling technology for system trustworthiness. As the information society expands it may be the case that issues of information management take an increasingly dominating role. Within the survivability concept it is explicitly acknowledged that totally secure systems are currently not practical and hence strategies must be adopted to manage the (uninterrupted) flow of services and information under attack or high stress conditions.

Fig.1: Building blocks There are two principal factors that are involved in the definition of survivability requirements on computer systems in the Information Society scenario [9]: Information systems are becoming critical because they are relied upon for an increasing range of functions and services that underpin the social infrastructure. These information systems are dependent on computing and network infrastructures which due to technological convergence of information and communication technology and economic drivers are becoming increasingly homogeneous and

“open”. Some of them such as the Internet are inherently insecure and provide little guarantee of service quality. Software applications on the other hand, are becoming increasingly heterogeneous, leading to problems with generalizing security practices.

0

0

0

Dellherate 0

What types of network services and computing systems are required to support these critical services with their survivability requirements? How can a minimum essential service level be guaranteed, and what comprises an essential service? What mechanisms need to be introduced to allow the recovery, following a failure, of ‘global’ systems and services? And, significantly, how can existing infrastructures sustain such requirements?

Findmg answers and technical solutions to these questions is critical for the information society, if the privacy, security and safety of industry, commerce, Government and individuals are to be assured. [I]

Fig.2: Threat Classes We assume that it is feasible to identify the presence of critical functions in a networked information infrastructure (see Fig.l), to define dependability requirements for functions with challenging characteristics (such as distribution, legacy, mobility, changing threats) and finally to allocate these requirements to (sub)-systems and components (see Fig.2).

2. THE PROBLEM Experience with network systems has shown that no amount of hardening can guarantee invulnerability to attack. Despite best efforts, systems will continue to be breached’. Thus, it is vital to expand the current view of information systems security to encompass system behavior that contributes to survivability in spite of intrusions or accidents. Network systems must be robust in the presence of attack and be able to survive attacks that cannot be completely repelled. Survivability requirements give rise to the following fundamental questions: 0 What are the critical functions and services in such an emerging scenario and what are the survivability needs that our society requires?

1

The US Defense Information Systems Agency reports that the Department of Defense is attacked 250,000 times a year. Los Alamos National Laboratories is attacked daily, with 22 proven outsider intrusions in the last five months. From “Security Measures”, Albuquerque Journal, March 24, 1998, pp. BI-B2.

0-7803-6521-6/$10.00(C) 2000 IEEE

450

2.1. TECHNOLOGICAL CHARACTERISTICS The information infrastructure outlined above (Fig.1) has a set of technological characteristics that are considered as main drivers for the new survivability/dependability concerns: Large-scale distribution: Computer communications infrastructures on which emerging services and applications will be built are evolving towards large-scale distributed infrastructures - with a wide geographic spanand the inclusion of a high variety of legacy components. Due to legacy and configurability, infrastructures become unbounded in structure and functions. Due to technological “Open” systems: convergence and economic drivers, computing platforms, software and communications systems are becoming increasingly homogeneous and “open”. Computing platforms are uniform in their operating system (Microsoft Windows or UNIX). The wide area networks are constructed using routers most of which sold by a small number of manufacturers (Cisco Systems for example). Also the protocol stacks are increasingly uniform. A single protocol stack, descending from the core TCP/IP protocol, manages the whole Internet. Integration of systems: Interconnection between communications infrastructures and other largescale systems adds to the complexity of the interactions between components (concept of “system of systems”). These systems are becoming interoperable and interdependent. In addition, the trustworthiness of the communications infrastructure . becomes a key factor in the overall dependability of the system. Mobility and flexibility: Geographical as well as functional flexibility is exacerbated by the

0

controls are three examples that attracts much attention in the literature and press. A mobile agent system provides a distributed computing infrastructure on which applications belonging to different (usually untrusted) agent owners can execute concurrently.

introduction in networks of mobile and adaptable elements with embedded software (e.g. mobile computing, mobile code and personal communications). Systems are subject to varied user and operational environments. Universality: Hardware, software, and information infrastructures are becoming pervasive across all social, educational, commercial and industrial sectors. The much discussed concept of an Information Society is rapidly becoming an every-day reality.

2.2. SURVIVABILITY CONCERNS The problems related to the survivability (dependability) of distributed systems can be grouped under two broad classes: Complexity and Vulnerability.

COMPLEXITY The rapid advances in distributed systems and services, largely facilitated by the availability of high-bandwidth digital communication networks, has severely escalated the complexity of system design, implementation and analysis. This complexity is a dependability concern for the following reasons: scalability, system complexity, service complexity, infrastructure complexity, failure model uncertainty, technological evolution, global environments.

VULNERABILITY The deployment of distributed systems across open communication networks, such as the internet, creates a substantial new set of operational threats and vulnerabilities: Multiple threat types; Dependence on public network infrastructures (the majority of applications and service provider will depend on (public) network infrastructures to integrate their distributed high level services in a secure and reliable way); Cascade effect: network infrastructures (Internet, intranets, fixed phone, wireless) are becoming interrelated through the physical hardware they use. Interconnection between various network infrastructures such as health, commerce, finance, etc. increases vulnerability of the latter to possible cascade effects [2]; 0 User profile: the market for the majority of highly-deployed products and services is composed of non-specialist or poorly trained users, who rarely have the appropriate professional background nor the organizational support to manage security or dependability issues; Mobile code/agents: mobile agents offer a new paradigm for distributed computing. Java “applets”, IBM “Aglet” or Microsoft’s ActiveX

0-7803-652 1 -6/$10.00 (C) 2000 IEEE

45 1

3. THE SOLUTION Basically, the ideal solution approach should rely on the fact that systems can exhibit large variations in survivability requirements. Thus, survivability requires that system requirements be organized into essential services and non-essential services, perhaps organized in terms of user categories or business criticality. Essential services must be maintained even during successful intrusion; non-essential services are to be recovered after intrusions have been dealt with. Essential services may be further stratified into a number of levels, each embodying fewer and more vital services, as a function of increasing severity and duration of intrusion. Conceming the issue of survivability to physical attacks that destroy links and nodes in networks, the best approach consists in the design and analysis of multiple-priority (traffic) restoration techniques to provide service continuity, eventually while also minimizing network congestion.[3,4,5] It is also possible that the set of essential services vary in a more dynamic manner, depending on a particular attack scenario and the resulting situation. In this dynamic case, services that are essential under one scenario may not be essential under another, resulting in different combinations of essential services that are scenario dependent. Thus, definitions of requirements for essential services must be augmented with appropriate survivability requirements. The information survivability program at DARPA defines Survivability as: “The ability of a system to continue the adequate perfarmance of its critical services and functions even after (unforeseen) successful attacks have taken place”. [6] From this

definition it is apparent that one novel aspect of survivability (compared to traditional dependability definitions) is the ability to handle the effects of an attack, and adjust a system accordingly, whereas traditional security techniques rely on excluding the possibility of intrusion. Due to its nature, the survivability scope can usefully be extended to include: Accidental failures and events Networked infrastructures and their emerging services such as electronic commerce

Network Analysis), a method for survivability analysis developed by CERT [ 6 ] .

The Cascade Effect caused by the unexpected propagation of faults events across networked systems

3.4. SYSTEM VALIDATION & VERIFICATION The failure modes of complex, distributed systems are not understood, nor have they been adequately modeled. Multi-disciplinary analysis techniques need to be developed (or extended from existing techniques) to facilitate dependable design methods. New intrusion detection and diagnosis methods should support operation of critical functions in the presence of attacks under failures.

3.1. ROBUST SYSTEMS To deliver survivable, trustworthy information systems there are many fundamental issues that need to be addressed in the design of computer systems. Two key themes are faulttolerance and secure operating environments. FAULT TOLERANCE With the development of complex distributed systems, and ‘systems of systems’ it will be essential to deploy systems that are robust to failure. This could be achieved by developing computer systems that are (autonomously) dynamically reconfigurable and modular. These fault tolerant computing methods are deployed in realtime control and embedded systems but have yet to be widely exploited in the context of information systems. 111, 121

4. ESTABLISHED RESEARCH ’AVENUES There is no single technical solution that will lead to the successful implementation of survivable network systems. The dependability concerns are spread across a broad range of hardware, software and networking themes (in addition to social and legal issues). Consequently, the survivability programs provide a focussed framework to draw together many disparate computer science, electronics, communications and security themes in a multidisciplinary approach to the total problem. A number of research programs have been instigated (predominantly in US and Europe, but also in other industry nations though at smaller-scales) which seek to meet the research goals posed by the quest for ‘survivability’ or ‘critical infrastructure protection’. Overall, these research programs tum around four key themes: 0 High confidence networlung (& fault tolerance) High confidence computing Systems Wrappers and composition 6 Survivability of large-scale systems

SECURE OPERATING ENVIRONMENTS Most operating systems are designed with security as an ‘add-on’ feature rather than as an intrinsic property of the system. As a result security, and security policies, are poorly implemented, often weak, and largely neglected. Operating systems are required that make security an active but unobtrusive feature of the system. Cryptography, firewalls, partitioning, run-time environments, interfaces, intrusion detection and isolation must all be considered. [ 131 3.2. SECURE END-TO-END NETWORKING The Internet was developed as an ‘open’ digital medium. This characteristic renders it highly unsuitable for dependable applications. Methods must be developed to provide secure end-to-end networking, which do not detract from the accessibility of the medium, yet maintain integrity of data and continuity of operation. Many technical issues are involved, including: cryptography, keysignatures and management systems, secure mobile code (e.g. ‘wrappers’), as well as non-technical issues such as electronic commerce and privacy laws.

3.3. DESING, ANALYSIS & INTEGRATION Design and analysis methods are required to assess the integrity and security of a networked information system, particularly in the (common) case where legacy systems are integrated. Formal methods may be one approach, others need to be considered. Two representative analysis methods may be cited: a graph-based approach to network survivability analysis [7] and the SNA (Survivability 0-7803-6521-6/$10.00( C )2000 IEEE

452

4.1. US DARPA SURVIVABILITY PROGRAMS The US has established a significant survivability research program led by DARPA [8]. In substance, the program exploits established research methods in areas such as distributed operating systems, cryptography and fault tolerance, but also incorporates more radical viewpoints, such as the design of operating systems using principles observed in the human immune systems. Let us shortly present, as sample, two DARPA research programs. One of these programs is “Self-Configuring Survivable Multi-Networks for Information Systems: COSMOS (URL: www.cstD.umkc.edu/research/cosmos).In this project they specially address the issue of survivability due to physical attacks that destroy links and nodes in networks, but expect that many of their results will extend to non-lethal attacks which destroy or corrupt network control information and databases.

In their efforts, they seek to develop a comprehensive set of solutions for the network design and management aspect of providing adequate service continuity in the event of a major attack on multi-networks. The CERT Coordination Center is active in another interesting project where it is developing a Survivable Network Analysis (SNA) method to evaluate the survivability of systems in the context of attack scenarios. Also under development is a Survivable System Simulator that will provide for the analysis, testing, and evaluation of survivability solutions in unbounded networks. (URL:

www.sei.cmu.edu/organization/proaams/urotec-

critical-systems.htnII)

4.2. EUROPEAN SURVIVABILITY PROGRAMS’ Critical infrastructure do exist in the various nations of the EU (European Union), mostly regulated by localhational legislation. Some of them, such as utilities and financial services, cross the borders of individual nations and have done so for decades. In this respect, initiatives are underway for EU regulation in the areas of privacy and key management. Table 1 (at the end of this paper) summarizes an illustrative (rather than exhaustive) list of EU funded projects and the main issues addressed. 4.3. OTHER RESEARCH INITIATIVES WORLD-WIDE Because of the actuality of the theme Information Survivability and Security one can observe many other research initiatives world-wide (not only in Europe and US), which are funded by either governments or industry. The results of all these initiatives are published in an ever increasing number of Conferences, Workshops, Magazines and Journals, thus providing a significant contribution to the emergence of a really secure information society.

5. PROMISING RESEARCH DIRECTIONS FOR THE FUTURE Survivability will remain an interesting research field for the future. Reference [6] provides a good summary of some of the challenging research issues in the area of survivable systems: Adapt and develop architectural description for an adequate description of large-scale distributed systems with survivability attributes. Intruder usage models. Representation of intruder environments.

*

Details of CEU funded projects can be found on the CORDIS database,

URL: http://www.cordis.lu/info/frames/ifOO9~en.htm

0-7803-6521-6/$10.00(C) 2000 IEEE

453

Evasluation models of survivability as a global emergent property from architectural specification. Pilot tests of real distributed systems for refining the analysis technology and instruments.

5. CONCLUSION The economical and social benefits of Communication and Information Systems are selfevident. However, unless appropriate measures are taken to create dependable systems, there is a substantial risk that individual and corporate security, privacy and safety could be compromised. Communication and Information Systems based on current technology and methods, particularly those distributed and interconnected, are highly susceptible to new types of vulnerabilities. These vulnerabilities may be attributed to poor design, poor implementation, misuse or malicious attack. Consequently, major initiatives are required to promote the development of trustworthy Communication and Information Systems. Many of the pertinent issues are being explored within the theme of information system survivability. This paper has first described the context of an information society that is relying on distributed networked infrastructures. These infrastructures are unfortunately highly insecure. The fundamental problem is that no amount of hardening can guarantee invulnerability to attack. The substance of the solution of this fundamental problem has been structured around answers to a series of fundamental questions. Finally, it has been underlined that the purpose and goal of all research initiatives (present and future) is to find adequate answers and technical solutions to these so critical questions for the information society. Furthermore, most research initiatives have one characteristic in common: they basically exploit established research methods in areas such as distributed operating systems, cryptography and fault tolerance, some even incorporate more radical view points, such as the design of systems using principles observed in the human immune systems.

REFERENCES [l] T. Jackson, M. Wlikens, Survivability of Networked Information Systems and Infrastructures: First Deliverable of an explanatory study. European

Commission Special Report JRC/ISIS/STA/DAS/Projects/Survivability/Study, Dec. 1998, pp.1-37. [2] Report to the President’s Commission on Critical Infrastructure Protection. Special Report CMU/SEI-97SR-003, January 1997. Software Engineering Institute,

[8] Andrew S . Riddle and Peter A. Wilson and Roger C. Molander, Strategic Information Warfare,.National Defense Research Institute (US), RAND. Published 1996 by RAND. [9] A. Avizienes, H. Kopetz, J.C. Laprie (eds), Dependability: Basic Concepts and Terminology, , Springer Verlag. [ 101 Encyclopedia of Computer Science, third Edition. IEEE Press. Edited by A. Ralston, E.D. Reilly. [ 111 A. Avizienis, “Toward Systematic Design of FaultTolerant Systems”, Computers, Vo1.30, No.4, April 1997. E121 F. C. Gortner, “Fundamentals of Fault-Tolerant Distributed Computing in Asynchronous Environments” in ACM Computing Surveys, Vol. 31, Issue 1, 1999. [13] P. A. Loscocco et al., “The Inevitability of Failure: The Flawed Assumption of Security in Modern Computing Environments”, Proc. Of the 2 1’‘ National Information Systems Security Conference, pages 304-314, October 1998.(URL:http://www.cs.utah.edu/fluxlflask)

Carnegie Mellon Univ. htm://www.sei.cmu.edu/uub/documents/97.reuorts~. [3] D. Medhi and D. Tipper, “Multi-Layered Network Survivability -Models, Analysis, Architecture, Framework and Implementation: An Overview,” in Proc. of the DAUPA Information Survivability Conference and Exposition (DISCEX 2000),Hilton

Head Island, South Carolina, January 25-27,2000. [4] W.D. Grover, “Distributed Restoration of the Transport Network,” in Telecommunication Network Management into the 21’‘ Century: Techniques, Standards, Technologies, and Applications, IEEE

Press, 1993. [5] K. Kyandoghere, “VP Control for ATM Networks with Call-Level QoS Guarantees,” IEICE Transactions on Communications, Vol. E8 1-B, No.1, January 1998, pp .32-44. [6] R.J. Ellison, D.A. Fischer, R.C. Linger, H.F. Lipson, T. Longstaff, N.R. Mead, Survivable Network Systems: An emerging Discipline, Technical Report CMU/SEI97-TR-013 , ESC-TR-97-013, CERT, 1999. 171 C. Phillips and L.P. Swiler, “A Graph-Based System for Network-Vulnerability Analysis,” ACM 1998 NSPW 9/98, (1999 ACM 1-58113-168-2), Charlottsville,VA, USA

Table 1: A sample of EU funded Survivability projects

Issues addressed Contribute to the growth of the electronic commerce on Intemet by developing and installing end-to-end security mechanisms for commercial transactions using the Intemet infrastructure. The key issues that the project addressed were the design and planning of survivable networks and distributed recovery techniques for networks and multi-layer network. Formulate, develop and demonstrate an open system, fault-tolerant, High distributed computer connection architecture conforming to the Confidence OS1 model. The architecture was capable of being configured to Computing support a range of performances and dependabilities and to manage distributed processing, as well as offering transparent fault-tolerant and network management to the user. Develop a systematic method for system evolution and reWrappers and RENAISSANCEengineering which is geared to the requirements of the commercial Composition Methods and Tools systems domain, and which can maximize the investment in legacy Support For the systems and information. The method takes into account Evolution And ReEngineering of Legacy technology changes which have to led to customer pressure to migrate applications from centralized mainframes to object-based, Systems (ESPRIT) distributed client-server systems. Addresses the issues of secured access to local and centralized MV2VTS - Multi Survivability of Large Scale Modal Verification for services in a multi-media environment. The main objective is to extend the scope of application of network-based services by Systems in EU Teleservices and adding novel and intelligent functions, enabled by automatic Security Applications verification systems combining multimodal strategies (secured (ACTS) I access based on speech, image and other information). CRYPTOGRAPHY, MOBILE CODE AND AGENTS, FAULT TOLERANT SYSTEMS, Other related FORMAL METHODS. COMPLEX SYSTEM ANALYSIS. Research Areas

Key Theme High Confidence Networking

Program ID E2S - End to End Security over the Internet (ESPRIT) IMMUNE-End toEnd Survivable Broadband Network (RACE) DELTA 4 (ESPRIT)

0-7803-6521-6/$10.00(C) 2000 IEEE

454

Suggest Documents