Metrics in Risk Determination for Large-Scale

Metrics in Risk Determination for Large-Scale Distributed Systems Maintenance Maureen Ann Raley University of Alabama in Huntsville 2312 South Pierce Street Arlington, VA 22202-1519 USA Telephone: 703-325-3510 (Fax: 703-325-6485) Email: [email protected]

Abstract This paper presents a comparison of management metrics used in the development and maintenance of large-scale government systems dependent on information sharing through computer networking. The paper suggests that the correct choice and use of metrics increases the likelihood of project success and, further, that a more rigorous means of analyzing feedback from project progress reports could reduce the amount of rework and difficulty encountered[1]. The paper draws on data from an inventory database recorded during the nationwide upgrade of a networked computer system to propose a framework for maintenancephase metrics, as well as suggesting a metrics suite. This paper proposes future work using entropy metrics for risk assessment validated against the results from the standard project metrics.

Letha Hughes Etzkorn Computer Science Department University of Alabama in Huntsville Huntsville, AL 35899 USA Telephone: 256-824-6291 (Fax: 256-824-6239) Email: [email protected] well-defined parameters during the project. In general, systems priorities during maintenance can assumed to be system (1) availability; (2) data integrity; (3) performance; and (4) security, while ensuring these priorities are not compromised[2]. The correct metrics chosen to assess the progress of a project and the risk to completing its goals can provide timely information about the critical points of the project. However, if the measurement indicators are chosen badly or are not well used, the results, at best, are a waste of resources, and at worse, a markedly greater probability of the project failing to meet its goals.

2. BACKGROUND Risk analysis, standard project metrics, COTS-based systems, and dependency graphs are the key areas that provide background to this research.

Categories and Subject Descriptors

2.1 Risk Analysis

C 5.3 [Microcomputers] Workstations, Personal Computers D 2.9 [Software Engineering] Management, Quality Assurance, Configuration Management H.1.2 [Information Systems] Models and Principles, User/Machine Systems] Human Factors

A risk is a potential problem with causes and effects; in other words, a risk is an event or circumstance that has a probability of occurring and would have an adverse effect if it occurs. Increasingly, we depend on computer systems to behave acceptably in applications with extremely critical requirements, by which we mean that the failure of systems to meet their requirements may result in serious consequences, such as loss of human life and resources.

General Terms Management, Measurement, Performance, Human Factors, Standardization

Keywords Project management, software maintenance, system upgrade, COTS-based systems, COTS, lessons learned, quality control, metrics, entropy metrics.

1. INTRODUCTION When confronted with the maintenance and upgrade of a large, geographically distributed networked computer system, one of the questions project management must address is how to assess project risk and what metrics should be chosen to best enable an assessment that will result in the delivery of a system that functions as expected, meets schedule, and is within budget, as well as adheres to

Risk management is an essential process in managing software development as well as gauging the overall progress of the system development. Risk management is a method of managing that concentrates on identifying and controlling areas or events that have a potential of causing unwanted change. Risk analysis involves an assessment of the impact of the consequences if the problem event were to occur. The key concepts of risk management can help software managers assess problem situations and formulate proactive solutions; however, too often organizations approach risk management by assuming the project will run exactly to plan. To practice risk management effectively, the project manager must be able to identify the risks; determine the possibility of each risk manifesting;

estimating exposure if the risks occur; determining which risks must be managed; take action on the risks that can be controlled; and plan contingently for risks that are beyond immediate action. Most organizations now have formal programs in risk management, and government directives often promote a risk-based approach. However, what is needed between identifying an abstract set of risks and then, managing the project through risk reduction is not always clear.

2.2 Metrics Metrics are performance data collected to provide a measure the quality of the product or the status of the project under development. Metrics may deal with software development (faults, reliability, complexity, design or requirements stability, depth and breadth of testing, etc.). Metrics may also address project characteristics, such as cost, schedule, and manpower, providing insight into the progress made toward achieving the project goal. Metrics for software development include project management metrics and quality metrics. Project metrics are primarily focused on the continuous assessment of cost and schedule, and can include manpower stability, testing progress, and resource utilization. Projects may use Gant or Pert charts with stoplight colored percentages to indicate risk assessment. Software quality metrics are more concerned with “ilities,” such as quality, reliability, complexity, dependability, maintainability, and stability. The Software Engineering Institute (SEI) at Carnegie Mellon University (CMU)[3] recommended operational mechanisms for three important project management functions: • Project planning – estimating costs, schedules, and defect ratings • Project management – tracking and controlling costs, schedules, and quality • Project improvement – providing baseline data, tracing root causes of problems and defects, identifying changes from baseline data, and measuring trends. A decade or so ago, a number of different initiatives proposed sets of metrics, most of which overlapped. The US Government had various sets of metrics in place for use in development of military acquisition systems. The SEI/CMU proposed the use of differing sets of metrics at each of the levels of the Capability Maturity Model, and recommended the basic set be cost, schedule, software size (development progress), software effort (manpower), schedule, and quality (fault profiles). Table 1, “Software Metrics Sets Comparison,”[4] is a comparison of these metrics sets, which are typical of software development

projects, taken from a decision paper[5] for metrics selection for a US Government acquisition project.

2.3 COTS-Based Systems Increasingly, computing systems are structured around commercial off-the-shelf (COTS) hardware and software components. US vendors produce the majority of COTS computer hardware and software products. Government computer networks, in fact, most if not all computer networks, are COTS-based. In the future, government and commercial computer-based systems are more likely to increase the use of COTS, due to the wide spectrum of items available, the consistency of the product, and the low cost of purchase versus the cost of in-house development. Fundamental differences exist between COTS and conventional software development and maintenance. These COTS systems require different lifecycle methodology than traditionally built computer systems. Development and maintenance models (Waterfall, Spiral, and Prototyping, for example) and architectural modeling must address a commercial rather than an organizational approach since software releases, updates, and bug fixes are scheduled to suit the vendor. COTS-based system testing is generally done using “black box” methodology, as the internal code is not available or may have legal restrictions on reverse engineering or modification by the end-user. Because of the different development and maintenance cycles, it is possible that the metrics chosen for a COTS system may be different than those for custom-built systems, or the metrics may need to be applied differently. Metrics for COTS-based systems will need to be chosen for their ability to function under different methodologies.

2.4. Risk Dependency Graphs and Component Dependency Graphs in Risk Analysis Dependency graphs are typically directed graphs used a probabilistic models for the purpose of reliability analysis at the architectural level. These graphs represent components, component reliabilities, link and interface reliabilities, transitions, and transition probabilities and are used at the component level in the determination of software coupling and cohesion. [1] Pert and Gant charts are normally used in project management show schedule (time) dependencies by providing a graphical depiction of the interrelation of project events. The deviations in the start and completion times for each event directly impact the overall project completion date. Such schedule dependencies are an

Table 1. Software Metrics Sets Comparison #

Software T&E Panel (STEP)

1

Cost

2 3 4

5 6

Software Engineering Institute (SEI) ---------------

US Government AMC-P-70-14 AMC-9-70-14 Cost Deviations

Objective

Schedule

Schedule Progress

Schedule Deviations

Track progress vs. schedule Readiness to proceed to next phase

Computer Resource Utilization Software Engineering Environment (CMM level) Requirements Traceability Requirements Stability

Computer Resource Utilization ---------------

Computer Resource Utilization

Track planned vs. actual memory utilization, I/O channels, throughput

Software Development Tools

Assessment of contractor’s development environment

---------------

---------------

Traces requirements to design, code, and test

Software Volatility, Unit Development Progress Software Volatility, Design Complexity Software Size Design Complexity

Requirements Definition & Stability

Measures changes in requirements and the effect on development effort

Design Structure

Track design changes and their effect of development and configuration

---------------

Provides indication of potential problem areas where test effort should be focused.

Track development expenditures

7

Design Stability

8

Complexity

9

Breadth of Testing

Testing Progress

Test Coverage

Measures the extent and success of test coverage. Provides indication of sufficiency of testing.

10

Depth of Testing

Testing Progress

Test Sufficiency

Measures the degree to which the required functionality has been successfully demonstrated

11

Fault Profiles

---------------

Defect (Fault) Density

Measures the number of faults vs. time. Tracks open, closed trouble reports by priority

12

Reliability

---------------

---------------

Attempts to predict system downtime by tracking trouble reports using math models

13

Manpower (optional) Development progress (optional)

Personnel

Development Manpower Development Progress (Completeness)

Tracks personnel loading and turnover

---------------

Tracks schedule and units per release to track functional preservation

Supportability

Indicates how easy/difficult the software will be to maintain

14

15

16

--Design Progress –Unit development progress – Testing Progress Incremental Release Content

Provides percentage of development completion across all phases and work elements

example of risk dependency graphs in project management and provide insight into one aspect (schedule risk) of project management risk analysis. [1]

3. DISCUSSION

Research involving the use of component dependency graphs for risk analysis on the components of COTS-based systems has taken place at the University of West Virginia from 1999-2003 under Ibraham, Wang, Yacoub, Ammar, and Robinson [6-8]. In 1999, Yacoub, Cukic, and Ammar [7] developed a probabilistic model and a reliability analysis technique named "Scenario-Based Reliability Analysis" (SBRA) for component-based software whose analysis is strictly based on execution scenarios. A probabilistic model named "Component-Dependency Graph" (CDG) was constructed and the reliability of the application as the function of reliabilities of its components and interfaces was analyzed. SBRA was used to identify critical components and critical component interfaces, and to investigate the sensitivity of the application reliability to changes in the reliabilities of components and their interfaces. This use of CGD included the development of a metrics suite in 1999 by Yacoub, Ammar, and Robinson to measure the quality of designs at an early development phase. The suite consists of metrics for dynamic complexity and object coupling based on execution scenarios. The proposed measures were obtained from executable design models and were used to assess the quality of a pacemaker application.

In the recent past, a US Government entity undertook a massive upgrade of its nationwide, networked computer system. The purpose of the upgrade was to standardize the software and hardware throughout the agency, modernize the computers and software components, dispose of obsolete components, and introduce a layered operating system with administrative and user privileges. The overall benefit of the upgrade would be a system more conducive to configuration management control, easier to maintain, with barriers to prevent installation of end user software (games, favorite programs) and less susceptible to integrity compromise by external or internal hacking. The upgrade effort had a firm deadline. Any computer not meeting compliance goals at the deadline would be disconnected from the network and removed from operational use or would be accorded strictly stand-alone status, meaning no transfer of data by any means other than hardcopy printout. “Sneakernet” transfer by floppy drive, modem, or other electronic means was prohibited [2].

Ibraham, Yacoub, and Ammar in 1999 [8] explored the use of CDG to obtain the overall system/subsystem risk factor in a modeling and simulation environment. Commercial simulation products were used to obtain simulation statistics for the development of an automated architectural-risk assessment. Models were developed and used for obtaining dynamic complexity and dynamic coupling measures for all architecture elements. Severity analysis was performed and a risk factor for each architectural component was determined. Wang et al in 2003 [9] worked on the premise that most faults in software systems are found in only a few of a system’s components, whose early identification would allow the developer to focus on defect detection activities. A prototype tool called Architecture-level Risk Assessment Tool (ARAT), based on the risk assessment methodology presented by Ibraham, Yacoub, and Ammar in 2001 was developed to provide risk assessment based on measures obtained from CDG analysis. ARAT can be used in the software development design phase for architectural-level software risk assessment.

3.1. An Example of a Large-scale Distributed COTS-Based System in the Maintenance Phase

The numerous networked systems were distributed throughout the continental United States and included diverse operations centers, large data entry centers, computer centers where in-house software was developed and maintained, and field offices, where operations activities were based. The various computer subsystems were a mixed network of mainframe computers and client/server computers. The software used was a mixture of predominately commercial off-the-shelf (COTS) software for business automation and operating systems for the networks and the computers. A notable portion of the software consisted of in-house programs for data entry and data analysis, which also resided on the system, although the amount of in-house software was significantly less than that of the COTS software. During the upgrade, the entire nationwide computer system had to remain available and perform as usual, the data on the system must be uncorrupted and of the highest integrity, and the system must remain secure. Weekly reporting on the upgrade project, which received a high level of scrutiny, was taken from the nationwide inventory database. It is from this database and the weekly reporting that we can find the basis for a suite of metrics for risk analysis for systems maintenance.

3.2 Component-Based Reporting Data The weekly status reports were generated from the inventory database (Oracle) in hardcopy and show the weekly status of the project for the agency’s two divisions over 60 weeks using the following categories: (1) Date of report

(2) Total devices (hardware and software items, i.e., a client computer, a software package) (3) Number of devices that can be made compliant, irrespective of actual status (4) Number of devices whose compliance status has no impact on the project goal, i.e., irrelevant devices. (5) Number of devices that need to be replaced (6) Number of “isolate to standalone” devices that HAVE been isolated (7) Number of “isolate to standalone” devices that HAVE NOT been isolated (8) Total number of devices that CAN be modified, i.e. “modifiable” (Devices Modifiable) (9) Number of “modifiable” devices that HAVE NOT been modified (“Devices to be Modified”) (10) Percentage compliant – [Total devices – (those not isolated + those not replaced + those not modified)]/ Total devices

3.3 Component-Based Data as a Basis for a Metrics Suite for Risk Analysis

software on any computer, and previous lack of restrictions on computer relocation made “desktop” computers and their software difficult to track accurately. With any computer system, mainframe or client/server, failure to rigorously perform product inventory, as well as database entry and modification, led to a large number of missing or possibly stolen devices, although the discrepancy would be higher on the client/server side, due to size. The introduction of radio frequency identification (RFID) devices or similar automated mechanisms could be used to alleviate inventory inaccuracies in the future. • Device disposal progress – this metric will measure

the progress in disposing of obsolete devices. Most, if not all of these devices will be replaced by compliant equipment. Component graphs for two of these metrics are shown in Figures 1 and 2. Each distributed site is a component providing input to the metric.

Using the inventory reporting data above, this research proposes the following suite of metrics to assess the risk during the system upgrade. Note that “device” indicates a unit or an item, such as a software package or a computer. • Device modification progress – this metric will measure the progress being made in modifying the devices that can be made compliant by applying an upgrade, i.e., software patches circuit cards, additional memory, or a larger capacity hard drive. • Device isolation progress – this metric will measure the progress in isolating to standalone status legacy systems that cannot be retired or modified. These systems contain data and legacy programs that must be accessed for a finite number of years (three to seven) before they can be retired. During this time, functions performed by these systems will be migrated to compliant systems, so that all “isolate to standalone” devices will be retired in seven years.

SC1 SC2 SC3 SC4 SC5 SC6 SC7 Devices Needing Modification

SC8 SC9 SC10 R1 R2 R3 R4

Figure 1. Risk Component Graph for Modified Devices CC1 CC2 CC3 HQ1 HQ2 HQ3 Devices Isolate to Standalone

HQ4 HQ5 HQ6 HQ7 NC1 NC2

• Compliance growth – this metric will measure the progress of the project to the goal of 100% compliance. • Inventory stability – this metric will measure the disorder due to an inaccurate inventory in the inventory database system. Examples of some circumstances that could contribute to an inaccurate database are as follows: Within the client/server system, portability, lack of software installation restrictions allowing installation of any

NC3 Aux

Figure 2. Risk Component Graph for Standalone Devices

4. FUTURE STUDIES Because entropy measures the amount of uncertainty in a system, the next step in this research will be to develop a complete set of risk component graphs, in addition to the two illustrated in Figures 1 and 2, as input to a suite of entropy metrics and to examine their effectiveness for risk

analysis and identification in the maintenance phase of large-scale computer systems upgrades. Validation of the output of the proposed entropy metrics on the risk component graphs will be against the weekly project status reports taken from the inventory database.

4.1 Entropy, as Applied to Information Theory Information theory deals with assessing the amount of information in a message. The basis of information theory was laid out in Shannon’s seminal paper, “A Mathematical Theory of Communication [10].” Shannon’s theory focuses on measuring the uncertainty that is related to information. He proposed to measure the amount of uncertainty, or entropy, in a distribution by the following definition: Let X be a discrete random variable on a finite random set X = {x1, x2 , . . . xn }, with probability distribution function p(x) = Pr (X = x). The entropy, H(X) is defined as n H(X) = -Σ (p(x) * log2 p(x)) k=1

and, if we partition the outcome into groups, such as our inventory reporting data, the entropy of the probability conditioned on the group k is: n Hn(P) = -Σ (pk * log2 pk) k=1 n where pk ≥ 0, ∀ k ∈ 1, 2, . . . n and Σ pk = 1 k=1

By defining the amount of uncertainty in a distribution, Hn describes the minimum number of bits required to uniquely distinguish the distribution. In other words, it defines the best possible compression for the distribution, that is, the output of the system. This fact has been used to measure the quality of compression techniques against the theoretically possible minimum compressed size [11].

4.2 Entropy Metrics Chapin in 1988 [12] examined the software development lifecycle and the maintenance cycle and determined the two life cycles differ both in what stages are included and in the relative effort applied to the stages. He concluded the differences have major implications for the choice of aids and tools, and for the management of application software maintenance or support. In 1989[13], he proposed an entropy metric for use during the maintenance cycle based on message flow data. He expanded this research into COTS-based systems consisting of COTS software

components, reused software, and object oriented software, and proposed a metric to measure the complexity of the interaction between the diverse components to assess the changes in system complexity affecting maintainability. The use of an entropy-based approach for predicting uncertainty in large-scale project management was researched in 1981 by Martin, Lenz, and Glover [14]for predicting cost growth in an Air Force weapons systems development project. Probabilities were determined using the input from a questionnaire-based survey of principal individuals involved in the development process. The results using this method were accurate within 3% of the actual program cost. Because cost projections for weapons systems are notoriously inaccurate – the goal for this particular system was about one-third of the actual cost -Martin’s entropy cost metric was considered a preferred method of cost growth prediction. Hassan and Holt [11] examined software complexity from a different perspective, by focusing on the process instead of code. The authors theorized a chaotic development process would negatively affect the outcome of the code, and introduced an entropy metric based on the frequency that a module is modified over a finite time period. By comparing entropy and modification count during modification of open source operating system software modules, they found a 13-45% improvement in prediction error using entropy.

5. SUMMARY An effective suite of metrics for risk identification and analysis in the maintenance phase of large distributed COTS-based computer systems, when used properly, can improve the probability the project is completed successfully. It can also reduce the amount of rework brought on by failure to adapt management strategy to the changes in the system status by enabling early identification of risk. Assuming the appropriate risk mitigation measures are enacted, early identification of risks can make the difference as to whether a project reaches it goals, is within budget, and on time. Further study using both project management metrics and entropy metrics, as well as a mix of the two, is planned using data gathered during the nationwide system upgrade. The new areas examined by this research are the application of entropy metrics to risk component graphs, the comparison of results obtained from entropy metrics and traditional project metrics, and the determination of an appropriate set of metrics to analyze risk.

6. REFERENCES 1.

Raley, M.A. and L.H. Etzkorn. Proceedings of the International Conference on Software

2.

3.

4.

5.

6.

7.

Engineering Research and Practice (SERP '04). in International Conference on Software Engineering Research and Practice (SERP'04). 2004. Las Vegas, NV: CSREA Press. Raley, M.A. and L.H. Etzkorn. Case Study: Lessons Learned During a Nationwide Computer System Upgrade. in ACM SE 2004. 2004. Huntsville, AL: ACM. Carleton, A.D., et al., Software Measurement for DoD Systems: Recommendations for Initial Core Measures. 1992, Software Engineering Institute, Carnegie Mellon University: Pittsburgh, PA 15213. Raley, M.A., Multiple Launch Rocket System (MLRS) Improved Fire Control System (IFCS) Software Metrics, in Apple Clarisworks. 1993, Project Executive Office (PEO) Tactical Missiles (TM) Multiple Launch Project Office (MLRS): Redstone Arsenal, AL. Raley, M.A. and H.E. Wright, Improved Fire Control System (IFCS) Contract DAAH01-92-C0432 Definition (Cost Reduction) -- Software Metrics. 1993, Project Executive Office (PEO) Tactical Missiles (TM) Multiple Launch Project Office (MLRS): Redstone Arsenal, AL. p. 2. Ibrahim, A., S.M. Yacoub, and H.H. Ammar. Architectural-Level Risk Analysis for UML Dynamic Specifications. in 9th International Confrence on Software Quality Management (SQM 2001). 2001. Loughborough University, Leicestershire, England LE11 3TU: British Computer Society. Yacoub, S.M., B. Cukic, and H.H. Ammar. Scenario-Based Reliability Analysis of Component-Based Software. in Tenth

8.

9.

10.

11.

12.

13.

14.

International Symposium on Software Reliability Engineering (ISSRE '99). 1999. Boca Raton, FL: IEEE Computer Society Press. Yacoub, S.M., H.H. Ammar, and T. Robinson. Dynamic Metrics for Object Oriented Designs. in Sixth IEEE International Symposium on Software Metrics. 1999. Boca Raton, FL: IEEE Computer Society Press. Wang, T., et al. Architectural Level Risk Assessment Tool Based on UML Specifications. in 25th International Conference on Software Engineering. 2003. Portland, OR: IEEE Computer Society Press. Shannon, C.E., A Mathematical Theory of Communication. The Bell System Technical Journal, 1948. 77: p. 379-423, 623-656. Hassan, A.E. and R.C. Holt. Studying the Chaos of Code Development. in International Workshop on Principles of Software Evolution. 2003. Helsinki, Finland. Chapin, N. Software Maintenance Lifecycle. in 1988 International Conference of Software Maintenance (ICSM '99). 1988. Arizona: IEEE Computer Society Press. Chapin, N. An Entropy Metric for Software Maintainability. in Proceedings Twenty-Second Annual Hawaii International Conference on System Sciences,. 1989. Los Alamitos CA: IEEE Computer Society Press. Martin, M.D., J.O. Lenz, and W.L. Glover. Uncertainty Analysis for Program Management. A Decade of Project Management [web page] 1981 [cited; Available from: http://www.welchco.com/02/14/01/60/81/01/01.