Developing distributed systems is an activity that consumes time and ... Distributed Development Environment, Fault Severity, Software Component, Software.
International Journal of Distributed Systems and Technologies, 4(2), 1-14, April-June 2013 1
Testing-Effort Dependent Software Reliability Model for Distributed Systems Omar Shatnawi, Department of Computer Science, Al al-Bayt University, Mafraq, Jordan
ABSTRACT Distributed systems are being developed in the context of the client-server architecture. Client-server architectures dominate the landscape of computer-based systems. Client-server systems are developed using the classical software engineering activities. Developing distributed systems is an activity that consumes time and resources. Even if the degree of automation of software development activities increased, resources are an important limitation. Reusability is widely believed to be a key direction to improving software development productivity and quality. Software metrics are needed to identify the place where resources are needed; they are an extremely important source of information for decision making. In this paper, an attempt has been made to describe the relationship between the calendar time, the fault removal process and the testing-effort consumption in a distributed development environment. Software fault removal phenomena and testing-effort expenditures are described by a non-homogenous Poisson process (NHPP) and testing-effort curves respectively. Actual software reliability data cited in literature have been used to demonstrate the proposed model. The results are fairly encouraging. Keywords:
Distributed Development Environment, Fault Severity, Software Component, Software Engineering, Software Reliability, Software Testing-Effort
INTRODUCTION The software development environment is changing from a host-concentrated one to a distributed one due to cost and quality aspects and rapid growth in network computing technologies. Distributed systems are being developed in the context of the client-server architecture. Client-server architectures dominate the landscape of computer-based systems. Everything from automatic teller networks to the Internet exists because software residing
on one computer—the client—requests services and/or data from another computer—the server. Client-server software engineering blends conventional principles, concepts, and methods with element of object-oriented and computer-based software engineering to create client-server systems. Client-server systems are developed using the classical software engineering activities. Distributed systems are growing rapidly in response to the improvement of computer hardware and software and this is matched by
DOI: 10.4018/jdst.2013040101 Copyright © 2013, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
2 International Journal of Distributed Systems and Technologies, 4(2), 1-14, April-June 2013
the evolution of the technologies involved (Zhao et al., 2010). Developing large-scale distributed systems (LSDS) is complex, time-consuming, and expensive. LSDS developments are now common in air traffic control, telecommunications, defense and space. In these systems a release will often be developed over 2-4 years and cost in excess of 200-300 person years of development effort (Kapur et al., 2004b). Due to their complexity, LSDS are hardly ever “perfect” (Lavinia et al., 2011). Features define the content of a release. The features are realized by mapping the full set of system feature requirements across the various components. Normally this involves building on a large existing software base made up of the existing components that are then modified and extended to engineer the new release. Requirement changes are common in these developments because of the long lead times involved. Change is inevitable since user requirements, component interfaces, and developers’ understanding of their application domain all change. This adds to the complexity of planning and controlling the in-progress development both at the individual component and the release. A further complexity factor is the integration and validation of the release once the individual components are delivered. These large-scale systems must achieve high reliability because of their nature. In the final release validation software defects arise not only from the new software but also due to latent defects in the existing code. Substantial regression tests must be run to checkout the existing software base that may amount to millions of lines of code (Kapur et al., 2004b). Successful operation of any computer system depends largely on its software components. Thus, it is very important to ensure the quality of the underlying software in the sense that it performs its functions that it is designed and built for. To express the quality of the software to the end users, some objective attributes such as reliability and availability should be measured. Software reliability is the most dynamic quality attribute (metric), which can measure and predict the operational quality of the software. Software reliability model (SRM) is the tool, which can be used to evaluate the software
quantitatively, develop test cases, schedule status and monitor the change in reliability performance. In particular, SRMs that describe software failure occurrence or fault removal phenomenon in the system testing phase are called software reliability growth models (SRGMs). Among others, non-homogeneous Poisson process (NHPP) models can be easily applied in the actual software development. Some SRGMs are concerned with the cumulative number of faults detected by software testing and the testing period (Lyu, 1996; Xie, 1991; Musa, 1999; Kapur et al., 1999; Pham, 2000; Shatnawi, 2009a). These models assume testing-effort to be constant throughout the testing period. Other SRGMs have incorporated the expenditures due to testing-effort (Yamada et al., 1985; Kuo et al., 2001; Kapur & Bardhan, 2002; Kapur et al., 2006; Huang et al., 2007; Kapur et al., 2008). They assumed that the fault detection rate is proportional to current fault-content and software testing-effort expenditures. Testingeffort expenditures are the resources spent on software testing. It has been observed that the relationship between the testing time and the corresponding number of faults removed is either Exponential or S-shaped. An interesting inference can be made regarding the analysis namely all the models are robust and can be used for any testing environment and can be termed as Black-Box models, which are used without having any information about the nature of the software being tested. However, if one has to develop what is called White-Box model, one needs to know about the software technology, which has been used to develop the software. Thus, it is imperative to clearly understand the software development environment and accordingly there is need to develop a model, which can explicitly explain the software technology that has been used to develop the software and now being tested. Such a modelling approach was earlier adopted by (Pham, 2000; Shatnawi, 2009b; Kapur et al., 2004a; Ohba, 1984; Yamada et al., 1985). Such approach is very much suited for object-oriented programming and distributed development environments (Kapur et al., 2004a; Shatnawi, 2009b).
Copyright © 2013, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
International Journal of Distributed Systems and Technologies, 4(2), 1-14, April-June 2013 3
Considerable evidence from industry case studies indicates substantial business benefit can be derived from aggressive software reuse. In ideal setting, a software component that is developed for reuse would be verified to be correct and would contain no defect. In reality, formal verification is not carried out routinely, and defects can and do occur. However, with each reuse, defects are found and eliminated, and a component’s quality improves as a result. In a study conducted at Hewlett-Packard (HP), Lim (1994) reports that the defect rate for reused code is 0.9 defects per KLOC, while the rate for newly developed software is 4.1 defects per KLOC. The present scenario of software development life cycle has emerged into a distributed environment because of the development of network technology and ever increased demand of sharing the resources to optimize the cost. A software development process typically consists of four phases: specification, designing, coding, and testing (Ohtera & Yamada, 1990). During the testing phase, a lot of development resources are consumed to detect and correct faults latent in the developed software system. Testing distributed systems is an example of an activity that consumes time and resources (Cristescu & Ciovica, 2010). Therefore, an SRGM is needed to estimate the current reliability level and the time and resources required to achieve the objective reliability level. The proposed NHPP based SRGM attempts to account for the relationship between the amount of testing-effort and the number of software faults detected during testing in distributed development environment. The proposed model is based on the assumption that the software system is composed of a finite number of reused and newly developed components. The reused components do not consider the effect of the impact (severity) of the fault-type (complexity) on the software reliability growth phenomenon (i.e., the growth is uniform). The newly developed components do consider the effect of the impact of the fault-type on the software reliability growth phenomenon. Accordingly, the fault removal process is modelled separately and the total fault removal phenomenon is the
sum of the fault removal process of all. Then, we review some forms of testing-effort functions such as exponential, Rayleigh, Weibull, logistic etc. After that, we provide the methods used for parameter estimation and the criteria used for validation and evaluation respectively. We then validate and compare the model with other existing model by applying them on actual software reliability dataset cited from real software development projects. Finally, conclusions are drawn.
SOFTWARE RELIABILITY MODELLING Computer software traditionally ran in standalone systems, where the user interface, application ‘business’ processing, and persistent data resided in one computer, with peripherals attached to it by buses or cables. Few interesting systems, however, are still designed this way. Instead, most computer software today runs in distributed systems, where the interactive presentation, application business processing, and data resources reside in loosely-coupled computing nodes and service tiers connected together by networks (Buschmann et al., 2007). Therefore, the software development environment is changing from a host-concentrated one to a distributed one due to cost and quality aspects and rapid growth in network computing technologies. A distributed system is a computing system in which a number of components cooperate by communicating over a network. The explosive growth of the Internet and the World Wide Web in the mid-1990’s moved distributed systems beyond their traditional application areas, such as industrial automation, defense, and telecommunication, and into nearly all domains, including e-commerce, financial services, health care, government, and entertainment (Buschmann et al., 2007). Software components also has failure rate specified during software design using software engineering paradigm. These failure rates reflect the reliability of the system, which is desired to be high (Raza & Vidyarthi, 2011). Under this environment software modules/components can
Copyright © 2013, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
4 International Journal of Distributed Systems and Technologies, 4(2), 1-14, April-June 2013
be developed at different geographical locations and components used in other software can be re-used. It is empirically known that the growth curve of the cumulative number of detected faults is an exponential curve when a software system consisting of several software components are tested in the testing phase (Ohba, 1984), while is an S-shaped growth curve when a newly developed software component is tested (Yamada et al., 2000). For software systems developed under such an environment several models have been proposed (Yamada et al., 2000; Kapur et al., 2004b; Shatnawi, 2007). Yamada et al. (2000) have constructed a software reliability model based on an NHPP, which incorporates the exponential software reliability model (Goel & Okumoto, 1979), and the delayed S-shaped model (Yamada et al., 1983) for software systems developed under such an environment. They assume that a software system consisting of several number of reused software components and newly developed software components. Under these assumptions they formulate a model based on NHPP for a distributed development environment. The formulated model was found to describe only both the purely exponential growth curve and the highly S-shaped growth curve according to the value of the weight parameters. Though the authors of the model argue that if we can estimate the values of the weight parameters reasonably, we can obtain software reliability assessments measures more accurately, they have assumed prespecified testing-weight parameters for each software component. In addition, they assumed that as the testing time increases, testing-effort also increases. If testing time becomes quite large, testing-effort also becomes quite large. In reality, no software developer may spend infinite resources on testing irrespective of the testing period. Thus, there is a great need to develop more realistic software reliability model. To address these issues, we integrate testing-effort function into Yamada et al. (2000) model to get a better description of the software fault removal phenomenon in a distributed development environment. Different from Yamada et al. (2000) model, the proposed
model incorporates the effect of testing-effort and the values of the testing-weight parameters are estimated for capturing a wide class of reliability growth curves. The model considers that software consists of a finite number of reused and newly developed components. Each component has its own characteristics and thus the faults detected in a particular component have their own peculiarities. Therefore, the fault removal process for each component can be modeled separately and the total fault removal phenomenon is the sum of the fault removal process of all the components. We feel the problem can be better formulated using the proposed model developed below.
Model Assumptions 1. Fault removal phenomena follows an NHPP with mean value function m(t). 2. Software is subject to failures during execution at random times caused by the manifestation of remaining faults in the system. 3. Software is composed of a finite number of newly developed and reused components. Accordingly, faults are of two types, which are of different severity. Testing effort required depends on the severity of faults. 4. From our discussion in the introduction it follows that the ratio of fault exists in reused to newly developed components is about 1 to 4. 5. Software reliability growth in the reused components is uniform while in the newly developed component is not. 6. Each time a failure occurs, an immediate (delayed) effort takes place to decide the cause of the failure in order to remove it. The time delay between the failure observation and its subsequent fault removal is assumed to represent the severity of the faults. 7. Fault removal process (i.e., the debugging process) is perfect. 8. Mean number of faults detected/removed by the current testing-effort expenditures is proportional to the mean number of remaining faults in the system.
Copyright © 2013, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
International Journal of Distributed Systems and Technologies, 4(2), 1-14, April-June 2013 5
Model Notations a: Over-all fault-content of the software; (∑ai+∑aj=a), a>0 ai: Initial fault-content of fault-type i; (i=1,2,… ,m), ai(=ahi) aj: Initial fault-content of fault-type j; (j=1,2,… ,n),aj(=ahj) h i(j): P r o p o r t i o n o f f a u l t - t y p e i ( j ) ; (0