Implementation of First Time Right Practice in Software Development Process D. Duka and L. Hribar Ericsson Nikola Tesla Poljicka cesta 39, Split, Croatia Phone: +385 21 435820 Fax: +385 21 435834 E-mail:
[email protected]
Abstract - Fault Slip Through (FTR) is the concept aimed to improve the software development process. This practice focuses on designing the quality (preventing actions) instead of solving problems (corrective actions). The overall initiative results in continuous improvements and provides a mechanism for propagating the lessons learned between projects to further improve the organization. It also provides clear and effective way to set and follow quality expectations at project level. FTR enables and emphasizes management commitment and control by using verifiable criteria for desired quality outcome and introduces a common measurement framework for systematical data collection, analysis and comparison. This paper gives an overview of FTR concept. Proposals for practice improvement and some initial results analysis are also presented.
I. INTRODUCTION The pressure to improve software development process is not new, but in today’s competitive environment there is even greater emphasis on delivering a better service at lower cost. Service providers care about quality of their services upon which they make investments. Software vendors care about their clients’ needs and efficiency in achieving them, a symptom of which is little to no rework – i.e. get it done right the first time. The process named First-Time-Right (FTR) serves both these needs. It has both an external and an internal focus. Externally it focuses on the client; internally it can delineate where errors reside – notably are they within or upstream of the performance measurement group or system. In other words, FTR splendidly balances customer focus and internal accountability. It delivers this balance at little cost in terms of both start up and ongoing maintenance. The First Time Right was established as a step towards the Operational Excellence program at Ericsson as an initiative mainly established for Quality Assurance (QA). In simple words Quality Assurance represents the planned and systematic activities implemented within the quality system and
demonstrated as needed to provide adequate confidence that an entity will fulfill requirements for quality [1]. Time and cost are easily visible during project, while software quality can usually be assessed late in the project and effects of decisions/actions on quality improvements are difficult to evaluate [2]. Rework needed on developed products has a broad impact on the organization: •
Cost of faults found later are higher,
•
Negative loops require additional resources,
•
Customer and sponsor satisfaction,
•
Organization capability.
Benefits of less rework during all the development and testing phases and maintenance are the following: A. More resources are available instead of being involved with maintenance. B. More resources at Function Test (FT) are available for function failures instead of being involved with faults slipped through from previous verification phases. C. More resources are available instead of being involved with System Test (ST) and Design Follow-Up (DFU) activities i.e. activities between first official release and global deployment. D. Less temporary spill-over of competent resources due to emergencies from early development phases and design towards maintenance. All stated benefits per specific product development phases have its graphical interpretation on the Figure 1 [3] showing how positive spill-over of resources (green) is caused by reduced rework. On the other side, unwanted rework might cause negative spillover of resources (red) causing jeopardizing the later phases.
A Early Phases
Design & Implementation
FT
DFU
Maintenance
B C D
F Fig. 1. Benefits of less Rework
First Time Right practice has recently been applied at Ericsson to software development projects having a time frame of 6-12 months. As described in this paper, special principles and guidelines had to be followed while building FTR model as a part of regular software development process. Success factors and some important considerations needed to fulfill the intended effect are also presented here. In order to measure FTR effectiveness, Fault Slip Through (FST) metric was chosen. Although gathering data for more accurate statistical analysis will require longer period of time, some initial analysis was performed pointing to some potential areas of improvements in product life cycle.
II. FTR PRINCIPLES The First Time Right practice is based on following quantitative criteria: • • • •
Mandatory documents, Best practices, Quantitative specifications of processes, Specific measurements.
Those criteria are specified for each one of the Quality Indexes: QI1, QI2, QI3, QI4, QI5 [3]. Each quality index includes a list of documents/practices to be performed during development process through five FTR phases: • Always-on, • Early phases, • Design & implementation, • Early verification, • Verifications. The general meaning of the Quality Index can be summarized by the relevant position in the typical Project Management Triangle (see Fig. 2):
Time QI1
Costs are the most appropriate ones to support the application of the relevant QI-s.
Time needs are dominant within a controlled quality
QI2
Quality needs are dominant within the planned times
QI3 QI4
Cost
QI5
Quality
Fig. 2. Project Management Triangle
QI1 - characterizes fast prototypes/demonstrators development of new and innovative functionalities, either used as gap filler for a limited time span or eventually evolving towards a real product. QI2 - characterizes developments of products where time pressure is dominant, with the appropriate levels of quality and costs. QI2 allows prediction of product quality only on an approximated basis, while ensuring the achievements of the challenging time targets. QI3 - characterizes balanced developments as far as quality and time are concerned. It also allows prediction of product quality on a statistical basis, while providing confidence on the time targets. QI4 - characterizes developments of products where product quality is dominant, with the appropriate levels of time and costs. It allows a meaningful prediction of product quality, while providing a consistent level of confidence on the time targets. QI5 - characterizes developments of high quality products, while optimizing impacts on the levels of time and costs.
III. FTR MODEL AND QUALITY GATES The relationship between the Quality Gates (QG) and the FTR practice is described on the Figure 3:
Establishment and updating of a QI for each feature to be developed
Application of agreed QI Development and assessments
TG1
TG2
QG1: Internal TG1 Assessment
MS8
QG6: Product Quality Assessment
QG3: Internal TG2 Assessment
QG2: Productivity Strategy Meeting
QG4: CMM Assessment
QG5: Functional Configurat. Audit
Fig. 3. The FTR Model and Six Quality Gates
QG1 - Risks and benefits of Feasibility Process tailoring are presented, discussed and validated. After TG1, discussions for selecting the Quality Index starts. QG2 - The Project Steering Meeting (PSM) uses the preliminary (or definitive) QI as one of the inputs. The purpose of PSM is to synchronize project needs and processes, methods and tools. Before Toll Gate 2 (TG2 – project execution start), final decision of the Quality Index must be done. QG3 - Risks and benefits of Execution Process tailoring are presented, discussed and validated, taking into account the selected QI as one of the inputs. Between TG2 and Milestone 8 (MS8 – first official release) at any important scope changes or change requirements, the already established QI is validated or changed. The Monthly Quality Report shows the status of application of the FTR criteria. QG4 - Process tailoring approaches are analyzed through Capability Maturity Model (CMM) assessment (such as defined at TG2) and also the implementation and effectiveness of the FTR criteria (according to the decided QI). QG5 - The preconditions for measuring the Requirement Implementation Coverage and relevant measurements are evaluated. QG6 - The implementation of the FTR criteria and the relevant effectiveness are validated.
feature that has to be developed, taking into account main project constraints (time, quality, cost) in order to propose the most appropriate QI (see Fig. 4). The Provisioning Office Manager provides the top management point of view about expected QI. In the next step, Project Manager evaluates the consequences on time schedule and cost of proposed QI. Final decision regarding QI is made on the Management Team meeting. Change Request (CR) regarding FTR requirements is issued when needed [4].
CR during design
This color indicates the original flow
FTR responsible inputs
PM inputs
This colour indicates a new flow due to CR-s to be used when there are important impacts from CR-s.
Analysis of the Feature
1
Preliminary proposal of a Project-QI Analysis of conseque nces
CR: updating of Project-QI
2
Preliminary proposal of a Mgmt-QI by PPO manager
CR: updating of Mgmt-QI
PPO manager PA manager LM inputs
Project Planning scenarios Final decision
CR: updating of Project Planning scenarios
3
Selected-QI
CR: updatedQI
IV. FTR GUIDELINES During project feasibility study phase Project Manager and FTR local responsible analyze the
Fig. 4. FTR Guideline Before Project Execution Start
The Quality Plan (QP) prepared after project execution start has to be updated, taking into account FTR activities. Project can also add activities to the selected QI-s (documented in Quality Plan). At any important scope changes or requirements changes, the already established QI is validated and changed if needed. After execution phase is finished, Product Quality Assessment is performed by Line Manager, Project Manager, Configuration Manager, Process Owners and FTR local responsible in order to validate the implementation of the FTR criteria and its effectiveness.
from previous verification phases and system test faults. Fault Slip Through (FST) represents the number of faults not detected in a certain activity. These faults have instead been detected in a later activity [7]. Figure 6 visualizes the difference between Fault Latency and FST: Fault Latency
Fault Slip Through
Fault was inserted Design
IV. FAULT SLIP THROUGH As the quality level of the final product is set at the beginning of the project, a large number of faults can result in project delays and cost overruns. The number of faults in a large software project has a significant impact on project performances. In order to keep project performances development teams require fast and accurate feedback on the quality at the early stages of the design cycle [5]. Figure 5 [6] demonstrates how the cost of faults typically rises by development stages. The implication of such a cost curve is that the identification of software fault earlier in the development cycle is the quickest way to make development more productive. In fact, the cost of rework could be reduced up to 3050 percent by finding more faults earlier. FAULTS REMOVING COST
COST
CE AN
D FU
N
•
•
TE
N
Operation
The primary purpose of measuring FST is to make sure that the test process finds the right faults in the right phase, i.e. commonly earlier. The main effects of doing this are:
IN M
The FTR practice is applied to each deliverable individual. It aims to deliver features, with the expected functionalities, within the expected times and with a quality level which minimizes (ideally nullifies) and makes predictable the rework caused by faults found after function test i.e. the faults slipped through
System Test
The test strategy, test processes and complete development process define in which phase different kind of fault are supposed to be detected.
A
Y
TI O EX EC U
D Y
IB IL IT FE AS
EN T
PR ES TU
EM IR R EQ U
Fig. 5. Cost of Rework
Function Test
When measuring FST, the norm for what is considered right is the test strategy of the organization. That is, if the test strategy states that certain types of tests are to be performed at certain levels, the FST measure determines to what extent the test process adheres to this test strategy. This means that all faults that are found later than when supposed are considered as slipped [8].
•
DEVELOPMENT PHASES
Unit Test
Fault found
Fig. 6. Example of Fault Latency and FST
• COST
Coding
Fault supposed to be found
•
•
Fewer stopping faults (slipping faults tend to be stopping). Less redundant testing (when finding the right faults in the right phase). Less process variation because fewer faults in late phases → improved delivery precision. Avoid doing the same mistakes (faults) → supports a learning organization. Improved product quality i.e. only some of the faults slipping from earlier phases are found in Integration and Verification (I&V). Earlier and cheaper fault removal.
Although quality of a product is built in during early phases the FST can be related to the whole software development process, from specification to design and test. The test at the end is only meant to be a confirmation of the adherence to the requirements. However, FST is not supposed to be used just as a
measure, it is a concept for continuous improvements. Otherwise its intentions will not be fulfilled [9].
Generic success factors are: •
One must accept the current situation and then improve on it; especially reward systems might cause accurate data to be withheld and contra-productive improvements to be implemented.
•
Organizations with successful measurement programs actively respond to what the measurement data tell them.
•
Appropriate and timely communication of metrics results are critical in order to be used - make the results visible.
•
Measurement results must be examined in the context of the processes and environments that produced them.
•
The effort to collect metrics should not add significantly to the organization's workload.
•
The success of a measurement program is dependent on the impact it has on decision making, not the longevity of it.
Implementation process When getting started with FST measurements, some activities are required to make it work. Generic advice is hard in this area since many activities depend on how different organizations operate. In order to implement the FST in the organization, the following steps are recommended [8]: •
Determine the business goal of introducing FST i.e. whether the goal is to reduce the number of faults or to reduce the lead-time.
•
Create a common commitment.
understanding
and
•
Identify a driver. Someone with authority and true interest in implementing the concept is crucial to make the implementation successful. Otherwise, it will as most other improvement work be down-prioritized every time a project emergency occur (which tends to be very often).
•
Make sure that the test strategy is welldefined and communicated throughout the organization.
•
Perform a baseline measurement on a finished project (or at least a subset of defects from a project). Doing this is a good test to see that the measure is possible to apply in relation to the defined test.
•
Add the measure to the local Trouble Report (TR) process. If the measure is included in the TR process, follow-up will be a lot easier.
•
Educate. People most understand how to report the measure and even more importantly understand why it is important to measure.
•
Identify a pilot project to apply it in. This first project needs extra attention regarding follow-up of how well the measurement reporting works.
•
Visualize results early to determine status and to further show people that it is important (people tend to care more about visible measurements).
Success factors / important considerations From experience, the concept will not have the intended effect in practice unless some factors are adhered to.
FST specific success factors are: •
When introducing FST measurement into the fault reporting process, a guard is needed at least in the beginning to make sure that the faults are classified as they should (for example Maintenance Handling Office expert, test leader or fault review board).
•
Make sure to have a common and project independent test strategy developed and communicated because it is very tightly connected to the FST measure. Developing and deploying a common test strategy is in it self a driver for improvements since it tends to identify flaws in the process.
V. FIRST TIME RIGHT RESULTS FTR practice was applied on SW development project at Ericsson. The main reason for that was the conclusion that many faults inherited from early development phases cause a lot of rework and significantly increase the final product cost. In order to address this problem, a central part of the process change was to introduce the First Time Right approach. The concept was implemented into a new product release and results were compared with the similar previous project. During both projects personnel turnover was very low and the root development processes were stable.
The chosen metric was Fault Slip Through i.e. measuring the fault slippage from different development phases. Results are shown on the following table [10]: TABLE 1 FAULT STATISTIC – PROJECT COMPARISON
Product FST to Function Test Product FST to System Test
Without FTR 77%
With FTR 61%
Improvement
59%
51%
14%
21%
As can be seen from the table, slippage from Unit to Function test and from Function to System test were reduced. Lower result in second case can be explained by fact that one feature in Function test was not fully following the new approach (due to the specific test environment it was tested off site) causing the higher fault slippage than expected. However, in order to verify complete FTR results, new validating tool in form of additional fault codes has to be applied. Additional fault codes provide a flexible and standard way to collect and analyze data on defects found during development. Results coming from the additional fault codes analysis will be used to:
mitigating the risks of poor quality, when time is dominant. The practice allows to establish a clear agreement and commitment between the organization parties involved with the feature development (Project manager vs. Organization managers) by means of a shared decision about a QI. This process is build-up by a bottom-up approach by using experience and competence of relevant experts. A measurement framework exists, providing good Key Performance Indicators associated to the practice goal (e.g. fault slipping through at any development and test phases). The practice is applied to developments projects and the results are to be validated on a statistical basis, by means of a meaningful number of applications. The statistical prediction of rework level at ST/DFU/Maintenance requires time (e.g. 2-3 years) for gathering data and elaborating them into a consistent and proven statistical model. However, improvements might be visible in a shorter term. Nevertheless, the implementation of the FTR provides the opportunity to review/reinforce the underlying processes, because it highlights process lacks and quantitative levels of process applications. It also provides the raw materials to produce a trend line of quality to see if the process has been measurably improved or not.
•
Give feedback to the ongoing development project,
•
Improve processes,
•
Optimize and validate the FTR model.
[1]
Additional fault codes will provide the following information:
[2]
•
Error injection during development activities,
[3]
•
Effectiveness of each verification activity,
[4]
•
Fault pattern over project phases.
[5]
Achieved FTR results, together with new info that will be obtained after implementation additional fault codes, will help us to improve the weak development areas and also to decrease the amount of rework needed in later phases of product cycle.
VI. CONCLUSION FTR process re-focuses the attention on the quality requirements associated to the first delivery. It also takes into account the real development requirements (time dominant and quality dominant factors), while
REFERENCES
[6] [7] [8]
[9]
[10]
A. U. Rehman, “Quality Cost Analysis“, Feditec Enterprise T. P. Ryan, “First Time Right Ratio“, Measuring the Measurers, November 2008 ***, “FTR Practice“, Internal Ericsson Documentation, Sweden 2006 ***, “FTR Application Guideline“, Internal Ericsson Documentation, Sweden 2006 L. Hribar, S. Sapunar and A. Burilovic, “First Time Right In AXE using One track and Stream Line Development“, Proceedings MIPRO, 2008 L-O Damm, “Early and Cost-Effective Software Fault Detection“, Blekinge Institute of Technology, 2007 ***, “Fault Slip Through Measurements“, Internal Ericsson Documentation, Sweden 2005 L-O Damm, L. Lundberg and C. Wohlin, “Faults-slipthrough – A Concept for Measuring the Efficiency of the Test Process“, Wiley Software Process: Improvement and Practice, Special Issue, 2006 Z. Antolic, “FST Measurement Process Implementation in CPP Software Verification“, Proceedings MIPRO, 2007 L-O Damm and L. Lundberg, “Results from introducing component-level test automation and TestDriven Development“, Journal of Systems and Software, vol. 79, issue 7, July 2006