ATM Software System Development

7 downloads 16905 Views 659KB Size Report
good example of a software failure, which covered almost each classification of the ... The Computer Aided Dispatch (CAD) system, showing agents, system ...
Evaluate the Failure Of Critical System Software Engineering for Dependent System Nasreen Iqbal De Montfort University Definition of failure: “„not trying‟, not the outcome”

The safety can be defined as free from the disaster, losses and accidents, and critical system failure would result in serious system hazard. Incompatible requirement lead to the failure of the critical system, moreover critical system failure can indicate losses in the economy, a threat to life and physical damage. The Project, LAS CAD spanned for 5 years (1987 - 1992) serves as a good example of a software failure, which covered almost each classification of the Critical system. 11.1 Critical System classification:

SC divided into the following subdivisions: 11.1.1

Safety-critical systems

Under this classification the failure results in loss of life, injury or damage to the environment, which initiates when the requirement of the system not met or absence of relevant knowledge, mislead of project by stockholders or many more critical reasons involved, LAS falls under this classification. 11.1.2

Mission-critical systems

This classification falls under the failure of any goal-directed activities such as London Ambulance service system 11.1.3

Business-critical systems

Business critical system failure leads to higher economic losses, such as a Customer accounting system in a bank. 11.2 Safety Management:

The first slandered rule is to follow the properties of the safety management to make sure that claim addressed. The most important focused area of safety management is its safety case documentation, which includes the following claims:  Identification of hazards, failure, mechanism failure, safety targets  Risk assessment  Supporting evidences

1

3

Hazard & Risk Analysis

4

Overall safety Requirement

Requirements 5Safetyallocation

Overall Planning

6

Overall operation And maintenance planning

Overall safety 7 Validation planning 8

Overall installation &commissioning planning

9

Safety-related systems: E/E/PES Realization (see E/E/PES Safety lifecycle)

10

Safety-related systems: other technology Realisation

11

External risk reduction facilities Realisation

Overall installation 12 and commissioning

13

Overall safety validation

Overall operation, 14 maintenance and repair

16

Back to appropriate overall safety lifecycle phase modification 15 Overall and retrofit

Decommissioning or disposal

11.3 The LAS:

In 1990 the new requirements took place to assemble new system, aimed to create:  Automated system  Automation in handling bulk of calls.  Location identification through ambulance locations to mobilization  Human allocation In 1991 the proposed system consists with elements that addressed to the proposed requirement:  Computer-aided dispatch  Computer map display  Automatic vehicle location system (AVLS).

2

The Computer Aided Dispatch (CAD) system, showing agents, system functions and communication channels

Hospital Ambulance Crew

CAD System Function

Call Taking

Resource mobilization

Resource identification

Resource management

The London Ambulance service system first introduced its services in 1992 on the unstable condition, entails of stockholders, „Director of support services‟, „system manager‟, „contract analyst‟ who had experience with the previous LAS system, and a few others but had a very small influence. In March 1992 the progress report listed number of daily system failures. These issues were reviewed and proposed training to the staff to overcome from the situation, but finally there were flaws in the software and physical system that lead this system to complete failure. This report followed the international standard taxonomies of a critical system that will discuss the failure arguments, aim to make sure that the presented evidences is enough to demonstrate the requirement of software safety. One of the slandered goals based systems are “The Defense Standard 00-56”, sets out the requirements, relating to safety management and requires system developers to demonstrate how they can achieve safety. 11.4 LAS Investigation

One of the direct results of this shutdown included the patient death after long wait of the ambulance arrived, estimated number of death toll caused 10 upwards, even though no reason of findings for the late arrival of an ambulance. The investigation blamed a wide range of factors, including technical, managerial, human and environmental issues. Some blame was placed upon incomplete software, and inadequate testing and staff training. Managerial issues included the lack of change, and coordination with the contractor, since beginning etc. 11.5 Factors Contributed LAS Disaster

 As the system was not ready for operation and not agreed by the CAD system and others, that the system ready for full implementation 3

 The CAD software reported not tuned and never fully tested, hardware reliability was also not fully tested.  The unresolved data communication problems, to and from the mobile data stations.  The doubts on the accuracy of the Automatic Vehicle Location System (AVLS).  Untrained staff included Management and crew, having no confidence  Inaccurate or incomplete data available in the system.  The recorded number of exception messages required to dial by the staff Casualties: 20-30 died due to delay Ambulance Source of

LAS Board

Staff

CAD

Executive Directors

LAS Management

No Staff Involvement

p

Requirement Analysis

CAD Management

No Training

llow u No Fo LAS from

Designing Coding

Staff

Staff

No Management Training

No Staff Involvement

Error: System design

Testing Training

Contractors

No Backup Plan

Installation Maintenance

No Maintenance Plan

11.6 Viewpoint and its breakdown:

The number of groups / stockholders was involved in the construction of a Critical System, have different point of views on the system that directly related to the role of the system. In the case of LAS framework, among the stakeholders, the role manager and the ambulance staffs standout, actually their point of view of interpretation in the motivation process for the system would considerably, but apparently there is no evidence presented that the system took into account any opinion other than the system manager, as stated below: Excerpt 2.1 Referee Evidence number [3016]

4

In this regard senior management believed that the implementation of this system would enough to bring new changes, these changes would reflect on the work methodology of the staff and ambulance crews but the consultation with the concerned staff was minimal in the implement. As such many problems came into account when the system was introduced, as the use of early reactive strategy such as the comparison of different views for a given situation, and the resolution of different opinion in the process of negotiation, validation during the system requirement presentation. We find in the LAS report clear indications of the viewpoints of management in the selection of project, also indicated that the only proposal from (LAS) met all the requirement, including cost and timeline and based on these major requirement the proposal was finalized. Excerpt 2.3 Reefer Evidence Number [3116] The report submitted that the management didn‟t convinced fully by the system used previously and forcefully implemented these changes from the previous system to the new system, where decision ability shifted from MAN to machine was not „a welcome change‟, and made the system detached. IEEE member and risk management expert Robert Charette commented that it was a bad decision by the project managers is only a cause of software failure today. Excerpt 2.4 Refer evidence number [3117-c] & [3116] The Management viewpoint was that the system imposed the work mythology for the staff, as a result staff found them in the situation of "strait jacket" within which they tried to operate local flexibility. The management had full confidence the CAD system will eliminate all the fuzzy manual logic from the work environment and implement a user friendly system that brought the system supervisor for the staff. Excerpt 2.5 Refer evidence number [1007-b] Here we can conclude that there was a conflict in the opinion of the allocation procedure between the management and the ambulance staff that the system was used to handle these matters before. Management team believed to adopt the fast moving change ideology, as a result the development process speed up, but with the differences between the staff and management. All the evidences in this case indicating that this process was ignored and the system was implemented using the management‟s viewpoint, caused a miscommunication in the agreements between the stockholders requirements.

5

The penalties were disastrous as reported, negative thoughts towards a system and absence of joint ownership by the ambulance staffs lead failure in the installation issues. The report also pointed the circumstantial evidence of damage / misuse of the equipment. Mainly reported that the poor communication between management and staff cause the system into mistrust. Excerpt 2.2 Refer evidence number [3117] On this evidence reported that this was not considered an ideal solution by management that the new system would take up the old responsibilities of stations and crews, which used to decide the resource allocation independently and allocate the ideal resource to any accident in their previous days. Evidence 3072 witnessed that the LAS Management in that impression that the contract must be a project manager but most of the time the project management responsibilities and even supplier related issues, handled by the Director of support services and contract analysis. No evidence emphasizing that the project management duties performed by the SO other than project planner. Conclusion Management and staff viewpoint cleared that the system was implemented with the consideration of non-negotiable timetable. The report also indicated the lack of PRINCE project management methodology deserted the requirement analysis phase. The Management misperception for the elimination of the old work methodology, the insufficient information about the staff and minimal involvement of the staff in the project implementation, cause this project in disaster. As per evidence 3070, various issues carried forward to the Senior Manager to review but none of them were followed up with them. Future work for this project is planned to validate these findings along with other safety case lessons and also the Management must perform statistical studies of critical project to establish the impact of each social critical derived risk factor, included user involvement and their acceptance.

6

C12

EVALUATE THE DESIGN OF CRITICAL SYSTEM

The Designing of Critical System follows the International Slandered such as UK: MoD DEF Stan 00-55 & 00-56, efforts to obtain the safe system and prevent from the failures. The discussion will focus on the slandered of designing a critical system and finding of a LAS disaster solution with learning lessons. The Development of critical system based on dependability, the discussion will try to find the dependability and its importance in the LASCAD system, which has four pillars:  Reliability: deliver services as specified;  Availability: deliver services when required  Security: protects itself from the external attack  Safety: concern with the human life threaten 12.1 LAS CS Design & Solutions

The development of critical system based on various properties, one of them is requirements and its traceability in the system‟s evolution. The changes in the Life cycle of the system development are obvious, but it required document traceability for the changes. The LAC was implemented with many issues and one of them was traceability of the changes made by the team. CAD entered into this project without prior relevant experience, had inexperienced staff with uncertain plans and concepts. 12.2 Selection of team Member:

It is clear that the „selection of the team‟ best suitable solution to avoid the disaster. The prime job is for the Tender board & LAS management to select the project in terms of its actual requirement and specification of the proposed system. The tender should rise in order to attract the vendors, number of vendors and suppliers should be in the choice list. The evolution in the selection of the vendor should be based on the best in industry practice or IS standard expertise and these selection responsibilities must divided into a management group who should be the final authority to produce the best report to the Tender Board. Refer Evidence: [3032], [3031] and [3040] As we now the selection of the team was inadequate in terms of experience and knowledge. The development cycle is mainly based on the follow-up report and documentation. Here I must conclude that the CAD team it was responsible mainly because the CAD management was unable to manage their team, but on the other hand the LAC management also responsible in order to track the development process and not to rely completely on contractors. Also, we must note that experienced suppliers and standard products are good choices, as they follow their standard pattern in which they are experienced therefore they don‟t require any more, guidelines and tracking; but in many cases new groups may perform better as well.

7

The LAS management had to understand that they selected inexperienced group, in this regard their responsibility was increased to track all the development stages and check progress report, regulate meetings with the entire team staff in order make sure that all the requirements are addressed. In the report the solution correctly pointed that few expertise groups, purchasing specialists and contract specialists, should involve in the selection of vendor in order to have multiple views for the critical system. I disagreed that the low bit always failed; here discussion should not emphasize the low bit, but should emphasize that the LAS Tender board approved the critical system with the incomplete requirement information. 12.3 Refined Requirement Specification

An analyst who understands the requirement and has the ability to use the tools, that represents the quality slandered in the requirement. “The system shall” is the phrase clearly specified in the requirement that referred the IEEE recommended practice 830. The requirement should cover all the specifications, functional, non-functional, included model designing which demonstrates the system simulation. The requirement specification completes its survey when each and every section should involve in the preparation, mainly operational department. In the LAS system the main stuff was crewed and control, room service providers, had no involvement in the requirement specification, caused their work environment and situation full of tragedy and worst. As evidence said: Refer Evidence: [3011]. The evidence says that requirement specification itself was not fully addressed all the typical and complex areas, this also noticeable that the LAS management represented the partial viewpoint in the required documents, a collection of managerial viewpoint, excluded staffs and operational staff viewpoints. That means the specification was not completed with its functional requirement, the other evidence: Excerpt: Refer Evidence: [3116] 12.4 Asses Project Risk In advance:

A risk is not recorded a problem always but something that may appear in the future. By these analysed phrases we must learn to establish balance between the negative assets of risk against the possible benefits given by them. The assessment of the risk in the project constructs on the assumptions. The analysis said that when there are no assumptions and everything goes smoothly in the project, and then formulates the bugs. However, the assumptions could eliminate if they have no longer validation on the project, but the major bugs may not ignore in the project life cycle.

8

Hazard

problem

Risk

The risk management assembled on the basis of hazard in the system. Identification of the risk in the critical system obviously helps to eliminate the hazard and build safe system. The risk identification covered each and single area, i.e. project and team size, business impact, development environment and technology to build in the project and its mitigation should planned as per IS standard. The risk checklist and its solutions must address in the entire life cycle. LAS proved a high risk project where life threatens involved, but as per report the project team was failed or avoided to trace the critical risk in their analysis phase. The entire development process never focused on the risk and its mitigation. Excerpt: Referee Evidence: [4017] The LAC management should prepare the report list of the related critical risks and this checklist should address in the testing phase by the testing team. The planned test must contain the riskchecklist, development, documentation and the proper test data, should involve each operational section from the LAS team in this testing. The post-test document must present to the LAS management with all the details, so the management will revive further before initiate the next phase. 12.5 Traceability

The LASCAD system software error was the result of carelessness and the absence of quality assurance of the program and code changes. If LAS paid attention on development organization and imposed genuine expectation, then the CAD likely would not have failed system. Traceability refers to the ability to describe and follow the life of a requirement in both forwards and backwards. During the software life cycle the requirements traceability analysis, performing an important part as it ensures that all of the requirements have been adequately considered during each phase of the project and make sure that there has nothing skipped in the developed system due to missing requirements.  Requirements management deals with the process of managing changes during the requirements engineering process and system development.  Requirements traceability is concerned with the relationships between requirements (dependencies), their sources and the system design.  CASE tools are necessary for the requirements, storage and management. The following sample traceable chart that will demonstrate every business area being addressed and linked with each other as per risk priorities. Traceable Matrix Business to Requirement 9

Use Case ID

Functional Requirement ID

Priority

Design Elements

System Component

Technical Specification

Architecture Design Document

Business Requirement Short Description

Business Requirement ID

UC01 UC01 UC07

Pre traceability The traceability complete when it moves forward and backward direction. The backward direction called Pre-Rs-Tractability requirement. The CAD system should establish the pre tractability where the stockholder may investigate and can keep track of each record. Excerpt: Refer Evidence: [3082] The evidence proved that the CAD developer always in the process of changes or adding new features in the system without PIR plan and traceable documentation, this ignorance was reflected in the testing and maintenance phases, as a result LAS received incomplete staff training and also system maintenance completed without traceable details that caused several bugs and errors. According to the report, the CAD also followed many traceable systems called PIR, OS, etc. during the development process, but inquiry found that several changes being done by the programmer, but never been followed the proper guideline of PIR as a result all the changes made in LAS has no records so far. As evidence said: Excerpt 2: Refer Evidence: [3088] The traceability basically provides the “umbrella” between configuration management and versioning aspects that complete the entire software life cycle requirement. The CAD implemented in pieces therefore the configuration management is important to the CAD system to provide one umbrella. Many evidences said that the CAD developers failed to identify such changes; in this regard an independent reviewer recommended that the change log should be under more formal controls. The fragments below illustrate this scenario: Excerpt 3: Refer Evidence: [3093] It recorded that during these 9 months the system was never stable, changes and enhancements were made continually to the CAD software. The Data track system was amended and enhanced

10

without tracking records. The MDT‟s and the RIF system were never followed by the developer, Refer Evidence: [3098] The proposed solution is to establish the traceability from the beginning of the development cycle, resources that have to be coordinated towards the same goals may follow a number of standards, such as the IEEE [26] and the SEI [57]. It is very clear that the traceability concept used as a commercial tool and research, product since long and in practice in the current environment as well. The proposed solution in the best Software enterprise practice is to update the traceability and follow the standard. 12.6 Make a proper Schedule and a Backup:

Each and every phase must follow the project planner along with the schedule time table (time line to start and Go live), analysis, development, testing, implementation and maintenance phases which must followed by project scheduler. Lack of the schedule planner in the development process may delay or stretched the project for unlimited period. The following screenshot will give an idea to prepare the planner in Microsoft Project, which included the tasks and schedules of each phase with their time frame, resources and their assigned tasks along with the documents of each phase handover. The project plan should follow the waterfall strategy in the entire life cycle of the project.

Refer Evidance: [5004.a-b]

11

Evidence reported that the CAD initially presented Project planner to the Tender Board and LAS management but never followed by their development team, therefore the LAS management had misconception in the project Go Live announcement, as a result the announcement for go live” on October 26, 1992 put project in risk, with this decision the development process accelerated their development process and completed without sufficient pace; also the LAS staff had little or no training in the new system, as a result risk factor was not addressed properly. Refer Evidance: [5004.c] The other important aspect in the project management is to develop Backup planner for the critical system in the change management system. What if the system failed, what if the installation was not responding properly, or network failed, or system crash, the backup plan have capability to roll back the system in its previous system. Evidence recorded that CAD had no backup plan and no back over machine involved. Refer Evidance: [1009-c] 12.7 Pay Attention on the references from outside agencies

The third party or outside consultancy specialized in various critical safety cases for finalizing the system, they would be a key resource of the project may that help project team to eliminate the inappropriate candidates. The third party advisory mainly deals with the synopsis of the project, and present the criticality of the project, their viewpoints natural and emphasizing environmental aspects. In the CADLAS many third parties appeared with their advices such as „the managing director of a competing software house‟ referred various memos to LAS management in June and July 1991 that described about the project and said that ' it‟s a totally and fatally flawed'. The other consultant said about „LAS's specifications as poor‟ which has various areas that undefined and said that “this certainly is a mistake of wrong choices.” The result said that LAS management avoided the third party advises, these advices could control the crises of the LAS system and its failure. 12.8 Lesson Learned

Inquiry emphasized the requirement for future development of the LASCAD system, provided details of the importance of the stockholders confrontation and their impact on the project success: 

Focus on repairing the reputation of CAD within the service  Resistance includes stakeholder feedback on the requirement of the system which impact on the local environment, apparent risk which is addressing the local and system requirement and its mitigation impact of the revolutionary changes.  The management should willing to welcome majority or stakeholder and invest on the recourses that facilitate the mitigation of the risk 12

 The source of information and feedback should open from the resources, which include the importance of user acceptance and their confirmation on the Go-Live even it get delayed, included third party consultancy  The risk mitigated via pressure, which reduced the feedback loops between stakeholders, that's resulting in poor communication between them, as a result project was a misfit in the environment thus failure occurred.  Software and system design must provide the quality assurance, stakeholder satisfaction, its value and status. The providers should not an emphasis on the fully automated system, include user acceptance will help to build cooperative and a user support system. The benefit as it reduces the risk and enabling the user involvement, incorporating their local knowledge and experience.  Sense of ownership and control should develop in all the stakeholders, they should believe in the technology requirement in the critical systems  User training must complete with the influence of the user and their feedback must present to the management.  Management and staff must have total, noticeable, confidence in the reliability of the system  The new system should be introduced in a planned and scheduled approach

It is concluded that: the failures of the project diagnosed as faulty in the technical and project management levels that frequently pointing indications of failure rather than the actual source of failure that was the fundamental social complexity of the project. The project entire life cycle processed its task without the analysis of the Stakeholder impact in the project and lack in finding risk analysis, mitigation of critical risk. These neglected areas required focus in order to reduce the risk of project failure and build a safe system for the lives.

13

REFERENCES:

 Course material slides Dr. Antonio Cau.  All the relevant references attached with the course  Sauer, C.: Deciding the future for IS failures not the choice you might think. In: Currie, W.L., Galliers, B. (eds.): Rethinking Management Information Systems. Oxford University Press (1999) pp.279-309  Wilson, M., Howcroft, D.: Re-conceptualising failure: social shaping meets IS research. Eur J Inf Syst 11 (2002) 236-250  D. Dalcher, “Disaster in London: The LAS Case study,” ecbs, p. 41, IEEE Conference and Workshop on Engineering of Computer-Based Systems, 1999.

14