Use Case-Based Acceptance Testing of a Large Industrial System ...

1 downloads 0 Views 460KB Size Report
Proceedings of the Testing: Academic & Industrial Conference – Practice And Research .... CBO software system (4.1) and outsourced operations and processes ...
Use Case-Based Acceptance Testing of a Large Industrial System: Approach and Experience Report Serguei Roubtsov, Petra Heck Laboratory for Quality Software (LaQuSo) at Eindhoven University of Technology Den Dolech 2, P.O.Box 513 5600 MB The Netherlands [email protected], [email protected]

Abstract The paper describes the preparation and execution of the site acceptance testing of a large scale industrial system. A use case based approach to testing is developed. The approach introduces three-level test artifact specifications. At the highest level test scenarios are used to validate system use cases. The lower levels are presented by test scripts and test cases, which unfold test scenarios into test procedures with corresponding test data. A proposed requirements traceability model joins all the test artifacts to each other and to the requirements allowing testers to maintain control over requirements coverage during the entire testing process. The paper also describes successful practical experience in application of the approach as well as our observations regarding usage of modern testing techniques and tools in industry.

tion and Octopus Cards Ltd (both from Hong Kong) [7]. The Hong Kong Octopus Card solution [3] was chosen as a prototype for the Dutch e-ticketing system. The system is planned to be implemented nationwide in 2007.



… … …

… …

1. Introduction

… … …

… …

1.1. E-Ticketing system for the Netherlands The Dutch nationwide e-ticketing system is an initiative of five public transport organisations (PTOs): Nederlandse Spoorwegen, Connexxion, GVB (Amsterdam), RET (Rotterdam) and HTM (The Hague). This alliance serves 90% of the country’s public transport market. In 2002 these companies founded the Trans Link Systems (TLS) company, whose purpose was ”The implementation of a robust and future-fixed system of electronic public transport tickets in the Netherlands for the benefit of a customer-oriented, safe and cost-effective provision of services by the public transport companies to passengers” [7]. In the same year TLS issued a tender in order to find the best suppliers. The tender was won by the East-West Consortium (EW) consisting of three major international companies: Accenture, Thales and Vialis, with important subcontractors MTR Corpora-

Figure 1. E-ticketing system architecture The system’s multi-level architecture is presented in Fig. 1. At the lowest Level 0 a customer uses an electronic (”smart”) card at different Level 1 front-end devices - checkin and check-out gates, add-value machines, Point Of Sale Terminals (POST) - to travel throughout the entire country using different transportation means like train, bus, metro, tram, etc. Each card contains an electronic purse (e-purse) with a certain amount of money to be used for travelling

Proceedings of the Testing: Academic & Industrial Conference – Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 © 2006

IEEE

(so-called ”easy-trip” ticket). Also different PTO products - like season or discount tickets - may be put into the card. Among those products, so-called inter-operable (IO) products can be used and sold within the networks of several transport organisations. Some cards may be personalized so that the customer’s account number and address can be available in the central database. This feature supports epurse auto-reload functionality, which allows, if necessary, for the e-purse to be filled with an additional amount of money on the fly, using front-end devices. Card usage transactions and device audit registers from front-end devices are collected by Level 2 station processing systems (SPS) and sent through a secure network to central processing systems (CPS) each one of which controls transactions of a single transport organisation. This is Level 3 equipment. Transactions from different PTOs are collected by Central Back Office (CBO) hard- and software equipment (Level 4). The major tasks of the CBO are: • Registration of newly produced and initialized e-tickets in the database called Card Master • Daytime transaction validation against validation rules predefined by business logic, security and technical requirements • Card transaction history storage in the Card Master • End-of-day settlement, i.e. settlement report generation for the PTOs and bank interface files generation • Blacklisting: generation and distribution of the lists of lost and stolen cards • Transport application (product) management and device management (registration, deregistration, etc.) • Card production, initialization and distribution All the tasks above except the last one are related to the CBO part which is called Central Clearing House System (CCHS). In its turn the CCHS system is divided into two parts: Clearing Operator (CO), which settles batches of transactions, and Card Issuer (CI), which handles the Card Master. The card production related tasks are separated from the CCHS for legal reasons into the Initialization and Personalization Center (IPC) governed by a subsidiary of TLS Trans Link Systems Card Issuer.

1.2. LaQuSo assignment The Laboratory for Quality Software (LaQuSo) [1], which is a sub-department within the department of Mathematics and Computer Science at the Eindhoven University of Technology (TU/e), was involved in the preparation and execution of acceptance testing of the Central Back Office system.

According to the contract between TLS and East-West, part of TLS activities including factory acceptance testing (FAT, at the supplier’s site) and site acceptance testing (SAT, at the customer’s site) were outsourced to East-West. This implied all the phases from test planning and preparation to test execution and result interpretation. Accordingly, the developers of the system were assigned to perform its final acceptance effort. This approach has its drawbacks. The suppliers have both an interest in supplying a properly working system which satisfies the customer as well as meeting the deadlines within budget. Therefore acceptance testing is recommended to be performed by the customer or the end user organization [12]. For this reason TLS asked LaQuSo as a third independent party for its support and expertise in fulfilling several functions: • to take part in SAT planning; • to validate completeness and correctness of requirements coverage by test cases; • to assess test specification and reporting documents provided by the East-West Consortium; • to witness tests and to evaluate their results. During accomplishment of those functions some subtasks arose, like: • contractual documents reviewing; • user requirements gathering and formal specification. The SAT cycle had been started in Spring 2004. In tight collaboration with TLS and East-West LaQuSo accomplished its mission and a small scale pilot system was launched in the Rotterdam region in Summer 2005. The purpose of this paper is to provide an experience report on a large scale industrial case study, which could be of practical interest to test professionals, and to make some observations regarding the validation and verification techniques used and needed in industry, which should be a useful feedback for academia. The rest of the paper is organized as follows. Section 2 describes the results of a field research (2.1), and the testing approach which has been proposed as a result of this research (2.2 and 2.3). Section 3 is about the project team activities during the SAT preparation phase. Section 4 covers the test execution stage of the project, testing of the CBO software system (4.1) and outsourced operations and processes supporting the system (4.2). Subsection 4.3 describes test results. Conclusion 5 summarizes the results of the project as lessons learned.

2. Case study and approach 2.1. Field survey at Trans Link Systems The project was started with thorough analysis of the contractual and design documents regarding the Central

Proceedings of the Testing: Academic & Industrial Conference – Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 © 2006

IEEE

Back Office system, which were: • • • • •

the contract; the business rules document; the conceptual design; the high level system design; the EW processes supporting the system.

It was discovered that no user requirements were formally listed at the moment. Moreover, a number of the design documents had not been completely finalized. The reason for this, as we realized, was that the project was still an evolving organism, with many stakeholders having different interests and responsibilities involved from both sides. This especially concerned so-called ”Dutch delta” - the specifics of the Dutch e-ticketing system against its Hong Kong prototype. Besides some minor issues (e.g, daylight-saving switch twice a year), there were such differences as expected extensive usage of products with a card, certain legal requirements regarding electronic money circulation in the Netherlands, as well as more strict data privacy legislation. Regarding a testing approach, it was specified contractually that a use case-based approach had to be implemented. What was meant by that, was that test cases had to be designed according to system usage scenarios, triggered mostly by the events at the Level 0/Level 1 interface. No further elaboration - including design of the scenarios themselves - was done. On the other hand, the IEEE 829 [11] standard for software test documentation was mandated as a basis for software test documentation. The additional requirement to test execution was that during the Central Back Office SAT no connection with lower levels would be available. Also, it was settled among the parties that the system (software) and operations (supporting human-machine processes) would be tested separately. Thus, after the survey we found out that: • user requirements had to be retrieved from the project documentation and listed formally; • it had to be expected that changing documentation would lead to multiple iterations of requirements adjustment and tracing; • the testing approach needed to be elaborated. The next subsection describes the approach that we developed together with the East-West testing team and used during the site acceptance testing.

2.2. Approach Use case diagrams and scenarios, being part of the Unified Modelling Language (UML) [4], are broadly used in software design at the phase of functional requirements specification (see, e.g. [9]). They provide a view on the behavior of a software system from the user’s perspective.

That is why use case scenarios seem to be a useful basis for acceptance testing, which is a process of comparing the software system to its initial requirements and the current needs of its end users [12]. On the other hand, a use case based approach to testing, although mentioned as one of the accepted techniques in testing literature (see, e.g. [10]), is not particularly described in detail in this literature. Some publications, the ideas of which we found useful, are presented online [5, 6] in form of short presentations and manuals. The main idea of the publications is to focus on use case scenarios in order to provide sound coverage of functional requirements, whereas the transformation of those scenarios into test cases seems to be straightforward: just represent at the necessary level of detail (test case preconditions, user input and system output, expected results, etc.) each path of a use case scenario as a separate test case. Our experience has shown us that such a direct mapping is not so easy. Firstly, when use case scenarios are designed for large systems, like the one we had in hand, they are inevitably at a very high level of detail. In our situation almost each scenario would contain very rough steps covering different levels of the system. On the contrary, designing test cases requires very precise specification of each step. In order to avoid possible misinterpretation of scenario steps by a test case procedure, additional checks against use cases and the system specification would be required. Secondly, in our system almost all triggers and a lot of steps of use case scenarios belonged to the lower levels which were out of scope. Therefore, each use case needed to be adjusted to the Level 4 functionality. Thirdly, for a large system there could be too many test cases. On the other hand, a lot of them would follow the same, or partly the same, control flow (test procedure). Accordingly, the problem, how to reuse test procedures, had to be solved. Another task was how to adjust the use case based approach to be compliant with the IEEE 829 standard. We could not find any indication in literature as to how use case based testing might be put into the test documentation scheme of this standard, which for the test preparation phase looks like this [11]: • • • •

test plan; test design specification; test case specification; test procedure.

Finally, after a lot of discussions among the parties, a three-level test specification approach was proposed by the LaQuSo team and accepted by TLS and EW. At the highest level use case scenarios were introduced. They were given the name ”test scenarios”. A typical test scenario is

Proceedings of the Testing: Academic & Industrial Conference – Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 © 2006

IEEE

presented in Fig. 2. It describes the sequence of events triggered in the system at Level 1 by a passenger who purchases an inter-operable product which is loaded into his/her card at a station service terminal (POST). The format of the test scenario does not differ much from the traditional one, which can be found e.g. in [9].

sequences of steps like the main flow. At this point our specification differs from the standard one [9], in which each alternative path has to be described properly. This compromise between the desired level of detail and lack of information was proposed by our counterparts. We were bound to accept it because, firstly, not all the processes, especially the ones covering exceptional situations, were in place at that moment and, secondly, it was agreed that all variations and exceptions would be described in detail at the lower levels of the test specification. The list of related test scenarios included into the test scenario specification had to contain test scenarios which are either used by this one at certain steps or, the other way around, use it as a step. This would give us the opportunity to reuse some control flows in lower level test specifications. The relationships within the test scenario ”Buy PTO-IO product” are shown in Fig. 3.

Figure 3. Relation between test scenarios

Figure 2. Test scenario Steps of the main test scenario represent most frequent system behavior in form of a sequential control flow whereas variations and exceptions are the alternatives of this normal control flow. The difference between variations and exceptions is that variations represent normal alternatives of the system behavior whereas exceptions describe the system behavior in case of unexpected failure. In other words, variations as well as the main control flow are the prototypes for different positive test cases with similar test procedures but exceptions are the prototypes for negative test cases, which have to demonstrate the system’s ability to recover from failures. Variations and exceptions are presented in our format by their short descriptions only and not by the

Our test scenario reuses the test scenario ”Perform Daytime Validation for CO/CI”, which is depicted by the > relation directed to the reused scenario. The test scenario ”Buy a card with PTO product (preloaded)” can reuse ours. Finally, two possible variations of the main scenario are connected to it by > relations. The second level of the test specification was presented by test scripts, which had to contain a procedure describing how to perform the testing of steps from test scenarios. Each test script had to become an elaborate description of just one path of a certain - parent - test scenario and had to contain the following items: • ID; • reference to the parent test scenario/ variation/ exception; • description (”how to do”) of each step of the parent main scenario/variation/exception; • input data types;

Proceedings of the Testing: Academic & Industrial Conference – Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 © 2006

IEEE

• pass and fail criteria. A test script description at any step could contain a reference to another test script. This way the idea of test procedure specification reuse could be implemented. The exact format of the test script was not defined at that moment. The reason for this was that testing had to be split into two parts: system and operations testing. Also, both testing teams had different preferences how to visualize test scripts. Besides, the format could partly depend on test automation tools, which also might be chosen by the teams differently. The lowest level was presented by test cases, which were considered as instances of test scripts and had to contain: • • • • •

The list of test scenarios was meant to be a supplementary document for a test plan which had to be provided by EW.

2.3. Traceability issues Besides the test specification approach, another major issue had to be solved: how to provide coverage and traceability of user requirements through all the levels of test artifacts. If we analyze the structure of proposed testing documentation, we can describe it by the traceability model represented by the UML class diagram in Fig. 4. The figure shows that all testing artifacts are traceable from

ID; reference to the parent test script; steps from the parent test script; input data values for each step; expected results for each step.

The exact format of a test case was not specified as well. However, the mandatory items listed above would allow for test cases to reuse the same test script in two different ways: first, by using different data values of the same data type specified in the parent test script and, second, by performing not all the steps specified in the parent test script or by altering those steps. This way the main goal of coverage could be implemented: to cover every main scenario, variation or exception at least one test case per each of them had to be designed. Regarding compliance with the IEEE 829 standard, if it was not followed by our approach literally, at least its spirit was preserved. Table 1 shows the mapping between our testing artifacts and the items of this standard. Table 1. Mapping to IEEE 829 Standard Our Item IEEE 829 Item Test plan Test plan Test scenario Test design specification Test script Test procedure Test case specification Test case specification What is different is that in the standard both test design and test case specifications are used to develop the corresponding test procedure, whereas in our three-level approach test scripts are designed from test scenarios and, then, test cases are derived from test scripts. The rationale behind this is that use case based testing is essentially a process centric approach; this process is roughly known at the very beginning as a use case. The logical way, therefore, would be to unfold it into a detailed script and, then, split the script into several cases.

Figure 4. Requirement traceability model the test cases up to a particular main scenario, variation or exception: all the unidirectional associations reflect explicit references in the corresponding test artifacts to their parent ones. However, requirement specifications were not yet included in our specification approach. The obvious way to trace requirements would be to put references to the corresponding requirements into each test scenario. However, it is well known that user requirements and design specifications, the highest level of which represent use case scenarios, are usually expressed in different language. As a result such references would always be quite arbitrary. Moreover, some test scenarios would cover some requirements just partially and no reference could reflect such partial coverage. Our solution was straightforward - a requirements traceability matrix [8], which placed against requirements (in rows) all main test scenarios, variations and exceptions (in columns) (Fig. 5). If a requirement is covered by any partic-

Proceedings of the Testing: Academic & Industrial Conference – Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 © 2006

IEEE

3.1. Requirements specification First of all, all five sources of requirements listed in 2.1 were thoroughly reviewed in attempt to retrieve user requirements. To be put on the list, a requirement had to be literally mentioned in at least one of those documents as shown in the traceability model presented in Fig. 6: the OCL (Object Constraint Language, [4]) invariant means that for each requirement at least one document source has to exist. Figure 5. Requirement traceability matrix (fragment)

ular element in the columns, the corresponding cell has to be marked. Of course, we realize that such mapping is also subjective but at least the association presented by a matrix is traceable in both directions and, therefore, the analysis of coverage could become more comprehensible: we could analyze how any particular requirement is spread among test scenarios and how many requirements are relevant to any test scenario, variation and exception. Also we made an attempt to quantify the degree of coverage by numbering marks from 0 to 1, where 1 meant complete coverage. For example, the number 0.1 widely presented in the matrix corresponded by agreement to so-called common requirements like ”All types of transactions have to be stored in the Card Master”. Surely, there has to be a test scenario which checks this ”all types” but if in any other scenario one particular type of transaction is tested, then this has to be marked as 0.1. Adding up the numbers in each row, we were able to at least figure out which requirements were not covered at all and which ones were possibly ”over-tested”. Adding up the numbers in each column it was possible to find test scenarios, variations and exceptions which did not cover any requirement at all and, therefore, were probably redundant.

3. Test preparation During the SAT preparation phase our responsibilities as representatives of the customer were: • to create the list of user requirements to test against; • to assess soundness and coverage of the part of the test plan which contained the test scenarios; • to validate the test scripts and cases against the test scenarios. All three tasks followed each other in time and took about 4 months to accomplish.

Figure 6. Documentation traceability model The task of requirements retrieval was not so straightforward because, for example, the high level design document by definition contained system but not user requirements and needed to be compared to other documents in order to filter out the implementation specifics. Besides, the high level design document was based on the Hong Kong prototype, which at some points did not correspond to the requirements to the new system. In addition, some requirements which we found in the contract were argued by EW as service level agreement specifications but not the system requirements. Also, a certain amount of outdated and ambiguous requirements was discovered. The way we took to eliminate ambiguity was to prioritize the project specification documents according to the list shown at the beginning of subsection 2.1, starting with the contract as having the highest priority, and to adjust those contradictory requirements which belonged to lower priority documents. As a result, all this took several iterations of review and, this time, the East-West team were our reviewers. There was no requirements specification tool available at Trans Link Systems at the moment so we just put the

Proceedings of the Testing: Academic & Industrial Conference – Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 © 2006

IEEE

requirements specifications into an Excel workbook as separate sheets, which listed: • • • • • •

IPC functional requirements; IPC non functional requirements; IPC reporting requirements; CCHS functional requirements; CCHS non functional requirements; CCHS reporting requirements.

The reason for such separation was that the IPC and CCHS parts of the CBO had to be tested separately. Besides, reporting requirements mostly assumed other validation means than testing, like e.g. audit by domain experts. Since no requirements traceability tool was available at that time, such an old-fashioned means of specification made it hard to trace every change in the referred documents which happened occasionally during the specification period. Finally the requirements list was agreed upon and test scenarios design was started.

3.2. Test plan evaluation Both sides had agreed that test scenarios had to become a mandatory addendum to the test plan. From the LaQuSo project team point of view this addendum was the document to focus on. The reviewing process, just like the other ones, took several iterations. At the first step we checked the quality of each presented test scenario. We checked both the content and the formal structure. The content was validated against the documents available at that moment, against logic and, sometimes, against common sense. The structure was checked against the agreed format (Fig. 2). Some typical errors encountered during reviewing together with their probable impact on future test scripts and cases are shown in Table 2. At the next iteration completeness of the test scenarios document was assessed. For this purpose the requirements traceability matrix was used. By that time a lot of contractual issues were settled, which demanded one more walk through the requirements specification list. A few requirements were agreed as ”not for the launch release” and, therefore, we pulled them out of scope. The first version of the requirements traceability matrix revealed that about 90% of the functional requirements were more or less (in terms of our quantified criterion) covered by the test scenarios whereas non functional requirements coverage was more modest: about 60%. Those of them which we accepted as covered were mostly performance and system management/maintenance requirements. For performance issues it was decided that functional test scenarios could be reused with stress loads of data. A separate set of

Table 2. Typical Errors in Test Scenarios and Their Effects on Test Scripts/Cases Error type Missing preconditions Missing steps Conditionals (”if” ”then” - ”else”) present at steps Actor of a step is unclear

Missing post conditions Missing variations/exceptions

test scenarios called ”IT operations” was designed for system management/maintenance issues. Very predictably security, capacity, availability and scalability requirements were almost not covered since they did not fit well to the use case based approach. We agreed with EW to separate the remaining not covered requirements into three groups: • out of Level 4 scope: not CBO requirement; • out of SAT scope: covered at other testing phases (FAT, system integration or security certification tests); EW had to present the corresponding reports for reviewing instead; • other validation techniques were required (like audit). After sorting out not covered requirements a residual of about 5% still remained which we did not agree upon. This residual was written down and marked as ”Open issues”. Partly those issues resulted from still remaining open issues in the project itself. The next task was to analyze the quality of coverage. During this analysis some test scenarios covering too many requirements were proposed to split apart and other ones were advised to join together, when it was feasible to consider one test scenario as a variation of another one. After a few more iterations the set of test scenarios was accepted by TLS and the project moved to the phase of test scripts and cases design.

3.3. Test scripts and cases: reviewing process The EW testing team produced two sets of test scripts: the first one was meant for the system SAT and the second one for operations testing. The developers of the system

Proceedings of the Testing: Academic & Industrial Conference – Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 © 2006

IEEE

Effect on test scripts/cases Missing input data Missing steps Missing scripts and/or cases Impossible to tell user input from system reaction Missing/wrong pass/fail criteria Missing scripts and/or cases

prepared the first set and the people who would be responsible for its operational support in the future prepared the other. As we mentioned earlier, the format of test scripts and cases was not fixed before the test preparation phase. As a result, each testing team had chosen its own format. The variant of a test script proposed for the system testing is shown in Fig. 7. This script corresponds to the test scenario presented in Fig. 2. ( The real content is removed for non disclosure reasons.)

Figure 7. Test script (fragment) As field descriptions in Fig. 7 show, this format is a kind of joint version of a test script and a test case. All agreed mandatory items of both of them (described in 2.2) are recognizable in the format. Column ”Test Data Specification” contains the types of data prepared for the test, e.g. a card number, a product ID, etc., as well as actual values of that data. As we noted earlier, the CCHS was supposed to be disconnected from the lower levels during testing and test data simulators were developed to be used instead. This column contains all this simulated data. It has to be emphasized here that in this column several test cases were usually merged. The next column ”Actions...” contains a detailed description of testing steps together with system input data.

The ”Expected Results” column contains expected system outputs being pass criteria for each test step. The solution providing test script reuse is as follows. Any sequence of steps is allowed to have a unique ID so that it can be referred to from any other test script or from variations or exceptions of this one. Variations and exceptions are not unfolded but just provided with those references or, sometimes, with short descriptions. Several columns representing different system levels provide an easy way of separating CBO related steps from lower level steps and, within the CBO, splitting up system related steps and pure operational steps performed by TLS or EW. Forexample, in Fig. 7 only the steps 6 to 9 marked by ticks ” ” as System Level 4 related were meant to be tested during the system part of the SAT. In addition, the column ”Test Cycle” provides references to test sessions in which different test scripts had to be organized. The format seems very convenient for testers to follow during actual testing; normally it contains all the data related to the test in one sheet. Several test cases covering the main scenario as well as its variations and exceptions are merged into one script (if there were too many test cases to fit into a column, they were presented in a separate ”test case template” referring to the same test script). However, this format does not correspond to the model in Fig. 4: variations and exceptions have not been elaborated into separate test scripts. Thus, the relation ”one to one” between a test script and a main scenario, variation or exception does not hold. As a result, the coverage of every variation and exception by test cases is hard to prove. It was quite obvious how to improve this format: firstly, test cases, at least one for each variation and exception, had to be retrieved into separate documents and, secondly, in each joint test script, each variation and exception had to have a reference to the corresponding test case. This way back references from variations and exceptions to test cases would prove coverage and splitting of test scripts would not be required. However, the consensus about such an improvement was not achieved mostly due to the very tight schedule. Therefore, during the reviewing process much of our effort was directed to coverage assessment in whatever way possible. Too many variations and exceptions were ruled out by the developers as ”impossible for normally functioning CCHS” or ”would be a failure”, etc. On the contrary, we tried to draw them back into the scope as related to system autorecovery procedures or IT operations maintenance. The operations test document for the IPC testing, which had to be performed first, had essentially the same format as the format described above except, in addition to system related steps, operational steps were described in detail as well. The resultant test scripts suffered from the same

Proceedings of the Testing: Academic & Industrial Conference – Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 © 2006

IEEE

weakness: it was very hard to assess coverage of test scenarios, variations and exceptions. During preparation of the CCHS operations testing the EW operations team used the Mercury TestDirectorTM tool [2]. All test scripts were prepared using the corresponding functionality of that tool. Although its requirements traceability functionality was not used, each main scenario, variation and exception - which the developers considered relevant - was described in a separate test script. If necessary, test scripts could be quickly redesigned to reuse them with different test data. Thus, such a refined approach allowed improving test scenarios traceability at least for operations related test scripts. Reviewing this set of test documents we concentrated on coverage from another point of view: whether or not all test scenarios which contained operational steps had been encountered. As a result of the reviewing process a few not covered test scenarios were detected and included into the following versions. Regarding the content of the tests presented by the both testing teams, although a certain effort to assess its quality was made, we realized that our counterparts knew the system much better than we did. So, we decided to rely on their awareness of the system and try to achieve positive results which meant to find bugs - during actual testing.

4. Test execution Two testing teams were organized by EW: the one for system testing consisted of system developers and the other included persons who were supposed to run the CBO system after the launch. LaQuSo was witnessing both test cycles.

4.1. CBO system testing System IPC testing implied actual production of e-tickets of different kinds as well as performance testing. The testers were willing to leave some time for free format testing. This vivid process revealed quite a lot of bugs especially related to the so-called ”Dutch delta”. This required a few more sessions of regression testing, although, in general, the system functioned as it was specified. On the contrary, during CCHS system testing rigid sets of test data were predefined in advance by a simulator. Moreover, these sets of transactions were quite randomly generated. As a result, the history of transactions related to one particular card had nothing to do with the real life situation. For example, during testing it was possible to have a card in the Card Master which was blocked before it was actually initialized or its e-purse was auto-reloaded before the card was sold and it was uncertain whether this was a

testing approach side effect or a system failure. Such a test data preparation approach, of course, was quite acceptable for system integration testing, however, it made it hard to support the use case based approach, in which triggers of system events have to be in a logical order. Our first effort was to persuade the testers to prepare comprehendible test data that had to be explained to us before each test session. It was done and the test preconditions and results became clearer. Nevertheless, the LaQuSo witnesses had the impression the tests were very well rehearsed. It was obvious that the test data was carefully selected and used many times before. Of course, this had nothing to do with the artifice of the testers; it was a clear drawback of the test organizational approach when the developers were chosen to be responsible for the SAT. It is well known [10] that the developers are always honestly aware that their system is perfect and what needs to be done is to prove this for the customer. On the contrary, a tester has to have an opposite attitude: for him to find a bug is a real positive result [10]. However, it is psychologically very hard for a developer to obtain such an attitude. As a result, very few minor bugs were found. However, we were pretty much aware that there are always bugs undiscovered in any system. So, we decided to do two things: first, to try to ”shake the boat” during the operations testing and, second, to walk through the requirements list one more time in order to directly compare requirements with the content of the tests we saw. This way, having already more knowledge of the system, we would be able to assess the soundness of the test content.

4.2. CBO operations testing The operations testing team consisted of persons responsible for system support in the future. Therefore, they were willing to find system weaknesses. Both the IPC and CCHS systems were subjected to thorough examination. The success was especially apparent with the CCHS. Dozens of defects were discovered both in Graphical User Interface (GUI) and back-end business logic just through the not hasty analysis of the GUI response to different inputs using negative, boundary-value testing and other known testing techniques [12]. The defects were classified in a predefined scale: C critical; H - high severity; M - medium severity; L - low severity. It was agreed between the parties that all C and H issues had to be resolved before the launch. The resolution of M -bugs was not mandatory immediately; L level ”cosmetic” issues were meant to be resolved in after-launch releases. Consequently a list of regression test items emerged and was put in the TestDirectorTM defect-tracking system. After that an additional walk through the requirements

Proceedings of the Testing: Academic & Industrial Conference – Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 © 2006

IEEE

list was done by the LaQuSo team in order to compare the requirements with the content of the tests performed. We proposed to add a few more tests to the list of regression tests. As a result, a series of regression tests was prepared in collaboration between the LaQuSo/TLS team and the testers. Regarding particular operational issues the major problems concerned, quite unexpectedly, paperwork and human to human processes, e.g. the forms which had to provide input for the system and how to obtain the information for those forms. Also, it was encountered that the interaction between TLS and EW business processes was not completely clear at that moment.

4.3. Test results All regression tests were finalized till April, 2005. Of about 130 issues previously encountered, all critical and high level ones, as well as about 50% of the medium ones were resolved and the corresponding defects were closed. The rest was postponed until new releases as change or enhancement requests. The final requirements coverage analysis performed by the LaQuSo project team revealed that roughly 94% of the listed requirements were covered during the SAT. The remaining part lay on the boundaries between the Central Back Office and other parts of the system (network security, bank interface, etc.) and could not be tested having had only the Level 4 system in isolation. Although the tests took much longer than it was expected, they were completed successfully before the system launch.

5. Conclusion. Lessons learned Site acceptance testing of the highest level of the nationwide e-ticketing system was prepared and performed on the basis of the use case approach. Although not all the steps of the approach were implemented as planned, the results achieved by the LaQuSo project team we consider as successful. To summarize, we express our experience regarding the project as lessons learned, which we believe could be useful both for industry and academia. • The use case based approach is a suitable technique for site acceptance testing • To be successful for large scale systems, the use case based approach requires decomposition into multilevel test specifications • Our three level use case based approach: test scenarios - test scripts - test cases, has proven to be a feasible solution

• Outsourcing of site acceptance testing to developers has its drawbacks: the probability to end up with a demo version of factory acceptance or even system integration testing is quite high. It is preferable that acceptance tests are being performed by a party independent of the supplier. This can either be the customer or an independent third party. This eliminates the chance that a conflict of interest may occur • Doing acceptance testing rely on people who will run the system in the future, they are usually eager to find bugs • Sound coverage of requirements demands an iterative process going from requirements specification down to test case design and back • Modern requirements-management tools can improve requirements traceability and acceptance test efficiency drastically

6 Acknowledgment We thank Joop Akkerman from the Atos Origin company for fruitful cooperation during the fulfillment of the project and sharing insights into the subject of the paper. We thank Prof. Kees van Hee from the Eindhoven University of Technology for valuable scientific support and advice.

References [1] LaQuSo. ”Laboratory for Quality Software”, June 2006; http://www.laquso.com. [2] Mercury. ”Mercury Test Director”, June 2006; http:// www.mercury.com/us/products/quality-center/testdirector/. [3] MTR. ”Using the Octopus card”, June 2006; http://www.mtr.com.hk/eng/train/octopus.html. [4] OMG. ”Unified Modeling Language Specification v.2.0”, June 2006; http://www.omg.org/mda/specs.htm. [5] Pete McBreen. ”Creating Acceptance Tests from Use Cases” June 2006; http://www.informit.com/ articles/article.asp?p=26652. [6] Ross Collard. ”Use Case Testing”, June 2006; http://www.stickyminds.com/sitewide.asp?Function=edetail &ObjectType=ART&ObjectId=6857. [7] TLS CV. ”Trans Link Systems”, June 2006; http://www.translink.nl. [8] B. Ramesh, M. Jarke. Toward Reference Models for Requirements Traceability. IEEE Transactions on Software Engineering, 27(7):58 – 93, 2001. [9] A. Cockburn. Writing Effective Use Cases. Addison-Wesley, 2000. [10] E. Dustin. Effective software testing: 50 specific ways to improve your testing. Addison-Wesley, 2003. [11] IEEE.org. IEEE standard for software test documentation. IEEE Std 829-1998, 1998. [12] G. J. Myers. The Art of Software Testing. John Wiley & Sons, Inc., 2 edition, 2004.

Proceedings of the Testing: Academic & Industrial Conference – Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 © 2006

IEEE