assessment of robustness for a web-based system

0 downloads 0 Views 215KB Size Report
Robustness, FMEA, Web, Use case, Analysis, Testing. 1. .... results obtained by performing robustness tests on the two systems in question. We will thus ... There are no interactions between objects at multiple layers at the same time. .... When DAIM receives incorrect input, it responses in one out of four ways: (1) the system.
ASSESSMENT OF ROBUSTNESS FOR A WEB-BASED SYSTEM Tor Stålhane Norwegian University of Science and Technology N-7491 Trondheim, Norway [email protected]

Hue T. Pham Norwegian University of Science and Technology N-7491 Trondheim, Norway [email protected]

ABSTRACT The WebSys project has developed a method for robustness assessment, using the Failure Mode and Effect Analysis on use cases, modified by introducing a set of robustness stereotypes developed by I. Jacobson. The paper gives a short presentation of the method and show how it can be applied by assessing the robustness of two systems. We then show that the results from testing these two systems give robustness estimates that are in good agreement with the theoretical result. We have also extended I Jacobson’s method for analysing robustness to include more complex systems. KEYWORDS Robustness, FMEA, Web, Use case, Analysis, Testing.

1. INTRODUCTION The WebSys project is a research project supported by the Norwegian Research Council. The goal of the project is to study the development of web systems to find ways to perform trade-offs between robustness and time to market. As part of this research we have looked into the possibility to adapt methods from reliability and safety analysis so that they can be used to analyse robustness. Robustness has many facets but in the WebSys project we have chosen to focus on robustness as “the ability to react in a helpful way to incorrect user input”. As a consequence of this, we study the system’s behaviour at the user interface. A system is robust if all incorrect input (1) give a helpful and easy to understand error message, (2) do not cause the system to crash or lose information and if at all possible, (3) reset the system to the state were it was when it received the incorrect input. Robustness is not only needed for web systems, but the need is more pressing there than it is in many other IT systems due to the users’ diversity in computer knowledge and experience. In addition, users are less faithful on the Internet. In most cases, there are many sites that provide the same services and if a customer is dissatisfied with a site he will quickly move to another. A non-robust web site will cause the user lots of inconveniences such as making him type in the same info several times, giving him strange, incomprehensible error messages and, probably worst of all, allowing him to insert information that is incorrect, thus exposing him to serious failures later in this or another session. Earlier papers published on robustness have focused on handling program failures in an orderly manner, mostly by using exceptions – see for instance (DeVale, J and Koopman, P, 2002 and Mao, C-Y and Lu, Y-S, 2005). Others have studied the possibility of estimating robustness based on test results – see for instance (Groot, P et al, 2000) – but in most cases, robustness has just been discussed in general terms as part of a system quality model such as in (Mich, L. et al, 2000). I Jacobson has developed a set of UML stereotypes and a set of rules that can be used to analyse the robustness of a UML system description (Jacobson, I. et al, 1992, Pender T, 2003). Unfortunately, it is not possible to assign a numerical value to the result of this

robustness analysis. We have used Jacobson’s stereotypes in the analysis in order to identify the control objects – objects that control the system’s interface with the user. Figure 1 shows the three stereotypes and the rules introduced by Jacobson. Boundary object

Control object

Entity object

Figure 1. Jacobson’s stereotypes for robustness analysis

The rules that apply to the stereotypes are as follows: (1) actors can only talk to boundary objects, (2) boundary objects can only talk to control objects and UML actors, (3) entity objects can only talk to control objects and (4) control objects can talk to boundary objects, other control objects and entity objects. In this way we assure that all information is handled by a control object before storage or display. Robustness is not an add-on - a characteristic of a system that can be added to any system when and if the need arises. If robustness is important we need to consider it from day one. We were thus looking for a method where we could start when the use cases were available. The method should serve both the developers - how do we build robustness into the system - and the quality assurance – how robust will the system be? The WebSys project has evaluated several methods that could be used to assess robustness. One possibility could be to use a reliability growth model – see for instance (Lyu, M.R, 1996) - but the need for large amount of failure data makes this unpractical except as an after-the-fact analysis, which is not particularly helpful during system development. Instead, we decided to use the Failure Mode and Effect Analysis – FMEA – method. FMEA stems from the analysis of mechanical, electric and electronic systems. However, due to its simplicity and versatility FMEA has been used with success in software development for a long time – see for instance (Reifer, D.J, 1979 and Bowles, J.B. and Wan, C, 2001). In addition to being simple to use, it allows us to use the experience available among all the participants in the analysis process. Most of the work done on the use of FMEA in software development has focused on safety and reliability. In the WebSys project, however, we found that FMEA was just right for our purpose. The most important factors leading to our choice were that FMEA • Could be used both on functions, scenarios and components. We could thus start to use FMEA as soon as we had the first use cases. • Was easy to understand and use. We could thus involve all stakeholders – e.g. customers, designers, developers and testers. • Can use both quantitative and qualitative data. We could thus start with whatever information that was available and improve the robustness assessments as we got more information. Our first paper on the use of FMEA in robustness assessment was published in 2004 – (Zhou, J and Stålhane, T, 2004). This was a theoretical paper, focusing on getting a clear definition of the problem and showing how the FMEA could be used in a practical setting. When we had constructed what we consider to be a sound theoretical foundation, we decided to test the method on two systems developed by the university’s department for IT services – DAIM, a system for administration of master thesis and EpN - a planning tool for university courses. The intention of this paper is not to prove that the method works. What we want to do is to give a demonstration of the concept to show how it works and to show that the theoretical results agree with the results obtained by performing robustness tests on the two systems in question. We will thus first describe the assessment process, based on use cases and the FMEA and then compare the results obtained in this process with the results obtained by a robustness test.

2. ANALYSIS AND TEST 2.1 FMEA and robustness analysis and assessment The FMEA can best be described as a structured brainstorming. The structuring is provided by the FMEA table. The table’s content is not fixed but is adapted to the situation at hand. The table we have found convenient for our use is show in table 1. Table 1. Template for the robustness FMEA table

Use case Id: Control object ID

Robustness failure mode Failure description

Seriousness S Score

Mitigation M Score

System effect

Indicators

Effect description

Observable effect

The columns are used as follows: • Control object: the object we are currently analyzing. This can be a module, a component or a process depending on the level of analysis. • Robustness failure mode: non-fulfilment of an explicit or implicit robustness requirement. • Seriousness - S: a number indicating the seriousness of the failure mode. We have used numbers from 0 (not critical at all) to 5 (highly critical). • Mitigation - M: a number indicating how well the failure mode is handled by the system. We have used numbers between 0 (not handled at all) and S (handled completely) • System effect: the effect of the failure mode on the system under analysis. • Indicators: events or relationships that can be observed if the failure mode occurs. When the FMEA table is finished we can assess the robustness – RbF – for each failure mode as: M (1) RbF = F SF In order to keep it simple we will assess the robustness of each use case scenario as the average robustness over all failure modes in this use case – equation (2) and the robustness for the system as the average over all use cases in the system - equation (3): RbU = RbF (2) Rb = RbU (3) The use of score values in this type of arithmetic may cause a certain amount of concern to some mathematical and statistical purists. Ever since S.S. Stevens’ paper in 1946, there has been an ongoing discussion on what is allowed and what is not when it comes to numbers, measurement and arithmetic (Stevens, S.S., 1946) We will not repeat this discussion here but instead refer the interested reader to the works of John W. Tukey (Tukey J.W, 1961).

2.2 Two examples 2.2.1 DAIM – An administration system for master thesis DAIM is a role-based system and uses PHP in addition to JavaScript, XML and SQL. The structure of the DAIM system is relatively simple. When a user enters an input, the system validates it first in the client, and then at Web server before saving it. However, not all input data will be validated either at the client or at the server. There are no interactions between objects at multiple layers at the same time. The application is a sequential process. Thus, the Jacobson model can be used to model all possible uses cases of DAIM system, without problems.

The following small example is taken from the robustness assessment of the DAIM system as presented in (Pham, T.H.T, 2006). The use case diagram is shown in figure 2 and the corresponding FMEA table is shown in table 2. Main page

Result page

Show

Validate

Web server

Update

Database

Figure 2. Use case for the function “Fill in contract”

From this use case we can identify the control objects Validate, Update and Show. The FMEA allows us to document all possible robustness related failure modes for each control object as show in table 2. Table 2. FMEA table for the use case UC02 - Fill in contract

Use case Id: UC02 – Fill in contract Control Robustness failure Seriousness object mode S Validate Invalid input is received but not 5 handled Update

Error output from “Validate” received not handled

Database contains incorrect data or user input not handled Show Error output from “Updated” module not handled Output from “Updated” contains incorrect data but not handled Sum for use case: UC02 Assessed robustness

Mitigation M 2 (0.4)

System effect System crashes or invalid input is saved to the database

5

2 (0.4)

System crashes or invalid input is saved to the database

5

3 (0.6)

Incorrect data is presented to the user

3

2 (0.7)

User must retype information

5

2 (0.4)

23

11 (2.5)

Incorrect data is presented to the user

Indicators No response is produced Invalid input is saved No response is produced Failure in “Validate information” Information found is incorrect No response is produced Database contains incorrect data

0.48 (0.50)

2.2.2 EpN – A planning system for university courses EpN uses the WebObjects framework. This framework is a multiply layer architecture which generates builtin packages that contains several processes. For example, a built-in validation function is applied to validate user input. When a user triggers a request, several objects are invoked at several layers and constitute a matrix form. When we first applied Jacobson’s model to assess the robustness of EpN system we experienced some problems. The reason for this is that Jacobson’s model is lacking the ability to model concurrent processes,

such as concurrent process at multiply layers. For example, application, session, and component are concurrent objects that must all be alive when performing a request. This information can not be modelled by using the original version of Jacobson’s analysis model. For example, if a process inside a module interacts with process inside other module, it’s impossible to show which process is member of which module. In order to solve this problem, a box is used to illustrate whether the process is inside an object or a module. Figure 3 shows the result from applying our extended version of Jacobson’s analysis model to the problem.

Figure 3: Analysis diagram for Set/change title in EpN Figure 3 illustrates a normal request-response loop in the EpN system. The control objects “Detect request”, “Encode/Decode”, “Process request”, “Generate response”, “Sleep” and “WebObjects Control” are built-in packages from the WebObjects framework. The “Process request” awakes objects at several layers in order to get ready for performing a request. Only “Generate response” from built-in packages, “Check input complete”, “Format & Validate Decimal number” and “Save Changes” are control objects that influence input robustness.

2.3 Testing for robustness 2.3.1 Testing method Based on the robustness requirement of the target system, we can analyse the use cases that have influence on the input robustness. When the system receives incorrect input, a test is passed if it behaves as defined in robust requirement. If not, it is a failure - critical or not critical. There are two types of tests which are intended to test different issues related to the input robustness. The first type is to test the reaction to invalid input. Each invalid input will be considered as one test. The second type is intended to check that the input is complete. In this case, the form is submitted missing at least one parameter but the input is not necessarily invalid. Several inputs were filled in, and counted as one test. These test cases are marked by an [L] in the input data field – see figure 4. A tester collects test data through observation of test results and uses the template shown in Figure to capture system’s behaviour, preconditions, expected reaction and system’s behaviour due to incorrect or incomplete input.

Use case ID: use case name Precondition Field: Input data/Action … Total number of tests:

Date: Expected reaction ….

Observed reaction …. Number of tests passed:

Figure 3: Form for collecting data during testing

The total number of tests and the number of tests passed are summed up when the tests are finished. As shown in equation (4), robustness of use case U is estimated as the sum of all tests passed (TP) in the use case U, divided by the total tests (T) in the use case U. When we shall estimate the robustness based on the test results we can chose between two alternatives. We can (1) give the tests important weights to match the severity score of the problem it aims to discover or we can (2) normalize the mitigation scores with the severity score and use this number (M/S). For this first evaluation of our method, we have chosen the latter approach. The normalized mitigation numbers are shown in parenthesis in table 2 above. For each use case the robustness estimate is the average over all tests pertaining to this use case. For the robustness of the total system, the average is taken over all tests applied to the system as shown in equation (3) above.

∑ TP = ∑T

j

RbU

j∈U

(4)

j

j∈U

2.3.2 Testing DAIM Whenever possible, the AutAT (Automatic Acceptance Testing of web application) was used to automate the tests – see (Øvstetun, T.M et al, 2005). There are, however, some limitations in AutAT, e.g. the tool does not support testing of dynamical Web-page. These tests were therefore performed manually. The automated test were created in AutAT, and then converted to Watir in order to be run. An example of the test description is shown in figure 4. The robustness requirement of DAIM is that “DAIM should always respond to the user’s action either by correct result or prompting the user due to user mistake or internal component failure”. Based on this requirement, all tests were intended to check the system’s behaviour to internal component failure and user mistake. When DAIM receives incorrect input, it responses in one out of four ways: (1) the system recognizes an invalid input and prompts the user for new input, (2) the system recognizes the input, and a default or old value is kept and saved without prompting the user, (3) the system does not discover that the input is invalid and save it in database, (4) the system does not discover that the input is invalid and abnormal software behaviour is exposed directly to the user. Type (1) is the proper software behaviour, (2) is a failure, while (3) and (4) are critical failures. The robustness of each use case and of the entire system was estimated based on the test results as shown in equations (4). Tests which are intended to check the system’s behaviour due to internal component failure are the cases where users don’t fill in input. There is, however, no input to validate. Based on the testing of all relevant use cases in DAIM we obtain an estimated robustness of 0.67. The assessment as describes in section 2.2.1 gives us 0.65. Thus, the error between the estimated robustness by testing and by applied assessment framework is approximately 3%.

2.3.3 Testing EpN EpN is a large system with 19 functional requirements. Only seven of those requirements allow user to give input. In this case study, only use cases that require user input were tested and considered in the input robustness. The framework WebObject has also generated many processes, which have not influence on the input robustness. Many of the EpN web-pages are dynamic and the AutAT tool cannot be used to test this system, since the tool does not support dynamical web-pages yet. The tests are therefore performed manually.

FR04: Set/Change title Date: 10.07.06 Precondition 1) Logon with valid username and password 2) Choose a Faculty by clicking a link, such as "Faculty for IT technology" 3) Click the button "new subject..." 4) Click the button "Title" 5) Fill in the following in the indicated fields. Field: Input data Expected reaction Reaction System detects that the filled As expected [L] Bokmål: input are not complete and prompt the user Bokmål: None of these fields can Invalid input are not discovered and saved Engelsk: include only integers. System without prompting. Nynorsk: detects invalid input and URL: prompts the user. Emnekodeforslag: Forkortet: Study points: The credit cannot have 4 digits. 1234 is converted to 1234.0 and saved System prompts the user without prompting Study points: The credit cannot have 5 digits. System does not save input with 5 digits System prompts the user. as credit. System retained the old value of the credit without prompting the user Study points: The credit can not be a letter. Invalid input is recognised; an old or System prompts the user. default value of the credit is retained without prompting the user. Total tests: 10 Tests passed: 1 Figure 4. Test description for “Set / change title

The failures are not categorized in critical or not, only as failure or not failure. A test is passed if invalid input is recognized and the system prompts the user for new input. Based on the testing of all relevant use cases in EpN we obtain an estimated robustness of 0.35. The assessment as describes in section 2.2.2 gives us 0.33. In this case the error between the estimated robustness by testing and by applied assessment framework is approximately 7%.

2.3.4 The influence of the level of analysis The differences between estimated and assessed robustness can stem from several factors. Firstly, the difference between the level of Jacobson’s diagram and requirements is important. The level of Jacobson’s analysis model will vary, i.e. from a use case (high level) to a user story (medium level). A use case can be based one or several user stories. If Jacobson’s analysis is applied at a high level, analysis of the robustness failure modes and possible causes will also be at a high level. For example, the possible robustness cause is “error user input” at a high level but will be split up into illegal character and invalid format if it is applied at a medium level. Analysing robustness failure modes and causes at a high level will not give us the same estimated robustness value as if the analysis is done at a medium level. The chosen level is, however, dependent on the available information. The more information is available, the more detail level of robustness assessment can be performed, thus giving us a more correct value.

2.4 Threats to validity The main threat to validity for the results is the question whether the assessment and the estimates based on test are really measuring the same factor and whether they are independent or not. To some degree, the assessment and the tests are dependent on each others since they both are derived from the system’s requirements. We have, however, tried to keep the assessment and the tests as independent as possible in the following way:

• •

The assessment is based on the use cases, system design and – where available - code. The tests are based on the textual system’s requirements and are thus separated from the use cases and other system description and models. In this way we have tried to avoid that the tests are contaminated by other system knowledge. On the other hand, only a large amount of user testing can give the final verdict as to the quality of our assessments.

3. CONCLUSION This paper reports two important results. First and foremost, we have demonstrated that our method for assessment of robustness gives results that are in good agreement with the results obtained by testing. The simplicity of the method makes it easy to involve all stakeholders and the numerical values obtained from the assessments allows the customer to assign robustness requirements either to each use case scenario or to the total system. The design decision needed to get a high enough mitigation score is left to the development organization. Secondly, we have improved on Jacobson’s method for robustness analysis. The stereotypes invented by Jacobson must be extended with a box which is used to separate processes inside each module. In this way, the interactions between several modules become visible. It is, however, still impossible to model concurrent processes. Thus, concurrent processes must be converted to sequential processes for the robustness analysis.

ACKNOWLEDGEMENT Throughout this work we received a lot of help from the persons who had developed DAIM and EpN – especially Kai T. Dragland, J. Grønsberg and Arne Venstad.

REFERENCES Bowles, J.B and Wan, C, 2001, Software failure modes and effect analysis for a small, embedded control system. Proceedings of the Reliability and Maintainability Symposium. Philadelphia, PA, USA, pp1-6. DeVale, J and Koopman, P, 2002, Robust Software – No More Excuses. International Conference on Dependable Systems and Networks. Bethesda. MD, USA, pp 145-154. Groot, P et al, 2000, Torture tests: a quantitative analysis for the robustness of Knowledge-Based Systems. Proceedings of the 12th International Conference on Knowledge Engineering and Knowledge Management, McLean, VA, USA, pp 403-418 Jacobson, I. et al, 1992, Object-Oriented Software Engineering. A Use Case Driven Approach. Addison. Wesley. Lyu, M.R, 1996, Handbook of Software Reliability Engineering. McGraw-Hill. Mao, C-Y and Lu, Y-S, 2005, Improving the Robustness and Reliability of Object-Oriented Programs through Exception Analysis and Testing. Proceedings of the 10th IEEE International Conference on Engineering of Complex Computer Systems. Shanghai, China, pp 432-439 Mich L et al, Evaluating and Designing Web Site Quality. IEEE Software, January – March, 2003, pp 34–43. Pender, T., 2003, UML Bible, Wiley Publishing Inc, Indianapolis, IN, USA Pham, T.H.T, 2006, WebSys Robustness Assessment and Testing, Master Thesis, NTNU, Trondheim. Reifer, D.J, 1979, Software failure mode and effect analysis, IEEE Trans. Reliability, vol. R-28, pp 247-249. Stevens, S.S, 1946. On the theory of scales of measurement, Science, no. 103, pp 677-680 Turkey, J.W., 1962. Data Analysis and Behavioral Science or Learning to Bear the Quantitative Man’s Burden by Shunning Badmandments. The Collected Works of John W. Tukey, vol. III. Belmont CA, USA, pp 187-389. Zhou, J and Stålhane, T, 2004. A Framework for Early Robustness Assessment. Proceedings of Software Engineering and Application, MIT, Cambridge MD, USA Øvstetun, T.M. et al, 2005. Automatic Acceptance Testing of Web Applications. Master Thesis, NTNU, Trondheim.

Suggest Documents