A Quantitative Method for Quality Evaluation of Web Sites ... - CiteSeerX

72 downloads 173755 Views 355KB Size Report
The Web is playing a central role in diverse application domains such as business, ... Elementary Evaluation (both Design and Implementation stages);. 3.
A Quantitative Method for Quality Evaluation of Web Sites and Applications

Luis Olsina1, Gustavo Rossi2 3 1

GIDIS, Department of Informatics, Faculty of Engineering, at UNLPam, Calle 9 y 110, (6360) General Pico, La Pampa, Argentina E-mail [email protected] 2 LIFIA at UNLP, Calle 50 y 115, (1900) La Plata, Buenos Aires, Argentina 3 CONICET - Argentina E-mail [email protected]

Abstract. In this paper, a quantitative evaluation strategy to assess the quality of Web sites and applications (WebApps) is discussed. We give an overview of the WebQEM (Web Quality Evaluation Method) and its supporting tool by presenting an E-commerce case study. The methodology is useful to systematically assess characteristics, subcharacteristics and attributes that influence product quality. We show that the implementation of the evaluation yields global, partial and elementary quality indicators that can help different stakeholders in understanding and improving the assessed product. Concluding remarks and in-progress research are finally presented.

Keywords: Web Engineering, Quantitative Evaluation, WebQEM, Quality Characteristics and Attributes. 1. Introduction The Web is playing a central role in diverse application domains such as business, education, industry, and entertainment. As a consequence, there are increasing concerns about the ways in which WebApps are developed and the degree of quality delivered. Thus, there are compelling reasons for a systematic and disciplined use of engineering methods and tools for developing and evaluating Web sites and applications [9]. We need sound evaluation methods for obtaining reliable information about the product’s quality. These methods should identify which attributes and characteristics should be used to obtain meaningful indicators for assuring specific evaluation goals given a user viewpoint. It is widely known that the quality of software products can be described in terms of quality characteristics as defined in the ISO/IEC 9126-1 standard [5]. “However, the state of the art in software measurement is such that, in general, the direct measurement of these characteristics is not practical. What is possible is to assess these characteristics based on the measurement of lower abstraction attributes of the product” ([6], p.3). We consider an attribute as a direct or indirect measurable (tangible or abstract) property of an entity (a WebApps in our case). In addition, we can use a quality model (in the form of a quality requirement tree) in order to specify such characteristics, subcharacteristics and attributes. These quality, cost, or productivity requirements are often quoted as non-functional requirements in the literature. In this context, stakeholders should consider which are the characteristics and attributes that influence the product quality and quality in use (though ensuring product quality is not often sufficient to guarantee quality in use –however, this discussion is beyond the scope of this article). Specifically, there are some characteristics that influence product quality as those prescribed in the ISO 9126-1 standard, i.e., usability, functionality, reliability, efficiency, portability, and maintainability. In order to define and specify the quality requirement tree for a given assessment goal and user viewpoint, we should consider diverse attributes –e.g., Broken Links, Orphan Pages, Quick Access Pages, Table of Contents, Site Map, Links Colour Style Uniformity, Permanence of Main Controls, just to quote a few of them. It might also be admitted, however, that designing a rigorous non-functional requirement model that gives us a strong correlation between attributes and characteristics is a hard endeavour. In this paper, we present the Web Quality Evaluation Method (QEM) [14, 15] and some aspects of its supporting tool, WebQEM_Tool [16]. We show that, by using the methodology for assessment purposes, we can give recommendations both by controlling quality requirements in new Web development projects and by evaluating requirements in operational phases. We show that we can discover either absent features, or requirements poorly implemented, i.e., design and implementation drawbacks related to the interface, navigation, accessibility, search mechanisms, content, reliability and performance, among others.

Though our method can be applied for assessing all aspects of Web sites and application, we focus on those that are perceived by the user (navigation, interface, reliability, etc.) instead of other product attributes (such as quality of code, design, etc). In this sense we emphasize the Web site characteristics and attributes from a general visitor viewpoint. The rest of this paper proceeds as follows. In Section 2, we describe the evaluation process to which WebQEM adheres. In Section 3, we discuss a comprehensible example in the field of E-commerce in order to illustrate both the methodology and the first version of the supporting tool. Finally, some concluding remarks and future work are drawn.

2. Overview of the Evaluation Process in the WebQEM Methodology The WebQEM process steps are grouped in the following four major technical phases: 1. 2. 3. 4.

Quality Requirements Definition and Specification; Elementary Evaluation (both Design and Implementation stages); Global Evaluation (both Design and Implementation stages); Conclusion of the Evaluation (regarding Recommendations).

Figure 1, shows the evaluation process underlying the methodology including the phases, stages, main steps, inputs and outputs. This model is inspired in the ISO’s process model for evaluators [6]. We next give an overview of the major technical phases (the evaluation process has also planning and scheduling steps).

2.1 Quality Requirements Definition and Specification. In this phase, evaluators must clarify the evaluation goals and the intended user viewpoint. They should select a quality model, for instance, the ISO-prescribed characteristics in addition to attributes customized to the Web domain. The relative importance of these components should be identified considering the WebApps audience and the extent of the coverage required. Regarding the user profile, at least three abstract evaluation views of quality may be defined, i.e., visitors, developers and managers views. For example, the visitor category can be decomposed in general and expert visitor subcategories. Thus, taking into accounts the domain and product descriptions, the agreed goals, and the selected user view (i.e., the explicit and implicit user needs), characteristics, subcharacteristics and attributes should be specified in a quality requirement tree. In the end of this phase, a quality requirement specification document is produced.

2.2 The Elementary Evaluation. In this phase, two major stages are defined as depicted in Fig. 1: The design and the implementation of the elementary evaluation. For each measurable attribute Ai from the requirement tree, we can associate a variable Xi, which will take a numerical value from a direct or indirect metric. However, the value of this metric will not represent the level of satisfaction of this elementary requirement at all. For that reason, it is necessary to define an elementary criterion function that will result afterwards in an elementary indicator or preference value. For instance, let us consider the Broken Links attribute, which measure (count) links that lead to missing destination pages. A possible indirect metric is: X = #Broken_Links / #Total_Links_of_Site. Now, how do we interpret the measured value?; what are the best, worst and intermediate preferred values? The next formula represents a possible criterion function to determine the elementary quality preference EP: EP = 1 (or 100%) if X = 0; EP = 0 (or 0%) if X >= X max ; otherwise EP = (X max – X) / X max if 0 < X < X max where X max is some agreed upper threshold such as 0.06 So the elementary quality preference EP is frequently interpreted as the percentage of satisfied requirement for a given attribute, and it is defined in the range between 0, and 100% (so the scale type and the unit of metrics become normalized [20]). Furthermore, to ease the interpretation of preferences, we primarily group them in three acceptability levels, namely: unsatisfactory (from 0 to 40%), marginal (from 40 to 60%), and satisfactory (from 60 to 100%) –this is exemplified in Section 3.4.

In the implementation stage, the selected metrics are applied to the Web application as shown in Fig. 1. Some values can be measured observationally, while others can be obtained automatically by using computerized tools.

Non-functional Requirements

Elementary Evaluation

Partial/Global Evaluation

Web audience's Needs Managerial Requirement

ISO/IEC 9126 model WebQEM models & tools

Elementary Preference Criteria Definition

Metric Selection

Global Preference Criteria Definition

Elementary Criterion Description Web Product Components

Measurement Implementation

Aggregation Schema

Measured Value

Elementary Preference Implementation Measurement Result

Elementary Result

Documentation

Conclusion of the Evaluation

Implementation of the Evaluation

Evaluation Goals

Scored Value Partial / Global Preference Implementation Final Result

Recommendations

Web Product Descriptions

Design of the Evaluation

Quality Requirements Specification

Quality Requirements Definition

Figure 1. The evaluation processes underlying in the WebQEM methodology. The phases, stages, main processes, inputs and outputs are shown.

2.3 The Global Evaluation Phase. Again, two major stages are defined: The design and the implementation of the partial/global quality evaluation. In the design stage, aggregation criteria and a scoring model should be selected. The goal of quantitative aggregation and scoring models is to make the evaluation process well structured, accurate, and comprehensible by evaluators. There are at least two type of models: for example those based on linear additive scoring models [2], and those based on nonlinear multi-criteria scoring models [1] where different attributes and characteristics relationships can be designed. In both cases, the relative importance of indicators is considered by means of weights. For example, if our procedure is based on a linear additive scoring model the aggregation and computing of partial/global indicators or preferences (P/GP), considering relatives weights (W) is based on the following formula: P/GP = (W1 EP1 + W2 EP 2+ ... + Wm EP m);

(1)

such that if the elementary preference (EP) is in the unitary interval range the following is held: 0