Metrics for World Wide Web Information Systems 1

0 downloads 0 Views 114KB Size Report
Abstract: The number of information systems in the world wide web is growing continuously. However .... As a result of a pre-Delphi research, we identi ed three cost categories that occur in the prelaunch ... programming. 4.3 Maintenance Cost.
Metrics for World Wide Web Information Systems K. Lenz, A. Oberweis, A. v. Poblotzki Lehrstuhl fur Wirtschaftsinformatik II, University of Frankfurt, D-60054 Frankfurt/Main, Germany

Abstract: The number of information systems in the world wide web is growing

continuously. However, the development process of web information systems is not yet suciently supported by adequate methods during all phases. Especially methods for cost estimation are missing. Thus, the development of web information systems does not only bear the risk of unforeseen high implementation cost, but also of uncontrollable maintenance cost. In this paper we present measures for world wide web information systems based on a conceptual model. Existing cost estimation methods in software engineering are transferred to the development of web information systems. Furthermore, the computation of the size of an information system allows its classi cation and helps to nd similar web information systems as reference.

1 Introduction Information systems in the world wide web are in widespread use, the supply with information through this new medium is growing continuously. However, the development of web information systems is not yet suciently supported by appropriate methods during all phases of the development process. Implementation and maintenance are supported by multiple tools like for example extended web page editors. For the design of web information systems, modeling concepts (Garzotto et al. (1995), Lenz, Oberweis (1998)) similar to those for the modeling of database systems have been proposed. For the planning phase, methods as known for the software engineering (Sommerville (1992)) are still missing. For example, the lack of cost estimation concepts makes it necessary to calculate development and implementation cost based on intuition or experiences of past projects. Analytical cost estimation concepts require that the relevant parameters of in uence for the development of web information systems are known and can be quanti ed with suitable measures. The metrics developed for that purpose can also be used to distinguish between web information systems due to di erent relevant characteristics. The information supply di ers for example remarkably in size, in implementation and maintenance time and cost. Metrics also allow the positioning of web information systems in a multidimensional space which may serve as basis for statistical analysis such as cluster analysis and for classi cation. Additionally, the semantic distance between two web information systems can be determined. This allows, e.g., to nd a suitable past project as reference for a future development project.

In this paper, we introduce metrics for the development of world wide web information systems based on a conceptual model of the information system. Therefore, we rst describe the underlying conceptual model for web information systems, the so-called page link model. In the next section, existing software metrics and cost estimation methods for software engineering are brie y surveyed. In section 4, we develop a metric for the size of web information systems and discuss relevant in uence factors on the estimation of implementation and maintenance cost. Finally, a brief outlook on future work is given.

2 Page Link Model The page link model supports the modeling of structural aspects of web information systems by a graphical representation. Pages types stand for classes of web pages with identical structure and comparable content. They are represented by rectangles in the page link scheme. If all web pages of one page type are combined to a single page, we call this page a list page. The list page may be grouped by a grouping criteria which then is indicated in the page link scheme. Page types have attributes in order to describe the content of web pages of this page type. In analogy to page types, links with comparable anchor, target and purpose are combined to a link type. Link types are represented by arrows between page types. We distinguish between links between two pages (unidirectional, bidirectional link) and a whole structure of links between several pages (index link, guided tour link, and their combination, the index guided tour link). The graphical representation of the most important components of the page link model is shown in Figure 1.

criteria

page type

page type with list page

unidirectional link

bidirectional link

index link

guided tour link

Figure 1: Components of the page link model The page link scheme for a web information system can be derived from an extended Entity Relationship scheme (Silberschatz et al. (1997)). Figure 2 shows a very simple example for a supplier of products like books, software and accessories. A book can describe or belong to several software products. On the other hand, software can be described or belong to several books. Figure 2 (a) shows the extended Entity Relationship scheme for these aspects whereas Figure 2 (b) presents the derived page link scheme. A list page is created for all products and the products are grouped by their type

(book, software or accessory). On this list page, each product should have its link to a speci c web page with a detailed product description, the price etc. To simplify the example, we have omitted the attributes of the page types here. supplier

supplier

1 r1 n

product

product type type book

accessory

n r2 m r1 r2

software

book

software

accessory

supplies belongs to (a)

(b)

Figure 2: Page link scheme for a supplier A detailed description of the page link model and the derivation process of a page link scheme from an extended Entity Relationship scheme can be found in (Lenz, Oberweis (1998)).

3 Software Metrics and Cost Estimation Software metrics are needed to quantify relevant characteristics of the software and of the software development process. Generally, characteristics can be measured in basic units such as 'lines of code' (LOC) or 'function points' for software code. Obviously, the counting of lines of code is not very suitable for the development of web information systems for almost the same reasons for which it has been criticized in the area of general software

development (Humphrey (1995)). A widely accepted method for the estimation of the size of a software system is the function point method (Garmus, Herron (1996)). The basic unit 'function point' describes important software functions like for example input or output. Metrics in basic units can be weighted and combined to more complex formulas. This allows the de nition of so called derived metrics in order to calculate a measure like the size of a software system or productivity. If software metrics are used during the planning phase of the development process, the basic units cannot be quanti ed at this time. Therefore, one has to rely on existing measures which have been validated by already nished projects and which serve as basis for the estimation of the basic units. In order to obtain a relatively high degree of expressiveness the estimation must be as simple as possible to compute and at the same time as precise as possible. The number of in uencing parameters has to be restricted, but without diminishing the quality of the measure by too much simpli cation. Traceability of the estimation can facilitate error detection, re nement or the adaptation to changing externalities.

4 Metrics for Web Information Systems The estimation of the size of a software system and its implementation cost can be done by the function point method. The important functions of the system are counted and weighted, and thus the development cost can be calculated with formulas that are to be customized to the speci c project requirements. This concept can be transferred to the development of information systems, where so called 'web points' have to be counted. Then corresponding weights are to be found, and nally formulas for the calculation of size and cost have to be generated. Until now, descriptions of the size of information systems are often restricted to the number of web pages or storage space. However, the size and maintenance cost of a web information system also depend on the number of links. For example, more maintenance work has to be done for an information system consisting of only three pages with many links to pages of other information systems than for an information system of about twenty pages and only a few corresponding links. Consequently, 'web points' have to be counted for both, web pages and links. For that purpose, we use the page link model.

4.1 Size of a Web Information System

First, we consider the size of a web information system. To know the size of the information system gives us a rst idea of how expensive and time intensive the development process will be. It helps to compare a speci c development project to other projects and thus to nd reference projects. Let n (n > 1) be the number of di erent page types of the page link scheme. Then for each page type i (i = 1; :::; n), p denotes the estimated average i

number of pages at one time. The number of attributes of that page type is A very simple measure for the size of the page type i is

ai .

pi

a: i

Now we have to look closer at the link types: the Kronecker symbol  indicates whether there exists a link from page type i to page type j ( = 1) or not ( = 0). l denotes the estimated average number of links from one page of the page type i to a page of page type j . Let ! design a weight. The weight has to be chosen adequately depending on the link type. For example, di erent weights for context links, structural links for navigational help, links without local administration possibility and local links for the navigation on the web page itself are possible. For simplicity, we suppose ! =! 8i; j . In Figure 4.1, a small excerpt from the page link model of our example is shown together with the relevant variables. ij

ij

ij

ij

ij

ij

ji

unidirectional link with weight ω ij = ω ji δ ij =1, δ ji =0

supplier

product

l ij estimated average number of links

page type i pi ai

estimated average number of pages number of attributes of page type

pj aj

type page type j

Figure 3: Detail of a page link scheme together with variables for size destination We then compute

8 >> ! >< y! (1 + l ) = > 1 + ! (1 + l ) >> ! (1 + l ) : 2! (1 + l )

unidirectional link index link with integrated list page index link with extra list page l guided tour link index guided tour link as the number of links weighted by the respective link type. With this, we can de ne a measure for the size of a web information system S as ij

ij

ij

ij

ij

ij

ij

ij

ij

ij

S

=

; ; ; ; ;

X n

i=1

pi (ai +

X n

i=1

ij lij ):

This estimation of the size of a page type can still be further re ned. For example, each attribute could be weighted corresponding to the underlying domain of the attribute (text, image, video, etc.). Additionally, the average size of the attribute values could be estimated. These measures must be normalized in order to make them comparable and then to integrate them

into the formula for size estimation. Finally, development projects must show if the increase of estimation precision by adding another weight or other measures justi es the additional e ort. In contrast to the syntactical size S which is a measure for the presentation of information, the semantical size of a web information system measures the extent and relevance of the information itself. In order to nd similar information systems to a speci c web information system, a comparison should base on both, syntactical and semantical size. But as a rst step, the syntactical size of a web information system helps to categorize its implementation and maintenance cost for the development process.

4.2 Implementation cost

As a result of a pre-Delphi research, we identi ed three cost categories that occur in the prelaunch phase of web information systems. The cost caused by the technical requirements are related to the hardware and the implementation environment. Hardware related cost comprise (among others) the internet access, routers, server, backup systems, proxy and rewall systems. Computers, operating systems, development software and education needed for the implementation of the web information system cause cost related to the implementation environment. The third category concerns the content of the information system. Content related cost are cost for the concept of the media content, the media planning, text, pictures, the layout and the programming.

4.3 Maintenance Cost

The prediction of maintenance cost is more dicult than the prediction of the implementation cost, because maintenance time-cycles depend on external in uences. On the other hand, the cost of maintenance top the implementation cost by many times in the lifetime of a system (Sommerville (1992)). The method to identify the implementation cost can be used as well to identify the maintenance cost. As result of the pre-Delphi research, we identi ed two categories: rst the technical basis with the cost for connections, internet fees, and site administration, and second the information contents with the cost for updates, online marketing, mailing lists and response services. Maintenance can occur periodically, permanently or only few times (for example once). It can be either predictable or unpredictable, i.e. in the latter case maintenance work is done spontaneously. Maintenance expenditures are caused by four reasons: the correction of faults, the adjustment of the site, the extensions of the functionality and/or the improvement of performance. A metric has to cover the objects of maintenance as well as the probability that one of these four reasons occurs. Referring to the objects of a site, the complexity of maintenance can be described by calculating the expenditure for the basic operations 'insert', 'delete' and 'update'.

5 Outlook A Delphi study is currently done to nd a complete list of cost factors that determine the cost of development and maintenance. The Delphi technique has experienced considerable acceptance inside and outside MIS Research for forecasting problems (Niedermann et al. (1991), Saunders, Jones (1992), Malhotra et al. (1993) (Robeson 88)). Any Delphi study comprises several rounds of opinion gathering from an expert panel. In each round, members of the panel are asked to react in writing to a shared document that summarizes the evolving consensus, as well as current positions and arguments of all members of the panel (Martino (1983)). Besides, the identi cation of a complete list of cost determinants, the Delphi Method allows to rank and validate the factors. Metrics for the implementation and maintenance cost can be derived by assigning those determinants to the page link scheme. Weights that have to be found for these metrics (e.g. for the di erent types of maintenance) will be validated by already nished projects. Furthermore, a database where information about the development process are stored will be created for nished web information system projects. Based on the metrics for size, implementation and maintenance cost, a metric for the semantic distance of two web information systems is to be developed. This metric will allow to identify within the database past projects as references for a future web information system project.

References

LENZ, K., OBERWEIS, A. (1998): Design of World Wide Web Information Systems. In: I. Balderjahn, R. Mathar, and M. Schader (eds.): Classi cation, Data Analysis, and Data Highways. Springer, 262-269. GARZOTTO, F., PAOLINI, P., and SCHWABE, D. (1995): HDM - A ModelBased Approach To Hypermedia Application Design. ACM Transactions on Information Systems, 11, 1-26. GARMUS, D., HERRON, D. (1996): Measuring the Software Process: A Practical Guide to Functional Measurements. Prentice Hall. HUMPHREY, W.S. (1995): A Discipline for Software Engineering. Addison Wesley. MALHOTRA, M.K., STEELE, D.C., GROVER, V. (1993): Important Strategic and Tactical Manufacturing Issues in the 1990s. Decision Sciences, February 1993. MARTINO, J.P. (1983): Technological Forecasting for Decision Making, Second Edition. New York. NIEDERMANN, F., BRANCHEAU, J.C., WETHERBE, J.C. (1991): Information Systems Management Issues for the 1990s. MIS Quarterly, 12, 475-495.

ROBESON, J.F. (1988): The future of Business Logistics: A Delphi Study Predicting Future Trends in Business Logistics. Journal of Business Logistics, 2, 1-14. SAUNDERS, C.S., JONES, J.W. (1992): Measuring Performance of the Information Systems Function. Journal of Management Information Systems, Spring 1992, 63-82. SOMMERVILLE, I. (1992): Software Engineering. Addison-Wesley. SILBERSCHATZ, A., KORTH, H.F., and SUNDARSHAN, S. (1997): Database System Concepts, Third Edition. McGraw-Hill.