Estimating CASE development size from outline ... - Semantic Scholar

Information and Software Technology 38: 391-399 (1996)

Estimating CASE development size from outline specifications Sophie Cockcroft School of Computer and Information Science, University of Otago, P.O. Box 56, Dunedin, New Zealand Size is a significant variable in cost/estimation modelling and is widely regarded as one of the most important determinants of effort. The research documented here was carried out

in response to problems associated with estimation in project

management. This paper discusses some empirical evidence arising from a study of design size estimation. The approach used here is to count the number of lines in the automatically generated CASE report. These are then used in the estimation equations as a measurement unit. Results from this study are shown to support earlier conjectures that design size can be estimated from early system specifications in CASE development environments. The study concludes that accurate size estimations can be made using data from early system specifications.

Keywords: Computer-aided software engineering, CASE, CASE application products, size measurement, size estimation, software process modelling

INTRODUCTION It has been suggested by Tate and Verner1 that measures extracted from software specifications may be useful in estimating subsequent development sizes in a CASE environment. They isolated a sequence of sizes of interest during the development of a CASE application. These sizes relate to products yielded at various stages along the development path.

The expression used for these products, ‘CASE application


products’, was abbreviated to ‘CAPs’ in their paper, this abbreviation will also be adopted here.

The work presented here was designed primarily to discover whether these conjectures relating to development size were borne out in practice. In addition a specific size measurement unit was used, namely the number of lines in an automatically generated CASE report. This novel approach to size measurement was first suggested by Tate and Verner2. To the authors knowledge this is the first time that this measure has been used empirically in the context of size estimation in a data centred CASE business system development.

It is common knowledge that, in the field of software project management, there are widespread cost and schedule overruns. What the project manager is seeking, early in a software project, is an accurate estimate of effort and hence cost for the purposes of project planning and bidding for contracts or providing cost estimates for customers. Software size is a major input to most cost estimation models which reflects its widely accepted role as one of the single most important determinants of effort

1,4,5,6,7

. The

importance of an accurate size estimate in any system development should not therefore be underestimated. A number of factors have been suggested as the weak link in the cost estimation chain including selection of personnel8 and software sizing9. It is the issue of software sizing that is of interest here.

The research documented here is the enaction and measurement of a software development process that mimics a real world situation in a number of ways. The developments done using this process yield a small development data set which nevertheless is large enough to derive size estimation equations. The empirical data


which has been collected is used to test whether the relationships between CAPS postulated by Tate and Verner1 hold; if they do hold how strong the relationships are, and how they can be used for estimation purposes. It is acknowledged that, since this is a laboratory based study its results are not necessarily generalisable to the whole CASE community. The provision of empirical evidence, which is difficult to collect in ‘real world’ situations, should however, be of value.

BACKGROUND

Motivations of software process modelling The objective of modelling the software process, from the point of view of measurement, is that it should isolate the essential components of the given process, and hence give us definite activities and/or products to measure.

Estimation in

general, and size estimation in particular, is intimately linked to measurement in that it is through the use of metrics that accurate estimation will be achieved.

Measurement

within

the

software

process

supports

two

processes;

management/control and estimation/prediction as illustrated in Figure 1. In reality this representation of software process modelling motivations is somewhat simplified. It is the goals of management/control and estimation/prediction that are of interest here.

Control With regard to control in software development, DeMarco used the phrase 'You can't control what you can't measure'10 .

Control comes under two categories, firstly


controlling the process, that is 'doing it right', and secondly controlling progress, costs, etc.11. Software management metrics, which are collected at regular intervals, represent current assessments of the work to be done, the work accomplished, the resources used, and the status of the products being generated 12. In addition the use of metrics is a proven technique for guiding the actions that are taken to improve the software process. Anderson 13 emphasised the need to collect data during development that describes production processes, partial and intermediate products in order to assess technical achievements, project status and progress.

Estimation Another motivation for measurement is prediction, preferably early on in the project 14,15

. It is acknowledged that early phases of software development create products

that express the properties of the final version and hence can be used as input to a formal model for prediction and control16. This general acceptance of the value of early measurements has led to a proliferation of design measures 17. Lehman18 stated the objective of measurement as 'a tool for increased understanding of the technology, to provide models and mechanisms for forecasting'. Further development of measurement technology, and its integration into environment design, is of fundamental importance and must become a major research focus and goal.

The application of software process modelling in this work The aims of software process modelling in this work were somewhat different from those of other software modelling researchers.

In addition to understanding and

formalising the process it is necessary to provide means, through more elaborate and complete models, of measuring and managing it. In this case is a domain specific,


product network model is considered to be an appropriate choice because the model is fitted to both the environment (ie. data centred CASE development) and the measurement needs (ie. collection CAP information). The major focus here is on the measurement of size at various stages in CASE development.

Measurement

within

the

software

process

supports

two

processes;

management/control and estimation/prediction as illustrated in Figure 1. In reality this representation of software process modelling motivations is somewhat simplified. It is the goals of management/control and estimation/prediction that are of interest here.

Tate and Verner1 introduced the product network model, which is specific to datacentred business systems, and the CASE development environment. In this model a network of products is defined which can be measured in one or more ways, see Figure 2. Arcs between the product nodes indicate dependence of later products on earlier products in respect of the product attributes of interest, the order of product development in Figure 2 being defined by alphabetic labelling ‘a’ to ‘j’. Processes are not modelled explicitly because they are not of primary interest for this study. However each product can be regarded as a combination of a process to 'produce that product' and the product itself. The inputs to the process are the products that a given process is dependent on, namely those linked by arcs to it. The diagram presented in Figure 2 is in fact an abstraction from a domain specific software process model, which focuses on development product dependencies for the purposes of investigating size relationships between products related by these dependencies.


The following section provides some background to the choice of size measurement and estimation technique used in this work. References are given to the original articles for a more detailed treatment of the individual techniques.

Measuring software size There are problems in determining the size of any software product that consists of a large number of different objects (e.g. entities, relationships, data flows, processes). During the development of a software product there were a number of product sizes of interest. In this case the application concerned is a database together with a set of transactions that update and report on the database contents. There are a number of products (e.g.

data flow diagram, data model) that are measurable during

development. One goal of this work is to measure how much of each product of interest is produced during development.

For the purposes of software process management, specific measures should be made. Traditionally the problem of size measurement has been solved either by finding a common measure for all objects (traditionally lines of code), or by constructing a composite measure that assigns different weights to different objects appropriate for the purpose in hand and then adds the weighted object counts or measures together. Verner19 classified software size metrics under four categories; textual metrics, object counts, vector metrics and synthetic metrics.


Textual metrics These are analogous to commonly used measures of the size of natural language in printed form, an example would be sections, paragraphs, sentences or statements in the COBOL language.

Object counts These are counts of specification or design objects, of objects occurring in other descriptions of software, or within the software itself. Objects that have been counted include; functional primitives 10, files, flows and processes, logical transaction types, entities and relationships, objects, attributes and operations

Vector metrics These separately count more than one object type within the software or within some description of it. As the name suggests these use a vector of size metrics rather than a single size value.

This Metric vector approach suggested by Basili

20

gives a

description of the size of the data model through, for example, the number of entities, relationships and attributes.

Synthetic metrics Sometimes known as composite metrics these are single values produced by the application of some function to one or more type of object count. They can be regarded as functions of metric vectors. The following can be defined as synthetic metrics; Software science metric21, function weight10 M1 22, FPA23 and MarkII24.


For of applications developed with CASE tools there is a further choice, that of lines in a data model report. This is the measure chosen in this case for two reasons. Firstly it represents a synthesis of the sizes, that is, entities, relationships and attributes. Secondly it is an easy to collect and consistent measure of size within a specific development and measurement environment. Intuitively this measure can be regarded as consistent because reports are automatically generated and report lines are automatically counted. In addition this measurement technique rules out subjectivity.

The CASE tool used for these developments is Deft 4.2. Deft provides a report generation facility. For the purposes of this work the size measure being used is the size of the report generated by Deft 4.2, which is analogous to lines of code or lines of documentation, which are common measures. This approach to size measurement has already been suggested as part of the discussion of how CAP sizes should be measured, and in what units 1.

The report size, or more specifically the number of lines in a formatted report, is used as a size measure in this case for three reasons, also noted by Tate 25, which are as follows; •

Report lines are highly correlated with tokens. A correlation of line counts against token counts in an earlier study 26 produced a result of r2=0.97.

•

Report lines are a natural measure of quantity for requirements

•

Report lines are much easier to count than tokens using an automatic line counting facility provided by the operating system.

Estimating software Size


Size is a key consideration in estimating the cost of software development, which in turn is critical to successful project management. Numerous techniques have been developed to estimate size, but their usefulness depends on the unique combination of circumstances of the estimator. Estimated software size is the key input to most software cost models and to date the measurement unit has been source lines of code (SLOC). Although it has been established that software size is important, there are two major problems with its estimation. First the suitability of estimation models over a range of projects has not been proved, and furthermore within a given project different models may be appropriate at different phases.

As a first step to

understanding these problems and analysing the capabilities and limitations of the various types of model Ferens 27 classified software sizing models into six categories. These are illustrated in Table 1. A regression model is chosen as the basis for size estimation in the current work. It will therefore be discussed at greater length later with particular reference to the current data.

The use of regression to obtain

component explanation and size estimation equations was a feature of the Verner and Tate's ‘Explanation then Estimation model' 28 and the MERMAID method.29

METHODOLOGY The estimation problem that this research seeks to address is that of estimating later system sizes from early system specifications, based on the product-network model1 . The data for these size estimations are collected through the use of a nested application system model. This model, which is designed for controlled laboratory system development, involves incremental development leading to twelve practical system designs.


The chosen size measurement unit is lines of CASE generated report, the CASE tool used in this case being Deft 4.2. As stated earlier, a regression based size estimation method was adopted

Estimation problem There are three estimation goals in the current work. The first goal is to estimate the user interface size from selected flows in the general data flow model, the second is to estimate the functional specification, database and design size from the general data model, general data flow model or both. The third estimation goal is to conduct an investigation into the general feasibility of estimating any later size from any earlier size. In terms of the product network in Figure 2. •

Estimate Sizeh + Sizel + Sizei from any or all of Sizea + Sizeb + Sizec

•

Estimate Sizeg from selected flows in b

Note the alphabetic labelling corresponds to that of Figure 2.


Data collection As outlined earlier, for the purposes of this work, we are concerned with a network of sizes of CASE application products. These sizes are derived from a number of increments or subsystems. Basically two full system specifications were designed. The first was a case study design for a database system for a distance learning college. This system contains only the most basic student details. The second is the same system with more student details, which of course was drawn heavily from the design of the first system.

The functionality of the systems in terms of reporting and

querying facilities varied according to the schematic diagram in Figure 3. It can be seen that by varying the functionality available in a given system a total of six system designs can be derived from the basic student address system excluding of course help with output. These are as follows

Basic Basic + Input Basic + Output Basic + Input + Output Basic + Input + Help Basic + Input + Output + Help

Having completed the basic address type systems, which include such details as Name, Student Code, Address, Postcode, Country and Country Code, further data is added to each student record. This additional data includes Sex, Age, Dept, Year of study, E-mail , Fax and Telephone number. When the systems that have more student details are taken into account this makes the total number of systems derived, and


hence the total number on which we can gather data, twelve.

Statistically this

represents the minimum set of data points that can be meaningfully used in a correlation30. Though, strictly speaking, the resulting 12 systems are not independent, they are different.

Moreover they are different in different system 'dimensions',

namely data (basic), transaction processing (input), report processing (output) and user interface (help). Thus they are, in some ways more different than many regular IS systems.

For the purposes of this study it seems reasonable to regard them as

effectively independent data points. The systems were developed in three stages. In the first stage data flows were merely named, this stage corresponds to the outline data model in Figure 2. In the second stage attribute names were added to the data flows, corresponding to the detailed data model in Figure 2. In the third stage full descriptions of all entities and attributes were added and the contents of help messages. User interfaces were developed as well as program structure diagrams and process descriptions to give a detailed system design.

RESULTS

A number of multiple linear regressions were carried out in an attempt to find a linear relationship between the size of final products of the Deft 4.2 and the early system specifications, the final products being the user interface and the full design size, and the early system specifications are data flow diagrams, data models, number of attribute names in early data model and number of flows in early data flow diagrams. These early system specifications were used as independent variables, in various combinations, to derive estimation equations. These equations were used to predict firstly the user interface size and secondly the full design size.


Model accuracy The level of accuracy of size estimation deemed to be adequate in cost estimation models such as COCOMO

8

has traditionally been set at ± 20%. That is estimates

should be within 20% of actual. It has been suggested31 that a refinement of the measure is used in size estimation. This refinement was based on the levels of accuracy expected by managers in a survey of fifty four British companies. Moores32 states that a model of size will be deemed to be accurate, if:

1.

The estimate is within 20% of the actual 80% of the time

2.

The mean magnitude of relative error (MMRE) is less than 20%

and where. n

MMRE = MRE =

1

n

Actuali - Estimatei

. Σ

* 100% Actuali i=1

Equation 1 and,

Actuali = actual value for the ith data point Estimatei = estimated value for the ith data point

The level of accuracy of size estimation models expected in the above survey should be regarded as an ideal. In practice this accuracy target is rarely achieved. A survey of previous empirical work 33 showed that prediction values of within 30% of actual 70% of the time was a more realistic accuracy target for effort estimation. There is little evidence that size is any easier to predict than effort so there seems little point in expecting a higher level of accuracy in size models than in effort models. According to these standards the results presented in the next section represent accurate


estimations. The MMRE values illustrated in column 3 of Tables 3 and 5 are based on Equation 1.

Since the aim of this study is the early prediction of system size, the independent variables used in the regression are taken from the first two stages of the development. That is, firstly the general systems in which flow, process, data store, and external entity names only are added; and secondly the detailed systems in which names of attributes are included in the above objects. The level in which process and attribute descriptions are included was not used as a predictor because this level represents a major part of the development and at this stage most of the work has been done so it does not provide an early indication of size.

The specific linear regressions carried out to predict the user interface sizes are given in Table 3 (It will be noted that the independent variables are listed in development sequence). The ranking of these measures by their effectiveness in predicting user interface does not correspond to the development sequence. That is, measurements taken at a later stage in development do not necessarily yield more accurate predictions as might have been expected, Table 3 illustrates this point. There is a trade off, in this case, between the level of accuracy of prediction using a given variable, and how far we have to go down the development path to obtain it. In Table 3 the ‘percentage of work done’ illustrates the proportion of the total work done in developing the user interface, ie. How far down the development track, in terms of size, have we gone?. For example, development of the detailed data flow diagram represents 16.60% of the total work involved in designing the user interface, the


measurement unit being CASE report lines. The earliest predictor of user interface size, for this data set, is the general data flow diagram, but the most reliable is the general number of flows with attribute names added or the detailed data flow diagram size. Since 16.60% of the development has to be done in order to obtain the latter estimation, but only 9.12% for the former, there is a strong argument for adding attribute names at an earlier stage of system design in order to facilitate prediction.

Full design size estimation For the purposes of this work the full design size consisted of the following. User interface size, program structure diagram size, Generated SQL for each complete design, and the fully detailed system model for each system, i.e. data flow diagram and data model. This follows the description of software designs put forward in by Barnard34 and unlike the functional specification also contains some information about the systems processing functions. The specific linear regressions carried out to predict the design size are given in Table 4 :

Table 5 shows the effectiveness of various independent variables in predicting design size. It will be seen that, as with the User Interface prediction, the general model with detailed attributes added is a better predictor of size than the general data model alone or even the detailed data model for this data set. Again the best predictors of size are the detailed data flow diagram and the general number of flows and detailed attribute names. The earliest predictor for this data set is the general data flow diagram. SUMMARY


This work has addressed some of the existing problems of size estimation for the purposes of project management. It is acknowledged that the software industry still has difficulty estimating the resources required for development in a specific application class and development environment. The industry is often unable to explain after the event what went wrong when size/effort/cost estimates are wide of the mark.

In this work twelve data centred business systems were built using the CASE tool Deft 4.2. The systems were built according to a nested development model, whereby a number of systems different in functionality and focus were built upon a basic common core system. During development, size data was collected.

For the purposes of this work a domain specific, process-product network was chosen, which uses a number of aspects of conventional structured systems analysis approaches. The use of a systems analysis approach and in particular process-product networks is favoured because their representation of

activities and products or

product components makes them readily understandable and

also makes them

suitable for measurement. The limitation of the study to a finite and controlled environment, together with the use of domain specific models, excludes extraneous factors that would have a greater impact in more heterogeneous situations. Thus estimation and management are simpler in these cases. In this work an abstracted process model has been presented for the purposes of illustrating size dependence (see Figure 2). This network of sizes is used for measurement and estimation purposes.

An unorthodox measure was selected for the purposes of this work, namely CASE generated report line counts.

This measure is justified here firstly because it


represents a synthesis of the sizes of diverse objects, such as entities, relationships and attributes and secondly, because it is an easy to collect and consistent measure of size within the specific development and measurement environment.

A regression based size estimation technique was chosen for this work because this type of technique is based on historical data and is widely applicable in such studies.

LIMITATIONS OF THE STUDY The main limitations of this work are as follows. Firstly the fact that there are only twelve systems developed and secondly the use of the same data set for calibration and validation. The first of these has two major implications: statistically the sample is small (although it does represent the minimum size on which correlations can usefully be performed), and it reduces the independence of the data points. The reduced independence does not however make the systems any less ‘true to life’ since re-use is a common phenomenon in business applications. The second limitation is that the same data set was used for developing regression equations and testing the effectiveness of those equations. Ideally different data sets or fractions of the same large data set would be used for these two purposes. However, this approach is accepted practice in the field of software engineering economics. Finally this work is not intended to be generalised to CASE developed systems as a whole, however the provision of empirical evidence to support conjectures relating to this particular size estimation problem should be of some value.

CONCLUSION


The specific problem area addressed was that of early size estimation in the context of data centred business systems developed using a CASE tool. Good predictions of user interface size and design size were obtained. The earliest predictor in both cases was the general DFD size (item b of Figure 2). The most reliable in both cases was the general DFD with detailed attributes or the detailed DFD (item e of Figure 2). It was noted that one would have to proceed further down the development path to obtain the latter measurement and that the addition of attribute names alone early in design improves the estimation value of the measurements. It should be noted however, that this rule may only apply to data sets gathered under the same conditions as the one under consideration here.

In summary it is found that accurate size estimations can be made from early system specifications using the number of lines in the automatically generated CASE report as a measurement unit. It would appear that in this environment the addition of attribute names early in development improves accuracy of early prediction.

The main contribution of this work has been the carrying out of empirical work that provides evidence to support hypotheses put forward in previous work. It has been demonstrated that accurate size estimation from early specifications is possible within the context of a given CASE environment.

Suggested further work to build on this could include the development of the system designs right through to implementation stage, testing the prediction equations developed on a different data set and repeat studies using different CASE tools.


REFERENCES 1.

Tate, G and Verner, J M ‘Approaches to measuring size of application products with CASE tools’ Information and Software Technology. Vol 33 No 9 (November 1991) 622-628

2.

Tate, G and Verner, J M ‘Software Metrics in CASE development’ Proceedings. IEEE COMPSAC, Kogakuin University, Tokyo, Japan, (11-13 September 1991) 565-570

3.

Albrecht, A J & Gaffney, J E 'Software function, source lines of code, and development effort prediction: a software size validation' IEEE transactions of software engineering Vol. 9 No. 6 (November 1983) 639-648

4.

Banker, R & Kemerer, C F ‘Scale economies in new software development’ IEEE transactions on software engineering Vol. 15 No. 10 1199-1205

5.

Boehm, B W& Papaccio P N 'Understanding and controlling software costs' IEEE transactions on software Engineering Vol . 14 No. 10 (October 1988) 1462-1477

6.

Kulkarni, D A, Greenspan, J B, Kreigman D A, Logan J J and Roth, T 'A generic technique for developing a software sizing and estimation model' COMPSAC 88 155-161

7.

Lederer, A L & Prasad, J ‘Nine management guidelines for better cost estimating’ Communications of the ACM Vol. 35 No. 2 (1992) 51-59

8.

Boehm, B W 'Software Engineering Economics' Prentice Hall 1981

9.

Verner, J M and Tate, G 'A Model for Software Sizing', Journal of Systems and Software 7 (1987) 173-177

10. De Marco, T 'Controlling software projects management, measurement and Estimation' Yourdon Press (1982) 11. Dale, C J and van der Zee, H 'Software productivity metrics: who needs them?' Information and Software Technology v34 p731-8 November '92 12. Clapp, J 'Getting started on software metrics' IEEE Software Vol 10 p108-9 January '93 13. Anderson, O 'The use of Software engineering data in support of project management' Software Engineering Journal November 1990 350-356 14. Shepperd, M ‘Products, processes and metrics ' Information and Software Technology v34 674-80 October '92


15. Farbey, B 'Software quality metrics: considerations about requirements and requirement specifications 'Information and Software Technology v32 p60-4 January/February '90 16. Fenton, Norman E; Kaposi, Agnes A 'Metrics and software structure' Information and Software Technology v29 p301-20 July/August '87 17. Roche, J M 'Software metrics and measurement principles' ACM SIGSOFT Software engineering Notes Vol. 19 No. 1 Jan 1994 77-85 18. Lehman, M M 'Software engineering, the software process and their support' Software Engineering Journal September 1991 19. Verner, J M 'A Generic model for software size estimation based on component partitioning' Ph.D. Thesis 1989 20. Basili, V R 'The TAME project: Towards improvement oriented software environments' IEEE Transactions on Software Engineering Vol.. 14 No.. 6 June 1988 758-773 21. Halstead, M H ‘Elements of Software Science’ Elsevier, New York 1977 22. Van der Poel, K and Schach, S ‘A Software metric for Cost Estimation and Efficiency Measurement in Data Processing System Development’ Journal of Systems and Software, 3, 187-191 (1983) 23. Albrecht, A J ‘Programming Productivity: Issues for the eighties’ IEEE Press, Silver Spring, MD. 34-43 (1981) 24. Symons, C R ‘Function Point Analysis: Difficulties and Improvements’ IEEE TSE, 14, 1, 2-11 (1988) 25. Tate, G. 'Software process modelling and metrics: a CASE study' Information and Software Technology v35 323-30 June/July '93 26. Verner, J M & Tate, G 'Estimating Size and effort in fourth generation development' IEEE software July 1988 15-20 27. Ferens, D V 'Software size estimation techniques' Proceedings of the IEEE NAECON 1988 701-5 28. Verner, J M and Tate, G 'A Software Size model' IEEE transactions on software engineering Vol.18 No. 4 April 1992 29. Kitchenham, B. A. 'Empirical studies of assumptions that underlie software cost estimation models' Information and Software Technology Vol. 34, April 1992 211-219


30. McLean, A,. ’Statistics and Financial Mathematics for Business’ Prentice Hall 1982 31. Moores, T T 'A model to size the development of prolog programs' City Polytechnic of Hong Kong Departmental working papers No. 94/13 1994 32. Moores, T T & Edwards J S 'Could large UK corporations and computing companies use software cost estimating tools? - a survey ‘European Journal of Information Systems Vol. 1 No. 5 311-19 1992 33. Tate, G and Verner J M 'Software Costing in Practice' Information and Software Economics edited by Richard Veryard, Butterworth Scientific UK (1990). 34. Barnard, H J, Metz R & Price A L 'A Recommended Practice for Describing Software Designs:' IEEE standards Project 1016 IEEE transactions on Software Engineering Vol. SE12 No. 2 1986 258-263


Table 1 Comparison of size estimation techniques based on Ferens 27 Technique

Example

Description

Advantages

Limitations

Expert

Program

One or more experts are

-No historical data

-lack of objectivity

Judgment

evaluation and

asked for their opinions

required

-Knowledge of "expert"

review

on the factors that affect

-Useful early in a

technique

software size.

program

PERT8 Function

Function Point

Function Points are

-Well validated for

Collection relatively

Points

Analysis

computed by analysing

data processing

difficult to automate .

five program

information systems.

characteristics: Inputs,

- Much recent work

outputs, enquiries, master

has been carried out

files and program

to widen its

interfaces.

applicability to other

(FPA)

23.

1

areas. Parametric

Price-SZ

Use input parameters

-Consider many

-Suitability for new,

consisting of numerical

factors

unique programs

or descriptive values of

-Can be calibrated

selected program attributes. Analogy

Regression

Software Size

Estimate size by

-Based on historical

Truly similar program

Estimator

comparing software with

data

may not exist

(SSE)

similar software of

-Useable early in a

known size.

program

Ikaturas

Developed by

-Based on historical

-Database limited

Takayanagi

performing regression

data

-Require High correlation

Verner and

analysis on historical

-Transcend need for

coefficient and low

data.

direct analogy

standard error

Combine various

-Flexibility

None beyond limitations

elements of

-Advantages of each

of components

Tate Composite

28

Size Planner

component


Table 2. Inputs to linear regressions (User Interface size) Dependent User Interface Size

Independent 1. General data flow diagram size 2. General no. of flows 3. General no. of flows and Detailed attributes 4. General model size (DFD+DM) 5. General model size (DFD+DM) + detailed attributes 6. Detailed data flow diagram size 7. Detailed no. of flows 8. Detailed model size (DFD+DM)


Table 3. Effectiveness of various independent variables in predicting user interface size Independent Variable

Prediction Prediction Prediction Estimating Equation Used within within within 20% 15% 10%

Gen. DFD

% MMRE of work done 6.30% 12.20%

66.7%

66.7%

50.0%

Gen. no of flows

7.71% 12.46%

66.7%

66.7%

50.0%

Gen. no of flows + no. of attribute names

9.12% 8.81%

91.7%

75.0%

66.7%

Gen. model (dm+dfd) Gen. model + detailed attribute names

9.20% 12.32%

75.0%

66.7%

66.7%

10.61% 9.49%

83.3%

75.0%

66.7%

Det. DFD

16.60% 10.26%

83.3%

75.0%

75.0%

Det. flows

18.01% 11.00%

83.3%

75.0%

58.3%

Det. model size (dm + dfd)

19.57% 14.03%

75.0%

75.0%

50.0%

-1590.12763+(22.63024*general dfd size) -576.929106 + (90.314144*general data flows) -1046.662683+(86.347534*general data flows)+(24.660429*detailed attributes) -2283.74098+(22.62828*general model size) -2587.17798+(21.737138*general model)+(19.991625*detailed attributes) -1379.53+(9.934677*detailed dfd size) -597.913+(91.27514*detailed data flows) -1858.67+(9.609181*detailed model size)


Table 4. Inputs to linear regressions (Design size) Dependent Design size

Independent 1. General data flow diagram size 2. General no. of flows 3. General no. of flows and Detailed attributes 4. General model size (DFD+DM) 5. General model size (DFD+DM) + detailed attributes 6. Detailed data flow diagram size 7. Detailed no. of flows 8. Detailed model size (DFD+DM)


Table 5. Effectiveness of various independent variables in predicting design size Prediction Variable % of work done Gen. dfd 3.08% Gen. no of flows 3.78% Gen. model(dm+dfd) 4.53%

MMRE Prediction within 20% 13.39% 66.7% 13.86% 66.7% 13.49% 66.7%

Prediction within 15% 66.7% 66.7% 66.7%

Prediction within 10% 50.0% 50.0% 50.0%

Gen. no of flows and 4.36% 9.20% det. attrib Gen. model + 5.10% 10.00% attribute names

83.3%

75.0%

58.3%

83.3%

58.3%

58.3%

Det. DFD

8.26% 9.11%

83.3%

83.3%

75.0%

Det. flows

8.97% 11.94%

75.0%

75.0%

58.3%

Det. model size (dm 9.71% 11.45% + dfd)

75.0%

75.0%

66.7%

Estimating Equation Used

-2933.024066 + (43.239699*dfd size) -996.094546 + (172.529789*flows) -4258.524268 + (43.237035*general model size) -1971.071186 + (164.296713*flows) +(51.185062*attributes) -4900.878983 + (41.350551*general model size) + (42.320854*detailed attributes) -2559.70203 + (19.114278*size of detailed dfd) -1048.72114 + (174.788353*no of detailed flows) -3505.67795 + (18.504466*detailed model size)


Figure 1. Interactions between motivations in software process modelling Figure 2. CASE application Product components whose sizes may be of interest. Tate and Verner1 Figure 3. Schematic diagram of functionality of proposed systems


8QGHUVWDQGLQJ 0DQDJHPHQW

(VWLPDWLRQ 0HDVXUHPHQW

Information and Software Technology 38: 391-399 (1996) a

b

General data model

c

d detailed data model

e Outline system model

f

h l

General dataflow model

database design i

system model

detailed functional specification

design

j generated code

k actions (text in DFD descriptions of processes)

Detailed dataflow model

g user interface


Help Enhanced Output

Enhanced Basic Input

Estimating CASE development size from outline ... - Semantic Scholar

Estimating CASE development size from outline ... - Semantic Scholar

Suggest Documents

Outline Business Case Development template.pdf - Google Drive

Outline points - Semantic Scholar

Estimating population haplotype frequencies from ... - Semantic Scholar

Estimating collision efficiencies from contact ... - Semantic Scholar

Estimating Word Translation Probabilities from ... - Semantic Scholar

estimating recombination rates from population ... - Semantic Scholar

estimating fatigue from predetermined speech ... - Semantic Scholar

Estimating Degradation Model Parameters from ... - Semantic Scholar

Estimating reflectance from multispectral camera ... - Semantic Scholar

Estimating physical reflectance spectra from ... - Semantic Scholar

Estimating Transmitter Release Rates from ... - Semantic Scholar

Estimating effective population size from linkage ...

Estimating effective population size from linkage ...

Estimating Age Ratios and Size of Pacific Walrus ... - Semantic Scholar

Sample Size Required For Estimating The ... - Semantic Scholar

Estimating drizzle drop size and precipitation rate ... - Semantic Scholar

Estimating the Size of Changes for Evolving Object ... - Semantic Scholar

CCS paper outline - Semantic Scholar

ECOM-02 Outline - Semantic Scholar

Cleavage paper outline - Semantic Scholar

Estimating Software Effort Based on Use Case ... - Semantic Scholar

Estimating dengue transmission intensity from case ... - PLOS

Case Study Outline - Paytronix

Case Study Outline - Paytronix