Nordic Journal of Computing, 14(2008), 282-300.
Estimation of Real-Time Software Component Size Kenneth Lind1 and Rogardt Heldal Chalmers University of Technology, Gothenburg, Sweden
[email protected] [email protected]
Abstract. For distributed networks which will be mass-produced, such as computer systems in modern vehicles, it is crucial to develop cost-efficient hardware. A distributed network in a vehicle can consist of 100 ECUs (Electronic Control Unit). In this paper we consider the amount of memory needed for these ECUs. They should contain enough memory to survive several software generations, without inducing unnecessary cost of too much memory. We show that UML Component Diagrams can be used to collect enough information for estimating memory size using an FSM method. We develop two linear models capable of estimating memory size for two common types of software components before the software is available. We support our findings by two experiments containing several software components from the automotive industry.
Keywords. Functional Size Measurement, COSMIC Function Points, UML components, system architecture, software code size.
1. Introduction One of the key attributes of a system architecture is extensibility [1]. The automotive industry integrates new features as software in a dedicated ECU (Electronic Control Unit) connected to a communications network. Extensibility of such network is limited by available communication bandwidth and available node count. The downside of this integration method is that today’s vehicles contain up to 100 ECUs. The system architectures of the future are expected to have fewer ECUs. This means that more functionality has to be integrated on each ECU and the ECUs have to be more powerful. It is vital that these future ECUs have enough spare memory and spare processing capacity to last for the expected life of the system architecture, which might be as long as 10 years. However, spare memory and processing capacity means additional cost, and the cost pressure in the automotive industry is increasing over time. Cost reduction activities are part of everyday business, and spare resources 1
The author is also affiliated with Saab Automobile AB, a part of General Motors.
2
Kenneth Lind and Rogardt Heldal
are always questioned. Therefore, it is important to be able to estimate the amount of spare memory and spare processing capacity needed in a system. The significance of the added cost can be illustrated by an example: let us assume that an average vehicle has a system architecture containing 50 ECUs, and that each ECU contains 2 cents worth of spare memory that will never be used. GM (General Motors) produces 9 million vehicles each year. This adds up to 9 million USD (0.02*50*9 million) extra annual cost, yielding no benefit either to the customer or to the vehicle producer. This illustrates the problem at hand. This paper is an extension of our work reported in [2], and also contains results from our replicated experiment reported in [3]. We have investigated FSM (Functional Size Measurement) [4], [5], [6], [7], [8], [9] as a method of estimating software code size based on requirement specifications. In our study containing two experiments, we found that UML Component Diagrams can capture the information needed for applying FSM. The work has been performed using available UML Component Diagrams within GM (General Motors), the producer of vehicles like Saab, Opel, Cadillac, Chevrolet, etc. With help of FSM, UML components, and some textual description in the requirement specification related to the components, we have been able to estimate software code size of software components within GM. For the UML components we considered we also had access to the resulting software components. We found significant correlation between the measured value and actual software code size. Based on this correlation we have developed two linear models capable of estimating software code size before the software is available. Our finding is one of the factors in deciding the memory size of the different ECUs which exist in a vehicle. Our goal is that our research shall contribute to a better balance between software size and available memory. This is important for developing more cost-efficient system architectures. Another contribution from this work is that we investigate bytes of compiled code as the software code size metric, instead of the commonly used SLOC (Source Lines-OfCode) metric [10]. SLOC is often used for cost estimation, but we are interested in software code size instead of cost estimation. Therefore we chose to use bytes as a metric instead of SLOC. The paper is organized as follows: The next section provides sufficient background on FSM and UML Components to understand the paper. Then we define the research problem, the statistical hypothesis, and the strategy for measurement and analysis. Thereafter we describe how the study has been carried out. The linear model is designed and evaluated. Thereafter, validity threats are evaluated, and the interpretation of results is discussed, followed by related work. Finally, conclusions and suggestions for further work are presented.
Estimation of Real-Time Software Component Size
3
2. Background First we will look at Functional Size Measurement and describe the method we will use in more detail. Thereafter we will look at components used at GM and how they are related to UML. Components play a key role in our work since we use them to perform our Functional Size Measurements. 2.1
Functional Size Measurement
Functional size is defined as “size of the software derived by quantifying the Functional User Requirements” [4]. Functional User Requirements (FUR) describe what the software is expected to do for its users. Examples are data transfer, data transformation, data storage, and data retrieval. Functional size is independent of software language and development methods. There are several FSM methods available. The original method was described by Albrecht 1979 [11], [12], [13]. A comprehensive literature survey covering several methods is found in [14]. Some of them are IFPUG FPA (Function Point Analysis) [15], [16], COSMIC Function Points (CFP) [17], [18], MkII FPA [19], and NESMA [20], to name a few. The typical usage of FSM is development cost estimation and project planning. In this experiment, CFP is chosen because it is known to be suitable for real-time software, like automotive real-time systems [17], and it is a “second generation” method, complying with the ISO/IEC 14143-1:2007 standard for FSM methods [4]. The COSMIC Method defines a standardized measure of software functional size; see Figure 1 where the structure of the method is shown. The result of the method is a functional size measure expressed in CFP units.
Figure 1. The steps of the COSMIC Method. [17]
4
Kenneth Lind and Rogardt Heldal
The Measurement Strategy phase (see Figure 1) of the method defines the purpose and scope of the measurement. The purpose of the measurement in this work is to estimate the software code size based on the measured functional size. The scope of the measurement is defined based on the UML Component Diagram and the requirement specification. This is described as part of the Experiment operation section of the paper. The Mapping Phase (see Figure 1) of the method identifies the functional processes. A functional process is a component comprising a unique set of data movements (entry, exit, read, write; see Figure 2). This phase is based on the interfaces in the UML Component Diagram and the requirement specification, where the data movements can be identified. The Measurement Phase (see Figure 1) is basically a calculation of the data movements involved in the measured component. The types of data movements are defined in Figure 2.
Figure 2. The Generic Software Model with data movement types. [17]
As can be seen in Figure 2, there are four different data movement types: • • • •
Entry sub-processes move data across the boundary and into the functional process. Exit sub-processes move data across the boundary to a functional user. Read sub-processes move data from persistent storage to the functional process. Write sub-processes move data from the functional process to persistent storage.
Each data movement is equivalent to 1 CFP, and operates on a data group. A data group is a set of attributes, where each attribute describes a unique aspect of the same object.
Estimation of Real-Time Software Component Size
5
Persistent storage (see Figure 2) enables a functional process to store a data group beyond the life of the functional process, or from which a functional process can retrieve a data group stored by another functional process. By convention, the data movements of a functional process are also assumed to represent the data manipulation of the functional process. This is one of the limitations with the COSMIC method, and the consequence is that the method does not capture complex calculations, or treatment of large amounts of data. 2.2 UML Components GM uses UML Component Diagrams to show how the customer feature is divided into its smallest entities called “distributable components”, and the interfaces between them. A distributable component is normally implemented in software, and executes on one ECU. A distributable component shall never be split up into more pieces, but can be used in several features. The UML Component Diagram is developed from the requirement specification and modeled in the Rhapsody tool [21], as part of the system architecture development activities within GM. This is described further in [22]. The experiments in this paper will use existing Component Diagrams of the type shown in Figure 3.
Figure 3. UML Component Diagram of the Indicate Vehicle Speed feature.
In Figure 3, we see that the distributable components are modeled as component stereotypes (denoted “Distributable” followed by the name of the distributable component). Signals (serial data signals, hardware I/O signals, etc) are modeled as interfaces and classes. Data transfers between distributable components or between a distributable component and an interface are modeled as flows. Flows between two
6
Kenneth Lind and Rogardt Heldal
distributable components, or between a distributable component and an interface that is not a hardware interface, convey the corresponding signal. This notation deviates somewhat from standard UML 2.x [23]. For instance, flows are not usually part of Component Diagrams. The reason for GM to use flows is primarily to increase understandability among engineers not familiar with UML, which is important since the Component Diagrams are sent for review to stakeholders within the company to obtain feedback and approval. The HMI_ControlSpeedometer distributable component seen in Figure 3 receives two external inputs (VehSpeedAvgDriven, DispMeasSysExt) and produces one output (VehicleSpeedDisplayValue). The component diagram in Figure 3 can be translated to standard UML 2.x; see Figure 4 which concentrates on the HMI_ControlSpeedometer distributable component.
«use»
«use»
Figure 4. UML explicit representation of the HMI_ControlSpeedometer distributable component.
As can be seen in Figure 4, in this diagram required interfaces show what signals the component requires from other components, and provided interfaces show what signals the component provides to other components. As we can see from the diagram, required interfaces are related to the component by a dependency, and provided interfaces are related to the component by a realization relationship. Signals are shown with the keyword in the interface. A general description of software components is found in [24].
Estimation of Real-Time Software Component Size
7
3. Experiment definition Each experiment carried out in this work is defined as: Research problem: • “How accurately can software code size in bytes of compiled code be estimated by using a suitable FSM method?” • “To what extent can UML Component Diagrams capture the information needed for applying FSM?” Setup: • • •
Identify a set of software components of the same type and developed by one software development team for one ECU, and where each component corresponds to an isolated set of requirements. Apply the chosen FSM method to calculate a Function Point value for each chosen component. Obtain actual software code size in Number of bytes compiled code for each software component. Use a subset of data points (Function Point value, Software code size) to build a linear model. Use the rest of the data points to evaluate the model.
We anticipate that a number of factors might affect software code size. Besides functional size of the requirements, other factors that might affect the software code size are team experience, development tools and methods, software language and infrastructure, compiler, etc. In our experiment we focus on the relation between functional size and software code size, so we want to control the influence by all other factors. By investigating software components developed by one software development team, the variation due to experience is minimized and the variation due to tools and methods is removed. By investigating software components developed for one ECU, the variation due to software language, infrastructure, and compiler is removed. So, with this setup we think that other factors are controlled to the extent needed to capture only the relationship we want to study. The experiment setup is summarized in Figure 5. This paper contains two experiments investigating two types of software components. The first experiment investigates software components of Display & Indication type. They typically perform small calculations and display information of vehicle data, like vehicle speed, engine speed, etc. This type of functionality is representative of at least 20-25% of the features in a typical vehicle today. The second experiment investigates software components of Comfort & Convenience type. They are characterized by a combination of event-based user inputs causing changes of one or several digital or analog output(s), and represent at least 25-35% of the features in a typical vehicle today. Other types of functionality contained in a vehicle are Control & Calculation (characterized by continuous control and/or complex calculations), and Infotainment & Telematic (characterized by manipulation of large amounts of data).
8
Kenneth Lind and Rogardt Heldal
Requirement Requirement specification specification Component A, B, C, D
Other Other factors factors
Function points component A, B, C, D Functional FunctionalSize Size Measurement Measurement
Keep constant SW SWcode codesize size estimator estimator
COSMIC UML UML Component Component Diagram Diagram
SW SWCode Code size size
Component A, B, C, D Code size component A, B, C, D
Figure 5. The setup of the experiments.
As can be seen in Figure 5, requirement specifications and UML Component Diagrams are the two sources of input to the FSM method. We already mentioned in section 2 that COSMIC Function Points (CFP) is the FSM method chosen in this work. Experiment data are collected by calculating CFP for each component, and by obtaining the actual software code size for the same components. Based on the experiment data, a linear model is developed by using linear regression. The linear model is called SW code size estimator in Figure 5. The purpose of the SW code size estimator is to estimate the software code size before the software is available. For validation purposes, the estimated results will be compared to real values for software components developed by GM. The fact that we have empirical data to base our results on is an advantage compared to similar work reported in academia. (See for instance [28].)
4. Experiment planning The null hypothesis can be derived from our research problem in the following way: Our approach is to estimate software code size from CFP based on a linear model. A linear model is defined as y = k*x + b. The parameters k and b are calculated during the linear regression operation. Our statistical hypothesis needs to reflect whether the linear model is statistically significant or not. This is expressed as the following null hypothesis and alternative hypothesis: For the k parameter:
H0k: k = 0. H1k: k ≠ 0.
For the b parameter:
H0b: b = 0. H1b: b ≠ 0.
H0k and H0b are rejected if the probability of the test statistic is less than 0.05.
Estimation of Real-Time Software Component Size
9
A linear regression operation is evaluated based on checking the statistical assumptions for linear regression, how much variation in the data is explained by the model (captured by R2, the coefficient of determination), and the probability with which the null hypothesis can be rejected, and by estimating the data points not used for building the model. The criterion for deciding if the model is good enough is that the residual between actual software code size and estimated software code size shall be within 10%.
5. Experiment operation As can be seen in Figure 5, requirement specifications and UML Component Diagrams are the two sources of input to the FSM method. The requirement specifications typically contain textual requirements on the performance, behavior, activation criteria, etc. for a customer feature. They also contain the information that we miss in the Component Diagrams, like calibration parameters, configuration parameters, etc. The UML Component Diagram is developed from the requirement specification as part of the system architecture development activities within GM. So the Component Diagrams are already available for this experiment. In the first experiment, 12 software components developed for the Instrument Panel Cluster (IPC) were chosen at random out of the around 60 software components identified as candidates fulfilling the prerequisites stated in the experiment definition. The COSMIC Method was applied to each of the software components in order to measure the size in CFP units. An example is shown below, where CFP is calculated for the HMI_ControlSpeedometer component in Figure 3. 1 Functional process: • • •
4 Entry (Clock tic, System PM, Engine VDA, VehSpeedAvgDriven/NonDriven) 1 Exit (VehicleSpeedDisplayValue) 3 Read (DispMeasSysExt, P_GAGE_HOME_DUR, P_SPEEDOMETERGAGE_ENABLED)
CFP=8 We can see above that one of the Entry sub-processes (VehSpeedAvgDriven), the Exit sub-process (VehicleSpeedDisplayValue), and the Read sub-process (DispMeasSysExt) are obtained from the UML Component Diagram in Figure 3. The other sub-processes are obtained from the textual description related to the component in the requirement specification, but could be incorporated in the Component Diagram.
10
Kenneth Lind and Rogardt Heldal
After applying the COSMIC Method to each of the software components, the actual software code size in Number of bytes of compiled code was obtained for the same software components. The resulting data is shown in Table 1, where the left column titled “Component” contains the name of the feature using the measured component, the second column shows actual software code size, the third column shows CFP, and the right indicates the data points used for evaluation only. Component Speed Warning Engine Speed Compass indication Vehicle Speed Turn Signal ind. Odometer Gear indication Driver Workload Distance to Dest. Outside Air Temp Gauges Night Panel Display Dimming
Actual Software code size (bytes) 388 556 636 740 922 1122 1126 1164 1242 1910 2182 4244
CFP 4 6 7 8 11 14 14 14 15 25 30 60
Comment Build/Evaluation Build/Evaluation Build/Evaluation Build/Evaluation Evaluation only Build/Evaluation Build/Evaluation Build/Evaluation Evaluation only Evaluation only Evaluation only Evaluation only
Table 1. Experiment data for the 12 software components.
The data points are mainly in the interval 4 CFP to 15 CFP, which is the anticipated size for the majority of the distributable components. Three data points are well above this interval, 25 CFP, 30 CFP and 60 CFP. The intent is to use them for evaluation only, with the purpose of testing how well the linear model extrapolates outside the “normal” interval. In addition, two more data points are saved for evaluation purposes. In the second experiment, 15 software components developed for the Body Control Module (BCM) were chosen at random out of around 100 software components. The COSMIC Method was applied to each of the software components to measure the size in CFP units. Here the sizes of the software components are in the interval 4 CFP to 26 CFP.
6. Data analysis As a first step we look at the scatter plot of the data in the first experiment, see Figure 6. There seems to be a linear relationship between CFP and software code size. This is confirmed by calculating the correlation coefficients between CFP and software code size. The experiment data is far from normally distributed (tested by a KolmogorovSmirnov and a Shapiro-Wilk test), so the non-parametric Spearman’s rank coefficient
Estimation of Real-Time Software Component Size
11
is calculated. The Spearman’s rank coefficient is equal to 0.993, and the correlation is significant at the 0.01 level. Scatter plot all data 4500 4000
SW code size (Bytes)
3500 3000 2500 2000 1500 1000 500 0 0
10
20
30
40
50
60
70
CFP
Figure 6. Scatter plot of IPC experiment data, software code size (in bytes) vs CFP.
The next step is to build a linear model based on linear regression, fitting a line that minimizes the sum of the quadratic distances to each data point. The general approach applied in this report is to choose a subset of the experiment data, build a linear model based on linear regression, evaluate the resulting model, and compare it to other models. This is repeated for a number of possible subsets of the experiment data. The linear model showing overall best results has the following values: • • • • •
R2=0.997 (indicating that the model captures much of the variation in the data) k=73.691 95% confidence interval for k is [68.556 ... 78.825] (since 0 is far below the interval, we can safely reject H0k : k = 0) b=107.375 95% confidence interval for b is [52.352 ... 162.399] (since 0 is far below the interval, we can safely reject H0b : b = 0).
Since both H0k and H0b can be rejected, we conclude that the parameters k and b of the linear model are statistically significant. The results in Table 2 were given by using the model to estimate the data points. The left column titled “Component” contains the name of the feature using the measured component, the second column shows actual software code size, the third column
12
Kenneth Lind and Rogardt Heldal
shows estimated software code size, the fourth column contains the residual (calculated as the difference between actual and estimated software code size divided by actual software code size), and the right column shows CFP. The data points used for building the model are highlighted. Component Speed Warning Engine Speed Compass ind. Vehicle Speed Turn Signal ind. Odometer Gear indication Driver Workload Distance to Dest. Outside Air Temp Gauges Night Panel Display Dimming
Actual Software code size (bytes) 388 556 636 740 922 1122 1126 1164 1242 1910 2182 4244
Est. Software code size (bytes) 402 550 623 697 918 1139 1139 1139 1213 1950 2318 4529
% Residual -3.6 1.2 2.0 5.8 0.4 -1.5 -1.2 2.1 2.4 -2.1 -6.2 -6.7
CFP 4 6 7 8 11 14 14 14 15 25 30 60
Table 2. Evaluation of chosen model.
In the second experiment, we developed a linear model in the same way as in the first experiment. The results are included in Table 3. The information in the table is obtained in the same way as in Table 2. Component Remote PRNDL Ill. Trunk Lamp Truckbed Cargo Lamp Panic Alarm Front Zone Int. Lights Rear Close Cargo Lamp Power Sounder Ignition Switch Lamp Tonneau Release Dedicated DRL IL Inadv. Load Protect. Horn Interior Lights Interior Dimming Manual Liftgate
Actual Software code size (bytes) 932 1492 1504 1612 1614 1908 1968 2202 2460 2530 2592 3218 3834 4134 4530
Table 3. Evaluation of chosen model in [3].
Est. Software code size (bytes) 998 1441 1441 1589 1589 1885 2033 2328 2476 2476 2772 3215 3807 4103 4250
% Residual -7.0 3.4 4.2 1.4 1.5 1.2 -3.3 -5.7 -0.7 2.1 -6.9 0.1 0.7 0.8 6.2
CFP 4 7 7 8 8 10 11 13 14 14 16 19 23 25 26
Estimation of Real-Time Software Component Size
13
The chosen model in the first experiment is defined by the following equation:
SWcodesizeest = 73.691* CFP + 107.375 The chosen model in the second experiment is defined by the following equation:
SWcodesizeest = 147.857 * CFP + 406.139 The experiment data for both experiments are plotted in Figure 7.
Figure 7. Scatter plot of experiment data for both experiments, software code size vs CFP.
7. Evaluation of validity threats As part of planning, performing and evaluating an experiment, it is important to consider possible threats to the validity of the results. [25] lists four types of validity threats: to conclusion validity, internal validity, construct validity, and external validity. According to [25] the priorities between validity threats for experiments in applied research are in decreasing order: internal, external, construct, and conclusion. Internal validity in our experiments is increased by a random selection of components to measure within the group of available components, and by minimizing the effect of experience and other factors in the experiments. (See Figure 5.) On the other hand, internal validity is decreased by the limited number of data points. However, we believe that a random choice of 12 software components out of around 60 in total in
14
Kenneth Lind and Rogardt Heldal
the first experiment, and 15 software components out of around 100 in total in the second experiment, is enough. External validity in our experiments is maximized by selecting a group of components that are representative of the type of functionality we want to measure. The components investigated in the first experiment are taken from the Instrument Panel Cluster ECU, and mainly perform small calculations and display information of vehicle data, like vehicle speed, engine speed, etc. This type of functionality is representative of at least 20-25% of the features in a typical vehicle today. The components investigated in the second experiment are taken from the Body Control Module ECU, and are characterized by a combination of event-based user inputs causing changes of one or several digital or analog output(s). This type of functionality is representative of at least 25-35% of the features in a typical vehicle today. Construct validity in our experiments is affected by the fact that the COSMIC method is applied manually, with some room for interpretations. As a consequence, it cannot be guaranteed that another person would calculate exactly the same CFP measure. This could be improved by automating the calculation of CFP, and will be investigated in future work. Conclusion validity in our experiments is affected by the power of the statistical test and if the assumptions for the test are violated. We have established significant correlation in the data with the assumptions checked as valid, and the resulting coefficients from linear regression are statistically significant. In summary, these experiments represent one of several steps towards a feasible method to estimate software code size. Future experiments will strengthen the overall validity by increasing the number of data points, investigating other functionality types, and automating the calculation of CFP.
8. Interpretation of results As can be seen in Table 2, the linear model in the first experiment is able to estimate the software code size of the data points not included in the build of the model within 5.8% accuracy. This is deemed a good result. Based on this result we conclude that the model is valid within the normal interval where we have statistical data, i.e. 4 CFP to 15 CFP. The data for Outside Air Temperature, Display Dimming and Gauges Night Panel were collected for evaluating the models ability to extrapolate outside the normal interval. As is shown in Table 2, the model is able to estimate the software code sizes of these software components within 6.7% accuracy. This is an indication that the linear model might be able to perform in a larger interval than expected. However,
Estimation of Real-Time Software Component Size
15
more statistical data is needed before we can make any conclusion regarding a larger interval. As can be seen in Table 3, the linear model in the second experiment is able to estimate the software code size within 7% accuracy. So the two models perform with similar accuracy within their respective context. As can be seen in Figure 7, there is high correlation within the data for one experiment. This is true both for IPC components and for BCM components. But the correlation between the two component types is much lower. It is also clearly seen that the linear model developed for one component type will not perform well for the other component type. The reasons behind this difference will be investigated in future studies. For now we only indicate some possible reasons for this difference: methods and tools, team experience, software component type, software language and compiler, software infrastructure in the target ECU, requirement complexity not captured by COSMIC FSM, etc. We have investigated how to estimate software code size based on UML Component Diagrams available before the actual software is available. The purpose of estimating the software code size is to estimate how much memory size the complete application software of an ECU will require from start of production, and to add sufficient (but not too much) additional spare memory for future functional growth. One additional factor to consider is the potential growth of the software component from the first issued requirement specification (for hardware sourcing purposes) until final issue of the requirement specification (for production purposes). This potential growth during development might be caused by correcting errors found during validation, new requirements due to increased expectations from the stakeholders, etc. In the general case, only the first issue of the requirement specification is available when we measure the functional size, and estimate the corresponding software code size. But we want to estimate the software code size at start of production, and this is not necessarily the same size. This factor is outside the scope of this paper, and will be investigated in future work. The general idea of our work is to estimate the code size of all software components contained in the ECU (including expected growth during development), add these up and add spare memory for expected future functional growth. The result is the needed memory size. These experiments concentrate on the part of the application software stored in ROM type memory. The size of RAM memory needed (for storage of variables and parameter values) is obtained fairly straightforwardly from the requirement specification, just by counting the variables and their size. This information is available before the UML Component Diagrams are available. Therefore we can obtain the amount of both RAM and ROM memory needed, by using information in the requirement specification and UML Component Diagrams.
16
Kenneth Lind and Rogardt Heldal
One of the challenges when applying the COSMIC Method is to find the right level of granularity. The level of granularity is increased as the functional requirements are divided into parts, which in turn might be divided into smaller parts. Each of these “parts levels” corresponds to a level of granularity. One example is shown in Figure 8. When measuring the size in CFP for Process A or Process B, the entry/exit pair X/Y contributes 1 CFP each. But, when measuring the size in CFP for Process 1, X/Y does not show (i.e. they do not contribute to the size in CFP). This means that CFP(Process 1) ≠ CFP(Process A) + CFP(Process B)! Hence, it is important to measure the size at “the Process A and Process B” level of granularity, in order to be able to compare different CFP measures. In our approach, the UML Component Diagram helps in defining the right level of granularity.
Figure 8. Process 1 is on a lower level of granularity than Process A and Process B.
9. Related work The majority of published works on FSM methods deals with the problem of estimating development effort (man-hours) or development cost for a software program, based on requirement specifications; see [26] and [27]. The former paper [26] describes how COSMIC FP can be used in the telecommunications domain. It is shown that COSMIC FP yields higher correlation with real values on project sizes than IFPUG FPA. The latter paper [27] describes how IFPUG FPA can be used to estimate the total development cost, and to optimize the partitioning of hardware and software based on cost and reusability. The starting point for the proposed method is a system-level description in UML class diagrams and sequence diagrams.
Estimation of Real-Time Software Component Size
17
Axelsson [28] performed work similar to ours based on IFPUG FPA [15], but with little details on how to obtain the results and without empirical data. An example illustrates how IFPUG FPA can be used to evaluate different system architecture alternatives, based on development cost, product cost, and risk. [29] studies conceptual similarities and differences between the FSM methods MkII FPA, COSMIC FP, and IFPUG FPA. A case study is conducted on a document management system. The case study shows that conversion between different FSM methods requires more work, both theoretical and empirical studies. [30] applies Mk II FPA and COSMIC FP to a real-time software system. One result is that COSMIC FP can be applied earlier in the development cycle than Mk II FPA. In [31] a method is proposed to measure software quality attributes like maintainability, usability, and reliability, based on COSMIC FP. The software quality attributes are measured based on basic architectural metrics like coupling, cohesion, and complexity. The method is applied to the Key Word in Context case study in [32]. [33] presents a Function Point-like approach to measure the size of components using specifications written in UML. An interface complexity measure is combined with an interaction measure into a system-level size measure called Component Points. No empirical validation is reported. [34] and [35] report ongoing activities with the purpose of automating the FP measurement. In [34], it is described how UML sequence diagrams can capture the information needed as input to the COSMIC FP method. This facilitates the possibility to automate the calculation of FP, by extracting the number of messages exchanged in the sequence diagrams. In [35], it is shown that UML class diagrams and sequence diagrams can be used as input for a software tool that automatically calculates the IFPUG FP. [36] proposes a research area to handle the software complexity of data manipulations of a functional process. Resulting methods can be integrated into the COSMIC FP method to provide a more complete software size measure. This would complement the current COSMIC method as it only addresses the complexity of data movements.
10. Conclusions and Further Work This paper describes how UML Component Diagrams and requirement specifications can be used to collect enough information for estimating software code size using an FSM method. We have used COSMIC Function Points (CFP) as the FSM method, to calculate Function Points on several components for which software is already developed. Significant correlation was found between the CFP values and the actual software code size. Based on this correlation we have designed two linear models capable of estimating the software code size of two common types of components within 7% accuracy. The linear models are valid for component sizes between 4 CFPs
18
Kenneth Lind and Rogardt Heldal
and 15 CFPs, and between 4 CFPs and 26 CFPs, respectively. These sizes are expected to be typical for this type of components. However, estimated values for components above this interval indicated that the models might be able to perform in a larger interval. This is something we plan to investigate. These experiments represent a significant step towards a feasible method to estimate software code size as support for deciding the memory size of the ECUs in a system architecture. Our two experiments investigated components of two different types developed by two different software teams. Both experiments show that COSMIC FSM can be used to estimate software code size given the type of components and software team. But the model developed by investigating one of the teams with a given type of components cannot be used by the other team. Whether this is due to differences between component types or other factors will be investigated in future work. In the current work not all the information needed for CFP comes from UML Component Diagrams. We believe it might be possible to include this information in the Component Diagram. This will open up the possibility to automatically calculate Function Points. Future work will investigate this possibility.
Acknowledgements We would like to thank Jonas Hellgren, Stefan Edvardsson, Robert Baillargeon, and Rick Flores at GM for their input to the experiments, Petter Modal at Chalmers University of Technology for his advice on the statistical parts of the paper, and Marcin Zalewski at Chalmers University of Technology for his valuable comments throughout the paper. This research was funded by VINNOVA, the Swedish Research Agency.
References [1] IEEE 100-2000, “The Authoritative Dictionary of IEEE Standards Terms Seventh Edition”. [2] Lind, K, and Heldal, R, “Estimation of Real-Time System Software Size using Function Points”, Proc. of the Nordic Workshop on Model Driven Engineering (NWMoDE), 15-28, 2008. [3] Lind, K, and Heldal, R, “Estimation of Real-Time Software Code Size using COSMIC FSM”, Proc. of the IEEE Intl. Symposium on Object/component/serviceoriented Real-time distributed Computing (ISORC 2009), 244-248, March 2009. [4] ISO/IEC 14143-1:2007, “Information Technology - Software Measurement Functional Size Measurement - Part 1: Definitions of concepts”, 2007.
Estimation of Real-Time Software Component Size
19
[5] ISO/IEC 14143-2:2002, “Information Technology - Software Measurement – Functional Size Measurement - Part 2: Conformity Evaluation of Software Size Measurement Methods to ISO/IEC 14143-1”, 2002. [6] ISO/IEC TR 14143-3:2003, “Information Technology - Software Measurement – Functional Size Measurement - Part 3: Verification of Functional Size Measurement Methods”, 2003. [7] ISO/IEC TR 14143-4:2002, “Information Technology - Software Measurement – Functional Size Measurement - Part 4: Reference Model”, 2002. [8] ISO/IEC TR 14143-5:2004, “Information Technology - Software Measurement – Functional Size Measurement - Part 5: Determination of Functional Domains for Use with Functional Size Measurement”, 2004. [9] ISO/IEC 14143-6:2006, “Guide for the Use of ISO/IEC 14143 and Related International Standards”, 2006. [10] Boehm, B, Abts, C, Winsor Brown, A, Chulani, S, Clark, B, Horowitz, E, Madachy, R, Reifer, D, and Steece, B, “Software Cost Estimation with COCOMO II”, PrenticeHall Inc., 2000, ISBN 0-13-026692-2. [11] Albrecht, A, “Measuring application development productivity”, Proc. of the IBM Applications Development Symposium, Monterey, CA, 83-92, Oct. 1979. [12] Albrecht, A, Gaffney, J, “Software function, source lines of code, and development effort prediction: A software science validation.” IEEE Trans. Softw. Eng. SE-9, 6, 639–648, 1983. [13] Albrecht, A, “AD/M Productivity Measurement and Estimate Validation”, IBM Corporate Information Systems, IBM Corp., Purchase, NY, 1984. [14] Gencel, C, and Demirors, O, “Functional Size Measurement Revisited”, ACM Trans. Softw. Eng. Methodol. 17, 3, Article 15 (June 2008). [15] IFPUG, Function Point Counting Practices Manual, Release 4.1, IFPUG, Westerville, OH, 1999. [16] ISO/IEC 20926:2003, “Software Engineering - IFPUG 4.1 Unadjusted FSM Method - Counting Practices Manual”, 2003. [17] COSMIC, The Common Software Measurement International Consortium Functional Size Measurement Method, Version 3.0, Measurement Manual, 2007. [18] ISO/IEC 19761:2003, “Software engineering - COSMIC-FFP - A functional size measurement method”, 2003. [19] ISO/IEC 20968:2002, “Software Engineering - MkII Function Point Analysis Counting Practices Manual”, 2002. [20] ISO/IEC 24570:2005, “Software Engineering - NESMA Functional Size Measurement Method v.2.1 - Definitions and Counting Guidelines for the Application of Function Point Analysis”, 2005. [21] Telelogic Rhapsody, http://modeling.telelogic.com/products/rhapsody/. [22] Baillargeon, R, and Flores, R, “From Algorithms to Software – A Practical Approach to Model-Driven Design”, SAE paper 2007-01-1622. [23] OMG, Unified Modeling Language (UML), Superstructure, V2.1.2, Object Management Group, http://www.uml.org/. [24] Szyperski, C, “Component Software: Beyond Object-Oriented Programming.” 2nd ed. Addison-Wesley Professional, Boston 2002 ISBN 0-201-74572-0. [25] Wohlin, C, Runeson, P, Höst, M, Ohlsson, M, C, Regnell, B, and Wesslén, A, “Experimentation in Software Engineering: An Introduction.”, Kluwer Academic Publishers, 2000, ISBN 0-7923-8682-5. [26] Afsharian, S, Giacomobono, M, and Inverardi, P, “A Framework for Software Project Estimation Based on COSMIC, DSM and Rework Characterization”, Intl. Conf. of Software Engineering (ICSE’08), 15-24, 2008.
20
Kenneth Lind and Rogardt Heldal [27] Fornaciari, W, Micheli, P, Salice, F, and Zampella, L, “A First Step Towards Hw/Sw Partitioning of UML Specifications”, Proc. of the Design Automation and Test in Europe Conf. and Exhibition (DATE’03), 668-673, 2003. [28] Axelsson, J, “Cost Models with Explicit Uncertainties for Electronic Architecture Trade-off and Risk Analysis”, Intl. Council on Systems Engineering (INCOSE), 2006. [29] Gencel, C, and Demirors, O, “Conceptual Differences Among Functional Size Measurements Methods”, IEEE First Intl. Symposium on Empirical Software Engineering and Measurement, 305-313, 2007. [30] Gencel, C, Demirors, O, and Yuceer, E, “A Case Study on Using Functional Size Measurement Methods for Real Time Systems”, Proc. of the 15th Intl Workshop on Software Measurement (IWSM), 159-178, 2005. [31] Zayaraz, G, Thambidurai, P, Srinivasan, M, and Rodrigues, P, “Software Quality Assurance through COSMIC FFP”, ACM Sigsoft Software Engineering Notes, Vol. 30, No. 5, 1-5, Sept. 2005. [32] Zayaraz, G, and Thambidurai, P, “Quantitative Measurement of Software Architectural Qualities through COSMIC FFP”, Proc. of the IEEE Annual India Conf. 1-4, 2006. [33] Wijayasiriwardhane, T, and Lai, R, “A Method for Measuring the Size of a Component-Based System Specification”, IEEE The 8th Intl. Conf. on Quality Software, 329-337, 2008. [34] Levesque, G, Bevo, V, and Tran Cao, D, “Estimating Software Size with UML Models”, Canadian Conference on Computer Science & Software Engineering (C3S2E '08), 81-87, 2008. [35] Uemura, T, Kusumoto, S, and Inoue, K, “Function-point analysis using design specifications based on the Unified Modelling Language”, Journal of Software Maintenance and Evolution: Research and Practice 13, 4, 223-243, July/Aug. 2001. [36] Tran-Cao, D, Levesque, G, and Abran, A, “Measuring Software Functional Size: Towards an Effective Measurement of Complexity”, Proc. of the Intl. Conf. on Software Maintenance (ICSM’02), 370-376, 2002.