Applying Statistical Methodology to Optimize and Simplify Software Metric Models with Missing Data Victor K.Y. Chan Macao Polytechnic Institute
[email protected]
W. Eric Wong & Jin Zhao Department of Computer Science University of Texas at Dallas
[email protected] ABSTRACT
1. INTRODUCTION
During the construction of a software metric model, the decision on whether a particular predictor metric should be included is most likely based on an intuitive or experience based assumption that the predictor metric has an impact on the target metric with a statistical significance. However, a model constructed based on such an assumption may contain redundant predictor metric(s) and/or unnecessary predictor metric complexity. This is because the assumption made before the model construction is not verified after the model is constructed. To resolve the first problem (i.e., possible redundant predictor metric(s)), we propose a statistical hypothesis testing methodology to verify “retrospectively” the statistical significance of the impact of each predictor metric on the target metric. If the variation of a predictor metric does not correlate enough with the variation of the target metric, the predictor metric should be deleted from the model. For the second problem (i.e., unnecessary predictor metric complexity), we use “goodness-of-fit” to determine whether certain categories of a categorical predictor metric should be combined together. In addition, missing data often appear in the data sample used for constructing the model. We use a modified k-nearest neighbors (k-NN) imputation method to deal with this problem. A study using data from the “Repository Data Disk - Release 6” is reported. The results indicate that our methodology can be useful in trimming redundant predictor metrics and identifying unnecessary categories initially assumed for a categorical predictor metric in the model.
Software metrics are measurements of some properties of a software system or its specifications. To explore the possible correlation among these metrics, appropriate models have to be constructed. All such models are known as software metric models, although other terminologies have also been used. In general, each model gives a relationship between a specific target metric (a dependent variable) and one or more predictor metrics (independent variables). For example, one can develop a model to predict the “work effort required for completing a project” (which is the target metric) in terms of the project’s predictor metrics such as “the number of function points of that project.” Such a model is very useful as it can help us estimate the expected work effort of a software project at an early development stage. Based on this estimate, we can make a decision on whether the project should be continued or what kind of adjustments (e.g., hiring more developers) are necessary in order to complete the project before a certain deadline.
Keywords: software metrics, models, model optimization, model simplification, missing data, imputation method
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SAC’06 April 23-27, 2006, Dijon, France. Copyright 2006 ACM 1-59593-108-2/06/0004... $5.00.
One of the challenges encountered in the construction of a software metric model is the selection of its predictor metrics. In general, whether a predictor metric should be included in a software metric model is based on an intuitive or experienced-based assumption that this predictor metric has a statistically significant impact on the target metric. There exists a vast amount of literature on the construction of software metric models [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]. However, none of these studies provides a theoretically rigorous test to verify “retrospectively” the statistical significance of the impact of each predictor metric on the target metric. Stated differently, the assumption made before the model construction is not carefully verified after the model is constructed. As a result, it is possible that the impact of a predictor metric on the target metric may not be as statistically significant as we originally assumed. The consequence is that the model so constructed may contain some redundant predictor metric(s). To resolve this problem, we propose a statistical hypothesis testing methodology to help us verify “retrospectively” whether to retain a predictor metric in the constructed model. If the variation of a predictor metric does not correlate enough with the variation of
1728
the target metric, the predictor metric should be deleted from the model. In addition, we face another challenge for a categorical predictor metric – how to determine its categorization. Are all the categories initially assumed for a given categorical predictor metric necessary? Should any of them be combined together in order to reduce the complexity of the categorical predictor metric? For example, for a categorical predictor metric “development type,” one can assume it has three categories: “new development,” “enhancement,” and “redevelopment.” One point we should verify after the model is constructed is whether we really need these three categories. If the model will not be compromised with statistical significance by combining two categories (say “enhancement” and “redevelopment”) into one single category (say “enhancement and redevelopment”), then the total number of categories for “development type” should only be two instead of three. We propose to use the “goodness-of-fit” to verify “retrospectively” the original categorization of a categorical predict metric and determine whether certain categories of that categorical predictor metric should be combined together. More details of our methodology appear in Section 3. Another point worth noting is that missing data often appear in a data sample that is used to construct a metric model [13, 14, 15]. Since the problem of missing data is not unique to software engineering, it is no surprise to find that there are a wide range of methods that have been proposed to deal with the missing data [16]. We can classify the methods for handling missing data into three classes: listwise deletion (LD) methods, imputation methods, and maximum likelihood (ML) methods. The disadvantage of using the LD methods is that it wastes a considerable amount of information and may introduce a bias in the data. As discussed in [14], the ML methods are not susceptible to bias when data are missing at random (MAR). Chan and Wong [17] reported a study using the ML methods to optimize and simplify software metric models. One problem of using the ML methods is that it assumes that data come from a multivariate normal distribution, which may not be the case for categorical predictor metrics. The basic idea of an imputation method is to replace missing data by appropriate estimates that are obtained based on the known data (i.e., the non-missing data). Recent studies [18, 19] show that the k-nearest neighbors (k-NN) imputation method appears to provide a more robust and sensitive method for estimating missing data than other imputation methods. In this paper, we use the k-NN imputation method to deal with missing data. The remainder of this paper is organized as follows. Section 2 gives a mathematical perspective of model optimization and simplification. Our methodology is explained in Section 3. A study to demonstrate the use of our methodology is in Section 4. Finally, our conclusion appears in Section 5.
2. REGRESSION MODEL We use linear regression analysis to determine the relationship between a target metric (or dependent variable) with one or more predictor metrics (or independent variables). The general form of such a model is as follows: y = β0 + β1 x1 + · · · + βp xp + βp+1 xp+1 + · · · + βq xq (1) where y is the target metric, x1 , · · · , xp are the continuous predictor metrics, xp+1 , · · · , xq are the categorical predictor metrics that take on values of discrete levels (for example, categorical predictor metric “development type” takes values 1, 2 and 3 corresponding to “new development,” “enhancement,” and “redevelopment,” respectively), and β0 , β1 , · · · , βq are the regression coefficients to be estimated during the model construction. However, there might be a situation where the variation of a particular predictor metric xl does not correlate enough with the variation of the target metric y after considering the variation of other predictor metrics x1 , · · · , xl−1 , xl+1 , · · · , xq or, equivalently, predictor metric xl does not vary with the target metric y with sufficiently high statistical significance after considering the variation of other predictor metrics x1 , · · · , xl−1 , xl+1 , · · · , xq . This situation is tantamount to the true, but unknown, regression coefficient βl not being statistically significantly different from zero. From the view of statistics, it is equivalent to the following hypothesis: H0 : βl = 0 versus H1 : βl = 0. If we accept the null hypothesis H0 , which means that the predictor metric xl does not impact the target metric y with enough statistical significance, then Model (1) should be optimized and simplified by omitting the term βl xl , and the predictor metric xl is redundant. If the null hypothesis is rejected (i.e., alternative hypothesis H1 is accepted), then inclusion of predictor metric xl in the model (1) is warranted. Furthermore, assuming that the target metric y is impacted by a particular categorical predictor metric xc , p+ 1 ≤ c ≤ q, we need to verify whether the impact can be exerted with less complexity than initially assumed (in other words, whether the model (1) can be optimized and simplified by combining two or more categories for the categorical predictor metric xl to form one single category). For example, categorical predictor metric “development type” has three categories: “new development,” “enhancement,” and “redevelopment.” If the model will not be compromised with statistical significance by combining two categories (e.g., “enhancement” and “redevelopment”) into one single category (e.g., “enhancement
1729
and redevelopment”), then the total number of categories for the categorical predictor metric “development type” can be reduced and unnecessary predictor metric complexity is averted. More details are given in the next section.
3.
k −1 t=1 Eiit cit 1 . k −1 t=1 Eiit
Once all missing data have been imputed, the data sample is complete and multivariate regression analysis can be applied.
METHODOLOGY
In this section we develop a statistical methodology to optimize and simplify software metric models. Since missing data are often encountered in the data samples that are used to construct software metric models, there are many techniques that have been developed to replace missing data with estimates that are obtained based on the known (non-missing) data [20, 21]. Of these techniques, imputation methods are especially useful in situations where a complete data set is required for the analysis [22]. For example, in the case of multiple regression analysis, all observations must be complete. Recently, some studies, such as [18, 19], show that the k-nearest neighbors (kNN) imputation method appears to provide a more robust and sensitive method than other imputation methods. We first present an overview of the k-NN method in Section 3.1, followed by the statistical methodology in Section 3.2.
3.1
2 2
missing value, the imputed value is
k-NN Imputation Method
The basic idea of the imputation method is to fill in missing data with values based on other observations without missing data. The k-NN method works by finding the k most similar complete cases to the missing cases to be imputed, where similarity is measured by Euclidean distance. A weighted average of values from the k most similar cases is then imputed as an estimate for the missing value in the missing cases.
3.2 Model Optimization and Simplification As mentioned in Section 2, one objective of our methodology is to verify, with certain statistical significance, whether the target metric y varies with a predictor metric xl after considering the variation of other predictor metrics. From a statistics point of view, it is equivalent to the following hypothesis: H0 :
In order to test the hypothesis H0 , we will define a procedure which is based on a modified k-NN method and a Monte-Carlo simulation. The detailed steps of the procedure are: • Step 1. If mil is a missing value from the continuous metrics, i.e., 1 ≤ l ≤ p, based on the known values of the k nearest cases, a continuous distribution is generated by using a nonparametric kernel method, and we assume that the missing value mil is drawn from the continuous distribution. The probability density function is given by
We now present some notations to help explain the kNN method. The data sample is divided into observations with missing values (the missing cases) and observations without missing values (the complete cases). Let mi = (mi1 , mi2 , · · · , miq )T be the vector of all metrics measured on the ith observation in the missing cases, and cj = (cj1 , cj2 , · · · , cjq )T be the vector of all metrics measured on the jth observation in the complete cases.
fil (x) =
Eij
il
1 h
:2E k
t=1
−1 iit
k t=1
−1 Eii t
ker(
x − cit l ), h
(2)
where ker(·) is the kernel function and h is the bandwidth. The selection of the kernel function and the bandwidth is discussed in [23]. • Step 2. If mil is a missing value from the categorical metrics, i.e., p + 1 ≤ l ≤ q, based on the known values of the k nearest cases, a discrete distribution is given by
The Euclidean distance between a missing case i and a complete case j is given by:
I: = (m
βl = 0.
This test serves to verify whether βl is statistically significantly different from 0. If the above null hypothesis is rejected, then inclusion of predictor metric xl in the model is warranted. Otherwise, the predictor metric xl is redundant and the model should be optimized and simplified by omitting xl .
P (x = cit l ) =
− cjl )2
l
2E E
−1 iit , k −1 t=1 iit
t = 1, 2, · · · , k.
(3)
• Step 3. By independently sampling from the distributions in steps 1 and 2, a Monte-Carlo simulation can be used to generate N completed data samples. For each of the N completed data samples, the F test [24] can be used to test the hypothesis H0 at a given significance level s. Then for the hypothesis H0 , the number of acceptances among the N simulation data samples is denoted as a, which has
We denote the k nearest complete cases to the missing case i as i1 , i2 , · · · , ik . Their Euclidean distances are Eii1 , Eii2 , · · · , Eiik , respectively. For the k-NN imputation method, the known values of the k nearest cases are then used to derive a weighted value for the missing value. Here the weighted value is derived by an inverse distance weighted algorithm. For example, if mi1 is a
1730
an approximately normal distribution with a mean N (1 − s) and a variance N s(1 − s).
the number of categories of a given particular categorical predictor metric may be reduced and unnecessary predictor metric complexity is averted.
• Step 4. The test at α-level is to reject hypothesis H0 if a < N (1 − s) − µα
FN S(1 − s),
4. A CASE STUDY
(4)
The data sample used in our study is the “Repository Data Disk - Release 6” which is from the International Software Benchmarking Standards Group (ISBSG). It is recognized as one of the most comprehensive, reliable, systematic and representative data samples of its kind [13, 14, 26]. This data sample consists of 55 metrics for a collection of 789 projects. However, most of these metrics suffer from missing data for some of the projects. In the following, a regression model is constructed with “summary work effort” as the dependent variable (i.e., the target metric) and a set of continuous and categorical metrics as independent variables (i.e., the predictor metrics). Six of these predictor metrics (namely, x1 , x2 , x5 , x6 , x7 , and x8 in Table 1 have been used to construct software cost estimation models based on the same data set [26, 27]. In order to demonstrate the use of our methodology, we add two additional predictor metrics (x3 and x4 ) which are seldom used. Table 1 gives details of these metrics.
where µα is the upper αth quantile of the standard normal distribution. Otherwise, we accept the hypothesis H0 . Another objective of our methodology is to verify whether the target metric y will be impacted differently by a particular categorical predictor metric that combines two or more categories into one single category as opposed to treating each category separately. The purpose of the verification is to avoid unnecessary differentiation between any pair of categories of a categorical predictor metric. We conduct this verification by using the goodness-of-fit. If the model provides a better fit by combining two or more categories into one single category for a particular categorical predictor metric, then the total number of categories for that particular categorical predictor metric can be reduced and the model can be optimized and simplified.
Table 1: Metrics used in our study
The fit of a model can be evaluated by using the mean magnitude of relative error (MMRE), which is widely used. The MMRE is defined as [25]: M M RE =
1 n
y
Name of Metric Summary work effort
: | y − yˆ | ,
x1
Function points
i=1
x2
Maximum team size
x3
Project elapsed time User baselocations
n
i
i
yi
where yi is the true and yˆi the estimated value of the target metric, and n is the number of observations. The best model has the lowest MMRE value. The detailed steps are as follows:
x4
• Steps 1 and 2 are the same as discussed above. • Step 3. By combining different categories into one single category for a particular categorical predictor metric, we get different models. For example, categorical predictor metric “development type” has three categories: “new development,” “enhancement,” and “redevelopment.” By combining “new development” and “redevelopment” into one category, we get one model; by combining all three of them together into one category, we get another model. By exhausting all possible combinations, we get three additional models: the first model is to combine “new development” and “enhancement”; the second model is to combine “enhancement” and “redevelopment”; the third model is the original model without any combination. • Step 4. N completed data samples are generated by a Monte Carlo simulation for each of the models constructed in step 3. For each of the simulated samples, its MMRE is calculated. Then we compare the arithmetic mean of the MMRE values for these models and identify the one with the lowest arithmetic mean value. According to this model,
x5
Development platform
x6
Programming language type
x7
Development type
x8
Resource level
Description
Remark
Total effort (in hours) spent on the project The adjusted function point count number The maximum number of project team members Total elapsed time (in months) for the project Number of physical locations being serviced by the installed system Specify the primary development platform
Dependent variable
Specify which programming language is used for the project Specify whether the development is a new development, enhancement, or redevelopment Specify how people spend their resources
Three categories: 1 for “microcomputers,” 2 for “mid-range computers,” and 3 for “mainframe computers” Four categories: 1 for “2GL,” 2 for “3GL,” 3 for “4GL,” and 4 for “application generators” Three categories: 1 for “new development,” 2 for “enhancement,” and 3 for “redevelopment” Four categories: 1 for “development team effort,” 2 for “development team support,” 3 for “computer operations involvement,” and 4 for “end users or clients”
The model is given as follows: y = β0 + β1 x1 + β2 x2 + · · · + β8 x8 ,
(5)
where β0 , β1 , · · · , β8 are the regression coefficients. The “retrospective” verification of the possible redundancy of predictor metrics xl , l = 1, 2, · · · , 8 can be conducted by testing the null hypothesis H0 : βl = 0. Rejection of the null hypothesis implies that the predictor metric xl in the model (5) is warranted. Otherwise, the predictor metric xl is redundant and needs to be omitted.
1731
• For x7 (“development type”), categories 2 and 3 should be combined
As discussed earlier, this test can be conducted by using the k-NN method and a Monte-Carlo simulation. Let k = 10 and the uniform kernel as: 1 ker(u) = I(−1,1) (u) 2
• For x8 (“resource level”), categories 2 and 4 should be combined, whereas the other categories should remain the same without further simplification.
where I(−1,1) denotes the indicator function for set (−1, 1), i.e., I(−1,1) (x) =
1 0
Table 3: Mean values of MMRE Models Mean values of MMRE for x5 (1,2,3) 7.80182 (1,2),3 7.56072 (1,3),2 10.0994 (2,3),1 8.04854 1,2,3 7.65893
−1 < x < 1 otherwise.
The bandwidth h = 100.2 = 0.63096. Two thousand data samples are generated using a Monte-Carlo simulation. For each of the data samples, the significance level s is set to 0.05. For each hypothesis, the number of acceptances among the 2000 simulation data samples is denoted as a which has an approximately normal distribution with a mean of 1900 and a variance of 95. Thus, if we choose the significance level α = 0.1, the null hypothesis H0 will be rejected if √ a < 1900 − 1.28155 ∗ 95 ≈ 1887.
Table 4: Mean values of MMRE for metrics x7 and x8 Models Mean values of Mean values of MMRE for x7 MMRE for x8 (1,2,3,4) 7.62675 6.56982 (1,2),3,4 7.65051 7.33335 (1,3),2,4 7.64776 8.01296 (1,4),2,3 7.94576 6.37836 (2,3),1,4 7.76990 7.88843 (2,4),1,3 7.90618 6.02991 (3,4),1,2 7.84770 7.53150 (1,2,3),4 7.60925 7.36693 (1,2,4),3 7.69927 5.69695 (1,3,4),2 7.86093 6.11643 (2,3,4),1 7.88959 6.79328 (1,2),(3,4) 7.68877 7.11619 (1,3),(2,4) 7.74769 7.18419 (1,4),(2,3) 8.36456 6.34281 1,2,3,4 7.65893 7.65893
Otherwise, we accept the hypothesis H0 . Table 2 shows of the 2000 simulations, how many times each hypothesis is accepted and its overall acceptance/rejection. For example, of the 2000 simulations, the hypothesis for x4 is accepted 1992 times, which is greater than 1887. This implies the hypothesis for x4 is accepted. Hence, the corresponding predictor metric “user base-locations” is redundant and should be deleted from the model (5). Other predictor metrics should be retained in the model.
To conclude, of the eight predictor metrics, our results of the hypothesis testing suggest that only x4 should be deleted from the model. And, of the four categorical predictor metrics (x5 , x6 , x7 , and x8 ), our results of the “goodness-of-fit” suggest that certain categories should be combined as discussed above. With this information, the original model constructed based on an intuitive or experience-based assumption (as to which predictor metrics should be included) can be optimized and simplified.
Table 2: Results of the hypothesis testing Number of acceptances H0 x1 0 Rejection x2 1710 Rejection x3 104 Rejection x4 1992 Acceptance x5 1746 Rejection x6 1775 Rejection x7 1811 Rejection x8 73 Rejection
On the other hand, the “retrospective” verification of possible unnecessary differentiation between any pair of categories for a categorical predictor metric can be conducted by comparing the arithmetic mean values of MMRE (refer to Section 3.2). The results for the four categorical predictor metrics (x5 , x6 , x7 , and x8 ) are given in Tables 3 and 4. The number of simulations for each model in these tables is 2000. The notation “(1,2),3” implies that two categories “1 and 2” and “3” are used. That is, category 1 and category 2 are combined together, whereas category 3 is by itself an additional category. Similarly, the notation “(1,2,3,4)” implies all four categories are combined together as one single category, and “(1,2),3,4” implies three categories, namely, “1 and 2,” “3,” and “4.” Other notations follow the same convention. From Tables 3 and 4, we make the following observations: • For x5 (“development platform”), categories 1 and 2 should be combined • For x6 (“language type”), categories 1, 2 and 3 should be combined
for metrics x5 and x6 Mean values of MMRE for x6 7.64853 7.69204 7.70269 7.55085 7.65893
5. CONCLUSION It is true that while constructing a prediction model, the selection of predictor metrics is initially based on an intuitive or experience-based assumption that these metrics have an impact on the target metric with a statistical significance. However, no theoretically rigorous tests have ever been applied to verify “retrospectively” the statistical significance of the impact of any of the selected predictor metric on the target metric after the model is constructed. Thus, inclusion of redundant predictor metric(s) and/or unnecessary predictor metric complexity may result, leading to waste of resources in the collection of data for redundant predictor metrics or predictor metrics of unnecessary complexity. Moreover, missing data often appear in the data samples that are used to construct software metric models. To overcome this problem, we apply the k-NN imputation method and a statistical methodology to verify “retrospectively” the aforesaid statistical significance. Based on the verification results, we make recommendations on
1732
how to optimize/simplify the original model. A study using data from the “Repository Data Disk - Release 6” is reported to demonstrate the use of our methodology. The results indicate that our methodology can be useful in trimming redundant predictor metrics and identifying unnecessary categories initially assumed for a categorical predictor metric in the model.
[14] I. Myrtveit, E. Stensrud and U.H. Olsson, “Analyzing data Sets with Missing Data: An Empirical Evaluation of Imputation Methods and Likelihood-Based Methods,” IEEE Transactions on Software Engineering, 27(11):999-1013, November 2001.
6.
[16] R.J.A. Little and D.B. Rubin, Statistical Analysis with Missing Data, 2nd ed. New York: Wiley, 2002.
REFERENCES
[1] A. Albrecht and J. Gaffney, “Software Function, Source Lines of Codes, and Development Effort Prediction,” IEEE Transactions on Software Engineering, 9(6):639-648, November 1983. [2] B. Kitchenham and N. Taylor, “Software project Development Cost Estimation,” Journal of System and Software, 5(4):267-278, November 1985. [3] C.L. Ramsey and V.R. Basili, “An Evaluation of Expert Systems for Software Engineering Management,” IEEE Transactions on Software Engineering, 15(6):747-759, June 1989. [4] D.W. Aha, “Case-Based Learning Algorithms,” Proc. Defense Advanced Research Projects Agency Case-Based Reasoning Workshop, pp147-158, Washington, D.C., May 1991. [5] S. Horikawa, T. Furnuhashi and Y. Ucikawa, “On Fuzzy Modeling Using Fuzzy Neural Networks with the Back-Propagation Algorothm,” IEEE Transactions on Neural Networks, 3(5):801-806, September 1992. [6] T. Mukhopadhyay, S.S. Vicinanza and M.J. Prietula, “Examining the Feasibility of a Case-Based Reasoning Model for Software Effort Estimation,” MIS Quarterly: Management Information System, 16(2):155-171, June 1992. [7] L. Breiman, J.H. Friedman, R.A. Olshen and C.J. Stone, Classification and Regression Trees, New York: Chapman & Hall, 1993. [8] R.J.S. Jang, “ANFIS: Adaptive-Network-Based Fuzzy Inference System,” IEEE Transctions on Systems, Man, and Cybernectics, 23(3):665-685, May-June 1993. [9] A. Lakhotia, “Rule-Based Approach to Computing Module Cohesion,” Proceedings of the 15th International Conference on Software Engneering., pp 35-44, Baltimore, Maryland, May 1993. [10] J. Maston, B. Barrett and J. Mellichamp, “Software Development Cost Estimation Using Function Points,” IEEE Transactions on Software Engineering, 20(4):275-287, April 1994. [11] C. Walston and C. Felix, “A Method of Programming Measurement and Estimation,” IBM Systems Journal, 16(1):54-73, 1977.
[15] K. Strike, K.E. Emam and N. Madhavji, “Software Cost Estimation with Incomplete Data,” IEEE Transactions on Software Engineering, 27(10):890-908, October 2001.
[17] V. Chan and W. E. Wong, “Optimizing and Simplifying Software Metric Models Constructed Using Maximum Likelihood Methods,” Proceedings of they 29th International Computer Software and Applications Conference, pp. 65-70, Edinburgh, UK, July, 2005. [18] O. Troyanskaya, M. Cantor, G. Sherlock and et. al., “ Missing Value Estimation Methods for DNA Microarrays,” Bioinformatics, 17(6):520-525, June 2001. [19] M.H. Cartwright, M.J. Shepperd and Q. Song, “Dealing with Missing Software project Data,” Proceedings of the 9th IEEE International Software Metrics Symposium, pp 1-12, Sydney, Australia, September 2003. [20] B. Ford, “An Overview of HotDeck Procedures,” Incomplete Data in Sample Surveys, Theory and Bibliographies, W. Madow, I. Olkin and D. Rubin eds., Volume 2, 1983. [21] I. Sande, “Hot-Deck Imputation Procedures,” Incomplete Data in Sample Surveys, Theory and Bibliographies, W. Madow, I. Olkin and D. Rubin eds., Volume 3, 1983. [22] R. Little, “Missing-Data Adjustments in Large Surveys,” J. Business $ Economic Statistics, 6(3):287-296, July 1988. [23] J. Simonoff, Smoothing Methods in Statistics, New York: Spring-Verlag, 1996. [24] G.A.F. Seber, Linear Regression Analysis, New York: Wiley, 1977. [25] S. Conte, H. Dunsmore and V. Y. Shen, Software Engineering Metrics and Models, Menlo Park, CS: Benjamin Cummings, 1986. [26] R. Jeffery, M. Ruhe and I. Wieczorek, “Using Public Domain Metrics to Estimate Software Development Effort”, Proceedings of the 7th IEEE International Software Metrics Symposium, pp16-27, London, UK, April 2001. [27] L. Angelis, I. Stamelos and M. Morisio, “Building a software cost estimation model based on categorical data”, Proceedings of the 7th IEEE International Software Metrics Symposium, pp 4-15, London, UK, April 2001.
[12] L. Briand, K. El Emam and I. Wieczorek, “Explaining the Cost of European Space and Military Projects,” Proceedings of the 21st International Conference on Software Engineering, pp 303-312, Los Angeles, California, May 1999. [13] I. Myrtveit, E. Stensrud and U.H. Olsson, “Assessing the Benefits of Imputing ERP projects with Missing Data,” Proceedings of the 7th IEEE International Software Metrics Symposium, pp 78-84, London, UK, April 2001.
1733