A History-Based Automatic Scheduling Model for ... - CiteSeerX

2 downloads 1035 Views 111KB Size Report
Software risk management has been recognized as one ... (3) A comprehensive risk-based project scheduling ap- ... by Fussel and Field [16] to manage risks.
A History-Based Automatic Scheduling Model for Personnel Risk Management Hsinyi Jiang, Carl K. Chang, Jinchun Xia, Shuxing Cheng Department of Computer Science, Iowa State University {hsinyij, chang, jxia, scheng}@cs.iastate.edu

Abstract Personnel risk is an issue which has not been researched well but plays an important role to determine whether a software project succeeds or fails. Most existing work focuses on subjective expertise while an objective view is lacking. Furthermore, to the best of our knowledge, the demand for an automatic tool to support risk management has not been answered yet. In this research, based on objective historical data, we extend our earlier model, cability-based scheduling framework, by including risk analysis. Hence we provide a novel approach to mitigate personnel risks while achieving a project scheduling with minimum cost.

1

Introduction

Software risk management has been recognized as one critical issue in software development since efficiently mitigating risks can directly reduce the cost [2]. Risks are potential problems which can cause failure of a software project, either functional failure of the software, quality deficiency of the product, or cost/schedule problems of the project itself. Among all the risks, personnel risk, related to capabilities and activities of employees, is the one which tends to be ignored, but may greatly impact the product quality. Though many approaches were proposed to mitigate software risks, very few of them ever addressed this issue. Besides, one drawback of existing risk management approaches is the emphasis of subjective information such as expertise. We believe it is important to include objective information because it gives insight into the real situation of software projects. Furthermore, automatic process of historical data (of similar projects) saves human labor and usually introduces fewer mistakes. For this reason, we based our risk analysis on historical data. Starting from early days, many researchers tried to integrate risk management into the feedback loop of software development to provide adaptation. Whereas, this kind of integration is mostly manual. Keshlaf and Hashimbut [6] emphasized the demand of an automatic tool to avoid te-

dious and error-prone manual operations. However, to the best of our knowledge, this kind of tool is still lacking in current literature. To address all the problems discussed above, we conduct a research based on project scheduling because assignments of a project can directly or indirectly impact software quality. In our earlier work [14] [17], we proposed a capability based scheduling model which analyzes human capabilities and uses Genetic Algorithm (GA) to find an optimal schedule (including employee assignments). In this paper, we extend the capability based scheduling model by including risk analysis. As a result, schedules computed by the extended model have lower degree of personnel risk; thus the probability of project success can be increased. Based on historical data, the framework we propose here can automatically identify, analyze, and mitigate project/software risks. The contribution of this work is summarized as below. (1) Personnel risks which were never considered in current literature are addressed in this paper; (2) Objective information from historical data, instead of subjective expertise, is used for risk analysis; (3) A comprehensive risk-based project scheduling approach which offers end-to-end support for risk management by extracting and analyzing risk factors from historical data, monitoring risk factors during software project management, and adjusting the assignment due to estimate errors; (4) An automatic tool to support this comprehensive approach.

2

Related Work

Gayet and Briand developed METRIX to predict highrisk software component based on experts’ opinions and historical data [1]. Many approaches have been tried to address software risk management. Foo and Muruganantham proposed Software Risk Assessment Model (SRAM)

[5] which considers nine critical risk elements. Xu and his colleagues [12] investigated software risk control based on Capability Maturity Model (CMM). A concept to deal with risks at design level was presented by Gary McGraw [15]. Feather [13] incorporates logical fault trees of probabilistic risk assessment (PRA) into the “Defect Detection and Prevention” (DDP). Risk management database was used by Fussel and Field [16] to manage risks. There is a commercial tool called @Risk (Palisade) [19] which integrates Monte Carlo simulation into Microsoft Project and Excel. Through this tool, users can explicitly include uncertainties in their estimation so that all possible outcomes can be generated as results. It helps users making design decisions. Unfortunately, most of the work based their analysis on subjective data, such as experts’ experience, which may be biased. Moreover, though most of them tried to quantify and rank different risks[16] [6], their analysis is based on risk factors known by experts which are obviously not enough. Additionally, none of them offers a comprehensive approach to integrate risk analysis into software project management and provides an automatic tool to support it.

3

we identify risk factor u from the current historical data entry. We then compare u with all the risk factors stored in INDEP and OTHERS. If u is found in INDEP or OTHERS, we increase its occurrence frequency. Otherwise we add u to OTHERS. If the occurrence frequency of a risk factor u reaches a pre-determined threshold, the minimum support used in data mining in our model, we need to remove u from OTHERS to INDEP. To add u to INDEP, we have to perform the following two steps: (1) If u is independent of all the risk factors in INDEP, then u is directly added to INDEP; (2) If u is dependent on a set of risk factors in INDEP, then u can be represented with those risk factors. Therefore, we increase the frequencies of those risk factors. Using the minimum support as threshold in data mining instead of other criteria is beneficial. Applying data mining, we only need to deal with the risk factors in INDEP because elements in OTHERS are not eligible (not enough data) to be analyzed. To inspect the probability independencies between two risk factors, mutual information in the information theory [10] [11] is introduced. The mutual information is defined as

Definitions and Assumptions

Two terms, project risk and software risk, are defined as follows.

I(X; Y ) = H(X)−H(X|Y ) =

 xy

Definition 1 Project risk is the probability of an undesired event, such as budget overruns, and schedule delays, which can result in failure of a project.

P (x, y)log2

P (x, y) P (x)P (y)

, where X and Y are random variables (the occurrence of risk factors), and H(·) is the entropy function. Moreover, it can be shown that I(X; Y ) ≥ 0. Theoretically, we have the result that X and Y are independent iff I(X; Y ) = 0. Since we get the probabilities of risk factors from real data, I(X; Y ) can be considered to be 0 if its value is less than a very small number. We therefore claim that X is independent of Y whenever I(X, Y ) < , for some sufficient small number  > 0. Based on mutual information, we can obtain the dependencies between risk factors selected from OTHERS and elements stored in INDEP and thus the two steps described above can be performed. To construct the automatic model, we make several assumptions.

Definition 2 Software risk is the probability of an undesired event, such as logic error and design error, which can result in functional failures of software product. Moreover, we define risk factors as human activities which result in both risks, such as coding on Friday noon, capabilities to learn a new type of techniques slowly, and managers being on vacation. In addition, we categorize project historical data based on similar project characteristics, i.e. complexities, sizes, types of the project, and numbers of staff [4]. There are two risk factor sets for each category of projects in our system, INDEP and OTHERS. They are defined as follows.

Assumption 1 Every risk can be analyzed by several risk factors. In other words, every risk event is caused by several risk factors.

Definition 3 INDEP is the set of independent (probability independent to each other) risk factors and the rest risk factors identified from the historical data are stored in the set of OTHERS. Both of the two sets are initialized to be empty.

Assumption 2 At the end of a task/project, we (or experts) can judge the success or failure of a task/project, and if it fails, we can usually identify its causing risk factors.

The process of historical data works like below. For each entry in the historical data, we identify its risk factors (we can always do so according to assumption 1). For example,

Assumption 3 There is at least a certain amount of historical data stored in our database for each category of projects. 2

4.1

Assumption 4 Tasks with task types and task precedence of a project are recorded in historical data.

The risk-based scheduling is based on item response theory (IRT) [18] and data mining technology. To monitor risk factors related to the project, we exploit three tables, a employee-risk factor table (see Table 1), a task-risk factor table (see Table 2, where “1” represents “related” and “0” indicates “not related”), and a rescheduling employeerisk factor table (see Table 3). For each employee, Table 1 records occurrence frequency of each risk factor as a causing factor of task failure. For example, there are totally three tasks, t1 , t2 , t3 which are related to risk factor 1. The durations of all the three tasks are d1 , d2 , d3 , respectively. Now 10% of t1 , 20% of t2 , and 30% of t3 are assigned to employee 1. We find task t1 failed. So finally we compute the occurrence frequency of risk factor 1 as a causing factor of 0.1×d1 . Note here we compute failure to be 0.1×d1 +0.2×d 2 +0.3×d3 task duration, not number of tasks. Table 3 records the same frequency but the data is collected at runtime from current project instead of historical data of past projects.

Assumption 5 We can add checkpoints into the project.

4

Automatic Model for Personnel Assignments

Our goal is to build an objective automatic model to mitigate personnel risks through assignments of software projects based on historical data. As discussed in section 1, our risk management approach needs to be adaptive – it should be able to make timely changes to the assignment, hence increase the probability of project success. As a result, we design our approach by two working stages, scheduling stage and adaptation stage. At scheduling stage, information about risk factors is extracted from historical data. Risk factors and employees’ capabilities are taken as input to compute an optimal task assignment (including time schedules). Note the information about risk factors may not be exactly accurate for current projects. The personnel risk-based scheduling at this stage is more for preventive purpose. At adaptation stage, which is along with the software project development, new values of risk factors may be found to be conflicting with the ones extracted from historical data. At this time we may need to recompute task assignments. The risk-based rescheduling at this stage is for compensating purpose. We use the framework depicted in Figure 1 to determine risk factors and compute their frequencies. This framework is embedded in our capability based scheduling model [14][17]. Output of this framework is considered as risk dimension input for the Genetic Algorithm to calculate optimal assignments. Risk Analysis Extract the data of the employees with respect to the similar failed projects and the risk factors

Item Characteristic Curves Fitting Process

Extract the key risk factors for each type of tasks/projects by analyzing the item parameters

Input Project Complexity/ Task Models/ Types of Tasks

Risk Factors Risk Factor 1 Risk Factor 2

Employee 1 1 2 1 9

Employee 2 1 13 1 3

Table 1. An Employee-Risk Factor Table

Tasks Task 1 Task 2

Risk Factor 1 1 1

Risk Factor 2 0 1

Table 2. A Task-Risk Factor Table Risk Factors Risk Factor 1 Risk Factor 2

Employee 1 1 2

0

Employee 2 0 1 6

Table 3. A Rescheduling Employee-Risk Factor Table

Data Analysis

Input the minimum support and minimum confidence for data mining (Case Base) History Records Database

Risk Factors

The risk factors of employees in the employee-risk factor table are initialized based on the historical data, and are updated while the project is going on. Basically, we consider about risk factors at three scales:

Data Mining Process (Task/Project types to risk factors from failed tasks/projects)

(1) risk factors pertaining to the whole project (since we believe that the complexity and the type of a project may cause some problems which employees cannot easily solve),

Figure 1. The framework to determine risk factors of a project/tasks

(2) risk factors within a single task (since different types of tasks may have different problems), 3

(3) risk factors within a scenario of several certain types of tasks (since risks might be transferred [4] [12] and accumulated).

of a failed project. Then for each risk factor, if it is related to a failed project, we estimate the working proportion of project duration for each employee and compute the employee’s failure score in a project. The failure score of an employee in the project will be

These three types of risk factors can be computed using the following algorithm: (1) Apply data mining and IRT (compute the difficulty parameter b and the discrimination parameter a if there are more than two risk factors – details can be found in section 4.2) on history data to determine the set of the first type of risk factors for the project, and add them to the tasks in the project.

< Loss of the P roject >= P · < Cost of the P roject >, where P is the proportion of the project duration which the employee is assigned to. For each employee, we calculate the average of his/her failure scores of all the failed projects as his/her final failure score. That is, a failure score of an employee, say Employee k, is

(2) Apply data mining and IRT on history data to determine the set of the second type of risk factors for the tasks, and add the specific risk factors to the specific tasks.

< Loss >=

i∈Fk

(3) Apply data mining and IRT on history data to determine the set of the third type of risk factors, which is, the largest frequent itemset of tasks with some risk factors (which appear after all the tasks in the itemset), and add the specific risk factors to the specific tasks.

4.2

where Fk is the index set of all the failed projects in which Employee k is involved in historical data, |Fk | is the number of elements in Fk , and Pi is the proportion of the duration of the project i which the employee is assigned to. Note that it is not efficient for us to compute the failure scores of employees every time when we want to estimate the parameters of risk factors. A more efficient way is to record the failure scores of employees with respect to the risk factors in INDEP to our historical data (as a part of employee model) and only update those when there are some modifications in the sets INDEP and OTHERS. After the failure scores of employees with respect to the risk factor are known, the failure scores of these employees will be distributed over a range. We divide the range into n equal parts. Since the range of the levels in IRT is [−3, 3], we also divide the range [−3, 3] into n equal parts (the midpoint of each part is the level of the part) and each part corresponds to one part of the sub-range of failure scores (in order). We compute the average of the failure rates of employees with respect to similar projects in historical data in each part (if there is any employee in this part) and treat that as the observed value. With these observed values, the fitting process of the risk factor characteristic curve can be started. The procedure used to fit the curve is based on maximum likelihood estimation. Note that the number n depends on the number of employees, that is, if the number of employees is larger, we can choose larger n.

Item Response Theory (IRT)

IRT is originally designed for testing purpose [18], which provides a scale of measurement for abilities of a person. Since our research mainly focuses on personnel risks, to understand those risks very well, we have to know the abilities of each employee. Moreover, for our risk analysis model, we estimate the detailed abilities (levels of each type of risk factors) of employees instead of their overall abilities. IRT relies upon individual items of a test rather than upon aggregation of the items such as a test score [7]. Hence, IRT matches all the intentions of our research on risks. The model exploited in this paper is two-parameter. P (θ) =

1 1+

e−a(θ−b)

1  Pi · < Cost of the P roject i >, |Fk |

,

where a is the discrimination parameter, b is the difficulty parameter, and θ is the ability level. The discrimination parameter a demonstrates how good the discrimination of the risk factor is. The larger a is, the better discrimination the risk factor has. The difficulty parameter b tells us the difficulty level of the risk factor. Since we consider failure rates with respect to the risk factor, larger b indicates lower difficulty level. To estimate the parameters of risk factors, we compute the “failure scores” (loss) of employees based on historical data. Our computation focuses on failed projects. According to Assumption 2, we can usually identify risk factors

4.3

Objective Functions and Constraints

Our previous work focuses on optimizing the cost of a project based on employee’s capabilities using GA [14][17]. Input from capability dimension is considered in that work. In this paper, a new dimension of input, risk, is introduced. We continue employing GA as our optimization method. 4

The newly added input dimension requires new objective functions in the Genetic Algorithm. Practically, it imposes a multi-objective optimization problem which needs to be solved. One of the classical methods which are commonly used to solve multi-objective optimization problem is weighted sum method [9]. With this method, we can assign weights to both objective functions by proportions of our preferences and obtain a particular trade-off solution. Before we convert the multi-objective optimization problem to a single one, we need to normalize the objectives. Since the cost and the risk factors of employees are represented by different scales of magnitude, normalization of both objective functions are necessary. Moreover, because the calculation of each risk factor parameter of each task assignment is proportionally based on the failure rates of the employees who are involved in the task, the value of each risk factor parameter is still a probability. Hence, if we make the summation of the weights assigned to the risk factors to be 1, the objective function related to risk factors is within the range [0, 1]. According to the reason that the range of the risk factors is easier to predict, we normalize the cost to the range [0, 1]. Within the collected similar historical data, the maximum cost can be determined. Let it be the denominator. That is, COSTnormal =

< Risk F actor i >=

, where Ej is the index set of employees assigned to the current scheduled task Tj and Pk is the proportion  of the task which is assigned to Employee k. Note that k∈Ej Pk = 1. The term < F ailed Rate of Em k > is the failure rate of Employee k with respect to the Risk F actor i. Since GA maximizes the fitness function, and we want to minimize the cost and the risk of a project, the fitness function is defined as F itness = α ·

j∈I

rj

1 1 + (1 − α) · COSTnormal RISKnormal

, where 0 < α < 1 is the weight for objective function of cost.

4.4

Rescheduling

The second stage of our approach, adaptation stage, is designed for detecting unexpected performance of employees and conduct rescheduling. 4.4.1

F itness Cost . M aximum Cost



1

Pk < F ailed Rate of Em k >

k∈Ej

Identification of unexpected performance

To trigger rescheduling, we need to generate thresholds from Table 1 (e.g., if the failure rate of an employee with respect to a risk factor is p, 0 < p < 1, we set this threshold to be max{p(1 + ), 1}, where  > 0), which indicate expected performance of employees. Afterwards, we update values in Table 3 using runtime data, which demonstrates employee’s performance in current project. The second step of identifying unexpected performance is to update Table 3 using runtime data. According to Assumption 5, we can add checkpoints into the project. Depending on the size of the project, we can decide how many checkpoints are required. At each checkpoint, we check all the finished tasks, identify risk factors, and update Table 3 accordingly.

Based on our idea, the risk factors are different from task to task. Hence, partially controlling the constraints in each task is significant. In order to deal with the constraints easily by using the same objective function for all tasks and to make the summation of the weights assigned to the risk factors to be 1, the objective function related to risk factors can be written as

RISKnormal = 



ri < Risk F actor i > (1)

i∈I

, where I is the index set of all the risk factors related to the project and ri ≥ 0 depends on how important the risk factor is (we can give a rule based on the parameters a and b to generate them). Moreover, ri = 0 if the risk factor i is not related to the current scheduled task. For example, if there is a task which is the type “coding in JAVA” with only one risk factor called “lack of the technology in JAVA”, we will have only one nonzero coefficient related to that risk factor and others are zero. Furthermore, some risk factors are related with some special events. These risk factors may also be treated as the same way. The term < Risk F actor i > in (1) is

4.4.2

Trigger of rescheduling

The major work of the rescheduling trigger is to calculate a ratio of unexpected performance of employees. If this ratio is big, say, greater than 40%, we reschedule the rest of the project. As described in section 4.1, Table 2 stores information about whether an employee is related to a risk factor. We have another table in the original capability-based scheduling model to record whether a task is assigned to an employee. Therefore, it is possible for us to find out whether 5

an employee is related to a certain risk factor between any two checkpoints. If we construct a new table which records this information by using “1” for “related” and “0” for “not related”, we can count the total number of employee-risk factor pairs appearing in the segment between two checkpoints. We name this number as OV ERALL, which will be used as denominator to calculate the ratio of unexpected performance. The next step is to find out how many unexpected cases happened during current segment (between previous checkpoint and current checkpoint). This work is done by comparing values of Table 3 to their thresholds defined in section 4.4.1. However, not all risk factors in Table 3 are compared to their thresholds. Practically, if a risk factor has a < 0.5, which means it has low degree of discrimination, rescheduling results will not make much difference from the current one. Finally we can calculate an unexpected performance ratio based on the analysis described above. If this ratio exceeds a pre-defined value (for example, 40%), the project will be rescheduled automatically. 4.4.3

[2] R. P. Higuera, Y. Y. Haimes, “Software Risk Management”, Technical Report CMU/SEI-96-TR-012, ESC-TR-96012, 1996. [3] D. A Coley, An Introduction to Genetic Algorithms for Scientists and Engineers, World Scientific Publishing Co. Pte. Ltd., 1999. [4] Y. F. Brown et al., Software Risk Management: A Practical Guide, Department of Energy Quality Managers, Software Quality Assurance Subcommittee, 2000. [5] S. Foo, A. Muruganantham, “Software Risk Assessment Model”, IEEE International Conference on Management of Innovation and Technology (ICMIT), vol. 2, pp. 536-544, 2000. [6] A.A. Keshlaf, K. Hashim, “A Model and Prototype Tool to Manage Software Risks”, First Asia-Pacific Conference on Quality Software, pp. 297-305, 2000. [7] F. B. Baker, The Basics of Item Response Theory, ERIC Clearinghouse on Assessment and Evaluation, 2001. [8] C. K. Chang, M. J. Christensen, T. Zhang, “Genetic Algorithms for Project Management”, Annals of Software Engineering 11, pp. 107-139, 2001. [9] K. Deb, Multi-Objective Optimization using Evolutionary Algorithm, John Wiley & Sons, Ltd, 2001. [10] J. Cheng, R. Greiner, J. Kelly, D. Bell, and W. Liu, “Learning Bayesian network from data: An information-theory based approached”, Artif. Intell., vol. 137, pp. 43-90, 2002.

Rescheduling process

Before rescheduling, data from current project (stored in Table 3) is pushed to historical data and Table 1 should be updated correspondingly. When updating Table 1, its old values and new values from Table 3 are given weights and summated together. We then run the Genetic Algorithm again to find out an optimal scheduling for the rest of this project.

5

[11] D. J.C. Mackay,Information Theory, Inference and Learning Algorithms, Cambridge University Press, 2003. [12] R. Xu, L. Qian, X. Jing, “CMM-based software risk control optimization”, IEEE International Conference on Information Reuse and Integration (IRI), pp. 499-503, 2003. [13] M.S. Feather, “Towards a Unified Approach to the Representation of, and Reasoning with, Probabilistic Risk Information about Software and its System Interface”, 15th International Symposium on Software Reliability Engineering(ISSRE), pp. 391-402, 2004.

Conclusion and Future Work

This paper presents a comprehensive approach which extracts risk information from historical data, integrates risk analysis into project scheduling, and perform rescheduling based on runtime monitoring. With the objective information from historical data, we believe this approach is superior to other existing work. Our risk-capability-based model generates and systematically analyzes historical data, which provides the basis for future analysis. In this research, IRT and data mining are used to process risk factors and results are output to Genetic Algorithm which computes optimal task assignments. Besides, a rescheduling mechanism is designed to detect and mitigate potential risks.

[14] Y. Ge, Capability-based software project scheduling with system dynamics and heuristic search, master’s thesis, Iowa State University, 2004. [15] G. McGraw, “Risk Analysis in Software Design”, IEEE Security & Privacy, pp. 32-37, 2004. [16] L. Fussell, S. Field, “The Role of the Risk Management Database in the Risk Management Process”, 18th International Conference on System Engineering (ICSEng), pp. 364-369, 2005. [17] Y. Ge, C. K. Chang, “Capability-based Project Scheduling with Genetic Algorithms,” International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce (CIMCA), p. 161, 2006. [18] Item Response Theory: Information from Answers.com, Answers Corporation, 2007; http://www.answers.com/topic/itemresponse-theory.

References [1] B.E. Gayet, L.C. Briand, “METRIX: a tool for software-risk analysis and management”, Annual Reliability and Maintainability Symposium, pp. 310-314, 1994.

[19] @RISK 4.5 for Excel, Palisade Corporation, 2007; http://www.palisade.com/.

6

Suggest Documents