Post-Completion Error in Software Development - ACM Digital Library

6 downloads 92546 Views 1MB Size Report
May 16, 2016 - software engineering. This paper investigates whether post- completion errors occur in software development and the likelihood that software ...
2016 9th International Workshop on Cooperative and Human Aspects of Software Engineering

Post-completion Error in Software Development Fuqun Huang Institute of Interdisciplinary Scientists Seattle, WA98115 [email protected]

task [6]. Post-completion errors are a prevalent erroneous pattern that have been found to reoccur across various tasks. Well-known post-completion errors occurring in routine procedural tasks include forgetting to retrieve one’s bank card from an automated teller machine (ATM) after a transaction and forgetting to retrieve the original document from a photocopier [6]. Post-completion errors also occur during in problem-solving situations [19].

ABSTRACT Post-completion errors have been observed in a variety of tasks by psychologists, but there is a lack of empirical studies in software engineering. This paper investigates whether postcompletion errors occur in software development and the likelihood that software developers commit this error when a post-completion scenario is presented. An experimental study was conducted in the context of a programming contest. In the experiment, a programming task specification that contained a post-completion sub-task requirement was presented to the subjects. The results showed that 41.82% of the subjects committed the post-completion error in the same way—forgetting to design and implement a software requirement which is supposed to be the last sub-task and is not necessary for the completion of the main sub-task. This percentage of subjects committing the post-completion error was significantly higher than that of subjects committing other errors. This study has confirmed that post-completion error occurs in software development and, moreover, different software developers tend to commit this error in the same way with a high likelihood at the location where a post-completion scenario is presented. Strategies are proposed to prevent post-completion errors in software development.

There have been a number of studies in constructing the cognitive model of post-completion errors [5, 23, 27, 33] and identifying factors that provoke or mitigate the occurrence of these errors [2, 4, 7, 20, 24, 34]. However, most of these studies are conducted in procedural tasks rather than problem-solving tasks [19]. The few studies on post-completion errors in computer science focus on the users of computer systems, aiming to provide better design rules for human-computer interfaces [8, 9, 24, 28, 31]. To the best knowledge of the author, there is little research on software developers’ post-completion errors in programming context. Human errors are the primary cause of software defects, since computer programs are a pure cognitive product that describes its designers’ thoughts [11, 15, 17, 18, 29]. Understanding the human error mechanisms of software developers will advance various approaches that are currently used to defend against software defects, such as defect prevention, defect prediction, defect detection and fault tolerance [13, 15, 16]. Post-completion errors have been proven to occur in various tasks; and this type of errors are so obstinate that motivation enhancement and training are found to be ineffective in preventing them [4]. Therefore, postcompletion errors in the context of software development is an interesting and significant topic worthy of scientific research.

CCS Concepts

• Software and its engineering ➝Software functional properties ➝ Correctness • Software and its engineering➝ Software creation and management ➝ Designing software •Software and its engineering ➝ Software development process management➝ Risk management • Social and professional topics ➝ Project and people management

This paper aims to explore a series of problems about the postcompletion error in software development:

Keywords

RQ1: Whether post-completion error occurs in software development?

Post-completion errors; software development; programming errors; common-cause failure; software psychology; human errors in software engineering.

RQ2: How likely post-completion error occurs under postcompletion scenarios in software development?

1. INTRODUCTION

RQ3: How post-completion error manifests itself in the context of software development?

Post-completion error is a specific type of human errors that one tends to omit a sub-task that is carried out at the end of a task but is not a necessary condition for the achievement of the main sub-

RQ4: What strategies can be used to defend against postcompletion errors in software engineering?

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. CHASE'16, May 16 2016, Austin, TX, USA Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-4155-4/16/05…$15.00 DOI: http://dx.doi.org/10.1145/2897586.2897608

The paper is organized as follows: Section 2 presents the methodology, which includes the task, the experimental procedures and the subjects; Section 3 presents the data and results; Section 4 discusses the implications gained from the study; Section 5 provides the conclusion.

108

2. METHODOLOGY

Print a Chinese word “jiong” in a nested structure.

This study was conducted by embedding a task that contains a post-completion scenario in a programming contest, and investigating how the programming contestants committed the post-completion error. The following sections describe the details about the task, the procedures and the subjects of the study.



2.1 Task

Each input group contains an integer n (1≤n≤7).

The task used in this study was called the “jiong” problem. The requirement specification of the task in Figure 1 was presented to the programming contestants.



Inputs

There is an integer in the first line that indicates the number of input groups.

Outputs

Print a word “jiong” after each input group, and then print a

The “jiong” task in Figure 1 has two sub-tasks. The first sub-task is the main task, which requires the programmers to design an algorithm to calculate the multiple-level nested Chinese character “jiong”. To succeed in this main sub-task, a software developer first needs to extract the numerical relations between the width, height and nesting levels of the character based on the sample outputs. Then, the developer needs to design a recursion or iteration algorithm to compute any “jiong”s between one to seven nested levels.

blank line. 

Sample Inputs 3 1 2 3

The second sub-task is to print a blank line between the output “jiong”s. This sub-task is at the end of the super task, but it is not an essential step for the completion of the main sub-task. This is the post-completion sub-task which is expected to be omitted by some subjects. More specifically, if the general psychological human error pattern of post-completion errors manifests itself in software development, it should manifest as a program fault that “the blank line after a “jiong” is missing”.



Sample Outputs +――――――+ ∣∣∣∣∣∣∣∣ ∣∣╱∣∣╲∣∣ ∣∣∣∣∣∣∣∣ ∣∣+――+∣∣ ∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣ +―+――+―+

2.2 Procedures +―――――――――――――― + ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣ ∣∣∣∣∣∣╱∣∣╲∣∣∣∣∣∣ ∣∣∣∣∣╱∣∣∣∣╲∣∣∣∣∣ ∣∣∣∣╱∣∣∣∣∣∣╲∣∣∣∣ ∣∣∣╱∣∣∣∣∣∣∣∣╲∣∣∣ ∣∣╱∣∣∣∣∣∣∣∣∣∣╲∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣ ∣∣∣∣+――――――+∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣╱∣∣╲∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣+――+∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ +―――+―+――+―+―――+

2.2.1 Context setting The working memory load during a task is the direct factor that determines how much attention one can possibly allocate to the post-completion sub-task, thus it influences how likely one commits a post-completion error [6, 23]. Therefore, an appropriate setting of the experimental context to simulate subjects’ cognitive load is extremely important. It determines the external validity [32] of the study, which refers to the extent to which the likelihood of post-completion errors observed in the experiment can be generalized to industrial programming situations.

+―――――――――――――――――――――――――――――― + ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ╱∣∣╲∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣ ╱∣∣∣∣╲∣∣∣∣∣∣∣∣∣∣∣∣ ∣ ∣∣∣∣∣∣∣∣∣∣∣∣╱∣∣∣∣∣∣╲∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣╱∣∣∣∣∣∣∣∣╲∣∣∣∣∣∣∣∣∣∣ ∣ ∣∣∣∣∣∣∣∣∣∣╱∣∣∣∣∣∣∣∣∣∣╲∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣╱∣∣∣∣∣∣∣∣∣∣∣∣ ╲∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣╱∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ╲∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣╱∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ╲∣∣∣∣∣∣∣ ∣∣∣∣∣∣╱∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ╲∣∣∣∣∣∣ ∣∣∣∣∣╱∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ╲∣∣∣∣∣ ∣∣∣∣╱∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ╲∣∣∣∣ ∣∣∣╱∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ╲∣∣∣ ∣∣╱∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ╲∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣ ∣∣∣∣∣∣∣∣+―――――――――――――― +∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣╱∣∣╲∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣╱∣∣∣∣╲∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣╱∣∣∣∣∣∣╲∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣╱∣∣∣∣∣∣∣∣╲∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣╱∣∣∣∣∣∣∣∣∣∣╲∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣+――――――+∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣╱∣∣╲∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣+――+∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ +―――――――+―――+―+――+―+―――+―――――――+

To better simulate the cognitive load and attention allocation that programmers experience in their work, the researcher set the task of interest in a programming contest. Setting the task in a programming contest has many advantages that other alternative contexts (e.g. a single task in a laboratory controlled environment) do not have. The subjects in the programming contest had the time pressure to correctly solve as many problems in a limited contest time. Therefore, the subjects should have high cognitive load during the task of the researcher’s interest. However, the aimed task was presented to the subjects in a more natural way: the task was hidden among several other tasks. Therefore, a subject would not devote all of his attention and time to every detail of the task of interest. This is a good simulation of programmers in their real working situations: programmers have deadlines to submit their programs; they have time pressure to solve a variety of problems that constitute a mile-stone task, but they have the freedom to allocate their efforts across different smaller problems.

Figure 1. The task specification of the “jiong” problem.

109

Based on the above consideration, the experiment of this study was conducted in the 7th annual programming contest of Beihang University (BUAA). The programming contest originally aimed to select candidates to participate in the Asian Qualifying of the ACM International Collegiate Programming Contest (ACMICPC). The task of this study, the “jiong” problem was presented as one of the eight problems that the contestants could choose to solve.

of the faults were recorded as the data of another study focusing on the contribution of cognitive styles to programming error diversity[16].

The time limit for the programming contest was 4 hours. The scoring rules of the programming contest required the contestants to solve as many problems as they can and as fast as possible to win the contest. The scoring rules included:

As the contestants were allowed to debug and re-submit their programs, each person can submit more than one version. The code inspector was required to review all the versions for a contestant, and recorded all of the faults contained in these versions. However, for each contestant, each fault was counted only once even if it appeared in several versions.

Since the researcher was concerned with the error-proneness underlying programming activities, the researcher was interested in all of the errors a contestant made during the entire process of solving the “jiong” problem.

a) The contestants’ performances were first ranked by their total scores. The total score is equal to the total number of problems solved successfully;

2.3 Subjects Fifty-five contestants from BUAA tried the “jiong” problem. These 55 contestants constituted the subjects of this study. Based on the educational background survey [16], all of the subjects were undergraduate students from the department of computer science or software engineering. All students had just finished their course on C language. Their answers to the questions on their programming experience showed that, these students had no extra programming experience except for the course training on C language provided by the university.

b) If the total scores were equal between two contestants, the contestants were ranked by problem-solving time: the less time, the higher the rank.

2.2.2 Online Judge System Just as programmers have tools that help them to debug their programs in real working situations. The contestants were provided feedbacks on their submissions from an Online Judge System, which was similar to the system used in ACM-ICPC [21]. The contestant can first compile and run the program on his/her local environment, then submit it to the Online Judge System on the server. For each problem, participants can submit unlimited versions of programs to the Online Judge System until the system “accepts” one version or the contestant quits. The types of submission result that the Online Judge System fed back to the students were as follows:

3. RESULTS The post-completion error that “the blank line after a “jiong” is missing” had been committed by 23 subjects, 41.82% of the total 55 subjects. To better unfold the following discussion, the author proposes a simple but straightforward measure called “Error Committing Ratio” (ECR) to capture the likelihood that an error is committed by a group of people:

Accepted (AC). The outputs of the program matches the Online Judge outputs.

𝑛𝑖 × 100% 𝑁 Where i is the error sequence number, 𝑛𝑖 is the number of subjects committing the error i, and N is the total number of subjects. 𝐸𝐶𝑅𝑖 =

• Wrong Answer (WA). The output of the program does not match what the Online Judge expects. • Presentation Error (PE). Presentation errors occur when the program produces correct output for the Online Judge’s secret data but does not produce it in the correct format. For the “jiong” problem, if a submission contained the post-completion error (forgetting the blank line between outputs), the Online Judge System would not accept the submission and indicated there was a “Presentation Error”.

Besides the post-completion error, another 22 errors were found in the code inspection and the number of subjects committing them are summarized. Interested readers can find the detailed description on them in [16].

• Runtime Error (RE). This error indicates that the program performs an illegal operation when running on the Online Judge’s input. Some illegal operations include invalid memory references such as accessing outside an array boundary. There are also a number of common mathematical errors such as divide by zero error or overflow.

In total, 23 errors were introduced by the 55 subjects. 13 errors were unique errors that were committed by only one subject. Another 10 errors (including the post-completion errors) were common errors that had been committed by two or more subjects. The detailed descriptions for other common errors are summarized in [16].

• Time Limit Exceeded (TL). The Online Judge has a specified time limit for every problem. When the program does not terminate in that specified time limit, this error will be generated.

The descriptive statistics for the ECR of the other 22 errors are provided in Table 1. From Table 1 we can see that the ECR for the post-completion error (41.82%) is extremely high than other errors.

• Compile Error (CE). The program does not compile with the specified language’s compiler.

2.2.3 Code inspection After the programming contest, a software engineer performed the code inspection to identify the faults introduced by the contestants on the “jiong” task. It is notable that the code inspection was not just focusing on the post-completion error. All

110

completion task may contain lines of code that are not necessarily located at the end of the program.

Table 1. Descriptive statistics for ECR of the other 22 errors. Descriptive statistics

ECRi

Minimum

1.82%

Maximum

12.82%

Mean

3.88%

Median

1.82%

Standard Deviation

4.67%

RQ4: What strategies can be used to defend against postcompletion errors in software engineering? This study has significant implications for the prediction and prevention of post-completion errors. The author proposes three ways to prevent post-completion errors in software development. The first and simplest way is to eliminate the post-completion task if it is not necessary. Like the example in this study, “printing a blank line between the outputs” was not essential for the programming coach to selecting the most skilled and talented contestants, thus, this post-completion task was eliminated in the next year’s programming contest [26]. The other approach is changing the procedure of the task if possible. For example, currently many ATMs require a user to withdraw his/her bank card first then he/she can continue the next step (e.g., withdrawing cash). The third strategy of preventing post-completion errors in software engineering is to highlight (e.g. using bright colors and/or bold font) the places of post-completion tasks in the requirement documents, since visual cues are an effective way to reduce post-completion errors [3, 7].

Total number of other errors, I=22 Total number of subjects, N=55

Wilcoxon signed-rank test [30] was performed to examine whether the number of subjects committing the post-completion error is the same as that for other errors. The Wilcoxon signedrank test was chosen because it is a non-parametric statistical test that is especially robust and requires no assumptions on the distribution of the data. A Kolmogorov-Smirnov Test showed that the number of subjects committing errors were not normally distributed.

This study was conducted in a small programming task and the subjects of this study were students majoring in computer science. This context is different from that of industrial projects in which the size of task is larger and the expertise level of software developers tend to be higher. This difference can be a threat to the external validity. However, the findings of this study should have significant implications to the future studies on the postcompletion erroneous behaviors of professional programmers, since expertise is related to task or domain, i.e. an expert in one task can be a novice in another task [1, 10, 35]. Though the task is simpler and smaller than that in real industry, it contains the function points that have covered all the three performance levels: skill-based performance, rule-based performance and knowledgebased performance [16, 18, 22, 25]. Though the programmers in this experiment have less expertise, they have confronted all three levels of performance as those experienced programmers in their real industrial practices. Overall, the subjects in this study are considered competent programmers for the task presented to them.

The result shows that ECR of other errors is significantly lower than that of the post-completion error, Median=1.82%, z= -4.22, p < 0.001). It means the post-completion error is far more “strong” and risky than any of other errors. Once a post-completion error scenario in software development is presented, people have a high probability to commit the post-completion error.

4. DISCUSSION The discussion of the findings of this study is unfolded according to the research questions proposed at the beginning of the paper. RQ1: Whether post-completion error occurs in software development? This study has confirmed that post-completion error occurs in the context of software development. RQ2: How likely post-completion error occurs under postcompletion scenarios in software development? This study shows that post-completion errors tend to be committed with a high likelihood. The post-completion error in this study was committed by more people than any other type of errors [25].

This study also started a promising area worthy of software researchers’ pursuit: using human error theories to predict software defects. The general pattern of the post-completion error predicts the exact location and the form of “blank line missing” error in this study. This cannot been achieved by any of the traditional software defect prediction models [12, 14]. Based on program complexity metrics that are widely used in traditional defect prediction models [12], the task of “printing a blank line between the outputs” should be the location that is most unlikely to contain a defect, as this task is very simple. In real cases, this location is riskier than any other difficult locations. Such phenomenon can only be well explained by human error theories.

RQ3: How post-completion error manifests itself in the context of software development? Post-completion error manifest itself in software development as the post-completion sub-task is omitted by software developers, which is similar as the way in other procedural tasks. However, it is notable that post-completion errors in software development context do not necessarily occur at the last line of code. In this experiment, the code of printing a “blank line” is not located at the last line of the program. This is different from the postcompletion errors in procedural tasks, in which the error generally occurs at the last single step of “operation” or “action”. In software development, the post-completion task can be a logic task represented as the last stage of the whole problem solving process during developers’ problem representation stage [15]. This post-completion task may contain several steps of operations (e.g., typing a series of strings) and/or several steps of using knowledge rules (e.g., programming language rules). This post-

5. CONCLUSION This paper conducted an experimental study on the postcompletion errors in software development. The study shows that software developers tend to commit post-completion error with a high likelihood once the post-completion scenario is presented in software requirements. Strategies such as eliminating unnecessary post-completion requirements and highlighting the post-completion requirements can be used to prevent this error. This study has also initiated a promising area, using human error

111

theories to predict software defects, which has shown to be very powerful in accounting for the exact location and form of the postcompletion software fault.

Experiment. Journal of Software engineering 7, 3, 114-120. DOI= http://dx.doi.org/10.3923/jse.2013. [15] F. Huang, B. Liu, and B. Huang, 2012. A Taxonomy System to Identify Human Error Causes for Software Defects. In The 18th international conference on reliability and quality in design International Society of Science and Applied Technologies, Boston.

6. ACKNOWLEDGMENTS The author acknowledges all the participants of the 7th Annual Programming Contest of Beihang University. The author also thanks Professor Bin Liu, Dr. You Song, Mr. George Yue, Mr. Zongquan Ma and Mr. Zixing Li for their support in this study.

[16] F. Huang, B. Liu, Y. Song, and S. Keyal, 2014. The links between human error diversity and software diversity: Implications for fault diversity seeking. Science of Computer Programming 89, Part C, 1 (9/1/), 350-373. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.scico.2014.03.0 04.

7. REFERENCES [1] B. Adelson and E. Soloway, 1986. A Model of Software Design. International Journal of Intelligent Systems 1, 195213. [2] M. G. Ament, A. L. Cox, A. Blandford, and D. Brumby, 2010. Working memory load affects device-specific but not task-specific error rates. In Proceedings of CogSci.

[17] F. Huang, B. Liu, S. Wang, and Q. Li, 2015. The impact of software process consistency on residual defects. Journal of Software: Evolution and Process. DOI= http://dx.doi.org/DOI: 10.1002/smr.1717.

[3] M. G. Ament, A. Y. Lai, and A. L. Cox, 2011. The effect of repeated cue exposure on post-completion errors. In Proceedings of the 33rd Annual Conference of the Cognitive Science Society Citeseer, 850-855.

[18] F. Huang, B. Liu, and Y. Wang, 2013. Review of Software Psychology. Computer Science 40, 3, 1-7.

[4] J. Back, W. L. Cheng, R. Dann, P. Curzon, and A. Blandford, 2007. Does being motivated to avoid procedural errors influence their systematicity? In People and Computers XX—Engage Springer, 151-157.

[19] S. Y. Li, A. Blandford, P. Cairns, and R. M. Young, 2005. Post-completion errors in problem solving. In Proceedings of the Twenty-Seventh Annual Conference of the Cognitive Science Society Citeseer.

[5] M. D. Byrne, 2003. A mechanism-based framework for predicting routine procedural errors. In Proceedings of the Twenty-Fifth Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society.

[20] S. Y. Li, A. L. Cox, A. Blandford, P. Cairns, R. M. Young, and A. Abeles, 2006. Further investigations into postcompletion error: the effects of interruption position and duration. In Proceedings of the 28th Annual Meeting of the Cognitive Science Conference, 471-476.

[6] M. D. Byrne and S. Bovair, 1997. A working memory model of a common procedural error. Cognitive science 21, 1, 31-61.

[21] M. J. P. V. D. Meulen and M. A. Revilla, 2008. The Effectiveness of Software Diversity in a Large Population of Programs. IEEE Transactions on Software Engineering 34, 6, 753-764.

[7] P. H. Chung and M. D. Byrne, 2008. Cue effectiveness in mitigating postcompletion errors in a routine procedural task. International Journal of Human-Computer Studies 66, 4, 217-232.

[22] J. Rasmussen, 1983. Skills, rules, and knowledge; signals, signs, and symbols, and other distinctions in human performance models. IEEE Transactions on Systems, Man and Cybernetics 13, 3, 257–266.

[8] P. Curzon and A. Blandford, 2000. Using a verification system to reason about post-completion errors. In Design, Specification and Verification of Interactive Systems.

[23] R. M. Ratwani and J. Gregory Trafton, 2010. A generalized model for predicting postcompletion errors. Topics in cognitive science 2, 1, 154-167.

[9] P. Curzon and A. Blandford, 2004. Formally justifying user-centred design rules: a case study on post-completion errors. In Integrated Formal Methods Springer, 461-480.

[24] R. M. Ratwani and J. G. Trafton, 2011. A real-time eye tracking system for predicting and preventing postcompletion errors. Human–Computer Interaction 26, 3, 205-245.

[10] F. Détienne, 1995. Design Strategies and Knowledge in Object-Oriented Programming: Effects of Experience. Human–Computer Interaction 10, 2-3, 129-169.

[25] J. Reason, 1990. Human Error. Cambridge University Press, Cambridge, UK.

[11] F. Détienne, 2002. Software design - cognitive aspects. Springer-Verlag New York, Inc., New York, NY, USA.

[26] Y. Song, 2012. The 8th Annual Programming Contest of Beihang University College of Software, Beihang University.

[12] N. E. Fenton and M. Neil, 1999. A Critique of Software Defect Prediction Models. IEEE Trans. Software Engineering 25, 5, 675-689.

[27] J. G. Trafton, E. M. Altmann, and R. M. Ratwani, 2011. A memory for goals model of sequence errors. Cognitive Systems Research 12, 2, 134-143.

[13] F. Huang and B. Liu, 2011. Systematically Improving Software Reliability: Considering Human Errors of Software Practitioners. In 23rd Psychology of Programming Interest Group Annual Conference (PPIG 2011), York, UK. DOI= http://dx.doi.org/DOI: 10.13140/2.1.4881.9520.

[28] W. F. Van Der Vegte and N. C. Moes, 2012. Towards improved user-product testing with cognitively enhanced scenarios. In ASME 2012 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference American Society of Mechanical Engineers, 681-687.

[14] F. Huang and B. Liu, 2013. Study on the Correlations between Program Metrics and Defect Rate by a Controlled

112

[29] G. M. Weinberg, 1971. The Psychology of Computer Programming. VNR Nostrand Reinhold Company.

Proceedings of the Interservice/Industry Training, Simulation, and Education Conference, 1075-1085.

[30] F. Wilcoxon, 1945. Individual comparisons by ranking methods. Biometrics bulletin, 80-83.

[34] S.-Y. Yau and S. Y. Li, 2015. Working Memory and the Detection of Different Error Types: Novel Predictions for Error Detection. In Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems ACM, 1031-1036.

[31] S. Wiseman, P. Cairns, and A. Cox, 2011. A taxonomy of number entry error. In Proceedings of the 25th BCS Conference on Human-Computer Interaction British Computer Society, 187-196.

[35] N. Ye and G. Salvendy, 1994. Quantitative and qualitative differences between experts and novices in chunking computer software knowledge. International Journal of Human-Computer Interaction 6, 1, 105-118.

[32] C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén, 2012. Experimentation in Software Engineering. Springer, New York. [33] S. D. Wood and D. E. Kieras, 2002. Modeling human error for experimentation, training, and error-tolerant design. In

113

Suggest Documents