An introduction to program comprehension for computer science ...

36 downloads 243360 Views 769KB Size Report
In this critical review from a computer science educational pers- pective, the authors compare and contrast the way in which differ- ent models conceptualize ...
An Introduction to Program Comprehension for Computer Science Educators Carsten Schulte

Teresa Busjahn

Department of Computer Science Freie Universität Berlin, Germany [email protected]

Department of Computer Science Freie Universität Berlin, Germany [email protected]

Tony Clear

James H. Paterson

School of Computing and Mathematical Sciences AUT University, New Zealand [email protected]

Glasgow Caledonian University, Glasgow, UK [email protected]

Ahmad Taherkhani Department of Computer Science and Engineering Aalto University, Finland [email protected]

ABSTRACT

1

The area of program comprehension comprises a vast body of literature, with numerous conflicting models having been proposed. Models are typically grounded in experimental studies mostly involving experienced programmers. The question of how to relate this material to the teaching and learning of programming for novices has proven challenging for many researchers.

There has long been debate on teaching and learning programming and how learners can make progress (sequence of learning topics, learning tasks, teaching methods, etc.). Several ITiCSE working groups have been centered on this area (see, e.g. [4446,58]. Learning programming involves reading and understanding given program texts, which ties in nicely with program comprehension (PC). There is a rich body of research literature in that field, and a range of models of PC have been proposed, usually based on empirical studies of programmers. PC models tend to have a set of common elements. They typically describe an assimilation process, cognitive structures such as mental representation of the code, and the knowledge base needed to construct the mental representations.

In this critical review from a computer science educational perspective, the authors compare and contrast the way in which different models conceptualize program comprehension. This provides new insights into learning issues such as content, sequence, learning obstacles, effective learning tasks and teaching methods, as well as into the assessment of learning.

Categories and Subject Descriptors

INTRODUCTION

These typical elements can be seen as an analysis of (parts) of the skills and knowledge needed by introductory programmers. In other words, they can be used to develop a model of learning to read and understand programs – a kind of educational model of program comprehension. From this model we can draw conclusions and hypothesize about educational research issues such as desirable properties and usage of example programs, appropriate teaching/learning methods, tools that facilitate deep learning, and the way in which the sequencing of learning steps might affect skills and knowledge development.

K3.2 [Computers & Education]: Computer and Information Science Education – computer science education, information systems education.

General Terms Experimentation, Human Factors.

Keywords Program Comprehension, CS Ed Research, Pedagogy

In this article we have analyzed a selection of PC models from an educational point of view. The models were selected to be representative of the various schools of thought in PC. We initially conducted a critical review of the literature, and working from that basis we performed the following steps: choice of models for detailed analysis; establishing a procedure for extracting data and mapping to educational questions; analysis of selected models using this procedure; drawing conclusions.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ITiCSE-WGR’10, June 26–30, 2010, Bilkent, Ankara, Turkey. Copyright 2010 ACM 978-1-4503-0677-5/10/06...$10.00.

65

There are some parallels between this work and the earlier work of Robins et al., who reviewed a wide range of literature relating to the psychological/educational study of programming [75]. Here we study in depth a focused subset of the literature, relating specifically to models of PC. The aim is to map these models, which themselves have no specific connection to education, to an educational model of PC, and in doing so to draw conclusions and seek new ideas for teaching and learning. The report is organized as follows: 1. 2. 3.

2

Introduction to PC, including an overview of the evolution of models of PC and choice of models for analysis Analysis of Data: Characterizations of relevant models and mapping of models to the common framework of the Block Model [80] Implications for teaching and learning

Introduction to Program Comprehension

In this section we briefly summarize the key movements in empirical research in program comprehension which addresses the cognitive processes that are applied by programmers when understanding programs. Comprehension is usually conceptualized as a process in which an individual constructs his or her own mental representation of the program. Models of program comprehension generally build theories of the cognitive processes on the basis of empirical studies of programmers performing tasks which require them to read and understand a program. This research field has been largely motivated by the need to maintain existing software and the desire to develop automated tools to provide support for program comprehension [91]. As a result of this focus on professional practice, studies have largely involved expert programmers who are experienced professionals, although some studies, such as [25], [65], and [87] have focused on evaluating differences between the cognitive processes of experts and novice programmers as well.

Figure 1: Key elements of program comprehension models The models which have been described have differing, sometimes contradictory views of these elements. Exton comments that: “When considering comprehension strategies it is important that they are presented as possible techniques for program comprehension rather than a ‘correct method for comprehending programs,’ as any single method might work for a given individual but may not for others” [23].

2.1

External Representation

External representation consists of any material and data related to the target program that is not part of the programmer’s internal knowledge and is utilized by the programmer to comprehend the program. The target program can be presented in different ways and formats using different methods and tools. For example, some studies provide only the program code to the subjects (in hard copy and/or on computer screen), while others may provide the documentation of the program as well.

Program comprehension, from the beginnings in the 1970s continues to be an active research field [83]. Over this time, a diverse array of models has emerged, often building on the work of previously proposed models. Despite their differences, most models contain a set of common elements, illustrated in figure 1.

2.2

A model consists of an element which is external to the programmer, comprising the way, or ways, in which the program is represented to the programmer. There are two key elements of the model which are internal to the programmer [42].

Cognitive Structure

Cognitive structure consists of the internal knowledge of a programmer that can be divided into a knowledge base and mental representation. The knowledge base exists in the form of the programmer’s general programming knowledge, programming language-related knowledge, domain-related knowledge, and so on. The mental representation is the internal knowledge which programmers build from the target program while comprehending it, using their knowledge base. Programmers have different backgrounds. Experts have a richer knowledge base than novices. This helps experts to comprehend the target program faster and easier than novices. When an expert reads a program, he/she is able to efficiently link the parts of the program to those already existing in his/her knowledge base and build the mental representation. For novices, it is a more difficult task, because the target program is more likely to be completely new and unfamiliar to them, since their knowledge base is not comprehensive and the process of linking (i.e., cross-referencing) is missing or inefficient.

One of these is the cognitive structure, which may include both the programmers existing knowledge base about programming and/or about the program’s application domain and the programmer’s mental representation of the model. The other internal element is the assimilation process by which programmers build their mental representation using the knowledge base and the representation which they are given of the program. PC models are often characterized according to aspects of the cognitive structures or assimilation processes they describe.

66

2.3

based’, and assert that these processes are used in conjunction with bottom-up comprehension [63].

Assimilation Process

The assimilation process pertains to the strategy that programmers use to extract the information from source code in order to build their mental representation of the code and to comprehend the code.

2.3.1

2.3.4

Top-down comprehension

Top-down models describe the assimilation process as being one in which the comprehension process consists of applying knowledge about the domain of the program and mapping this knowledge to the microstructure of the code. According to Brooks, the assimilation process is driven by hypotheses and is described as an opportunistic process driven by beacons in the code [9]. Soloway and Ehrlich proposed a schema, or plan-based top-down model in which programmers use programming plans and rules of programming discourse to decompose goals and plans into lowerlevel plans [87]. The cognitive structures here are the programming plans and rules which form the knowledge base and the plans which form the programmer’s mental representation.

2.3.2

In addition to the aforementioned models, there are some models that deal with program comprehension from different perspectives.

Bottom-up comprehension

Constructivist models: Rajlich describes the process of program comprehension from the point of view of the constructivist theory of learning [70]. According to this view program comprehension starts with the pre-existing knowledge of the programmer and continues through processes of assimilation and adaptation. Exton provides an overview of program comprehension models and contrasts these with constructivist theory [23].

Models which are categorized as bottom-up describe an assimilation process in which programmers start with individual code statements and chunk or group these into higher level abstractions. This process is repeated at successively higher levels until a complete mental representation of the program is formed. In the model of Shneiderman and Mayer the knowledge base consists of syntactical and semantic knowledge, and novices are more focused on syntactical knowledge [83]. Pennington describes a process in which two distinct models evolve [65,66]. The program model is built by chunking microstructures in the text into macrostructures. The program model is essentially a control-flow abstraction of the program. The situation model includes data flow within the program, and the function or goals of the program. Programmers who exhibit a high level of comprehension were observed to crossreference frequently between the program and situation models.

Characteristics of representation model: Fix et al. focused specifically on five characteristics of the mental representation of the program, and showed that possession of these characteristics distinguished experts from novices [25]. While this work does not include all the elements we look for in a program model, its focus on what characteristics need to be developed in order to become expert, is of interest here.

3

Pennington’s model was based on studies of programmers working in the procedural paradigm. More recent work has extended that model to take account of the increasing prevalence of objectoriented (OO) programming. Corritore and Wiedenbeck used Pennington’s model as a framework to study the comprehension of programmers who were expert in either of procedural or OO programming [15]. Burkhardt et al. studied the comprehension by expert and novice programmers of an OO program, and also considered the influence of the task: read-to-recall or read-toreuse [10]. Their model extends Pennington’s to include aspects specific to the object-oriented paradigm. They observed the evolution of a situation model only in novices undertaking a read-toreuse task, while experts may follow opportunistic strategies as described in the following section.

2.3.3

Integrated models

The integrated meta-model [54] builds on the previous models described here, and identifies three mental representations: topdown (domain) model, program model and situation model. The assimilation process involves switching between models as required to build the three models simultaneously. The authors stress the importance of the knowledge base. Corritore and Wiedenbeck described the results of their study as being consistent with the need for an integrated model [15]. More recently Douce has described a working memory model which takes account of the visual, spatial and linguistic capabilities of programmers [20]. The working memory model contains a ‘central executive’ component which communicates with knowledge components that are related to previous models.

Selection of models for analysis

The aim of selection of models was to find a set of models which are influential individually and which represent the main model types, for example top-down, and bottom-up (see Figure 2). Other reviews of the work in program comprehension do also mention a similar choice of models, see [19,53,62,91]. Models selected for detailed analysis are highlighted in the figure. Arrows indicate direct antecedence or strong influence, while the dotted area indicates that the model of von Mayrhauser and Vans integrates the enclosed previous models. Pennington’s model is clearly influential as a number of other models use this as a basis, so this was selected as an initial candidate. The authors discussed this model collectively and in doing so derived the schema for analysis which is described in the following section of the report.

Opportunistic strategies

Some authors do not accept the dominance of either top-down or bottom-up comprehension. Letovsky views the programmer as an opportunistic processor capable of exploiting both bottom-up and top-down cues as they become available [42]. Littman et al. observed that some programmers adopt an as-needed, or opportunistic, approach, which focuses on the necessities of the task at hand [48]. This contrasts with the systematic approach taken by other programmers who form a more complete mental representation of the program. O’Brien et al. distinguish between two variants of top-down comprehension, ‘expectation-based’ and ‘inference-

4

Schema for Analysis

The set of analysed models is shown in table 1. The table gives a short characteristic for each model. Citations (retrieved July 2010) for Google Scholar and ACM digital library are counted, and references to educational publications citing the chosen set of PC models are listed in the last column.

67

Figure 2 Evolution of PC models. Models selected for detailed analysis are shaded Authors

Year

Model type

Citations Google Scholar

Soloway and Ehrlich

1984

top-down

560

Letovsky

1986

Pennington

1987

Citations ACM digital library

opportunistic

[42]:

3

[43]: 270

44

bottomup

[65]:

47

[66]: 384

Citations in CER papers

Fix, Wieden beck and Scholz

1993

representation characteristics

[4,16,24, 26,29,30 ,41,46,5 6,57,67, 71,75,77 ,85,89,9 2,100]

von Mayrha user and Vans

1995

Integrated metamodel

Burkhar dt, Détienne and Wieden beck

[1,2,5,27 ,72,75,7 6] [31,38,4 6,50,51, 60,61,71 ,7880,94,10 1]

2002

bottomup (novices) possibly opportunistic (experts)

Table 1: PC models selected for detailed analysis Note: (CER: Computing Education Research)

68

[55]: 111 [54]: 289 45

8

[11,50,5 1,69,84, 96]

24

[13,36,8 0]

39 11

[8,80]

As a first step each chosen model was described in an abstract and its impact was determined by the citation count in Google Scholar and the ACM Digital Library. Afterwards the model was characterized according to the key elements in Figure 1. Since many models are based on experimental studies, information on the empirical base of the respective models was compiled, e.g. the research methods used. In the next step, each model was mapped to the Block Model, in order to find counterparts.

[33]: for each sentence, each word in turn is perceived by the reader and immediately incorporated in the mental representation. At the end of a sentence – in our context we should say: of a perceived program block – the capacity of the Short-Term Memory (STM) is reached, information needs to be transferred and integrated into the Working Memory (WM) so that short-termmemory is freed for the next cycle (STM is only able to store information for a short time, while with WM, it is also possible to process and manipulate the information. Therefore, STM can be considered as a subset of WM [22]. From the program comprehension perspective, STM can be used for storing the information that a programmer has recently acquired from the code. Processing that information and linking it to other information obtained so far from the target program takes place in WM). In this integration process only some information (not all) of the former cycle is transferred, the mental representation is abstracted stepwise from the perceived material.

Subsequently implications for teaching and learning were derived. Main aspects concerned the kind of knowledge and skills that are needed or have to be learned and the difference between experts and novices. Last but not least, implications for further research and finally empirical validations were considered.

5

Introduction to the Block Model

The Block Model [80] is an educational model of program comprehension, which describes the core aspects of understanding a program text. These aspects are organized as a matrix, where the columns represent dimensions of PC and the rows the hierarchical levels of comprehension (Figure 3). Macro Understanding Understanding struct- the overall struc- the “algorithm” ure ture of the pro- of the program gram text Rela- References tions between blocks, e.g.: method calls, object creation, accessing data, ...

Sequence of method calls “object sequence diagrams”

Understanding the goal/ the purpose of the program (in its context)

Operation of a Function of a block, block, a method, maybe seen as or a ROI (as subgoal sequence of statements)

Atoms Language elements

Operation of a statement

Program execution (data flow and control flow)

Duality Structure Figure 3: Block Model [80]

As the information from the text has to be abstracted and reduced it raises the interesting question of, which types of information need to be transferred to the next level? The goal is to extract the information which is needed to enable a connection or integration with prior knowledge, so that the mental representation of the text builds a coherent (Kintsch says ‘gestalt-like) whole ([33], p. 93).

Understanding how subgoals are related to goals, how function is achieved by subfunctions

Blocks ‘Regions of Interests’ (ROI) that syntactically or semantically build a unit

Text surface

Each hierarchy level is more abstract and independent from the perceived information ([33], p. 10-16). At the higher levels it becomes more conscious and organized ([33], p. 93f).

In order to describe this process and the needed types of information, Kintsch proposes a distinction between the “text base” for information extracted from the text, and a “situation model” for the activated prior knowledge from long-term memory. For example, the situation model can differ from previously extracted information (= the text base) due to inferences based on prior knowledge of the reader ([33], p. 50). Of course, there is only one mental representation. The text base is used to describe and analyze those parts of the mental representation that are directly extracted from the text. The situation model contains inferences and abstractions build by integrating prior knowledge into the current understanding of the text.

Function of a statement. Goal only understandable in context

This distinction is also used by several models of PC. However, the Block Model distinguishes three general types of knowledge, visualized as columns in the matrix. These three dimensions of comprehension refer to text surface, execution and function.

Functions (as means or as purpose), goals of the program

Text surface is aligned to the external representation of the program. It is the actual code a person reads in order to comprehend the program (or parts of it). In the Block Model it is supplemented by the execution dimension of the program. This is the feature by which a program text is distinguished from other types of texts. In the discussion on learning programming, understanding the execution is an important aspect for learning [80].

Function

The conceptualization of the hierarchical levels mainly follows Kintsch’s generalized and expanded version of earlier work on text comprehension [33].

Based on knowledge of the text surface and the execution a text base is constructed as a mental representation. By inferences and additional domain knowledge, a situation model is also constructed which includes comprehension of the function, the goals and purposes of the program.

According to Kintsch, the comprehension process is conceptualized as bottom-up and yet chaotic and flexible. First, comprehension has to start at some point with sensing (reading) the (program) text – which is a sequential process. Word by word, new information is constructed and added in the internal mental model of the reader. This process usually moves from bottom to top in the matrix: from words (atoms), to blocks, to inferences about the relations between blocks to recognizing the holistic macrostructure at the fourth level. This process may also be a cyclic process

These three dimensions are separated into two sides: structure and function. This separation resembles the separation of text base and situation model. But it also draws on ideas from philosophy of technology [37]. Thereby technical artifacts, such as source code, have a dual nature. This duality is called structure vs. func-

69

tion. Structure draws on the empirically observable and objectively measurable properties of the program: The way the program text is structured, and accordingly the resulting execution of the program. Thus the first two dimensions are conceptualized as structure. In contrast, function draws on the intended goals, purposes and desired effects of the so-to-speak ‘physical’ structure. The crucial point in the philosophical debate is that function cannot directly be measured or inferred from structure, but instead it is a socially grounded interpretation.

6.1

The last dimension therefore is conceptualized as function in this philosophical perspective. The interesting aspect of this separation is that it suggests a qualitatively different aspect of understanding that has to be distinguished from the usual idea of ‘information extraction’. In order to comprehend function, a reader has to interpret the program text. Noteworthy for an educational model is that understanding of function and goals of a program relies on inferences, and on several knowledge types extracted from the code. This has important educational implications. The educational difficulty is to build bridges between these two sides of the duality. Teaching approaches in particular should avoid understanding of function that is detached from the understanding of the underlying structure – and should avoid teaching structure without bridging to function, too.

6.1.1

Characterization of the Model

The model of Soloway and Ehrlich can be characterized as follows.

6.1.1.1 External Representation Used The external representation consists of the program code. Program code is given to the programmers on paper in either plan-like (program that uses only typical plans which are written in a way consistent with discourse rules) or unplan-like form (program composed in a manner that does not conform to the plans and discourse rules).

6.1.1.2 Conceptualization of Assimilation Process

Overall, within the comprehension process maybe all, maybe only one dimension is involved. Which types of information are extracted, and how the process of abstraction and inferences develops, depends on the already constructed mental representation.

Comprehension takes place by reading the code and finding and understanding familiar plans. Rules of programming discourse (e.g., proper variable names) play an important role in understanding plans and help the programmer to make sure that a plan is indeed the one he or she has intended it to be. Improper rules of programming discourse result in misinterpretation of plans, which in turn leads to failure in understanding the program. Critical lines (that carry the information that makes the program a plan-like program) are the key representatives of a plan which experts utilize in process of recognizing the plan.

Comprehension is influenced a) by the material already read, b) by the knowledge base, c) by additional information, and d) goals or intentions of the reader. All these factors are constrained or even constructed by the teaching approach. For example, by giving information about function and goals at the macro level, the bottom-up reading process turns into a top-down process in which already constructed preliminary understandings (hypotheses) are refined, become more detailed, or are rejected.

The assimilation process in the model of Soloway and Ehrlich is top-down [53].

Another interesting aspect of the comprehension model is the assumption that with the increasing competence of the reader fewer errors occur in: a) extracting information, b) integrating, and c) activating the relevant prior knowledge [33]. In addition, with greater expertise more processes are performed automatically or unconsciously so that cognitive resources are freed for more complex and intentional processing [33].

6.1.1.3 Conceptualization of Cognitive Structure The programmer’s knowledge base is divided into programming plans and rules of programming discourse. The mental representation of a program consists of plans that are composed in a program. The composition is governed by rules of programming discourse. In addition, critical lines, which play the role of beacons, are essential in recognizing plans, and thus are of high importance in programmers’ knowledge base.

Based on the model, different learning paths can be built. Therefore the blocks can be arranged in different orders. But not all blocks always need to be taken into account, only the core issues.

6.1.2

Empirical Basis

In order to show the validity of the claims, the two following studies were conducted by Soloway and Ehrlich.

In addition, application of the model should distinguish between micro-sequences and macro-sequences. Micro-sequences focus on teaching and learning (~comprehending) one singular program text, maybe implementing one sorting algorithm. They focus on one example only. Macro-sequences focus on a course, in which more than one example or goal is taught.

6.1.2.1 Study 1 Two different versions of programs were used: a plan-like program and an unplan-like program, as defined before. These two versions were created for 4 different short program types (so a total of 8 programs for each participant). From each program, one line of code was replaced with a blank line. Experts (45) and novices (94) where asked to fill the blank line so that it best completed the program (participants were not told what problems the programs were intended to solve). According to the Soloway and Ehrlich model, expert programmers should perform better with plan-line programs than with unplan-like programs. Whether a program was plan-like or unplan-like should not critically affect novices’ understanding of the programs, since they have less

However, the Block Model gives an abstracted overview, and as such remains rather vague concerning teaching issues like goals, content and sequence of teaching. By analyzing different models of program comprehension in more detail, the authors strive to obtain more detailed educational knowledge of these issues.

6

Soloway and Ehrlich

The model of Soloway and Ehrlich [87] is based on natural language text comprehension research. From the perspective of this model, a computer program consists of plans, which are “generic program fragments that represent stereotypic action sequences in programming” [87]. These plans are composed by applying rules of programming discourse, that is, programming conventions. In a well-written program, good programming conventions are used to govern the composition of plans in a program, and as a result, the program is comprehensible. Atypical plans and bad programming conventions result in a program that is difficult to comprehend.

Results of Analysis

We now present the results of detailed analysis of each of the set of selected models shown in table 1.

70

From this information, plans are inferred. Plans relate to the different levels of the function dimension in the block model (see 2 in Figure 4). The horizontal arrow in the mapping indicates that inferences are linked from (1) to (2).

knowledge than experts, that is, they are not expected to have acquired as many of the plans and discourse rules. The first study showed, that for plan-like programs experts performed significantly better than novices, while no such differences were found on unplan-like programs. Overall, all subjects solved more plan-like problems.

6.1.5

6.1.2.2 Study 2 This study presented experts (41) with essentially the same (but complete) programs as in the first study. Each program was presented to each expert three times (each time for 20 seconds) and they were asked to recall the programs verbatim. The programs included both plan-like and unplan-like programs. On the first time, participants were asked to recall as much of the program as possible, and on the second and third times they were asked to add to or correct their original recall using a different color pencil each time. In this study, the idea is that if plans help experts to understand programs more efficiently, they should recall more of the plan-like programs earlier. The results were presented in terms of the number of recalled critical lines of the plan-like and unplanlike programs. The changes of difference between the recalled critical lines in the two program types over trials were also analyzed. This is intended to demonstrate the importance of plans and plan-like programs in recalling and comprehension.

6.2

Empirical results for the Pennington model show that programmers attaining high levels of comprehension tend to think about both the program world and the domain world to which the program applies while studying the program. This is called a crossreferencing strategy and contrasts with strategies in which programmers focus on program objects and events or on domain objects and events, but not both. This model is based on the text comprehension model of Kintsch and van Dijk [34].

Subsequent Empirical Work

The model of Soloway and Ehrlich is one of the oldest models in the program comprehension field, and constitutes the basis of many later studies (see for example [53] [54]).

6.1.4

Pennington

The model of Pennington [64,65] distinguishes between a program model (text base) and a domain model (situation model), which are developed sequentially (program model first). Comprehension is best, when the two models are interrelated. This model is developed from studies which focused on differences in comprehension strategies between programmers who attained high and low levels of program comprehension.

The results of the second study showed that experts recalled significantly more plan-like lines of code correctly, thus supporting Soloway and Ehrlich’s model.

6.1.3

How do experts and novices differ?

Experts have (more comprehensive) knowledge of programming plans and are able to use these plans when applying rules of programming discourse, while novices do not. As a result, experts perform considerably better with plan-like programs than unplanlike programs, whereas the difference in performance for novices is not that clear (since they have not developed many plans yet).

6.2.1

Characterization of the Model

The model of Pennington can be characterized as follows.

6.2.1.1 External Representation Used

Mapping to Block Model

In the first ‘short text’ study [66], several 15 line segment FORTRAN and COBOL programs were presented to computer professionals. In the second study [65] a minimally documented 200 line FORTRAN program (converted from an earlier COBOL program) was used. “The documentation included an introductory set of comments describing the program as one that keeps track of the space allocated for wiring … and the wiring assigned to that space during the design of a building... the FORTRAN version contained one-line comments corresponding to COBOL paragraph headers” ([66], p. 329). In both studies, the program text was presented on the computer display and subjects could scroll forward or backward, jump to another place in the program, split the screen into halves, and scroll either half. Thus the external representations were primarily the program code supported by the features of the development environment, plus some complementary domain information.

As a top-down model, the Soloway and Ehrlich model can be considered as consisting of the following two phases: recognizing/assuming plans and verifying the recognized plans. Programmers perform the first phase using their knowledge base and programming plans. In the second phase, rules of programming discourse and critical lines are used to ensure that the plans are recognized correctly. The mapping of the Soloway and Ehrlich model onto the block model as shown in Figure 4 pertains to the second phase.

6.2.1.2 Conceptualization of Assimilation Process Pennington proposes that understanding of overall program flow control precedes the more detailed understanding of program functions. In particular, she suggests that program readers build at least two mental models of the program they are studying, a program model and a domain model. The program model is characterized by an abstract knowledge of the program's text structures. The domain model relates objects and functions in the problem domain to source-language entities. Pennington asserts that the comprehension strategies of the subjects in these studies can be characterized as program-level, domain, or cross-referencing, the latter being a strategy that combines features of the other two strategies. That is, the programmers concentrated either on the

Figure 4 Soloway & Ehrlich mapped to the Block Model Critical lines are single statements in plans, and thus can be regarded as relating to atoms in the Block Model. Consequently, rules of programming discourse can be related to atoms, as they are often associated with sentences. Comprehension therefore starts with, and is focused somewhat upon the atom level, and the text surface (see 1 in Figure 4).

71

program, on the problem domain, or somehow effectively related the two. It was the cross-referencing readers who performed best.

maries explaining what the program did and answered 20 comprehension questions. They were given an additional 30 minutes to implement the requested change, after which a second summary was written and 20 more comprehension questions answered.

6.2.1.3 Conceptualization of Cognitive Structure Pennington does not refer explicitly to the notion of a knowledge base. However she does explore ‘text structure’ knowledge and ‘plan knowledge’ as macrostructures in the construction of mental representations of program texts.

Using her analysis of the data, Pennington asserted that the comprehension strategies of the subjects could be characterized as conceptualized by the model resulting from the study.

6.2.3

The “idea that text structure units play an organizing role in memory suggests three main features of program comprehension: (1) Comprehension proceeds by segmenting statements at the detail level into phrase-like groupings that then combine into higher order groupings. (2) Syntactic markings provide surface clues to the boundaries of these segments. (3) The segmentation reflects the control structure of the program” ([66], p. 307). “Plans correspond to a vocabulary of intermediate level programming concepts such as searching, summing, hashing, counting, etc., and there are hundreds (maybe thousands) of these plans. Like other forms of engineering and design, “there is a craft discipline among programmers consisting of a repertoire of standard methods of achieving certain types of goals” (Rich, 1980)” [[66], p. 307].

In this relatively recent study Shaft & Vessey [82] build upon the work of Pennington with respect to the program model, indicating its merit as a basis for subsequent research, but they relied on von Mayrhauser and Vans [54] for the combination of the three comprehension models (program, top down and situation/domain model).

There are 5 types of information extracted 

Operations: “specific actions [which] occurred in a program” ([66], p. 101)



Control Flow: the execution sequence of a program



Data flow: transformations of data objects



State: “the state of all aspects of the program that are necessarily true at that point in time” ([65], p. 101)



Function: purpose and role of the program (goals / hierarchy of subgoals)

Subsequent Empirical Work

Shaft & Vessey [82] cite the work of Pennington in mental representations of software developers: “Pennington’s ([65], [66]) characterization of the types of information in software as function, data flow, control flow, and state information best encompasses the relevant research and has been used widely in studies addressing the types of information found in software. Hence we use Pennington’s characterization in our research” [[82], p. 34].

The Shaft & Vessey study concluded that ‘cognitive fit’ was important for comprehension when conducting software maintenance: “Improved software comprehension is associated with better performance on the modification task only when cognitive fit exists between the maintainer’s dominant mental representation of the software and the type of modification task conducted (software modification task). When fit does not exist, improved software comprehension is associated with lower problem-solving performance on the modification task” ([82], p. 47).

6.2.4

Mapping to Block Model

Mapping of the Penington model to the Block model is illustrated in Figure 5. The mapping assumes that the program model is first built for the text surface dimension in a bottom up fashion. In the second phase the domain model is built for the function dimension (as indicated by the arrow labeled “1” in Figure 5), with it being unclear whether a top down, bottom up or a combination of both strategies is adopted (as indicated by the double headed arrow labeled “2” in Figure 5).

Pennington argues that “abstract knowledge of program text structures plays the initial organizing role in memory for programs, and that control flow or procedural relations dominate in the macrostructure memory representation” [[66] p. 337]. She asserts that the program model is characterized by an abstract knowledge of the program's text structures. The domain model relates objects and functions in the problem domain to sourcelanguage entities. Whether or not Pennington's results indicate that program readers create two distinct mental models in succession, they certainly support the layered abstractions proposed by Brooks [9] and Letovsky [42].

6.2.2

Empirical Basis

Pennington [65] carried out two studies to develop her conceptualisation of program comprehension. In the first ‘short text’ study, using a latin square experimental design, professional programmers were exposed to small pieces of code (15 line segments), and engaged in recall and recognition tasks indicative of comprehension.

Figure 5 Pennington mapped to the Block Model At a third comprehension stage, links across columns in the Block Model occur through a cross-referencing strategy between program and domain models (double headed arrow “3” in Figure 5).

In the second study of moderate length program texts, Pennington [65] carried out an experiment using a minimally documented 200-line FORTRAN program. Subjects (drawn from top and bottom quartile comprehenders in the first study) were asked to study the program for 45 minutes in preparation for a modification task. Some of the subjects were asked to think aloud as they examined the program. After the study period, subjects wrote sum-

The latter is said to be the most successful model of program comprehension. It is quite possible that the sequence of 1-2-3 may be an artifact of the experimental design

72

6.2.5

How do experts and novices differ?

Experts tended to use cross referencing, by which the program model and domain model was constructed and linked. Novices tended to either construct a program model, or a domain model.

6.3

Letovsky

Letovsky [42], [43] describes “a computational model of the subjects’ understanding processes, in which questioning and conjecturing behaviors play a functional role” [43], p. 325). The inherent “processes and representations constitute a cognitive model of human program comprehension abilities [...]. The basic structure of the model is a set of knowledge assimilation processes that construct a mental model of the target program by combining information gathered though reading the code and documentation with knowledge from a knowledge base of programming expertise” ([43], p. 325).

6.3.1

Figure 6 Letovsky mapped to the Block Model While the specification relates to the dimension of functions in the Block Model, especially since Letovsky outlines the possibility of decomposing goals into subgoals, implementation and annotation cannot be mapped straight to text surface and execution. The mental model is some kind of layered network, where the annotations constitute the layers between the implementation and the specification. These annotations are implicit in the transitions between the levels and dimensions of the Block Model. The process of acquiring this mental model is quite flexible, the opportunistic programmer uses both bottom-up and top-down strategies, therefore multiple translations to the Block Model are possible.

Characterization of the Model

The model of Letovsky can be characterized as follows.

6.3.1.1 External representation used: Source code combined with program documentation consisting of an overview, program routine descriptions, a hierarchy chart (calling structure of the routines), a file description (structure of the data base file) and a sample session, provided the basis for Letovsky's model.

6.3.4

6.3.1.2 Conceptualization of Assimilation Process The assimilation process “interacts with the stimulus materials [...] and the knowledge base to construct the mental model” ([43], p. 331). The programmer is an opportunistic processor that mixes bottom-up and top-down elaboration of the mental representation.

6.4

6.3.1.3 Conceptualization of Cognitive Structure

Empirical Basis

The conducted study used thinking-aloud protocols from six professional programmers (4 expert-level program maintainers with 3 - 20 years of professional programming experience and 2 junior-level program maintainers with less than 3 years of professional experience). The given tasks included understanding and enhancement of a FORTRAN 77 program, which contained 14 routines, 250 lines of code and documentation. Three of the six subjects completed the task in the allotted 90 minutes

6.3.3

6.4.1

6.4.1.1 External representation used The basis is a PASCAL program with 135 lines, presented on paper.

Mapping to Block Model

“Specification: an explicit, complete description of the goals of the program;



Implementation: an explicit, complete description of the actions and data structures in the program;



Annotation: an explanation that shows how each goal in the specification is accomplished and by which parts of the implementation, and what goal(s) in the specification is (are) subserved by each part of the implementation” ([43], p. 332).

Characterization of the Model

The model of Fix, Wiedenbeck and Scholz can be characterized as follows:

6.4.1.2 Conceptualization of Assimilation Process

According to Letovsky a complete mental model includes the following aspects: 

Fix, Wiedenbeck and Scholz

The focus of this model [25] is not on the process, but on five abstract features of the mental representation of the code. In contrast to types of information included in the mental representation, this model focuses on characteristic features of a mental representation that distinguish novices from experts. These features are drawn from prior work also discussed here, namely the models from Soloway, Letovsky and Pennington. The distilled features by which novices and experts differ are named: hierarchical and multi-layered; explicit mappings between the layers; founded on the recognition of basic pattern; well connected; well grounded in the program text.

Programmers are knowledge-based understanders, they have a “knowledge base, which encodes the expertise and background knowledge that the programmer brings to the understanding task” ([43], p. 331) and a “mental model that encodes the programmer’s current understanding of the target program. This model evolves in the course of the understanding process” ([43], p. 331).

6.3.2

How do experts and novices differ?

Letovsky conducted his study with professional programmers and therefore did not draw explicit conclusions about the differences between novices and experts. However, a developed knowledge base is necessary to become an expert.

Although the model is focused on characteristic features of the mental representation at the end of the assimilation process, some hints on this process are given. The assimilation process is merely seen as a process of “information extraction”, based on the knowledge base (e.g. based on knowledge of plans and the hierarchical structure of programs, these types of information can be extracted from the code) [25]. In addition, the assimilation process may be influenced by reading strategies (e.g. by reading the program’s code in the order of execution of the program). Authors indicate that such a strategy

73

Understanding of interactions of parts of the program can be mapped to the level of relations in the Block Model. Again, Fix et al.’s model seems to be focused on the dimensions of text surface and functions.

supports building a layered or hierarchical mental representation [25].

6.4.1.3 Conceptualization of Cognitive Structure The mental representation (of experts) is organized so that it has the following features:

5) Sound base in the program text: Another feature of experts’ mental representation is the ability to localize or re-localize already obtained information in the code. “Well grounded [mental] representations include specific details of where structures and operations occur physically in the program” [25]. An indicator is the ability to “fill in names of program units in a skeleton outline of the program” [25] (This task type may be seen as a precursor to Parsons’ puzzles [63]). Therefore a skeleton of the program was visualized using boxes; nested program parts as boxes within boxes. Subjects had to fill in the appropriate names of “program units” into the boxes. A second indicator is the ability to “match variable names to the procedures in which they occur” [25].

1) Hierarchical and multi-layered structure: This feature is based on Letovsky’s model [42] of a decomposition of goals into sub-goals. It is also related to the style of programming. It resembles Wirth’s ideas of procedural programming by stepwise refinement through hierarchical decompositions of tasks into subtasks [102], resulting in a modular and hierarchical structure of code. Relying on knowledge about such structures, reading such hierarchical code an expert will extract this type of information. An indicator of a hierarchical and multi-layered mental representation is e.g. the ability to “match procedure names to the procedures which they called” [25]. Layers may refer to the levels of the Block Model. It remains somewhat unclear whether such layers refer to only the dimension of text surface, or to text surface and function or to even all three dimensions.

It is not clear whether the mental representation was conceptualized by Fix et al. as having the needed information included or as organized in a way so that the needed information can easily be reconstructed. In terms of the Block Model and other models relying on the idea of chunking the latter version would be the case. That is, the mental representation is conceptualized as a succession of different mental representations for the different layers, and moving up these layers would include chunking. The idea of a mental representation as being well grounded in the code however refers to a movement downwards from abstractness to concreteness. Therefore, we might call the mental operations needed as ‘de-chunking’.

2) Explicit mappings between the layers: These layers are also connected to an abstraction from the code. For example, it is assumed that it is easy to understand “the action of each line of code in isolation”, as well as understand the overall goals (e.g. by “mnemonic names or documentation” [25]). The problem in comprehension is the “mapping between high-level goals and their code representation” [25]. Explicit mapping between layers refers to a mental representation that enables a person “to link specific segments of code to program goals” [25]. An indicator is the ability to write a description of two procedures, “telling what program goals they realized” [25]. As two procedures are involved, answering the question requires to “map between high level goals and the program code” [25]. In the empirical study, “subjects were also asked to write a sentence or two about how the procedure carried out its goals” [25].

6.4.2

Empirical Basis

20 novices and 20 experts had 15 minutes time to read a 135 LOC PASCAL-program on paper (3 pages). The code was withdrawn, and a question-booklet with 11 questions was given to the subjects. Each question was on a single page. After filling out a page the participants were not allowed to re-access this page. The session was open-ended. Experts on average used 1 hour, novices 1 ¼ hours.

Rephrased in terms of the Block Model, mapping between layers refers to the ability to link the dimensions of function and program execution to the dimension of the text surface.

6.4.3

Subsequent Empirical Work

No directly linked subsequent empirical work on the specific five abstract characteristics with regard to the article was found. However, there is work on some of the features, or on comparison of experts and novices in program understanding:

In contrast to the authors’ claims, when respectively mapped to the Block Model, this “mapping between layers” seems to happen. For example, the above mentioned indicator where subjects were asked to describe the goals of two procedures requires comprehending goals within the level of relations.



Burkhardt/Détienne/Wiedenbeck [10]: “experts and novices differ in the elaboration of their situation model but not their program model”.



LaToza et al. [40]: “Differences between experts and novices included that the experts explained the root cause of the design problem and made changes to address it, while novice changes addressed only the symptoms. Experts did not read more methods but also did not visit some methods novices wasted time understanding. Experts talked about code in terms of abstractions such as ‘caching’ while novices more often described code statement by statement. Experts were able to implement a change faster than novices. Experts perceived problems that novices did not perceive and were able to explain facts that novices could not explain.”



Corritore and Wiedenbeck [14]: novices “form detailed, concrete mental representations of the program text…their

3) Foundation on recognition of basic patterns: This feature of the mental representation refers to plan-knowledge as introduced by Soloway and Ehrlich [87]. Experts are likely to recognize plans while reading code. An indicator is to “label complex code segments with a plan label” [25], such as “linear search”, “input loop until end of file”, “sort routine” or “binary search” [25]. 4) Sound connections: This feature of the mental representation refers to the ability to understand “how parts of the program interact with one another” [25]. These interactions seem hard to understand because they may “embody instances of delocalized plans, that is, plans in which the code implementing them is scattered throughout the program” [25]. An indicator is the ability to “list names used for same data objects in different program units” [25].

74

mental representations were primarily procedural in nature, with little or no modeling using real-world referents.” In contrast, “more advanced novices were using more abstract concepts in their representations”.

6.4.4

an exploratory experiment that identifies dynamics of the cognition process when maintenance engineers work with large operational software products” [55]. The study aimed to find a code comprehension process model using the Integrated Comprehension model as a guide for large scale program understanding. Rather than a strict top down or bottom up process, the authors found that programmers frequently switch between program, situation and domain (top down) models, and for maintenance activities the effective understanding of large scale code needs significant domain information.

Mapping to Block Model

As the Fix et al. model focuses on features of the mental representation, the mapping focuses on – interestingly – dynamic aspects. The Fix et al. model highlights the idea that experts can easily navigate within their mental representation to access different types of information.

6.5.1

While the matrix of the Block Model visualizes key elements of a mental representation, the mapping with arrows indicates these dynamic features of experts’ mental representations. Experts’ mental representations can be seen as highly dynamic or navigational, and therefore they can easily and flexibly navigate between dimensions and levels.

Characterization of the Model

The model of Von Mayrhauser & Vans can be characterized as follows:

6.5.1.1 External Representation Used The second study investigated “the cognition processes of programmers with little prior knowledge of the application domain as they attempted to understand large scale production code” ([55], p. 426). The external representation appears to be solely the program code, it does not appear from the report [54] that any other specification documents were used.

6.5.1.2 Conceptualization of Assimilation Process The model integrates aspects of the five models proposed by prior researchers Letovsky ([42], 1986); Shneiderman and Mayer ([83], 1979); Brooks ([9], 1983), Soloway, Adelson and Ehrlich ([86], 1988) ; and Pennington ([65], 1987). The model accommodates a mental representation of the code, a body of knowledge (knowledge base) stored in long term memory, and a process for combining the knowledge in long term memory with new external information (such as code) into a mental representation. The integrated metamodel responds to the cognition needs for large scale software systems. It combines relevant portions of the other models and adds behaviors not found in them – for example when a programmer switches between top-down and bottom-up code comprehension.

Figure 7 Fix, Wiedenbeck and Scholz mapped to the Block model In Figure 7, arrow (1) indicates what Fix et al. have called “founded on the recognition of basic pattern”, “well connected”, and “explicit mappings”: navigation between dimensions. Arrow (2) indicates what Fix et al. have called “hierarchical and multilayered”, and “well grounded”: navigation between levels.

6.4.5

How do experts and novices differ?

“Prior work showed that subjects frequently switch between all model components (i.e., understanding is built at all levels of abstraction simultaneously)” ([55], p. 429).

The overall idea of this model is that comprehension relies on the ability to ‘navigate’ between different elements of the mental representation, horizontally as well as vertically (from the perspective of the Block Model). Experts are more likely to map aspects of function and execution to concrete code. Also they are more likely to have a layered and hierarchical mental representation which is well connected allowing navigation or focusing on specific aspects as well as ‘explicit mappings’ of these different elements. Novices do not have such an elaborated mental representation. They can show each of the five characteristics, or beginnings thereof but are less likely to build such a well connected or ‘navigational’ mental representation.

6.5

This model suggests a more complex process of alternating models of comprehension, as different avenues of attack become exhausted. The knowledge base seems to be a central reference point for such model switching, but holds different types of data for each of the three models currently in use. It is argued that the three model components (program, domain, situation) are applied at differing levels of abstraction depending upon code size. They define a classification scheme based upon code length or lines of code (LOC), where small scale is code < 200 LOC; several modules > 200 LOC to 2000 LOC; program size > 2000 LOC to 40,000 LOC; and a large scale system > 40,000 LOC. It is suggested that for large code bases understanding occurs at a higher level of abstraction.

Von Mayrhauser & Vans

This model [55,54] resulted from a desire to address gaps in program comprehension knowledge and of experimental research applicable to the work of software developers engaged in maintenance activities. The absence of experimental studies using large scale bodies of code was noted. An integrated metamodel of program comprehension was proposed, drawing upon the work of prior researchers in program comprehension. Four major components were included: “the top-down, situation and program models and the knowledge base. The first three reflect comprehension processes; the fourth is needed to successfully build the other three” [54]. The model was developed based upon “results from

Four levels of aggregation are outlined: action types at the component level (e.g. program model); episode level (action sequences representing lowest level strategies); aggregate processes (repeated episode sequences); session level processes (overall maintenance task process). Examples of each of these based upon state machine diagrams are presented. Table 6 in [54] presents 50 or so discrete action types at the lowest component level.

75

The resulting highest session level process characterised as ABC to understand the maintenance session for one module, alternated between aggregate level processes PA, PB and PC (codes for repeated episode sequences noted above) by reaching an end of block and chunking and storing the knowledge gained. These were represented in a state model to demonstrate transitions between these aggregate processes within the overall maintenance session. Thus the aggregate level processes represent investigation towards building chunks, and “at the session-level the purpose of each aggregate process is to understand a block of code (using different detail steps and information) and then to chunk and store the learned information” ([55], p. 434).

objects and the relationships of objects during program comprehension” ([15], p. 6). Corritore & Wiedenbeck studied the differences in comprehension of a procedural (C) and OO (C++) program of some 800 LOC which implemented the same functionality and was supported by significant external documentation. 30 professional programmers, half of whom were conversant with either paradigm, participated in the study completing three maintenance tasks over two separate two hour sessions. “Compared to the procedural group, the OO group showed a pattern in which their strategy changed more over time with the activity. While initially they used a top-down approach to comprehension, later their strategy shifted towards a more bottom-up orientation as they made program modifications. This shift became more pronounced as time progressed, and was most clearly shown by the sharp decline in their documentation use in the second modification. It appears that the more abstract files provided the information needed for general comprehension of the program, but once the modifications began the OO programmers shifted to a more bottom-up orientation. Thus, the modification tasks focused the information needs of the programmers on the code. While the same decline in the use of abstract information occurred in the procedural group, it was less pronounced because external documentation was used less from the start” ([15], p. 17).

6.5.1.3 Conceptualization of Cognitive Structure Various types of information are extracted from the knowledge base. 1) Top down structures: such as programming plans at three levels (strategic, tactical and implementation) and “rules of discourse”. 2) Program model structures: related to program-domain knowledge (text structure knowledge, plan knowledge, rules of discourse). 3) Situation Model Structures: related to problem domain knowledge (in particular - functional knowledge).

6.5.2

Empirical Basis

The model is grounded in the experience of 11 professional programmers in moderate scale and reasonably representative experiments of the comprehension processes involved in maintenance activities. Data from analysis of a single programmer was taken as representative of the others for this study.

These results for OO programs are broadly consistent with the von Mayrhauser & Vans findings that rather than a strict top down or bottom up process, programmers frequently switch between program, situation and domain (top down) models, and for maintenance activities effective understanding of large scale code needs significant domain information.

The subject had to become familiar with “the main module of a terminal emulation program. To do this, he had to understand both the program and the X.25 communications protocol. He had recently taken over responsibility for this code and wanted to understand it so that he would be able to fix bugs and enhance the software in the future. Thus his was a general understanding task. The software the subject worked on and the specific assignment he performed is typical of maintenance tasks encountered in industry. The software consisted of approximately 85-90K lines of [non standard Pascal] code.” ([55], p. 427)

Shaft and Vessey [82] cite the von Mayrhauser & Vans model as illustrative of the multiple representations of software: “Software maintainers incorporate these different types of information into their mental representations of the software. They may develop multiple mental representations, each of which emphasizes a particular type of information. The types of representations that they form depend on their experience with software and with their knowledge of the application domain of the software, among other factors. A series of protocol analysis studies have led to the characterization of these representations as domain, program, and situation models (Vans et al. 1999; von Mayrhauser and Vans 1995, 1996; von Mayrhauser et al. 1997)” ([82], p. 34).

“A two hour programming session, during which [a representative subject] worked on a maintenance task, was audio-taped and then transcribed for analysis. He was in the process of understanding one module in a system for which he recently took over responsibility ([55], p. 426).” Protocol analysis using think aloud techniques was used as a research technique, complemented with analysis of ‘action types’, and of dynamic code understanding through process discovery and strategy analysis at three levels (episodic, aggregate, session).

6.5.3

The study concluded that due to problems with cognitive fit, comprehension of software in itself may be insufficient: “Prior research investigating software comprehension and modification views them as distinct tasks. Our findings indicate that they should be viewed as interrelated tasks because of the complex interrelationship between them. Seeking high levels of comprehension as a way of improving the ability to conduct other software related tasks is beneficial when the software maintainer’s mental representation of the software and their mental representation of the modification task emphasize the same type of knowledge. However, when there is a mismatch between the software maintainer’s mental representation of the software and the mental representation of the modification task, improvement in comprehension impedes performance on the modification task” ([82], p. 49).

Subsequent Empirical Work

Corritore and Wiedenbeck [15] referred to the model as a motivator for subsequent studies of OO programmers: “von Mayrhauser and Vans (1996) speculate that OO comprehension might be characterized by a more top-down approach compared to procedural comprehension…As suggested by the work of Gilmore and Green (1984), this would be likely to occur if OO code and documentation tend to highlight higher-level abstractions more than do procedural code and documentation. Experimental results of Burkhardt, Détienne & Wiedenbeck (1997) which employed program documentation and reuse tasks show that OO experts tend to develop a domain-based abstraction in terms of function,

6.5.4

Mapping to Block Model

The mapping assumes that the program model is built for the text surface dimension in a bottom up fashion, the top down model is

76

built for the program execution dimension in a top down fashion and the domain model is likewise built for the function dimension in a top down fashion.

6.6.1.1 External Representation Used The program was presented to the study participants as C++ source code in hard copy and on computer screen, with "little documentation". The program consisted of 10 classes presented in 23 files, with a total of 550 lines of code. The program was also provided in executable form, and a C++ reference textbook was provided.

Links across columns in the Block Model occur primarily at the chunking or block levels, with the knowledge base being implicit in this linkage. The model might alternatively, or perhaps complementarily, be represented as supporting linkages also at the relations level.

6.6.1.2 Conceptualization of Assimilation Process This work essentially follows the approach of Pennington in which comprehension is seen as a process of extraction of information and the construction of program model and situation model. The evolution of these models differs between novices and experts, and differs according to the nature of the purpose of comprehension. For read-to-recall, experts are more likely to construct a situation model than novices, while for read-to-reuse the differences are smaller and novices are able to construct a situation model over time.

6.6.1.3 Conceptualization of Cognitive Structure

Figure 8 von Mayrhauser & Vans Integrated mapped to the Block Model

6.5.5

The cognitive structure presented in this model is derived from Pennington’s model, with additional elements included to take account of object-oriented program features. While plan representations upon which the situation model is built are described by Pennington as being primarily based on data flows, the model here supports delocalized plans. The following additional text relation elements are defined: objects, relationships between objects, reified objects (objects which are not part of the problem domain but are required, e.g. a string class or collection class), main goals and client-server relationships. Like Pennington, these text relations are mapped to knowledge structures and mental representations, and a distinction is drawn between text knowledge and plan knowledge. There is explicit reference in this work to complex delocalized plan knowledge and to generic programming knowledge relevant to reified objects. The mental representation of the program is conceptualized as dynamic, object and functional views, objects and relationships form an object view, while data flow and client-server form a dynamic view. The functional view relates to the main aims of the program. Dynamic and functional views occur within both the program model and the situation model, while the object view is considered to be characteristic of the situation model only.

How do experts and novices differ?

The von Mayrhauser & Vans Model is primarily focused on the comprehension strategies of experts, and addresses the processes of comprehension for “programming in the large” by professional programmers, but could be conjectured to apply equally to “programming in the small” by novices, where domain familiarity or perhaps ”triviality” may cloud the findings of comprehension studies? Therefore finding ways of being more explicit about the domain may perhaps help novices in more naturally drawing the linkages between top down, program and situation models.

6.6

Burkhardt , Détienne and Wiedenbeck

This model [10] is based on a mental model approach following van Dijk and Kinsch. Like Pennington’s model, it distinguishes between the text base, or program model, and the situation model. The authors’ motivation in this work was to study the effect on PC of three factors within a single experiment: programmer expertise, nature of the task and the development of comprehension over time (phase). Two distinct types of task were included in the study: a read-to-recall task (documenting a program) and a readto-reuse task (modifying a program). Conceptually, the model can be considered to be a descendant of Pennington’s work, adapted to take into account object-oriented program features such as objects and message passing as well as larger program structures. The authors found a four-way interaction between the three factors studied and the type of model (program or situation) developed by the participants in the study. For the read-to-recall task, novices did not develop a strong situation model and there was no interaction of expertise with phase and type of model. For the read-to-reuse task, there was a three-way interaction between phase, expertise and type of model. The key finding was that when given this type of task novices were able over time to develop a situation model.

6.6.1

6.6.2

Empirical Basis

The model is grounded in a study based on a four-factor mixed design. The between-participant factors were expertise and task type while the within-participant factors were phase and type of model tested (i.e. program model versus situation model). There were 51 participants, 30 of whom were OO experts (professional programmers experienced in OO design with C++). The remaining 21 were classed as OO novices. These were advanced undergraduate computer science students who had recently enrolled in a course introducing OO programming and C++, and who were experienced in C and other non-OO languages but had only basic knowledge of OO programming and C++. Participants were randomly allocated to one of two tasks:

Characterization of the Model

The model of Burkhardt , Détienne and Wiedenbeck can be characterized as follows:



read-to-recall task: comment the code for the use of another programmer



read-to-reuse task: construct a different system which may reuse given code by inheritance or cut & paste

For both tasks an initial preparatory study phase preceded a second phase in which the participants carried out the assigned

77

formed. When presented with a read-to-recall task, novice comprehension focused on the program model more than on the situation model. However, when presented with a read-to-reuse task, novices were able to develop a situation model over time. There is an implication that, given tasks which involve modifying or reusing code, some readers “may construct a representation that they do not spontaneously construct otherwise” ( [10], p141). This should be considered when designing learning activities based on sample programs. However, it should be noted that the “novices” in this study were in fact advanced students with significant programming experience, and were novices only in the sense of lacking OO experience, and some caution must be exercised in expecting this finding to apply to true novice programmers.

task. Verbal protocols were collected and the participants answered questions after each phase where the questions were designed to measure program comprehension. The program selected for the study was chosen so that the problem domain (management of information within a university) was familiar to all participants regardless of programming expertise. While small by industrial standards, the program was described as being larger than programs typically used in similar research. The authors comment that their findings “confirm, for the documentation group, our hypothesis that the expertise of programmers should affect the construction of the situation model but not the construction of the program model, provided that our novices are advanced students”( [10], p134). For the reuse task, “the effect of phase was to increase the construction of the situation model but not of the program model. For the novices in the reuse situation there was a significant increase of the situation model between phases 1 and 2” ([10], p134). Experts appear to form program and situation models together, possibly using opportunistic strategies.

6.6.3

7

Comprehension is mainly focused on program code (often presented on paper) as a dominant type of representation. Only occasionally tools, visuals, documentation (other than code comments) or representations of the domain are included.

Subsequent Empirical Work

The article is quite recent, so we found no direct subsequent empirical work on the interplay of task, expertise and phase, but some subsequent work citing this article.

The process of assimilation is usually seen as dependent on context factors such as goal/task, prior knowledge, and representation. Thus the process may be top-down, bottom-up, as-needed, or opportunistic.

Arisholm et al. [3] evaluated pair programming with respect to system complexity and expertise. They observed that the “benefits of pair programming in terms of correctness on the complex system apply mainly to juniors, whereas the reductions in duration to perform the tasks correctly on the simple system apply mainly to intermediates and seniors” [3].

While some models, such as that of Letovsky, list elements of the knowledge base (KB), others do not. Pennington and others essentially ignore the KB, when they are studying professional programmers they assume a complete or almost complete KB. From the educational perspective however, an important question is how to teach and learn a solid knowledge base, and of what elements the KB should consist.

Karahasanović et al. [32] compared as-needed and systematic comprehension strategies for intermediate beginners for a maintenance task of an object-oriented application. They concluded that programmers should “receive training in the advantages and disadvantages of different comprehension strategies for different maintenance tasks and applications. Furthermore, more attention should be paid in training to the areas in which the participants failed to apply their declarative knowledge of object-oriented concepts, such as inheritance of functionality.”

6.6.4

Types of knowledge analyzed in the models comprise ([43], p. 331): 

Programming language semantics: the understanding of the language



Goals: the meaning of a large set of recurring computational goals, that are independent of the algorithms that compute them



Plans: a bag of tricks or solutions to problems that were solved in the past



Efficiency knowledge: criteria that allow inefficiencies to be detected and techniques for evaluating the resource costs of plans



Domain knowledge: knowledge of the world, the application domain, and specialized domains



Discourse rules: knowledge of stylistic conventions in programming

Mapping to Block Model

This model is distinctive in that it places emphasis on the relations level of the Block Model (as indicated by “1” in Figure 9). The key finding relates to the increased development of a situation model (2) by novices with phase, dependent on the nature of the task. Experts follow more opportunistic strategies (3).

The mental representation is often conceptualized as a duality of domain and program understanding, sometimes it is seen as consisting of several, connected elements and abstractions. Some models take into account also the navigability between different elements as an important feature of the mental representation.

Figure 9 Burkhardt, Détienne and Wiedenbeck mapped to the Block Model

6.6.5

Synthesis

Our review of research in program comprehension identifies some commonalities in the different models.

The following figure visualizes these issues as a mapping to the Block Model.

How do experts and novices differ?

This study found that the differences between the comprehension of experts and novices were dependent on the task to be per-

78

cus on construction’ may naturally tend to lead them away from considerations of program comprehension. Kuittinen & Sajaniemi have claimed that “We know only two examples of research into new concepts that can be utilized in teaching introductory programming: software design patterns, and roles of variables” [39]. In the latter study a research design from Pennington [65] was replicated to study the effect of a visualisation tool (PlanAni) on students’ levels of code comprehension. The study found that “the teaching of roles seems to assist in the adoption of programming strategies related to deep program structures, i.e., use of variables” [39]. Here they apply the distinctions proposed by Pennington [65] whereby “program knowledge concerning operations and control structures reflect surface knowledge, i.e., knowledge that is readily available by looking at a program. In contrast, knowledge concerning data flow and function of the program reflect deep knowledge which is an indication of a better understanding of the code [39]. The study further noted the counter finding that “While the traditional group performed best and the animation group worst in dealing with surface structure; the opposite was true for deeper levels of program knowledge” [39]. Turning to the second research example which Kuittinen & Sajaniemi [39] considered relevant to introductory programming, the relevance of design patterns to the wider CS curriculum has been noted by Astrachan et al. [4]. In doing so they recorded the debt owed to earlier work such as that of Soloway & Ehrlich [87] which “provided foundational material for the adoption of architectural patterns by academics and software practitioners”. While the subsequent CS educational research work may often make reference to the PC literature, frequently it consists of somewhat incidental acknowledgements (e.g.[5,8,46,69,92]). Al-Imamy et al.[2] in 2006 and Wiedenbeck [99] in 1986 with time difference of 20 years have based their work upon the PC literature (Letovsky, [42]; Shneiderman & Mayer, [83]) and refer to the need to reduce the initial syntax learning burden for students, the former by using a template tool, the latter by placing an explicit emphasis on “continuous practice with basic materials to the point that they become overlearned” (Wiedenbeck, [99]) and thereby build a level of comfort upon which higher order skills may rely. In a recent study Mannila [51] found support for the idea that novices “tend to understand concepts in isolation”, consistent with the distinction between program and domain model focus observed by Pennington [65] who noted the limited, program model only focus, exhibited by novices. The work by Lister et al. [47], using the SOLO taxonomy [7] and noting the challenge for novices in distinguishing the “forest from the trees” resonates with those findings. Mannila [51] therefore recommends the use of regular “progress reports” so that students who tend to focus on writing difficulties may also note and record their difficulties with code comprehension to enable their teachers to intervene constructively. Much work in the area of software visualization tools to support novice learning draws upon the PC literature (cf. [60]; [78]). As one example Cross et al., [16] have reported upon the use of the GRASP tool where visualizations of control structures and other views, were aimed at supporting multiple models of PC: top-down models, bottom-up models, and mixed models. To conclude this section, a further area in CS education which has based its insights upon the PC literature is that of debugging. McCauley et al. [56], present a useful review of the debugging related literature, asking such questions as why bugs occur, no-

Figure 10 Block Model add-ons. The Block Model focuses on the assimilation process, which has to be triggered by the KB, which is added as an additional layer in figure 10. As yet it remains unclear if it is more useful to conceptualize the KB as consisting of particular content for each cell of the Block Model, or on the dimensions: 

KB for Text surface: rules of discourse, plans …



KB for Program execution: Mental model of the underlying machine, notional machine …



KB for Function: domain knowledge, including a set of recurrent goals in programs



A fourth distinct element of the knowledge base might be reading strategies (procedural knowledge). These foster or inhibit the abstraction from atoms to the macrostructure.

8

Prior Influence of PC in CS Education

Here we present a necessarily brief overview of how the program comprehension research surveyed in this report has contributed to CS Education in the past. A useful review of learning and teaching of programming has been presented in Robins et al. [75], who proposed a programming framework comprising the three elements of knowledge, strategies, and models, which in turn contribute to the three aspects of design, generation and evaluation of programs. A particular focus of the framework, building on the work of Spohrer and Soloway, is the emphasis on “strategies for carrying out the coordination and integration of goals and plans that underlie program code” ([88], p. 412 - 413), with the recommendation that these notions and strategies be explicitly taught. Börstler et al. [8], have noted that “There is a large body of knowledge on program comprehension…but this is rarely applied in an educational setting”. This might be a matter of conjecture, but some suggested reasons are: the focus of much of the prior PC research has been on the work of professional programmers rather than students; the early research has shown that “there is very little correspondence between the ability to write a program and the ability to read one” [101]; although subsequent work from the BRACElet project tends to refute that (cf. [44,49]); and as Mannila has observed “Introductory programming courses tend to have a strong focus on construction, with the overall goal to get students to write programs as quickly as possible using some high-level language”; again from the perspective of the students themselves, “when students are asked about what they find difficult about programming, they mainly answer based on their experiences from writing code – not reading it”[51]. Thus CS educators’ ‘fo-

79

program). The properties of the notional machine are language dependent.

vice-expert differences and how we can improve the teaching and learning of debugging. They note the early work by Soloway & Ehrlich and Spohrer et al. where it was observed that “breakdowns between goals and plans led to many of the bugs found” [90]. Subsequent work by Rist [72] related schemas and plans to problem, deep and surface structures in learning. Ko & Myers [35] produced a framework of debugging related activities. Robins et al. [74] conducted a detailed study of the typical categories of bugs encountered by novice programmers. Chmiel & Loui [12], and Robertson et al., [73] have called for debugging skills to be explicitly taught. As is evident from this brief survey, the PC literature has had some impact in subsequent CS Education research, but is limited to specific areas, or serves as a more general set of foundational reference sources.

9

Educational Inferences from the review Goals and content

As seen from the perspective of PC, the overall goal is to develop a knowledge base that enables the learner to effectively comprehend code.

Goals

Therefore the following goals should be considered as useful or relevant for courses on (introductory) programming: 

Develop sub-conscious / automated chunking strategies or skills



Skills to effectively navigate in the mental representation, and to be able to map and navigate to the corresponding external representation ([25], p. 78f)



Ability to extract different types of information from program text



Ability to develop a holistic understanding



Ability to cross-reference different key elements (As an aside: cross-referencing is based on Kintsch, the ability to reconstruct prior/later mental representations from the current representation – perhaps that is what makes programming so hard. Some kind of de-chunking process seems to be invoked?)

9.1.2

2.

The notional machine: An abstract model of the machine when it executes programs (i.e. the meaning of the running

Pragmatics: The skills of planning, developing, testing, debugging and so on.

Teaching and Learning Sequence

Program comprehension is a process, and might even be seen as a learning process. Mapping this process to educational questions regarding effective sequences in teaching and learning programming should provide some interesting insights. In order to do so, we distinguish between micro-sequences and macro-sequences. Micro-sequences focus on teaching and learning (~comprehending) a program, for example, implementing one sorting algorithm. It focuses on one example only. Macrosequences focus on a course. In both cases different sequences can be thought of as results from discussion of program comprehension processes. In Figure 11, the sequences from a) to c) apply to micro and macro sequence. Sequence a) draws on the overall bottom-up process at the atom level. A programmer needs to understand the atoms he/she is reading in order to proceed. Applied to a macro sequence this could lead to the idea of focusing on text surface in introductory programming, at least at the beginning. However, b) and c) suggest alternatives: start with simple examples to train overall comprehension on all levels and dimensions, and then increase the

Content

General orientation: What is the general idea of programs, what are they for and what can be done using them?

5.

9.2

With regard to content, we can match insights from PC with abstract descriptions of the content of programming courses. Requirements for learning programming are developed within the discussion on learning programming, by e.g., du Boulay [21]: 1.

Structures: (Abstract) solutions to standard problems, a structured set of related knowledge. (In terms of PC: rules of discourse / plans / beacons / patterns. According to Burkhardt et al. [10], plan knowledge includes complex, delocalized plans. This relates to knowledge of relationships between objects and messages passing between these objects, which may be encapsulated in design patterns and binary association patterns.)

The second additional dimension might be conceptualized as belonging to pragmatics: here reading and comprehension strategies should be added. Effective strategies include, for example, reading in the sequence of the execution and using a cycle of generate-and-test-hypotheses ([25], p. 78f).

Skills to read program code. Most PC models emphasize the importance of understanding the program based on the program code (i.e., reading). Perhaps it should be explicitly emphasized in education as well.



4.

Two additional dimensions should be considered. The first of these is domain, or domain knowledge. In comprehension of programs, the domain of the program and the learners’ knowledge with regard to the program play an important role, but often seem to be overlooked in discussion about programming pedagogy. There are some exceptions: Bennedsen and Caspersen emphasize conceptual modelling of the problem domain [6], and also some problem-solving approaches can be seen as emphasizing domain knowledge (e.g. Deek et al. [17], quoted in [75]).

An important issue is the question of teaching and learning goals in programming. Based on the perspective of program comprehension it seems that some goals are hidden or implicit, and should get more attention.

9.1.1

Notation: The syntax and semantics of the programming language used.

The dimensions formatted in bold are those that seem important from the perspective of program comprehension. If the first dimension is interpreted as focussing on programs’ goals then it may also be seen as important.

In this section we present inferences for computing education, based on the prior analysis of program comprehension.

9.1

3.

Figure 11 Different teaching sequences

80

complexity of examples (b). Another option is to rely on domain knowledge and start with that (c).

that a more integrated and correct answer is a more convincing demonstration that the student has understood the code, and demonstrates an ability to operate at a more abstract thinking level.

The interesting question is which of these different sequences are likely to generate stronger mental representations and knowledge bases? The discussion of PC models indicates that the mental representation has to be structured in such a way that crossreferencing is possible. In several models, experts – in contrast to novices – are likely to generate a mental representation that allows easy navigation, easy translation, and cross-referencing. As indicated by Figure 12, there is a complex network of relationships. Ignoring this fact may lead to isolated knowledge and fragmented understanding of the learners when comprehending programs.

Figure 13 Mapping the Block Model to models of learning sequences

This navigation between cells of the Block Model occurs predominantly on the block and relation levels where chunking and processes of linking between program and domain models takes place (see the models of Burkhardt et al., von Mayrhauser et al. and Fix et al.).

The analysis in Figure 13 indicates the commonalities in the three different models, and demonstrates how the transition from process to object thinking is critical as the shift from a bottom-up to a broader top-down thinking process takes place. For both the structure and function elements of the Block Model, it is apparent that the thinking process develops progressively. From 1) “interiorisation” when initial familiarization with a concept is gained, (here at the single element, “atomic” or unistructural level); to 2) a “block” level of thinking where concepts are condensed into chunks or routines, seen as components or parts of a bigger whole (e.g. trees within a forest); to 3) where a broader and deeper comprehension is achieved and “the whole forest” is seen [47]. This mapping suggests that the transition is applicable across all levels and elements of the Block Model. The resulting educational challenge is to support this transition in thinking through an appropriate set of teaching and learning strategies. The Block Model may help focus these strategies on specific elements or combinations.

Figure 12 cross-referencing / navigating between elements of the mental representation An open question is the number of elements on each side that a programmer needs in order to be able to cross-reference. Crossreferencing is a skill or habit which connects the different elements and enables a programmer to operate and utilize his/her own mental representation. Thus it may be that a teaching sequence that develops skills step by step may lead to isolated skills in which the mental representation becomes more and more detailed, but learners are not able to cross-reference.

9.3

Learning tasks and Teaching methods

Some learning tasks may be derived from empirical studies on PC. In those studies some tasks were used to assess comprehension. In our context these tasks might be useful instruments for assessments of learners, and inspire the development of learning tasks to train for the desired goals. Learning tasks are somewhat related to the representations used in teaching and learning. Certain ways or methods of presenting code to the learners, together with annotations can be seen as learning tasks. The question is, based on our review of PC: What kind of representations and tasks scaffold and reinforce learning?

In order to shed light on this issue, the discussion on learning processes might be helpful: For example, in mathematics education Sfard [81] has noted that “process or operational conceptions must precede the development of structural or object notions”, meaning that in order to understand a concept in mathematics one needs to be able to progress from the executing of a procedure by following a set of step by step instructions, to a deeper understanding of the underlying concept as a whole construct. This critical distinction in levels of comprehension has been termed the “process – object transition” [44]. The SOLO taxonomy by Biggs & Collis [73] has a similar notion in its “quantitative versus qualitative” phase distinction. Mapping these notions to the Block Model [80] using the terminology of Sfard and of SOLO, may demonstrate the applicability of educational models of comprehension which differ from those generated from the alternative research stream based upon psychology of programming perspectives e.g. [55]. Thompson [93] explains that “SOLO …is based on a quantitative measure (a change in the amount of detail learnt) and a qualitative measure (the integration of the detail into a structural pattern). The lower levels focus on quantity (the amount the learner knows) while the higher levels focus on the integration, the development of relationships between the details and other concepts outside the learning domain.” Applying the SOLO taxonomy then, students’ answers are classified according to the level of integration of thought that they demonstrate. The view is

In Whalley & Robbins [98], an overview of different learning and assessment tasks is given: The different task types listed in Figure 14 provide for not only variation in terms of teaching method, but also variation in terms of goals and content. By using the insights in program comprehension it is possible to hypothesize about the specialities of different types of tasks.

9.3.1

Parsons’ problems

Parsons’ problems, as an example, can be interesting hands-on tasks, where learners have to rearrange given pieces of code into the correct order. However, it can be surprisingly difficult to construct suitable Parsons’ problems [18]. We can try to hypothesize on the possible learning effects of such a task using the obtained knowledge on program comprehension. In a Parsons’ prob-

81

lem, code is presented in random order, and has to be re-arranged into the correct, original order. Code pieces may vary in length.

However, this still might be a useful task, when the goal is to teach or reinforce knowledge on text structure: typical patterns, beacons, or rules of discourse.

However, presenting code pieces at the length of one line often corresponds to atoms only – and as such cannot be rearranged in a meaningful way. Such a task would be like re-arranging this article from its singular words. Thus larger pieces are needed. But what is the effect? Anecdotal evidence suggests that students are able to find the original order, but they do so without really understanding the program. The reason is that students can rely on their knowledge base, and by using beacons they can rearrange the code: Pieces containing input statements are likely to go to the top, output statements to the bottom etc. In other words, students can focus on the dimension of the text surface, without having to access or cross-reference to execution and/or function. They also don’t need to chunk. Question type

Description or example

Fixed code

Requires the student to manually execute (trace through) some code and select, from a set of provided answers, the correct outcome or result [46]. Requires the selection of the correct code, from a set of provided answers, which completes the provided “skeleton” code [46]. Questions where the student is given a code fragment and the solution is a code fragment that should give the same result but the logic of the algorithm has been altered (or reversed). Questions where the student is given a code fragment and is asked to identify the same program in an alternative representation or vice versa [97]. Question type drawn upon Soloway’s extract questions (1986). Given an algorithm in pseudo code or plain English translate this logic into valid code in language X. Given a code segment, explain the purpose of that piece of code in plain English. ([6], [47], [97]) Classify a number of code segments that are very similar by identifying the similarities and differences in structure and/or purpose Given the original piece of code and several options (answers) for refactoring the code, critique the answers; or given a piece of code identify “smells” in the code by highlighting sections of the code. Refactor the provided code (in the form of a class) by decomposing it into methods. Given the purpose of a snippet of code and the code snippet in which the lines of code are mixed-up (out of order), the code statements must be rearranged into the correct order for the code to run successfully. From a test case or series of test cases, determine the intent, by explanation or answering directed questions or by writing the actual class(es) for which this test specifies the functional intent.

Skeleton code Change in logic Change in representation

Code purpose Classification Code refactoring

Parsons’ Puzzles [64] Code intent

This argumentation may be refuted, elaborated, debated, and empirically tested. However, the main point of the discussion of possible effects of Parsons’ problems is that by using a PC perspective we get a tool or theory to hypothesize and discuss learning effects of tasks by thinking about the effects on comprehension. Figure 15 depicts the general processes involved in such tasks: In the above discussed example the external representation of code pieces presented in random order have to be translated into an external representation with the restored, original order. In doing so, students map code pieces to their mental representation – presumably with a focus on the first dimension: the text surface. Mental processes like cross-referencing, (de-)chunking or mapping to specific elements of the mental representations do not seem to be involved. translating External representation

External representation

mapping Chunking and De‐chunking Cross referencing

Figure 15 Overview on mental process related to comprehension tasks An educational conclusion from this work is the design of learning tasks that focus on navigation/mapping and on a layered structure – maybe design of a layered representation of aspects of the code could be useful [25].

10 Implications for further research Many of the empirical studies concerning PC were conducted using thinking-aloud protocols. Varying the style of these studies might help to eliminate artifacts of the experimental design and to gain new insights into PC. Given that the programming comprehension models upon which von Mayrhauser & Vans have built have been largely derived from studies of professional programmers, there is considerable scope for alternative studies using novices at differing levels of experience, including (e.g. longitudinal) studies on the development of comprehension skills. Another interesting aspect for future studies is the modification of the external representation. Thus different program artifacts like UML diagrams and different styles of code (e.g. commenteduncommented) should be included. Furthermore other empirical study designs should be considered. For example, the combination of different methods like eyetracking and retrospective thinking-aloud seems very promising for a deeper understanding of human comprehension [95]. Differing research designs applying selected cells of the Block Model could prove illuminating – for instance, if students have knowledge only at the atom and block level, what supporting knowledge base elements are necessary to aid comprehension? What strategies do they use to become more familiar with a longer

Figure 14: Task types, cited from [98]

82

segment of code, or a whole program? If domain knowledge is primed in some way, does this priming improve performance on comprehension tasks? A dynamic model demonstrating the aspects and specific cells of the three dimensional Block Model [incorporating a knowledge base] applied in a comprehension study would be an interesting development.

hension develops as they progress through their educational career.

11 Conclusion During this analysis and the drawing of educational inferences, a wide range of ideas and suggestions for teaching and learning programming were obtained. It is difficult to summarize these ideas and suggestions in a conclusion, but we can highlight the critical aspects by giving some key words:

Other implications for further research are: 

Effects and specifics of code reading: One can assume that there are different strategies for code reading, e.g. reading sequentially or in the order of the execution [25], strategies that focus on specific types of knowledge to be extracted [59], and so on. All of these strategies are likely to have effects on comprehension and learning. Specifically, it seems a neglected topic to teach suitable program reading strategies.



A related issue is the effect of specific design of reading tasks. For example, read-to-recall might have different effects than read-to-change [10].



“Code purpose”-questions which ask the learner to provide a verbatim summary of the program are also likely to be dependent on details of the task and reading strategies. In addition they might impose the need for specific schemas to evaluate and interpret the answers. There are different schemas to analyse such program summaries (see e.g. [28,44,52,97]). This can also be an issue for further research.



Focus is often on the content of the mental representation: Is the suitable/needed information included? But another important issue relates to the abstract features of the mental representation. This direction of enquiry focuses on the abstract features of the mental representation, as opposed to the types of knowledge extracted or inferred from the code. These features have to do with connections of elements of the mental representation, and the ability to access these elements when needed, as well as the ability to explicitly map different elements, including doing a mapping to the external code base. In this article we called these features “navigability”. To learn and reinforce such skills specific learning tasks are needed.



Tasks supporting the development of a domain model are also needed. Learning activities which involve simply reading example code for recall or to summarise/document do not help learners to develop domain models. Learning activities which involve modifying or reusing the example to meet different requirements may be particularly useful in enabling learners to develop a domain model.





The role of domain / domain knowledge for comprehending programs seems to be underestimated in pedagogy.



There are a lot of possible learning tasks for reading and comprehending programs, see e.g. Figure 14, as well as the instruments used in the different empirical studies (see section 6). These instruments might be useful as tasks for learning or as diagnostic instruments for testing the learning outcome.



Experts mental representations are more than the sum of the elements / information obtained from reading the program text. Instead, experts have a flexible and navigational mental representation. This issue is seldom reflected on in the discussion on teaching and learning programming.

The revised Block model (see section 7) leads us to conjecture a process of knowledge acquisition by novice programmers that may be likened to the process of sewing a patchwork quilt. Each cell in the model could be thought of as a group of patchwork squares, and each layer of the knowledge base behind those squares could be thought of as the stuffing. The process of knowledge acquisition consists of gradually sewing in the squares and padding up the stuffing until a robust, coherent, cross referenced and internally consistent model of comprehension is established. Thus novice programmers’ early comprehension models can be characterized by a pattern of "holey knowledge“(i.e. an incomplete patchwork of fabric, with empty cells and missing stuffing). This view of the program comprehension process resonates with the notion of "fragile knowledge" identified by [68]. How long it takes to fill this framework and build a robust comprehension model is a matter of conjecture, but Winslow (1996) (cited in [75]) noted that it "takes about 10 years of experience to turn a novice into an expert programmer". Intuitive CS educator views suggest a year of consolidation (through the CS1, CS2 sequence) to get novices to a basic level of program comprehension. Perhaps the final lesson for CS educators from this analysis of PC models is that we may need to be more patient with our students and acknowledge that some good things take time.

We need to develop all views of a program – dynamic, object, functional, and specifically for OO we need to consider object relations and dynamic object interactions.

12

One direction is to further explore differences between novices and experts. This is usually not a focus within research on program comprehension – it is often a minor or side issue only, but of course very important from an educational perspective. For example, the novices in the study by Burkhardt et al. [10] were relatively experienced in programming, but new to OO programming. However, in recent years many students are introduced to programming through an objects-first approach, and even where this is not the case most students are introduced to OO programming within the first year or two of study. There is scope for taking a more fine-grained approach to the empirical study of “novices”, taking account of how they have learned and how their compre-

[2]

[1]

[3]

[4]

83

REFERENCES Abbas, N. Properties of “Good” Java Examples. Umea’s 13th Student Conference in Computer Science 1-17. Al-Imamy, S., Alizadeh, J. et al. 2006. On the Development of a Programming Teaching Tool: The Effect of Teaching by Templates on the Learning Process. Journal of Information Technology Education. 5, (2006), 271283. Arisholm, E., Gallis, H. et al. 2007. Evaluating Pair Programming with Respect to System Complexity and Programmer Expertise. IEEE Transactions on Software Engineering. 33, 2 (2007), 65-86. Astrachan, O., Mitchener, G. et al. 1998. Design patterns: an essential component of CS curricula. Proceedings of the twenty-ninth SIGCSE technical symposium on Com-

[5]

[6]

[7]

[8]

[9] [10]

[11] [12] [13] [14]

[15]

[16]

[17]

[18]

[19] [20] [21]

puter science education (Atlanta, Georgia, United States, 1998), 153-160. Baldwin, L.P. and Macredie, R.D. 1999. Beginners and programming: insights from second language learning and teaching. Education and Information Technologies. 4, 2 (1999), 167-179. Bennedsen, J. and Caspersen, M. 2008. Model-Driven Programming. Reflections on the Teaching of Programming. J. Bennedsen, M. Caspersen, et al., eds. Springer Berlin / Heidelberg. 116-129. Biggs, J.B. and Collis, K.F. 1982. Evaluating the quality of learning : the SOLO taxonomy (Structure of the observed learning outcome) / John B. Biggs, Kevin F. Collis. (1982). Börstler, J., Hall, M.S. et al. 2009. An evaluation of object oriented example programs in introductory programming textbooks. SIGCSE Bull. 41, 4 (2009), 126143. Brooks, R. 1983. Towards a theory of the comprehension of computer programs. International Journal of ManMachine Studies. 18, 6 (1983), 543 - 554. Burkhardt, J., Détienne, F. et al. 2002. Object-oriented program comprehension: Effect of expertise, task and phase. Empirical Software Engineering. 7, 2 (2002), 115– 156. Chen, T., Monge, A. et al. 2006. Relationship of early programming language to novice generated design. SIGCSE Bull. 38, 1 (2006), 495-499. Chmiel, R. and Loui, M.C. 2004. Debugging: From novice to expert. ACM SIGCSE Bulletin. 36, 1 (2004), 17-21. Clear, T. 2005. Comprehending large code bases - the skills required for working in a "brown fields" environment. ACM SIGCSE Bulletin. 37, 2 (2005), 12-14. Corritore, C. and Wiedenbeck, S. 1991. What do novices learn during program comprehension? International Journal of Human-Computer Interaction. 3, 2 (1991), 199-222. Corritore, C.L. and Wiedenbeck, S. 2001. An exploratory study of program comprehension strategies of procedural and object-oriented programmers. International Journal of Human-Computer Studies. 54, 1 (2001), 1-23. Cross, J.H., Hendrix, T.D. et al. 1999. Software visualization and measurement in software engineering education: An experience report. FRONTIERS IN EDUCATION CONFERENCE (1999), 12B1/5-12B110. Deek, F.P., Kimmel, H. et al. 1998. Pedagogical changes in the delivery of the first-course in computer science: Problem solving, then programming. Journal of Engineering Education. 87, 3 (1998), 313 - 320. Denny, P., Luxton-Reilly, A. et al. 2008. Evaluating a new exam question: Parsons problems. Proceeding of the Fourth international Workshop on Computing Education Research (Sydney, Australia, 2008), 113-124. Détienne, F. 1996. What model(s) for program understanding? Arxiv preprint cs/0702004 (1996). Douce, C. 2008. The Stores Model of Code Cognition. 20th Psychology of Programming Interest Group, Lancaster University, September 2008. (2008). Du Boulay, B. 1986. Some difficulties of learning to program. Journal of Educational Computing Research. (1986), 57-73.

[22]

[23] [24] [25]

[26]

[27] [28]

[29] [30]

[31]

[32]

[33] [34] [35]

[36]

[37]

[38]

84

Engle, R.W., Tuholski, S.W. et al. 1999. Working memory, short-term memory, and general fluid intelligence: A latent-variable approach. Journal of Experimental Psychology: General. 128, 3 (1999), 309-331. Exton, C. 2002. Constructivism and Program Comprehension Strategies. International Conference on Program Comprehension (Los Alamitos, CA, USA, 2002), 281. Fincher, S., Petre, M. et al. 2004. A multi-national, multiinstitutional study of student-generated software. Kolin Kolistelut-Koli Calling Proceedings. (2004), 20-28. Fix, V., Wiedenbeck, S. et al. 1993. Mental representations of programs by novices and experts. Proceedings of the INTERACT '93 and CHI '93 conference on Human factors in computing systems (Amsterdam, The Netherlands, 1993), 74-79. Fjuk, A., Holmboe, C. et al. 2006. Contextualizing object-oriented learning. Comprehensive object-oriented learning: the learner's perspective. A. Fjuk, A. Karahasanović, et al., eds. Santa Rosa: Informing Science Press. 11-26. Gerdt, P. and Sajaniemi, J. 2006. A web-based service for the automatic detection of roles of variables. SIGCSE Bull. 38, 3 (2006), 178-182. Good, J. and Brna, P. 2004. Program comprehension and authentic measurement:a scheme for analysing descriptions of programs. International Journal of HumanComputer Studies. 61, 2 (2004), 169-185. Goodyear, P. 1987. Sources of difficulty in assessing the cognitive effects of learning to program. Journal of Computer Assisted Learning. 3, 4 (1987), 214-223. Hundhausen, C.D., Brown, J.L. et al. 2006. A methodology for analyzing the temporal evolution of novice programs based on semantic components. Proceedings of the second international workshop on Computing education research (Canterbury, United Kingdom, 2006), 59-71. Jones, S.J. and Burnett, G.E. 2007. Spatial skills and navigation of source code. Proceedings of the 12th annual SIGCSE conference on Innovation and technology in computer science education (Dundee, Scotland, 2007), 231-235. Karahasanović, A., Levine, A.K. et al. 2007. Comprehension strategies and difficulties in maintaining objectoriented systems: An explorative study. J. Syst. Softw. 80, 9 (2007), 1541-1559. Kintsch, W. 1998. Comprehension : a paradigm for cognition. Cambridge University Press. Kintsch, W. and van Dijk, T.A. 1978. Toward a model of text comprehension and production. Psychological Review. 85, 5 (1978), 363-394. Ko, A. and Myers, B. 2005. A framework and methodology for studying the causes of software errors in programming systems. Journal of Visual Languages & Computing. 16, 1-2 (2005), 41-84. Kotze, P., Renaud, K. et al. 2008. Don’t do this – Pitfalls in using anti-patterns in teaching human–computer interaction principles. Computers & Education. 50, 3 (2008), 979-1008. Kroes, P. 1998. Technological Explanations: The Relation between Structure and Function of Technological Objects. Techné: Journal of the Society for Philosophy and Technology. 3, 3 (1998), 18-34. Kuittinen, M. and Sajaniemi, J. 2004. Teaching roles of

[39]

[40]

[41]

[42]

[43] [44]

[45] [46] [47] [48]

[49]

[50]

[51] [52]

[53]

[54]

variables in elementary programming courses. Proceedings of the 9th annual SIGCSE conference on Innovation and technology in computer science education (Leeds, United Kingdom, 2004), 57-61. Kuittinen, M. and Sajaniemi, J. 2004. Teaching roles of variables in elementary programming courses. Proceedings of the 9th annual SIGCSE conference on Innovation and technology in computer science education - ITiCSE '04 (Leeds, United Kingdom, 2004), 57. LaToza, T.D., Garlan, D. et al. 2007. Program comprehension as fact finding. Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering (2007), 361–370. de Lemos, M.A. and de Barros, L.N. 2003. A didactic interface in a programming tutor. Proceedings of 11th International Conference on Artificial Intelligence in Education (AIED2003) (2003). Letovsky, S. 1986. Cognitive processes in program comprehension. Papers presented at the first workshop on empirical studies of programmers on Empirical studies of programmers (Washington, D.C., United States, 1986), 58-79. Letovsky, S. 1987. Cognitive processes in program comprehension. Journal of Systems and Software. 7, 4 (Dec. 1987), 325-339. Lister, R., Clear, T. et al. 2009. Naturally occurring data as research instrument: analyzing examination responses to study the novice programmer. SIGCSE Bull. 41, 4 (2009), 156-173. Lister, R., Schulte, C. et al. 2006. Research perspectives on the objects-early debate. ACM SIGCSE Bulletin. 38, 4 (2006), 146-165. Lister, R., Seppälä, O. et al. 2004. A multi-national study of reading and tracing skills in novice programmers. ACM SIGCSE Bulletin. 36, 4 (2004), 119-150. Lister, R., Simon, B. et al. 2006. Not seeing the forest for the trees. ACM SIGCSE Bulletin. 38, 3 (2006), 118-122. Littman, D.C., Pinto, J. et al. 1986. Mental models and software maintenance. Papers presented at the first workshop on empirical studies of programmers on Empirical studies of programmers (Washington, D.C., United States, 1986), 80-98. Lopez, M., Whalley, J. et al. 2008. Relationships between reading, tracing and writing skills in introductory programming. Proceeding of the fourth international workshop on Computing education research - ICER '08 (Sydney, Australia, 2008), 101-112. Mannila, L. 2006. Progress reports and novices' understanding of program code. Proceedings of the 6th Baltic Sea conference on Computing education research: Koli Calling 2006 (Uppsala, Sweden, 2006), 27-31. Mannila, L. 2007. Novices' progress in introductory programming courses. Informatics in education. 6, 1 (2007), 139-152. von Mayrhauser, A. and Lang, S. 1999. A coding scheme to support systematic analysis of software comprehension. IEEE Transactions on Software Engineering. 25, 4 (Aug. 1999), 526-540. von Mayrhauser, A. and Vans, A.M. 1994. Program Understanding - A Survey. Colorado State University Computer Science Technical Report CS94-120. (1994).

[55]

[56] [57]

[58] [59]

[60]

[61]

[62] [63]

[64]

[65]

[66] [67] [68]

[69]

[70]

85

von Mayrhauser, A. and Vans, A.M. 1995. Program comprehension during software maintenance and evolution. Computer. 28, 8 (Aug. 1995), 44-55. von Mayrhauser, A. and Vans, A.M. 1996. Identification of dynamic comprehension processes during large scale maintenance. IEEE Transactions on Software Engineering. 22, 6 (1996), 424-437. McCauley, R., Fitzgerald, S. et al. 2008. Debugging: a review of the literature from an educational perspective. Computer Science Education. 18, 2 (2008), 67-92. McCracken, W.M. 2002. Models of designing: understanding software engineering education from the bottom up. 15th Conference on Software Engineering Education and Training (CSEET'02) (2002), 0055. Mead, J., Gray, S. et al. 2006. A cognitive approach to identifying measurable milestones for programming skill acquisition. SIGCSE Bull. 38, 4 (2006), 182-194. Mosemann, R. and Wiedenbeck, S. 2001. Navigation and comprehension of programs by novice programmers. 9th International Workshop on Program Comprehension (IWPC'01) (2001), 79-88. Naps, T., Cooper, S. et al. 2003. Evaluating the educational impact of visualization. Working group reports from ITiCSE on Innovation and technology in computer science education (Thessaloniki, Greece, 2003), 124-136. Nevalainen, S. and Sajaniemi, J. 2006. An experiment on short-term effects of animated versus static visualization of operations on program perception. Proceedings of the second international workshop on Computing education research (Canterbury, United Kingdom, 2006), 7-16. O'Brien, Michael 2001. Software Comprehension - A review and research direction. Technical Report 2003 (2001), 176–185. O'Brien, M.P., Buckley, J. et al. 2004. Expectation-based, inference-based, and bottom-up software comprehension. Journal of Software Maintenance and Evolution: Research and Practice. 16, 6 (2004), 427-447. Parsons, D. and Haden, P. 2006. Parson's programming puzzles: a fun and effective learning tool for first programming courses. ACE '06: Proceedings of the 8th Australian conference on Computing education (Darlinghurst, Australia, 2006), 157–163. Pennington, N. 1987. Comprehension strategies in programming. Empirical studies of programmers: second workshop. E. Soloway, G. Olson, et al., eds. Ablex Publishing Corp. 100-113. Pennington, N. 1987. Stimulus structures and mental representations in expert comprehension of computer programs. Cognitive Psychology. 19, 3 (1987), 295-341. Perkins, D.N. 1986. Conditions of Learning in Novice Programmers. Journal of Educational Computing Research. 2, 1 (1986), 37-55. Perkins, D.N. and Fay, M. 1986. Fragile knowledge and neglected strategies in novice programmers. Empirical Studies of Programmers. E. Soloway and S. Ivengar, eds. Ablex Publishing Corp. 213-229. de Raadt, M., Watson, R. et al. 2006. Chick sexing and novice programmers: explicit instruction of problem solving strategies. Proceedings of the 8th Austalian conference on Computing education-Volume 52 (2006), 5562. Rajlich, V. 2002. Program Comprehension as a Learning

[71]

[72]

[73]

[74]

[75] [76]

[77] [78]

[79]

[80]

[81]

[82] [83]

[84] [85]

[86]

Process. Proceedings of the 1st IEEE International Conference on Cognitive Informatics (2002), 343-350. Ramalingam, V., LaBelle, D. et al. 2004. Self-efficacy and mental models in learning to program. Proceedings of the 9th annual SIGCSE conference on Innovation and technology in computer science education (Leeds, United Kingdom, 2004), 171-175. Rist, R.S. 2004. Learning to Program: Schema Creation, Application, and Evaluation. Computer Science Education Research. S. Fincher and M. Petre, eds. Routledge. 175 - 198. Robertson, T.J., Prabhakararao, S. et al. 2004. Impact of interruption style on end-user debugging. Proceedings of the 2004 conference on Human factors in computing systems - CHI '04 (Vienna, Austria, 2004), 287-294. Robins, A., Haden, P. et al. 2006. Problem Distributions in a CS1 Course. Eighth Australasian Computing Education Conference (ACE2006) (Hobart, Australia, 2006), 165-173. Robins, A., Rountree, J. et al. 2003. Learning and teaching programming: A review and discussion. Computer Science Education. 13, 2 (2003), 137-172. Sajaniemi, J. and Kuittinen, M. From procedures to objects: What have we (not) done. Proceedings of the 19th Annual Workshop of the Psychology of Programming Interest Group 86–100. Sajaniemi, J. and Kuittinen, M. 1999. Three-level teaching material for computer-aided lecturing. Comput. Educ. 32, 4 (1999), 269-284. Sajaniemi, J., Kuittinen, M. et al. 2007. A study of the development of students' visualizations of program state during an elementary object-oriented programming course. Proceedings of the third international workshop on Computing education research (Atlanta, Georgia, USA, 2007), 1-16. Sajaniemi, J., Kuittinen, M. et al. 2008. A study of the development of students' visualizations of program state during an elementary object-oriented programming course. J. Educ. Resour. Comput. 7, 4 (2008), 1-31. Schulte, C. 2008. Block Model: an educational model of program comprehension as a tool for a scholarly approach to teaching. Proceeding of the Fourth international Workshop on Computing Education Research (Sydney, Australia, 2008), 149-160. Sfard, A. 1991. On the dual nature of mathematical conceptions: Reflections on processes and objects as different sides of the same coin. Educational Studies in Mathematics. 22, 1 (1991), 1-36. Shaft, T.M. and Vessey, I. 2006. The role of cognitive fit in the relationship between software comprehension and modification. MIS Quarterly. 30, 1 (2006), 29–55. Shneiderman, B. and Mayer, R. 1979. Syntactic/semantic interactions in programmer behavior: A model and experimental results. International Journal of Computer & Information Sciences. 8, 3 (1979), 219-238. Simon, B., Fitzgerald, S. et al. 2007. Debugging assistance for novices: a video repository. SIGCSE Bull. 39, 4 (2007), 137-151. Sims-Knight, J.E. and Upchurch, R.L. 1993. Teaching Object-Oriented Design Without Programming: A Progress Report. Computer Science Education. 4, 1 (1993), 135-156.

[87] [88] [89] [90] [91] [92] [93] [94]

[95]

[96]

[97]

[98] [99] [100]

[101] [102]

86

Soloway, E., Adelson, B. et al. 1988. Knowledge and Processes in the Comprehension of Computer Programs. The Nature of Expertise. M. Chi, R. Glaser, et al., eds. Lawrence Erlbaum Associates. 129 - 152. Soloway, E. and Ehrlich, K. 1984. Empirical studies of programming knowledge. IEEE Transactions on Software Engineering. 10, 5 (1984), 595–609. Soloway, E. and Spohrer, J.C. 1989. Studying the Novice Programmer. L. Erlbaum Associates Inc. Sorva, J., Karavirta, V. et al. 2007. Roles of variables in teaching. Journal of Information Technology Education. 6, (2007), 407-423. Spohrer, J.C., Soloway, E. et al. 1985. A goal/plan analysis of buggy pascal programs. Hum.-Comput. Interact. 1, 2 (1985), 163–207. Storey, M. 2006. Theories, tools and research methods in program comprehension: past, present and future. Software Quality Journal. 14, 3 (2006), 187–208. Tenenberg, J., Fincher, S. et al. 2005. Students designing software: a multi-national, multi-institutional study. Informatics in Education. 4, 1 (2005), 143-162. Thompson, E.L. 2008. How do they understand? Practitioner perceptions of an object-oriented program. Massey University, New Zealand. Thompson, E. 2006. Using a subject area model as a learning improvement model. Proceedings of the 8th Australian Conference on Computing education - Volume 52 (Hobart, Australia, 2006), 197-203. Uwano, H., Nakamura, M. et al. 2007. Exploiting Eye Movements for Evaluating Reviewer's Performance in Software Review. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 90, 10 (2007), 2290-2230. Vagianou, E. 2006. Program working storage: a beginner's model. Proceedings of the 6th Baltic Sea conference on Computing education research: Koli Calling 2006 (Uppsala, Sweden, 2006), 69-76. Whalley, J.L., Lister, R. et al. 2006. An Australasian study of reading and comprehension skills in novice programmers, using the bloom and SOLO taxonomies. ACE '06: Proceedings of the 8th Austalian conference on Computing education (Darlinghurst, Australia, Australia, 2006), 243–252. Whalley, J.L. and Robbins, P. 2007. Report on the fourth BRACElet workshop. Bulletin of Applied Computing and Information Technology. 5, 1 (Jun. 2007). Wiedenbeck, S. 1986. Beacons in computer program comprehension. International Journal of Man-Machine Studies. 25, 6 (1986), 697–709. Wiedenbeck, S. 2005. Factors affecting the success of non-majors in learning to program. Proceedings of the first international workshop on Computing education research (Seattle, WA, USA, 2005), 13-24. Winslow, L.E. 1996. Programming pedagogy - a psychological overview. SIGCSE Bull. 28, 3 (1996), 17-22. Wirth, N. 1971. Program development by stepwise refinement. Commun. ACM. 14, 4 (1971), 221-227.