Software Architecture and Quality in Open-Source Development

2 downloads 347 Views 1MB Size Report
Software Architecture and Quality in Open-Source Development ... long as the application provides these reliably and efficiently, the user is generally satisfied.
DRAFT – Please, do not quote or distribute without permission

Can We Predict the Generation of Bugs? Software Architecture and Quality in Open-Source Development Manuel Sosa Technology and Operations Management Area INSEAD, Fontainebleau, France [email protected]

Jürgen Mihm Technology and Operations Management Area INSEAD, Fontainebleau, France [email protected]

Tyson Browning Department of Information System and Supply Chain Management Neeley School of Business, Texas Christian University, Fort Worth, Texas, US [email protected] Abstract We study how software architecture relates to quality.

Based on a software architecture

representation that accounts for not only the hierarchical arrangement of its subsystems and components but also their dependency structure, we formally define the notion of system cyclicality. System cyclicality is an architectural property that captures the fraction of mutually interdependent components in a system. By examining multiple versions of 20 open-source, java-based applications (126 total) developed by the Apache Foundation, we empirically analyze the relationship between software architecture characteristics and the creation of bugs.

Our results suggest that, while

controlling for various system characteristics, system cyclicality is a key determinant of bug creation. Interestingly, we found evidence that it is not just the cycles themselves, but how hidden they are, that drive the effects of system cyclicality and bug creation. From an academic viewpoint, this work provides a theoretical and empirical basis for a causal link between the architecture of a complex system and its quality. This has important implications for the management of complex system design and development in fast-paced industries such as software. Our results suggest that managers could benefit from proactively examining the architecture of the system they develop and monitoring its cyclicality as one of their strategies to mitigate the creation of defects.

DRAFT – Please, do not quote or distribute without permission

1

Introduction Previous work has studied the implications of system architecture decisions to various aspects of

the firm (e.g., Baldwin and Clark 2000; Ulrich 1995; Yassine and Wissmann 2007). However, little attention has been devoted to understanding the link between product architecture characteristics and product performance. How does the architecture of a system influence its (conformance) quality? More specifically, to which features of the system architecture should managers pay attention during the development process to minimize the emergence of defects? To study the relationship between system architecture and quality, we examine the development of software applications—for several reasons: they are complex, they exhibit fast change rates (like fruit flies in studies of biological evolution), and they offer (through their source code) an efficient, reliable, and standardized medium to capture the architecture of their design. In addition, software applications typically have centralized repositories to reliably track the quality issues associated with each release. Limited research in both the engineering design and computer science communities has addressed the link between system architecture and quality. On the engineering design side, researchers have studied the architecture of complex products to explore how the direct and indirect connections among product components influence the propagation of design changes. This line of research has suggested that change propagation can cause rework and long development times due to the unpredictable nature of design changes propagating not only between directly connected components but also indirectly through intermediary components (Clarkson et al. 2004; Eckert et al. 2004; Sosa et al. 2007b). More recently, Gokpinar et al. (2007) found that the connectivity of an automobile’s subsystems and the extent to which their interfaces are managed significantly affected subsystems’ conformance quality. The literature on computer science and information systems has also examined the structure of software systems and related it to performance issues such as the time required to implement changes (Cataldo et al. 2006), the difference between open-source and closed-source development (MacCormack et al. 2006), and the factors that lead to refactoring of the source code (see (Mens and Tourwé 2004) for a review). However, previous work has not yet clearly linked the

1

DRAFT – Please, do not quote or distribute without permission

features of a system’s architecture to the generation of defects in the system. That is the focus of this paper.

By integrating methods used in engineering design to analyze product and process

architectures with methods from information systems on the determinants of defect proneness, we identify architectural properties that are likely to be associated with the risk of creating defects in software applications. From a user’s perspective, software applications provide certain functionality and capability. As long as the application provides these reliably and efficiently, the user is generally satisfied. However, that is rarely the case. Users typically uncover “bugs” by testing and using the software application. From a designer’s standpoint, there are many alternative ways for the software to provide the specified functionality. Designers or architects must determine how to allocate the software’s functions to its various components or groups thereof, called subsystems (or modules). Architects must also determine how the software system will be organized in terms of command and control, utilities, and other supporting infrastructure of components and subsystems. These choices determine the nature and extent of the relationships between the components and subsystems of any version of the software application, and they affect not only the ease with which the components and subsystems can be successfully modified in successive versions (MacCormack et al. 2006; Parnas 1972; Sosa et al. 2007a) but also the risk of introducing undesired functionality into the system (Jones 2000; Koru and Tian 2004). A system’s conformance quality is defined by its ability to meet its design specifications. Hence, system defects (called bugs in software applications) are identified when the system does not perform as specified. To meet system requirements a hardware product is designed and then produced. In software development, this is equivalent to architecting and writing the source code and then compiling it to transform it into an executable application that will ultimately deliver the specified functionality. Conformance quality is measured over the final product. Hence, defects or bugs may be uncovered by testers prior to the application’s release and by users post-release. Here is important to note that many bugs are probably uncovered and corrected during the development phase (most likely during the compiling of code), similar to design rework in the latest phases of product

2

DRAFT – Please, do not quote or distribute without permission

development, in which the product is tested and prepared for production ramp-up. Although these pre-release bugs are important to manage due the amount of design rework they typically involve, the focus on this paper is on the bugs that elude the development team and get released with the application to be uncovered later by the users. In software development, two basic concepts characterize a “good” design structure (Stevens et al. 1974): coupling and cohesion. Cohesion refers to the internal consistency of each software component, while coupling pertains to the strength of the connections between components. A good source code design maximizes cohesion and minimizes coupling.

This principle suggests that

increased coupling (greater connectivity among components) is likely to have a negative impact on the quality of a system (Koru et al. 2007; MacCormack et al. 2006). In this paper, we argue that it is not the amount of coupling but rather the fraction of components involved in cyclical dependencies that is positively associated with the generation of bugs. To test this proposition, we examined 126 releases representing multiple generations of 20 distinct open-source applications developed by the Apache Foundation. This paper is structured as follows. Section 2 discusses the software architecture representation schemes and metrics that facilitate our study, along with the theoretical framework and hypotheses. Section 3 describes the empirical study carried out to test our hypotheses. Section 4 describes the analysis and results. Section 5 concludes by discussing the academic and managerial implications of this work.

2

Linking Software Architecture and the Creation of Bugs

2.1

The Architecture of Complex Systems The architecture of a designed system (either hardware or software) is determined during its

design process through both decomposition and integration (Alexander 1964; Simon 1981). Establishing the architecture of a (hardware) system includes breaking it down into functional and physical elements, mapping the functional elements onto the physical elements, and specifying the interfaces among the elements (Ulrich 1995; Ulrich and Eppinger 2007). Simon (1981) suggested that complex systems should be designed as hierarchical structures consisting of “nearly 3

DRAFT – Please, do not quote or distribute without permission

decomposable systems,” with strong interfaces within subsystems and weak interfaces across subsystems. This is consistent with the independence axiom of axiomatic design, which suggests the decoupling of functional and physical elements of a product (Suh 2001), as well as with the notion of modularity, which suggests the creation of options that enable the evolution of designs (Baldwin and Clark 2000). Hence, designers typically decompose complex hardware and software systems into subsystems and components to facilitate their design. Yet, such subsystems and components must then be integrated to ensure that the product functions as a whole. Previous research in engineering design has developed methods to analyze the architecture of complex products by studying how their components interact to provide system functionality. More specifically, this stream of research has modeled products as collections of interdependent components and has developed methods to cluster components with similar dependencies into subsystems (modules) (Browning 2001; Lai and Gershenson 2006; Pimmler and Eppinger 1994) and analyzed how component connectivity patterns relate to organizational and design decisions (Sosa et al. 2003; Sosa et al. 2004; Sosa et al. 2007b). Similarly, to analyze the architecture of a software system, we examine its source code because it codifies the design of the system. Analogous to hardware products, the architecture of a software application is the scheme by which its functional elements are codified into objects in the source code and the way in which these objects interact and are grouped into subsystems and layers (e.g., Parnas 1972; Parnas 1979; Shaw and Garlan 1996). The dependency structure of the source code (i.e., the way in which its components exchange information) specifies the system’s functionality precisely. In that sense, the source code captures the “process” (or “recipe”) that determines how the system works. Recognizing this is important because it leads us to depart from the methods used to analyze hardware product architectures. Contrary to previous work focused on analyzing the architecture of complex software applications (MacCormack et al. 2006), we consider the software architecture as a process that specifies precisely how the objects comprising the system will interact over time to provide its functionality. In contrast, hardware product architectures have traditionally been analyzed by considering them

4

DRAFT – Please, do not quote or distribute without permission

as a collection of static physical elements and dependencies (Pimmler and Eppinger 1994; Sosa et al. 2007b). Because software architectures describe the process by which the components of the system product interact to provide its functionality, we use alternative analytical techniques that have traditionally been used to analyze iterative processes such as new product development (see (Browning 2001) for a review). A process architecture adds a time dimension to the elements and relationships in a system (Browning and Eppinger 2002). While software components execute much more quickly than a project’s activity network, they nevertheless execute in finite run time, with the dependencies between the elements of the source code determining the order of the actions performed by the components defined by the source code. 2.2

Architectural Representations of Software Systems The source code of a software application consists of a collection of connected components

organized into subsystems, which are in turn grouped into levels and layers (Sangal et al. 2005; Shaw and Garlan 1996).

To explore the features and effects of a system’s architecture, we need to

understand these arrangements more specifically. It will clarify our discussion to refer to an example, one of the software applications we studied, version 1.3 of Ant. Traditionally, designers have represented the architecture of their source codes with block diagrams such as Figure 1, which depicts the system’s decomposition into subsystems. We define a subsystem as a set of components and/or other subsystems, where the presence of a block inside another represents a “parent” to “child” relationship in the decomposition. For example, at the first level of decomposition, Ant version 1.3 consists of four subsystems: “taskdefs,” “types,” “util,” and “*” (where “*” stands for a group of miscellaneous components). The subsystem “taskdefs” in turn breaks down into two lower-level subsystems, as does the subsystem “util.” Each of the subsystems shown in Figure 1 is ultimately a collection of components (which are not shown individually). Thus, the “z-axis” of the figure (coming out of the page) relates to the level (or depth) of decomposition. For example, the first level is formed by the four subsystems immediately below the root node (level 0), and the architecture has two levels because two of the subsystems in the first level, “taskdefs” and “util,” contain other subsystems (“compilers” and “regexp,” respectively).

5

DRAFT – Please, do not quote or distribute without permission

Figure 1: Architecture block diagram for version 1.3 of the Apache Ant application

The vertical layout (“y-axis”) of the subsystems and components in the block diagram shown in Figure 1 is also meaningful. Subsystems and components located at the bottom of the diagram are intended to serve the subsystems above. These layers are defined by the system architect’s design rules (Baldwin and Clark 2000; Sangal et al. 2005) Software is often architected in layers to provide a coherent command and control structure, such that components in higher layers can call (depend on) components in lower layers.

The opposite situation, where lower-layer components depend on

higher-layer ones, is possible but undesirable, for reasons we will discuss below. Thus, the block diagram conveys information about both the levels of decomposition (which relate to the physical structure of the code) and the layers of intended dependencies (which relate to the process structure of the code). Note that where one chooses to end the decomposition and declare the lowest level is the modeler’s choice. In our analysis, we stop at the “class” level,1 although we could go down further to the level of methods and data members and eventually even to lines of code. However, three main arguments led to our choice to decompose to the class level. First, classes tend to provide a set of common functionality (e.g., a set of low-level mathematical functions) that is maintained as one cohesive piece of software, often in a single source file by a single author. Second, the main attributes of the architecture become apparent by the class level, so further decomposition would only obscure these insights. Third, this level of decomposition is consistent with previous work focused on representing software architectures (e.g., MacCormack et al. 2006; Sangal et al. 2005). Thus, for the 1

Our dataset contains only Java applications, wherein files and classes are typically the same, except for “inner classes” (classes within

classes), which we do not consider explicitly.

6

DRAFT – Please, do not quote or distribute without permission

purposes of our analysis, we treat each Java class as an atomic component of the software architecture. Although block diagram representations capture the hierarchical organization of components and subsystems in layers, they do not show the dependencies between components. However, as we will discuss below, determining the architectural properties that influence bug creation makes it imperative to consider both the components’ dependencies and their hierarchical organization. Dependencies among software components are formed by the “calls” made by one component to another.

2

To

represent these, both within and across subsystems and layers, we use a design structure matrix (DSM) representation (Browning 2001). A DSM is a square matrix of size n (where n is the number of elements) whose diagonal cells represent system elements and whose off-diagonal cells indicate dependencies between those elements. The use of DSMs to study the structure of development processes (by mapping out how information flows between activities) led to structured approaches to identify subsets of activities involved in design iterations (Browning and Eppinger 2002; Smith and Eppinger 1997a; Steward 1981). Several researchers have used the DSM representation to capture the architecture of complex products and to analyze patterns of interactions among the components, both for general products (e.g., Sharman and Yassine 2004; Sosa et al. 2003; Sosa et al. 2007b) and for software systems (Cataldo et al. 2006; MacCormack et al. 2006; Sangal et al. 2005; Sosa 2008; Sullivan et al. 2001). Similar to this latter stream of research, we also capture the dependency structure of the source code of a software application in a DSM representation. However, as we mentioned above and will further discuss below, we analyze our software architecture DSMs by considering the process-like nature of the source code they represent. Figure 3 shows a “flat” DSM representation of Ant 1.3, where the term flat signifies its agnosticism towards hierarchical levels. Hence, Figure 2 shows the 117 components of Ant 1.3 and the 463 dependencies between them. We use the convention where an off-diagonal mark in the DSM represents the dependency of the column component on the row component. Thus, a mark in cell (i,j) indicates that the object (a java class) labeling column j depends on the object labeling row i. 2

Specifically, we include the following types of dependencies: invocations (static, virtual, and interface), inheritances (extensions and

implementations), data member references, and constructs (both with and without arguments).

7

DRAFT – Please, do not quote or distribute without permission

Figure 2: A flat DSM representation of Ant 1.3

To account for the organization of components into subsystems and layers, we supplement the flat DSM with a hierarchical DSM representation. The “z-axis” (coming out of the page) of the DSM in Figure 3 shows the nested levels of decomposition and the components’ membership in subsystems. The “y-axis” of the DSM shows the ordering of the subsystems in layers. Thus, Figure 3 combines many of the visual benefits of Figures 1 and 2. This representation allows us to distinguish inter- and intra-subsystem component dependencies.

Figure 3: A hierarchical DSM representation of Ant 1.3

8

DRAFT – Please, do not quote or distribute without permission

2.3

Identifying Component Loops in Software Architectures A key strength of the process DSM representation and sequencing analysis is their ability to

highlight component loops.

We define a component loop as a subset of components whose

dependencies form a complete circuit. To understand a component loop, consider the various types of dependencies that can exist among several components in a system. Error! Reference source not found. exemplifies three types of dependencies (or lack thereof) between components. In case (a), the three components are independent. Thus, procedures and data processing done by any of the components are independent of the other components (assuming sufficient processing resources). In case (b), component C provides data services to components A and B. Similarly, component B provides data to component A. As a result, there is a serial order (C, B, and A) in which these three components must be executed to ensure data availability. In cases (c) and (d), components A, B, and C are involved in a circuit or loop because they depend on each other in a cyclical manner. Procedures of component A depend on data processing performed by component B, which depend on data provided by component C, and which in turn depend on data provided by component A. Considering an information-processing view of a system, cases (a), (b), and (c) represent the three fundamental types of dependencies between elements in a product, process, or organizational system (Eppinger et al. 1994; Thompson 1967). These three cases, however, assume that the components all belong to the same group. This assumption breaks down when considering the membership of the components in different subsystems. Case (d) represents group membership by shading components A and C differently from B and (the newly added) D. Subsystem membership has two important effects on coupled dependencies. First, it may increase the size of the loop by involving additional components (such as component D) that otherwise would not be part of the loop. (Any change in data provided by component B might also affect component D, and since the loop could cause several changes in B, component D might receive several change signals.) Second, it might hide the intrinsic loop formed by A, B, and C. Because the loop crosses group membership boundaries and adds additional components due to group membership, the intrinsic loop formed by A, B, and C can get hidden. We argue that the influence of component D on the intrinsic loop is determined not only by the dependency between component B and D but also by the fact that the group membership of these 9

DRAFT – Please, do not quote or distribute without permission

components increases the likelihood of being considered as a bundle of components.

(a) Independence

(b) Serial dependence

(c) Coupled dependence

(d) Extended coupled dependence

Figure 4: Types of relationship patterns between components

The concept of loops or cycles (also called iterations) is not new in the process analysis literature, where DSMs have been used to identify subsets of activities that drive iterations (Browning and Eppinger 2002; Meier et al. 2007; Smith and Eppinger 1997a; Smith and Eppinger 1997b; Steward 1981).3 However, what is new in our conceptualization of component loops is twofold: •

First, we distinguish component loops in the presence of the levels and layers in which the system’s components are organized.



Second, we relate the presence of component loops to an important measure of product (not process) quality such as bug creation. To do so, we define system cyclicality, an architectural property of the system, as the fraction of the system that involves components embedded in component loops.

Methods exist to determine the sequence of components in a process DSM that highlights the minimal subsets of coupled components (Meier et al. 2007; Steward 1981; Warfield 1973). (Our use of sequencing to identify coupling distinguishes our approach from previous work, in both the hardware and software product domains (e.g., MacCormack et al. 2006; Pimmler and Eppinger 1994), which has not differentiated between feed-forward and feedback interactions and has instead used clustering algorithms to group components.4) Basic sequencing orders the DSM to minimize the

3

Although component loops in the system (or product) domain may cause design iterations in the process (or work) domain, they are

conceptually different. The use of the term “component” to characterize loops helps us emphasize that we are concerned with the loops present in the system/product domain. 4

For further discussion of the differences between sequencing (also called partitioning) and clustering algorithms, see (Browning 2001).

10

DRAFT – Please, do not quote or distribute without permission

number of super-diagonal marks and their distance from the diagonal. A lower-triangular matrix implies a sequence of execution that maximizes the availability of data to all components. A mark (i, j) below the diagonal indicates a feed-forward dependency where component i provides data to component j (i < j), while a super-diagonal mark indicates a feedback dependency in which component j provides data to component i that has been previously executed (since i < j). Since feedback dependencies spawn loops, feedback marks are generally undesirable in process architectures. Considering the flat DSM in Figure 3, one can identify the intrinsic component loops in Ant 1.3 by sequencing the DSM to minimize the super-diagonal marks. Figure 5 shows this result and highlights the two intrinsic component loops in the shaded blocks along the diagonal. Ant 1.3 has seven feedback marks that cause the two component loops, which respectively contain seven and 14 interdependent components. Since 21 out of the 117 components of Ant 1.3 are involved in intrinsic component loops, there is an 18% probability that a randomly chosen component is involved in an intrinsic component loop.

Figure 5: Sequenced flat DSM of Ant 1.3

We refer to the component loops shown in a sequenced flat DSM (Figure 5) as intrinsic because they are identified without any constraints to the sequencing algorithm imposed by the hierarchical way in which the components are organized into subsystems (i.e., the levels of decomposition). Another perspective on the architecture can be obtained by applying a constrained form of sequencing

11

DRAFT – Please, do not quote or distribute without permission

to the hierarchical DSM (Figure 3), where we recursively sequence the subsystems internally at each level, from the top (root) level and then down. This approach constrains the sequencing within each subsystem and highlights connections that traverse the subsystems and layers laid out by the system architects. Since the levels and layers influence the way the developers work and the associations they realize, the number of interdependently coupled components in the hierarchical DSM captures an alternative and potentially important characteristic of the architecture. Figure 6 shows a sequenced hierarchical DSM of Ant 1.3 (from Figure 3), which also contains two component loops. However, since the sequencing of the DSM is constrained by need to keep each component within its subsystem, the resulting loops are much larger. By examining the blocks formed along the diagonal by enclosing all of the components involved in the two realized component loops, we find that they contain 88 components. (The algorithm used to determine a realized component loop in a sequenced hierarchical DSM is described in the Appendix.) The first design loop includes 50 components across two subsystems (“compilers” and “*”) which form the high-level subsystem “taskdefs.” The second component loop contains 38 components across four subsystems (“types”, the two subsystems that comprise the subsystem “util”, and the high-level subsystem “*”). The 88 total components involved in realized component loops implies a probability of 75% that a randomly chosen component is involved in a realized component loop. Since the sequencing algorithm on the hierarchical DSM is constrained by the actual hierarchy of the software architecture, the number of components involved in realized loops will always be greater than or equal to the number of components involved in intrinsic loops.5

5

Because many of the components in the realized design loops are not dependent on other components within the realized design loops, we

also consider the size of the realized design loops minus these unconnected components. (These are the components with empty rows and columns in sub-matrices along the diagonal that define the two realized design loops of Ant 1.3.) We take this distinction into account in our analysis that relates component loops and bugs.

12

DRAFT – Please, do not quote or distribute without permission

Figure 6: Sequenced hierarchical DSM of Ant 1.3

Next, we develop a theoretical argument for how component loops lead to higher bugs creation. Then, in Section 3, we empirically test such a hypothesis by using the views of component loops presented here. 2.4

Hypotheses: The Effects of Component Loops on the Creation of Bugs This paper argues that certain architectural patterns of a system can significantly impact its

number of defects. Although many bugs are uncovered and fixed during the development and testing of a software application, many bugs get shipped with the system and are uncovered by its users. We focus on this latter type of defects. In general, bugs represent undesired behaviors of software systems. Based on findings from the process system literature, we would expect that an important source of bug creation would be the presence of component loops, since they are likely to trigger iterative problem solving (Roberts et al. 2006). Iterative problem solving typically corresponds to difficult and recursive problems that require making assumptions, iterating, and/or compromising, a process which may not converge easily and therefore carries a higher risk of residual errors than serial or parallel problem solving (Eppinger et al. 1994; Krishnan et al. 1997; Terwiesch et al. 2002). Moreover, as more components are involved in such iterative problems, the probability of convergence on a feasible solution decreases (Mihm et al.

13

DRAFT – Please, do not quote or distribute without permission

2003; Smith and Eppinger 1997a), which can increase the risk of embedding bugs into the system. Hence, we hypothesize that: H1: The larger the fraction of components of version s involved in component loops, the greater the number of bugs associated with version s. Our first hypothesis conjectures that the presence of component loops will increase the risk of having bugs in the system. However, as discussed in the previous sub-sections, there are various types of component loops. Intrinsic component loops involve the minimum set of components with coupled dependences, assuming that they can be developed together without any hierarchical constraints. However, because source code is organized into modules and subsystems, intrinsic component loops are typically augmented by other components that share subsystem membership. Hence, realized component loops could provide a more realistic indication of the effects of loops as perceived by the developers. Hence, the difference between realized and intrinsic component loops is the addition of components to the intrinsic component loops due to the hierarchies of the architecture. The addition of extra components to intrinsic component loops to form the realized loops has two important effects on the risk of introducing bugs. First, it increases the size of the component loop and therefore (artificially) increases the size of the iterative problem to be solved, which in turn could lead to increase the risk of creating bugs. Second, and more importantly, the additional components could introduce “noise” into the component loops that could increase the distance between the components involved in the intrinsic loops (the potential root cause of the bugs). This not only makes the iterative problem more difficult due to lack of precision and stability of the information exchanged but also makes it less visible to the developers (Terwiesch et al. 2002). Iterative problem solving is even more problematic when it is not foreseen by the developers (Pich et al. 2002; Sommer and Loch 2004). Hence, we argue that realized component loops can lead to a higher number of bugs, because they are more likely to hide and disaggregate the intrinsic component loops, which otherwise could receive greater focus from the developers.

The more extra components involved in realized

component loops, the greater this effect, and the higher the risk of creating more bugs. This leads to our second hypothesis:

14

DRAFT – Please, do not quote or distribute without permission

H2: Realized component loops have a stronger positive effect on bugs creation than intrinsic component loops.

3

Empirical Study: The Apache Open Source Foundation To test our hypotheses, we study readily-accessible, open-source, Java-based software

applications from the Apache Foundation (www.apache.org), one of the largest, best-established, and widely-studied open source communities of developers and users who share values and a standard development process (Roberts et al. 2006). The Apache Foundation has a “desire to create high quality software that leads the way in its field.”

We examined all the Java-based applications

developed by Apache, focusing on Java because (1) it is one of the most widely used and open objectoriented programming languages, and (2) it captures components and their dependencies in a structured and explicit manner in its source code. This minimizes the risk of having components or dependencies being “masked” in the source code and only appearing later at the time of compilation. In total, we identified 69 Java-based development projects at the Apache Foundation in mid-2008. This provided our initial database. To effectively examine a causal relationship between architecture characteristics and quality, we needed to obtain a longitudinal dataset, so we down-selected to the 37 applications for which we could access data for successive major releases. That is, we discarded 32 projects because they had a limited history of only one or a few minor releases. From the 37, we selected the applications for which we could access, for successive major releases, their precompiled (“pre-built”) source code (to codify product architecture features), their bug reports (to determine number of bugs), and their release notes (to determine the innovative features and other control variables). After data purification, we compiled a set of 126 releases representing 20 applications with an average of 6 major releases (or versions) each. We used three different sources of data: bug tracking systems, precompiled .jar files6, and release notes.

First, we examined the Bugzilla and Jira bug tracking systems of the Apache

Foundation to obtain all the data for the bugs associated with each release. Each of these systems

6

Jar files contain all the Java class specifications (including the dependencies among them) for a given Java-based software application.

15

DRAFT – Please, do not quote or distribute without permission

allows for users and developers to enter bug reports, which are classified by their potential severity and processed by the development team in a structured way. All bugs which are not fixed by a developer during the writing of the source code and therefore get released with the application go through this process. These databases thus record the status and closure of each bug associated with any release. We developed a web-crawler to automate the gathering of the bug data. Second, we downloaded the precompiled versions (as signified by an existing .jar file) of each application available from the Apache archives and/or the application’s website, selecting the versions considered major releases. We did not normally use minor releases since these typically involve relatively small changes. We used a commercially available software application developed by Lattix (www.lattix.com) to translate the structure of the source code captured in the .jar file into a matrix representation such as the ones shown in Figures 5 and 6. Finally, we consulted the release notes of each version of all the applications in our sample to find data on newness, age, and other important controls 3.1

Dependent variable Number of bugs associated with version s of application i (yis). Our main dependent variable

counts all the bugs that have been formally identified and attributed to version s of application i. The identification of a bug is carried out by developers or users (with confirmation by developers) after the release. Hence, this variable does not measure the capability of the development organization to discover bugs. Rather, it is a proxy for the number of actual defects embedded in version s of application i. As mentioned, we used the Buzilla and Jira bug tracking systems as the data sources to quantify this variable. Out of the complete list of bugs entered into these systems, we discarded any items that could not be verified as actual bugs by the developers (Classification: ‘WORKS_FOR_ME’ or ‘INVALID’ for Bugzilla and ‘Cannot Reproduce’ or ‘Not A Problem’ for Jira). We also discarded any bugs that the developers considered duplicates of bugs already registered in the system (Classification ‘DUPLICATE’ for both Bugzilla and Jira). Attribution of a bug to a code version was primarily determined by the classification in the system (Classification according to data field ‘Affected Versions’). If no version was explicitly given in the bug description, we assumed the bug

16

DRAFT – Please, do not quote or distribute without permission

belonged to the most recently released version with respect to the bug entry date. 3.2

Independent variables Our key predictor variable is the extent to which version s of application i contains the various

types of component loops (as discussed in section 2) in its source code. Because we can identify component loops in the presence or absence of the constraints imposed by the hierarchical assignment of components to subsystems, we define three types of component loop measures: •

Intrinsic cyclicality (PI,is) is the probability that a randomly chosen component in version s of application i belongs to an intrinsic component loop. (Let us recall that intrinsic component loops are defined by the set of components that share coupled dependencies in a sequenced flat DSM such as the one shown in Figure 5.) To determine PI,is we count the number of components involved in loops in the flat DSM of version s of application i (CI,is), divided by its total number of components (Nis). Hence, PI,is = CI,is, / Nis



Realized cyclicality (PR,is) is the probability that a randomly chosen component of version s of application i belongs to a realized component loop. This measure is a function of the number of components that are involved in component loops determined while maintaining the constraints of the subsystems and layers used by programmers to organize their code (CR,is). To identify CI,is we count the number of components in loops in the sequenced hierarchical DSM of version s of application i, such as the one shown in Figure 6. Hence, PR,is = CR,is / Nis



Reduced realized cyclicality (PRR,is) is the probability that a randomly chosen component in version s of application i belongs to a reduced realized component loop. This measure is similar to PR,is but subtracts from its numerator the number of components that do not depend on any other component within the loop.

3.3

Control variables We include two sets of control variables. First, we control for exogenous, non-architectural

17

DRAFT – Please, do not quote or distribute without permission

features of the application that are likely to affect the creation of bugs. Second, we control for architectural characteristics that relate to the direct and indirect connectivity among the components of the application so as to test more precisely whether and how system cyclicality might influence bug creation. 3.3.1 Non-architectural Controls •

Age of application at version s. This is measured by the number of days since the first release of the application. This assumes that the application is officially “born” on the date of the first major release and ages with successive releases. The cumulative time between releases is likely to increment both the complexity of and knowledge about the architecture. Since these factors are likely to affect bug creation, it is important to control for the age of the application.



Days since last release. The time between successive releases varies within and across applications, so it is important to control for the time span between the previous release and version s. The longer this time, the higher the probability that more changes will have been introduced into the application, which could ultimately affect bug creation.



Newness of application at version s. New features and incremental changes to existing features add uncertainty and complexity to the structure of the application. Implementing these types of changes not only consumes development resources but also is likely to introduce unforeseeable perturbations to existing features. Hence, the number of new features and incremental changes in an application is likely to affect the creation of bugs. Using the information from the release notes, we capture both the number of new features and incremental changes associated with each release. New features add functionality, while incremental changes modify existing functions. We measure the newness of version s with two control variables that count the numbers of new features and incremental changes, respectively.

3.3.2 Architectural Controls The following variables are measured for version s of application i: 18

DRAFT – Please, do not quote or distribute without permission



Size of jar file. The overall complexity of a system is a function of the amount of information it carries. We expect more complex software systems to generate a larger number of bugs. We use the size of the jar file (in kilobytes) as a proxy of the raw complexity of the source code. This variable measures the volume of information associated with the software architecture, but it does not capture how such information is broken down into components and how these components interact.



Number of nominal subsystems. The application source codes in our data set are complex systems formed by interrelated components.

To manage the complexity, developers group the

components (Java classes) into subsystems.

Typically, subsystems group components that

collectively perform certain functions. Such a grouping is likely to affect the cognitive ability of the team to understand the architecture of the source code, and therefore it may influence their propensity to create bugs. Because the assignment of each component to a subsystem is well codified by the naming convention, we are able to count the number of distinct subsystems. Note that this measure counts only the number of component-based subsystems, not any higher-level subsystems that group together only other subsystems. •

Number of components (Nis). The number of components into which the source code has been decomposed is a basic dimension of system complexity that conditions the architecture of the system and therefore for which we must control (Kauffman and Levin 1987).



Internal system connectivity.

We use two measures to control for the direct and indirect

connectivity among components: o

Direct connectivity (Kis) measures the number of direct connections among components (Kauffman and Levin 1987).

o

Indirect connectivity measures the number of non-zero cells of the binary visibility matrix of the system after subtracting the system’s DSM. The visibility matrix (V) of a system is a square matrix (similar to the DSM) whose non-zero cells (vij) indicate that component i is connected to component j via a finite number of intermediary components.

19

The

DRAFT – Please, do not quote or distribute without permission

visibility matrix is obtained by raising the DSM (D) to successively higher powers via Boolean multiplication until the number of empty cells in the resultant matrix stabilizes (MacCormack et al. 2006; Sharman and Yassine 2004; Warfield 2000).

Hence, to

measure indirect connectivity we count the number of non-zero cells in V-D. Note that because V captures both direct and indirect connectivity we must subtract D from V to control for these effects separately. •

Number of component loops. Because our key independent variables do not explicitly control for the number of component loops present in the source code, we include a control for it whose value depends on whether we are considering intrinsic or realized component loops. Table A (in the Appendix) shows descriptive statistics and correlations between the variables

included in our analysis. There were, on average, 101 bugs, 8 new features, and 23 incremental changes associated with each release.

4

Analysis and Results Our dependent variable is the number of bugs. Several features of our data make statistical

analysis a non-trivial task. Because our dependent variable exhibits skewed count distributions (which takes non-negative values only), standard ordinary least-square regressions can lead to inefficient and biased estimates. To deal with this issue, statisticians recommend using Poisson-like regression models developed explicitly to model the count nature of the dependent variables (Cameron and Trivedi 1998). Because the variance is significantly larger than the mean of our dependent variable, negative binomial regression models provide a more accurate estimate of the standard errors of the coefficient estimates of our regression models (Cameron and Trivedi 1998; Hausman et al. 1984). We estimate a model of the form (Cameron and Trivedi 1998, p. 279):

E[ y is | xis , " i ] = " i exp( xis# ! ) That is, our regression models predict that the expected number of bugs of version s of application i depends exponentially on a set of linearly independent regressors (xis). The exponential form of our model ensures that the dependent variable is always greater than zero. The ß coefficients shown in

20

DRAFT – Please, do not quote or distribute without permission

Table 1 are estimated by fitting the model to data. The coefficient ßj equals the proportionate change in the expected mean if the jth regressor changes by one unit. A significantly positive ßj coefficient indicates that, all else being equal, an increase in regressor j increases the expected number of bugs, whereas a significantly negative ßj coefficient indicates that, all else being equal, an increase in regressor j decreases the expected number of bugs. Of particular interest are the ß coefficients for our key independent variables. A significantly greater than zero coefficient of ßiteration_propensity would indicate that the greater the iteration propensity in version s of application i the greater the expected number of bugs. This would be in line with hypothesis H1. The αi are application-specific effects, which can be either fixed or random. These effects permit observations of the same application to be correlated across versions, thereby building serial correlation directly into the model. In a fixedeffects model, the αi absorb time-invariant, unobserved, application-specific features. By doing this we effectively control for any unobserved factors such as the “culture” of the development team associated with each application, since these are likely to differ across applications but much less likely to change for the same application over successive releases. For the random-effects model, the

αi are iid random variables which can be estimated by assuming a distribution for αi (typically a gamma distribution). We report estimates based on the fixed-effects model, which are consistent with the random-effects estimates of those models that pass the Hausman specification test to use random effects (Hausman et al. 1984). Finally, because software development technologies may change significantly from year to year and such developments might affect bug creation across all of the applications, we include indicator variables associated with the year of each release. Table 1 provides the coefficient estimates of the models predicting the expected number of bugs. Model 1 includes a first set of control variables. This model shows that the effect of “time since last release” is positive and significant, indicating that the longer the time between releases the greater the likelihood of introducing a larger number of bugs. Model 2, which includes the rest of the control variables, suggests that neither the number of components (N) nor the number of direct connections among them (K) are significant determinants of bug creation. However, the significant, negative coefficient of indirect connectivity suggests that the propagation of information through intermediary

21

DRAFT – Please, do not quote or distribute without permission

components is likely to reduce the number of bugs. To understand this further, we estimated two additional models (not shown in Table 1) in which we distinguish feed-forward and feedback indirect connectivity (in both flat and hierarchical sequenced DSMs). The results of these alternative models indicate that it is feed-forward indirect connectivity, not feedback indirect connectivity, that is significantly and negatively associated with the number of bugs. Models 3, 4, and 5 include our three measures of system cyclicality, respectively. These models also control for the number of component loops. Model 3 shows a positive (yet not significant) coefficient estimate for intrinsic cyclicality, whereas Model 4 shows a positive and significant coefficient estimate for realized cyclicality. Finally, Model 5 shows that the effect of reduced realized cyclicality is positive but not significant. Hence, Model 4 offers the strongest empirical evidence to support H1. That is, the greater the probability that a randomly chosen component belongs to a realized component loop, the larger the expected number of bugs associated with such a version of the application. The fact that Model 4 (and not Models 3 or 5) shows the largest and only significant effect of cyclicality on the number of bugs provides empirical support to H2. Based on a test of means, the coefficient estimate of realized cyclicality (Model 4) is significantly larger than both the intrinsic and reduced realized cyclicality shown in Models 3 and 5, respectively. Hence, it is not only the presence of intrinsic component loops that may increase the risk of creating bugs, but also the fact that such cycles may be hidden from the developers by the presence of other components in the source code. Our results suggest that increasing the size of the subsystems whose components are involved in loops (even if they are not connected to the other components within the realized component loop) increases the risk of masking the design cycle itself and therefore the risk of creating bugs in the system. 4.1

Bug Fixing To gain further insight into the relationship between system cyclicality and quality, we also

examine the determinants of bug fixes. Bug fixes is measured by the number of bugs associated with version s of application i that have been fixed by the developers, as reported by the bug tracking systems. Analyzing the determinants of bug fixing is particularly challenging because it depends on

22

DRAFT – Please, do not quote or distribute without permission

having uncovered the bug in the first place. Hence, when predicting bug fixing, it is imperative to consider the number of bugs as an endogenous variable. To do so, we use a three-stage estimation of simultaneous equations procedure in which the first equation predicts the number of bugs, and the second equation predicts bug fixes while controlling for the number of bugs predicted in the first equation (Kennedy 2003, p. 190). We use the reg3 procedure with fixed effects implemented in Stata version 9 after log transforming both our bugs and bug fixes data. (This is possible to do here because less than 2% of the observations in our bug data are zeros.) The results of this alternative analysis are reported in Table 2. Table 2 shows the results of three model specifications in which we test how realized, intrinsic, and average intrinsic cyclicality affect the capability to fix bugs, conditional upon the number of bugs that have been uncovered. First, all three models confirm the results shown in Model 4 of Table 1. That is, realized cyclicality is positively associated with expected number of bugs (in line with H1). As for bug fixing, Model 1 shows that realized cyclicality has a positive (and marginally significant) effect on the number of bug fixes. Model 2 shows a non-significant coefficient for the intrinsic cyclicality. Finally, Model 3 includes the average intrinsic cyclicality as a predictor. (This variable differs from our intrinsic criticality measure by using the average size of intrinsic component loops in the numerator of PI,is rather than the total count of components in intrinsic component loops.) This model shows a negative and significant coefficient estimate for this variable, which suggests that the larger the size of the intrinsic component loops (on average), the fewer bug fixes, conditional on the number of bugs uncovered. Hence, although Model 1 suggested, at first, that realized component loops could favor bug fixing, Model 3 helps us realize that there is a core element in realized component loops that, on average, hinders the capacity of the organization to fix bugs. From an empirical viewpoint, this is a remarkable result considering the high correlation between number of bugs and bug fixes (ρ = 0.97), which makes it very difficult to disentangle empirically the effects that might drive one but not the other. This result provides further substantial evidence for our argument that hidden component loops increase the risk of creating bugs (H2). However, it is the intrinsic section of component loops within the realized loops that affects the capability of the team to fix bugs,

23

DRAFT – Please, do not quote or distribute without permission

because the average size of intrinsic loops are likely to relate to the average size of the root cause of bugs.

5

Discussion This paper examines the architecture of software systems and identifies architectural properties

that influence the system’s quality. We take a process view of software architectures to identify component loops that are likely to cause system defects (bugs). A distinct aspect of our approach is that we identify component loops by taking into account the hierarchical structure of subsystem levels and layers. Our empirical results suggest that component loops embedded in the hierarchies in which source codes are typically organized are key determinants of bugs—not only because they contain an important root cause of bugs (intrinsic component loops), but also because they hide such intrinsic component loops, thereby hindering the ability of the team to uncover those bugs during the development process. This has important implications for managers of software development. Our results suggest that in order to manage the quality of software applications it is imperative to examine the architecture of the source code. This work provides clear guidance on where to look to determine the factors that increase the risk of bug creation. Our results suggest that the larger the fraction of components involved in realized component loops, the higher the risk of creating bugs. More specifically, our regression models (Model 4 in Table 1) indicate that a 1% increase in the propensity of having a component being part of a realized component loop in our sample increases the expected number of bugs by 1.26% (all else being equal). This is significant considering that, in our sample, the average increase (over two successive versions of an application) in realized cyclicality was almost 4%. Hence, an application that increased its realized cyclicality by 4% over six successive versions would have the risk of increasing its number of bugs by 34% in total (an increase in the number of bugs of 5% between two successive versions over six releases). From a conceptual viewpoint, our theoretical argumentation and analysis focus on the role of system cyclicality (operationalized as three types of component loops) in bug creation. In doing so, we provide an alternative perspective on the effect of connectedness (or coupling) on quality. Instead 24

DRAFT – Please, do not quote or distribute without permission

of assuming that connectedness among components is a negative property in bug creation, we argue that it is rather the visibility of cyclical connectedness that is an important determinant of bugs. While testing our hypotheses, we have controlled for the effects of direct and indirect connectivity among components. In so doing, we show that indirect connectivity can be good. Feed-forward indirect connectivity reduces bug creation. This result contrasts with previous work that has emphasized the cost of change propagation due to both high direct and indirect connectivity among components (Eckert et al. 2004; MacCormack et al. 2006).

Although we agree that the connectedness (or

coupling) of a system has many undesirable features that can hinder several dimensions of quality, such as the extendibility and maintainability of the design, our results suggest that systems architected in layers (such that components in lower layers serve components in higher layers) require and benefit from (direct and indirect) connections that obey this directional design rule. Hence, our results suggest a caution about the generalization that connectedness among components is necessarily an undesirable feature. Instead, our results indicate that, to proactively manage the generation of defects, it is important to identify component loops in the presence of the hierarchies used to organize the architecture of a system. Our analysis takes advantage of the dynamic nature of our data to establish a causal link between software architecture and quality. Because our dependent variable is measured after applications have been released, and our predictor variables measure architectural properties determined before the release, the potential of reverse causality can be ruled out. However, because our analysis was carried out on a sample of Java-based applications developed by the open-source foundation Apache, we need to carry out similar studies in other settings before fully generalizing our findings. Relating the architecture of software applications and quality has allowed us to establish the mechanisms by which component loops are likely to influence bug creation.

Yet, many open

questions remain to be addressed in future research. What other characteristics of an architecture might generate bugs? How do open source and “closed source” software architectures differ? Do any such differences lead to differences in quality? Our current research efforts aim to find insightful answers to these questions.

25

DRAFT – Please, do not quote or distribute without permission

6

Acknowled gements We appreciate support from Lattix Inc., which provided us with the software used to document

the architecture of software applications. We thank Neeraj Sangal for insightful feedback throughout this research and Jeremy Berry for helpful research assistance.

7

References

Alexander, C. 1964. Notes on the Synthesis of Form. Harvard University Press, Cambridge, MA. Baldwin, C.Y., K.B. Clark. 2000. Design Rules: The Power of Modularity. MIT Press, Cambridge, MA. Browning, T.R. 2001. Applying the Design Structure Matrix to System Decomposition and Integration Problems: A Review and New Directions. IEEE Transactions on Engineering Management, 48(3) 292306. Browning, T.R., S.D. Eppinger. 2002. Modeling Impacts of Process Architecture on Cost and Schedule Risk in Product Development. IEEE Transactions on Engineering Management, 49(4) 428-442. Cameron, A.C., P.K. Trivedi. 1998. Regression Analysis of Count Data. Cambridge University Press, Cambridge, U.K.. Cataldo, M., P. Wagstrom, J.D. Herbsleb, K.M. Carley. 2006. Identification of Coordination Requirements: Implications for the Design of Collaboration and Awareness Tools. Proceedings of the ACM Conference on Computer-Supported Cooperative Work, Banff, Alberta, 353-362. Clarkson, P.J., C. Simons, C. Eckert. 2004. Predicting Change Propagation in Complex Design. Journal of Mechanical Design, 126788-797. Eckert, C.M., P.J. Clarkson, W. Zanker. 2004. Change and Customization in Complex Engineering Domains. Research in Engineering Design, 15(1) 1-21. Eppinger, S.D., D.E. Whitney, R.P. Smith, D.A. Gebala. 1994. A Model-Based Method for Organizing Tasks in Product Development. Research in Engineering Design, 6(1) 1-13. Gokpinar, B., W. Hopp, S. Iravani. 2007. The Impact of Product Architecture and Organizational Structure on Efficiency and Quality of Complex Product Development. Northwestern University, Working Paper. Hausman, T., B.H. Hall, Z. Griliches. 1984. Econometric Models for Count Data with an Application to the Patents-R&D Relationship. Econometrica, 52(4) 909-938. Jones, C. 2000. Software Assessments, Benchmarks and Best Practices. Addison-Wesley. Kauffman, S.A., S. Levin. 1987. Towards a General Theory of Adaptive Walks on Rugged Landscapes. Journal of Theoretical Biology, 128(1) 11-45. Kennedy, P. 2003. A Guide to Econometrics. 5th Edition, MIT Press, Cambridge, MA. Krishnan, V., S. Eppinger, D. Whitney. 1997. A model-based framework to overlap product development activities. Management Science, 43(4):437-451. Koru, A.G., J. Tian. 2004. Defect Handling in Medium and Large Open Source Projects. IEEE Software54-61. Koru, A.G., D. Zhang, H. Liu. 2007. Effect of Coupling on Defect Proneness in Evolutionary Open-Source Software Development. in Feller, J., B. Fitzgerald, W. Scacchi, A. Sillitti, Eds., Open Source Development, Adoption and Innovation (International Federation for Information Processing), Springer, Boston, MA, 34, 271-276. Lai, X., J.K. Gershenson. 2006. Representation of Similarity and Dependency for Assembly Modularity. Proceedings of the ASME Design Engineering Technical Conferences - 18th International Conference on Design Theory and Methodology, Philadelphia, PA, . MacCormack, A., J. Rusnak, C.Y. Baldwin. 2006. Exploring the Structure of Complex Software Designs: An Empirical Study of Open Source and Proprietary Code. Management Science, 52(7) 1015-1030.

26

DRAFT – Please, do not quote or distribute without permission

Meier, C., A.A. Yassine, T.R. Browning. 2007. Design Process Sequencing with Competent Genetic Algorithms. Journal of Mechanical Design, 129(6) 566-585. Mens, T., T. Tourwé. 2004. A Survey of Software Refactoring. IEEE Transactions on Software Engineering, 30(2) 126-139. Mihm, J., C. Loch, A. Huchzermeier. 2003. Problem-Solving Oscillations in Complex Engineering Projects. Management Science, 49(6) 733-750. Parnas, D.L. 1972. On the Criteria To Be Used in Decomposing Systems into Modules. Communications of the ACM, 15(12) 1053-1058. Parnas, D.L. 1979. Designing Software for Ease of Extension and Contraction. Transactions on Software Engineering, 5(2). Pich, M.T., C.H. Loch, A.D. Meyer. 2002. On Uncertainty, Ambiguity and Complexity in Project Management. Management Science, 48(8) 1008-1023. Pimmler, T.U., S.D. Eppinger. 1994. Integration Analysis of Product Decompositions. Proceedings of the ASME International Design Engineering Technical Conferences (Design Theory & Methodology Conference), Minneapolis, Sep.. Roberts, J.A., I.-H. Hann, S.A. Slaughter. 2006. Understanding the Motivations, Participation, and Performance of Open Source Software Developers: A Longitudinal Study of the Apache Projects. Management Science, 52(7) 984-999. Sangal, N., E. Jordan, V. Sinha, D. Jackson. 2005. Using Dependency Models to Manage Complex Software Architecture. Proceedings of the 20th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages And Applications (OOPSLA), San Diego, CA, Oct 16-20, 167-176. Sharman, D.M., A.A. Yassine. 2004. Characterizing Complex Product Architectures. Systems Engineering, 7(1) 35-60. Shaw, M., D. Garlan. 1996. Software Architecture: Perspectives on an Emerging Discipline. Prentice Hall, Upper Saddle River, NJ. Simon, H.A. 1981. The Sciences of the Artificial. 2nd Edition, MIT Press, Cambridge, MA. Smith, R.P., S.D. Eppinger. 1997a. Identifying Controlling Features of Engineering Design Iteration. Management Science, 43(3) 276-293. Smith, R.P., S.D. Eppinger. 1997b. A Predictive Model of Sequential Iteration in Engineering Design. Management Science, 43(8) 1104-1120. Sommer, S.C., C.H. Loch. 2004. Selectionism and Learning in Projects with Complexity and Unforeseeable Uncertainty. Management Science, 50(10) 1334-1347. Sosa, M.E. 2008. A Structured Approach to Predicting and Managing Technical Interactions in Software Development. Research in Engineering Design, 19(1) 47-70. Sosa, M.E., T.R. Browning, J. Mihm. 2007a. Studying the Dynamics of the Architecture of Software Products. Proceedings of the ASME 2007 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference (IDETC/CIE 2007), Las Vegas, NV, Sep. 4-7. Sosa, M.E., S.D. Eppinger, C.M. Rowles. 2003. Identifying Modular and Integrative Systems and Their Impact on Design Team Interactions. Journal of Mechanical Design, 125(2) 240-252. Sosa, M.E., S.D. Eppinger, C.M. Rowles. 2004. The Misalignment of Product Architecture and Organizational Structure in Complex Product Development. Management Science, 50(12) 1674-1689. Sosa, M.E., S.D. Eppinger, C.M. Rowles. 2007b. A Network Approach to Define Modularity of Components in Product Design. Journal of Mechanical Design, 129(11) 1118-1129. Stevens, W.P., G.J. Myers, L.L. Constantine. 1974. Structured Design. IBM Systems Journal, 13(2) 115-139. Steward, D.V. 1981. The Design Structure System: A Method for Managing the Design of Complex Systems. IEEE Transactions on Engineering Management, 28(3) 71-74. Suh, N.P. 2001. Axiomatic Design: Advances and Applications. Oxford University Press, New York.

27

DRAFT – Please, do not quote or distribute without permission

Sullivan, K.J., W.G. Griswold, Y. Cai, B. Hallen. 2001. The Structure and Value of Modularity in Software Design. Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering, Vienna, Austria, Sep. 10-14, 99-108.

Terwiesch, C., C. H. Loch, A. De Meyer. 2002. Exchanging preliminary information in concurrent engineering: Alternative coordination strategies. Org. Sci. 13(4): 402–419. Thompson, J.D. 1967. Organizations in Action. McGraw-Hill, New York. Ulrich, K.T. 1995. The Role of Product Architecture in the Manufacturing Firm. Research Policy, 24(3) 419440. Ulrich, K.T., S.D. Eppinger. 2007. Product Design and Development. 4th Edition, McGraw-Hill, New York. Warfield, J.N. 1973. Binary Matrices in System Modeling. IEEE Transactions on Systems, Man, and Cybernetics, 3(5) 441-449. Warfield, J.N. 2000. A Structure-Based Science of Complexity: Transforming Complexity into Understanding. Kluwer, Amsterdam. Yassine, A.A., N. Joglekar, D. Braha, S. Eppinger, D. Whitney. 2003. Information Hiding in Product Development: The Design Churn Effect. Research in Engineering Design, 14(3) 145-161. Yassine, A.A., L.L. Wissmann. 2007. The Implications of Product Architecture on the Firm. Systems Engineering, 10(2) 118-137.

28

DRAFT – Please, do not quote or distribute without permission

Table 1 Negative Binomial Regressions Predicting Expected Number of Bugs (N = 126) Model 1 (controls)

Model 2 (controls)

Model 3 (intrinsic)

Model 4 (realized)

Model 5 (reduced realized)

Age: Days from first release !

-.106 (.212)

-.120 (.242)

-.074 (.245)

.052 (.268)

-.006 (.269)

Days from last release !

.642* (.359)

.788** (.337)

.770** (.334)

.717** (.333)

.755** (.335)

Num of incremental changes !

2.141 (2.090)

3.317 (2.129)

2.285 (2.272)

1.786 (2.148)

2.132 (2.207)

Number of new features !

.462 (6.625)

-.975 (6.666)

-.975 (7.152)

.575 (6.659)

-.465 (6.750)

Number of nominal modules !

.185 (.139)

-.137 (.251)

.012 (.271)

-.053 (.266)

-.049 (.267)

Size of source code file !

-.106 (.212)

4.081 (7.714)

4.241 (7.884)

6.864 (7.840)

6.002 (7.827)

Number of components (N) !

.798 (2.356)

.485 (2.376)

.542 (2.509)

.774 (2.458)

Direct connectivity (K) !

.721 (.649)

.849 (.667)

.369 (.674)

.439 (.680)

-.025** (.010)

-.028*** (.010)

-.020** (.010)

-.022** (.010)

Number of component loops

-.050 (.045)

.012 (.058)

.002 (.060)

Intrinsic cyclicality

.583 (1.173)

Independent variables

Indirect connectivity !

Realized cyclicality

1.263** (.576)

Reduced realized cyclicality

Log Likelihood *< .1

** < .05

.870 (.565)

-515.95

-510.08

-509.38

*** < .01. Standard errors are shown between parentheses.

! Coefficients multiplied by 1000 to facilitate exposition of results All models include application-specific fixed effects and year effects

29

-507.58

-508.75

DRAFT – Please, do not quote or distribute without permission

Table 2. Three stage least square regressions to predict Bugs and Bug Fixes (N=122) Independent variables

Model 1 (bugs)

LN(number of bugs identified) Age: Days from first release ! Days from last release !

Model 1 (bug fixes) .770*** (.141)

.862 (.783)

-.096 (.216)

.732 (.464)

Days to next release ! Num of incremental changes ! Number of new features ! Number of nominal modules ! Size of source code file ! Number of components ! Direct connectivity ! Indirect connectivity ! Number of component loops Realized cyclicality

Model 2 (bugs)

Model 2 (bug fixes)

Model 3 (bugs)

.800*** (.110) .848 (.783)

-.186 (.214)

.736* (.438) -.074 (.244)

.922*** (.088) .848 (.783)

-.239 (.173)

.842* (.459) -.058 (.197)

-.125 (.168)

.268 (2.848)

.512 (.811)

1.217 (2.831)

1.069 (.756)

.585 (2.846)

.911 (.657)

22.776** (10.268)

3.930 (4.146)

23.813** (10.106)

4.237 (2.862)

22.844** (10.232)

1.511 (2.642)

23.898* (14.456)

-4.120 (4.203)

24.599* (14.440)

-3.975 (3.766)

24.868* (14.444)

-3.559 (3.445)

.018 (.331)

-.056 (.090)

-.072 (.330)

-.142 (.096)

-.023 (.331)

-.183** (.088)

-6.387 (4.454) 1.867 (1.170) -.039** (.018)

-1.135 (1.284) .469 (.349) -.004 (.006)

-6.549 (4.438) 2.048* (1.169) -.042** (.018)

-.899 (1.172) .454 (.319) -.004 (.006)

-6.559 (4.448) 1.961 (1.170) -.041** (.018)

-.629 (.944) .228 (.271) .003 (.005)

.069 (.088) 2.435*** (.910)

-.004 (.026) .540* (.321)

.109 (.083) 1.594* (.841)

.022 (.017)

.091 (.087) 2.157** (.898)

Intrinsic cyclicality

.001 (.673)

Avg intrinsic cyclicality

-.859* (.472)

R-sq

.688 *< .1

Model 3 (bug fixes)

** < .05

.972

.683

.975

*** < .01. Standard errors are shown between parentheses.

Coefficients multiplied by 1000 to facilitate exposition of result All models include application-specific fixed effects and year effects

30

.687

.985

DRAFT – Please, do not quote or distribute without permission

Appendix Algorithm to calculate realized component loops.

To identify a diagonal block that encloses coupled

components, we start drawing the block, from the upper-left corner toward the lower-right corner of the sequenced hierarchical DSM, by including the first component with a mark to the right of the matrix diagonal. Then, we draw a vertical line until we find the diagonal, unless we find a mark to the right of the vertical line before that. If we do, we move the drawing of the vertical line to the corresponding column away to the right of the matrix and continue drawing downward until either find another mark to the right of the vertical line or we hit the diagonal. Once the vertical line reaches the diagonal, the block is defined and the set of coupled components is identified.

31