592
IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 60, NO. 3, AUGUST 2013
Joint Effect of Team Structure and Software Architecture in Open Source Software Development Ning Nan and Sanjeev Kumar
Abstract—In this study, we seek to understand socio-technical interactions in a system development context via an examination of the joint effect of developer team structure and open source software (OSS) architecture on OSS development performance. Using detailed data collected from code repositories from SoureForge.com, we find that developer team structure and software architecture significantly moderate each other’s effect on OSS development performance. Larger teams tend to produce more favorable project performance when the project being developed has a high level of structural interdependency while projects with a low level of structural interdependency require smaller teams in order to achieve better project performance. Meanwhile, centralized teams tend to have a positive impact on project performance when the OSS project has a high level of structural interdependency. However, when a project has a low level of structural interdependency, centralized teams can impair project performance. This study extends our understanding of information technology’s deep engagement in organizational life and provides directions for open source practitioners to better organize their projects to achieve greater performance. Index Terms—Collaboration network, open source software, social network analysis, software architecture, software project performance.
I. INTRODUCTION PEN source software (OSS) development projects are Internet-based communities of software developers who voluntarily collaborate to develop software that they or their organizations need [1], [2]. Compared with traditional software development, OSS development is unique in that it is selforganized by voluntary participants. Furthermore, OSS projects automatically generate detailed and public logs of developer actions and project outputs, allowing a clear view of their inner workings [1]. These unique aspects of OSS have inspired studies regarding motivations of individual participants [3], [4], mechanisms sustaining volunteer participation [5], governance of OSS projects [6], [7], architecture of OSS code [8], [9], patterns of project affiliation of OSS developers [10], [11], and knowledge creation and sharing [12], [13] in OSS projects. Insights from these OSS studies have increasingly pointed toward the
O
Manuscript received March 7, 2011; revised April 30, 2012, August 6, 2012, and October 23, 2012; accepted November 27, 2012. Date of publication January 22, 2013; date of current version July 13, 2013. Review of this manuscript was arranged by Department Editor B. C. Y. Tan. N. Nan is with the University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada (e-mail:
[email protected]). S. Kumar is with the Sheldon B. Lubar School of Business, University of Wisconsin–Milwaukee, Milwaukee, WI 53201, USA (e-mail: kumars@uwm. edu). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TEM.2012.2232930
inseparable role of the social and the technical in shaping OSS development processes and outcomes. For example, organizational learning processes in OSS development have been shown to vary significantly across projects characterized by different development stages and varied application types [12]. Similarly, a comparison of software architecture between open-source and proprietary projects indicates that modularity of software designs is jointly determined by software function and purposeful behaviors of software developers [14]. Extant OSS studies suggest that OSS development is particularly suited for an examination of joint effects of the social and the technical in a system development context since it promotes and digitally captures self-orchestrated interactions between software developers and software artifacts. Situated in the OSS development context, this study focuses on OSS developer team structure as the social aspect and software architecture as the technical aspect of OSS projects. Our general research question is: what is the joint effect of developer team structure and OSS project architecture on OSS development performance? The answer to our research question can serve as a first step in integrating the separate lines of work on OSS development’s social and technical dimensions into a coherent research literature by speaking to the reciprocity between an important social process (i.e., the team structure) and a critical technical factor (i.e., software architecture) in OSS development. It can also help OSS practitioners to understand the strengths and weaknesses of the OSS development process. II. THEORETICAL BACKGROUND AND PREVIOUS RESEARCH A. Inseparable Role of the Social and the Technical Researchers have long recognized the composite relationship between social processes and technical properties in an organizational work context. The organizational information processing theory (OIPT) provides a widely cited perspective of this composite relationship: to achieve optimal performance there should be a match between information processing capabilities of organizational structure (social processes) and information processing needs of a given task (technical properties) [15], [16]. In the organizational literature, information processing capabilities are typically assessed by looking at the collaboration structure of the workforce while information processing needs are evaluated by examining the level of interdependency among task units. The OIPT notion of “match” is usually modeled via the moderating role of task characteristics in the causal pathway from collaboration structure to work performance (e.g., [17], [18]). The assumption underlying this approach is that task characteristics are exogenous factors while
0018-9391/$31.00 © 2013 IEEE
NAN AND KUMAR: JOINT EFFECT OF TEAM STRUCTURE AND SOFTWARE ARCHITECTURE
collaboration structure of the workforce can be chosen and adjusted by human actors. For example, Tushman [19] proposed that high performing R&D project teams were more likely to align their team structure with the information processing needs of their tasks. Consistent with this thesis, he found significant associations between team structure and task characteristics in high performing R&D project teams (e.g., decentralized team structure for nonroutine tasks) but diminished associations in low performing project teams. The exploration of the composite relationship between organizational structure and task characteristics has been extended to the software engineering literature. Compared with tasks examined in the organizational literature, software development tasks are unique in that they are subjected to the structurational power of human actors since software artifacts are essentially encoded human intelligence [20]–[22]. In other words, task characteristics are endogenous rather than exogenous to software development research models. This unique aspect of software development tasks has motivated software engineering researchers to move beyond the unidirectional moderating role of task characteristics in conceptualizing the OIPT notion of “match.” For example, Cataldo et al developed a social– technical congruence measure that captures the proportion of collaboration activities actually occurred in development teams relative to the total number of collaboration activities required by interdependency among software development task assignments. A significant impact of this social–technical congruence on project productivity manifests the equally important roles of organizational structure and task characteristics in determining software project performance [23], [24]. In addition, Kim and Umanath [25] employed a multiplicative interaction model in examining the relationship between development team structure, software task characteristics, and project performance. This modeling approach revealed that team structure and task characteristics served mutually moderating roles in affecting project performance outcomes. In summary, both the organizational and software engineering literatures emphasize the inseparable role of organizational structure and task characteristics in organizational work performance. While the organizational literature portrays this inseparable role through the moderating effect of task characteristics on the causal pathway from organizational structure to work performance, the software engineering literature indicates that organizational structure may reciprocally moderate the effect of task characteristics on work performance due to the endogeneity of software development task characteristics. Taken together, prior organizational and software engineering research suggests that social processes and technical properties can play equally important and mutually moderating roles in software development performance. This mutually contingent (used interchangeably with mutually moderating) perspective is particularly relevant to OSS projects since the self-organized nature of OSS development can amplify the malleability of both OSS team structures [26] and OSS project architecture [8]. An understanding of the mutually moderating roles of team structures and project architecture can help OSS practitioners to orchestrate the social and the technical aspects of OSS development
593
Fig. 1. Collaboration network of an OSS project (ID#31602 on SourceForge.net).
in concert and harvest project performance gains from their joint effect. Our subsequent theorization of socio-technical interactions in OSS development is therefore grounded in the premise that OSS development team structure and OSS software architecture are equally important and mutually contingent factors in OSS development performance. B. Socio-Technical Interactions in OSS 1) Social—Development Team Structure: Following prior research, we view OSS development team structure as an important social aspect of OSS development since it manifests information processing capabilities of OSS workforce [17]–[19]. Our characterization of development team structures draws on concepts and analytical techniques of social network theory. Social network theory models individual actors as nodes of a graph joined by their relationships depicted as links between the nodes [27]. When relationships are defined as collaborations on a task, the social network is specified as a collaboration network [28]. We choose to generate collaboration networks on an intraproject level; that is, each collaboration network includes developers of a single OSS project as nodes and collaboration incidences on tasks (i.e., CVS files) of the same project as links. An intraproject collaboration network is a close-up view of relationship structures within a particular project. Compared with an interproject (i.e., community-level) network, an intraproject (i.e., project-level) network is more relevant to our research question since it allows us to evaluate how organizational structure of a particular OSS project team affects performance of the corresponding project. In OSS projects, the basic unit of work is a file in the OSS CVS system (concurrent versioning System, a database of all changes made to the source code of the project). Hence, we view collaboration tasks as the files in the CVS system. A collaboration incidence occurs when two developers make code commits to the same CVS file. A collaboration network refers to the graph made of open source developers as nodes and the collaboration incidences on the same CVS file as links. Fig. 1 shows an example of an intraproject collaboration network. In this figure, the nodes represent the developers who have submitted code commits to the CVS system of project 31602. Presence
594
IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 60, NO. 3, AUGUST 2013
of a link between two nodes indicates that the developers represented by these two nodes have made code commits to the same CVS file in this project. We characterize OSS collaboration network structure by two commonly used measures: network size and network centralization. Network size is the number of nodes in a graph. It indicates the overall scope of the collaboration network [29]. Network size captures the open nature of OSS development where a large number of developers collaborate to develop software in contrast to traditional software development where the number of developers is comparatively lower. Network centralization indicates the extent to which a network is centralized around one or a few nodes [29]. Centralization of a network is measured in comparison to the most central network—the star network. In the star network, one central node connects to all of the other nodes while all other nodes are only connected to the central node. Any deviation from this structure indicates a reduction in network centralization. Network centralization captures the informal and organic aspect of OSS development where individual developers may work with much lower central hierarchical control compared to traditional software development. Although there are other metrics of collaboration network structure, we chose to focus on the two measures because network size and centralization reflect the major difference between the two alternative philosophies for organizing programming teams [30]–[32]: the chief programmer team [33] and the egoless team [34]. The chief programmer team has been compared to a surgical team [30]. It represents the mechanistic process that dominates the traditional software development. Such teams are intentionally small and centralized around a few programming experts [31], [35]. In contrast, the egoless team reflects the organic software development process that has emerged in the OSS community. It advocates decentralized communication and collaboration among programmers and is less concerned about team size [31], [35], [36]. An examination of network size and centralization of OSS teams helps us to understand the link between the unique characteristics of OSS collaboration network structure and OSS development performance. 2) Technical—Software Architecture: An important technical aspect of software development projects is the structural interdependency among processing elements of the software being developed [37]. Software structural interdependency is “the strength of association established by a connection from one module to another.” [38, p. 117]. To date representations of software structural interdependency have evolved from ad hoc descriptions to a repertoire of methods, techniques, and notations [39]. This repertoire of formal representations is typically referred to as software architecture. In other words, software architecture is a formal way to describe the structural interdependency of a software system in terms of components and their interconnections [39]–[41] (we use structural interdependency and software architecture interchangeably). By decomposing the overall task into parts and then designing, implementing, or maintaining each individual part, software architecture provides a feasible way to develop and manage large systems. It reduces the complexity of software development projects to associations
within and among processing elements (i.e., components) of a system [42]–[44]. Prior research shows that software architecture is an important indicator of the information processing need of a software project [37]. In general, the load and diversity of information needed to be processed by a software project increases with the level of structural interdependency among its processing elements [45]. This relationship can be particularly evident in OSS development projects since explicit software architecture is either undesirable or unenforceable with a large number of autonomous participants [39]. The structural interdependency of OSS projects often evolves as a result of past development activities. Consequently, considerable variations can exist in structural interdependency levels of different OSS projects, leading to substantially different needs for coordination/consolidation of knowledge elements. 3) Interaction Between Team Structure and Software Architecture: Although OIPT offers a widely cited notion of “match,” it does not specify how this notion should be modeled in empirical research [46]. Tushman [19] employed a median split approach to compare associations between organizational structures and task characteristics across high performing projects versus low performing projects. Other researchers have modeled the “match” between organizational structure and task characteristics as a single construct, often called alignment or congruence [23], [24], [47]. Our conception departs from these approaches because they do not richly incorporate the mutually contingent effects of team structure and software architecture. The media split approach uses performance outcome as a criterion for dichotomizing data rather than a dependent variable in the research model. It therefore throws away information regarding the magnitude of contingency effects on work performance. Similarly, when reducing socio-technical interactions to a single alignment/congruence construct, researchers lose a nuanced view of how organizational structure and task characteristics reciprocally influence each other’s impact on work performance. In this study, interaction between the social and the technical aspects in an organizational work context refers to the mutually moderating or mutually contingent relationship between team structure (network size and centralization) and software architecture (structural interdependency) in OSS development. In other words, we conceptualize socio-technical interactions in OSS projects as multiplicative interaction of team structure and software architecture [48]. Prior software engineering research [25] indicates the effectiveness of this multiplicative approach for a precise account of how effect of team structure is contingent on software architecture and how team structure simultaneously moderates the impact of software architecture on OSS development performance.
III. HYPOTHESES We develop our hypotheses following the central argument of OIPT that organizational work performance is jointly determined by information processing capabilities of the workforce
NAN AND KUMAR: JOINT EFFECT OF TEAM STRUCTURE AND SOFTWARE ARCHITECTURE
and information processing needs of the task. As discussed before, information processing capabilities of a development team is embodied by the structure of the team collaboration network. Information processing need of the software development process is represented by the structural interdependency of the software being developed. The mutually contingent effects of development team structure and software structural interdependency on project performance are specified below.
595
capabilities in this team. Therefore, Hypothesis 1: Collaboration network size and software structural interdependency mutually and positively moderate each other’s impact on OSS project performance; that is, impact of network size on project performance is more likely to be positive when software structural interdependency is higher, and impact of software structural interdependency on project performance is more likely to be favorable when network size increases.
B. Network Centralization and Structural Interdependency A. Network Size and Structural Interdependency Network size, one aspect of development team structure, has mixed implications to information processing capabilities of a team. On one hand, larger networks incur higher coordination and communication cost [49]. This is recognized by Brooks’ Law that, “adding manpower to a late software project makes it later” [30]. On the other hand, larger networks carry more diverse expertise and perspectives. They are better at specialization and division of labor among team members. As a widely cited quote states that, “given enough eyeballs, all bugs are shallow” [36]. The overall effect of network size on task performance depends on the information processing need entailed by structural interdependency of OSS projects. As mentioned earlier, the load and diversity of information needed to be processed usually increases with the level of structural interdependency of a software project [37], [45]. Projects with a lower level of structural interdependency do not take full advantage of the diverse expertise and perspectives in a large team while these projects have to bear the increased communication cost in such a team. Network size can therefore have a negative impact on the project performance of these projects. In a project with a high level of structural interdependency, capabilities of a large team in processing a heavy load of diverse information can produce salient project performance gains, compensating for the communication cost associated with a large team. The negative impact of network size on project performance can be dampened in this scenario. Reciprocating effects of network size on project performance, impact of software structural interdependency on project performance can vary across development teams with different network size. In traditional software development contexts where team size tends to be small, software structural interdependency is often found to increase software development effort which in turn can impair project performance [50]. However, recent OSS research suggests that OSS development may defy the negative effect of software structural interdependency on development effort due to its self-organized nature [12]. With the discretion to adjust development team structure according to project characteristics, an OSS team can recruit new members when a high level of structural interdependency is perceived. This subsequently allows the project to take advantage of information processing capabilities afforded by a large collaboration network. On the other hand, when a team is unwilling or unable to recruit additional members for a project with a high level of structural interdependency, project performance may be impaired as a result of insufficient information processing
Similar to network size, network centralization has mixed effects on information processing capabilities of a team. A centralized team is better at identifying and consolidating expertise in a team. It incurs lower coordination cost than a chain-like network (low network centralization). However, a centralized team structure imposes significant information processing load on the central nodes. The limitation of human cognition can impede the effectiveness of the whole team [51], [52]. Projects with a low level of structural interdependency do not require much consolidation among knowledge domains [37], [45]. The “bottle neck” effect of a few central nodes can outweigh the realized benefit of identifying and coordinating expertise in a centralized team, leading to suboptimal project performance [37], [45]. However, as the structural interdependency of a project increases, the advantage of a centralized team structure in identifying and consolidating expertise from a wider range of knowledge domains becomes important. This advantage enables a more centralized team to achieve better performance. Software structural interdependency, while moderating the effect of network centralization on project performance, is subject to a contingency effect of network centralization. The tendency for software structural interdependency to negatively affect project performance can be particularly strong in a team with chain-like structure since such a team is relatively ineffective for knowledge consolidation. This tendency can be alleviated by a centralized team given this team’s capabilities in identifying diverse information and coordinating information processing activities. Furthermore, since OSS project code reflects the organizational structure that produced it, a centralized team may evolve its project code toward a high level of structural interdependency in order to capture the intrinsic relationship among team members’ expertise. This project code can in turn serve as a means to facilitate coordination among developers, dampening the negative impact of structural interdependency on project performance [7]. Therefore, Hypothesis 2: Collaboration network centralization and software structural interdependency mutually and positively moderate each other’s impact on OSS project performance; that is, impact of network centralization on project performance is more likely to be positive in projects with a higher level of structural interdependency, and impact of software structural interdependency on project performance is more likely to be favorable when network centralization increases.
IV. DATA AND CONSTRUCTS The data for the study were collected from SourceForge.net: a central location for software developers to control and manage
596
IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 60, NO. 3, AUGUST 2013
open source software development. It serves as a source code repository for the OSS projects. As of April 2012, SourceForge.net had more than 230,000 projects and more than 2 million developers [53]. A. Data Collection From hundreds of thousands of OSS projects hosted at SourceForge.net, we selected a sample of projects for analysis. The selection of the sample was based on three key considerations: 1) project keeps detailed CVS information in the public domain; 2) project registered between Feb 2001 and Nov 2001; and 3) project has at least ten developers in the collaboration network. The first criterion makes our data collection practically possible. We have used automated spider scripts to collect CVS history information and such data collection is only possible for publicly available CVS archives. The second criterion ensures that the sampled projects have sufficient elapsed time since starting so that significant amount of development activity has already taken place and the periodic surges or dips in development activities have evened out and will not significantly bias the construct measures. The proximate starting times for the projects in the sample also help us to eliminate longitudinal effects that may confound our results. The third criterion is to safeguard the integrity of the collaboration network measures. Collaboration network measures are sensitive to the size of the network [29]. Particularly, when the network size is small, some network measures, such as centralization, become meaningless. Therefore, we have restricted our sample to projects with at least ten developers in the collaboration network. Following the sampling criteria earlier, our final sample consists of 96 projects. Collecting detailed data from CVS archives is a complex process that can lead to data errors and inconsistencies. Previous research has highlighted the problems in collecting data from CVS archives of SourceForge [54]. Accordingly, we have taken special care in the data collection process. The data are collected using custom made automated programs that access CVS pages, parse the information in the pages and save the information in a database. The programs include safeguards to make sure that the data collected do not have errors. Furthermore, we manually checked a subset of collected data to verify that the automated data collection resulted in a usable dataset. Some parts of the dataset that were less voluminous (e.g. programming language, project status) were collected manually from corresponding SourceForge pages. B. Construct Operationalization 1) Collaboration Network Structure: As discussed earlier, we employ network size and network centralization to measure collaboration network structure. Network Size is measured by the total count of nodes in the network [27]. The network centralization measure follows the approach proposed by Freeman [55]. It expresses the degree of inequality in a network as a percentage of that of a perfect star network of the same size. The higher the value, the more centralized the network is.
We employed the widely used social network analysis software UCINET [56] to compute the structure metrics for the collaboration networks in our sample. 2) Software Structural Interdependency: A direct operationalization of software structural interdependency would require a manual review of software source code and evaluation of “the strength of association established by a connection from one module to another.” [38, p. 117] Although automatic tools (e.g., Lattix) are available for evaluating software architecture, these tools are usually limited to a few widely adopted programming languages such as C/C++ and Java. The overwhelming amount of source code and the wide range of programming languages in our sample prevent us from measuring software structural interdependency either manually or using the automatic tools. Hence, we have taken a more tractable approach of using a surrogate measure, the average number of source code task files (CVS files) per folder in the CVS file tree, to measure software structural interdependency. Prior software engineering research found that the level of software structural interdependency tends to increase exponentially as the number of modules increases linearly in a project [30], [57]. Building on this observation, recent OSS research has defined each CVS file as a module and employed the number of CVS files as a surrogate measure of interdependency among modules in an OSS project [12]. Meanwhile, in a study about OSS code MacCormack et al. [8, p. 1019] pointed out that, “in software designs, programmers tend to group source files of a related nature into ‘directories’ that are organized in a nested fashion.” This suggests that the CVS file tree structure should be considered in gauging the level of software structural interdependency in addition to the number of CVS files in a project. Our measure of structural interdependency, the average number of CVS files per folder in the CVS file tree, captures both the total number of CVS files in an OSS project and the CVS tree structure of the same project. We use the last folders in the CVS file tree of a project in calculating this structural interdependency measure. A large number of CVS files per folder indicates a high level of structural interdependency since CVS files grouped into the same folder are typically related [8] and structural interdependency tends to increase exponentially with each addition of a processing element (i.e., a CVS file) [12]. 3) OSS Development Performance: Traditional indicators of software development performance, such as cost of development and development within schedule, are not applicable to OSS contexts because OSS projects generally do not involve a budget or a deadline. Prior research on OSS development has employed various OSS project performance measures such as OSS developers’ perceived project success [58], developer’ decisions in joining an OSS project team [5], [11], [59], the percentage of resolved bugs [12], increase in lines of code (LoC) [60], promotion to higher ranks in an OSS project [3], the number of subscribers associated with a project [61], the percentage of closed change requests [4], and number of code commits [10], [62], [63]. Among these measures, we choose the number of code commits per developer per day as our measure of OSS development performance. A code commit refers to a change in the CVS archive through adding, deleting, or changing software
NAN AND KUMAR: JOINT EFFECT OF TEAM STRUCTURE AND SOFTWARE ARCHITECTURE
597
TABLE I DESCRIPTIVE STATISTICS
code. The number of code commits is an objective measure of project performance enacted by the key social actors (development teams) with respect to the main technical element (software code) of an OSS project. It has been repeatedly used by prior studies regarding effects of social processes on technical success of OSS projects [10], [62], [63]. These studies indicate that the number of code commits is particularly suited for an examination of both social and technical factors in OSS development. Our measure, the number of code commits per developer per day, allows us to compare performance across projects with different team size and duration. It transforms the discrete counts of code commits to a continuous growth rate of an OSS project. 4) Control Variables: We controlled for the following variables based on the previous literature: Product Size: In the software engineering literature, product size has been identified as an important factor in manpower, time, and productivity of a project [64]–[68]. Therefore, product size is used as a control variable. According to the literature, we measure product size as the total LoC of an OSS project. Programming Language: Software programming language is another well-recognized factor that may affect software performance [69], [70]. In our data, there are 24 different programming languages utilized by the sampled OSS projects. Many projects employ more than one language. Among the 24 languages, Java, C++, and C are the most frequently used. Due to the limited sample size, we created four dummy (binary) variables: “Java,” “C++,” and “C” to account for the top three most frequently used languages, and “other” to represent all the other languages. A project receives a value of 1 for a language dummy variable if it uses the language in consideration, a value of 0 otherwise. In other words, a project has four language values, one for each of the four dummy variables. The “other” language variable was defined as the base category and left out of the regression model in order to prevent dummy variable trap. License Type: OSS projects use different licensing schemes and the specific license type used may affect developer motivation and project success [71]. We control for license types to take into account potential project performance variations due to commercial or noncommercial nature of a license. License type is measured as a binary variable. All projects with the most
popular OSS license, the GPL (general public license, usually indicating a noncommercial OSS product) license, are given a value of 1. All other projects have a value of 0. V. DATA ANALYSIS AND RESULTS A. Hypothesis Testing and Coefficient Interpretation In the regression model, the number of code commits per developer per day, network size and product size are log transformed to account for the nonlinear relationship between project performance and network size and product size [65], [66]. This transformation also helps to mitigate heteroskedasticity and nonnormal residual problems identified by regression diagnostics. Other variables remain in their original form, because it is possible for them to take “0” as a value. A log transformation of these variables would arbitrarily truncate out meaningful data points. B. Results The ordinary least squares (OLS) technique is employed in hypothesis testing. Before applying the OLS model, we tested for outlier values in the sample. We used the widely adopted Grubbs’ test for detecting outlier observations [72]. A conservative confidence level (99.99%) was adopted for this test. One of the sampled projects was excluded from our hypothesis testing as it was detected by Grubbs’ test as an outlier (p < 0.0001). This reduced our sample size to 95 projects. As the Grubbs’ test is designed to only work with Normal distributions, we performed the Kolmogorov–Smirnov test which did not reject the normality assumption [73]. The descriptive statistics and the correlation matrix for the resulting dataset are shown in Tables I and II. Note that the variables that are part of the interaction terms are centered before being used in the regression model to account for multicollinearity. The means shown in Table I are uncentered values. Standard assumptions of OLS estimation were then assessed. The presence of heteroskedasticity was tested with the Breusch– Pagan/Cook–Weisberg test for heteroskedasticity. Since the model indicated significant heteroskedasticity problem, we conducted robust regression and calculated Huber–White robust estimates of standard errors [74]. The model incorporates several
598
IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 60, NO. 3, AUGUST 2013
TABLE II CORRELATION MATRIX
TABLE III RESULTS OF THE OVERALL OLS MODEL
interaction effects. In such models, multicollinearity is a common problem. We tested for multicollinearity using the variance inflation factor and found significant multicollinearity (VIF is 22.569). We attempted to correct for multicollinearity by centering the interacting variables on their mean. The centering reduced multicollinearity to an acceptable level [75] (VIF is 2.03). The results of the OLS analysis are reported in Table III. H1 posits that collaboration network size and software structural interdependency mutually and positively moderate each other’s impact on OSS project performance. This concerns the coefficient of the interaction term of network size and software structural interdependency. As shown in Table III this coefficient is positive and significant (β = 0.0443, p = 0.008) in our model testing results. Hence H1 is supported. More interesting insights surfaced in post-hoc probing of this result. Following Aiken and West [48], we calculated simple slopes of the effect of network size on the number of code commits per developer per day at three values of software structural interdependency (SSI): SSIhigh = 4.774, SSI-mean = 0, SSI-low = −4.774. These three values are one standard deviation above the mean, the mean, and one standard deviation below the mean of centered SSI values, respectively. The simple slopes of network size (−0.0911 at SSI-high, −0.3026 at SSI-mean, and −0.5127 at SSI-low) are plotted in Fig. 2. To gain a complete view of the mutually mod-
Fig. 2.
Simple Slopes of Network Size.
erating effect of network size and SSI, we also computed the simple slopes of the effect of SSI on the number of code commits per developer per day at one standard deviation above the mean (NS-high = 0.5406), the mean (NS-mean = 0), and one standard deviation below the mean (NS-low = −0.5406) values of network size. These simple slopes are 0.0193, −0.0046, and −0.0285, respectively (see Fig. 3).
NAN AND KUMAR: JOINT EFFECT OF TEAM STRUCTURE AND SOFTWARE ARCHITECTURE
Fig. 3.
599
Simple Slopes of SSI.
Fig. 2 shows that network size tends to have a negative effect on project performance in terms of the number of code commits per developer per day. However, this negative effect can be mitigated by the interaction between network size and software structural interdependency since the negative simple slopes of network size become less steep as the structural interdependency values increase. In order to find the SSI value at which the effect of network size on project performance turns from negative to positive, we equate the derivative of the number of code commits with respect to network size to zero (−0.3026 + 0.0443∗SSI = 0). Since the result of this equation is centered SSI value, we add the mean of SSI to this result to gain a precise view of the inflection point. The result indicates that when there are more than 17 CVS files per folder in a project (or when software structural interdependence is 1.44 standard deviations above the mean), the effect of network size of project performance turns positive. With respect to the effect of software structural interdependency on project performance, Fig. 3 reveals that this effect is negative in small development teams but positive in large teams. Therefore, the interaction of network size and structural interdependency plays a key role in the relationship between project characteristics and project performance. By equating the derivative of the number of code commits with respect to SSI to zero (−0.0046 + 0.0443∗Network-Size = 0) and adding the mean value of network size to this result, we found that when there are more than 21 members in a project team (or when network size is 0.19 standard deviation above the mean) the effect of SSI on project performance becomes positive. H2 proposes that collaboration network centralization and software structural interdependency mutually and positively moderate each other’s impact on OSS project performance. As shown in Table III, the coefficient of the interaction term of network centralization and software structural interdependency is positive and significant (β = 0.3650, p = 0.027). Thus, H2 is supported. Following the same post-hoc probing approach for H1, we calculated the simple slopes of the effect of network centralization (NC) on the number of code commits per developer per day at SSI-high, SSI-mean, and SSI-low values (refer back to H1 post-hoc probing). The simple slopes are 1.3918,
Fig. 4.
Simple Slopes of Centralization.
Fig. 5.
Simple Slopes of SSI.
−0.3507, and −2.0932, respectively (see Fig. 4). Meanwhile, the simple slopes of the effect of SSI on the number of code commits per developer per day at one standard deviation above the mean (NC-high = 0.0709), the mean (NC-mean = 0), and one standard deviation below the mean (NC-low = −0.0709) values of network centralization are calculated. These simple slopes are 0.0213, −0.0046, and −0.0305, respectively (see Fig. 5). Figs. 4 and 5 suggest that network centralization and software structural interdependency reciprocally remedy each other’s negative effect on project performance. Setting the derivative of the number of code commits with respect to network centralization to zero (−0.3507 + 0.3650 ∗ SSI = 0), we found that at a moderate level of software structural interdependency (0.2 standard deviation above mean or more than 11 CVS files per folder) the effect of network centralization turns from negative to positive. The derivative analysis of the number of code commits with respect to SSI (−0.0046 + 0.3650 ∗ NetworkCentralization = 0) reveals that 0.18 standard deviation or 0.18 degree of network centralization is the inflection point where the effect of structural interdependency turns from negative to positive.
600
IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 60, NO. 3, AUGUST 2013
VI. DISCUSSION In reference to our research question, the data analysis results paint a fine-grained picture of the joint effect of OSS development team structure and software architecture. They indicate that development team structure and software architecture have mutually contingent effects on OSS project performance. In particular, larger teams tend to produce more favorable project performance when the project being developed has a high level of structural interdependency. Meanwhile, more centralized teams are associated with better project performance in projects with a high level of structural interdependency. For projects with a low level of structural interdependency, smaller teams tend to produce better project performance. Project performance gains can also be achieved when less centralized teams are coupled with projects with a low level of structural interdependency. Prior research suggests that an increase in network size can lead to changes in collaboration network structures such as network centralization [76]. This implies that the interaction effect of network centralization and software structural interdependency may be moderated by different network size. Furthermore, it can be argued that the effectiveness of chief programmer teams (i.e., small and centralized collaboration network structure) or egoless teams (i.e., large and decentralized network structure) depends on the structural interdependency of the software project being developed. We performed post-hoc analysis in reference to the possible three-way interaction among network size, network centralization, and software structural interdependency. The coefficient of the three-way interaction term is not significant (β = 0.0425, p = 0.857) in the post-hoc regression analysis. This indicates that network size and network centralization do not moderate each other’s interaction with software structural interdependency; the joint effect of team structure and software architecture manifests primarily as the mutually contingent effect of network size and software structural interdependency, as well as the mutually moderating effect of network centralization and software structural interdependency. A. Research Implication A key motivation leading us to undertake this study is the lack of research directed at socio-technical interactions in a system development context. Theoretical development and empirical findings of this study help close this research gap in two important ways. First, this study contributes to a unified theory regarding the relationship between IT properties and organizing practices in a collective work process. Since IT systems are central tasks in OSS development, this study recognizes IT properties (software structural interdependency) and organizing practices (development team structure) as equally important and tightly intertwined factors in work performance outcomes. Furthermore, given the malleability of team structure and software architecture in OSS development projects, our research context allows an opportunity to capture the mutually contingent relationship between OSS development team structure and software architecture through their multiplicative interaction effect on project performance. Because the relationship between IT properties and organizing practices characterizes the deep
logic underlying various manifestations of socio-technical interaction, a unified theory about this relationship can lead us to consistent and cumulative insights regarding the nature and relevance of today’s information technology. A second contribution of this study is to conceptualize and empirically evaluate the role of socio-technical interactions in OSS projects. By integrating insights from the organizational and software engineering literatures in the overarching framework of OIPT [15], [16], we develop a theoretical perspective that bridges prior OSS research streams focusing on either OSS developer behaviors or OSS project characteristics. Our empirical findings provide formal support to the conjectures regarding the inseparable role of social and technical factors in OSS development processes e.g. [14]. They also help user in new research models that transcend linear effects and incorporate multiplicative interaction terms embodying self-organized socio-technical interactions in OSS development context. More than three decades ago information systems (IS) researchers have conceptualized an organizational work system as a function of two jointly independent, but correlative interacting systems— the social and the technical [77]. This perspective, namely the socio-technical perspective, has been recognized as an effective approach for system design [77]. Research models incorporating socio-technical interaction constructs can take advantage of the socio-technical perspective and advance OSS theories toward full and precise accounts of how self-organized OSS participants and continuously evolving OSS software architecture collaboratively produce OSS development processes and outcomes. B. Practical Implication To OSS practitioners, the main implication of our findings is that they can gain the best of both worlds by adopting a hybrid software development process that incorporates strengths of traditional software development model and OSS model [78]. Since development team size and team centralization do not moderate each other’s relationship with project performance (i.e., a lack of support to the three-way interaction in our posthoc analysis), OSS developers can combine team size and centralization in different ways in order to maximize performance gains from their respective interaction effects with software architecture. For a project with a high level of structural interdependency, they can take advantage of the OSS development model in tapping into diverse expertise and perspectives of a large collaboration network over the Internet while following the chief programmer team logic in achieving knowledge consolidation capabilities of a few central nodes. For example, the Linux development team is large yet centralized around a few core developers. This team structure allows dispersed team members to effectively tackle the structural complexity of the Linux project [13]. Small teams typical of the traditional software development model and a decentralized team structure of OSS model can be combined for projects with a low level of structural interdependency. A case in point is the project with the highest number of code commits per developer per day in our sample. This project is characterized by a low level of structural interdependency (0.66 standard deviation below the average), a
NAN AND KUMAR: JOINT EFFECT OF TEAM STRUCTURE AND SOFTWARE ARCHITECTURE
small team size (1 standard deviation below the average), and a low team centralization (0.79 standard deviation below the average). In addition, OSS practitioners can evolve their project code toward a higher or lower level of structural interdependency according to their team structure in order to induce positive effects of socio-technical interactions. For example, Mozilla was redesigned toward a lower level of structural interdependency so that a less focused team distributed across geographic and organizational boundaries could contribute to it [8]. Our empirical analysis demonstrates a feasible way for OSS practitioners to quantify their team structure and software architecture as measures used in this paper were calculated using widely available CVS information for OSS projects. While cautioning OSS practitioners against applying the inflection points (where an effect turns from negative to positive) resulted from our sample directly to their projects, these inflection points can be used as qualitative benchmarks for OSS practitioners to evaluate the socio-technical interactions in their projects. For example, since the inflection point where the effect of network centralization on project performance turns from negative to positive is slightly above the mean value of software structural interdependency (0.20 standard deviation above the mean), OSS practitioners can compare their project’s structural interdependency (as the number of CVS files per folder) to that of other projects. If this value is noticeably higher than the average of other projects, project leaders can increase their team’s centralization in order to improve project performance. C. Limitations and Future Research There are several limitations of this study. These limitations also point toward future research that can build upon and extend the insights provided by this paper. First, to gain analytical tractability, we employ a surrogate measure of software structural interdependency. Although this measure has been employed by other OSS research [12], it may not fully capture the semantic of structural interdependency. To verify findings from this study, future research can seek to apply other metrics of software structural interdependency to OSS projects. Second, we have based our analysis on the dataset collected from the CVS archives of the OSS projects. While this captures the efforts of all the developers who have contributed code to the project, it does not take into account the actions of other participants to the OSS project such as user testers and bug reporters whose contributions are not recorded in the CVS system. Future research can extend our analysis to include the impacts of peripheral participants to obtain a more inclusive view of OSS development. Third, code commits do not reflect quality of code changes. To gain a more complete perspective of OSS project performance, future research models can employ quality-related project performance outcomes such as bug reports. Last but not least, our team structure measures are extracted from CVS files. While this operationalization captures collaboration incidences among OSS developers, it only provides a partial view of social relationships among OSS team members. Future research can generate team structures from multiple data sources such as emails, project membership, as well as CVS files.
601
VII. CONCLUSION A central concern for IS researchers is to understand IT’s deep engagement in social and organizational life. In this study, we recognize OSS development projects as an ideal research context for examining one instance of IT’s engagement in organizational life: self-organized OSS development process jointly enacted by developer behaviors and project characteristics. For IS research, our theoretical development and empirical findings provide new insights into the significant role of socio-technical interactions in shaping OSS project performance. For OSS practice we show that, as the Linux inventor, Linus Torvalds points out, OSS practitioners “still have to lead it [OSS project] in the right directions to succeed.” REFERENCES [1] E. von Hippel and G. von Krogh, “Open source software and the “privatecollective” Innovation model: Issues for organization science,” Org. Sci., vol. 14, pp. 209–223, 2003. [2] B. Perens, “The open source definition,” in Open Sources: Voices Open Source Revolution. Sebastopol, CA: O’Reilly, 1999, pp. 171–188. [3] J. A. Roberts, Il-Horn Hann, and S. A. Slaughter, “Understanding the motivations, participation, and performance of open source software developers: A longitudinal study of the apache projects,” Manage. Sci., vol. 52, pp. 984–999, 2006. [4] K. Stewart and S. Gosain, “Impact of ideology in OSS development teams,” Manage. Inform. Syst. Quart., vol. 30, pp. 291–314, 2006. [5] Y. Fang and D. Neufeld, “Understanding sustained participation in open source software projects,” J. Manage. Inform. Syst., vol. 25, pp. 9–50, 2009. [6] S. K. Shah, “Motivation, governance, and the viability of hybrid forms in open source software development,” Manage. Sci., vol. 52, pp. 1000–1014, 2006. [7] E. Capra, C. Francalanci, and F. Merlo, “An empirical study on the relationship among software design quality, development effort, and governance in open source projects,” IEEE Trans. Softw. Eng., vol. 34, no. 6, pp. 765–782, Nov./Dec. 2008. [8] A. MacCormack, J. Rusnak, and C. Y. Baldwin, “Exploring the structure of complex software designs: An empirical study of open source and proprietary code,” Manage. Sci., vol. 52, pp. 1015–1030, 2006. [9] C. Y. Baldwin and K. B. Clark, “The architecture of participation: Does code architecture mitigate free riding in the open source development model?,” Manage. Sci., vol. 52, pp. 1116–1127, 2006. [10] R. Grewal, G. L. Lilien, and G. Mallapragada, “Location, location, location: How network embeddedness affects project success in open source systems,” Manage. Sci., vol. 52, pp. 1043–1046, 2006. [11] J. Hahn, J. Y. Moon, and C. Zhang, “Emergence of new project teams from open source software developer networks: Impact of prior collaboration ties,” Inform. Syst. Res., vol. 19, pp. 369–391, 2008. [12] C. L. Huntley, “Organizational learning in open-source software projects: an analysis of debugging data,” IEEE Trans. Eng. Manage., vol. 50, no. 4, pp. 485–493, Nov. 2003. [13] G. K. Lee and R. E. Cole, “From a firm-based to a community-based model of knowledge creation: The case of Linux kernel development,” Org. Sci., vol. 14, pp. 633–649, 2003. [14] A. MacCormack, J. Rusnak, and C. Y. Baldwin, “Exploring the structure of complex software designs: An empirical study of open source and proprietary code,” Manage. Sci., vol. 52, pp. 1015–1030, 2006. [15] J. R. Galbraith, “Organization design: An information processing view.’ Interfaces, vol. 4, no. 3, pp. 28–36, 1974. [16] J. R. Galbraith, Designing Complex Organizations. Boston, MA: Addison-Wesley Longman, 1973. [17] C. Gresov, “Exploring fit and misfit with multiple contingencies,” Administ. Sci. Quart., vol. 34, pp. 431–453, 1989. [18] G. L. Stewart and M. R. Barrick, “Team structure and performance: Assessing the mediating role of intrateam process and the moderating role of task type,” Acad. Manage. J., vol. 43, pp. 135–148, 2000. [19] M. L. Tushman, “Work characteristics and subunit communication structure: A contingency analysis,” Administ. Sci. Quart., vol. 24, pp. 82–98, 1979.
602
IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, VOL. 60, NO. 3, AUGUST 2013
[20] W. J. Orlikowski and D. Robey, “Information technology and the structuring of organizations,” Inform. Syst. Res., vol. 2, pp. 143–169, 1991. [21] C. De Souza, J. Froehlich, and P. Dourish, “Seeking the source: Software source code as a social and technical artifact,” presented at the GROUP’05, Sanibel Island, FL, 2005. [22] M. E. Conway, “How do committees invent?,” Datamation, vol. 14, pp. 28–31, 1968. [23] M. Cataldo, P. A. Wagstrom, J. D. Herbsleb, and K. M. Carley, “Identification of coordination requirements: Implications for the design of collaboration and awareness tools,” presented at the 20th Anniversary Conf. Comput. Supported Cooperative Work, Banff, Alberta, Canada, 2006. [24] M. Cataldo, J. D. Herbsleb, and K. M. Carley, “Socio-technical congruence: A framework for assessing the impact of technical and work dependencies on software development productivity,” in Proc. Second ACM-IEEE Int. Symp. Empirical Softw. Eng. Meas., 2008, pp. 2–11. [25] K. K. Kim and N. S. Umanath, “Structure and perceived effectiveness of software development subunits: A task contingency analysis,” J. Manage. Inform. Syst., vol. 9, pp. 157–181, 1992. [26] C. Bird, D. Pattison, R. D’Souza, V. Filkov, and P. Devanbu, “Latent social structure in open source projects,” presented at the 16th SIGSOFT Int. Symp. Found. Softw. Eng., Atlanta, GA, Nov. 9–15, 2008. [27] S. Wasserman and K. Faust, Social Network Analysis: Methods and Applications. Cambridge, UK: Cambridge Univ. Press, 1994. [28] G. Madey, V. Freeh, and R. Tynan, “The open source software development phenomenon: An analysis based on social network theory,” in Proc. Amer. Conf. Inform. Syst., 2002, pp. 1806–1813. [29] R. A. Hanneman and M. Riddle, Introduction to Social Network Methods. Riverside, CA: Univ. of California Riverside, CA, 2005. [30] F. P. Brooks, The Mythical Man-Month. Reading, MA: Addison-Wesley, 1995. [31] M. Mantei, “The effect of programming team structures on programming tasks,” in Commun. ACM, vol. 24, 1981, pp. 106–113. [32] E. Yourdon, Managing the Structured Techniques. Upper Saddle River, NJ: Prentice-Hall, 1988. [33] F. T. Baker, “Chief programmer team management of production programming,” IBM Syst. J., vol. 11, pp. 56–73, 1979. [34] G. Weinberg, The Psychology of Computer Programming. New York: Van Nostrand Reinhold, 1971. [35] S. Faraj and V. Sambamurthy, “Leadership of information systems development projects,” IEEE Trans. Eng. Manage., vol. 53, no. 2, pp. 238–249, May 2006. [36] E. S. Raymond, The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. Sebastopol, CA: O’Reilly, 2001. [37] D. P. Darcy, C. F. Kemerer, S. A. Slaughter, and J. E. Tomayko, “The structural complexity of software an experimental test,” IEEE Trans. Softw. Eng., vol. 31, no. 11,, pp. 982–995, 2005. [38] W. P. Stevens, P. G. J. Myers, and L. L. Constantine, “Structured design,” IBM Syst. J., vol. 13, no. 2, pp. 115–139, 1974. [39] I. Georgiadis, J. Magee, and J. Kramer, “Self-organising software architectures for distributed systems,” in Proc. First Workshop Self-Healing Syst., 2002, pp. 33–38. [40] D. Garlan, “Software architecture: A roadmap,” presented at the Conf. Future Software Eng., Limerick, Ireland, 2000. [41] H. Muccini, P. Inverardi, and A. Bertolino, “Using software architecture for code testing,” IEEE Trans. Softw. Eng., vol. 30, no. 3, pp. 160–171, Mar. 2004. [42] J. Zhao, “On assessing the complexity of software architectures,” presented at the 3rd Int. Workshop Software Archit., Orlando, FL, 1998. [43] J. A. Stafford and A. L. Wolf, “Architecture-Level dependence analysis for software systems,” Int. J. Softw. Eng. Knowl. Eng., vol. 11, pp. 431– 451, Aug. 2001. [44] M. AlSharif, W. P. Bond, and T. Al-Otaiby, “Assessing the complexity of software architecture,” presented at the 42nd Annu. Southeast Regional Conf., Huntsville, AL, 2004. [45] D. J. Campbell, “Task complexity: A review and analysis,” Acad. Manage. Rev., vol. 13, pp. 40–52, 1988. [46] C. B. Schoonhoven, “Problems with contingency theory: Testing assumptions hidden within the language of contingency “theory”,” Administ. Sci. Quart., vol. 26, pp. 349–377, 1981. [47] M. E. Sosa, S. D. Eppinger, and C. M. Rowles, “The misalignment of product architecture and organizational structure in complex product development,” Manage. Sci., vol. 50, pp. 1674–1689, Dec. 2004. [48] S. G. West and L. S. Aiken, Multiple Regression: Testing and Interpreting Interactions. Newbury Park, CA: Sage, 1991.
[49] R. Guimera, B. Uzzi, J. Spiro, and L. A. N. Amaral, “Team assembly mechanisms determine collaboration network structure and team performance,” Science, vol. 308, pp. 697–702, 2005. [50] D. E. Harter, M. S. Krishnan, and S. A. Slaughter, “Effects of process maturity on quality, cycle time, and effort in software product development,” Manage. Sci., vol. 46, pp. 451–466, 2000. [51] M. E. Shaw, “Some effects of unequal distribution of information upon group performance in various communication nets,” J. Abnorm. Psychol., vol. 49, pp. 547–53, 1954. [52] K. Sandoe, Ed., Organizational Mnemonics: Exploring the Role of Information Technology in Collective Remembering and Forgetting (Simulating Organizations: Computational Models of Institutions and Groups Table of Contents). Menlo Park, CA: AAAI Press/The MIT Press, 1998. [53] SourceForge. (2012). What SourceForge.net. [Online]. Available: https:// sourceforge.net/apps/trac/sourceforge/wiki/What%20is%20SourceForge. net? [54] J. Howison and K. Crowston, “The perils and pitfalls of mining SourceForge,” presented at the Int. Workshop Mining Software Repositories, Edinburgh, Scotland, 2004. [55] L. C. Freeman, “Centrality in social networks: Conceptual clarification,” Social Netw., vol. 1, pp. 215–239, 1979. [56] S. P. Borgatti, M. G. Everett, and L. C. Freeman, UCINET IV Version 1.00. Columbia, SC: Analytic Technologies, 1992. [57] M. Belady and M. Lehmann, “Programming system dynamics,” in IBM Research Report RC3546. London, U.K.: Academic Press, 1971. [58] P. Agerfalk and B. Fitzgerald, “Outsourcing to an unknown workforce: Exploring opensourcing as a global sourcing strategy,” Manage. Inform. Syst. Quart., vol. 32, pp. 385–409, 2008. [59] R. P. Bagozzi and U. M. Dholakia, “Open source software user communities: A study of participation in linux user groups,” Manage. Sci., vol. 52, pp. 1099–1115, 2006. [60] J. W. Paulson, G. Succi, and A. Eberlein, “An empirical study of opensource and closed-source software products,” IEEE Trans. Softw. Eng., vol. 30, no. 4, pp. 246–256, Apr. 2004. [61] K. J. Stewart, A. P. Ammeter, and L. M. Maruping, “Impacts of license choice and organizational sponsorship on user interest and development activity in open source software projects,” Inform. Syst. Res., vol. 17, pp. 126–144, 2006. [62] L. Dahlander and S. O’Mahony, “Progressing to the center: Coordinating project work,” Org. Sci., vol. 22, p. 961, 2011. [63] P. V. Singh, Y. Tan, and V. M. Mookerjee, “Network effects: The influence of structural capital on open source project success,” Manage. Inform. Syst. Quart., vol. 35, pp. 813–830, 2011. [64] A. J. Albrecht and J. E. Gaffney, “Software function, source lines of code, and development effort prediction: A software science validation,” IEEE Trans. Softw. Eng., vol. SE-9, no. 6, pp. 648–652, Nov. 1983. [65] R. D. Banker, H. Chang, and C. F. Kemerer, “Evidence on economies of scale in software development,” Inform. Softw. Technol., vol. 36, pp. 275– 282, 1994. [66] R. D. Banker and S. A. Slaughter, “A field study of scale economies in software maintenance,” Manage. Sci., vol. 43, pp. 1709–1725, 1997. [67] D. E. Harter, M. S. Krishnan, and S. A. Slaughter, “Effects of process maturity on quality, cycle time, and effort in software product development,” Manage. Sci., vol. 46, pp. 451–466, 2000. [68] C. F. Kemerer, Software Project Management: Readings and Cases. New York: McGraw-Hill, 1996. [69] C. Jones, “Software metrics: Good, bad and missing,” Computer, vol. 27, no. 9, pp. 98–100, Sep. 1994. [70] T. M. Shaft and I. Vessey, “The role of cognitive fit in the relationship between software comprehension and modification,” MIS Quart., vol. 30, pp. 29–55, 2006. [71] M. L. Nelson, R. Sen, and C. Subramaniam, “Understanding open source software: A research classification framework,” Commun. Assoc. Inform. Syst., vol. 17, pp. 266–287, 2006. [72] F. E. Grubbs, “Procedures for detecting outlying observations in samples,” Technometrics, pp. 1–21, 1969. [73] N. Smirnov, “On the estimation of the discrepancy between empirical curves of distribution for two independent samples,” Bull. Math. Univ. Moscow, vol. 2, pp. 3–26, 1939. [74] H. White, “A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity,” Econometrica, vol. 48, pp. 817– 838, 1980. [75] J. Neter, W. Wasserman, and M. H. Kutner in Applied Linear Regression Models, Chicago, IL: IRWIN, 2004.
NAN AND KUMAR: JOINT EFFECT OF TEAM STRUCTURE AND SOFTWARE ARCHITECTURE
[76] P. V. Singh and Y. Tan, “Developer heterogeneity and formation of communication networks in open source software projects,” J. Manage. Inform. Syst., vol. 27, pp. 179–210, 2010. [77] R. P. Bostrom and J. S. Heinen, “MIS problems and failures: a sociotechnical perspective—Part I: The cause,” MIS Quart., vol. 1, pp. 17–32, 1977. [78] S. Sharma, V. Sugumaran, and B. Rajagopalan, “A framework for creating hybrid-open source software communities,” Inform. Syst. J., vol. 12, pp. 7– 25, 2002.
Ning Nan received the Ph.D. degree in business information technology from the Ross School of Business, University of Michigan, Ann Arbor, in 2006. She is currently working as an Assistant Professor at the Sauder School of Business, University of British Columbia, Vancouver, Canada.
603
Sanjeev Kumar received the Ph.D. degree in business information technology from the Ross School of Business, University of Michigan, Ann Arbor, in 2009. He is currently working as an Assistant Professor at the Lubar School of Business, University of Wisconsin—Milwaukee, Milwaukee.