The Lessons of Open Source Communities for Developing Reliable ...

Reliability Against the Odds: The Lessons of Open Source Communities for Developing Reliable Software Ruben van Wendel de Joode, Mark de Bruijne & Michel van Eeten Faculty of Technology, Policy and Management Delft University of Technology PO Box 5015, 2600 GA Delft, The Netherlands mailto: [email protected]

Abstract Open source software is increasingly used in mission-critical processes of organizations, signaling its acceptance as highly reliable software. That said, the development process in open source communities differs radically from conventional approaches to reliable software development. Stronger still, open source software seems to thrive under conditions of that are orthogonal to the conventional approach: extreme fragmentation, weak planning and coordination structures and lack of control over objectives, priorities and resource allocation. Therefore, we argue that open source communities provide us with valuable and innovative lessons for managing the production of reliable software – especially since the conditions for contemporary software development increasingly resemble those of open source communities rather than those of tightly controlled corporate environments. To draw out these lessons, this paper applies concepts from High Reliability Theory (HRT) on two recent case studies of open source communities, Linux and Apache. We identify a number of important reliability-enhancing mechanisms. Understanding these mechanisms is important to a) managers of corporate software developers participating in open source communities and b) managers who want to transfer mechanisms and lessons from open source communities to their organization.

Keywords: open source software, software reliability, High Reliability Theory (HRT).

0

Reliability Against the Odds: Open Source Communities and their Lessons for Reliable Software1

The remarkable track record of open source software An increasing number of software programs are developed in open source communities. The software developed in these communities is known under a wide variety of names of which “open source software” or “free software” are probably most commonly used.2 The communities consist of a wide variety of people ranging from hobbyists and students to free-lancers and programmers paid by companies (Hertel et al. 2003). They contribute various amounts of their time and effort to the development and maintenance of software. The philosophy of open source is that the source code, i.e. the human-readable part of software, should not be treated as a secret (Stallman 2002). Instead, the participants appear to agree that software and the corresponding source code is something that should be open, visible, downloadable and modifiable to anyone interested. The software and the corresponding source code are said to be in a commons (Benkler 2002, Bollier 2001, Boyle 2003). Open source software created in these communities is successful.3 Increasing numbers of big and small organizations turn to open source software to facilitate their critical business processes. Examples

1

This article is the product of a research project funded by the Netherlands Organisation for Scientific Research

(NWO). The reference number of the project is: 638.000.000.044N12. 2

In the remainder of this paper the term open source will be used.

3

The claim is not that that software developed in open source communities is qualitatively better than proprietary

developed software. The quality of proprietary software or open source software varies with each software program and the relation between the quality of software and the way in which software is developed is all but understood. The only claim made here is that in certain niche markets open source software has gained a decent share of the market and has apparently has reached a satisfactory level of quality (i.e. reliability, user friendliness, continuity, etc.).

1

include companies like New York Stock Exchange, internet retailer Amazon and IBM.4 Governments and municipalities have also adopted open source software. For example the City of Newport5 and the central government of Brazil have encouraged their respective sectors of government to move away from proprietary software to open source software.6 We may assume that particularly large organizations and governments will only invest and switch to open source software when they are convinced that the reliability of the software is sufficiently assured. This is especially true when the software is used to support mission critical business processes. Reliability may be defined here as the ability of the software to perform its required functions under stated conditions for a specified period of time. As such, software reliability has become an important attribute of software quality. Indeed, a study shows that one of the most important motives for firms to participate in the development of open source software is the reliability of that software (Bonaccorsi and Rossi 2004). Furthermore, quantitative research demonstrates that two of the most successful open source communities, i.e. Apache and Linux, are at least comparable in reliability as commercially developed software. Research shows that Apache has 31 software defects in 58,944 lines of source code. This results in a defect density of 0.53 per 1,000 lines of source code, which is said to be comparable to proprietary developed software programs, which have an average defect density of 0.51.7 Research by The Reasoning consulting group compared six operating systems on their implementation of a key-networking component. It concluded that the Linux kernel performed better than the five proprietary developed operating systems. The study also showed that the networking component of the operating system had “8 defects in 81,852 lines”8 of source code.

4

Based on: http://www.it-director.com/article.php?articleid=2125 (November 2003),

http://www.oreillynet.com/pub/a/oreilly/ask_tim/2004/amazon_0204.html (March 2004), http://news.com.com/2100-1001-275388.html?legacy=cnet (November 2003). 5

Based on: http://www.linuxdevcenter.com/pub/a/linux/2004/01/15/andy_stein_interview.html (March 2004).

6

Based on: http://www.wired.com/news/infostructure/0,1377,61257,00.html (March 2004).

7

Based on: http://www.infoworld.com/article/03/07/01/HNreasoning_1.html (March 2004).

8

Based on: http://www.reasoning.com/newsevents/pr/02_11_03.html (August 2004).

2

This article will explore how and why reliability in open source software is assured. There are two reasons why reliability in open source software is relevant. First, quantitative research by Hertel et al. (2003) and Hars and Ou (2002) demonstrates the growing involvement of programmers who are paid to participate in open source communities. Companies might do so for a wide variety of reasons. The challenge for the managers of these developers is to understand how they must balance the company’s interests while at the same time respecting the mechanisms and processes in open source communities. Otherwise, conflicts are likely to arise. In the words of a member of the Apache Software Foundation (ASF): “One of the problems is that the individuals who develop on Apache and come from Sun have deadlines. That creates frustration when they need to work together with people who do it for fun.” This article will identify a set of mechanisms that contribute to the reliability of open source software, which should be recognized, and perhaps even protected, by paid programmers and their managers. Second, and more important, open source software development can provide a much-needed new approach to producing reliable software. The conventional approaches seem to fit increasingly poorly with the dynamics of large-scale, innovative software development. A growing number of companies consider the organization of open source communities a viable alternative to the organizational structures and processes they have adopted to create proprietary software. A good example is the Dutch consumer electronics giant Philips.9 A large business unit of this company recently decided to change its software development practices by adopting an “inner source development methodology.” This methodology is based on open source communities and aims to transfer lessons from these communities into a corporate setting, in which the source code is shared across departments but remains inside the organization. This article will identify mechanisms that are relevant for managers involved similar processes of change. Managers should realize that opening the source code alone is not sufficient. To create an organizational structure that is capable of producing reliable software, they should understand how reliability in the communities is achieved.

9

Based on a talk by a Philips representative, given at the MMBase conference in Delft, June 9, 2004.

3

Software reliability against the odds The literature on software reliability tells us that creating highly reliable software is very difficult, and increasingly so (e.g., Boots 1995, Leveson 1995, Kling 1996). Leveson (1995, p. 25) mentions a study, which identified that in software a full 10 percent of functions failed to adhere to the original specifications. The use of software for increasingly complex and sophisticated functions has extended the scale and scope of software development to such a degree that software development is now considered a highly complex technological process. One of the latest distributions of Linux, Red Hat Linux 7.1, is estimated to contain some 30 million physical source lines of code (SLOC), while the Debian 2.2 GNU/Linux distribution includes more than 55 million physical lines of code.10 Combined with the fact that software-related errors can ‘suddenly’ occur, even after hundreds of thousands of hours of use, this leads to the conclusion that software is a highly complex, dynamic, large-scale technology (Leveson 1995). How to ensure the reliability of such a technology? The literature on software reliability typically answers this question by focusing on the application of better design methods, models and procedures (e.g., Bowles 2004, Wang et al. 2004). This includes, among other things, the adoption of more effective project management tools, like project planning and project tracking and oversight (Humphrey 1989, Paulk et al. 1994). In addition, we see efforts in reliability engineering (e.g., Son and Seong 2003, Guthrie and Parikh 2004, Friedman and Voas 1995), such as new tools to obtain failure rates and measure software reliability (e.g. Gandy and Jensen 2004, Yin et al. 2004). The shared assumption in this literature is that there is command and control over the process of software development. In fact, we would argue, that is what defines the conventional approach to ensure reliable software: better planning, coordination, tracking, and allocation of resources to make sure that what comes out of the process meets reliability standards. In open source communities, the conditions for command and control over development are by and large missing. In fact, we would argue that the conditions are orthogonal to those that enable command and control. As one researcher phrased it: “in general, open-source projects have conditions

10

Based on: Wheeler, D. A. 2001. More Than a Gigabuck: Estimating GNU/Linux's Size,

http://www.dwheeler.com/sloc/redhat71-v1/redhat71sloc.html (July 2004).

4

that tend to promote freeloading, unstable membership or low-quality contributions” (Markus et al. 2000, p. 21). The coordination mechanisms used in proprietary software development are largely absent (MubraniN and Booth 1999, Mockus et al. 2002). It is not hard to see why. First, the communities lack ‘traditional’ leaders who can coordinate the development process and enforce decisions so that developers adhere to them (e.g., von Hippel and von Krogh 2003). Instead, open source developers respect the individual and are “anti-authoritarian” (Himanen 2001, p. 40). Open source communities are collections of individuals in which “everybody does what he feels like doing”11 and in which project leaders “can only suggest.”12 Second, we see massive fragmentation in the technical systems and in the communities. The more popular communities consist of large numbers of subscribers. For example, both the “credits” file of the Linux Kernel and Apache contains about 400 contributors to the software code and some 4,000 people involved in following the development processes and reporting on problems (e.g., Hertel et al. 2003, Mockus et al. 2003). Thus, the software development processes in both communities may be characterized as highly fragmented processes in which thousands of individuals are involved. Yet somehow all these people collectively manage to create reliable software. It is acknowledged that there is a small core of software developers that are responsible for the lion’s share of the actual software programming and development work in both communities (e.g., Lewis 1999, Mockus et al. 2002). However, this does not diminish the importance of the thousands of others that produce code which is also incorporated, or which perform other tasks such as testing or bug reporting. Furthermore, it should be noted that even the core groups in the communities consist of people with different backgrounds and different interests who are widely distributed. For example, MubraniN and Booth (1999) reported “Apache’s 20 core developers are located in five different countries across three continents” (p. 63). Third, these communities lack a shared understanding of and commitment to reliable software. Different developers favor different trade-offs and views about what constitutes reliable software. A respondent who was interviewed in the context of this research said: “You could say there is a difference

11

Cited from an interview with a maintainer of various software modules in the Linux community.

12

Cited from a personal interview with the vice-president of the Free Software Foundation

5

between a developer from for example IBM and a private developer who works on it for his hobby. The first will be working more structurally towards deadlines and with a certain approach, whereas the latter is more creative and more risky in trying things and has generally a higher pay off risk ratio.” Or consider this statement from a maintainer in the Linux community: “[Volunteers] sit at home and work on their personal computers. Some of them are highly intelligent, but they are focused on their own situation, on small computers. Companies, however, have big systems with big databases and students have no experience with the problems facing these large-scale companies.” Fourth, membership is fluent as the legal and organizational boundaries of open source are blurred (Fielding 1999, Franck and Jungwirth 2003). There is no consensus about who is inside a community and who is outside: “Membership in the community is fluid; current members can leave the community and new members can also join at any time.” (Sharma et al. 2002, p. 10). Anyone thinking along the lines of the conventional approach to develop reliable software would have to conclude that these conditions are devastating to ensuring high reliability. And yet, the evidence tells us certain open source software is exactly that: highly reliable – against all odds, so to speak. We cannot but conclude that the conventional approach cannot account for this reliability. Or to put it in positive terms, if we can empirically unravel some of the mechanisms that enhance the reliability of open source software, we have the beginnings of a new approach to software reliability. The new approach may also allow us to integrate findings from other empirical research that questions the conventional approach. For example, Faraj and Sproull (2000) found only a weak relationship between software methods and the performance of software development teams. Instead, their evidence demonstrated a stronger link between expertise coordination and team performance. In short, we need an empirical exploration of what makes for reliable software and open source provides us with critical cases for such research.

Finding reliability-enhancing mechanisms: High Reliability Theory Within organizational theory, there has been a sustained interest in the reliability, or lack thereof, of complex, large-scale technological systems. The first dominant theory was developed to explain major accidents with such systems. The so-called Normal Accident Theory (NAT) argued that these systems 6

had characteristics that made them inherently prone to failure: they generated complex, unexpected interactions among system components and their components were tightly coupled (e.g., Sagan 1993, Perrow 1999). While recognizing the importance of these findings, a group of researchers pointed out that some of these systems nevertheless achieved remarkable levels of reliability, even though much seemed to conspire against them. The researchers were puzzled by the existence of organizations, which they labeled High Reliability Organizations (HRO’s), which could not be explained using conventional organizational theory (e.g., Roberts 1989, 1990b, Roberts and Gargano 1990).13 Like open source communities, HRO’s seemed to defy predictions and conventional assumptions with regard to the reliability of their performance. Organizations that were chosen as the object of study included (nuclear) aircraft carrier flight operations and nuclear power plants (e.g., Roberts 1990a, 1990b, Schulman 1993) and more recently also organizations that operate in commercial environments under “conditions such as increased competition, higher customer expectations, and reduced cycle time [that] create unforgiving conditions with high performance standards and little tolerance for errors” (Weick et al. 1999, p. 104, Vogus and Welbourne 2003). This research evolved into what we now know as High Reliability Theory (HRT). In other words, much like our quest into open source communities, HRT wanted to explain high reliability against the odds. HRT argued that HRO’s have nurtured a number traits that allow organizations and the people who work inside them to manage complex systems remarkably well and continuously maintain high levels of reliability (Rochlin 1999).14 This paper uses the findings from HRT to unravel the organization of open source communities from a reliability perspective. The list of reliability-enhancing mechanisms in table 1 may be viewed as central to the existing HRT-literature.15

13

HRT is usually offset against NAT (e.g., Perrow 1999, Rijpma 2003).

14

However, HRT-theorists do point out that no recipes exist to ensure fail-safe solutions to counter the risk of

service disruptions due to interactive complexity and tight coupling (Roberts 1989, 1990a, 1990b). 15

This list was based upon Rochlin (1996), LaPorte (1996), Grabowski and Roberts (1996, 1997), Roe et al. (1998)

and van Eeten and Roe (2002). The authors have narrowed the number of factors to six by collapsing different factors.

7

Table 1: Reliability-enhancing mechanisms provided by HRT-literature 1) 2) 3) 4) 5) 6)

A system of responsibility and accountability Flexible decision-making processes and authority patterns An organizational culture of reliability Structural flexibility and redundancy Reliability is a non-fungible commodity Strong presence of external groups with access to credible and timely operational information

Each of these six mechanisms from HRT mechanisms is discussed in more detail and we will explore their presence in the Linux and Apache communities the case studies below. We selected the Apache and the Linux community as case studies, because they are known to produce reliable software. In that sense they are not representative for the wide variety of small and unknown open source communities that are for instance listed on SourceForge.com. Therefore, the results of this explorative study can and should not be generalized to all open source communities. This research is explorative as the issue of reliability and especially organizing for reliability has, until now, hardly been addressed in state of the art literature on open source communities. The goals of this article are to a) identify a preliminary set of mechanisms and characteristics in the organization of open source communities that contribute to the creation of reliable software, and to b) sensitize practitioners and researchers for characteristics that until now have gone unnoticed as potential sources of reliability in the communities.

Methodology The research on which this article is based, is part of a larger research program that focuses on the reliability and continuity of open source software. The program was initiated in response to the growing adoption of open source software in the Netherlands, which is partly based on the increasingly accepted idea that open source software is reliable and ensures higher levels of continuity compared to proprietary developed software. This article is primarily focused on two communities, namely Apache and Linux. There are three reasons for this choice. First, quantitative analysis has shown that the software developed in both communities is reliable. Second, participants in the Apache and Linux communities are widely distributed and have different agendas and interests, and thus different trade-offs concerning the

8

reliability of the software, which makes them interesting to analyze with an organizational approach. Third, much information about the two communities is available. A number of data collection activities have been undertaken for this explorative research. The most important source of information is provided by a number of face-to-face semi-structured interviews. In light of a bigger research program16 a total number of 60 interviews were held in a wide variety of communities (e.g., Debian, Python and PostgreSQL) and organizations. They included 17 interviews with people who were in some way involved in the Linux community and 7 with people who were involved in the Apache community. The roles of the respondents in the two communities are specified in table 2. There are more roles then interviewees, as some of them were in more than one way involved in a community. The interviews were held in English, German and Dutch. In the remainder of this article many quotes are given from these interviews, some of which have been translated into English. These quotes, unless mentioned differently, have been cited in interviews with either a respondent from the Linux or Apache community.17

Table 2: Roles of the interviewees Board members of the Apache Software Foundation Members of the Apache Software Foundation Maintainers (they are responsible for a specific part of the software Moon and Sproull 2000) Contributors (writing source code, filing bug reports, organizing meetings, etc.) Corporate users Researcher

Apache18

Linux

2 3

5 1

4 8 6

Next to interviews other primary sources of data have been used: a) the digital material available on the official Apache website and sub-sites, and many Linux sites; and b) open source conferences, seminars and Linux user meetings during which field notes were made.

16

The research program is partly funded by the Netherlands Organisation for Scientific Research (NWO: reference

number 014-38-413). 17

A list of respondents and their quotes used in this article may be obtained from the authors.

18

The roles maintainers and contributors are not applicable to the Apache community, as maintainers are not

present in the Apache community and as each of the respondents said they regularly contribute.

9

Secondary sources of information include state of the art of research on open source communities in general and on the Apache community and the Linux community in particular. One valuable source of information were the research papers, articles and books available from the open source website hosted by MIT.19 Furthermore, numbers of articles from a wide variety of journals, books from different disciplines and conference proceedings were used as an input for this research. Another valuable source of secondary information were the news items on the Apache subsection of Slashdot, which were closely watched and filtered for all sorts of information concerning reliability.

Introducing the two open source communities

Apache The history of Apache starts with Rob McCool from the National Center for Supercomputing Applications (NCSA). McCool laid the foundation of the Apache web server by creating a HTTP daemon, which was adopted and collaboratively improved by a number of web masters. Improvement of the daemon halted when McCool left NCSA and forced others to develop and maintain their extensions and bug fixes (Fielding 1999). From 1995, a group of web masters improved the daemon and formed the nucleus of what would become the Apache-community. As the community grew in size it started to adopt tools to support its development processes to gradually improve on the performance of the software. Next to a mailing list, the community adopted a Concurrent Versioning System (CVS) and created a bugtracking system. The CVS is the database application used to support and manage the software development process (Shaikh and Cornford 2003). In January 1996, Apache httpd 1.0 was formally released (Mockus et al. 2003). Almost immediately, Apache proved popular and attracted a large number of users. Within one year after the release of the first version, the Apache server was the most widely used server on the Internet and until this day, Apache maintains the record of most-used web server software. Apache is a web server, which is used to host content, like websites and databases. People surfing the Internet use their browser to contact a particular web server if they want to view an Internet website.

19

See: http://opensource.mit.edu/ (March, 2004).

10

At the lowest level of the technical system of Apache is the APR layer. APR is short for “Apache Portable Runtime.” This layer takes care of the interfaces with the hardware and figures out where Apache is working in: Unix or Windows. The hardware of physical servers is typically highly diverse, as it consists of different machines. On top of the APR is the HTTP server: the core of the Apache web-server. On top of this core there are many different add-ons, which are developed and maintained in the Apache community. Finally, users also have their own specific applications that run on top of the Apache server.

Linux The history of Linux started in 1991 with an Internet message on a mailing list from its original creator Linus Torvalds inviting others to take a look at a small software program he had written. In his message he provided the source code of the kernel and invited other people to look at it and suggest further improvements. At first ten people downloaded the source code and five sent back bug fixes, code improvements and new features (Naughton 1999). Torvalds then took the time to review the responses and explained why he chose to ignore or add a suggestion. A growing number of people started to send suggestions for improvements and a year after he had posted his original message, 1,000 people had downloaded the Linux kernel. The kernel had become a functional operating system with 40,000 lines of code.20 Since then Linux has developed rapidly as both the number of users and the numbers of contributors to Linux have risen enormously. Linux is currently continuously improved by thousands of volunteers and paid programmers and the program in 1998 already boasted more than 7 million users (MubraniN and Booth 1999). So far, Linux is adopted as the basis of a number of commercial software distributions and the software is becoming increasingly popular in business and large multinationals. The Linux kernel is an operating system. The technical system of the Linux kernel consists of numbers of smaller and bigger modules, which combined with applications, constitute a Linux software distribution. The primary tasks of the Linux kernel are to create an interface between the software and the hardware and to facilitate communication between processes running on the computer. There is a wide

20

Josh McHugh (1998) Linux: The Making of a Hack, a Forbes article taken from the Internet:

http://www.forbes.com/forbes/1998/0810/6203094s1.html (July 2001).

11

range of applications available for the Linux kernel. The Linux.org website for instance lists 116 MP3 applications, 65 libraries and 36 file managers.21 Next to a wide variety of applications there are also many different versions of the Linux kernel itself. First, there are the stable version and the development version of the Linux kernel, which are both maintained in the Linux kernel community. Second, every maintainer in the kernel community has a version of the kernel in which he integrates improvements. To ensure compatibility, the locally maintained versions have to be regularly updated with the development version of the kernel maintained by Linus Torvalds. Third, most Linux distributions use a different version of the kernel. Companies like Red Hat or SuSE select a version of the kernel, improve it and make it ready for a commercial distribution.

Responsibility and accountability HRO’s instill upon their organizational members an extraordinary degree of responsibility and accountability to encourage the discovery and reporting of any event that might endanger the reliable operation of the technology (Roberts et al. 1994). HRO’s promote and reward the discovery and reporting of errors, even one’s own. Employees are provided with guidelines that tell how to deal with this responsibility and accountability. For example, in the Navy “you own a problem until either you can fix it or you can find someone who can” (Roberts 1990a, p. 106). This rule is supplemented with another rule: “never break a rule unless safety will be jeopardized by carrying out the rule” (Roberts et al. 1994, p. 621). This way, the detection and reporting of error is pushed to the lowest level. Even the lowest sailor on an aircraft carrier has the duty and authority to abort flight deck operations if he spots “foreign object debris” (FOD) on the carrier deck. In fact this behavior is actively encouraged, and the ships captain rewards rather than punishes those who spot debris and thereby halt the landing process (Roberts 1990b, p. 171, Roberts and Gargano 1990, p.157). Communities lack the formal organizational structure and mechanisms to enforce upon community members any sort of responsibility and accountability towards software reliability. However, we found that despite the absence of these mechanisms, individual community members in open source

21

From: http://www.linux.org/apps/index.html (April 2004).

12

communities did feel accountable and responsible for the reliability of the software that was developed. One developer in the Linux community explained: “Once you have created something in the [Linux] kernel, you feel responsible. If something turns out to be wrong then it is my mistake. Usually the one who made the last change is the most appropriate person to make the change, because you know the structure and understand the software.” Why does this respondent feel responsible to solve problems to his software? There are a number of mechanisms that indicate that this statement is more than just a coincidental remark; they suggest that responsibility and accountability in both the Apache and Linux are to some degree institutionalized. These mechanisms ensure a direct link between individual developers and the source code they have contributed to the community. One mechanism is the credits list, which lists the names of developers who contributed to the development of software and specifies what they contributed. The project leader of the PostgreSQL community explained that the name of the person who solves an error in the software is put “next to the resolved item” and a participant in the KDE community told: “My name is attached to every KDE program I have ever translated.” Eric Raymond highlights the importance of the credits list: “Surreptitiously filing someone's name off a project is … one of the ultimate crimes” (Raymond 2000). In the Linux community the software license, i.e. the General Public License (GPL), creates a more formal linkage between the individual contributors and the software they have written: “The license allows licensees to modify the licensed code so long as they ‘cause the modified files to carry prominent notices stating that [the files were] changed and the date of any change’” (McGowan 2001, p. 256). The Apache software is not licensed with the GPL, however they also have a more formal mechanism to link contributors to the source code they have written, namely the voting system. For every new commit – an upload of software – a vote is held and according to a member of the ASF: “If you explicitly voted that you wanted the patch to remain then you commit yourself to help clean up the software when the patch turns out be less good.” Thus the voting system puts informal pressure on the contributor and others who voted positively to change the source code when it is in need of changes. The presence of peer pressure and peer review also explains why linkages between individual developers and the software they have written give rise to responsibility and accountability. A member of the ASF explains: “When you get commit access [in the Apache community], you are able to alter the

13

version. But this is always subject to peer review. If you commit something and other developers don’t like it they will tell you so...” Linux and Apache consist of many programmers who want to demonstrate their skills and abilities to the rest of the community. They want to make contributions that are considered important (Hertel et al. 2003). And to write source code that consists of errors is considered a shame and as something that should be strongly avoided. A board member of the ASF states: “Online, a record is kept of who did what and when. Once you made a big mistake then it will become a ‘Hall of blame.’” This creates an incentive to do things right (cf. Wayner 2000).

Flexible decision-making processes and authority patterns To avoid the managerial and structural death trap between centralization and decentralization of authority HRO’s flexibly adapt their authority and decision-making processes to the pace of the operations they are engaged in (e.g., LaPorte and Consolini 1991). HRO’s are capable of changing their decision patterns because their formal organizational structures are overlaid with more flexible structures. In this way, HRO’s are able to adjust to a concept that is known as requisite variety; a contingency that demands that the complexity of the environment is matched with complexity of the organization. During periods of increased volatility and speed the more formal bureaucratic decision making structure is replaced by a leaner and more flexible decision model that focuses upon traits such as experience and expertise. Thus, HRO’s may be characterized as “multi-layered, nested authority systems” in which elements of the organization can adapt their level of interdependence to the pace of operations and the threat that is posed towards reliability (LaPorte and Consolini 1991, p. 40). However, when decisions are pushed down into the organization as the pace of operations increases, there is the increasing risk of individual complacency, mistakes or biases that may have disastrous consequences for reliability. HRO’s maintain their vigilance by increasing the ‘social friction’ – bargaining, negotiation, consultation, and internal communication – as the tempo of operations increases and the technology and the system becomes more and more complex (Roberts 1989). HRO’s display rich patterns of group interaction and communication that help operators inside the HRO to understand the system they operate. By combining the observations, knowledge and expertise of multiple system operators in the decision process at the lowest level, a richer and more detailed representation of the 14

collective situation is created. Especially at the operator level during high tempo operations, system operators interact to such a degree that a shared mental model of the system is created, which is used as a ‘collective mind’ (Weick and Roberts 1993). Thus, one of the explanations for how HRO’s operate so reliably may be found in the way that aggregate mental processes are more fully developed in these organizations then those in organizations in which efficiency is their prime concern.

There are indications that suggest multiple levels of decision-making in software development in both Linux and Apache are at work. However, it is unclear what their exact interplay is with reliability. Both Linux and Apache have a number of more formal mechanisms residing on a collective level. In Apache the voting system is clearly a mechanism that attempts to allow for decisions on a collective level; once a piece of source code receives three or more positive votes and no negative vote, the source code is included in the Concurrent Versions System (CVS). Also the Project Management Committees (PMC) and the ASF are bodies erected to make decisions that involve the collective of developers. In the Linux community, Linus Torvalds and his lieutenants make decisions that have an effect on the larger community of Linux developers and users. They make the actual decisions to include a certain piece of source in the Linux kernel (Lerner and Tirole 2002). Next to mechanisms to make decisions on a collective level, there are many decisions made on an individual level and they appear to have an effect on the processes in the communities-at-large. Consider for instance statements by individual developers who claim to make their own choices. A maintainer in the Linux community states: “Well, you have a pool of people who do what they want to do. Nobody gives me an assignment. Instead, I think: ‘hey how weird, this process is very slow.’ Well, then I myself will start to work to solve it… This is of course a very selfish approach, but I think that most people work like that. They work on what they run into.” A board member of the ASF provides another example: “that is actually what the entire open source philosophy is all about. Things only get done if at least one person feels that they are important. This person will make sure that it works.” Individuals who make their own informed decisions outside the realm of more collective mechanisms is a fairly institutionalized process, partly because they have the option to create new and competing versions of what are considered to be the official versions of the software (e.g., Egyedi and van Wendel de Joode 2004). Another

15

indication of the freedom of individuals to make their own choices and possibly even ignore decisions on a collective level is: “Hackers have always respected the individual. They have always been antiauthoritarian” (Himanen 2001, p. 40). According to the vice-president of the Free Software Foundation formal authority patterns are absent: “Many free software developers don’t work together. A company like Microsoft hires developers, puts them in a room and makes them work together. We don’t because we do not have walls… We cannot make them do anything, we can only suggest.” There are many more examples of statements in which respondents have argued to ignore formal authority and to do what they themselves believe is best. Furthermore, project leaders from communities other than Linux or Apache argue how they cannot make decisions that bind the collective, e.g. a former project leader of the Debian community explains: “Trying to lead Debian, is like trying to herd cats.” This raises the question, what exactly the status of Linus Torvalds or a PMC is, with regard to reliability. Could it be that we should view Linus Torvalds more as a player coach - i.e. as someone who develops software himself and at the same time motivates and stimulates others to contribute - and if so, what is his influence? (von Hippel and von Krogh 2003) Does he define the borders of what individuals may or may not do? What is the role of a PMC or ASF, when we know that individuals in the Apache community claim to do what they themselves like doing? Do they have more responsibilities next to dealing with issues like copyright and software licenses? Could it be that these formal authority patterns hardly have a role in the creation of software and the choices that accompany these processes? Especially, in the Apache community there are many indications to support the assumption that the frontrunners for reliability are the individual programmers. A board member of the ASF states: “We want to create high quality stuff. The quality control is done at the individual level by the programmer.” According the same respondent the formal and more authoritarian bodies set boundaries within which individuals are allowed to maneuver: “The decision about which features to add and which coding style to follow, we try to do as decentralized as possible. On the board level we decide in which direction the community as a whole has to go.” Finally, a bug tracking system does, indeed, demonstrate and hint to the importance of individuals to create reliable software. A bug tracking system is essentially a forum in which individuals can report a bug. The report of the bug is stored in the system until someone has solved the bug, and

16

removes the report from the list. At this time the system will usually automatically send e-mail to the person that wrote the report to let him now that his bug is solved. In a sense it is typically a mechanism that allows individuals from all over the world to report and fix bugs, without hardly any intervention of collective authority patterns. A bug tracking system connects individuals to problems that need to be solved and solving them will improve the reliability of the software.

An organizational culture of reliability High-reliability theorists universally agree that HRO’s are able to use the organizational culture as a source of high reliability (e.g., Roberts 1989, Weick 2001). This culture of reliability is an important tool to make sure that inside HRO’s there exists a clear understanding throughout the organization of the mission and goals of the organization and to allow for the necessary amount of decentralization in the management of a highly complex technology. With the introduction of a culture of reliability comes an important set of norms, which allows organizational members to share the values, assumptions and certain essential knowledge of the system. According to Weick (2001), the organizational culture is used to socialize people “to use similar decision premises and assumptions”, while operating the system so that “decentralized operations are equivalent and coordinated” (p. 340). Through this culture of reliability, organizational members are instilled with a sense of how the system they operate should be viewed, how it should be dealt with and how people should respond to disturbances (van Eeten and Roe 2002, p. 108). The organizational culture of reliability is a defining reliability enhancing characteristic, which enables HRO’s to operate in a decentralized manner and enables them to learn without having to engage in trial-and-error learning. Specific ways in which this process of learning and conditioning are achieved are storytelling (Weick 2001) and myths and rituals (Rochlin 1999). As a prime example of this culture of reliability in HRO’s, system operators are being conditioned to be constantly suspicious of signs that might indicate unreliability rather than reliability (Rochlin 1999, Weick 2001). There is a constant suspicion that small errors may lead to large consequences. HRO professionals are never certain that they have the complete information or knowledge. Thus, system operators are never complacent and aggressively seek to know what they don’t

17

know. The price for the creation of strong culture of reliability is that usually tensions exist between different experts engaged in the design and operation of the complex system (e.g., Schulman 1993).

Although open source communities lack the intensity of the type of socialization that is described in HRO’s, community members in fact do share a certain professional culture towards reliability. Consider a statement from a board member of the ASF: “We had to make the system reliable. This goal was sacred and we did not care about the rest.” Is this statement a coincidence, does it reflect the opinion of just one developer in the community? There are indicators suggesting that a culture focused on the discovery and fixing of errors is to some degree institutionalized in both Linux and Apache. First, Mockus et al. (2002) argue that in Apache of the top fifteen problem reporters, i.e. the fifteen reporters who reported the most errors, only three are also core developers. Apparently, the other twelve reporters are not sufficiently skilled or motivated to become core developers, yet they are sufficiently motivated to report bugs or even fix bugs. This means that inside the communities, specialization and a division of labor takes place. This division of labor is emergent, as individuals choose for themselves what to work on: “agents choose freely to focus on problems they think to best fit their own interest and capabilities” (Bonaccorsi and Rossi 2003, p. 1247). Specialization inevitably results in conflicting requirements and priorities as to what the source code should look like and about what the most important improvements are. A previous statement in this article demonstrates how differences between corporate developers and developers who participate for fun results in frustration. The latter group will find reliability less a concern and will want to concentrate on new features and new challenges. Second, Linux and Apache are governed by a culture that stimulates a continuous drive to create elegant software. Elegance is above all about aesthetics. “Let me say this, coding is not a science. It is more of an art form. It is all about the stability or readability of the code.” Elegance is a notion of beauty: “So from the standpoint of a small group of engineers, you’re striving for something that’s structured and lovely in its structuredness.”22 A member of the ASF explains how elegant code is likely to lead to better

22

Cited from an interview with Ellen Ullman by Scott Rosenberg in 1997. The interview is published on Salon.com:

http://archive.salon.com/21st/feature/1997/10/09interview.html (August 2002).

18

code and code that is easier to understand: “It is sort of an axiom that elegant code is also good code. Of course it is easy to come up with examples that prove differently, but most of the times it works.” In Linux a culture of wanting to create elegant code is at least partly sustained by Linus Torvalds, whose most important task a maintainer in the Linux community claimed to be “to judge whether changes are truly needed and whether they are elegant.” A respondent from the Apache community placed the strive for elegant source code into perspective. He felt that elegant source code was not always attainable, nor always the most important. In certain situations “it is about the question how good something is implemented and how effective the solution is. For the web server this hardly results in discussions, because everything has to meet standards like HTTP. For other projects it works differently. There is more uncertainty and there are no standards about how something is supposed to be implemented. Then it is a trade-off between efficiency of speed or memory. Sometimes, the new code is not good, but it makes the whole better than it was.” Third, both communities have coding style guides, which prescribe how the software is supposed to be written (Egyedi and van Wendel de Joode 2004). Adherence to the coding style guides creates uniformity and simplifies the task to create and maintain software. A member of the ASF argues that adherence in Apache is to some degree monitored by “a style police. These are one or two people who feel having the same coding style is important and they fix it when the code is written in a wrong style, or they will contact the author of the software and ask him to clean the code.” The coding style guides create a culture in which one of the values is to write source code that adheres to pre-defined standards. The same respondent argues that this is different for software developed in closed source environments: “There, another department does not get to see the code and thus different department have different coding styles. In Apache a lot of people come and go. It is good that there are people who keep the rules and style consistent. In Apache you will find that the code style looks similar. It does not matter in which piece of Apache you look.”

Structural flexibility and redundancy One of the important ways in which HRO’s are able to maintain a continuous reliable performance is by using operational redundancy and flexibility – also known as system slack – at the 19

system level. Consequently, system operators use the diverse forms of redundancy and flexibility that are built into the system they manage to deal with reliability-threatening events (Roberts 1990a, 1990b). Examples of redundancy available include the cross-training of personnel to fulfill multiple tasks, the design and management of parallel or overlapping activities that can provide backup and the relative interdependent design and functioning of elements and workgroups within the system (LaPorte 1996). By employing the structural flexibility and technical redundancies in the system, HRO’s are able to allow the system to fail gracefully. As a prime example of this type of system redundancy, Schulman (1993) mentions the ability of the air traffic control system to change the horizontal and vertical separation between airplanes in the landing sequence in order to allow for (potential) threats that might disturb the reliable operation. The technology and the way it is employed largely determines the amount of redundancy or slack that exists in the system and also largely determines the type of response that can be chosen by system operators. However, both strategies allow system operators to regain some semblance of control. Only if all resources are exhausted, there is no alternative but to decide to (temporarily) shut down the system.

Structural flexibility and redundancy are highly visible in both the Apache and the Linux community. They can be roughly divided in two categories. Firstly, the communities have software development tools that result in slack and flexibility. Secondly, the software itself provides mechanisms that are responsible for slack and flexibility. A respondent from the Apache community claims that the CVS is an important mechanism to institutionalize structural slack and flexibility: “Whether or not the committers are good is really not that important. The CVS ensures that nothing much can be destroyed in the software. Old versions are saved allowing you to move back to an older version. Furthermore, the CVS allows the creation of multiple branches, which can exist next to each other. Everyone is free to choose what software they want to use.” Another form of slack may be found in the number of participants in both communities. The development effort does not rely on one or two single programmers but a relatively large pool of programmers who all want to contribute and demonstrate their skills. The size of the community is enabled due to a continuous influx of new people (van Wendel de Joode et al. 2003). A maintainer in the

20

Linux community is convinced that they ensure that both communities consist of “a lot of different people, who have lots of different machines and systems that will help to spot a lot of bugs.” One respondent explains how companies like IBM and Covalent have contributed their programmers to develop HTTP server 2.0 and how this has resulted in unreliable software. “The programmers lacked a background in the development of web-applications. This is why 2.0 is a strange beast… Web developers and system administrators will become involved in the improvement of the system, once companies actually start to use 2.0. They will make the system reliable again.” One respondent explains that organizational slack also arises because the creation and maintenance of alternatives is allowed, or even stimulated. “There are many projects where two groups are taking on the same issue and basically doing the same… It seems like there are often two different ways of tackling a problem and people try both.” The two groups will compete, which is considered to be a good thing: “as it ensures that a project will continue to evolve and improve.” Consider for instance the slack in file systems available for Linux, “Currently there are six systems, and you have a choice between all of them.” Flexibility and slack in the software are created through modularity and elegance. Both Apache and Linux software are modular, which gives rise to flexibility. The editor-in-chief of Linux Journal explains: “free software maintains complexity with such a loose structure because the interfaces are well defined.” The modularity of Apache and Linux creates independence between the programmers and between different parts of the software and thus creates flexibility. Without modularity the software would become much less reliable, because changes made in one part of the software would likely have the effect that ‘something else does not work anymore.’ Modularity decreases dependency and thus eases the implementation of changes (Kogut and Metiu 2001). A respondent from the Linux community compares it to a glass of beer: “If you have a glass of beer and drink out the rim, then I know how to model my mouth and lips. If the content under it would change but the rim would stay the same I would still know how to keep my mouth. So the code or technique doesn’t matter, as long as it fits in with the interface. You can keep on innovating in between the same interface.” Elegance also stimulates structural flexibility in the communities. Elegant code eases the task of programmers to make changes to the source code. A maintainer in the Linux community is even convinced that “you can only work when the software is

21

beautiful and elegant, to be able to implement changes easily.” Thus, although the number of lines of source code is bound to increase when the functionality of software is enriched, elegance will ensure that the source code remains relatively easy to understand and hence remains relatively easy to change.

Reliability is a non-fungible commodity As a result of the extreme importance of the reliability of the services that the HRO provides, we have already noted that HRO’s cannot afford to engage in trial-and-error learning. Another consequence of the extreme importance of reliability in HRO’s is that reliability is considered a non-marginizable property (e.g., Schulman 1993, Roe et al. 1998). This means that at some point reliability cannot be traded-off for another commodity such as time or money. HRO’s are thus not allowed or not prepared to experiment with trade-offs (through trial-and-error). Consequently, little is known about exactly what it takes to achieve a minimum level of reliability that is considered satisfactory and about the marginal impact and value of reliability. In other words, most of the technologies employed by HRO’s are made as reliable as possible because no one is prepared to accept the consequences of not investing in all possible efforts to maintain the reliable performance of the system.

There are a number of indications that suggest to some extent, that reliability of open source software is non-fungible and cannot be traded against another software property in both communities. First, respondents have argued that a new release of the software is much less governed by deadlines and time constraints than by the quality of the code. It is the code that dictates when it is time for a new release. A member of the ASF explains: “At Red Hat people work fulltime on a distribution and they define deadlines for every new distribution. We don’t. We launch a new release when we feel it is ready for it.” Neither are the releases dictated by the wish to please customers, when a new release or update is needed, it is done. “It has happened that the Apache Software Foundation (ASF) released four versions a week.” A programmer from the company Covalent explains: “We tested the latest ASF release and discovered an error. We fixed it and reported the error to the ASF. They though it was important enough to make a new release, which included our solution. Then we found a new error and the next day another one. This is why the ASF released four new versions that week.” 22

Second, the voting system in the Apache community stimulates simple yes or no decisions. The voting system is used for every inclusion of source code and the rule is that if only one person disagrees then he can veto the placement of the software, no matter how many people think the software is good. This system is prone to result in frustration, as just one person is needed to remove the inclusion of new source code. One respondent explains how he feels that the voting system is important: “I judge the code mainly on its technical merits. Many people have many different opinions. If I think something is good, others can oppose and they have the right to veto. Some people get upset if you tell them that for this or that reason it is not accepted. A real case example was when someone made a contribution but that was rejected because it resulted in a memory leak.” Third, on the level of the code itself, the quality is arguably relatively easy to judge. Errors are obvious and must be removed upon reading the source. “Compare it to tightening a screw with a pipe wrench, instead of a screwdriver. Obviously this is totally wrong… These are objective rules. Sometimes young and inexperienced developers like to start a discussion about it though. This is at the level of the code itself. On this level discussion is impossible.”

However, there are also counter-arguments and examples that point to another direction. They hint that elegance and the quality of the source code are sometimes less important. A system maintainer from CNet explains: “But you have to be pragmatic… In practice there is often no time and you have to sacrifice the elegance for the practicality. An experienced developer will try to develop elegant code, but in certain situations it is just much easier to cut corners. For example officially you should store data only once. In practice this can slow down the database… that is the way things go in real life. It is a sacrifice: you learn to make trade-offs when you are more experienced. Elegant code is nice to have, but sometimes you just can’t.”

Strong presence of external groups with access to credible and timely operational information To maintain the focus upon the goals of the high reliability organization usually requires the presence and oversight of some external groups. LaPorte (1996) specifically mentions the importance of independent public bodies and stake-holding interest groups and professional peer bodies, which would 23

maintain the focus of the HRO on its reliability goals (p. 65). According to LaPorte the chances of HRO’s maintaining or even enhancing their performance will be facilitated by the “aggressive and knowledgeable oversight” of these groups. However, in order for external groups who have an oversight function to have any effect at all, these groups need to have the necessary and relevant information. And for this, the importance of accurate and timely information coming from the HRO’s they oversee is necessary. Frequently, external groups demand measures of operational performance, which allows the oversight organizations to assess the performance of the HRO.

There are a number of mechanisms that demonstrate a strong presence of external oversight in both Linux and Apache. First, software has a number of different dimensions along which reliability can be measured. A number of them are relatively easy to quantify and measure. External groups therefore frequently perform such measurements. Consider the examples from the introduction. For instance the periodical research reports written by the Reading consulting company. They have measured the number of errors in both Linux and Apache modules. Or consider a very recent Forrester research. They have compared Microsoft and Linux on the number of bugs and their response rate to bugs.23 These research reports receive much attention and publicity, both within and outside open source communities. Second, many stake-holding interest groups in Linux and Apache are also members of the communities. Examples are CNet, Yahoo and IBM. These companies are use the software, but are also involved in the communities and contribute to the development effort. They report bugs and make the community members aware of any outstanding issues. This means that external groups are more than just present; they become involved as well. A system administrator from CNet explains how the company is an external group but is also involved in the Apache community: “We are not experts in coding, but we bring in something else really valuable… We have a huge experience running major websites and we give back this experience to the community. We are not a vendor who is trying to sell it, we are a user and we

23

The repost is available at: http://www.forrester.com/Research/Document/Excerpt/0,7211,34340,00.html

(August 2004).

24

have therefore a different perspective. We are much more pragmatic and we contribute in giving cases or certain situations where Apache can be improved, based on real business situations.” Third, open source communities and the software are both open and highly transparent. Raymond (1999) suggests that this is almost a sufficient guarantee for quality and for developers and users to find and report bugs. He argues that given a sufficient number of knowledgeable people, almost every problem in the software is shallow. Along similar lines the actions of developers in the communities are highly transparent: “…in the open source community monitoring the behavior of users is easy because the Internet gives full transparency…” (Osterloh 2002, p. 15). Even though, transparency alone is not sufficient to ensure the creation of reliable software, it is a contributing factor to reliability. Not only are external groups able to read the source code and find bugs, they also create pressure for developers to write reliable and qualitatively high software. One respondent explains: “Online, a record is kept of who did what and when. Once you made a big mistake then it will become a ‘Hall of blame.’” To prevent this from happening, two respondents describe how they make sure to check every piece of source code before they add it to the repository. They do this because they want to avoid stupid mistakes, avoid the laughter of the community and to show their capacities as software programmers to the entire world. Fourth, there is a negative form of pressure to create reliable software. Anyone can report this problem when software programs like Linux or Apache consist of a bug or have a security flaw. However, this does not mean that the problem is automatically solved; it could very well be that no one has an interest in solving the problem. Thus, as long as the flaw or bug is not solved, everyone could make use of the flaw. Obviously, this creates a lot of pressure to solve the problem. One of the respondents reasons: If they have spotted a bug that can cause security problems they will give it back and say that there is something wrong but they will not tell you how to fix it, nor will they give the exploit. I don’t need a 15 year old to own my site because he abuses a bug, when I haven’t had time to upgrade my website to a newer version of Apache… If you are running a business you may not have the time to upgrade. The open source community doesn’t always realize that.” Finally, simply the use of open source software by external groups almost automatically results in a test of the software (cf. Raymond 1999). This characteristic is one of the reasons why von Hippel and von Krogh (2003) claim that free riding in the communities hardly occurs as even users of the software

25

bring in something valuable. They use the software, which automatically implies testing. If only they report bugs, it would already be beneficial to the community, as any bug in the software could potentially cause an entire application to crash. According to a respondent, to test software in the real world is very important: “If you do not, you never know whether it is reliable and how you have to improve it.”

Conclusion The article demonstrates that Apache and Linux share a number of characteristics that seem to contribute to the creation of reliable software. For each of the reliability-enhancing mechanisms distinguished in HRT-theory, both the Linux and the Apache community seem to have adopted alternative ways to stimulate the development of reliable software. The creation and adoption of mechanisms like a CVS, a bug-tracking system, the credit lists and modular software contribute to the creation of reliable open source software. This approach allows us to integrate empirical findings of previous empirical research in open source communities and to provide a more coherent view that questions the more conventional approach of command and control to create reliable software. However, it should be noted here that we do not claim to have found perfect substitutes for the reliability-enhancing factors described in HRT. Instead, we claim that these mechanisms constitute an explanation as to how Linux and Apache succeed in creating reliable software. Further research should verify whether these mechanisms are also relevant for other open source communities. In table 3 the mechanisms identified in this article are listed. The mechanisms are clustered in principles that are thought to lay at the basis of the mechanisms and to express the specific goals the mechanisms are believed to achieve. However, further research on a) the individual mechanisms, on b) the underlying principles and on c) their factual influence on reliability; are needed. The mechanisms discovered in this article provide a first insight into open source communities and their ability to create reliable open source software. As such they are relevant for a) those who manage open source developers and thus need to balance commercial interests with the peculiarities and specific processes in open source communities; and b) managers who are engaged in a trajectory of transforming internal processes to adopt an open source software development approach.

26

Furthermore, we argue that open source communities provide us with valuable and innovative lessons for managing the production of reliable software – especially since the conditions for contemporary software development increasingly resemble those of open source communities rather than those of tightly controlled corporate environments. For managers, practitioners and researchers engaged in software development this article intends to provide insight into mechanisms that play, a perhaps vital, role in achieving reliable software.

Table 3: Potential reliability-enhancing mechanisms found in open source communities Reliability-enhancing

Principle underlying mechanisms

Reliability mechanisms in open source

Direct linkages between

- Credit lists

developers and the source code

- Voting system

they have written

- Software license, e.g. the GPL

Freedom of individual developer

- Bug tracking system

mechanisms System of responsibility and accountability

Flexible decision-

- Voting system

making processes and

Formal, collective decision-

- Project leadership

authority patterns

making arena

- Project Management Committees (PMC) and Apache Software Foundation (ASF) - Specialization and emergent task division

Organizational culture - A continuous drive to create elegant code

of reliability

- Coding style guide adopted by entire community - CVS High amounts of organizational Structural flexibility and redundancy

slack and flexibility

- Huge number of programmers - Continuous emergence of variety and alternatives

Almost perverse tendency to create slack and flexibility in the source code

- Modularity - Elegant software

Reliability is a non-

Ignoring criteria other then

- Voting system allows veto

fungible commodity

quality of code

- Apparent ease to judge the quality of source code

27

- Ease to measure certain dimensions of reliability Easy access to information - High levels of transparency Strong presence of external oversight

- Use implies direct testing Ease for external parties to

- Permeable boundaries of open source

actually ‘make things happen’

communities - Ease to report problems

References Benkler, Y. 2002. Coase's Penguin, or, Linux and the Nature of the Firm. Yale Law Journal, 112(3). Bollier, D. 2001. Public Assets, Private Profits; Reclaiming the American Commons in an age of Market Enclosure. New America Foundation, Washington, D.C. Bonaccorsi, A. C. Rossi. 2003. Why Open Source Software can succeed. Research Policy, 32(7) 1243-1258. Bonaccorsi, A. C. Rossi. 2004. Altruistic individuals, selfish firms? The structure of motivation in Open Source software. First Monday. Peer reviewed journal on the Internet, 9(1). Boots, F. P. 1995. No Silver Bullet: Essence and Accidents of Software Engineering. N. Heap, R. Thomas, G. Einon eds., Information Technology and Society, A Reader, Sage, London, U.K., 358-376. Bowles, J. B. 2004. Code from requirements: New productivity tools improve the reliability and maintainability of software systems. Annual Reliability and Maintainability Symposium, 68-72. Boyle, J. 2003. The second enclosure movement and the construction of the public domain. Law and contemporary problems, 66(1/2) 33-74. MubraniN, D., K. S. Booth. 1999. Coordinating Open-Source Software Development. Proceedings of the IEEE 8th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, 61-66. van Eeten, M. J. G., E. Roe. 2002. Ecology, Engineering and Management, Reconciling Ecosystem Rehabilitation and Service Reliability, Oxford University Press, New York. Egyedi, T. M., R. van Wendel de Joode. 2004. Standardization and other coordination mechanisms in open source software. Journal of IT Standards & Standardization Research, 2(2) 1-17. Faraj, S., L. Sproull. 2000. Coordinating Expertise in Software Development Teams. Management Science, 46(12) 1554-1568. 28

Fielding, R. T. 1999. Shared Leadership in the apache Project. Communications of the Association for Computing, 42(4) 42,43. Franck, E., C. Jungwirth. 2003. Reconciling Rent-Seekers and Donators - The Governance Structure of Open Source. Journal of Management and Governance, 7(4) 401-421. Friedman, M.A. and J.M. Voas (1995) Software Assessment: Reliability, Safety, Testability, John Wiley & Sons, New York. Gandy, A., U. Jensen. 2004. A non-parametric approach to software reliability. Applied stochastic models in business and industry 20(1) 3-15. Grabowski, M. R., K. H. Roberts. 1996. Human and Organizational Error in Large Scale Systems. IEEE Transactions on Systems, Man, and Cybernetics 26(1) 2-16. Grabowski, M. R., K. H. Roberts. 1997. Risk Mitigation in Large Scale Systems: Lessons from High Reliability Organizations. California Management Review 39(4) 152-162. Guthrie V. H., P. B. Parikh. 2004. Software safety analysis: Using the entire risk analysis toolkit. Annual Reliability and Maintainability Symposium – RAMS, 272-279.

Hars, A., S. Ou. 2002. Working for Free? Motivations for Participating in Open-Source Projects. International Journal of Electronic Commerce, 6(3) 25-39. Hertel, G., S. Niedner, S. Herrmann. 2003. Motivation of software developers in Open Source projects: an Internet-based survey of contributors to the Linux kernel. Research Policy, 32(7) 1159-1177. Himanen, P. 2001. The Hacker Ethic and the spirit of the Information Age. Random House, New York. von Hippel, E., G. von Krogh. 2003. Open Source Software and “Private-Collective” Innovation Model: Issues for Organization Science. Organization Science 14(2) 209-223. Humphrey, W. S. 1989. Managing the Software Process. Addison-Wesley Longman, Reading, MA. Kling, R. ed. 1996. Computerization and Controversy, Value Conflicts and Social Choices (second ed.). Academic Press, San Diego, CA. Kogut, B., A. Metiu. 2001. Open-Source software development and distributed innovation. Oxford Review of Economic Policy 17(2) 248-264. LaPorte, T. R. 1996. High Reliability Organizations: Unlikely, Demanding and at Risk. Journal of Contingencies and Crisis Management 4(2) 60-71.

29

LaPorte, T. R., P. M. Consolini. 1991. Working in Practice But Not in Theory: Theoretical Challenges of “High-Reliability Organizations”. Journal of Public Administration Research and Theory 1, 19-47. Leveson, N. G. 1995. Safeware, System Safety and Computers. Addison-Wesley, Reading, MA.

Lerner, J., J. Tirole. 2002. Some simple economics of open source. Journal of Industrial Economics 50(2) 197234. Lewis, T. 1999. The Open Source Acid Test. Computer 32(2) 125-127. Markus, M. L., B. Manville, C. E. Agres. 2000. What Makes a Virtual Organization Work? Sloan Management Review 42(1) 13-26. McGowan, D. 2001. Legal Implications of Open-Source Software. University of Illinois Review 241(1) 241304. Mockus, A., R. T. Fielding, J. D. Herbsleb. 2002. Two Case Studies of Open Source Software Development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology 11(3) 309346. Moon, J. Y., L. Sproull. 2000. Essence of Distributed Work: The Case of the Linux Kernel. First Monday. Peer reviewed journal on the Internet 5(11). Naughton, J. 1999. A brief history of the Future; The origins of the internet (second ed.). Weidenfeld & Nicolson, London, U.K. Osterloh, M. 2002. Open Source Software Production The Magic Cauldron? Paper presented at the LINK Conference, Copenhagen, Sweden. Paulk, M. C., C. V. Weber, B. Curtis, M. B. Crissis. 1994. The Capability Maturity Model; Guidelines for Improving the Software Process. Addison-Wesley Longman, Reading, MA. Perrow, C. 1999. Normal Accidents, Living with High-Risk Technologies. Princeton University Press, Princeton, NJ. Raymond, E. S. 1999. The Cathedral and the Bazaar: Musings on Linux and Open Source from an Accidental Revolutionary. O'Reilly, Sebastapol, CA. Raymond, E. S. 2000. Homesteading the Noosphere. First Monday. Peer reviewed journal on the Internet 4(10). Rijpma, J. A. 2003. From Deadlock to Dead End: The Normal Accidents-High Reliability Debate Revisited. Journal of Contingencies and Crisis Management 11(1) 37-45.

30

Roberts, K. H. 1989. New Challenges in Organizational Research: High Reliability Organizations. Industrial Crisis Quarterly 3(2) 111-125. Roberts, K. H. 1990a. Managing High Reliability Organizations. California Management Review 32(4) 101-113.

Roberts, K. H. 1990b. Some Characteristics of one Type of High Reliability Organization. Organization Science 1(2) 160-175. Roberts, K. H., S. K. Stout, J. J. Halpern. 1994. Decision Dynamics in Two High Reliability Military Organizations. Management Science 40(5) 614-624. Roberts, K. H., G. Gargano. 1990. Managing a High-Reliability Organization: A Case for Interdependence. M. A. von Glinow, S. A. Mohrman eds., Managing Complexity in High Technology Organizations. Oxford University Press, New York, 146-159. Rochlin, G. I. 1999. Safe Operation as a Social Construct. Ergonomics 42(11) 1549-1560.

Roe, E., L. Huntsinger, K. Labnow. 1998. High reliability pastoralism. Journal of Arid Environments 39, 3955. Sagan, S. 1993. The Limits of Safety. Princeton University Press, Princeton, NJ.

Schulman, P. R. 1993. The Analysis of High Reliability Organizations: A Comparative Framework. K. H. Roberts ed., New Challenges to Understanding Organizations. Macmillan, New York, 33-54. Shaikh, M., T. Cornford. 2003. Version control software for knowledge sharing, innovation and learning in OS. Paper presented at the Open Source Software Movements and Communities workshop, ICCT, Amsterdam, The Netherlands. Sharma, S., V. Sugumaran, B. Rajagopalan. 2002. A framework for creating hybrid-open source software communities. Information systems Journal 12(1) 7-25. Son, H. S., P. H. Seong. 2003. Development of a safety critical software requirements verification method with combined CPN and PVS: a nuclear power plant protection system application. Reliability Engineering & System Safety 80(1) 19-32. Stallman, R. 2002. Free Software, Free Society: Selected Essays of Richard M. Stallman (Proof Copy ed.). GNU Press, Boston, MA. Vogus, T. J., T. M. Welbourne. 2003. Structuring for High Reliability: HR Practices and Mindful Processes in Reliability-seeking Organizations. Journal of Organizational Behavior 24, 877-903. Weick, K. E. 2001. Making Sense of the Organization, Blackwell, Oxford, U.K. 31

Weick, K. E, K. H. Roberts. 1993. Collective Mind in Organizations: Heedful Interrelating on Flight Decks. Administrative Science Quarterly 38, 357-381. Weick, K. E., K. M. Sutcliffe, D. Obstfeld. 1999. Organizing for High Reliability: Processes of Collective Mindfulness. Research in Organizational Behavior 21, 81-123. Wang, D. F., F. B. Bastani, I. L. Yen. 2004. A systematic design method for high quality process-control systems development. International Journal of Software Engineering and Knowledge Engineering 14(1) 43-59. Wayner, P. 2000. FREE FOR ALL: How Linux and the Free Software Movement Undercut the High-Tech Titans. Harper Business, New York. van Wendel de Joode, R., J. A. de Bruijn, M. J. G. van Eeten. 2003. Protecting the Virtual Commons; Selforganizing open source communities and innovative intellectual property regimes. T.M.C. Asser Press, The Hague, The Netherlands. Yin, M. L., J. Peterson, R. R. Arellano. 2004. Software complexity factor in software reliability assessment. Annual Reliability and Maintainability Symposium, 190-194.

32

The Lessons of Open Source Communities for Developing Reliable ...

The Lessons of Open Source Communities for Developing Reliable ...

Suggest Documents

Empowering Communities via Open Source Tools for

Galatea: Open-source Software for Developing Anthropomorphic ...

Understanding Requirements for Developing Open Source - UCI

open source software for economically developing ...

Developing Open-Source Software for Art Conservators

Neural ensemble communities: open-source ... - Semantic Scholar

The end of communities of practice in open source ...

The end of communities of practice in open source ...

Assessing Open Source Communities' Health using Service ...

learning communities in open source projects

Developing an Initial Open-Source Platform for the Higher Education ...

Developing an open-source database for the ... - Infoscience - EPFL

Lessons from open source software development

Providing Commercial Open Source Software: Lessons Learned ...

the life cycle of open source software development communities

The Organization of Open Source Communities: Towards ... - CiteSeerX

Framework for Governance in Open Source Communities - CiteSeerX

Empowering Communities via Open Source Tools for ...

Developing Healthcare Applications using Common Open Source ...

18 DEVELOPING OPEN SOURCE SOFTWARE: A ... - CiteSeerX

Free/Libre/Open Source Software (FLOSS): lessons for intellectual ...

Open Source Software: Lessons from and for ... - Semantic Scholar

Towards the Application of Open Source Software in Developing ...

The Impact of Open Source Software on Developing ... - Rackcdn.com