1 ETHICAL CONSIDERATIONS IN INTERNET CODE ...

ETHICAL CONSIDERATIONS IN INTERNET CODE REUSE: A MODEL AND EMPIRICAL TEST

27 January 2011

Manuel Sojer,1 Oliver Alexy,2 and Joachim Henkel1,3 1

Technische Universität München, Schöller Chair in Technology and Innovation Management, Arcisstr. 21, D-80333 Munich, Germany. sojer|[email protected] 2

Imperial College Business School, Tanaka Building, South Kensington Campus, London, SW7 2AZ, United Kingdom. [email protected] 3

Center for Economic Policy Research (CEPR), London

Abstract Internet code, such as open source software code, which is available for gratis download from the Internet, is becoming an increasingly important element of code reuse in commercial software development. However, when individual developers practice such reuse in ad-hoc fashion, for different reasons they might disregard potentially existing license obligations, thereby causing serious legal, economic, and reputational issues for their employers and for themselves. In this study, we try to explicate the underlying drivers of such behavior. We employ the theories of planned behavior, ethical work climate, expected utility, and deterrence to model Internet code reuse as an individual-level ethical decision. With the use of a survey of 869 professional software developers, we find that between 15% and 21% of these developers have in the past, under time pressure, reused Internet code in such a way that disregarded license obligations. We show that developers’ attitude toward the behavior and their perceived subjective norm influence their intention to engage in such behavior. Further, we find that developers who anticipate more severe consequences for this behavior, and those who perceive less severe time pressure in their firms, hold a more negative attitude. Additionally, our results highlight an indirect effect of an ethical work climate of compliance with laws and codes on intention mediated by subjective norm. Overall, our study advances the theoretical understanding at the intersection of knowledge reuse and ethical behavior. Moreover, we offer managerial implications with respect to how to avoid economic and legal issues from violated license obligations that result from ad-hoc Internet code reuse.

Keywords: Code reuse, information systems ethics, ethical behavior, Internet code, TPB, ethical work climate, expected utility theory, deterrence theory, open source software, PLS 1 Electronic copy available at: http://ssrn.com/abstract=1596009

ETHICAL CONSIDERATIONS IN INTERNET CODE REUSE: A MODEL AND EMPIRICAL TEST

INTRODUCTION The reuse of existing code is crucial for firms to increase the effectiveness, efficiency, and quality of their software development activities (Kim and Stohr 1998; Krueger 1992). Past work has shown that code reuse may positively affect time-to-market (Lim 1994) and reduce development and maintenance costs (Apte et al. 1990). While the concept of code reuse dates back to McIlroy (1968), the recent emergence of “Internet code”—code available for free (gratis) download from the Internet, particularly that of open source software (OSS) (e.g., Haefliger et al. 2008)—has further increased potential benefits of code reuse (Ebert 2008; Spinellis and Szyperski 2004). Extant literature is overwhelmingly positive about the reuse of Internet code (e.g. Ajila and Wu 2007; Norris 2004) and mentions potential pitfalls only in passing, if at all. In particular, research explicitly dealing with obligations that result from the reuse of Internet code is scarce. According to the terms of the license chosen by the original creator(s) to protect the code, obligations activated by the act of reuse may have strong implications for firms and the larger software in which the reused code is embedded. For example, a developer who built software for a Cisco router chose to reuse existing OSS code available on the Internet because it helped him do his job more effectively and efficiently.1 However, since the code was licensed under the GNU General Public License (GPL), this behavior later required Cisco to make a large share of the router software available in source code form to their customers and permit them to modify and redistribute this software without a fee (Lyons 2003).2

1 2

When reusing the OSS code, the developer was actually creating software for a Linksys router. However, by the time the resulting license obligations surfaced, Linksys had been acquired by Cisco. Similarly, for firms for which it is crucial for their business to protect the intellectual property (IP) in their

2 Electronic copy available at: http://ssrn.com/abstract=1596009

As this example makes clear, in the common situation of individual developers reusing Internet code in ad-hoc fashion (Sojer and Henkel 2010b)—as opposed to systematic Internet code reuse (Ruffin and Ebert 2004)—it is up to these individuals to appropriately take license obligations into account. Ad-hoc reuse may help developers deliver their work more effectively, efficiently, and in higher quality, and thus renders clear individual-level benefits to them. At the same time, dealing with license obligations may be an annoying burden for individual developers since such obligations are mainly of relevance at the firm level. As a result, while ad-hoc Internet code reuse may well optimally solve individual developers’ problems, such reuse without regard to licensing obligations may entail negative long-term consequences for their firms that may far outweigh any individual-level and short-term firm-level benefits. Since it is difficult to monitor developers’ decisions of whether or not to scrutinize and account for potential license obligations around a piece of Internet code considered for reuse, it is up to these individuals to behave ethically and choose a course of action that prevents harm to their employer (see Carroll 1991; Thong and Yap 1998). As such, ad-hoc reuse of Internet code, like other forms of knowledgebased collaboration, mandates vigilant behavior by individuals (Jarvenpaa and Majchrzak 2010). In this paper, we build on a rich tradition of ethical studies in the information systems (IS) domain (e.g., Anderson et al. 1993; Banerjee et al. 1998; Carlisle 1999; Cohen and Cornwell 1989; Mason 1986; Mingers and Walsham 2010; Moores and Chang 2006; Peace et al. 2003; Straub and Collins 1990; Walsham 1996), to analyze the factors that make an individual professional software developer more or less likely to disregard potential license obligations when reusing Internet code. As our two research questions, we ask: (1) can we model individual

software by keeping the source code secret, a situation similar to the one of Cisco described above will have dramatic consequences. For example, software development firm VMware (2008, p. 34) writes in the “risk” section of their quarterly filings to the U.S. Securities and Exchange Commission that Internet code reused in their software could, under certain situations, require them “to release the source code of our proprietary software, which could substantially help our competitors develop products that are similar to or better than ours.”

3

professional software developers’ ad-hoc reuse of Internet code as an ethical decision and, if so, (2) which factors will render them more or less likely to make the “right” decision? We build on the theory of planned behavior to model and investigate the individual-level decision process, in saying that attitude, subjective norm, and perceived behavioral control will predict individual-level intention and, thus, behavior (Ajzen 1991). In particular, we hypothesize that utility considerations (expected utility theory [e.g. Fishburn 1970; Schoemaker 1982]) and expectation of punishment (deterrence theory [e.g. Ehrlich 1973]) determine individuals’ intention (Bulgurcu et al. 2010; Peace et al. 2003) to engage in ad-hoc Internet code reuse and disregard potential license obligations via their attitude toward such behavior. Furthermore, we predict that the acceptance of wider, institutional norms about legal and ethical behavior as well as firm-specific rules (ethical work climate theory (Victor and Cullen 1988) impact individuals’ intention directly as well as indirectly through their perceived subjective norm. Based on a pre-study that includes 20 interviews with experts in the field, we survey 869 professional software developers with the use of a vignette study approach (Fredrickson 1986) that contains multiple scenarios of ad-hoc Internet code reuse. We then solicit individual software developers’ opinions and motivation with respect to such behavior through a questionnaire that includes potential controls for problems of common method bias (Podsakoff et al. 2003). The findings of our study point out that individuals’ intention to reuse Internet code without regard to potential license obligations is driven by their attitude toward the behavior and the subjective norm they perceive regarding it. One level deeper, in applying expected utility and deterrence theory, we find that attitude is influenced by the severity of time pressure developers perceive in their firms and by the severity of punishment they expect for their firms and themselves for disregarding potential license obligations. We also present evidence for subjective norm mediating the effect of ethical work climate on developers’ intention. 4

Based on our findings, our study makes three contributions to the literature on knowledge reuse and ethical behavior in the context of IS and, potentially, beyond. First, we solidify the application of expected utility and deterrence theory when investigating the concept of attitude in situations with an ethical dimension. In doing so, we also provide first evidence that individuals might consider the costs of a behavior more strongly than resulting benefits when forming an attitude in such situations. Second, regarding ethical work climate, we point out that it may not affect intention directly, but that its effect is mediated by subjective norm. Moreover, our results lead on to suggest that even in the context of ethical behaviors within firms, individuals are more strongly influenced by a work climate that invokes wider institutional norms rather than firmspecific rules. Third, our combined results highlight that whether or not to honor potential license obligations during ad-hoc Internet code reuse amounts to an individual-level decision in which ethical considerations are of crucial importance. This aspect needs to be included in discussions of potential firm-level benefits of knowledge reuse, and, in particular, Internet code reuse. In this sense, our study is an important step toward shedding more light on the microfoundations of knowledge reuse, that is, studying how individual-level factors affect the eventual success of knowledge reuse strategies at the level of the firm (see, e.g., Majchrzak et al. 2004). Finally, we also point out several managerial implications to influence the behavior of professional software developers in order to mitigate risks from their ad-hoc Internet code reuse. THEORY AND HYPOTHESES Internet Code Reuse as an Individual-Level Decision Software reuse as the software-specific form of knowledge reuse (Majchrzak et al. 2004; Markus 2001) is “the process of creating software systems from existing software rather than building software systems from scratch” (Krueger 1992, p. 131). Within software reuse, the artifact most commonly reused is code (Kim and Stohr 1998; Krueger 1992). Such code can come in the form 5

of components (pieces of software that encapsulate functionality, often specifically for the purpose of being reused) or snippets (that is multiple lines of code from existing systems). The large body of OSS and other code that has become available on the Internet at no cost for download in recent years and that generally comes with the permission to be reused in other software development projects—hereafter “Internet code”—is not only an attractive resource for OSS developers in the creation of new projects (Haefliger et al. 2008; Sojer and Henkel 2010a), but can also be leveraged in commercial software development (Ebert 2008; Spinellis and Szyperski 2004). When such code is proven and thoroughly tested, developers do not need to code entirely new systems from scratch, but, in reusing can produce artifacts of higher quality and better maintainability, often in much less development time (Crowston et al. 2010; Frakes and Kang 2005; Madanmohan and De' 2004). However, despite typically being available gratis on the Web, most Internet code is not in the public domain, but is still protected by copyright and comes under various licenses that demand that those who reuse the code comply with obligations stated in the license terms (Fitzgerald 2006; Rosen 2004; Watson et al. 2008). These obligations vary greatly by license. As a less critical condition, some licenses ask for attribution of the original creators of the reused code in the software that builds on their work. More problematic for firms are the obligations demanded by the GPL, which is also the most common OSS license. The GPL requests that other code that is tightly integrated with code under the GPL must also be put under this license. Thus, code that the firm might have wanted to protect through secrecy now has to be put under the GPL, thereby granting all users of the software the right to access, modify, and redistribute the source code of the software without paying a fee to the original creator. Typically, noncompliance to these strict rules is not a valid option to firms, since the likely enforcement of the GPL in court can have serious legal, economic, and reputational consequences (Arne 2008; Rosen 2004). Consequently, 6

to prevent these potential pitfalls, some firms have begun to reuse Internet code systematically by incorporating steps into their software development processes to select, evaluate, and integrate Internet code when building new software (Norris 2004; Ruffin and Ebert 2004). Given the structured processes of such a setting, it seems feasible to weigh the benefits and risks of Internet code reuse and manage potential issues from license obligations properly. However, as our introductory example has made clear, Internet code is also often reused by individual professional software developers in ad-hoc fashion when these developers—on their own and typically without telling anyone—employ Internet code as a shortcut in their work (Arne 2008; McGhee 2007). Levi and Woodard (2004, p. 8) even claim that such ad-hoc reuse of Internet code has become the “standard practice for many programmers.” Sometimes developers do not thoroughly check for the obligations that come with reuse or even intentionally ignore such obligations (Sojer and Henkel 2010b). Notably, in the survey conducted for this study, we find that between 15% and 21% of participants may have reused Internet code with a disregard for potential license obligations. Internet Code Reuse as an Ethical Individual-Level Decision Originating with Mason’s (1986)3 seminal contribution in which he pointed out the manifold ethical issues in the IS space, a stream of scholarly work has emerged to investigate and explain ethical and unethical behavior in the IS context. Among the issues explored in this body of research are software piracy (e.g. Moores and Chang 2006), IS abuse (e.g. Harrington 1996), data or identity theft (e.g. Banerjee et al. 1998), and privacy (e.g. Straub and Collins 1990). However, the fast-paced evolvement of IS leads to the continuous emergence of new ethical issues in this domain (Mingers and Walsham 2010; Straub and Collins 1990). The commercial relevance of the disregard for license obligations in ad-hoc use of Internet

3

Alternatively, Mingers and Walsham (2010) point to work in 1950 and 1985 as the roots of information ethics.

7

code is reflected in our introductory examples as well as in the startup and strong growth of firms such as Black Duck Software that provide software tools to help firms scan their code bases for Internet code reused by their developers. However, scholarly work explicitly addressing Internet code reuse in commercial software development has acknowledged, but not yet explored, this issue more deeply. Since most professional software developers are aware of Internet code licenses in general (Sojer and Henkel 2010b), disregarding potential license obligations has to be seen as a form of unethical behavior. It is unethical because developers’ pursuit of their goal to make their own job easier by reusing Internet code without fully accounting for the obligations coming with the reused code brings harm upon their employers. Thus, developers affect their employer’s ability to pursue its commercial goals in an unjust way through their individual behavior (see Carroll 1991; Thong and Yap 1998). Based on this argument, we develop a research model that explains such behavior. Modeling Individual-Level Ethical Reuse Behavior One of the theoretical models employed most frequently to investigate (un)ethical behavior in business contexts is the theory of planned behavior (TPB) (Ajzen 1991). This parsimonious model of human behavior assumes that individual behavior is predicted by the intention to engage in the behavior. TPB identifies the determinants of intention as attitude toward the behavior, perceived subjective norm regarding the behavior, and perceived behavioral control. From an ethical perspective, TPB brings together teleological and deontological considerations (also see Moores and Chang 2006) into a coherent individual-level decision-making framework. Fishbein and Ajzen’s (1975) expectancy-value model argues that the sum of positive and negative beliefs regarding the outcome and consequences of a behavior lead to the formation of an overall attitude toward it. Similarly, the subjective norm regarding a behavior is formed by the 8

sum of an individual’s normative beliefs of whether others of importance support or discourage the behavior. Finally, perceived behavioral control accounts for all factors perceived as opportunities or impediments to engage in the behavior, which may be differentiated into a “capability” and a “controllability” portion (Ajzen 2002; Pavlou and Fygenson 2006). Ethical decision making and its modeling through TPB has a long tradition in the IS literature and beyond. For example, Banerjee et al. (1998) use a model drawing on TPB to describe issues such as IS abuse and privacy violations. Peace et al. (2003) find that all three components of the TPB strongly predict intention to engage in software piracy. Outside the information technology context, Flanery and May (2000) investigate the role of ethical decision making in the metal finishing industry, and find strong support for the effect of both attitude and subjective norm on intention in an ethical decision-making setting. Buchan (2005) shows a positive effect of attitude on intention regarding unethical behavior in the public accounting profession. To account for particularities of ethical decision making, and in line with its original author (Ajzen 1991), these and other scholars have developed several modifications to the TPB. For example, it is common to use the intention to engage in a behavior as the dependent variable in studies of ethical behavior and not the behavior itself because intention is a highly reliable predictor of behavior (Ajzen 1991; Beck and Ajzen 1991). We follow a similar approach in using the TPB to predict individual-level intention to engage in unethical behavior, and extend it by adding constructs driving individuals’ attitude, subjective norm, and intention. Specifically with regard to attitude, we use expected utility theory (e.g. Fishburn 1970; Schoemaker 1982) and deterrence theory (e.g. Ehrlich 1973) to investigate the effect of benefits and costs (as perceived by the individual) of ad-hoc Internet code reuse that potentially violates license obligations. In addition, we investigate how concepts from ethical work climate theory (Victor and Cullen 1988) influence subjective norm as well as intention. In 9

the following, we introduce the individual constructs of our research model and develop hypotheses, which are summarized in Figure 1. INSERT FIGURE 1 ABOUT HERE Predicting Intention As pointed out earlier, a wide array of literature in various domains has shown the validity of the general TPB model in ethical decisions to predict intention based on attitude, subjective norm, and perceived behavioral control. Nevertheless, the relative importance of constructs influencing intention is expected to vary with regard to the behavior at stake (Ajzen 1991; Beck and Ajzen 1991). Consequently, it is important to analyze each specific behavior of interest and test the significance of each factor in predicting intention. Leveraging TPB as a base for our research model we posit the following three baseline hypotheses: Hypothesis 1: A more positive attitude toward the reuse of Internet code without regard to potential license obligations will increase developers’ intention to engage in such behavior. Hypothesis 2: A higher level of subjective norm supportive of the reuse of Internet code without regard to potential license obligations will increase developers’ intention to engage in such behavior. Hypothesis 3: A higher level of perceived behavioral control regarding the reuse of Internet code without regard to potential license obligations will increase developers’ intention to engage in such behavior. Predicting Subjective Norm and Intention: Ethical Work Climate Theory While TPB takes into account an individual’s attitude toward a behavior and the attitude of peers via subjective norm, some scholars have argued that when ethical decisions are at stake, the greater context in which these decisions are made also influences the decision makers (Trevino 1986; Wyld and Jones 1997). Following this line of thought, organizational ethics research has found empirical support for the relationship between an ethical climate within an organization 10

and the ethical behavior of the employees of such organization (e.g. VanSandt et al. 2006). Conceptualizing such an ethical climate within organizations, Victor and Cullen (1988, p. 101) have developed the multidimensional construct of ethical work climate as “the prevailing perceptions of typical organizational practices and procedures that have ethical content.” While ethical work climate within an organization is by definition a macro-level construct (Wyld and Jones 1997), the way it is perceived by members of the organization influences them with regard to “the types of ethical conflicts considered, the process by which such conflicts are resolved, and the characteristics of their resolution” (Victor and Cullen 1987, p. 55). In turn, integrating the dimensions of ethical work climate identified by Victor and Cullen (1988) into our research model allows us to test the effect of the organizational context on individual ethical behavior. Given the behavior of interest in our study, the two ethical work climate dimensions law and code and rules seem to be most appropriate for inclusion in the research model.4 The law and code dimension reflects how important it is in the organization to comply with laws and professional codes of conduct. It is, thus, a reflection of the strength of intrafirm adherence to institutional norms (e.g. Orlikowski and Barley 2001) that originate from the environment of the firm. Given that obligations from Internet code reuse result from legal instruments such as copyright, and that respect for IP is part of many IS codes of conduct (e.g. Anderson et al. 1993), developers in firms in which these norms are more strongly present and attended to should have a lower intention to reuse Internet code without regard to potential license obligations. While the foundations for ethical deliberation are societal, professional, or legal and thus extra-organizational in the case of the law and code dimension, the rules dimension of ethical work climate reflects intra-organizational, firm-specific principles such as policies, procedures, rules, and internal norms (Wyld and Jones 1997). Given that some firms that develop software

4

The other three dimensions identified by Victor and Cullen (1988) are caring, instrumental, and independence.

11

have policies, processes, and rules of how to deal with Internet code (which typically also address the topic of obligations resulting from reuse), developers in firms where compliance with internal rules is more pronounced should have a lower intention to reuse Internet code without regard to potential license obligations. The above rationale leads to the following two hypotheses: Hypothesis 4a: The more a firm’s ethical work climate emphasizes compliance with laws and codes, the lower will be developers’ levels of intention to disregard potential license obligations when reusing Internet code. Hypothesis 4b: The more a firm’s ethical work climate emphasizes compliance with firm rules, the lower will be developers’ levels of intention to disregard potential license obligations when reusing Internet code. Tetlock’s (1985, p. 298) proposition that “[b]oth individuals and small groups of individuals are constrained by the norms, procedures, and resources of the institutions in which they live and work” suggests that ethical work climate not only influences individuals’ behavioral intention as proposed in Hypotheses 4a and 4b, but also affects the subjective norm that individuals perceive from their colleagues who are subject to the same norms, procedures, and resources within their firm. Consequently, in firms with an ethical work climate of compliance with laws and codes as well as in firms with an ethical work climate of compliance with firm rules, individual developers should perceive a more negative subjective norm with regard to the reuse of Internet code without regard to potential license obligations. Thus, we posit two additional hypotheses capturing the potential mediation of the effect of an ethical work climate on intention by subjective norm: Hypothesis 4c: The more a firm’s ethical work climate emphasizes compliance with laws and codes, the lower will be developers’ levels of subjective norms supporting Internet code reuse without regard for potential license obligations. Hypothesis 4d: The more a firm’s ethical work climate emphasizes compliance with firm rules, the lower will be developers’ levels of subjective norms supporting Internet code reuse without regard for potential license obligations. 12

Predicting Attitude: Expected Utility Theory Having established the antecedents of intention and subjective norm, we next address the factors that influence developers’ attitudes with regard to the reuse of Internet code with a disregard for potential license obligations. We rely on expected utility theory and deterrence theory; expected utility theory argues that individual attitudes are based on a cost-benefit estimation of the behavior in general, whereas deterrence theory focuses on anticipated costs of illegal or unethical behavior through its potential punishment. We introduce both theories in more detail and derive hypotheses related to each. When developing functionality for a software project, commercial software developers can usually choose between three alternatives: to implement the required functionality themselves (possibly reusing internal code of their firm), to reuse Internet code making sure that all resulting obligations are met, or to reuse Internet code disregarding potential license obligations.5 Expected utility theory (e.g. Fishburn 1970; Schoemaker 1982) states that developers will favor the reuse of Internet code without regard for potential license obligations if the sum of all expected benefits of this alternative, less the sum of all expected costs of this option, is greater than for the other two options. Benefits and costs involved in evaluating the alternatives can be incorporated into our research model as antecedents of attitude toward the behavior (see, e.g., Bulgurcu et al. 2010; Peace et al. 2003). High perceived benefits should lead to a more positive attitude while high perceptions of the expected costs should lead to a more negative attitude (Ajzen 1991; Peace et al. 2003). In our setting, we argue that individual attitude will be driven, in particular, by the perceived usefulness of Internet code, the severity of time pressure the individual feels, and the expected cost of compliance to investigate and address obligations arising from reuse.

5

The option to in-license and integrate commercial code is usually not available in the situation we consider (adhoc reuse by individual developers) due to high transaction cost and the time required.

13

Benefits: Usefulness of Internet Code Internet code reuse can make developers’ jobs easier because it allows them to solve technical problems they could not solve themselves, or to develop better software in less time (e.g. Frakes and Kang 2005). Different developers may perceive these benefits differently. One developer from our qualitative pre-study explains that, “I feel the open source community is a lifesaver and I use the Internet daily to do my work, sometimes to find modules [i.e., components], but mostly […] to find examples of how to achieve a task correctly in complicated logic.”6 Quite to the contrary, another developer comments, “I have never found code from the Internet to be useful beyond showing an approach to some subject. Porting others’ code is too hard to be worth the trouble.” It seems likely that the first developer holds a more positive attitude toward Internet code reuse without regard for potential license obligations because he needs Internet code as a “lifesaver” for his job. Even if he incurs expected costs from the potential violation of license obligations, these costs should be made up for by the high benefit he perceives from the reuse of Internet code. Quite differently, the second developer should have a more negative attitude toward such Internet code reuse because he perceives little benefits from it in the first place. We thus state the following hypothesis. Hypothesis 5a: Usefulness of Internet code will have a positive effect on attitude toward its reuse without regard for potential license obligations.

6

See the Data and Methods section for more information on our qualitative pre-study, the source of the interview quotes presented here. Where established theory exists, the quotes serve mainly to illustrate or exemplify our arguments. When no established theory exists, we employ these quotes to solidify our arguments.

14

Benefits: Mitigation of Time Pressure7 Efficiency gains through Internet code reuse may, in particular, mean faster completion of a project; thus, it offers the benefit of avoidance of negative consequences that come from missed deadlines. Generally, research on software quality shows that developers under time pressure tend to take shortcuts in order to meet their deadlines (e.g. Brooks 1975). Such shortcuts are “decisions made in private that are motivated by a desire to stay on schedule, but are not in the best interests of the project” (Austin 2001, p. 195). It seems likely that reusing Internet code disregarding potential license obligations is one such shortcut for professional software developers in which they “hope for the best [and] leave potential sources of difficulty unexplored” (Austin 2001, p. 195). In a theoretical model, Austin (2001) shows that developers who perceive more severe consequences from missing deadlines, such as not being considered for promotions or pay raises, are more likely to take shortcuts and ignore the negative issues that might follow as a result. Drawing on this logic, we posit the following hypothesis: Hypothesis 5b: Severity of time pressure in the developer’s firm will have a positive effect on attitude toward reusing Internet code without regard for potential license obligations. Benefits: Avoidance of the Cost of Compliance Compliance to license obligation may represent a work impediment to professional software developers (Bulgurcu et al. 2010). A benefit specific to Internet code reuse that potentially violates license obligations is thus that such violations help to avoid costs of compliance, which

7

The research model assumes a relationship between the severity of not meeting deadlines and developers’ attitude toward the reuse of Internet code without regard for potential license obligations. It does not test a relationship between the general existence of time pressure and developers’ attitude. The rationale for this is that while most software development projects have some share of time pressure, developers only react to this if they perceive “severe” consequences from not meeting the resulting deadlines. Further, the existence of time pressure may vary from project to project and could even differ for different points in time within a project. Contrary to this, the severity of time pressure should be rather stable over time within a firm and thus relates better to attitude, which is also a construct expected to be rather stable over time.

15

can be broken down into two components. First, developers incur the cost of investigating which obligations come with the Internet code they want to reuse. Second, they need to bear the cost of ensuring that all obligations previously identified are accounted for properly. Similar to the perceived usefulness of Internet code reuse, developers can also hold different perceptions with regard to these two cost components. For instance, with respect to the cost to thoroughly check for the obligations of Internet code, one developer from the qualitative pre-study finds them to be rather high and explains, “my problem is that licenses are often written in legalese which is hard to comprehend.” It seems likely that developers with this position consider the avoidance of the cost of compliance as an attractive option. To the other extreme, another developer has the opposite opinion: “Software license issues are easy to check, and clearly nobody should integrate code without checking the license.” The statement nicely emphasizes that due to the perceived low cost of compliance this developer considers such behavior unattractive. The same discrepancy can be observed with regard to the second component, which addresses the costs involved to ensure that the obligations are accounted for. Here developers typically have to engage with others in their firm (often their supervisors) to determine whether and how the obligations of the code can be fulfilled.8 If developers expect high costs in the form of a lengthy and difficult discussion with their firm combined with a high likelihood of a negative outcome, they might very well consider it attractive to avoid this step, reuse the code right away, and disregard the obligations they are aware of. This leads us to the following hypothesis: Hypothesis 5c: Higher costs of compliance will have a positive effect on attitude toward the reuse of Internet code without regard for potential license obligations.

8

For example, engaging with others is necessary as software developers typically can neither decide by themselves whether proprietary code affected by a “viral” effect (see “Questionnaire Development”) can be made available under an OSS license, nor can they legally assign a license to code for which they do not hold the copyright.

16

Predicting Attitude: Deterrence Theory Counteracting the above benefits is the potential punishment that can result for the individual software developers and/or their firms from disregarding these potential license obligations. This cost type is closely linked to deterrence theory. In deterrence theory literature, punishment is usually decomposed into punishment severity and punishment certainty (e.g. Tittle 1980). Based on these concepts, deterrence theory proposes that the level of unethical or illegal behavior decreases when punishment severity and/or punishment certainty are increased (Ehrlich 1973). In the IS field, Straub (1990) finds support for deterrence theory in the context of computer abuse in organizations. Peace et al. (2003) identify a link between both punishment severity and punishment certainty and the intention of employees to pirate software at their workplace. In our setting, two different types of punishment severity need to be considered (see the respective subsection for why this distinction does not apply to punishment certainty): the severity of punishment for the firm (Thornton et al. 2005) and for the individual developer (Peace et al. 2003). Thus, we look at how punishment severity for the firm, punishment severity for the individual, and punishment certainty will affect individual-level attitude toward the reuse of Internet code without regard for potential license obligations. Costs: Punishment Severity (Firm) Regarding punishment severity for the firm, there should be developers who are well aware of the consequences their employers might face if they disregard potential license obligations, and who also consider these consequences as significant. One example for such a developer is a participant of the qualitative pre-study who explains, “a license can be legally enforced [against the firm]. You have to read it before use.” This statement illustrates that developers who know that violating obligations from reused code could create substantial (legal, economic, and reputational) trouble for their employer should have a more negative view on such reuse, and thus 17

do not engage in this behavior (also see Henkel 2009). In contrast, developers who are less aware of these potential problems or who perceive them as less severe should hold a more positive view thereof. We thus posit: Hypothesis 6a: Punishment severity for the developer’s firm will have a negative effect on attitude toward the reuse of Internet code without regard for potential license obligations. Costs: Punishment Severity (Developer) Regarding punishment severity for individual developers, the qualitative pre-study has revealed that there exist firms with explicit rules on how their developers have to deal with Internet code and that some of these firms also strictly enforce these rules. As one developer emphasizes this point, “I work for a company that expressly prohibits including open source software. […] if I were to cut and paste [this type of Internet code], it would cost me my job.” While this particular firm bans reuse of Internet code altogether, there are also firms that allow only the reuse of code under certain licenses, or only code with certain obligations, and have introduced punishments for developers who do not comply with these rules. Following deterrence theory, developers who perceive more severe punishments for themselves should hold a less positive attitude toward reusing Internet code disregarding potential license obligations. Thus, Hypothesis 6b: Punishment severity for the developer will have a negative effect on attitude toward the reuse of Internet code without regard for potential license obligations. Costs: Punishment Certainty Punishment certainty captures the likelihood that someone outside the developer’s firm determines that the firm’s software contains Internet code but does not account for the license obligations. Subsequently, it will be within the firm’s discretion to identify the individual developer responsible for this issue.9 It is generally assumed that “determining whether [Internet

9

Typically, someone outside of the developer’s firm has access only to the binary code of the software, so only the

18

code] is present in a corporation’s code base is a difficult task to perform accurately” (McGhee 2007, p. 8). Yet recently, organizations such as gpl-violations.org have been founded to actively pursue the violation of obligations from reused Internet code by commercial firms. Developers who are more aware of these recent developments should perceive a higher punishment certainty. Beyond that, there are various other factors that might influence the punishment certainty developers perceive, such as the programming languages employed (as the binary code created by some programming languages can be analyzed more easily than that of others) or the deployment mode of the software (e.g., embedded software vs. standalone software, or few customers vs. many customers). This leads us to posit: Hypothesis 6c: Punishment certainty will have a negative effect on attitude toward the reuse of Internet code without regard for potential license obligations. DATA AND METHODS Questionnaire Development To test our research model, we collected data with a survey questionnaire. The explicit items employed to measure the research model constructs are listed in Appendix A. As laid out below, we adhere to Boudreau et al.’s (2001) recommendations on study design in conducting a pilot study and a pre-test, using previously validated instruments whenever possible, reporting content and construct validity and reliability, and undertaking formal validation using PLS (also see Anderson and Gerbing 1998; Gefen and Straub 2005; Gefen et al. 2000; Straub et al. 2004). To ensure high content validity, whenever possible we relied on questionnaire items validated in previous studies. We draw on existing research that investigates forms of unethical

firm will be able to identify the culprit. The likelihood that individual developers will be able to get away with license violations even when their employers gets caught will depend on the individual’s programming skills and firm-level code tracking systems. However, these two factors are sufficiently captured by the capability and controllability dimension of perceived behavioral control, implying that (1) punishment certainty is rather a firmlevel construct and (2) additional theorizing at the level of developers is not necessary.

19

behavior—preferably in the IS context—with TPB-based models (e.g. Coyle et al. 2009; Limayem and Hirt 2003; Peace et al. 2003) and adapt their wording to match our setting. For the two ethical work climate theory constructs, we employed the original items by Victor and Cullen (1988). No suitable existing items could be identified for the constructs usefulness of Internet code, cost of compliance, and punishment certainty. Therefore, new items were developed to measure these constructs based on existing literature about Internet code reuse and a qualitative pre-study we conducted. This pre-study consisted of 20 interviews with an average duration of 51 minutes with industry experts in the field of Internet code reuse and further included discussions with software developers who had taken part in the pilot (see below) of our questionnaire. In the questionnaire, all resulting items were gauged on seven-point Likert-scales ranging from “strongly disagree” to “strongly agree.” Given the ethical nature of our survey, some further deliberations were necessary. First, in order to minimize social desirability effects in our data, we followed several of the steps suggested by Podsakoff et al. (2003). Specifically, the survey was designed in such a way that it could be administered anonymously through an online application, all critical survey items were presented in a nonthreatening, neutral tone, and a scenario helped to create psychological distance for the survey participants. Second, we explicitly include Strahan and Gerbasi’s (1972) version of the Marlowe-Crowne social desirability scale (1960) into our empirical design to be able to quantify any potentially remaining social desirability bias. Finally, to avoid omitting contextual information necessary to elicit realistic decision making in ethics surveys (Randall and Gibson 1990), we applied Fredrickson’s (1986) scenario method as a means to embed realism into our survey. Participants were randomly assigned one of three scenarios. As a starting point for our scenario vignettes, we relied on the 1992 version of the ACM (Association for Computing Machinery) Code of Ethics and Professional Conduct and the work 20

of Anderson et al. (1993), who have illustrated this code by developing “a set of nine classes that describe situations calling for ethical decision making.” For each of these classes, Anderson et al. (1993) have developed a scenario to illustrate its ethical content. In their first scenario, addressing ethical behavior in the context of IP, they describe a developer who under time pressure and stuck with technical problems reuses existing code without thoroughly checking for all obligations of the code and without accounting for those obligations that she is aware of. Using our insights from the qualitative pre-study, we modified this scenario in such a way that it reflects the situation of a developer who reuses Internet code today as accurately as possible. The modified scenarios (the full texts of which are available in Appendix B) now present a professional software developer named Joe who is under time pressure to complete his module of a software development project and who is not sure how to implement a certain piece of functionality specified for his module. In order to resolve this situation, Joe reuses Internet code in an ad-hoc fashion. In order to account for different types of Internet code reuse that disregard potential license obligations, three different versions of the scenario were derived from the above base case. In Scenario 1, the form of Internet code that Joe reuses is a snippet for which he does not check thoroughly whether there are obligations that need to be fulfilled for reuse. In Scenario 2, the form of Internet code that Joe reuses is a component, and similar to scenario 1, he does not check thoroughly whether there are obligations that need to be fulfilled when the component is reused. Finally, in Scenario 3 Joe, like in Scenario 1, reuses a snippet, however, he does check for the obligations resulting from reuse and finds that the snippet causes a “viral” effect, distantly related to the GPL, in that its author demands that other code combined with the reused code needs to be made publicly available in source code form on the Internet. Joe believes that a discussion with his firm about compliance with this obligation would take valuable time and sees a chance that 21

his firm would, in the end, forbid him to use the snippet. Due to this, he chooses to simply ignore the obligation, alters the snippet a little bit and integrates it into his work. We did not devise and test a potential fourth scenario that transfers the situation described in Scenario 3 to the reuse of components because the experts from our qualitative pre-study convinced us that this situation would be rather uncommon. Before conducting the actual survey, the questionnaire was pre-tested extensively. First, it was reviewed by four scholars familiar with the topic. After that, 113 software developers from the same population as that addressed in the actual survey (see below) took part in two rounds of pilot studies to assess the quality of the instruments employed with respect to content, scope, and language. The final questionnaire was refined based on the analysis of the pre-test results and feedback from the pilot tests and fellow researchers. Sample and Data Collection Since professional software developers from only one or a few firms might not be representative in their beliefs and opinions, our study required a broad sample. To accommodate this need, we chose participants in software development newsgroups for our survey, with the assumption that a substantial share of the people who participate in discussions on this communication channel would also develop software for a living. To construct our survey population, we thus identified 528 newsgroups dealing with the topic of software development. In July 2009, we extracted email addresses of those 38,212 individuals who had been active in these newsgroups in the previous 12 months (see Figure 2). We eliminated 13,525 addresses due to issues such as duplication or because they clearly did not refer to potential software developers. Of the remaining 24,687 addresses, 1,212 were utilized for our pilot studies and 23,475 formed our final survey sample. The survey was conducted during the fall of 2009, when we randomly selected 14,000 individuals from the population and contacted them via e-mail. Since 2,227 invitations did 22

not reach their recipients,10 the 1,133 valid responses we received correspond to a 9.9% response rate, which is typical for Internet surveys (Couper 2000). A non-response analysis that compares early to late respondents (Armstrong and Overton 1977) yields no indication of a non-response bias with regard to the items used in the research model. Yet, and despite the above steps taken during the design of the questionnaire, we find slight indications of social desirability biases in our data when correlating the social desirability scale with the constructs of our research model (see Table 1). INSERT FIGURE 2 AND TABLES 1 AND 2 ABOUT HERE Descriptive Statistics As expected, the majority (77%) of the respondents to our survey are professional software developers. Of these, 316 took the questionnaire with Scenario 1; 256 with Scenario 2; and 297 with Scenario 3. The demographic information presented in Table 2 as well as all analyses in the following refer exclusively to these 869 professional software developers. On average, the software developers surveyed are 35.6 years old, male (98%), and have 9.7 years of experience. When self-assessing their software development skills, nearly three quarters of the developers consider themselves to be above average. Twenty-three percent even think of themselves as “excellent” software developers. Most developers in our sample work as programmers (51%) or software/system architects (28%). A majority live in Europe (53%) or North America (28%). Developers from Asia and South America account for 15% and 4%, respectively. Nearly all developers have received a university education: 38% hold an undergraduate degree, 39% a graduate degree, and 11% a PhD. More than half of our sample (56%) has participated in OSS projects in the past. Finally, only about one third works in firms with a policy that addresses Internet code reuse, which is consistent with findings from the pre10

The high number of invitations that could not be delivered reflects the fact that some newsgroup participants provide fake contact information in their profiles.

23

study that indicate many firms still do not address potential risks of ad-hoc Internet code reuse by their developers. We control for the presence of such policies in additional robustness checks. RESULTS We tested our research model and its hypotheses individually for each of the three scenarios using partial least squares (PLS) employing the software SmartPLS (Ringle et al. 2005). PLS is one of the powerful second-generation multivariate analysis techniques that estimate both measurement and structural models simultaneously in optimal fashion (Chin 1998). Given the medium-sized samples for the three scenarios in our study and the lack of normal distribution in most of our variables,11 PLS seemed better suited for our research than the more traditional LISREL approach (Jöreskog and Wold 1982). All constructs were modeled as reflective measures (Petter et al. 2007). To estimate the significance of path coefficients, we relied on a bootstrapping procedure with 200 samples (Henseler et al. 2009). To present and discuss our results, we follow the two-step approach suggested by Anderson and Gerbing (1998). In the first step, we test the reliability and validity of our measurement models. After that we evaluate the structural models to assess our hypotheses. Measurement Model Assessment Evaluating measurement models in PLS requires an examination of their composite reliability, indicator reliability, convergent validity, and discriminant validity (Henseler et al. 2009). As seen in Table 3, all constructs meet the composite reliability threshold of 0.7 (Chin 1998) and nearly all items exceed the indicator reliability cut-off value of 0.7 (Fornell and Larcker 1981). In Scenarios 1 and 3, the items PBC2 and PBC4 are slightly below the required value of 0.7. The same is true for item CERT2 in Scenario 3. However, since in all these cases the overall constructs exceed the composite reliability threshold of 0.7, the items are retained. To ensure 11

For nearly all items included in the research model, Shapiro-Wilk and Shapiro-Francia tests of normal distribution of variables reject the hypothesis of normal distribution with p