How Development Decisions Affect Product Volatility: A Longitudinal Study of Software Change Histories
Evelyn J. Barry Mays Business School Texas A&M University College Station, TX
[email protected] (979) 845-2254 (phone) (979) 845-5653 (fax) Chris F. Kemerer Katz Graduate School of Business University of Pittsburgh Pittsburgh, PA
[email protected] (412) 648-1572 (phone) (412) 624-2983 (fax) Sandra A. Slaughter** Graduate School of Industrial Administration Carnegie Mellon University Pittsburgh, PA
[email protected] (412) 268-2308 (phone) (412) 268-7345 (fax)
July 2003 ** Contact author for this paper
GSIA Working Paper #2003-E66 DO NOT QUOTE, COPY OR DISTRIBUTE WITHOUT PERMISSION OF THE AUTHORS Funded in part by National Science Foundation grants CCR-9988227 and CCR-9988315, a Research Proposal Award from the Center for Computational Analysis of Social and Organizational Systems, NSF IGERT at Carnegie Mellon University, and the University of Pittsburgh Katz Graduate School of Business Institute for Industrial Competitiveness. We thank M. Cusumano, A. MacCormack, W. Richmond and participants in seminars at Purdue University, Arizona State University, and the University of Calgary for their comments and suggestions on an earlier draft of this manuscript. We also thank P. Marks and D. Ko for their assistance with data coding. i
How Development Decisions Affect Product Volatility: A Longitudinal Study of Software Change Histories Abstract Research on product development often focuses on activities prior to product launch. However, for long-lived, adaptable products like software, post-launch updates can account for the major portion of life cycle benefits and costs. Thus, it is important to understand how development decisions can influence the dynamics of product evolution. In this study we develop a model that relates strategic, organizational, and tactical product development decisions to the volatility of software products over their life cycles. We empirically evaluate this model by analyzing longitudinal data on more than 28,000 changes made over twenty years to software products in a major merchandising firm. In this firm a 1% increase in volatility is associated with a 1.9% increase in life cycle costs for the software products. Specific results indicate that, controlling for the domain, complexity, size and age of the products, an increase in the use of shared product platforms and work automation tools, in team instability, and in the intensity of updates to a product’s components increases a product’s future volatility. Analysis of curvilinear effects indicates that these volatility increases occur at a declining rate. These effects on volatility can be at least partially offset by using a standard, rather than custom, design over a product’s life cycle. Finally, we find significant interactive effects among pairs of individual development choices such that the total volatility impact can be amplified or moderated with the joint use of some development practices. Our results contribute to research on product development by providing often difficult to obtain empirical insights into the dynamic behavior observed during commercial product life cycles and revealing how this behavior relates to sometimes much earlier development decisions. In addition, our findings suggest that product development costs are sensitive to changes in volatility. The insights from these analyses can help product development managers understand the importance of managing product change, improve their ability to anticipate change and estimate the life cycle implications of their development decisions.
Key Words: Product Development Decisions; Product Life Cycle; Product Change; Product Evolution; Software Volatility; Software Development; Software Evolution; Software Maintenance. ii
1. Introduction Product development decisions can have long-reaching consequences. This is particularly true in software product development environments. While observers may intuitively tend to think of software as perpetually new, the reality is that many software products evolve over life cycles that can last for a decade or more (Swanson and Dans 2000). It is estimated, for example, that the average age of enterprise general ledger systems in Fortune 1000 companies is 15 years (Kalakota and Whinston 1996). To satisfy changing information requirements, software products are often modified and enhanced over their life cycles. In fact, the activities occurring post-launch (i.e., after initial implementation) can account for as much as 90% of the total product life cycle for software (Bennett 1996). Although numerous studies have improved managers’ understanding of product development issues, the majority of this research has focused on the pre-launch phases of product development (Brown and Eisenhardt 1995; Krishnan and Ulrich 2001). With the exception of a few studies (e.g., Banker, Davis and Slaughter 1998; Banker, Datar, Kemerer and Zweig 1993; Cohen and Whang 1997), there is relatively little knowledge about the effects of design and development choices on post-launch activities. However, long life cycles characterize not only pure software products, but also a growing number of manufactured products like automobiles, airplanes and medical equipment that have significant amounts of embedded software (Koopman 1996).1 Thus, it is increasingly important for a significant fraction of product development specialists to understand the life cycle implications of software development decisions. There is a wide range of life cycle patterns in software products and products with embedded systems: some products remain relatively stable over their life cycles while others are frequently updated (Kemerer and Slaughter 1997; Koopman 1996). In part, such differences in product volatility may be attributable to differences in environments. Software products operating in very uncertain and dynamic domains are changed often to stay in synch with their environments (MacCormack, Verganti and Iansiti 2001). However, development decisions can also influence product volatility. Microsoft, for example,
1
According to the World Semi-conductor Trade Statistics Blue Book, there are currently an estimated 5 billion embedded system microprocessors in use today, representing 94% of the world market for semi-conductors, and the embedded technology sector is projected to grow exponentially in the next decade (Coopee 2000; Halfhill 2000).
1
uses a “daily build” process in which a new version of the software is compiled at the end of every day (Cusumano and Selby 1997, Iansiti and MacCormack 1999). This practice causes the software product to change daily (even if the environment does not change every day). Our objective in this study is to understand how strategic, organizational and tactical development decisions influence the volatility of software products over their life cycles. Specifically, we consider business-oriented software products such as inventory management, advertising, pricing, and order processing applications. By “volatility” we refer to the length of the time between product modifications that occur post-launch as a result of life cycle activities; such activities can include updates, adaptations and enhancements to the product, including the development of new features (Barry 2001).2 Volatility is neither inherently good nor inherently bad. Rather, it reflects an important investment to keep products useful when, for example, new requirements emerge during use, the business environment changes, errors are discovered and must be repaired, new equipment or technology must be accommodated, or the performance or reliability of the software must be improved. However, it can be very challenging to predict which products will need to be changed and at what rate because, as we have noted, not all software products change at the same rate. This makes resource allocation and planning extremely complex. While managers cannot control the rate of environmental change, a central premise of this study is that managers can predictably influence (through their design and development decisions) the nature and timing of software change processes. This study makes several contributions to knowledge about product development for softwareintensive products. First, our study provides new insights by viewing product development from a temporal perspective, by examining longitudinal data about changes to software products over as many as twenty years in length. A temporal lens has the potential to provide a richer understanding of the longterm implications of management decisions as opposed to more common cross-sectional analyses (Ancona, Goodman, Lawrence and Tushman 2001). Specifically, our study extends the literature on 2
We focus on the timing dimension of volatility. Although there are potentially a variety of possible dimensions of volatility, for managers of products with long life cycles, like those examined here, a particularly difficult and important task is estimating when resources will be needed to update products. Simple heuristics, such as periodic update release strategies, may be significantly non-optimal (Banker and Slaughter 1997).
2
product development by developing and empirically evaluating a model associating development decisions with product volatility. A predictive model that links design and development choices to the dynamics of product evolutionary patterns has the potential to significantly improve decision-making for long-lived products like software. Using this predictive model, researchers can broaden and deepen their understanding of the transforming processes and dynamic behavior observed during product evolution; managers can improve their ability to anticipate change and to design adaptable products while retaining a life cycle perspective for product support. Our study also contributes through its compilation of a substantial archive of longitudinal data on software product change histories that enables us to empirically evaluate our model of product volatility. In Section 2, we develop a framework that relates strategic, organizational, and tactical product development decisions to software product volatility. Section 3 describes the empirical evaluation of our model, and Section 4 presents the analysis and results. In the final sections, we consider extensions of our analysis, discuss how our findings can inform research on product development, and draw out the implications of our results for the management of product development. 2. Strategic, Organizational and Tactical Product Development Decisions Product development involves a series of decisions about a potential product’s architecture, configuration, production process, distribution and launch. In particular, product development projects require decisions of different kinds – strategic, organizational, and tactical. In the following sections we examine these different types of product development decisions and consider how they could influence software product volatility. 2.1 Strategic Design Decisions and Software Product Volatility Product strategy and planning involve decisions about the firm’s target market, product mix, project prioritization, resource allocation, architecture, and technology selection (Krishnan and Ulrich, 2001). In this study, we focus on two important strategic design decisions that are particularly relevant to software product development: whether software assets should be shared across projects, as in platform-based product development (Meyer and Mugge 2001) and whether to use standard or product-specific designs
3
(Ulrich and Ellison 1999). These decisions define important dimensions of the architecture for a firm’s software product portfolio, and as such represent strategic design choices. 2.1.1 Shared Product Platform. Software platforms are sets of subsystems, components, and interfaces that form a common software base from which derivative products can be efficiently developed and produced (Meyer and Lehnerd 1997, p. xii; Wheelwright and Clark 1992). Platform-based product development is frequently used in the computing industry and is increasingly considered in other industries, although it is not always appropriate for all situations (Krishnan and Gupta 2001). In software development, a software platform typically includes a shared operating system, a set of applications, and specific components or modules that can be tailored to particular market segments (Meyer and Seliger 1998). Software platforms can afford significant economies of both scale and scope in internal software production, use and support (Banker and Slaughter 1997; Banker, Davis and Slaughter 1995).3 The decision to use a platform-based development approach could have implications for life cycle product volatility. In the context of business software development, the applications layer of the product platform typically consists of common applications such as general ledger, accounts receivable, and inventory management, as well as specific modules or components that can be used to tailor the general applications to particular market needs (Meyer and Seliger 1998). The use of a software platform can facilitate the evolutionary development of applications by flexibly accommodating new and modified components to meet new needs as the firm expands and changes (MacCormack 2001). As a firm grows in its current market and expands into new markets, the product development manager can decide whether to extend the use of the existing software platform into these markets, or whether to purchase or develop a separate software platform that is designed to handle the new information processing requirements. Extending the existing software platform means that the current software products will be used across a larger and potentially more diverse customer base and will process a greater volume of transactions. This use could leverage economies of scale and scope in
3 In addition, software companies can leverage the platform-based approach to product development for strategic advantage by becoming the standard or channel of distribution for large-scale innovation (Gawer and Cusumano 2002; MacCormack 2001; Meyer and Seliger 1998).
4
information processing. However, the products in the shared software platform may need to be adjusted to handle more inputs. The products are also probably less attuned to the information requirements of new customers in different markets. Therefore, these new users are likely to demand features and functions that are not currently available. A shared platform strategy would facilitate changes to the existing software products in the shared platform to accommodate these demands. Because the products in the software platform were designed to handle a particular set and volume of information processing requirements, the products could be expected to become more volatile with the growth and expansion of the firm. This implies that, given the decision to extend the existing software product platform: H1: An increase in firm growth is associated with an increase in the future volatility of the software products developed using a shared platform. 2.1.2 Standard versus Product-Specific Design. Related to the product platform decision is the decision of whether to use a standard or custom design for a product, i.e., the “design-select” decision (Ulrich and Ellison 1999). When developing a new product, a unique (product-specific) design can be created, a standard package or component can be sourced from an external vendor, or an existing component in the product platform can be re-used. A number of studies have examined the conditions under which a product-specific versus standard design would be preferred (von Hippel 1998; Ulrich and Ellison 1999), and there is a rich literature on transaction cost economics that informs the related decision of make versus buy for production activities in general (e.g., Williamson 1985) and for information systems development in particular (e.g., Ang and Straub 1998; Ang and Beath 1993). In the context of this study, our interest is not to determine whether managers will choose a custom or standard (package) design for a software product. Rather, we seek to predict the future volatility implications of the software product manager’s initial decision to use either a custom or standard design for the product. We assume, and our data support the notion, that both standard and custom designed software products can be modified by the firm or its agents should information requirements change. All else being equal, we expect that products created using a standard design are less volatile over their life cycles than those created using a custom design. There are several possible reasons for this
5
relationship. The initial choice of a custom design suggests that the product supports a core application. In a merchandising firm, for example, a core application is a point-of-sale system while a non-core application may be a payroll system. A core application supports and enhances the competitive position of a firm (Prahalad and Krishnan 2002). This implies that the firm will invest more, all else being equal, in its initial design and in its subsequent enhancement and maintenance because these activities have the potential to yield the largest returns (Hamel and Prahalad 1994; Prahalad and Krishnan 2002). Consequently, software products that support core capabilities are more likely than those supporting noncore activities to be implemented using custom designs. Subsequently, the software for a core application is also likely to be changed more frequently because keeping a core product or component useful is of more value to the firm. The choice of a standard or custom design could also reflect the extent to which it is important to achieve scale economies in product support. If a standard design is selected initially, the product manager (or the external vendor if sourced externally) may be reluctant to change the design after implementation. One reason is that any change to a “standard” product or component has the potential to impact all instances of use, causing unnecessary work. In addition, if the “standard” product or component is tailored after implementation, future economies of scale in modification activities would be reduced because the software is no longer the same as that in other products or components that were created using that design. Therefore, we hypothesize that: H2: Software products created using standard (package) designs are less volatile throughout their life cycles than software products created using product-specific (custom) designs. 2.2 Organizational Decisions and Software Product Volatility In the product development domain, organizational decisions include issues of team staffing, organizational structure, workplace design, and investments in productivity-enhancing tools and processes (Krishnan and Ulrich, 2001). For software product development, of special importance are organizational decisions relating to the product team and the use of productivity enhancing tools and processes (Pressman 2001).
6
2.2.1 Team Stability. The product team is central to the product development process as team members accomplish the critical tasks of articulating product specifications and transforming them into the design, development, and implementation of new products. Accordingly, Brown and Eisenhardt (1995) argue that product team aspects have a strong impact on product development performance. In particular, the composition of the team has been recognized as critical to the success of product development efforts (Clark and Wheelwright 1998; Connell, Edgar, Olex, Scholl, Shulman and Tietjen 2001). There are several important aspects of team composition that have been identified in the product development literature, such as the use of cross-functional teams, team demographics, and team tenure. In the context of software product evolution, where significant development efforts can occur post-launch, an aspect of team composition that is particularly relevant is product team instability, i.e., changes in the team of engineers who develop and update the software product over its life cycle (Sommerville 2000). The instability of the product team can be expected to contribute to software product volatility. Generally speaking, no one is likely to be as knowledgeable about the software product as those who have experience in developing or updating it (Sacks 1994). When new members of the team are assigned to the software product, they are initially less familiar with its design and code structure, and may have little experience working with other team members. While there may be some long-term performance benefits to bringing in new team members (such as increasing the amount and variety of ideas, and facilitating transfer of knowledge about the software product), at least initially, new team members are likely to make more errors (requiring future corrections) and are likely to be less efficient when updating the software product because they are unfamiliar with it (Swanson and Beath 1989). Thus, we expect that: H3: An increase in the turnover of a software product’s team members is associated with an increase in the product’s future volatility. 2.2.2 Work Automation. Work process has been identified as another critical success factor in product development efforts (Eisenhardt and Tabrizi 1995). There are various elements of the work process discussed in the literature, such as whether tasks are done sequentially or concurrently, whether standard or flexible processes are used, and whether the process is performed manually or is automated in
7
whole or in part (Brown and Eisenhardt 1995; Ulrich and Eppinger 2000). Investments in tools and processes may be motivated by a desire to improve labor productivity. Our interest in this study is not to determine whether the use of automation technology improves software development performance, per se, but rather to understand how the use of such technology influences product volatility.4 Computer aided software engineering (CASE) tools automate software development by generating software code to match design parameters entered into the tools by software developers (Kemerer 1992). CASE tool proponents emphasize the time and effort saved by software developers in generating and changing code. The tools make it possible to minimize effort even while increasing the changes occurring in the code through re-use and adaptation of product designs. CASE tools are not unlike the programmable automation technologies used in manufacturing to facilitate mass customization. Significant efficiency advantages from using these technologies have been found to accumulate from large volume and frequent product changes in manufacturing (Kelley 1994). Similarly, CASE technology could create potential economies of scale and scope in software product customization and adaptation. These economies could enable more responsiveness to changes in information requirements, thereby increasing the likelihood of future software volatility.5 Thus, we hypothesize that: H4: An increase in the use of work automation technology to develop a software product is associated with an increase in the product’s future volatility. 2.3 Tactical Decisions and Software Product Volatility Project management or tactical level decisions include those establishing the relative priority of development objectives, the planned timing and sequence of activities, project milestones and prototypes, and mechanisms for coordinating, monitoring, and controlling the project (Krishnan and Ulrich, 2001). In the context of software product development, and in situations where products are adapted and evolved
4
Whether such technology investments actually translate into improved work performance has been the subject of considerable debate. In engineering, the use of computer-aided design (CAD) technology has been linked to improved performance (Robertson and Allen 1993), although some studies have found that CAD technology does not always improve the overall effectiveness of product development (Adler 1990; Salzman 1989). 5 These ideas suggest that use of CASE tools may reduce the cost per change, but products built using CASE tools may still be more volatile, and therefore more costly in total.
8
over their life cycles, the nature, magnitude and timing of update activities performed are very salient (Kemerer and Slaughter 1997).6 2.3.1 Update Strategy. Because an existing software product portfolio is being adapted and enhanced, and service to current customers should not be unduly disrupted, product managers must carefully consider the timing and magnitude of update activities. In particular, product managers can choose to implement a significant number of new or enhanced components at one time (i.e., a “big bang” approach) or they can scope the project into smaller phases, versions or releases to limit the update intensity, i.e., the quantity of new and changed components delivered in one time period. In the information systems literature, implementing large amounts of new or changed software at once is known as a high risk strategy because such large changes are more difficult to fully test, and there may be significant problems that won’t be uncovered (Whitten, Bentley and Dittman 2001). Alternatively, when feasible, the product manager could choose to implement the changes in smaller versions or releases over multiple time periods to mitigate the scope of potential problems. An increase in update intensity (adding many new components or updating a large number of components in a software product at the same time) could be expected to increase the product’s volatility in the future, all else being equal. There are several potential reasons for this association. First, as shown in prior research on software reliability, new code can require several time periods to “stabilize”, i.e., for defects to be detected and corrected (Musa, Iannino and Okumoto 1990). In addition, software components that have been modified are more likely to require future corrections and adaptations than those that have not (Banker, Datar, Kemerer and Zweig 2002; Malaiya and Denton 1999; Yuen 1985). This is due, at least in part, to the opportunity to infuse errors into new and substantively modified code during the development process. In addition, when many components are undergoing changes, as we have noted, there may be less opportunity to test the changes thoroughly (Banker, et al. 2002). Adding or updating a large number of components at one time would require not only unit tests to verify the functioning of each component but also more complicated integration or system tests to determine 6
Update activities traditionally include tasks performed to adapt or enhance a software product by adding, changing, correcting or deleting components in the product (Lientz and Swanson, 1978).
9
whether the new or changed code has been implemented properly such that the components interact as predicted when integrated into the product portfolio. Moreover, adding or updating a large number of components in a software product at one time could increase the likelihood of a phenomenon known as “ripple effect” where a change to one component also (and unexpectedly) impacts other components (Fyson and Boldreff 1998). A high ripple effect signals a degradation of the product’s structure and an increase in its design instability, necessitating future changes (Gibson and Senn 1989; Li, Etzkorn and Davis 2000; Yau and Collofello 1985). Therefore, we expect that: H5: An increase in the intensity of updates to a software product is associated with an increase in the product’s future volatility. 2.4. Other Factors Influencing Software Product Volatility We control in this study for a number of factors that could be believed to influence software product volatility but that are less amenable to managerial action in the short term: product domain, product complexity, product size, and product age. The product domain sets the boundary of the product’s environment and includes the problem structure as well as the size and variety of the stakeholders. Each software product supports a particular task domain and fulfills an information gathering and disseminating role. Some task domains are wellstructured with consistent and relatively stable information requirements for all stakeholders throughout the software product life cycle, while others are ill-structured with uncertain and widely varying requirements from a variety of stakeholders (Prahalad and Krishnan 1999). In ill-structured domains, every time the task definition and structure change, information requirements change, requiring the software product to adapt. This implies that software product volatility may vary by task domain, and we control for differences in product domain in our analysis. Volatility should also vary with software product complexity, size and age. Software is among the most complex and abstract of artifacts of human creation, and the inherent properties of software products are often reduced to measures of their complexity, size and age (Brooks 1995, Simon 1994). Both increased complexity and increased size have been shown in prior research to be significant in predicting
10
the future occurrence of software faults and modifications (Banker, et al. 2002; Banker and Slaughter 2000; Kemerer 1995). As the size and complexity of the software grow, the likelihood of making errors increases, requiring future work to repair and adapt the code. Thus, software product volatility may be expected to increase with size and complexity. Product age is the final control variable. The laws of software evolution as articulated by Belady and Lehman (1985) and Lehman, Ramil, Wernick, Perry, and Turski (1997) describe changes related to software aging. As software products age, there is likely to be an increasing divergence between the software and its technical and organizational environments. Resolution of these discrepancies requires modifications, leading to increased software volatility with age. Given the above controls for software product domain, complexity, size and age, Figure 1 depicts our research model and hypotheses. Figure 1: Development Decisions and Software Product Volatility
Shared Platform Standard Design Team Instability Work Automation
H1: + H2: Product Volatility
H3: + H4: + H5: +
Update Intensity
Controls: Product Domain, Size, Complexity, Age
3. Method 3.1 Research Setting and Archival Data Coding Process We evaluate our hypotheses empirically, analyzing longitudinal data on software product change histories that we collected from the research site. The research site is a large, publicly owned conglomerate of department stores. The firm has a software product portfolio with 23 major software
11
products including 3,757 components. A “product” corresponds to a major business application in the firm such as sales order processing, inventory management, and pricing.7 A “component” corresponds to a software module belonging to a product; a “module” is a self-contained software program that accomplishes a specific task, such as printing a report, or providing a screen for data entry. The firm’s software portfolio accomplishes information processing for applications in merchandising, operations, fiscal and human resources business functions. A centralized information technology (IT) group supports this large software product portfolio. Throughout the twenty year history of the software product portfolio, the IT group maintained a comprehensive log of every update made to the software products and components from the 1970’s to the 1990’s, providing us with detailed data describing 28,415 individual change events. Each change event includes textual data describing the original software component creation date and author, the function of the component, the product to which the component belongs, the developer changing the component, the date of the change and a description of the change. We developed a coding scheme to categorize each change event that relies upon and extends the standard industry categorization for software update activities (IEEE 1993, Kemerer and Slaughter 1999). To code each event, a content analytic approach was adopted using a combination of latent and manifest coding techniques (Krippendorff 1980). Manifest coding involves looking through the text of the change event for visual occurrences of certain keywords (such as “author”, “change”, and “date”). Latent coding identifies the underling meaning in the text of the change log when keywords are not sufficient to code events. Three coders coded the change logs. The coders were chosen for their in-depth knowledge of software product development so that they could properly identify terms and acronyms and code events accurately. A coding flowchart was developed to provide a consistent procedure for coding events, and coders were trained in the use of the flowchart. Several trial data coding rounds were performed to assess and help ensure consistency between the coders. In these trials each coder independently coded the same
7
The applications include Advertising, Accounts Payable, Accounts Receivable (3 products), Sales Analysis (2 products), Capital Pricing Management, Fixed Asset Management, Financial Reporting, General Ledger, Shipping, Merchandising (3 products), Order Processing, Pricing (5 products), Payroll, and Inventory Management.
12
change events that were randomly selected. After the independent coding, the Cohen’s K was computed to assess the relative pair-wise agreement between the coders (Cohen 1960). After several trials, the Cohen’s K averaged close to 0.80 indicating substantial agreement between the coders. Subsequently, the change events for the products were divided equally between the coders, and they coded the events independently. Checks for coder drift found no evidence of a decrease in coder reliability over time. A random inspection of the coded change events by the third author of this manuscript did not find degradations in accuracy. After coding the events, the coders entered the data into relational databases for efficient and flexible storage and access. Appendix A provides an example of a coded change event. The change event data were supplemented with archival data collected from the firm’s software change control system and included source code as well as the technologies used to create the code. The source code was available for all components – whether developed internally or externally sourced from a vendor. The source code for every component was extracted from the software change control system and was analyzed using a commercial software code analysis tool to generate measures of software size and software complexity. Because the firm is publicly-held, annual reports to stockholders were available for the time frame of the study, and these reports were used to derive annual information about the firm such as revenues, profits, and the number of employees and to provide background information about the firm’s strategies, markets, and activities. Finally, annual reports were obtained from the IT group’s internal library, and provided information about the firm’s IT strategies, performance and policies. 3.2 Constructs and Measures To determine the volatility of a software product, we begin by calculating the length of the average time interval between changes to a software product’s components in each month of its life cycle. A product with an increase in volatility will experience changes occurring at shorter, more frequent intervals (i.e., a reduced time between modification), while a product with a decrease in software volatility will experience less frequent, longer intervals between changes (i.e., an increased time between modification). Specifically, the length of the average time interval between changes is calculated by averaging the time
13
since the previous change (TSC) for each change event e for a component c in software product s during time period (month) t: AvgTSCst =
1 Est ∑ TSCest where Est = the total number of change events for the components in Est e =1
product s during time period t. To facilitate comparisons of AvgTSCst across software products of different ages, we normalize AvgTSCst by a count of the number of months M a product s has been in existence as of the end of time period t. In addition, we reverse score the normalized measure so that a value close to 1 indicates a high volatility, and a value close to 0 indicates a low volatility.8 Thus, product volatility for software product s in month t is defined as: Volatilityst = 1 - AvgTSCst / Mst The explanatory variables in our model include the use of a shared product platform, standard versus custom design, team instability, work automation, and update intensity. Increased use of a shared product platform is assessed in terms of growth in the firm’s sales revenue in each month. Sales revenue was obtained for the time frame of the study from the firm’s annual report to its stockholders.9 Growth in sales revenue is a good measure of the use of the product platform in this context. Sales revenue growth reflects an increase in the size and variety of the customer base using the firm’s software products because the IT group employed a strategic growth policy that focused on leveraging the existing software product platform to support sales in new markets. Whenever the host firm acquired another retailer, the IT group’s policy required the extension and leveraging of the host firm’s existing software products to process the acquired retailer’s sales transactions (rather than supporting a separate software platform for the acquired retailer). By relying on a shared software product platform, IT management sought to increase the efficiency of software product development and support activities as the host firm entered into new markets. The growth in sales revenue for a particular month therefore captures the increase in sales 8
A product with components that do not change in a particular month has its volatility set to “0” for that month. No products were retired during the 20-year time frame of the study. 9 The consumer price index (CPI) established by the Bureau of Labor Statistics of the U.S. government was used to deflate annual sales revenues to a common year.
14
transaction and related processing that month as the firm expanded the use of the platform into existing and new market segments.10 The choice of a standard or custom design is measured using a binary variable for each software product to indicate whether the product was initially created using a standard (package) design or whether a custom (product-specific) design was developed for the product in the firm. A standard design is one that the firm purchased from an external vendor and that is used in many other products in many other firms. A product-specific design refers to the development of a unique design for the product within the firm that is not used by any other product or by any other firm. It is important to note that developers in the IT group could modify either standard or custom designed products, and external vendors could also modify standard products. Further, the change logs capture all modifications to the standard and custom designed products used in this firm whether the changes are done in-house or externally. Information about whether a standard or product-specific design was used to create each software product is available from the change event logs and product source code. The variable is set to “1” if a standard design was used to create the software product and is “0” otherwise. The team instability variable is assessed by counting how often updates to the software product are accomplished by a new team member. The variable is constructed based upon detailed information in the change event logs that identify the developers who created and modified the components in the software products. This allows us to identify changes in developer-component assignments for a software product in each month over its life cycle. The number of changes in developer-component assignments is aggregated at the product level each month to reflect the number of team member assignment changes made that month. The extent of work automation is operationalized as the intensity of CASE tool use based upon information in the IT group’s software change control system about the technology used to create each component in a software product. Every component is flagged in the change control system as having 10
We explored alternative measures of firm growth that could potentially reflect the increased use of the product platform. These included number of employees, number of stores, and square footage of stores. These measures are all highly correlated with sales revenue and perform similarly in the analysis. However, sales revenue is the only variable that is available for the entire time frame of the change history for the software product portfolio, and is the best measure of growth.
15
been created using a CASE tool or manually. To calculate the measure, for every product and month, the number of components in the product that were created using a CASE tool is divided by the total number of components in the product to create a proportional measure of the use of CASE tools. Our final explanatory variable is obtained from the change event data for each component and product. The intensity of product updates is measured as the number of components in a software product that were created or changed during the month divided by the total number of components in the software product that month. This provides a measure of the relative extent of updates in a software product for a particular month. Our control variables include product domain, product size, product complexity, and product age. Product domain is measured using a binary variable to distinguish between the primary business functions in the firm. In this firm, as in many others, there is a basic distinction between fiscal versus nonfiscal (i.e., merchandising) functions; our measure of domain is a binary variable where fiscal = “1” and = “0” otherwise.11 Product size is measured by counting the total number of components in the product in a particular month. Product complexity is assessed using information obtained from commercial code complexity analysis tools (which were used to analyze the software code for all of the components in the portfolio). A complexity metric is calculated for each component by counting the total number of data elements referenced in the component. Total data complexity is a key indicator of a component’s inherent design complexity because it predicts the procedural complexity of the coded software (Al-Janabi and Aspinwall 1993); each data element translates into one decision implemented in the code (Warnier 1976). The total number of data elements referenced is summed across the components in each product to obtain a product level measure of complexity. We then calculate product complexity by determining the difference between the total complexity in the product in one month and the total complexity in a prior month, divided by the total complexity in the prior month. This provides a measure of the relative increase in complexity for the product in a particular month. Finally, product age is determined based
11
A number of formulations of the domain variable are possible. “Fiscal” applications were among the most volatile at the data site, therefore this domain was chosen as a conservative choice in terms of leaving less variance remaining to be explained by the hypothesized model variables.
16
upon the average age of the components in the product. The age of each component is obtained from information in the change event logs that record the date on which the component was created. Product age is measured by averaging the age (in months) for all components in the product in a particular month. Operational measures of the dependent and explanatory variables in our research model are summarized in Table 1. The variables are measured for every month of each product’s life span. Table 1: Measurement of Variables in the Research Model Variable Product Volatility
Shared Platform Standard Design
Team Instability
Work Automation Update Intensity Product Domain Product Size Product Complexity Product Age
Operational Measurement One minus the average time, expressed as a percentage of a month, between changes to a software product’s components (calculated for the current month). Growth in the firm’s sales revenue in inflation-adjusted dollars (calculated for the prior month). A variable that indicates whether the product was created using a standard design (1) or whether it was developed using a custom, product-specific design (0). A count of the number of times that changes were made in developer-component assignments for a product’s components (calculated for the prior month). The percentage of a product’s components created using a CASE tool (calculated for the prior month). The percentage of a product’s components created or updated (calculated for the prior month). A binary variable that indicates whether the product supports the fiscal domain (1) or the non-fiscal domain (0). A count of the total number of components in a product (calculated for the prior month). The total number of data elements referenced in the product (in terms of increase from the prior month to the current month). The average of the age (in months) for all of the components in a product (calculated for the prior month).
4. Analysis and Results 4.1 Model Specification The data for evaluating our model were pooled into an unbalanced, time series, cross-sectional panel of 3,201 observations reflecting monthly life cycle data for each of the firm’s 23 software products. The panel is unbalanced because each software product has a differing number of months in its life cycle. The shortest product life cycle is 62 months, and the longest is 246 months, with an average product life cycle of just under 164 months. All 23 software products were in active use during the data collection period.
17
The basic framework for this analysis is the generalized regression model: yit = β ′xit + εit. In particular, the following specifies the equation predicting the volatility of software product s in time period (month) t: VOLATILITYst = β0 + β1*(PRODUCT DOMAINs) + β2*(PRODUCT SIZEst-1) + β3*(PRODUCT COMPLEXITYst-1) + β4*(PRODUCT AGEst-1) + β5*(SHARED PLATFORMt-1) + β6*(STANDARD DESIGNs) + β7*(TEAM INSTABILITYst-1) + β8*(WORK AUTOMATIONst-1) + β9*(UPDATE INTENSITYst-1) + εst We lag our time-dependent explanatory variables in this specification because their impacts are not immediate on software product volatility, but rather require one time period to manifest their effects.12 Lagging the independent variables also serves to mitigate potential concerns with endogeneity in the model (Kennedy 1998). Because the variables are measured using very different unit scales, to facilitate interpretation and comparison of the estimated coefficients we standardized each variable to its Z-score before entering it into the regression (Neter, Wasserman and Kutner 1990). 4.2 Specification Testing We initially conducted a pooled ordinary least squares regression estimation of the equation. The assumption of normality of residuals was not rejected using the Kolmogorov-Smirnov test (Stephens 1986). White’s test (1980) suggested no problems with heteroscedasticity across individual disturbances. We did not find evidence of multi-collinearity using the criteria specified in Belsley, Kuh and Welsch (1980) as the condition index and variance inflation factors are all less than 5.
12
The duration of the lag was determined empirically. In this firm, updates to software products (including adding or changing software components) required approval and entry into the product development plan that was updated at most on a weekly basis. Thus, the effects of explanatory variables required at least one week to manifest themselves in terms of product updates. We conducted a number of robustness analyses, varying the length of the time period and found that less than one month was too short for an effect to be realized. At lags of two or more months, the strength of the effects for the time-based variables is much less than at one month. Emergency repairs due to product failures were accomplished in the same time period that failure occurred. However, emergency repairs represent only a small percentage of the change events (less than 10%). Thus, we specified the model using a one-month lag for the time-dependent explanatory variables. A one-month lag is also consistent with much software management practice, where phenomena are often tracked by monthly reports.
18
Because we are estimating a cross-sectional, time series panel, we tested for cross-sectional heteroscedasticity and cross-sectional correlation as well as serial correlation. The Lagrange multiplier test for cross-sectional correlation (Breusch and Pagan 1980) was not significant suggesting that the disturbances are not correlated across software products (LM = 6.02, df = 23, p > 0.10). However, the Lagrange multiplier test for cross-sectional heteroscedasticity (Greene 2002) was significant, suggesting that the disturbance variance differs substantially across software products (LM = 282.38, df = 23, p < 0.01). Thus, we corrected for cross-sectional heteroscedasticity in our estimation. In addition, the Breusch-Godfrey test indicated that there was serial correlation: the serial correlation parameter is different for each software product’s time series (Breusch and Godfrey 1981).13 Therefore, we estimated the panel using feasible generalized least squares (FGLS) regression with panel-specific AR(1) corrections as well as a correction for heteroscedastic panels.14 In order to evaluate the significance of the incremental variance explained by the control and explanatory variables, we estimated our model in a hierarchical fashion. First, we estimated a “null” model (Model 1) with only an intercept term, then we estimated a model with only the control variables added (Model 2), and finally, we estimated our full model with the explanatory variables in addition to the control variables (Model 3). Likelihood ratio tests allow us to evaluate the significance of the incremental variance explained by adding each set of variables to the nested models (Greene 2002). We also evaluated whether there were outliers in the FGLS estimation by identifying residuals more than three standard deviations from the mean (Neter, et al. 1990). Based upon this analysis, we identified five outliers in our full model (Model 3), and re-estimated it without the outliers. As there was no appreciable change in the coefficient estimates without the outliers, we report our results with all observations included.
13
This was confirmed by comparison of three estimates: 1) FGLS with no correction for serial correlation, 2) FGLS using the same AR(1) correction for each software product, and 3) FGLS using a different AR(1) correction (i.e., a panel-specific AR(1) correction) for each software product. Only the third estimate addressed the serial correlation issue in the dataset. 14 Our FGLS estimation assumes that the parameter vector is constant across software products. We attempted to estimate a model with fixed effects for each software product. The fixed effects are highly collinear with each other and with one of the main variables: three fixed effect variables as well as the variable for product domain drop out of the regression due to high collinearity. Further, adding fixed effects does not appreciably change the value, sign, and significance of the coefficient estimates for the other variables. Thus, the results are reported from our original FGLS estimation without the fixed effects.
19
4.3 Results The descriptive statistics and pair-wise correlations for the variables in our data set are displayed in Table 2. As can be seen in Table 2, the inter-correlations are relatively modest (most are less than 0.50). Table 2: Descriptive Statistics and Correlations (1)
(2)
(3)
(4)
(5)
(6)
(7)
Variable
Mean (s.d.)
1. Product Volatility 2. Shared Platforma 3. Standard Designb 4. Team Instability 5. Work Automation 6. Update Intensity 7. Product Domainc 8. Product Size
0.52 (0.46)
1.00
9.51 (3.2)
0.46***
1.00
0.14 (0.35)
-.25***
0.08**
1.00
3.33 (6.29)
0.43***
0.24***
-.30***
1.00
0.16 (0.26)
0.46***
0.34***
-.27***
0.33***
1.00
0.08 (0.16)
0.24***
0.04*
-.29***
0.48***
0.24***
1.00
0.58 (0.49)
-.17**
-.17**
0.35***
-.28***
-.52***
-.26***
1.00
73.79 (77.70)
0.52***
0.57***
0.01
0.45***
0.41***
-.01
-.18**
(8)
(9)
(10)
1.00
0.07 0.38*** 0.18** -.14** 0.45*** 0.16** 0.34*** -.11** 0.36*** 1.00 9. Product (0.11) Complexity 82.44 0.33*** 0.46*** -.08** 0.22*** -.03 -.10* 0.18** 0.56*** 0.17** 1.00 10. Product (58.57) Age Notes: n = 3,155; † p < 0.10; * p < 0.05; ** p < 0.01; *** p < 0.001. Pearson product moment correlations are reported for pairs of continuous variables, Spearman rank correlations are reported for pairs of continuous and dichotomous variables, and Phi correlations are reported for pairs of dichotomous variables. aIn millions of dollars. b Coding: 1 = standard design, 0 = product specific design. cCoding: 1 = fiscal domain, 0 = non-fiscal domain.
The FGLS estimates for our models after correcting for cross-sectional heteroscedasticity and panelspecific AR(1) are shown in Table 3. As can be seen in Table 3, the variance explained in product volatility by adding the control variables (product size, complexity, age, and domain) in Model 2 to the “null” Model 1 is significant (χ2 = 96.34, df = 4, p < 0.001). The incremental variance explained in product volatility by then adding the five explanatory variables (shared platform, standard design, team instability, work automation, and update intensity) in Model 3 is significant as well (χ2 = 143.43, df = 5, p < 0.001). In the following paragraphs, we first report our results for the control variables in the equation. We then report our results for each of the hypotheses.
20
Table 3: FGLS Estimates Variable Intercept
Model 1 Estimated Coefficient (standard error) 0.0487 (0.0392)
Control Variables Product Domain (β1)
Model 2 Estimated Coefficient (standard error) 0.0076 (0.0188)
Model 3 Estimated Coefficient (standard error) 0.0039 (0.0148)
0.0597** (0.0240) 0.2946*** (0.0274) 0.0292** (0.0093) 0.2805*** (0.0265)
Product Size (β2) Product Complexity (β3) Product Age (β4) Explanatory Variables Shared Platform (β5)
0.2491*** (0.0188) -0.2747*** (0.0142) 0.0299** (0.0117) 0.2136*** (0.0183) 0.0699*** (0.0120)
Standard Design (β6) Team Instability (β7) Work Automation (β8) Update Intensity (β9) Log Likelihood Likelihood Ratio Test (c c 2) % Variance Explained: -Incremental -Total
0.0546** (0.0213) 0.1907*** (0.0240) 0.0397*** (0.0107) 0.0469* (0.0218)
-3054.585
--30.1%
-3006.413 96.344*** 6.4% 36.5%
-2934.696 143.434*** 13.7% 50.2%
Notes: n = 3,155; † p < 0.10; * p < 0.05; ** p < 0.01; *** p < 0.001. Betas are standardized.
As shown in Table 3, the results for our control variables are as expected. The coefficient on the binary variable for product domain is positive and significant, signifying that the products supporting the fiscal business domain are more volatile than those products supporting the non-fiscal business domain. The coefficients on software product size, product complexity and product age suggest that increases in size, complexity, and age in a prior month are significantly associated with increases in product volatility for the current month. Controlling for product domain, size, complexity, and age, we find strong empirical support for our hypotheses in Model 3. Our first hypothesis predicted that an increase in the use of a shared product platform (as measured by the growth in sales revenue) is associated with increased product volatility. This hypothesis is supported: an increase in sales revenue in a prior month is significantly associated with an increase in product volatility for the current month (β5 = 0.25, z = 13.23, p < 0.001). Our second
21
hypothesis predicted that products using a standard design are less volatile than products using a productspecific (custom) design, and it is also supported. That is, the coefficient on the variable distinguishing standard from custom design is significant and reflects a lower level of life cycle volatility for products using a standard design (β6 = -0.27, z = -19.38, p < 0.001). The results support our third hypothesis about the effects of team instability, indicating that an increase in the number of new developer-component assignments in a prior month leads to an increase in product volatility for the current month (β7 = 0.03, z = 2.56, p < 0.01). Our fourth hypothesis is also supported. That is, increased work automation in terms of CASE tool usage in a prior month is significantly associated with an increase in volatility for the current month (β8 = 0.21, z = 11.68, p < 0.001). Finally, the results provide support for our fifth hypothesis about the intensity of product updates and product volatility, indicating that an increase in the proportion of components added or modified in a prior month is significantly associated with increased volatility for the current month (β9 = 0.07, z = 5.81, p < 0.001). 5. Discussion of Results and Extensions In this section, we begin by evaluating and interpreting the relative linear impact of each kind of development decision on software product volatility. We then investigate whether the rate of change in software product volatility differs for each decision variable by analyzing curvilinear effects. We also recognize that product managers can choose to implement development decisions jointly rather than individually, and it is possible that one decision could attenuate or accentuate the effect of the other on product volatility. Hence, we explore whether the different decision variables have interactive effects on software product volatility. 5.1 Linear Effects of Development Decisions on Product Volatility The empirical results support our hypotheses and indicate that product development decisions implemented in one time period do have a significant impact on software product volatility in a future time period. Further, differences in the estimated coefficient values and their significance suggest that each kind of development decision differs in the nature and strength of its impact on software product volatility. Figure 2 graphically illustrates the relative linear impacts of the different kinds of design
22
decisions on software product volatility. Note that because the variables are standardized, the figure shows the effects on product volatility of an increase (decrease) in a variable in terms of standard deviations above (below) a mean of zero. Figure 2: Linear Impact of Development Decisions on Software Product Volatility
1
0.5
pr od uc t vo lat ilit
product platform standard design team instability
0
CASE tool update intensity
-0.5
-1 -3
-2
-1
0
1
2
3
standard deviations from mean
As shown in Figure 2, increases in the use of a shared product platform (Hypothesis 1) have a large and significant impact on software product volatility. For example, for a five-year old product, an increase in the use of a shared product platform of a standard deviation above the mean (i.e., a $3.2 million increase in sales) is associated with a decrease of 14 months and 28 days in the average time interval between changes to the product’s components. This relatively large impact on product volatility could reflect the frequency of changes required when extending the platform into different markets. In this firm, the product strategy involved use of a common platform to handle information processing needs when the firm acquired other retailers. Because the firm’s strategy is to extend the use of its software product platform into other retail markets, and the information needs for these markets could be quite different from those already supported by the platform, more frequent changes to the products could be required. The use of a standard design (Hypothesis 2) significantly reduces product volatility. This can be seen in the downward slope of the line in Figure 2 that depicts the effect of standard design on product volatility. For example, for a five-year old product, an increase in the use of a standard design in the
23
product portfolio of one standard deviation above the mean (i.e., an increase of 35%) lengthens the average time interval between changes to the product’s components by 16 months and 15 days. As we have reasoned in Hypothesis 2, a firm would use a custom or product-specific design for a core application and would therefore be more willing to invest in developing the initial design and in changing it. In contrast, products that use a standard design would be more stable to facilitate economies of scale in update, and our results suggest that the use of a standard design substantially reduces the frequency of changes to these products. Team instability as measured by the change in developer-component assignments (Hypothesis 3) increases software product volatility, although the effect is not as large as for other decision variables (this is reflected in the relatively flatter slope of the line for team instability in Figure 2). For example, for a five-year old product, an increase in the number of changes in developer-component assignments by one standard deviation above the mean (i.e., 6 new developer-component assignments) shortens the average time interval between changes to the product’s components by 1 month and 24 days. As we have posited in Hypothesis 3, a developer who is unfamiliar with a software component may make more errors or be more inefficient in making updates, requiring future changes to the component. In addition, other developers on the team do not have prior experience working with the new developer on the particular component. This lack of familiarity may also contribute to errors and inefficiencies requiring future changes to the software product.15 Work automation (Hypothesis 4) has a large and significant impact on software product volatility as can be seen in the steep slope of the line for CASE tools in Figure 2. For example, for a five-year old product, an increase in the use of CASE tools by one standard deviation above the mean (i.e., an increase of 26%) shortens the average time interval between changes to the product’s components by 12 months and 24 days. As we have asserted in Hypothesis 4, CASE tools automate the software production process
15
Speculating somewhat, it might well be argued that these results for changes in team membership are conservative estimates for the industry as a whole. This is because the data are drawn from an organization with very strong discipline and procedures for controlling and documenting changes. One could well imagine that those practices would tend to mitigate the knowledge lost when team members change, and that therefore the degree of impact of team membership change estimated here, as significant as it is, might still be less than might be typical in other, less well-disciplined organizations.
24
and make it easier and faster for developers to create and adapt software products. Our results suggest that use of these tools facilitates substantial increases in software volatility.16 Finally, as shown in Figure 2, an increase in the intensity of product updates in a prior month (Hypothesis 5) moderately increases the volatility of the software product in the current month. For example, for a five-year old product, an increase in the proportion of components added or changed by one standard deviation above the mean (i.e., an increase of 16%) shortens the average time interval between changes to the product’s components by 4 months and 6 days. This result is consistent with our hypothesis that software products that are updated extensively are likely to require future corrections and adaptations. To summarize, the development decisions that we have examined have significant linear effects on product volatility. Increases in the use of a shared product platform and in CASE tools, as well as increases in team instability and update intensity have all been shown to increase future product volatility while use of a standard rather than custom design significantly decreases product volatility. These results suggest that it is possible for product managers to predictably influence (through their design and development decisions) the frequency of software change processes. As such, managers could more confidently plan ahead for the allocation of scarce resources to development projects over the life cycle of software products. For example, managers could allocate fewer resources to support products developed using a standard design, or managers could choose to reduce the future volatility of a software product (and reduce the future resource requirements for a product) by keeping the product’s development team as stable as possible, or by phasing in updates to the software product in small releases rather than all at once. Alternatively, product managers could allocate more resources to support products developed using a custom design or CASE tools or plan for increased resource requirements in time periods when the firm is experiencing high growth.
16 It can also be noted that managers may be choosing to apply CASE tools to systems that they expect to have significant amounts of change. However, we believe that the major determinants that might lead to such a managerial decision are already captured in the control variables of domain, size, complexity and age.
25
5.2 Curvilinear and Interactive Effects of Development Decisions on Product Volatility There are a number of extensions to our primary model that have the potential to provide additional insights. For example, product managers may be interested in measuring more precisely the sensitivity of product volatility to changes in development decision variables (i.e., understanding the rate of change in volatility with changes in a decision variable). In addition, while our analysis evaluates the individual effect of each decision variable, it is possible or perhaps even likely that development practices are implemented jointly and thus would potentially have interactive effects on product volatility. For example, the joint use of some practices may amplify product volatility (i.e., increase volatility beyond the individual effects of each decision variable). To explore whether the development decision variables in our model have curvilinear and interactive effects on product volatility, we estimated two additional models. To our original model (Model 3) we added a squared term for each development decision variable17 in Model 4, and, in Model 5, we added ten interaction variables to represent all two-way interactions between the five development decision variables. The results from the estimation of these models are presented in Table 4. As shown in Table 4, the quadratic effects and the interaction effects explain significant incremental variation in software product volatility (χ2 = 102.19, df = 4, p < 0.001 and χ2 = 83.92, df = 10, p < 0.001, respectively). All of the quadratic effects are negative and significant, suggesting that increases in software product volatility associated with increases in the use of a shared platform, team instability, work automation, and update intensity are all occurring at a decreasing rate. Further, a plot illustrating the quadratic effects (Figure 3) shows that product volatility is most sensitive to changes in the use of a product platform and in work automation (as indicated by the steep curves of the quadratic effects for these decision variables). Product volatility is less sensitive to changes in team instability and update intensity (as indicated in the relatively flat curves of the quadratic effects for these decision variables).
17
The standard design variable is binary and thus cannot have a squared term in the model.
26
Table 4: Estimates of Quadratic and Interactive Effects Variable Intercept Control Variables Product Domain (β1) Product Size (β2) Product Complexity (β 3) Product Age (β 4) Explanatory Variables Product Platform (β5) Standard Design (β6) Team Instability (β7) Work Automation (β8) Update Intensity (β9) Quadratic Effects Product Platform2 (β10) Team Instability2 (β11) Work Automation2 (β12) Update Intensity2 (β13)
Model 3 Estimated Coefficient (standard error) 0.0039 (0.0148)
Model 4 Estimated Coefficient (standard error) 0.2983*** (0.0250)
Model 5 Estimated Coefficient (standard error) 0.8371*** (0.1071)
0.0546** (0.0213) 0.1907*** (0.0240) 0.0397*** (0.0107) 0.0469* (0.0218)
0.0620*** (0.0191) 0.2446*** (0.0241) 0.0246*** (0.0106) 0.0627** (0.0212)
0.0568** (0.0189) 0.2029*** (0.0230) 0.0240* (0.0106) 0.0629** (0.0204)
0.2491*** (0.0188) -0.2747*** (0.0142) 0.0299** (0.0117) 0.2136*** (0.0183) 0.0699*** (0.0120)
0.1590*** (0.0197) -0.2398*** (0.0149) 0.0555* (0.0222) 0.3960*** (0.0339) 0.1818*** (0.0232)
0.1860*** (0.0200) -0.1831*** (0.0256) 0.0609† (0.0226) 0.3103*** (0.0178) 0.1848*** (0.0256)
-0.1401*** (0.0158) -0.0112*** (0.0034) -0.1186*** (0.0148) -0.0166*** (0.0031)
-0.0741*** (0.0175) -0.0052† (0.0033) -0.0724*** (0.0149) -0.0125*** (0.0034)
Interaction Effects Platform x Instability (β 14) Platform x Design (β15) Platform x Automate (β16) Platform x Intensity (β 17) Automate x Instability (β 18) Automate x Design (β19) Automate x Intensity (β 20) Design x Instability (β 21) Intensity x Instability (β 22) Intensity x Design (β 23) Log Likelihood -2934.696 -2883.602 143.434*** 102.188*** Likelihood Ratio Test (c c2) % Variance Explained: 13.7% 2.9% Incremental 50.2% 53.1% Total Notes: n = 3,155; † p < 0.10; * p < 0.05; ** p < 0.01; *** p < 0.001. Betas are standardized.
-0.0947*** (0.0171) -0.0960*** (0.0127) -0.1406*** (0.0160) 0.0692*** (0.0191) -0.0381*** (0.0098) 2.4574*** (0.4400) -0.0075 (0.0095) -0.1242† (0.0866) -0.0269*** (0.0078) 0.0481* (0.0243) -2841.640 83.924*** 1.7% 54.8%
27
Figure 3: Quadratic Effects on Software Product Volatility 0.5
Change in Product Volatility
0
-0.5
platform CASE tool team instability update intensity
-1
-1.5
-2
-2.5 -3
-2
-1
0
1
2
3
Change in Decision Variables (standard deviations from mean)
The interaction effects also suggest several interesting patterns. Of the ten interactions, six are negative and significant, three are positive and significant, and only one interaction (between work automation and update intensity) is not significant. The negative interactions occur for six pairs of decision variables: platform × team instability; platform × design; platform × work automation; work automation × team instability; design × team instability; and team instability × update intensity. Positive interactions occur for three pairs of decision variables: work automation × design, design × update intensity, and platform × update intensity. In order to help convey the intuition around these results, two representative interactions are depicted graphically, one a significant positive interaction and one a significant negative interaction.18 Figure 4a depicts one of these interactions (between team instability and update intensity). As can be seen in Figure 4a, a negative interaction indicates a moderated effect on product volatility; that is, if team instability is high, product volatility does not increase with increases in update intensity, but if team instability is low, then product volatility increases with an increase in update intensity. The other negative interactions have a similar pattern and can be interpreted similarly. In contrast is the positive interaction between platform 18
For the interaction that is not significant, the graph shows two upward-sloping parallel lines.
28
× update intensity (which is graphed in Figure 4b). As can be seen in Figure 4b, a positive interaction indicates an amplified effect on product volatility; that is, increased use of a shared platform increases product volatility with increases in update intensity. The other positive interactions have a similar pattern and can be interpreted similarly. Figure 4a: Moderated Effect
Figure 4b: Amplified Effect Interaction between Use of Platform & Update Intensity
2
2
1
1
instability LO instability HI
0
Change in Volatility
Change in Volatility
Interaction between Team Instability & Update Intensity
platform use LO platform use HI
0
-1
-1
-2
-2 -3
-2
-1
0
1
Update Intensity (standard deviations from mean)
2
3
-3
-2
-1
0
1
2
3
Update Intensity (standard deviations from mean)
Our analysis suggests that there are indeed significant curvilinear and interactive effects of development decisions on product volatility. The curvilinear effects indicate that changes in product volatility are more sensitive to changes in some of the development decision variables and less sensitive to others. This could provide useful information to product managers because they could anticipate that changes in some decision variables will require a greater response than will changes in other variables. For example, product managers could observe in one time period a change in the use of CASE tools for product A and a change in team instability for product B and could predict that product A will be impacted more than product B in the next time period because changes in product volatility are more sensitive to changes in the use of CASE tools than to changes in team instability. The interaction effects also provide insights of potential value to product managers. Negative interactions suggest that the impact on product volatility is moderated with joint use of some decision variables. For example, concurrent increases in the use of a product platform and in team instability do not increase volatility more than their individual effects. This may be reassuring to product managers as it implies limits to increases in product volatility. On the other hand, product managers may need to pay
29
more attention to decision variables that have positive interactions as the impact on product volatility is amplified with their joint use. For example, concurrent increases in the use of a product platform and in update intensity will increase volatility more than might be expected from their individual effects. 6. Conclusions The goal of this study is to show how development decisions influence product volatility. To that end, we developed and empirically evaluated a conceptual framework that assesses the effect of strategic, organizational, and tactical product development decisions on the volatility of software products over their life cycles. In the context of a software development effort occurring over a twenty year time period in a major merchandising firm, we find that, controlling for the domain, complexity, size and age of the products, development decisions do have a significant effect on product life cycle volatility. Increases in the use of shared product platforms and work automation, in changes to the product development team, and in the intensity of product updates lead to increases in future product volatility. On the other hand, the use of a standard design substantially reduces volatility over the product’s life cycle. Our findings make a new contribution to the literature on product development by revealing how the dynamic behavior observed during product life cycles relates to development decisions. Although much of the literature has focused on the pre-launch phases of product development, product managers who manage long-lived products like software need to understand the impacts of their development decisions. In particular, we contend that product volatility is an important dimension that needs to be managed for products with long life cycles. Our results imply that product managers can predict which products are likely to be more or less volatile over their life cycles, depending on the particular development choices made (such as the use of platforms, tools, and standard or product-specific design; changes in team assignments; and update intensity) as well as aspects of the products less amenable to managerial control in the short term (such as product domain, size, age, and complexity). Our study has focused on examining the impact of product development decisions over a long period of time in a single firm. This research design has several strengths. Evaluation of this firm’s unusual and extensive longitudinal data on software change histories has afforded a rare opportunity to investigate the
30
dynamic behavior of software products over their life cycles. The focus on one firm and development effort has the advantage of naturally controlling for contextual factors that could differ across firms. We also added variables in our analysis to control for other factors (such as product domain, size, complexity and age) that could contribute to product volatility in the firm over the time period of the study. These controls in our research design and model strengthen the internal validity of our results. Although our focused research design enhances the internal validity of our findings, the external validity could be limited to organizational and product development contexts similar to the firm in our study. As no single study is definitive, the external validity of our results can be strengthened by replication. Thus, it is important for future research to examine the life cycle volatility impacts of product development decisions in other domains and development settings. In addition, while our intent has been to understand the antecedents of product volatility, it is important for future work to examine how volatility relates to other product attributes such as quality or costs. Future research to examine the life cycle effects of product development decisions on other outcomes is also warranted. This is particularly salient for products like software that are developed, enhanced and maintained over long life cycles. In this firm, product development costs are quite sensitive to product volatility. A post hoc analysis of the firm’s product development costs over the most recent three-year time period shows that differences in average development costs are significantly related to differences in average volatility for the software products.19 Specifically, we find that changes in volatility have an impact on costs that is almost doubled as a 1% increase in average product volatility is associated with a 1.9% increase in average product development costs. This suggests that differences in product volatility could therefore have substantive consequences for development costs and highlights the potential value of managing and predicting change in software product development environments.
19
Development costs were available for each software product for the most recent three-year period. We averaged the development costs for each product over the three years. We also averaged the volatility for each product over the same three years to yield an average volatility. The natural logarithm of average development costs was regressed against the natural logarithm of average volatility for the software products. The regression model is significant (F = 14.43, p < .01), and average volatility explains over 40% of the variance in average development costs (β = 1.90, t = 3.80, p < .01).
31
The ability of managers to respond rapidly to changes in their firm’s competitive environment is increasingly essential to firm survival. If software product managers could forecast changes to the longlived products they manage, they could more effectively plan, allocate workforce and manage change processes. However, signals in the form of past patterns of product volatility are not normally made available to managers due to the effort required to accumulate, report, and analyze detailed change request data. In addition, the ability to forecast changes to software products depends upon whether there are predictable patterns in life cycle activities. Our study demonstrates that managers can predictably influence through their design and development decisions the nature and timing of software change processes. Using a decision model such as the one developed in our study, product managers can anticipate the volatility consequences of their development decisions, improve their ability to forecast change, and make more informed decisions about investments in designing flexible and adaptable products while retaining a life cycle perspective for product support. 7. References Adler, P. 1990. The skills requirements of CAD/CAM: An exploratory study. International Journal of Technology Management, 5(2) 201-217. Ancona, D., Goodman, P., Lawrence, B., M. Tushman. 2001. Time: A new research lens. The Academy of Management Review, 26(4) 645-663. Ang, S., D. Straub, 1998. Production and transaction economies and IS outsourcing: A study of the U.S. banking industry, MIS Quarterly, 22(4) 535-552. Ang, S., C. Beath, 1993. Hierarchical elements in software contracts. Journal of Organizational Computing, 3(3) 329-361. Al-Janabi, A., E. Aspinwall. 1993. An evaluation of software design using the Demeter tool. Software Engineering Journal 8(6) 319-324. Banker, R., S. Datar, C. Kemerer, D. Zweig. 2002. Software errors and software maintenance management. Information Technology and Management, 3(1) 25-41. Banker, R., S. Datar, C. Kemerer, D. Zweig. 1993. Software Complexity and Software Maintenance Costs. Communications of the ACM, 36(11) 81-94. Banker, R., G. Davis, S. Slaughter. 1998. Software development practices, software complexity, and software maintenance performance: A field study. Management Science, 44(4) 433-450.
32
Banker, R., G. Davis, S. Slaughter. 1995. Application portfolio diversity and software maintenance productivity: An empirical analysis, Proceedings of the 1st Americas Conference on Information Systems, Pittsburgh, PA, 178-180. Banker, R., S. Slaughter. 2000. The moderating effects of software structure on volatility and complexity in software enhancement, Information Systems Research 11(3), 219-240. Banker, R., S. Slaughter 1997. A field study of scale economies in software maintenance. Management Science, 43(12) 1709-1725. Barry, E. 2001. Software evolution, volatility, and life cycle maintenance patterns. PhD Thesis. Graduate School of Industrial Administration, Carnegie Mellon University, Pittsburgh, PA. Belady, L., M. Lehman. 1985. Program evolution: Processes of software change. London: Academic Press. Belsley, D., E. Kuh, R. Welsch. 1980. Regression diagnostics: Identifying influential data and sources of collinearity, New York: John Wiley and Sons. Bennett, K. 1996. Software evolution: past, present and future. Information and Software Technology, 39(11) 673-680. Breusch, T., L. Godfrey. 1981. A review of recent work on testing for autocorrelation in dynamic simultaneous models, in D. Currie, R. Nobay and D Peel, eds. Macroeconomic Analysis: Essays in Macroeconomics and Econometrics, Croon Helm, London, England, 63-105. Breusch, T., A. Pagan. 1980. The LM test and its applications to model specification in econometrics. Review of Economics Studies, 47, 239-254. Brooks, F. 1995. The mythical man-month: Essays on software engineering. Anniversary Edition, Reading, MA: Addison-Wesley. Brown, S., K. Eisenhardt. 1995. Product development: Past research, present findings, and future directions. Academy of Management Review, 20(2) 343-378. Clark, K., S. Wheelwright. 1998. Managing product and process development. New York: The Free Press. Cohen, J. 1960. A coefficient of agreement for nominal scales, Educational and Psychological Measurement, 20(1) 37-46. Cohen, M., S. Whang. 1997. Competing in product and service: A product life-cycle model. Management Science. 43, 535-545. Connell, J., G. Edgar, B. Olex, R. Scholl, T. Shulman, R. Tietjen. 2001. Troubling success and good failures: Successful new product development requires five critical factors. Engineering Management Journal, 13(4) 35-39. Coopee, T. 2000. Embedded intelligence. Infoworld, November 20, 53-54.
33
Cusumano, M. and R. Selby 1997. How Microsoft Builds Software. Communications of the ACM, 40(6), 53-61. Eisenhardt, K., B. Tabrizi. 1995. Accelerating adaptive processes: Product innovation in the global computer industry. Administrative Science Quarterly, 40, 84-110. Fyson, M., C. Boldreff. 1998. Using application understanding to support impact analysis. Journal of Software Maintenance: Research and Practice, 10(2) 93-110. Gawer, A., M. Cusumano. 2002. The elements of platform leadership. MIT Sloan Management Review, 43(3) 51-58. Gibson, V., J. Senn. 1989. System structure and software maintenance performance. Communications of the ACM, 32(3) 347-358. Greene, W. 2002. Econometric Analysis. 5th ed. Englewood Cliffs, NJ: Prentice-Hall, Inc. Halfhill, T. 2000. Embedded microprocessors. Computerworld, August 28, p. 65. Hamel, G., C.K. Prahalad, 1994. Competing for the Future, Boston, MA: Harvard Business School Press. IEEE. 1993. IEEE Standard for Software Maintenance, New York: IEEE, 39. Iansiti, M., A. MacCormack. 1999. Living on Internet Time: Product Development at Netscape, Yahoo, NetDynamics and Microsoft, Harvard Business School Case #9-697-052, Boston, MA: Harvard Business School Publishing. Kalakota, R., A.B. Whinston. 1996. Electronic Commerce: A Manager's Guide, Reading, MA: AddisonWesley. Kelley, M. 1994. Productivity and technology: The elusive connection, Management Science, 40(11) 1406-1426. Kemerer, C. 1995. Software complexity and software maintenance: A survey of empirical research. Annals of Software Engineering, 1, 1-22. Kemerer, C. 1992. How the Learning Curve Affects Computer Aided Software Engineering (CASE) Tool Adoption. IEEE Software, 9, 5, 23-28. Kemerer, C., S. Slaughter. 1999. An empirical approach to studying software evolution. IEEE Transactions on Software Engineering, 25(4), 1999, 1-17. Kemerer, C., S. Slaughter. 1997. Determinants of software maintenance profiles: An empirical investigation. Software Maintenance: Research and Practice, 9, 235-251. Kennedy, P. 1998. A Guide to Econometrics, 4th ed., Cambridge, MA: MIT Press. Koopman, P. 1996. Embedded system design issues. Proceedings of the International Conference on Computer Design, Austin, TX, IEEE Computer Society Press, 310-317.
34
Krippendorff, K. 1980. Content Analysis: An Introduction to its Methodology, Newbury Park, CA: Sage Publications. Krishnan, V., S. Gupta. 2001. Appropriateness and impact of platform-based product development. Management Science, 47(1) 52-68. Krishnan, V., K. Ulrich. 2001. Product development decisions: A review of the literature. Management Science, 47(1) 1-21. Lehman, M.M., J.F. Ramil, P.D. Wernick, D.E. Perry, and W.M. Turski. 1997. Metrics and laws of software evolution - the nineties view. Metrics '97, The Fourth International Software Metrics Symposium, Albuquerque, NM, 1997. Li, W., L. Etzkorn, C. Davis. 2000. An empirical study of object-oriented system evolution. Information and Software Technology, 42(6) 373-381. Lientz, B., E.B. Swanson, 1978. Software Maintenance Management: A Study of the Maintenance of Computer Application Software in 487 Data Processing Organizations, Reading, MA: Addison-Wesley. MacCormack, A. 2001. Product-development practices that work: How Internet companies build software. MIT Sloan Management Review, Winter, 42(2) 75-84. MacCormack, A., R. Verganti, M. Iansiti. 2001. Developing products on ‘Internet Time’: The anatomy of a flexible development process. Management Science, 47(1) 133-150. Malaiya, Y.K., J. Denton. 1999. Requirements volatility and defect density. Proceedings 10th International Symposium on Software Reliability Engineering, 285-94. Meyer, N., A. Lehnerd. 1997. The Power of Product Platforms. New York: Free Press. Meyer, N., P. Mugge. 2001. Make platform innovation drive enterprise growth. Research Technology Management, January-February, 25-39. Meyer, N., R. Seliger. 1998. Product platforms in software development. MIT Sloan Management Review, Fall, 40(1) 61-74. Musa, J., Iannino, A., K. Okumoto. 1990. Software Reliability: Measurement, Prediction, Application. New York: McGraw-Hill. Neter, J., W. Wasserman, M.H. Kutner, 1990. Applied Liner Statistical Models Regression, Analysis of Variance, and Experimental Design, 3rd Edition, Burr Ridge, IL: Richard D. Irwin, Inc. Prahalad, C.K., M. Krishnan. 2002. The dynamic synchronization of strategy and information technology. MIT Sloan Management Review, 43(4) 24-33. Prahalad, C.K., M. Krishnan. 1999. The new meaning of quality in the information age. Harvard Business Review, 77(5) 109-118. Pressman, R. 2001. Software Engineering: A Practitioner’s Approach, 5th ed., New York: McGraw-Hill.
35
Robertson, D., T. Allen. 1993. CAD system use and engineering performance, IEEE Transactions on Engineering Management, 40(3) 274-282. Sacks, M. 1994. On the Job Learning in the Software Industry: Corporate Culture and the Acquisition of Knowledge, Westport, CT: Quorum Books. Salzman, H. 1989. Computer-Aided design: Limitations in automating design and drafting. IEEE Transactions on Engineering Management, 36(4) 252-262. Simon, H.A. 1994. The Sciences of the Artificial, Cambridge, MA: The MIT Press. Sommerville, I. 2000. Software Engineering, New York: Addison-Wesley. Stephens, M.A. 1986. Goodness of Fit Techniques, New York: M. Dekker. Swanson, E.B., C. Beath. 1989. Maintaining Information Systems in Organizations, New York: John Wiley and Sons. Swanson. E.B., E. Dans. 2000. System life expectancy and the maintenance effort: Exploring their equilibrium. MIS Quarterly, 24(2) 277-297. Ulrich, K., D. Ellison. 1999. Holistic customer requirements and the design-select decision, Management Science, 45(5) 641-658. Ulrich, K., S. Eppinger. 2000. Product design and development. 2nd ed. New York: McGraw-Hill. Von Hippel, E. 1998. Economics of product development by users: The impact of "sticky" local information, Management Science 44(5) 629-644. Warnier, J. 1976. Logical construction of programs, 3rd ed. New York: Van Nostrand Reinhold. Wheelwright, S., K. Clark 1992. Revolutionizing Product Development: Quantum Leaps in Speed, Efficiency, and Quality, New York: Free Press. White, H. 1980. Heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity, Econometrica 48(5) 817-838. Whitten, J., Bentley, L., K. Dittman. 2001. Systems Analysis and Design Methods, 5th ed., New York: McGraw-Hill/Irwin. Williamson, O.E. 1985. The Economic Institutions of Capitalism, New York: Free Press. Yau, S.S., J. Collofello. 1985. Design stability measures for software maintenance. IEEE Transactions on Software Engineering, 11(9) 849-856. Yuen, C.H. 1985. An empirical approach to the study of errors in large software under maintenance. 2nd IEEE Conference on Software Maintenance, Washington, D.C., 96-105.
36
Appendix A Coded Change Event Software change events were extracted from histories or logs that were written by developers each time they updated a software component. The data available in the change logs includes the original software component creation date and author, the function of the component, the product to which the component belongs, the developer making a change, the date of the change, and a description of the change. An example of a coded change event is presented below. A detailed explanation of the coding approach is provided in Kemerer and Slaughter (1999).
Coded Change Event PRODUCT NUMBER *---------------------------------------------------------------------------------------------------------------------------------------* SYSTEM-ID.
COMPONENT NUMBER
COMPONENT-ID. M110.
COMPONENT CREATE AUTHOR
AUTHOR. JOHN . DATE-WRITTEN. FEBRUARY 1990.
COMPONENT DATE CREATED
DATE-COMPILED.
*---------------------------------------------------------------------------------------------------------------------------------------* *
ON-LINE RECEIVING ENTRY
* CHANGE LOG *
*---------------------------------------------------------------------------------------------------------------------------------------* * DATE: 04/05/91 * PROGRAMMER: D.A.
COMPONENT DATE CHANGED
COMPONENT CHANGE AUTHOR
* CHANGE: RESET IDOC-TYPE FLAG (INDICATING ACTIVE PO) AND *
IDOC-RECEIPT-FLAG WHEN UPDATING IDOC-MRNC-DEDUCT-FLAG.
*---------------------------------------------------------------------------------------------------------------------------------------* *…
A