average ratings from both film critics and the mass audience. ... categorization system embodied in securities analystsâ division of labor. ...... a 5-star score (the best possible) versus anything else, results of which are reported ...... Bulk Mailing.
THREE ESSAYS ON THE COGNITIVE EFFECTS OF CATEGORIZATION PROCESSES IN MARKETS
A DISSERTATION SUBMITTED TO THE GRADUATE SCHOOL OF BUSINESS AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
Ming De Leung May 2010
© 2010 by Ming De Leung. All Rights Reserved. Re-distributed by Stanford University under license with the author.
This work is licensed under a Creative Commons AttributionNoncommercial 3.0 United States License. http://creativecommons.org/licenses/by-nc/3.0/us/
This dissertation is online at: http://purl.stanford.edu/ns482rk4930
ii
I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Hayagreeva Rao, Primary Adviser
I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Michael Hannan
I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Jesper Sorensen
Approved for the Stanford University Committee on Graduate Studies. Patricia J. Gumport, Vice Provost Graduate Education
This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file in University Archives.
iii
ABSTRACT
Categorical boundaries serve to partition continuously differing social actors into groups that are perceived to be alike. The work in economic and organizational sociology often draws upon findings in cognitive psychology to understand how these categorization processes impact organizational and individual outcomes. In particular, researchers have recently focused their attention on the finding that these social mechanisms of categorization lead actors who straddle multiple categories to suffer a disadvantage. Those who span multiple categories defy clear categorization and are subsequently devalued because they are difficult to understand. In the three essays to follow, I aim to contribute to the work on understanding the cognitive aspects of categorization processes in markets. In the first essay, I use a natural experiment to provide a more solid empirical foundation as to the existence of a cognitively driven penalty for multiple-category membership. In my second essay, I theorize on whether the sequence of categorical affiliations affects how one is evaluated. I propose that those social actors who display a history of moves between less cognitively associated categories will be seen as being less credible. In my third essay, I examine how heterogeneity in audience experiences affects their ability to communicate with potential market participants. I hypothesize that audience with greater depth of expertise are able to attract more accurate offers from more focused producers, while those audience members with greater breadth of experiences attract less accurate offers from less focused producers.
iv
ACKNOWLEDGEMENTS This thesis would not have been possible without the support of many people. I am deeply indebted to my reading committee members for their guidance. My advisor, Hayagreeva Rao, whose unceasing enthusiasm and energy kept me engaged and motivated and whose encouragement to think more succinctly sharpened my writing. I am also grateful to Mike Hannan, whose ability to glean the insight from my muddied early drafts and to suggest fruitful theoretical refinements and ideas, improved my papers dramatically. Finally, I am thankful to Jesper Sørensen, as this thesis is greatly improved due to his uncompromising expectations and particularly useful methodological advice. I wish also to thank Chip Heath, Glenn Carroll, Bill Barnett, and Jerker Denrell for helpful suggestions and encouragement, and the Stanford Graduate School of Business for support. I also wish to acknowledge the wonderful support from my friends and family. Thanks to my colleagues, Amanda Sharkey, Giacomo Negro, Gael Le Mens, and John-Paul Ferguson; who have co-authored, shared data, or given me valuable advice. I thank Ved Sinha, Fabio Rosati, and James Lee of Elance.com for allowing me to use their data. I happily acknowledge gratitude to my father and mother, Cho Kin and Lin, who have instilled in me a natural curiosity and acted as my role models and my sisters, May May and Li Li, who have always supported everything I‟ve done. Lastly, my greatest debt of gratitude goes to my wife, Nina, who has maintained unwavering confidence in me through this tumultuous journey. And to my son, Gio, who was considerate enough to be born a week after I defended my dissertation and has brought un-measureable joy to my life. v
TABLE OF CONTENTS
ABSTRACT ........................................................................................................
iv
ACKNOWLEDGEMENTS ................................................................................
v
TABLE OF CONTENTS ....................................................................................
vi
LIST OF TABLES ..............................................................................................
vii
LIST OF FIGURES ............................................................................................. viii CHAPTER 1: INTRODUCTION ..............................................................................
1
CHAPTER 2: MULTIPLE CATEGORY MEMBERSHIP .............................................
11
CHAPTER 3: SEQUENTIAL CATEGORY MEMBERSHIP .........................................
49
CHAPTER 4: HETEROGENEOUS AUDIENCES .......................................................
96
CHAPTER 5: SUMMARY AND DISCUSSION ......................................................... 120 APPENDICES ..................................................................................................... 127 REFERENCES .................................................................................................... 129
vi
LIST OF TABLES
TABLE 2.1
Summary Statistics for Prosper.com Labeling Analysis ..........
45
TABLE 2.2
Correlations for Prosper.com Labeling Analysis .....................
46
TABLE 2.3
Log-Odds Listing Will Be Fully Funded (Random and Fixed-Effects Binomial Logistic Regression) ...
TABLE 2.4
47
Listing Getting a First Bid, in Days (Piecewise Constant Hazard Rate Model) ................................
48
TABLE 3.1
Summary Statistics for Elance.com Sequence Analysis ..........
89
TABLE 3.2
Correlations for Elance.com Sequence Analysis .....................
90
TABLE 3.3
Log-Odds of Winning a Bid (Fixed-Effects Binomial Logistic Regression by Job) .............
TABLE 3.4
92
Log-Odds of Winning a Bid (Fixed-Effects Binomial Logistic Regression by Bidder .........
94
TABLE 3.5
Feedback Score Received (Heckman Selection Models) .........
95
TABLE 4.1
Summary Statistics for Elance.com Audience Analysis .......... 116
TABLE 4.2
Correlations for Elance.com Audience Analysis ..................... 117
TABLE 4.3
Coefficient of Variation for Bids Placed for a Job (Random-Effects Tobit Regression by Buyer) ......................... 118
TABLE 4.4
Mean Categorical Focus of All Bidders for a Job (Random-Effects Tobit Regression by Buyer) ......................... 119
vii
LIST OF FIGURES
FIGURE 2.1
Screen Shot of Prosper Loan Request ......................................
43
FIGURE 2.2
Screen Shot of Group Page with Category Labels Visible ......
43
FIGURE 2.3
Screen Shot of Group Page with Category Labels Removed...
44
FIGURE 3.1
Illustrative Category Meta-Schema ..........................................
83
FIGURE 3.2
Sequence of Category Moves ...................................................
83
FIGURE 3.3
Elance.com Job Listing ............................................................
84
FIGURE 3.4
Seller Feedback Page ...............................................................
85
FIGURE 3.5
Percent of Bidders Working in More than One Job Category .
86
FIGURE 3.6
Meta-Schema of Elance.com Job Categories ...........................
87
FIGURE 3.7
Erraticism of Bidders in 2004 ..................................................
88
FIGURE 3.8
Likelihood of Winning the Bid Ranked by Erraticism (Matched Pairs of Bidders for Same Job) ................................
91
FIGURE 3.9
Percent Change in Likelihood of Winning A Bid .................... 114
FIGURE 4.1
Elance.com Job Description ..................................................... 115
viii
CHAPTER 1 INTRODUCTION
Recent research in economic and organizational sociology draws upon findings in cognitive psychology to understand how categorization processes impact organizational and individual outcomes (Zuckerman 1999, 2000; Zuckerman, Kim, Ukanwa, and von Rittman 2003; Rao, Monin, and Durand 2005; Hsu 2006; Hannan, Pólos, and Carroll 2007). Categorical boundaries serve to partition continuously differing organizations into groups that critics, consumers, and investors perceive to be alike. Categorization influences economic outcomes by aiding consumers in finding the products they seek (Rosa and Spanjol 2005; Rosa, Porac, Spanjol, and Saxon 1999), facilitating financial analysts in comparing companies (Zuckerman 1999), inducing managers to divest incoherent lines of business (Zuckerman 2000), and directing the attention of movie critics (Hsu 2006). Zuckerman (1999) illustrated how the occupant of a candidate role tries to curry favor or recognition from an audience who holds the power to grant needed resources. Candidates needed to “look right” to the resource holder to be considered for evaluation. When candidates straddle multiple categories, they often fail to display the necessary characteristics of any one for audiences to adequately understand them. They risk getting ignored. Hannan, Pólos, and Carroll (2007: 108) suggest that these penalties to irregular classificatory membership may arise because a “lack of representativeness lowers confidence.”
1
One key argument in this vein has been that actors – be they organizations, individuals, or films – who attempt to span multiple categories, thereby defying clear categorization, are subsequently devalued. Empirical work has shown consistently that those identified with more categories fare worse than those affiliated with fewer. For example, Hsu (2006) found that movies classified in multiple genres garnered lower average ratings from both film critics and the mass audience. She argued that the less favorable ratings stemmed from the difficulties audiences had interpreting films that aimed to fit with many different tastes and, as a result, tended not to fit well with any particular one. Similarly, Zuckerman et al. (2003) showed that actors whose work experience spanned more film genres were less likely to be subsequently hired, compared to actors whose work was more concentrated in a single genre. The authors concluded that this occurred because employers implicitly screened candidates according to their fit with established film genres and thereby ignored those who had not established an identity that was definitively associated with a particular one. Finally, Zuckerman (1999) found that companies had lower market valuations when their pattern of industry participation did not fit squarely into the industry categorization system embodied in securities analysts‟ division of labor. Analysts tended to ignore these diversified companies because ambiguity about the industrybased component of these firms‟ identities left analysts uncertain about how to evaluate them. Overall, these findings have been interpreted as implicating cognitive psychological processes in the devaluation of organizations or individuals who span multiple categories in economic settings.
2
These studies of social actors who span multiple categories in markets have emerged in parallel with the established literature in cognitive and social psychology exploring related issues (Rosch 1973; Smith and Medin 1981; Murphy 2004). Foundational works on cognition demonstrate that the encountering of unusual category combinations prompts individuals to expend greater cognitive effort to make sense of them (Kunda, Miller, and Claire 1990; Asch and Zukier 1984). Because individuals are sometimes portrayed as “cognitive misers” (Fiske and Taylor 1984), objects which require additional effort to decode are generally ignored or even negatively viewed (Cohen and Basu 1987; Garbarino and Edell 1997). Yet, despite its consistency with the psychological literature, extant work in market settings using observational data has been less than definitive on the source of the penalty for multiple-category membership. This is because it is difficult to completely rule out unmeasured differences, such as quality, skill or ability, between category spanners and those affiliated with only one category. For example, in the case of movies that combine several genres, it is difficult, even in the presence of numerous statistical controls, to determine conclusively whether the reduced appeal is due to a lack of fit with an audience‟s cognitive schemas or whether genre-spanning movies are objectively of poorer quality. In fact, numerous theories would suggest that operational, rather than cognitive, difficulties explain these findings. Niche width theory in organizational ecology (Freeman and Hannan 1983) highlights the operational challenges that result from organizations attempting to do many different things at once. According to this theory, organizations face similar constraints in terms of the total effort they can expend. As a
3
result, firms must trade off broader engagement across many areas for deeper engagement with one. Work in cognitive and social psychology does not adjudicate between these competing explanations of the multiple-category discount observed in market settings precisely because such operational issues are unlikely among individuals in lab settings. In the second chapter to follow, co-authored with Amanda Sharkey, we aim to contribute to the literature on the role of categories in economic and organizational life by providing a more solid empirical foundation as to the existence of a cognitively driven penalty for multiple-category membership in market settings. Our goal was to discern whether cognitive factors drive at least part of the multiple-category discount found in economic settings after accounting for functional differences. In particular, we employed unique evidence in the form of a natural experiment on a peer-to-peer lending website where individuals seek to borrow money from those who wish to loan it. We first, demonstrated that users were less likely to receive full funding for their loan requests while category labels were displayed on the website. However, once this evidence of multiple category membership was removed, these same users no longer suffered such a disadvantage. Encouraged by evidence of a cognitive component to the discount we have observed in markets to actors who span multiple categorical boundaries, my third chapter asks: Are there better or worst ways of expanding beyond recognized categorical boundaries? Despite this multiple category detriment, both organizations and individuals are often pressured to extend beyond their initial province. Organizations, facing pressures to grow, often attempt to diversify from their initial
4
lines of business (Rumelt 1974). Competitors encourage expansion by mimicking each other‟s market entry decisions (Haveman 1993). Economies of scope advantages lead organizations to accumulate product lines in the hopes of exploiting expertise (Teece 1980) or forgoing contractual complexities (Williamson 1979). Sometimes, client expectations push firms to take on potentially disparate lines of business (Phillips and Zuckerman 2001). Individuals often engage across disparate domains as well. The market for CEOs drives turnover and succession of chief executives (Ocasio 1994). Successful entrepreneurs are seen to result from affiliations with more disparate social cliques (Burt 1992) thereby allowing the focal actor access to complementary information. More experienced film actors are encouraged to demonstrate their skills by expanding their repertoire into multiple genres (Zuckerman, et al 2003). This highlights a shortcoming of the theoretical approach to multi-category membership to date, that it cannot distinguish more versus less successful category spanners – in short, whether they are dilettantes or Renaissance men. The line between the two can be drawn more distinctly if investigators were able to disambiguate why some candidates, given a fixed multiple categorical portfolio of experiences, are more successful than others. Without a theory for why category spanners are not all similarly disadvantaged we are left to assume a social actor‟s success in working broadly is due to chance. The latest literature has hinted at the potential existence of successful polymaths (Zuckerman et al 2003; Fergusen 2009). However, the idea of a Renaissance man, able to successfully compile experiences across disparate categories without audience censure, proves to be theoretically elusive. For example, Zuckerman
5
and colleagues‟ (2003) investigation of film actors stops short of highlighting factors which contribute to successful generalists and instead show how the penalty for unfocused actors was lessened for veterans. This view is unable to explain how two equally experienced actors, who have worked in identical genres, may be more or less successful. I suggest the historic order in which a social actor accumulated their categorical experiences should matter. Historical context is of interest to social scientists because the consequences of past actions are often reflected in future outcomes. For example, Stinchcombe (1965) observed how founding conditions of organizations impact their future social structures. This was termed “organizational imprinting.” Baron, Hannan, and Burton (1999) verified this by showing how startups in Silicon Valley which began with higher administrative efforts eventually became more bureaucratic. Phillips (2005) uncovered evidence of the persistence of gender inequality in Silicon Valley law firms as the result of routines transferred from the parent firms of founders. Scholars who study labor markets have identified historic effects on individuals. Sørensen (2000) demonstrated how a team‟s composition over time contributes to the likelihood of an individual member leaving in the future. Zuckerman and colleagues (2003) studied the career histories of film actors and showed that those with past acting experiences that were less concentrated in a genre were less likely to obtain future work. I link these two streams of literature, multiple-category membership and history dependence, by asking the question: Are there more or less acceptable sequences of historic categorical membership? Sociologists who study career
6
progressions (Lawrence and Tolbert 2007; Abbott 1995; Abbott and Hrycak 1990; Kalleberg and Hudis 1979) motivate this chapter by examining past patterns of career movements and their result on future outcomes. For example, Wilensky (1961) identified people with career patterns of more or less orderly progress and demonstrated its effect on the individual‟s level of social participation. Lawrence and Tolbert (2007) hypothesize that typical career sequences are normatively established through repeated observations. Abbott and Hrycak (1990) find that typical career sequences of musician‟s in 19th century Germany can be identified and codified. BlairLoy (1999) finds that despite persistent beliefs by women in finance-executive careers that their success is random and accidental, that instead, their achievements are increasingly a result of a patterned career trajectory. This chapter takes the stance that candidates, who have sequentially moved between categories separated by stronger boundaries, will suffer a disadvantage. By boundaries, I mean to invoke how associated categories are with one another. In this sense, I intuit a “cognitive distance” between two categories. A candidate‟s past sequence of experiences serves as a cue or signal (Spence 1973) to potential evaluators. Because categories necessarily delineate potential difference, they lead audiences to examine a candidate‟s past through such a prism of categorical variation. Categories become evaluative in this instance. Individuals often prefer narratives when attempting to develop an understanding. They also see patterns of behavior when none may exist. When an audience attempts to construct a narrative to interpret a candidate‟s identity, a candidate with an unpredictable or unfamiliar sequence of past categorical experiences will likely be perceived as being erratic – a dilettante. On the other hand,
7
candidates who move between categories which are more cognitively proximate will be inferred as being more deliberate in their actions and will seem more conscientious. While these previous two papers have attempted to identify the cognitive detriment suffered by social actors who are unable to demonstrate categorically coherent past affiliations, my fourth chapter examines how the expertise an audience member has affects outcomes in markets. Zuckerman (1999) originally posited an audience/candidate interface as a conceptualization of a market. Literature to date along this paradigm has mainly focused on how candidates who fail to adequately display categorically familiar characteristics are generally disadvantaged. This is because audiences develop expectations as to what characteristics a category member should exhibit. Therefore social actors seeking attention will need to demonstrate categorical adherence (Zuckerman et al 2003). Those that don‟t are ignored (Zuckerman 1999), suffer critical censure (Rao et al 2003; Hsu 2006), and are generally disadvantaged in the marketplace (Hsu el al 2009). Much of this recent effort has been to understand the particular difficulties the candidate or producer faces. This stream of literature has only recent begun to examine the consequences of potential heterogeneity in audience members. My fourth chapter tackles the question of heterogeneity in audiences. If audience understanding of categorical boundaries affects how producers are disadvantaged, then theorizing on the consequences of variation in audience expertise becomes important. In short, how does potential variance in audience expertise affect market outcomes? For example, Hsu (2006) identified both the mass audience as well as the critic in her examination of the consequences of movies that combined elements
8
from disparate genres. Her findings were that both sets of distinct audiences reacted in similar ways to movies which combined genre elements – namely that the more genres a movie spanned, the less appealing they were to both sets of audiences. However, she also demonstrated that the more expert audience members, namely the critics, paid more attention to those movies which spanned more genres while the mass audience demonstrated no such increase in attention. This chapter extends on the previous work which has conceptualized audiences as varying in expertise (Hannan et al 2007; Hsu 2006; Carroll and Swaminathan 2000) by suggesting that this expertise can be developed along two dimensions, distinguished by the depth and breadth of past experiences. Greater depth refers to the amount of experience an audience member has had in any one particular category, while breadth refers to how spread out across different categories their experiences are. This is important in markets where audience members have to solicit offers because their expertise will influence their ability to communicate their requirements. I hypothesize that audience members with greater depth of experience will propose requirements that are more categorically pure and attract more accurate offers from more categorically focused producers. On the other hand, those with broader experiences will define more categorically mixed requirements and attract less accurate offers from more categorically diverse producers. Audience members with more experience (greater depth) in a particular category will have greater fluency in that category‟s conventions. The more often they experience a category, the better understanding as to what types of exchanges occur in that category, the more familiar they become with the details particular to transactions
9
in that category, and the clearer expectations for its members. This improves the audience members‟ ability to use category specific language and to more precisely specify requirements that are appropriate for that category. Producers will be better able to understand what is expected. This is hypothesized to lead to greater accuracy in the offers made. Audience members with more experience (greater depth) in a particular category will also attract producers that are more specialized in that particular category to make offers. Producers with in-depth category experience will self-select into these jobs. This is because producers with more experience in the focal category are more likely to have the skills to match these more categorically detailed requests. The greater breadth of different categorical experiences audience members have, the greater diversity of category elements they will be exposed to. This broader exposure increases the familiarity these audience members will have of characteristics from disparate categories. This increases the likelihood that they will have broader preferences and more difficulty in narrowing their requirements to be specific to one category. Their requests will include elements that are more categorically mixed and will appear more ambiguous. This is hypothesized to lead to lower accuracy in the offers made. These categorically mixed requirements will also draw producers that have broader backgrounds to make offers. Producers that have worked across more categories will self-select and be more likely to make offers because their more diverse skills are more likely to match these categorically mixed requirements.
10
CHAPTER 2 MULTIPLE CATEGORY MEMBERSHIP IN MARKETS
Recent research in economic and organizational sociology draws upon findings in cognitive psychology to understand how categorization impacts a myriad of organizational and individual outcomes (Zuckerman 1999, 2000; Zuckerman, Kim, Ukanwa, and von Rittman 2003; Hannan, Pólos, and Carroll 2007). Categorical boundaries serve to partition continuously differing organizations into groups that critics, consumers and investors perceive to be alike. Categorization influences economic outcomes by aiding consumers in finding the products they seek (Rosa and Spanjol 2005; Rosa, Porac, Spanjol, and Saxon 1999), facilitating financial analysts in comparing companies (Zuckerman 1999), inducing managers to divest incoherent lines of business (Zuckerman 2000) and assisting producers in identifying their competitors (Porac, Thomas, Wilson, Paton, and Kanfer 1995). One key argument in this vein has been that actors -- be they organizations, individuals or films -- who attempt to span multiple categories, thereby defying clear categorization, are subsequently devalued. Empirical work has shown consistently that those identified with more categories fare worse than those affiliated with fewer. For example, Hsu (2006) found that movies classified in multiple genres garnered lower average ratings from critics. She argued that the less favorable ratings stemmed from the difficulties audiences had interpreting films that aimed to fit with many different tastes but as a result tended not to fit well with any particular one. Similarly, Zuckerman et al. (2003) showed that actors whose work experience spanned more film
11
genres were less likely to be subsequently hired, compared to actors whose work was more concentrated in a single genre. The authors concluded that this occurred because employers implicitly screened candidates according to their fit with established film genres and thereby ignored those who had not established an identity that was definitively associated with any particular one. Finally, Zuckerman (1999) found that companies had lower market valuations when their pattern of industry participation did not fit squarely into the industry categorization system embodied in securities‟ analysts division of labor. Analysts tended to ignore these diversified companies because ambiguity about the category-based component of these firms‟ identities left analysts uncertain about how to evaluate them. Overall, these findings have been interpreted as implicating cognitive psychological processes in the devaluation of organizations or individuals who span multiple categories in economic settings. These studies of multiple categorization in markets have emerged in parallel with the established literature in cognitive and social psychology exploring related issues. Foundational works on cognition demonstrate that the encountering of unusual category combinations (e.g., Harvard-educated carpenter) prompts individuals to expend greater cognitive effort to make sense of them and produces emergent traits that are not necessarily associated with any of the constituent categories (Kunda, Miller, and Claire 1990; Asch and Zukier 1984). Because individuals are sometimes portrayed as “cognitive misers” (Fiske and Taylor 1984), objects which require additional effort to decode are generally ignored or even negatively viewed (Cohen and Basu 1987; Garbarino and Edell 1997) In a different, but related, vein, social psychological research into individuals and groups highlights the challenges multi-
12
racial individuals face as others attempt to make sense of their ambiguous racial identities (Alipuria 2002; Williams 1997; Root 1996; Omi and Winant 1986). Yet, despite its consistency with the psychological literature, extant work in market settings using observational data has been less than definitive on the source of the penalty for multiple-category membership. This is because it is difficult to completely rule out unmeasured differences, such as quality, skill or ability, between category spanners and those affiliated with only one category. For example, in the case of movies that combine several genres, it is difficult, even in the presence of numerous statistical controls, to determine conclusively whether the reduced appeal is due to a lack of fit with an audience‟s cognitive schemas or whether genre-spanning movies are objectively of poorer quality. In fact, numerous theories would suggest that operational, rather than cognitive, difficulties explain these findings. Niche width theory in organizational ecology (Freeman and Hannan 1983) highlights the operational challenges that result from organizations attempting to do many different things at once. According to this theory, organizations face similar constraints in terms of the total effort they can expend. As a result, firms must trade off broader engagement across many areas for deeper engagement with one. Similarly, the strategic management and finance literatures on corporate diversification report evidence of a multiple-category penalty (Laeven and Levine 2007; Burch and Nanda 2003; Berger and Ofek 1995; Lang and Stulz 1994; Wernerfelt and Montgomery 1988)1 but explain the results with an entirely different
1
Recently, some work in this area has begun to debate whether diversification has a causal effect in lowering firm value or whether the diversification discount may instead be due to selection effects, namely that firms choosing to diversify tend to be lower-performing to begin with (e.g., Villalonga 13
set of mechanisms, such as inefficiency due to rent-seeking behaviors, agency problems, or information asymmetries that are more problematic in diversified firms than in focused ones (Ozbas 2005; Scharfstein and Stein 2000; Denis, Denis, and Sarin 1997; Jensen 1996; Rotemberg and Saloner 1994; Berger and Ofek 1995; Bolton and Scharfstein 1990). Work in cognitive and social psychology does not adjudicate between these competing explanations of the multiple-category discount observed in market settings precisely because such operational issues are unlikely among individuals in lab settings. In this paper, we aim to contribute to the literature on the role of categories in economic and organizational life by providing a more solid empirical foundation as to the existence of a cognitively driven penalty for multiple-category membership in market settings. Our goal is not to compare the magnitude of effects stemming from cognitive factors to those from operational ones, but rather to discern whether cognitive factors drive at least part of the multiple-category discount found in economic settings after accounting for functional differences. In particular, we employ unique evidence in the form of a natural experiment on a peer-to-peer lending website where individuals seek to borrow money from those who wish to loan it. Labels indicating a borrower‟s category affiliations were visible on the website at one time but were subsequently removed. We show how the outcomes associated with getting a loan changed for the same population of borrowers from before to after these category labels were eliminated. After the labels were removed, multiple-category affiliated borrowers were still present, but, due to the label 2004; Campa and Kedia 2002). We view these emergent findings as further reason to explore the effects of multiple-category membership using a research design that eliminates selection effects. 14
removal, were no longer identifiable as such. Because the only difference between the two time periods is this exogenous shock of label removal, we are able to estimate the extent to which cognitive processes related to labeling alone cause the devaluation of multiple-category members. The estimated effect is therefore net of any unmeasured quality, motivational, or other functional differences, as the individuals remained the same. These analyses provide evidence that the penalty for multiple-category membership in economic contexts is at least partially rooted in socio-cognitive processes. THEORETICAL BACKGROUND Much recent research in economic and organizational sociology explores the role of classification structures, or categories, in organizational contexts (Hannan et al. 2007; Hsu 2006; Hsu and Hannan 2005; Rao, Monin, and Durand 2005; Zuckerman et al. 2003; Zuckerman 1999, 2000). This work draws heavily on insights gleaned from the study of categorization by cognitive psychologists (Rosch and Mervis 1975; Smith and Medin 1981; Hampton 1997; see Murphy 2004 for a review). Hannan and colleagues (2007) define a clustering of similar organizations as constituting a category when members of an audience, such as employees, critics, or consumers, attach a label to the cluster, reach a high degree of consensus about what the label means, and come to agreement about the set of organizations to which the label applies. Following this, if an actor claims membership in a category and if the audience accepts that claim, the audience concurrently makes inferences as to how the actor will behave and what features it will possess (Bruner 1957). These categorylevel expectations might actually fit any particular organization claiming category
15
membership to a greater or lesser degree depending on how closely the actor resembles the typical category member (Rosch and Mervis 1975; Malt and Smith 1984; Porac and Thomas 1990). Thus, category membership generally corresponds in some measure to “real” characteristics of an actor, but, more precisely, it denotes the audience‟s perception of the actor vis-à-vis some ideal type in the prevailing classification structure (Zerubavel 1997). In this sense, we view category membership in economic settings as a positional attribute denoting an actor‟s place in an audience‟s cognitive representation of market participants (cf. Rosa, Porac, Runser-Spanjol, and Saxon 1999; Porac, Thomas, Wilson, Paton, and Kanfer 1995). From the perspective of organizational audiences, categories are integral because they help in identifying relevant product offerings and choosing among them. Standard models of consumer decision-making posit a two-stage selection process whereby consumers first identify the choice set of all reasonably relevant offers and only then optimize among this smaller set of alternatives (e.g. Howard and Sheth 1969; Payne 1976; Lussier and Olshavsky 1979; Payne, Bettman, and Johnson 1988). Category membership looms large in the first stage because it represents a highly salient marker by which boundaries can be drawn between true contenders and irrelevant ones, thus circumscribing the set of individuals or products receiving a more thorough evaluation (Espeland and Stevens 1998; Lamont and Molnar 2002). This process of commensuration and evaluation becomes problematic, however, when audiences are unable to determine how to classify an offering, either because the offering does not seem to fit into any category at all or because it fits into too many. Confused by attempting to make sense of such offerings and left without the clear set
16
of expectations that category membership provides, audiences often exclude such options from further consideration and instead direct their efforts toward evaluating more clearly relevant offerings. Recent research suggests that this cognitive sensemaking account of the penalty for multiple-category membership holds in market settings such as the stock market (Zuckerman 1999), feature films (Hsu 2006) and ebay (Hsu, Hannan, and Koçak 2009). Although the convergent findings from this line of literature support the notion that actors who defy institutionalized classification schemas are punished or ignored because evaluators have difficulty making sense of them, alternative explanations remain. In particular, both the strategic management literature and research in organizational ecology offer alternative explanations as to why multiple-category members might be devalued. These accounts rely on the potential operational difficulties that arise when actors attempt to do many different things at once. For example, niche width theory (Freeman and Hannan 1983) begins with the assumption that organizations have equal capacity for total effort expenditures and posits that organizations must allocate their attention among various opportunities by trading off more effort in any one area for less in another. Thus, actors spanning multiple categories must devote fewer resources to any particular category than a specialist would (Freeman and Hannan 1983; Peli 1997; Dobrev, Kim, and Hannan 2001; Hannan, Carroll, and Pólos 2007).2 Therefore, those occupying multiple resource
2
Because niche width theory assumes that all organizations have the same total capacity for performance, it is not applicable to situations where generalists enjoy economies of scale/scope (Hannan, Polos and Carroll 2007). The setting under consideration here meets that assumption. 17
niches have lower appeal in any particular one relative to specialists who focus all their energies (Hsu, Hannan and Koçak 2009). The strategy and finance literatures on the diversification discount similarly highlight the operational causes for why diversified conglomerates perform worse on average than single line-of-business firms in the same industry. Explanations tend to center around various capital allocation challenges in diversified firms. One line of argument is that diversified firms are valued lower than more focused firms because of cross-subsidization: resources are taken from high-performing lines of business to subsidize laggard lines that might otherwise go out of business if they were forced to stand alone (Berger and Ofek 1995; Meyer, Milgrom, and Roberts 1992). A second explanation is that there are more information asymmetries, power struggles, and incentives for rent-seeking behaviors in conglomerates, driving down value and performance (Rajan, Servaes, and Zingales 2000; Scharfstein and Stein 2000; Harris, Kriebel, and Raviv 1982). A third explanation is that conglomerates tend to have larger free cash flows and greater borrowing power, both of which tend to be funneled toward value-decreasing investments (Jensen 1996). Taken together, then, existing research in economic sociology suggests two possible sources of the penalty for multiple-category membership, namely: 1) devaluation due to cognitive difficulties making sense of and evaluating actors that do not fit clearly into culturally shared categories and 2) devaluation driven by operational challenges that result in poorer quality and performance for actors who try to do many different things at once. Of course, these two sources may occur in tandem. However, precisely because the perceived quality of an offering tends to be at least
18
partly endogenous to one‟s position in a classification structure, establishing that cognitive factors contribute to the devaluation of actors who are ambiguously categorized can only occur after the elimination of all possible unmeasured operational differences. This is difficult to achieve with statistical controls. Existing work using observational data leaves the possibility that unmeasured performance differences, rather than cognitive factors, drive the multiple-category penalty. This motivates our use of a natural experiment. HYPOTHESES We begin by confirming, as has been demonstrated in previous research, that multiple-category members, identified as such to an external audience, are indeed devalued when their identities as category-spanners are apparent through labels. However, the key question we seek to answer is whether these same actors are devalued when their multiple category labels are not present. In comparing these two conditions, we provide a more solid empirical foundation from which to conclude whether cognitively induced penalties indeed exist net of functional or skill-based explanations. Various strands of work in psychology show that the presence of multiple category labels may spark confusion and uncertainty, as each label evokes its own set of associated beliefs and expectations, and audiences are left to determine how they may fit together to form a coherent identity. Research on racial and ethnic identity documents the interactional difficulties faced by individuals belonging to multiple racial and ethnic categories (e.g., someone who is both black and white). Multi-racial adults report being constantly asked, “What are you?” as others attempt to discern how
19
their identity fits with the perceiver‟s culturally shared schemas for racial categorization (Alipuria 2002; Williams 1997). The fact that the question arises so frequently suggests the pervasive discomfort and social disorientation that perceivers feels in encountering individuals whose identity is ambiguous due to their membership in multiple racial categories (Williams 1997; Root 1996; Omi and Winant 1986). Work on multi-racial identity deals with a particular type of multiple categorization; that of being multiply categorized in a single domain (i.e., belonging to two or more racial groups). This is distinct from being categorized across multiple domains (e.g., being white and female). The former case nearly always causes consternation whereas the latter case tends to do so to a greater extent when the particular combination is unusual or surprising (Crisp and Hewstone 2006). For example, McCloskey and Glucksberg (1978) found that individuals presented with names of categories and an object that might belong to the category (e.g., furniturechair, furniture-cucumber) were more inconsistent in classifying the object, both with one another and with their own prior classifications, when the object was only moderately typical of the category (e.g., furniture-bookend rather than furniture-chair or furniture-cucumber). The task in that study is similar to the issue studied here in that it involves making sense of an object based only on category information. We view the inconsistency found there are as suggestive of the confusion and lack of certainty that might ensue in making sense of an object with multiple category labels. Foundational work in this area also shows that individuals engage in effortful processing to develop causal explanations for why and how a set of labels (e.g., Harvard-educated and carpenter) might describe a single entity, particularly when the
20
combination is unexpected (Kunda, Miller, and Claire 1990; Asch and Zukier 1984). In such cases, the combined category often is described using words not used to characterize either of the constituent categories. For example, subjects presented with a description of someone who was Harvard-educated and a carpenter inferred that the person was “non-materialistic” even though subjects did not use that word to characterize someone who was either Harvard-educated or a carpenter (Kunda, Miller, and Claire 1990). We view this as suggesting the multiplicity of interpretations possible for category combinations and argue that this ambiguity about a categoryspanner‟s identity may result in uncertainty and avoidance. In economic settings, we suggest that the additional effort and confusion involved in making sense of and evaluating these ambiguous offers may result in their being less likely to be chosen. Humans have been described as “cognitive misers” (Fiske and Taylor 1984); they are often represented as only desiring to expend the minimal effort that is necessary to make a satisfactory decision. This view is consistent with evidence that suggests individuals allocate their limited cognitive attention sparingly (Payne 1982; Russo and Dosher 1983). More concretely, Cohen and Basu (1987) suggested that because correct classification of objects reduces uncertainty, negative affect may result when people are unable to properly categorize an item. Following this, experimental evidence in consumer behavior has shown that increased cognitive effort in decision-making results in a negative feeling towards the causing object. This then tends to cause decision-makers to prefer offers requiring less cognitive effort to evaluate, even after they have already expended the energy to compare both products (Garbarino and Edell 1997).
21
In summary, then, there are several possible reasons why having more category labels might lead to negative outcomes, even in the absence of any true differences between multiple-category and single-category members. It would seem to follow that multiple-category membership should incur less of a penalty when prominent labels are not present to highlight one‟s ambiguous identity. Whether the penalty for multiple-category membership disappears altogether or merely diminishes in size, however, depends on the extent to which audiences can differentiate multiple-category members from single-category members, even in the absence of labels. Because this is a matter to be resolved empirically, we predict only that the penalty for membership in multiple-category groups will attenuate when labels are removed, rather than speculating about whether it will subside entirely. Thus, we predict: Hypothesis 1: Members of more categories, when labeled as such, will be evaluated more negatively than those belonging to fewer categories. Hypothesis 2: The negative effect of multiple-category membership will diminish once category labels are removed. RESEARCH SETTING The previously mentioned difficulties of measuring the effect of being a multiple-category member, net of any underlying quality, motivational or other functional differences, motivates our use of a natural experiment. This design allows us to disentangle the effect of being labeled a member of numerous categories from the alternative functional explanations that might drive both multiple-category membership and rewards. Two factors are critical to an appropriate experimental design. The first is being able to compare equivalent treatment and control groups so the effects of any unobservable characteristics are removed. The second requirement is
22
that subjects cannot actively select into one condition or the other, but rather are assigned for reasons unrelated to either the treatment or outcome. Because of a fortuitous change in functionality that resulted in the removal of category labels, our research setting, a peer-to-peer lending website (www.prosper.com), meets these criteria. In the words of the website, “Prosper is an online community for lending and borrowing money. Lenders and borrowers come together to bid on personal loans: loans without a bank through peer-to-peer lending.” Users of the website who wish to borrow money for any purpose can post an unsecured loan request (a listing) for up to $25,000 to be paid back over three years. Other website users who wish to loan money bid on these loan listings by promising to fund a portion of the loan at a particular interest rate. They aim to profit from the interest they will charge. A listing that attracts enough bidders to meet the total loan request becomes a loan. Any user on the website can attempt to borrow or loan money. The website bases its format on micro-lending co-ops in which individuals would band together to lend and borrow outside of formal financial institutions. In the hopes of mimicking this sense of community, the website allowed participants to establish and/or join self-organized groups. These virtual groups are established by a self-appointed “group leader,” who is responsible for classifying the group into various categories to expedite the search process for new members. Each group is required to be labeled with at least one category. Group leaders choose from dropdown menus that allow them to select categories with which to label their group. For example, a group of nurses may choose to band together and categorize their group as,
23
“Science & Health / Medicine / Nursing.” Notice that “Nursing” is a category that is nested hierarchically under the high-level domain of “Science & Health,” and further nested under the category of “Medicine.” There are fourteen high-level domains, with 1,552 categories nested below, any of which can be chosen by a group leader. See Appendix A for a list of the high-level domains the category labels are organized under. Members of the site are not required to join a group, but, if they do, they are limited to joining only one. Of the 531,186 users on the website in November 2007, approximately 69,291 were affiliated with a group (~13%). In this setting, individuals can only belong to one group, and thus are not themselves single- or multi-categorical. Instead, individuals choose to join groups that are associated with one or more categories. Thus, category membership is, strictly speaking, a group-level property. However, category membership is closely connected to an individual‟s identity because a person‟s loan listing contains a link to the page of any groups to which they belong. When viewing an individual‟s loan request, prospective lenders can click on the group link to see details of the group a member belongs to, including the number of categories with which their group is affiliated. This somewhat unusual feature actually makes for a stronger test of the penalty for multiple-category membership. Because an individual‟s membership in a multiplecategory group is apparent only once someone clicks on the link to group membership on a person‟s loan listing, it does take some effort to detect multiple-category membership. This should serve to mitigate the negative effects of multiple-category membership and make it more difficult to find any effect of visible category labels.
24
Figure 2.1 shows an example of a loan listing posted online by a prospective borrower. Listings include the dollar amount and desired interest rate of the loan. Prospective borrowers also provide a short description of the purpose of the loan and submit financial information (e.g., credit rating, income, debts) verified by a third party. In addition to extensive financial and loan-related information, a listing shows other facets of a prospective borrower‟s profile, including the borrower‟s group affiliation. After reviewing such information, lenders then may choose to bid on a loan by offering to fund some portion of the borrower‟s total request at an interest rate that they can specify. [insert Figure 2.1 about here] Listings remain active for some time period - usually one or two weeks specified by the prospective borrower. At the end of the listing period, bids on a loan are summed. If the total amount offered exceeds the amount requested, the loan is considered to be “fully funded.” This occurs for approximately 13% of loan requests by group members we examined. In the case of a funded loan, bids are aggregated to form a single loan issued in the name of the web site to the borrower, and funds go proportionally to each lender as the borrower enters repayment. This context is uniquely suited for testing theories about the cognitively driven devaluation of multiple-category members because, as mentioned earlier, members of the site belonged to groups that were classified into varying numbers of categories. Group affiliation appeared as the hyperlinked name of the group on the page of an individual‟s loan listing. Clicking on the group link took users to a group page with a more detailed description of the group‟s mission, as well as information about the
25
number of members, listings and outstanding loans. Most relevant for the topic here, the page also included labels for the categories with which the group was affiliated. See Figure 2.2 below for an example of a group with category affiliation labels visible. [insert Figure 2.2 about here] The category labels groups were affiliated with, once visible on the website, were subsequently removed, providing a natural experiment on the labeling effects of multiple-category affiliation. See Figure 2.3 for an image of a group page with category labels eliminated. Notice the only change is the removal of the group‟s categorical affiliations. We view the removal of labels as constituting an exogenous “treatment” that allows us to test how the visibility of the number of categories a group is affiliated with impacts their members, net of any functional differences. If label invisibility does not change the penalty a member of a multiple-category group faces, then the devaluation of multiple-category members likely has nothing to do with cognitive factors and may instead reflect the underlying functional differences between actors. However, if the penalty attenuates or disappears along with label visibility, then we should be more persuaded to believe devaluation occurs in part because audiences have difficulty making sense of actors who resist clear categorization. [insert Figure 2.3 about here] RESEARCH DESIGN If this were a true experiment, each loan request would have been randomly assigned to a treatment or control condition. Doing so would allow us to operate under the standard experimental assumption that individuals assigned to the treatment
26
condition do not differ from those assigned to the control condition in any manner related to both the treatment and the outcome. However, because this occurred in a real-world setting, we had to consider whether the treatment (i.e., removal of category labels) might somehow have affected other aspects of the site that also influenced the outcomes of interest. In short, we seek assurance the removal of labels did not interfere with the activity that occurred on the site during the period we observed. There are at least three reasons we do not believe this is a concern. First, we confine our analysis to a window of 100 days before and 100 days after the label visibility change occurred. This limited window reduces the chance individuals could have switched between groups en masse and makes it likely that group composition stayed constant before and after the removal of labels. Second, our interview with a Prosper.com representative gave us no reason to believe the removal of labels was in any way related to how successful or unsuccessful the groups were. The labels were removed because they were initially utilized to assist individuals in finding groups matching their interests, but, as the groups feature was increasingly downplayed, the category labels became less important. This led to their removal. Third, site users did not know about the label change before it occurred. Therefore, it is difficult to imagine that prospective borrowers would have purposefully listed their loans before or after the change simply because they thought the removal of labels would somehow affect their likelihood of obtaining a loan. Given these reasons, this setting seems to be wellsuited for studying the cognitive effects of multiple-category membership while factoring out any unobservable underlying differences.
27
DATA AND MEASURES Prosper.com freely provides data of activity on their website. We downloaded data on September 22, 2008 encompassing all transactions that had occurred on the website up to that date. To minimize the effects of any other changes that occurred over time on the website besides the removal of category labels, we analyzed data from a 200-day window, 100 days before and 100 days after the removal of category labels on September 12, 2007. We also deleted listings spanning the date of label removal, as we were uncertain what would happen to listings that had visible labels one day and none the next. This left us with a total of 13,835 listings -- 8,177 listings before label removal and 5,718 listings after. Of the 267 different groups that had members posting a loan request before the removal of the labels, over 85% (228) of these groups had members posting after the label removal as well. This suggests the majority of our population under study was present in both the before (control) and after (treatment) conditions. Dependent Variables We analyzed the effect of multiple-category membership in two ways. First, we studied the probability that a listing resulted in a loan. Listings only become loans if they receive 100% of the amount requested by the prospective borrower. Therefore, becoming a loan indicates that the listing was found attractive by several lenders. We coded a listing as „1‟ if it received full funding and became a loan and „0‟ otherwise. Most listings on this website did not attract enough bids to become loans. Of the 13,835 listings in the dataset, only 1,846 (13.3%) became loans.
28
Second, as another measure of acceptance by potential loaners, we examined the hazard of a prospective borrower‟s listing attracting a first bid. Although it is certainly easier to obtain a single bid than it is to attract enough bids to fully fund a loan, we suspected members of groups associated with more categories might fare poorly even in the case of this relatively lower hurdle. The inability to obtain even a single bid provides another measure of the hesitancy or reluctance borrowers display when encountering a multiple-category group member‟s loan listing. To test this, we created a dataset in which the unit of analysis was a listing-day. Each listing-day was coded „0‟ until it received its first bid. Upon receiving a first bid, the loan was coded as „1‟ and removed from the risk set. On average, it took listings 5.6 days before garnering a first bid. Also, note that 33% of listings did not attract even a single bid, demonstrating the difficulty and importance of obtaining a first bid. In the case where a listing has just been posted and has yet to receive a bid, prospective bidders have to rely solely on their own analysis of the loan‟s merits rather than making inferences about the loan‟s attractiveness and legitimacy on the basis of the presence of other bidders, as often happens in economic settings via informational cascades (Bikhchandani, Hirshleifer, and Welch 1992; 1998), herd behavior (Banerjee 1992), or other situations where the actions of others incrementally increase the legitimacy of a listing (Rao, Greve, and Davis 2001; Pollock and Rindova 2003). Therefore, studying whether the rate of obtaining a first bid varies by category membership, both before and after label removal, allows for the identification of a penalty prior to any effects of herd behavior.
29
Independent Variables The key independent variable in this analysis is the number of categories with which a group was affiliated. Prior to the removal of category labels, founders of groups were required to choose at least one out of a set of 1,552 possible category labels to describe their group. Groups in this dataset belonged to anywhere between one to five categories. We measured the number of category labels as a count of these labels listed on the group‟s page. For example, the group “Atlanta Borrowers” was listed as “Regional / By Metro / Atlanta.” We counted this as having 1 label. By contrast, the group “BORROWERS - Free instant Listings” was labeled as belonging to the following categories: “Business & Professional/ Business & Finance/ Entrepreneurs,” “Religion & Spirituality/ Christian / Latter Day Saints,” “People & Lifestyle / Families / Other,” “Military / Other,” and “Other.” This group was operationalized as having five labels. We chose this approach because it reflected the way labels appeared visually on the group pages that prospective lenders could see when evaluating loan listings. Each label appeared as a separate line on the group page. Control Variables We controlled for individual characteristics that might affect a listing‟s success. These included the member‟s debt-to-income ratio, homeowner status, number of currently delinquent loans, credit rating and income. For details as to the coding scheme used for credit score and income, see Appendix B. Table 2.1 below summarizes the variables and Table 2.2 reports their correlations. [insert Table 2.1 and Table 2.2 about here]
30
RESULTS We first examined how the number of categories a group affiliated with affected its member‟s likelihood of obtaining a loan when labels were visible and whether this effect diminished once labels were removed. Because the dependent variable is dichotomous, we utilized logistic regression (Long 1997) to predict the logged odds a loan receives full funding. Specifically, we estimated the following:
log
1
X
(2.1)
where represents the linear transformation of the log of the probability, , of the dependent variable occurring, in this case getting full-funding, divided by the probability of it not getting full-funding. This is estimated with as the constant, X‟ as a vector of the independent and control covariates and as the estimated coefficients of those variables. We utilized a pooled binomial logistic model as an analytic strategy, allowing us to estimate the separate effects of labeling on a particular group before and after. Specifically, we created a dummy variable equal to 1 for listings occurring after label removal and 0 for listings occurring before. We then interacted this dummy with the number of categories with which a group was labeled. This interaction term captures the change in the effect of the number of categories, from before to after labels were removed. Table 2.3 presents the results of logistic regression models predicting the log odds of a loan request being funded. Model 1 shows the results of a random effects logistic regression and estimates the overall effect of the number of categories with
31
which a group is labeled along with the control variables. Model 2 reports a fixedeffects estimation and includes the variable indicating the listing occurred in the period after labels were removed. It also includes the interaction of this variable with the number of categories. For completeness, Model 3 includes all variables interacted with the indicator variable for the period after labels were removed. [insert Table 2.3 about here] In all three models, control variables operate generally as predicted. Borrowers with a greater number of delinquent loans, greater debt-to-equity ratios, worse credit ratings, and lower income levels were significantly less likely to have their loan requests met. However, homeowners were no more or less likely to get a loan funded than others, perhaps because this effect was already captured by an individual‟s debtto-income ratio. Results related to the effect of the number of categories to which a group belongs are consistent with our hypotheses. Model 1 tested the main effect of the number of categories a group was labeled with on the likelihood its members would obtain loans. Because there is no within-group variance in the number of categories they belonged to, we utilized a random effects estimation procedure (as opposed to a fixed effects). This allowed us to estimate the between-group variance by comparing the success of members from groups with more versus less category labels. The coefficient of the number of categories variable in Model 1 represents its overall impact on obtaining a loan. Specifically, each additional category a group is listed in decreases the odds of a member getting a loan by 17.9% (1-exp[-0.197]). This
32
supports hypothesis 1 and is consistent with previous research that has shown actors affiliated with more categories fare worse than those identified with fewer. We exploit the fortuitous natural experimental setting to test our second hypothesis. Model 2 includes fixed effects for each group. Because a fixed-effects model uses only within-group variation, it removes any potential between-group differences, such as their tenure, size, or composition that may affect the results. In short, a fixed effects model allows us to compare outcomes for members of the same group before and after the label removal, thereby eliminating potential alternative explanations that rely on operational or skill differences. The interaction of a dummy variable indicating the listing occurred after label removal with the number of categories variable shows a positive and significant effect in Model 2. This interaction of a variable indicating the listing occurred after label removal (the “after” dummy variable) with the number categories variable captures the change in the effect of the number of categories from before to after label visibility. This positive and significant effect indicates the penalty for multiple-category membership subsided after labels were removed, thus supporting Hypothesis 2. Once visible markers of multiple-category membership are removed, members of such groups significantly increased their ability to get a loan. Note, the resultant positive coefficient ( = .207) of the after interaction with the number of categories is suspiciously equal in magnitude of the estimated penalty associated with each additional category membership before label removal ( = -.197). This suggests the penalty for multiple-category membership may wholly disappear in the absence of labels. In order to fully test this, we estimated the effects of the number 33
of categories solely on a subset of listings that occurred after the date of label removal using a similar logistic regression. Results, not reported here, confirm our suspicion. There was no longer any significant penalty for those members belonging to multiplecategory groups after labels were removed. Notice the indicator of the time period after label removal itself is negative and significant, suggesting it was harder for all individuals to obtain a loan after September 12, 2007. To understand this further, in Model 3, we included all the covariates interacted with the variable indicating the listing occurred after labels were removed to control for any other potential coefficient changes. As Model 3 demonstrates, even controlling for all potential changes to the dynamics of before and after labels were removed, there is still a significant and positive change in the effect of the number of categories once the labels were removed. Note, the variable indicating the listing occurred after labels were removed is no longer significant, meaning any potential changes in the overall ability to get a loan in the latter time period have been captured by the included interaction variables. In fact, most of the interactions are not significant suggesting there were no other major changes associated the timing of the label change. The significant and negative coefficients of the interaction of the period dummy with variables of C, D and E credit ratings are in the same direction as the estimated negative main effects. This suggests those with poorer credit ratings found it even more difficult to obtain funding after label removal. We also studied the hazard of a listing obtaining a first bid. This suggested the use of an event history model (Tuma and Hannan 1984; Allison 1984). Hazard models estimate the instantaneous likelihood of an event at time t – in this case, obtaining a
34
first bid – provided the listing has “survived” until time t without any bids. Formally we estimate,
h(t ) lim
P(t T t | T t )
0
(2.2)
where T is a random variable representing the time to receiving a first bid, t denotes the amount of time (in days) that listing i has been available to be bid upon, and Pr(.) represents the probability of receiving a bid over the interval (t, t+Δt) given the listing had not received one yet. The hazard model estimates the effect of category labels using all available data, including the 32% of the listing that were censored, meaning they did not receive any bids. We adopt the piecewise exponential specification, which allows the base rate of receiving a bid to vary flexibly with how long a listing is available. In particular, this approach splits time into pieces according to the number of days the listing has been posted. The baseline failure rate remains constant within each time piece, but these base rates can vary across pieces. As a result, the piecewise model does not require any strong assumption about the exact form of duration dependence (Barron, West and Hannan 1994). The P pieces are defined according to break points: 0 1 2
P.
(2.3)
with P+1 = ∞, or in this case, the longest a listing was available being 23 days. Exploratory analyses showed that most listings received bids early, and the likelihood of ever getting a bid decreased the more time had passed. In this case, we identified individual daily timepieces for the first three days, then the next two days, followed by
35
a four-day interval (1, 2, 3, 5, 9; with intervals open on the right). The results are robust across more or less stringent timepieces. The rate at which a listing gets its first bid, r(u,t), is a function of how long the listing has been posted, u, a vector of covariates of interest, X′, and a vector of the listing member‟s individual control covariates, Z′:
ri (u, t ) exp [ iu i it X it ' ]
(2.4)
where represents the set of duration-specific effects, i represents the constant term for the i‟th listing member, and i and γi represent the estimated coefficients of the covariates. Table 2.4 reports results of the piecewise exponential hazard models of a loan request receiving a first bid. We utilized the same specification strategy described above, pooling all estimates, using a dummy to represent listings posted after labels were removed, and an interaction of this indicator with the number of categories. Model 1 estimates the base model with the main effect of the number of categories. Model 2 includes the dummy indicator as well as the interaction with the number of categories. Model 3 then includes all the other covariate interactions with the dummy variable indicating the listing was posted after labels were removed. In all models, control variables behaved as expected. [insert Table 2.4 about here] In support of hypothesis 1, the main effect of number of categories in Model 1 is negative and significant. The greater number of categories a group is associated with, the lower the rate of obtaining a first bid and the longer members have to wait to get a first bid when category labels are visible. Each additional category a group has 36
listed adds 14.9% (1-exp[-0.161]) more days (~3.6 hours) until the member‟s loan request attracts its first bid. However, this penalty subsides in the period after removal of the category labels, as shown by the positive and significant interaction of the number of categories variable with the period indicator in Model 2. In support of our second hypothesis, we find a significant improvement from before to after label removal in the fortunes of those members of multiply labeled groups. To calculate the total effect of belonging to one more category after label removal, the main effect is summed with the coefficient of the after interaction variable (-.161 + .117 = -.044). The result is a lingering slight negative effect, which translates to approximately 4.3% (1-exp[-0.044]) more days (~1.1 hours) until a listing gets a first bid for each additional category with which a group is affiliated. Model 3 attempts to control for any other changes that may have occurred after label removal by interacting the period dummy with every other covariate. Congruent with our hypotheses, the effect of the number of categories is still significant and negative while the effect of the interaction with the dummy indicating listings that were posted after labels were removed is positive and significant. The lack of any significant interaction terms suggests there were no other changes to the dynamics of the borrowing and lending processes from before to after the labels were removed, increasing our confidence that only the label visibility played a role. Additionally, the period dummy becomes insignificant, indicating that any differences in before and after label removal have been captured in other covariates. As mentioned above, over 85% of the groups which had members listing a potential loan before the removal of the labels also had members listing after the
37
removal. Because we are unable to run a fixed effects specification with a piecewise hazard model, readers may question how this 15% difference affected the estimates of the before (control) and after (treatment) conditions. If the two populations are not identical, it would be difficult to conclude the treatment was what caused the observed effect. To address this possible difference in the two conditions, we ran the hazard model solely on the population of groups that appeared both before and after the removal of labels. This ensured the control and treatment conditions were applied to an identical population of subjects. The findings from this analysis, not reported for brevity, fully corroborate the results presented above on the full population. As a final robustness check, we also estimated Poisson models predicting the expected number of days until a listing obtained its first bid on those that received any bids at all. In contrast to the event history models presented earlier, these models exclude the listings that did not receive any bids. Recall that 33% of the listings examined did not receive even one bid, leaving us to question if this was what drove the results. Thus, the Poisson models provide additional assurance that the delayed bidding effects we found above were not driven by listings not getting any bids at all. Results, not reported for brevity, concur with these previous findings. DISCUSSION One of the principle sociological contributions to our understanding of markets is the idea that rewards are in part determined by one‟s social position rather than being solely a function of individual attributes, such as preferences, skills, and effort (cf. White 1970; Podolny 1993; Sorensen 1996; Gould 2002). Extant research in economic sociology as well as strategic management shows that actors fare poorly
38
when they attempt to spread their efforts broadly across categories. This multiplecategory or diversification penalty has been explained by two main streams of arguments. One focuses on the negative evaluations audience members may give to offerings that do not fit into the prevailing cognitive framework. Because they seem confusing, evaluators choose to discount or ignore them. A second line of thought is more functional in nature and relates to the difficulties generalist producers have putting forth offerings of a quality level that consistently meets the caliber of specialists. Our use of a natural experiment allowed us to more clearly disentangle the effect of having an identity that is difficult to comprehend from the effect of underlying attributes. Our findings are consistent with the extant literature in cognitive psychology, which suggests that ambiguous objects often arouse confusion. We showed that multiple-category labels operate as a causal force for devaluation even when controlling for unmeasured differences between multiple-category members and their peers whose identities are more easily comprehended by virtue of their belonging to fewer categories. Members of multiple categories are penalized merely for being branded as such. The difficulty multiple category group members faced in obtaining a loan was eliminated after label removal. This suggests the mechanism was likely to be perceptual and underscores the importance of cognitive considerations related to identity and evaluation in economic and market-based contexts. However, the effect of multiple-category membership on the hazard of getting a first bid was not completely eliminated after label removal. We have two theories as to why this was the case. First,
39
there could be a memory effect, whereby the audience may have recognized the group to which a member belonged. Even though category labels were no longer visible, residual confusion towards the group may have affected a member‟s chances in the small time window in which we tested the second hypothesis. Secondly, there may have been some actual differences in the types of members who choose to be in multiple category groups, thereby affecting how the groups were still perceived. For example, if differential sorting of members occurred; perhaps those with poorer credit ratings ended up in groups which seemed more inclusive because of their multi-category labels. This may account for the residual negative effect we observed. However, in running the (unreported) Poisson models estimating the number of days it took a listing to receive its first bid, we utilized a fixed-effects specification to control for potential time-invariant group differences. The results of this analysis demonstrated that the penalty was completely removed along with the labels; suggesting this was not the case. Regardless, all the analyses showed a significant decrease in the penalty faced by multiple–category members after category labels no longer highlighted their ambiguous positions in the category structure. This indicates that cognitive factors drive at least part of the penalty for actors who span categories. We believe the findings from this particular setting represent a conservative estimate of the effect of being labeled a member of multiple categories. First, the categories in our setting were descriptors group leaders choose from drop-down boxes to label their groups. A priori, they do not appear to be the powerful, institutionalized forces with a myriad of mechanisms demanding conformity many researchers (e.g.
40
Meyer and Rowan 1977; DiMaggio and Powell 1983) suggest social categories represent. Yet, even in a setting where transgression of the category structure might not be expected to entail severe social sanctions, we found strong effects stemming from cognitive processes inherent in evaluation. In other settings, our findings suggest category labels are likely to exacerbate any underlying differences in quality, motivations or other attributes. This underscores the fact that category labels accentuate and amplify identity. Second, as mentioned above, it would be reasonable to believe frequent lenders might have been able to recall those groups that belonged to several categories for some time after labels were removed. This would suggest the effect of the penalty on multiple category membership after label removal may have still been evident in some bidders‟ actions and would work against finding any difference in the effect of multiple-category membership before and after labels were removed. Thus, the fact that we found a difference before and after labels were removed speaks to the power of labels. Overall, these findings suggest several areas for future research. While our research captured an average devaluation across all multiple-category members, it seems reasonable to suppose the extent to which multiple-category members are punished depends upon the strength of the particular categorical boundaries they are attempting to straddle. While others have shown that multiple-category members are viewed particularly harshly when category pairs are considered oppositional (e.g., Carroll and Swaminathan 2000), the size of the penalty may also be a function of the perceived similarity of the two categories. The cognitive burden of combining several
41
very dissimilar categories should be greater than that of combining relatively similar categories, perhaps leading to greater devaluation. While current methods do not suggest a sound way of discerning pair-wise similarity among the 1,552 labels in our dataset, we believe this is a fruitful area to pursue in future research. Another line of extension might explore whether multiple-category membership triggers only confusion, as shown here, or whether it sparks additional negative perceptions, such as the notion that multiple-category members are dilettantes (Durkheim 1893) or are less committed relative to specialists. In addition, while our work has focused on documenting the challenges broadly faced by those who do not fit into the prevailing categorization system, it would be useful to explore whether there are different effects depending upon the way in which one is perceived not to fit. Taking the example highlighted earlier, in what contexts does an audience view the combination of the labels, such as “entrepreneur” and “Silicon Valley” as more or less inclusive? Another question could be to understand whether it is worse for a social actor to not fit into any category or to fit into too many. For example, is a film classified as horror and comedy seen as being outside the category system altogether, or is it seen as attempting to straddle two potentially incompatible genres? Our data did not allow for direct testing of the particular type of negative perceptions involved in devaluing multiple-category members, but future research should explore this by measuring impressions of multiple-category members in a more nuanced manner.
42
FIGURE 2.1 SCREEN SHOT OF PROSPER LOAN REQUEST
FIGURE 2.2 SCREEN SHOT OF GROUP PAGE WITH CATEGORY LABELS VISIBLE
43
FIGURE 2.3 SCREEN SHOT OF GROUP PAGE AFTER CATEGORY LABELS WERE REMOVED
44
TABLE 2.1 SUMMARY STATISTICS OBSERVATIONS = 13,835 VARIABLE MEAN Full Funded Listing (=1) ............................ 0.1334 Days to First Bid ........................................ 5.6144 Got Any Bids (=1) ..................................... 0.6732 Debt To Income Ratio ................................ 0.4897 Is a Homeowner (=1) ................................. 0.3470 Num Current Delinquent ............................ 3.8192 AA Credit Rating ....................................... 0.0175 A Credit Rating .......................................... 0.0248 B Credit Rating .......................................... 0.0510 C Credit Rating .......................................... 0.1082 D Credit Rating .......................................... 0.1671 E Credit Rating .......................................... 0.2044 HR Credit Rating ....................................... 0.4266 Income Category 0 ..................................... 0.0002 Income Category 1 ..................................... 0.0393 Income Category 2 ..................................... 0.1637 Income Category 3 ..................................... 0.4469 Income Category 4 ..................................... 0.2199 Income Category 5 ..................................... 0.0703 Income Category 6 ..................................... 0.0501 Number of Categories ................................ 4.3384 After Label Removal Flag ......................... 0.4133
45
STD. DEV. 0.3400 3.9288 0.4690 1.2883 0.4760 4.9498 0.1313 0.1557 0.2200 0.3106 0.3731 0.4033 0.4946 0.0170 0.1945 0.3700 0.4971 0.4142 0.2557 0.2182 1.0054 0.4924
MIN 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
MAX 1 23 1 10.01 1 58 1 1 1 1 1 1 1 1 1 1 1 1 1 1 5 1
TABLE 2.2 CORRELATIONS VARIABLES Full Funded Listing (=1) ...... Days to First Bid(1) ............... Got Any Bids (=1) ................ Debt To Income Ratio .......... Is a Homeowner (=1)............ Number Current Delinquent AA Credit Rating ................. A Credit Rating .................... B Credit Rating .................... C Credit Rating .................... D Credit Rating .................... E Credit Rating..................... HR Credit Rating.................. Income Category 0 ............... Income Category 1 ............... Income Category 2 ............... Income Category 3 ............... Income Category 4 ............... Income Category 5 ............... Income Category 6 ............... Number of Categories .......... After Labels Removed (=1) ..
(1)
(2)
(3)
(4)
(5)
(6)
(7)
1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) 12) 13) 14) 15) 16) 17) 18) 19) 20) 21) 22)
1 -0.2842 0.2656 -0.0154 0.0934 -0.1828 0.1741 0.1831 0.1737 0.1548 0.0407 -0.0893 -0.2362 -0.0067 -0.0205 -0.0352 -0.0552 0.0339 0.0600 0.0802 -0.0194 -0.0561
1 -0.5778 0.0143 -0.0897 0.1390 -0.0809 -0.0876 -0.0984 -0.1126 -0.0647 0.0074 0.2063 0.0071 0.0173 0.0483 0.0340 -0.0323 -0.0430 -0.0864 0.1060 0.1881
1 -0.0156 0.0773 -0.1347 0.0767 0.0875 0.1006 0.1191 0.0821 -0.0075 -0.2232 -0.0063 -0.0285 -0.0646 -0.0160 0.0380 0.0397 0.0718 -0.0670 -0.0309
1 -0.0165 -0.0447 0.0095 0.0309 0.0216 0.0411 0.0121 -0.0254 -0.0361 0.0293 -0.0770 0.3805 -0.1301 -0.0956 -0.0580 -0.0539 0.0214 -0.0203
1 -0.0804 0.1221 0.0825 0.1035 0.1312 0.0334 -0.0030 -0.2096 -0.0124 -0.0220 -0.1306 -0.0878 0.0751 0.1284 0.1496 0.0382 0.0380
1 -0.0978 -0.1115 -0.1504 -0.1785 -0.1484 0.0023 0.3502 0.0084 -0.0490 -0.0136 0.0613 0.0016 -0.0330 -0.0602 -0.0434 0.0135
1 -0.0214 -0.0310 -0.0466 -0.0599 -0.0678 -0.1153 -0.0023 0.0606 -0.0264 -0.0737 0.0060 0.0471 0.0953 0.0108 -0.0094
(10)
(11)
(12)
(13)
(14)
B Credit Rating .................... C Credit Rating .................... D Credit Rating .................... E Credit Rating..................... HR Credit Rating.................. Income Category 0 ............... Income Category 1 ............... Income Category 2 ............... Income Category 3 ............... Income Category 4 ............... Income Category 5 ............... Income Category 6 ............... Number of Categories .......... After Labels Removed (=1) ..
(8) -0.0370 -0.0556 -0.0715 -0.0810 -0.1378 -0.0027 0.0631 -0.0054 -0.0511 -0.0108 0.0051 0.0845 0.0187 -0.0143
(9)
9) 10) 11) 12) 13) 14) 15) 16) 17) 18) 19) 20) 21) 22)
1 -0.0808 -0.1039 -0.1176 -0.2000 -0.0039 0.0358 -0.0041 -0.0519 -0.0097 0.0467 0.0551 0.0503 -0.0025
1 -0.1561 -0.1766 -0.3005 -0.0059 0.0335 -0.0070 -0.0717 0.0195 0.0380 0.0639 0.0612 -0.0046
1 -0.2272 -0.3865 0.0038 0.0148 -0.0271 -0.0112 0.0160 0.0245 0.0035 0.0578 0.0032
1 -0.4374 0.0019 -0.0373 -0.0718 0.0456 0.0315 0.0028 -0.0040 -0.0295 0.0512
1 0.0025 -0.0537 0.0938 0.0751 -0.0439 -0.0795 -0.1159 -0.0892 -0.0332
1 -0.0034 -0.0075 -0.0153 -0.0090 -0.0047 -0.0039 -0.0015 -0.0143
(17)
(18)
(19)
(20)
(21)
Income Category 2 ............... Income Category 3 ............... Income Category 4 ............... Income Category 5 ............... Income Category 6 ............... Number of Categories .......... After Labels Removed (=1) ..
(15) -0.0896 -0.1821 -0.1075 -0.0557 -0.0465 0.0298 -0.0311
(16)
16) 17) 18) 19) 20) 21) 22)
1 -0.3978 -0.2349 -0.1217 -0.1017 0.0003 -0.0441
1 -0.4774 -0.2473 -0.2066 -0.0277 -0.0091
1 -0.1461 -0.122 -0.002 0.0366
1 -0.0632 0.0039 0.0171
1 0.0336 0.0102
1 0.0977
46
TABLE 2.3 LOG-ODDS LISTING WILL BE FULLY FUNDED (MAXIMUM-LIKELIHOOD BINOMIAL LOGISTIC REGRESSION) VARIABLES
MODEL 1 ***
MODEL 2 ***
MODEL 3
Debt To Income Ratio ............................. -0.111 (0.027) -0.142 (0.031) -0.114*** (0.034) Is a Homeowner (=1)............................... -0.108 (0.060) 0.011 (0.067) 0.062 (0.083) Number Current Delinquencies ............... -0.087*** (0.011) -0.062*** (0.011) -0.054*** (0.014) A Credit Rating(1) .................................... -0.173 (0.174) 0.026 (0.199) 0.238 (0.249) B Credit Rating(1) .................................... -0.775*** (0.157) -0.522** (0.179) -0.412 (0.224) C Credit Rating(1) .................................... -1.222*** (0.148) -1.019*** (0.171) -0.762*** (0.212) D Credit Rating(1) .................................... -1.917*** (0.150) -1.686*** (0.173) -1.383*** (0.215) (1) *** *** E Credit Rating ..................................... -2.786 (0.159) -2.671 (0.186) -2.303*** (0.229) HR Credit Rating(1).................................. -3.273*** (0.162) -2.979*** (0.187) -2.834*** (0.232) (2) Income Category 1 ............................... 0.094 (0.470) -0.217 (0.516) -0.153 (1.081) Income Category 2(2) ............................... 1.123* (0.450) 1.001* (0.493) 1.092 (1.062) Income Category 3(2) ............................... 1.198** (0.446) 0.998* (0.488) 1.167 (1.059) Income Category 4(2) ............................... 1.379** (0.447) 1.247* (0.489) 1.417 (1.061) Income Category 5(2) ............................... 1.470** (0.453) 1.394** (0.495) 1.623 (1.065) Income Category 6(2) ............................... 1.333** (0.455) 1.360** (0.498) 1.553 (1.068) Number of Categories ............................. -0.197*** (0.026) After Labels Removed (=1) ..................... -1.054*** (0.310) 0.268 (1.277) After*Number of Categories ................... 0.207** (0.069) 0.201** (0.071) After*Debt To Income Ratio ................... -0.096* (0.047) After*Is a Homeowner (=1) .................... -0.145 (0.138) After*Number Current Delinquencies..... -0.022 (0.023) After*A Credit Rating(1) .......................... -0.565 (0.401) After*B Credit Rating(1) .......................... -0.280 (0.360) After*C Credit Rating(1) .......................... -0.710* (0.344) After*D Credit Rating(1) .......................... -0.839* (0.349) After*E Credit Rating(1) .......................... -1.068** (0.376) After*HR Credit Rating(1) ....................... -0.384 (0.372) After*Income Category 1(2) ..................... 0.702 (1.272) After*Income Category 2(2) ..................... -0.256 (1.208) After*Income Category 3(2) ..................... -0.530 (1.207) After*Income Category 4(2) ..................... -0.516 (1.210) After*Income Category 5(2) ..................... -0.727 (1.220) After*Income Category 6(2) ..................... -0.588 (1.224) Constant .................................................. 0.192 (0.477) Observations ........................................... 13835 13226 13226 Likelihood Ratio Chi2 (df) ....................... 2034.57 (17) 1208.35 (17) 1237.30 (32) Log Likelihood ........................................ -4417.86 -3563.51 -3549.04 Number of Groups ................................... 169 169 Minimum .............................................. 2 2 Average ................................................ 78.3 78.3 Maximum.............................................. 2420 2420 * p < 0.05, ** p < 0.01, *** p < 0.001, two-tailed test Notes: Standard Errors in parentheses, (1) Compared to an individual with AA Credit Rating, (2) Compared to an Individual with Income Category 0 (Not Listed), there were no listers with Income Category 7, Models 2 and 3 have fewer observations because some groups had no variance in results and were therefore dropped (609 observations in 71 groups)
47
TABLE 2.4 LISTING GETTING A FIRST BID IN DAYS (3) (PIECEWISE CONSTANT HAZARD RATE MODEL) VARIABLES
MODEL 1 ***
MODEL 2 *
MODEL 3
Timepiece 1 ....................................................... -0.663 (0.152) -0.371 (0.155) -0.630* (0.257) *** *** Timepiece 2 ....................................................... -1.573 (0.154) -1.266 (0.158) -1.525*** (0.258) Timepiece 3 ....................................................... -2.322*** (0.159) -2.012*** (0.162) -2.270*** (0.261) Timepiece 4 ....................................................... -2.288*** (0.155) -1.977*** (0.159) -2.234*** (0.259) *** *** Timepiece 5 ....................................................... -1.697 (0.153) -1.380 (0.157) -1.637*** (0.257) Timepiece 6 ....................................................... -1.102*** (0.155) -0.773*** (0.158) -1.028*** (0.258) Debt To Income Ratio ....................................... -0.011 (0.009) -0.013 (0.009) -0.012 (0.010) Is a Homeowner (=1)......................................... 0.018 (0.023) 0.028 (0.023) 0.033 (0.030) Number Current Delinquencies ......................... -0.011*** (0.003) -0.010*** (0.003) -0.012*** (0.003) A Credit Rating(1) .............................................. -0.009 (0.087) 0.006 (0.087) 0.061 (0.109) B Credit Rating(1) .............................................. -0.167* (0.078) -0.153* (0.078) -0.111 (0.098) C Credit Rating(1) .............................................. -0.261*** (0.073) -0.241*** (0.073) -0.157 (0.092) D Credit Rating(1) .............................................. -0.456*** (0.072) -0.433*** (0.072) -0.345*** (0.091) E Credit Rating(1)............................................... -0.678*** (0.072) -0.651*** (0.072) -0.622*** (0.092) (1) *** *** HR Credit Rating ............................................ -0.947 (0.072) -0.931 (0.072) -0.844*** (0.091) Income Category 1(2) ......................................... 0.252 (0.140) 0.178 (0.140) 0.299 (0.246) Income Category 2(2) ......................................... 0.446*** (0.131) 0.385** (0.131) 0.583* (0.238) Income Category 3(2) ......................................... 0.565*** (0.129) 0.510*** (0.129) 0.702** (0.237) Income Category 4(2) ......................................... 0.628*** (0.130) 0.579*** (0.130) 0.753** (0.238) Income Category 5(2) ......................................... 0.625*** (0.134) 0.586*** (0.134) 0.845*** (0.241) Income Category 6(2) ......................................... 0.781*** (0.136) 0.738*** (0.136) 0.976*** (0.243) *** *** Number of Categories ....................................... -0.120 (0.010) -0.161 (0.012) -0.162*** (0.012) After Labels Removed (=1) ............................... -0.746*** (0.091) -0.254 (0.335) After*Number of Categories ............................. 0.117*** (0.020) 0.117*** (0.020) After*Debt To Income Ratio ............................. -0.002 (0.015) After*Is a Homeowner (=1) .............................. -0.017 (0.047) After*Number Current Delinquencies............... 0.003 (0.005) After*A Credit Rating(1) .................................... -0.142 (0.181) After*B Credit Rating(1) .................................... -0.121 (0.161) After*C Credit Rating(1) .................................... -0.226 (0.151) After*D Credit Rating(1) .................................... -0.238 (0.148) After*E Credit Rating(1) .................................... -0.107 (0.149) After*HR Credit Rating(1) ................................. -0.246 (0.149) After*Income Category 1(2) ............................... -0.067 (0.313) After*Income Category 2(2) ............................... -0.316 (0.291) After*Income Category 3(2) ............................... -0.296 (0.290) After*Income Category 4(2) ............................... -0.255 (0.292) After*Income Category 5(2) ............................... -0.454 (0.299) After*Income Category 6(2) ............................... -0.402 (0.302) Observations ..................................................... 51095 51095 51095 Subjects ............................................................. 13835 13835 13835 Failures ............................................................. 9315 9315 9315 Total Time at Risk ............................................. 77676 77676 77676 Wald chi2 (df).................................................... 36972.22 (22) 36843.14 (24) 36826.18 (39) Log-Likelihood ..................................................-17610.03 -17529.07 -17519.07 * p < 0.05, ** p < 0.01, *** p < 0.001, two-tailed test; Note: Standard Errors in Parentheses, , (1) Compared to an individual with AA Credit Rating, (2) Compared to an Individual with Income Category 0 (Not Listed), there were no listers with Income Category 7, (3) Bids same day as listing day are coded 1, day after = 2, etc.
48
CHAPTER 3 SEQUENTIAL CATEGORY MEMBERSHIP
Researchers have recently focused their attention on the finding that social mechanisms of categorization lead applicants who straddle multiple categories to suffer a discount (Zuckerman 1999; Zuckerman et al 2003; Rao, Monin, and Durand 2005; Hsu 2006; Hsu, Hannan, and Koçak 2009). Extending on the institutional view (Meyer and Rowan 1977; DiMaggio and Powell 1983); Zuckerman (1999) illustrated how the occupant of a candidate role tries to curry favor or recognition from an audience who holds the power to grant needed resources. Candidates needed to “look right” to the resource holder to be considered for evaluation. When candidates straddle multiple categories, they often fail to display the necessary characteristics of any one for audiences to adequately understand them. They risk getting ignored. Hannan, Pólos, and Carroll (2007: 108) suggest that these penalties to irregular classificatory membership may arise because a “lack of representativeness lowers confidence.” Despite this detriment, both organizations and individuals are often pressured to extend beyond their initial province. Organizations, facing pressures to grow, often attempt to diversify from their initial lines of business (Rumelt 1974). Competitors encourage expansion by mimicking each other‟s market entry decisions (Haveman 1993). Economies of scope advantages lead organizations to accumulate product lines in the hopes of exploiting expertise (Teece 1980) or forgoing contractual complexities (Williamson 1979). Sometimes, client expectations push firms to take on potentially disparate lines of business (Phillips and Zuckerman 2001). Individuals often engage
49
across disparate domains as well. The market for CEOs drives turnover and succession of chief executives (Ocasio 1994). Successful entrepreneurs are seen to result from affiliations with more disparate social cliques (Burt 1992) thereby allowing the focal actor access to complementary information. More experienced film actors are encouraged to demonstrate their skills by expanding their repertoire into multiple genres (Zuckerman et al 2003). This highlights a shortcoming of the theoretical approach to multi-category membership to date, that it cannot distinguish more versus less successful category spanners – in short, whether they are dilettantes or Renaissance men. The line between the two can be drawn more distinctly if investigators were able to disambiguate why some candidates, given a fixed multiple categorical portfolio of experiences, are more successful than others. Without a theory for why category spanners are not all similarly disadvantaged we are left to assume a social actor‟s success in working broadly is due to chance. The latest literature has hinted at the potential existence of successful polymaths (Zuckerman et al 2003; Ferguson 2009). However, the idea of a Renaissance man, able to successfully compile experiences across disparate categories without audience censure, proves to be theoretically elusive. For example, Zuckerman and colleagues‟ (2003) investigation of film actors stops short of highlighting factors which contribute to successful generalists and instead show how the penalty for unfocused actors was lessened for veterans. This view is unable to explain how two equally experienced actors, who have worked in identical genres, may be more or less successful.
50
One reason for this is that a cross-sectional view of a candidate‟s experiences ignores the importance of how their past history may attribute to their present identity. Most of the work on multiple-category membership to date has implicitly suggested a candidate‟s identity is an amalgam of their contemporaneous affiliations. For example, Zuckerman (2000) measured the coherence of an organization‟s breath of industries by examining its current portfolio of businesses. Rao, Monin, and Durand (2005) concluded that the identity of a French chef was indicated by the signature dishes they offered. Negro, Hannan, and Rao (2009a, 2009b) utilized the portfolio of wines produced by a winemaker in a given vintage as a measure of their membership in a certain winemaking style that year. These studies suggest an audience‟s evaluation of a candidate‟s multiple-category identity can be derived by collapsing their actions into one point in time and ignoring their history. I suggest the historic order in which a social actor accumulated their categorical experiences should matter. Historical context is of interest to social scientists because the consequences of past actions are often reflected in future outcomes. For example, Stinchcombe (1965) observed how founding conditions of organizations impact their future social structures. This was termed “organizational imprinting.” Baron, Hannan, and Burton (1999) verified this by showing how startups in Silicon Valley which began with higher administrative efforts eventually became more bureaucratic. Phillips (2005) uncovered evidence of the persistence of gender inequality in Silicon Valley law firms as the result of routines transferred from the parent firms of founders. Scholars who study labor markets have identified historic effects on individuals. Sørensen (2000) demonstrated how a team‟s composition over
51
time contributes to the likelihood of an individual member leaving in the future. Zuckerman and colleagues (2003) studied the career histories of film actors and showed that those with past acting experiences that were less concentrated in a genre were less likely to obtain future work. This paper links these two streams of literature, multiple-category membership and history dependence, by asking the question: Are there more or less acceptable sequences of historic categorical membership? Sociologists who study career progressions (Lawrence and Tolbert 2007; Abbott 1995; Abbott and Hrycak 1990; Kalleberg and Hudis 1979) motivate this paper by examining past patterns of career movements and their result on future outcomes. For example, Wilensky (1961) identified people with career patterns of more or less orderly progress and demonstrated its effect on the individual‟s level of social participation. Lawrence and Tolbert (2007) hypothesize that typical career sequences are normatively established through repeated observations. Abbott and Hrycak (1990) find that typical career sequences of musician‟s in 19th century Germany can be identified and codified. BlairLoy (1999) finds that despite persistent beliefs by women in finance-executive careers that their success is random and accidental, that instead, their achievements are increasingly a result of a patterned career trajectory. This paper takes the stance that candidates, who have sequentially moved between categories separated by stronger boundaries, will suffer a disadvantage. By boundaries, I mean to invoke how associated categories are with one another. In this sense, I intuit a “cognitive distance” between two categories. A candidate‟s past sequence of experiences serves as a cue or signal (Spence 1973) to potential evaluators.
52
Because categories necessarily delineate potential difference, they lead audiences to examine a candidate‟s past through such a prism of categorical variation. Categories become evaluative in this instance. Individuals often prefer narratives when attempting to develop an understanding. They also see patterns of behavior when none may exist. When an audience attempts to construct a narrative to interpret a candidate‟s identity, a candidate with an unpredictable or unfamiliar sequence of past categorical experiences will likely be perceived as being erratic – a dilettante. On the other hand, candidates who move between categories which are more cognitively proximate will be inferred as being more deliberate in their actions and will seem more conscientious. This paper addresses yet another understudied aspect of categories – the relationships among them. To study the existence of successful multiple-category straddlers necessitates an understanding as to how categories spanned may be related. This is because combinations of some categories may be more acceptable than others depending on their relationships. Zuckerman (2000) investigates this by identifying related versus unrelated business segments residing within firms. Pontikes (2009) measures the leniency of a category by examining how many other categories its members straddle. Hannan and colleagues (2007) theorize on the distinction between nested and non-nested categories. However, sustained empirical investigation as to how a mélange of categories relates to one another and what this entails for category members and evaluation is as yet forthcoming. This paper attempts to develop this line of work by recognizing the varying distances separating categories and suggesting the sequence of moves between them as relevant.
53
I situate my investigation in the context of an online marketplace for services, www.elance.com. Elance.com assists users in finding and hiring independent professionals and small businesses on a contract basis. Sellers of freelancing services bid on jobs posted on the website by potential buyers of this temporary work. Questions of credibility should be prevalent in this virtual context. This inquiry proceeds as follows. In the first section, I develop the theoretical background which underlies our conceptions of categories. I explain multiple-category membership and then propose a theory of how behavior which may be construed as more erratic is disadvantageous. I develop a refutable hypothesis and then test it using data on all transactions conducted from 1999 to 2004. I attempt to empirically distinguish this effect from extant sociological explanations. Results of three, triangulating, analyses are detailed. Concluding remarks are then made. MULTIPLE CATEGORIES AND META-SCHEMA Categories help us make sense of the world (Rosch 1973; Smith and Medin 1981; Murphy 2004). Without categories, each object we encounter would be seen anew and it would require extensive cognitive effort for us to comprehend each one. Instead, individuals, through repeated interactions, learn to group similar objects together. These clusters of like-objects are then labeled and henceforth utilized as a category. Categorical boundaries are useful to lump and separate items and are essentially a social act. By placing an object in a category, the set of characteristics by which objects in that category are identified is highlighted (Zerubavel 1997). Merely grouping items serves to increase the distinction between category objects. In markets, these classification systems serve to demarcate groups of similar items – be it services
54
or products. This helps potential buyers sort and seek. Just as the government controlled SIC codes in Zuckerman‟s (1999) examination of the illegitimacy discount identified how financial analysts corralled similar firms for comparison, websites enact a classification system into which products and services offered have to fit in order to be found by buyers. Categories, in these instances, are predetermined. From a sociological standpoint, categories serve to develop our expectations of actors who claim membership or are designated in them (Hannan et al 2007). When a social actor is unable to fulfill the expectations of their purported category membership, they are generally disadvantaged. Investigating this, Zuckerman (1999) found that companies which did not garner adequate financial analyst attention, due to their misfit with recognized industry groups, were ignored. This resulted in a lower stock price. To develop a theory as to how the historic movement by social actors between multiple categories matters necessitates an understanding as to the relationship between those categories. I follow the natural intuition regarding categories that would suggest the boundaries between any two of them may vary. Boundary strength is evidenced by the difficulty for a social actor to move between categories. This varies depending on the pair of categories being spanned. This intuition is reported in Zuckerman and colleagues‟ (2003) investigation of film actors moving across movie genres. They find evidence which corroborates the belief among casting directors that the “comedy/drama boundary as particularly difficult to cross” (Zuckerman et al 2003: 1066). Notice that boundary porous-ness is particular to categorical pairs. Drama and Comedy may have a strong boundary between them, while the boundary between
55
Comedy and Romance could be weaker. This matters when we examine multiple category membership because the relationship between categories may drive how an audience views actors who move between them. One way to visualize multiple categories and the strengths of relationships between them is to array them in cognitive space. Here, I conceptualize a meta-schema encompassing multiple categories (Fiske and Linville 1980). This meta-schema represents an audience‟s conception as to how categories may be “organized” or structured relative to one another. Figure 3.1 represents a hypothetical meta-schema that an audience may hold of how categories „A‟ through „H‟ are structured in terms of the boundary strengths between them. The physical distance between the categories represents the strength of boundary – the greater the distance between two categories in this meta-schema space, the stronger boundary; the smaller the distance, the weaker the boundary. [insert Figure 3.1 about here] The space, or distance, represents the familiarity an audience may hold of a category pair – hence “cognitive distance.” The relationship between any two categories will vary. For example, if instances of category „A‟ are observed or paired with objects in category „C‟, the smaller space between them represents the fact that this pair will seem more familiar to an audience. This may be because social actors often move between the two. Contrast this to the difficulty an audience will have in understanding or recognizing pairings of objects in category „A‟ and category „H‟. Referring back to the movie genre example, a movie released which paired elements from the horror genre with the romance genre may lead audiences to be dismissive or
56
at least confused. I suggest that because horror movies and romance movies are likely very cognitively distant, paring elements of both in a movie will not be a familiar combination to an audience. ERRATIC JOB SEQUENCES This paper theorizes on the intuition an audience may hold when observing a social actor‟s sequential past history of engagements. In markets, cues to a seller‟s credibility likely include their historic performance on previous transactions. For example, the rise of reputation systems in online markets such as Ebay and Amazon demonstrate how past transactions serve to assist evaluation by future buyers (Hsu et al 2009; Resnick et al 2006). Because these exchanges are concluded remotely, signals of trustworthiness are reduced to information available online for a buyer to infer the appropriateness of a seller. These systems should be particularly relevant in situations which require trust. For example, when taking a car to an auto shop to complete a complex repair, there are perhaps several instances whereby decisions can be made by the mechanic to either cut corners or to choose a path which may be more costly, but ensure less difficulty for the owner in the future. In these situations the car owner should wish to choose a mechanic whom they may perceive as being less likely to cut corners. I differentiate this from the existing theory of categorical confusion (Zuckerman 1999, 2000) or principle of allocation (Hannan and Freeman 1977; Dobrev, Kim, and Hannan 2001; Hsu 2006) by suggesting that beyond the breadth of collected categorical experiences a candidate displays, it is instead the ordered path they have taken that matters. The sequence of how categorical experiences are accrued
57
will affect how an external audience perceives how committed a candidate is – in short, whether or not they are merely a dilettante. Holding a candidate‟s breadth or variety of category experiences constant, the less predictable a candidate‟s past job experiences seem, the less convincing an audience should find them. By sequence, I mean the order in which past categorical experiences have been accumulated. The work on career progressions has demonstrated the sequence in which a social actor moves through a range of previous job experiences impacts the rewards they may receive (Spillerman 1977; Kalleberg and Hudis 1979; Abbott and Hrycak 1990), their level of social participation (Wilensky 1961), or their future mobility chances (Tolbert 1982) and is also a function of the social structure (Lawrence and Tolbert 2007). For example, Wilensky identified an individual‟s career progression as either orderly or disorderly. He found those who had a more orderly job history, employees who either moved between related functional or ordered hierarchical jobs, were more likely to have stronger attachments to formal associations and their community. See Figure 3.2 for an illustration of two different sequential categorical paths utilizing the hypothetical meta-schema from above. The two potential candidates illustrated below both portray an equal variety of past experiences in that they have the same total number and type of categories in their history of job experiences. They have both accumulated experiences in categories A, B, C, D, H, and G. Their categorical breadth, at this cross-sectional point in time, is identical. However, Candidate 1, on the left, has chosen a path between categories with weaker boundaries, as demonstrated by her sequential moves of smaller distance in meta-schema space.
58
Contrast this to Candidate 2, on the right, who has stepped between the same number and type of categories, but has taken a path of greater cognitive distance (stronger boundaries). Their sequences differ. #1‟s sequence is A, C, D, B, H, G while #2‟s sequence is A, B, C, G, D, H. [insert Figure 3.2 about here] I define erraticism as the strengths of the cumulative boundaries a social actor has moved between in the past. The greater distance their sequential moves between categories are, the more erratic the candidate will look. This parallels the work that neo-institutionalists have done with boundary crossing. For example, Rao and colleagues (2003, 2005) have examined the movement of French chefs between the categories of Nouvelle and Classical French cuisines. They find that attempts to transgress stronger institutionalized boundaries elicit greater penalties by the Guide Michelin. However, the more often transgressions occurred, the more familiar those parings, the less effort critics expended on deciphering such movement and commensurately lower penalties were inflicted. Sequence matters because the order of movement between categories is juxtaposed in a candidate‟s history and highlights these transgressions. The more distant categories crossed demonstrates greater contrast, highlighting to an audience the divergence in a candidate‟s past. More proximate moves are more congruent and will illicit less discrepancy in audience understanding. I mirror the lead of Rao and his fellow analysts (2005) who identified the strength of boundaries as encouraging the punishment meted out by external critics, if boundaries are symbolically potent (DiMaggio 1987), then spanning those of greater strength will be more likely to result in punishment than spanning weaker ones. In this
59
case, movement between more proximate categories is more recognizable, therefore less likely to be penalized. White and White (1993) suggested that the boundary between categories, “get recognized from…a frequency distribution of sets of social actions.” An audience does not necessarily impose a set belief of the appropriateness of a categorical pairing, but rather they, “observe and dissect regularities” (White and White 1993: 60). Therefore, the stronger a recognized boundary between two categories, the greater a social transgression it will be recorded as and the more salient it will be as a social cue. A sequence of moves between more proximate categories in cognitive space may be viewed as more socially acceptable to an external audience because ordered progressions make more sense. As the literature on narratives would suggest, individuals have a desire to discern stories which link events (Orbach 1997; Tilly 2002) and are more persuaded by arguments when they are wrapped in a narrative. Bruner (1990) suggests we have a need for stories to help us make sense of the social world. For example, a candidate who accumulated their experiences in a steppingstone-like fashion may be perceived as being more committed to developing necessary skills. This means a candidate with an incremental progression of jobs may project the appearance of someone with a more planned or orderly focus as to how they went about expanding their scope. Sequence highlights a candidate‟s career background. Moves between more distant categories become salient because job experiences are often listed chronologically. Experimental work on presentation order and category recognition (Garner 1953; Medin and Bettger 1994) shows that the sequence in which a subject
60
views cues affects their subsequent judgments regarding category membership – disorderly sequences should highlight these disparities. People generally have more difficulty making absolute judgments than relative ones. This suggests sequencing of stimuli matters. Recent work by Stewart, Brown, and Chater (2002) show that classification of items into their appropriate categories was more accurate when the display of the item was proceeded by a more distant member of an opposite category than when proceeded by a member of the same category. Sequentially spanning categories that are more distant will be more salient because the categories will be juxtaposed next to one another. Past sequences that do so will induce an audience to identify such a candidate as being erratic. The more distant the total cumulative categories crossed by a candidate, the more they risk a discount because an audience is more likely to interpret such a candidate as a dilettante. Therefore, I propose the following: PROPOSITION: Ceteris paribus, the stronger cumulative boundaries a candidate’s historic sequential of categorical moves crosses, the less credible they will seem. SCOPE CONDITIONS There are several scope conditions that should be noted of the above proposed theory: 1. The object or candidate under question has a history, as opposed to those objects which only belong to multiple categories in a single instance. For example, movies can be affiliated with multiple genres, but do not accumulate them over time. On the other hand, a job candidate may move between several positions and will accumulate historic evidence of their categorical affiliations.
61
2. The historic affiliations of the candidate are observable by the relevant external audience. For example, resumes of job seekers identifying their job histories are reviewed by the hiring organization and film actors‟ roles in a sequence of movies are evaluated by casting directors. 3. The categories in question vary in a dimension of interest vis-à-vis each other. Because I claim the order of moves between categories is useful as a cue, this requires the categories to differ from one another. In essence, there needs to be varying strengths of boundaries which separate them. If there is no variation between categories, there will likely be no boundaries to speak of, in which case movement between them will do little to signify anything to an audience. 4. Category straddling is more a norm than exception. In order to have a useful sequence of categorical affiliations by which to compare one candidate to another implies that candidates regularly move between categories. In the case where spanning is highly unlikely, then the differences in candidate‟s sequences may not affect evaluation. AN ONLINE MARKETPLACE FOR SERVICES The context under study is an online market for freelancing services. Elance.com is a marketplace where buyers of services find and hire independent professionals and small businesses on a contract basis. Freelancers (bidders) bid on projects that buyers post to the site. Each job posting is categorized into a recognizable job category. Some examples of job categories include Website Programming, Administrative Assistance, Translation Services, and Logo Design. Elance.com was founded in 1999 and as of November, 2009, there were over 27,000 jobs posted each
62
month and over 100,000 providers of service located worldwide. Since founding, there has been over $225 Million worth of business transacted on the website with a recent average job size over $600. The online environment is particularly suited to examine questions of categorical membership for at least two reasons. First, it allows an investigator to examine data regarding all aspects of the remote exchange. Online website transactions record a particularly detailed view of the actions of exchange participants. Because websites need to overcome issues of information asymmetries and give confidence to its users, most transaction based websites attempt to capture and display information they deem necessary to facilitate trade. A second reason why an online environment is particularly suited to the examination of the effects of categorization is that the voluminous information available online necessitates a rigid classification system. As detailed below, Elance.com has incorporated an extensive classification system which facilitates sorting and finding jobs for both buyers and sellers. Elance.com jobs are posted within only one particular job category. The job level categories are nested within higher level domains. There are eight domains, which include, for example, “Web and Programming,” “Design and Multimedia,” and “Writing and Translation.” There are a total of 186 job-level categories nested under these domains. For example, the “Web and Programming” high level domain includes job categories such as, Website Design, Blog Programming, and Database Development. This is the level at which jobs are listed on the website, organized, searched for, and bid on by sellers. Each job can only be listed under one job category.
63
See Appendix C for a full list of domains and the categories nested below them. Figure 3.3 presents a sample job listing. [insert Figure 3.3 about here] Once a job is listed, freelancers bid on it. A bid includes the price at which the bidder is willing to complete the stated task. In making a decision to choose a bidder, buyers have access to the freelancer‟s historic profile. As with other online markets, a bidder‟s complete history of their past jobs and the feedback they have received are available for a potential buyer to evaluate. Notice the bidder‟s list of past jobs is presented chronologically, with information regarding the categories which the completed jobs were listed. See Figure 3.4 for an example listing of a bidder‟s past jobs viewable by a buyer. [insert Figure 3.4 about here] Category spanning on Elance.com is prevalent among sellers with increasing experience. Figure 3.5 illustrates the percent of bidders who have worked in more than one job category by the number of jobs they have completed. As a bidder increasingly garners experience by completing jobs, they are increasingly likely to work in multiple job categories. For example, about 70% of bidders who have completed less than 10 jobs have worked in more than one job category. However, for bidders who have completed between 30 and 50 jobs, 87% of them would have worked in more than one job category. [insert Figure 3.5 about here] The bidding concludes within a timeframe established by the buyer, generally within a week, whereupon a winning bidder is selected to perform the task. After a
64
winner is identified, details of the job are exchanged and delivery is accomplished online. Upon delivery of the completed task, the buyer has an opportunity to provide feedback on the seller‟s performance. As stated, the more erratic past job sequence a freelancer has had, the less credible they will appear. A candidate which is perceived to be more of a dilettante, more erratic, will be less likely to garner future work. I predict and test the hypothesis that the more erratic a bidder’s job history, the less likely they will win a subsequent job. VARIABLES OF INTEREST Dependent Variables In a freelancing environment, being awarded a job is the primary measure of success. Therefore, my dependent variable is the likelihood a bidder is chosen for a job. I test this hypothesis on all transactions that occurred on the website in 2004. That year there were a total of 33,906 jobs listed that garnered 315,537 bids. Having access to each bidder‟s job and bidding history, I tracked every bid made on every job listed in 2004. Unsuccessful (losing) bids were coded as 0‟s and successful (winning) ones are coded „1‟. Of the 315,537 bids, 35,878 were successful. Notice that there can be multiple winners for each job. Independent Variables The independent variable of interest, erraticism, is a function of the boundary strength between pairs of categories. I utilize a co-occurance measure of association to measure the strength of a boundary. As Rao et al (2005) demonstrated, the increasing incidences of French chefs who combine elements of both Nouvelle and Classical
65
cuisines into their repertoire led to the erosion of boundaries between the two camps. Zuckerman (2000), following Teece et al (1994), measured how cognitively related a firm‟s business lines were. The measure of the coherence between two industry classifications was a function of the number of times the two appeared in a firm‟s portfolio. He suggested that the more often two industries appeared together, the more familiar analysts and industry experts would be with the pairing. I measure the boundary strength between two categories by examining how often elements of each co-occur with one another – as Pavlov‟s dogs demonstrate their association of the ringing of a bell with food by salivating. Increasing observations of two category objects co-occurring should lead to weaker boundaries because of the increasing recognition. Paralleling this, cognitive psychologists have hypothesized how concepts come to be associated with each other through repeated pairing. By associated they refer to how the observation of one category may invoke expectations, or not evoke inconsistencies, of each another. This leads to some pairings of categorical objects being more acceptable than other, perhaps less associated, parings. Therefore, I measure boundary strength between two categories as a function of how often two jobs in different job categories appear together in all sellers‟ previous job histories. Formally, 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦 𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ𝑖𝑗 = 1 −
𝐾 𝑘 =1 min (𝐽 𝑘𝑖 ,𝐽 𝑘𝑗 ) 𝐾 𝑘=1 𝐽 𝑘𝑖
(3.1)
where the boundary strength between categories i and j is equal to the sum across all sellers k, of the minimum number of instances either category i or j jobs appear in their history, divided by the sum across all sellers k of the total number of times
66
category i appears in their job histories. I subtract from one to make the variable more tractable, as the more often two job categories co-occur, the weaker the boundary between them should be. Notice this measure is asymmetric. That is to say, the strength of association between categories i and j may not be equal to that between categories j and i. Boundary strength ranges from 0 to 1, where it equals 0 only when categories i and j always occur together and equals 1 when they never do. I calculated this measure for those categories with more than 100 total jobs as any less would be very sparse and difficult to justify as impacting an audience‟s beliefs about any boundary. All transactions from the beginning of the website‟s operations in 1999 to 2003 are used. Results of the calculated associations are graphically displayed in Figure 3.6. [insert figure 3.6 about here] I use an MDS (multi-dimensional scaling) algorithm to depict boundary strength as a distance. Because the above two dimensional space can only represent one distance, I display the average of the two category pair of asymmetric distances (i.e. BSij + BSji / 2). The size of the circles represents the volume of transactions between 1999 and 2003 in each job category, the larger the circle, the greater number of transactions that have been completed. This ranged from 100 to 12,128 jobs per category. The „closer‟ two categories are, the weaker the boundary; the farther apart, the stronger. There are two points to note from Figure 3.6. First, we can see „clusters‟ of the domains the job categories are nested within. This depicts an understandable grouping of categories based on some underlying belief in similarity of skills required to address
67
them. Focusing on the dense center cluster we can see the lower right space seems to encompass the jobs in Web and Programming, with jobs such as Simple Website, Blogs, and HTML Emails. The upper center space seems to be Design and Multimedia jobs, including Illustration, Letterhead, Print Ads; while just to the left of this space, the domain overlaps with the Administrative support space, with Word processing and Transcription jobs. Moving to the lower left, this cluster encompasses the jobs in the Writing and Translation arena with the categories of Creative Writing, Resumes, and Cover Letters. Finally, farther to the left of this space appears the Legal domain with Trademarks and Patent and Copyright work. The second point to note is the differences in space, or strength of boundaries, within each domain. For example, the jobs in the Web and Programming domain seem to all be very tightly associated with each other while there are stronger boundaries between jobs in the Legal arena. This may reflect an underlying belief by buyers of services as to the different strengths of the relationships among the job categories within a particular domain. For example, the job categories within legal services are less related to one another than the job categories in web and programming. The measure of how erratic a seller‟s job history is calculated as the cumulative sum of the distance between job categories they have sequentially worked. Specifically I define erraticism as, 𝐸𝑟𝑟𝑎𝑡𝑖𝑐𝑖𝑠𝑚𝑘 =
𝑁−1 𝑛=1 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦
𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ𝑖𝑛 𝑗 𝑛 +1
where bidder k‟s level of erraticism is the sum, from 1 to N-1 (N being the total number of chronologically ordered jobs completed by bidder k), of the boundary
68
(3.2)
strength of job n‟s category compared to job n+1‟s category. If category n equals n+1, then it is zero. There are two things worth noting about this measure. First, as a cumulative sum, a bidder‟s erraticism only increases. This seems appropriate as the more experiences one garners, the more chances inconsistently juxtaposed job categories may occur. Second, because each instance of a successful job is recorded, this permanently effects how an audience will view you. This also seems appropriate as it parallels other similar context, such as recruiters which may view a resume of all an applicant‟s job experiences chronologically ordered or the order of publications on a researcher‟s CV. Figure 3.7 shows the distribution of levels of erraticism of bidders for all jobs in 2004. Notice this is skewed to the right as there are many sellers who recently joined the website and have not had time to garner much experience. All bidders, regardless of experience, are represented by the left-hand side panel. Erraticism ranges from 0 to 348. The right-hand panel displays the erraticism of bidders that are within one standard deviation from the mean amount of work experience, as measured by total number of completed jobs. In this subset, the distribution of erraticism remains similar, with a skew to the right, a range from 0 to 85.6, and a mean of 16.9 with a standard deviation of 15.5. [insert Figure 3.7 about here] Control Variables Several control variables are included in the model that should affect a bidder‟s likelihood of being awarded a job. First, as this is a market based context, we should
69
expect the lower the price of the offer to correspond to a greater likelihood of wining, so I include the amount of the bid. The total number of times a seller and buyer have worked together in the past will likely affect the chances they will work again in the future. I expect a buyer‟s previous experience with a bidder to encourage future transactions. The number of previous jobs (logged) a buyer has completed in any category will capture the effectiveness a bidder has been in the past. The number of previous jobs (logged) a bidder has completed in the focal category they are bidding on should positively impact their ability to win again. There are two potential alternate explanations that I control for. First, the more erratic a job history a candidate has, the less likely their portfolio of past jobs represents a coherent identity to a potential buyer. This is what Zuckerman coined as coherence (2000) and is meant to represent how confusing a bidder‟s collection of past jobs may appear to an audience. The coherence of a candidate‟s past jobs should positively impact their ability to garner new business. I capture this effect by including a measure of how distant a bidder‟s total past portfolio of jobs is to their focal bidding category. Formally, I calculate, 𝐶𝑜ℎ𝑒𝑟𝑎𝑛𝑐𝑒𝑘𝑖 =
𝑁 (1−𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦 𝑛 =1
𝑆𝑡𝑟𝑒𝑛𝑔𝑡 ℎ 𝑖𝑗 𝑛 )
𝑁
(3.3)
where the coherence of a bidder k‟s portfolio of past job experiences relative to a job in category i they are bidding on is measured as the sum across all previous N jobs of the one minus the boundary strength of category i (the bidding category) to jn (each previous job‟s category), divided by the total number of jobs „N„ seller „k‟ has completed. I subtract the boundary strength measure from one in order to make the
70
variable more tractable. It is expected the greater the level of coherence, the better a bidder should fare. Second, the more erratic a bidder‟s past history, the greater number of job categories they would have worked across. If this is the case, they will be less likely to have developed skills for any particular one. Organizational sociologists have termed this the principle of allocation (Hannan and Freeman 1977; Dobrev et al 2001; Hsu 2006). While past research utilizing a natural experiment has suggested that negative effects of boundary spanning in online environments can be attributed to merely the perception of an audience (Leung and Sharkey 2010), I cannot rule out the possibility that skill learned from experience matters. If a candidate is more erratic, they may face difficulty because having too broad a base of experience should make them less able in any one. To control for this, I include a measure of the number of distinct job categories a bidder has worked in. I expect the greater the number of distinct categories a bidder has worked across, the lower their likelihood of winning a subsequent bid. Summary statistics of the variables are presented in Table 3.1 and correlations in Table 3.2. [insert Table 3.1 and 3.2 about here] MODELS AND RESULTS The risk set of interest here is all bids by bidders made in 2004 with more than two previous jobs. Because I need to calculate a sequence of past job histories, this necessitates eliminating bidders who have completed none or one job prior to their bid. Removing these bidders‟ bids leaves 293,954 bids from 2,128 qualified bidders. The
71
dependent variable was whether a bidder won a job or not and was coded a 1 if they won and a 0 otherwise. I test my hypothesis in three ways. First, I match bidders for the same job, with identical past categorical job experiences, but only differ in the order which they accumulated them. I then compare the likelihood a more versus less erratic bidder has of winning the bid. Second, in order to estimate the particular effect of erraticism and to better control for other potential differences between bidders, I utilize a fixedeffects logistic regression (grouped by job) to estimate, between all bidders for a job, the effect of their erraticism on their likelihood of winning the bid. Third, in order to address potential concerns of time-invariant heterogeneity, I model a fixed-effects logistic regression on the within-bidder effect of changes in their individual erraticism on their likelihood of winning a bid. Matching Identically Experienced Bidders Because in real-world empirical contexts it is often difficult to find identical pairs of candidates who would only differ on their treatment, we often rely on regression techniques to estimate the potential effect of differences in the independent variables of interest. However, this dataset is extensive enough to allow me to identify two bidders for the same job, who only differ in their level of erraticism. In short, I matched bidders, bidding for the same job, with identical past experiences – in terms of the number and types of past jobs – but differ only in the order of how they accumulated their experiences. This allows me to control for alternative explanations, such as the principle of allocation, the coherence of a bidder, and their levels of experience.
72
Of the 293,954 bids from 2,128 bidders, I identified 2,335 jobs where there were two bidders with identical categorical experiences. By identical categorical experiences I mean that both bidders had the same number and types of job experiences completed on the website. They only differed in the sequence by which they accumulated them. For each pair of bidders, I ranked them in terms of their erraticism by identifying the less erratic bidder and more erratic bidder. I then estimated the mean percent chance, for all jobs in this subset, that members of either groups of bidder (more versus less erratic ones) were eventually awarded the job. Results are reported in Figure 3.8. [insert figure 3.8 about here] Even with this extremely limited dataset, I demonstrate that between two bidders that had identical past number and type of jobs – the one with a more erratic job sequence was less likely to be chosen. In particular, when pitted against one another, the more erratic bidder was chosen 17.7% of the time whereas the less erratic one was chosen 22.1% of the time. These differences are significant at p