Taming Big Data: Using App Technology to Study ...

Taming Big Data: Using App Technology to Study Organizational Behavior on Social Media

Sociological Methods & Research 1-29 ª The Author(s) 2015 Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/0049124115587825 smr.sagepub.com

Christopher A. Bail1

Abstract Social media websites such as Facebook and Twitter provide an unprecedented amount of qualitative data about organizations and collective behavior. Yet these new data sources lack critical information about the broader social context of collective behavior—or protect it behind strict privacy barriers. In this article, I introduce social media survey apps (SMSAs) that adjoin computational social science methods with conventional survey techniques in order to enable more comprehensive analysis of collective behavior online. SMSAs (1) request large amounts of public and non-public data from organizations that maintain social media pages, (2) survey these organizations to collect additional data of interest to a researcher, and (3) return the results of a scholarly analysis back to these organizations as incentive for them to participate in social science research. SMSAs thus provide a highly efficient, cost-effective, and secure method for extracting detailed data from very large samples of organizations that use social media sites. This article describes how to design and implement SMSAs and discusses an application of this new method to study how nonprofit organizations attract public attention to their cause on Facebook.

1

Department of Sociology, Duke University, NC, USA

Corresponding Author: Christopher A. Bail, Department of Sociology, Duke University, NC, USA. Email: [email protected]

Downloaded from smr.sagepub.com at Duke University Libraries on June 3, 2015

2

Sociological Methods & Research

I conclude by evaluating the quality of the sample derived from this application of SMSAs and discussing the potential of this new method to study non-organizational populations on social media sites as well. Keywords computational social science, apps, social media, organizational behavior

In recent years, social scientists have expressed considerable enthusiasm for the study of organizations and collective behavior using social media data (e.g., Dimaggio et al. 2001; Golder and Macy 2014; G. King 2011; Lazer et al. 2009; Lewis, Gray, and Meierhenrich 2014). Much of this enthusiasm is provoked by the unprecedented amount of data currently available from social media sites such as Facebook and Twitter. Given sufficient computing resources, scholars can now extract billions of lines of text that describe online interactions between organizations and their audiences. These data are doubly intriguing because they describe ‘‘naturally occurring’’ interactions between organizations and their audiences in contrast to the more artificial or retrospective data that might be obtained from conventional survey research or in-depth interviews with representatives of organizations. The rich, longitudinal nature of these data thus holds the potential to understand the evolution of organizations and collective behavior in situ. Everincreasing use of social media and associated technologies among both organizations and the broader public suggests these data sources will only continue to increase in value.1 Yet the study of social media also presents daunting methodological obstacles for students of organizations and collective behavior. First, most publicly available data lack critical information about organizations and their audiences—as well as the broader social contexts in which they interact. Although scholars are already using Twitter to predict the performance of businesses in the stock market (Bollen, Mao, and Zeng 2011), the capacity of public health organizations to prevent pandemics (Paul and Dredze 2011), and even revolutions such as the Arab Spring (Howard et al. 2011), these studies are based upon analysis of social media texts that cannot address a variety of factors that are central to many theories of organizational behavior such as organizational capacity or external opportunity structures. While some social media sites collect detailed information about such dimensions of organizational behavior, these data are carefully protected because of pervasive concern about online privacy. The combination of these


Bail

3

factors has led some scholars to ask whether social media data can provide anything beyond crude description of underidentified populations of organizations in online settings. In this article, I introduce social media survey apps (SMSAs), a new research method designed to enable comprehensive analysis of organizational behavior on social media sites by synchronizing computational social science techniques with conventional survey research. Apps are small pieces of software that enable people to quickly and efficiently access information from the Internet or transmit their own data to others—either via desktop computers or via mobile devices. Although most apps are designed to improve the convenience of computing technology—often via mobile technology—I argue that app technology also provides a powerful new platform for social science research. SMSAs (1) enable researchers to request permission to access public and nonpublic data from an organization’s social media page, (2) survey these organizations in order to capture additional data of interest to a researcher, and (3) return the results of a scholarly analysis back to the organization as incentive for them to share their data and participate in social science research. SMSAs thus provide an efficient and cost-effective means for secure transmission of highly granular data from very large samples of organizations that use social media. This article provides a step-by-step guide for the design and implementation of SMSAs. I focus on apps designed to study organizations that use Facebook—the world’s largest social media site—though the method could easily be expanded toward other social media sites as well. Because the creation of SMSAs requires significant computer programming skills that are not common among social scientists, my discussion does not assume comprehensive knowledge of software languages or Internet technology. Instead, I provide annotated software code that describes how to implement this method at the following link: https://github.com/cbail/App-for-StudyingOrganizational-Behavior-on-Social-Media. The first section below discusses how to create the Internet infrastructure for an SMSA using social media sites and cloud computing infrastructure. The second section discusses the process of requesting permission from an organization to access their social media data. Third, I explain how to extract such data from social media sites. The fourth section explains how to incorporate conventional survey-based methods within apps to collect information about the background of organizations and the broader social environment in which they interact with other organizations and the public. Fifth, I discuss how to create incentives for social media


4


users to install SMSAs. Sixth, I explain how to identify samples of organizations that use social media sites and recruit them to install an SMSA. In the seventh section, I present an application of this new technique as part of a study of how nonprofit organizations call public attention toward their cause on Facebook. I conclude by evaluating the quality of the sample derived from the SMSA in this study, discussing legal and logistical issues that may arise in the implementation of this new technology, and analyze the potential of apps for social science studies of individual social media users as well as organizational populations.

Creating Online Infrastructure for an SMSA SMSAs require substantial coordination across multiple websites in order to function. Before describing the steps of software engineering necessary to obtain social media data from organizations, I must describe the online infrastructure necessary for creating and hosting such software. The first step is to request permission to create an app from a social media site. On Facebook, this requires app developers to create what is known as a ‘‘canvas page.’’ A canvas page is a website where the app developer must input basic information about an app—including its name, a brief description of its function, the name of the developer, and other administrative data.2 The canvas page does not host the software code or computer scripts that constitute an SMSA. Instead, such software must be uploaded to an external website. There are numerous app-hosting websites that offer various advantages in terms of scalability and cost. Among the most popular of these is the Google App Engine (see http://appengine.google.com). There, app developers must create an account—or link an existing Google account—in order to pay for the cost of Internet traffic created by app users. At present, the cost of hosting an app is very low—often less than a few dollars each month—yet such costs of course depend upon the amount of Internet traffic created by the app. The software for the app—described in detail in each of the following sections of this article—must be written in one of several computing languages (e.g., Python, Java, or PHP).3 This software—often written in the form of multiple scripts or files that interface with each other—is then launched or ‘‘deployed,’’ to use the language preferred by software developers, from the app-hosting website. This may be performed through additional computer scripts written by the developer or via an automated tool such as the Google App Engine Launcher, a piece of ‘‘stand-alone’’ software that can be installed on any computer.


Bail

5

Requesting Permission to Access an Organization’s Social Media Data Figure 1 describes the work flow of a hypothetical SMSA. The first step of software engineering necessary to create an SMSA is to ask an organizational representative for permission to access data from their group’s social media page.4 This process is often referred to as ‘‘authentication’’ by software developers. Not every authentication request is the same. Authentication requests vary in the type of information being requested of the user—in this case, the person who manages an organization’s social media site. On Facebook, pages hosted by organizations are known as ‘‘fan pages.’’ SMSAs that target organizational fan pages can access more than two years of ‘‘insights data’’ that is not publicly available. Insights data enable page owners to monitor visitor traffic to their pages, not unlike the popular web-monitoring tool ‘‘Google Analytics.’’ But while Google Analytics only provides raw counts of visitor traffic, Facebook insights data contain aggregate information about the age, gender, and geographic location of those who visit a Facebook fan page as well as detailed information about their interaction with the page—including the number of page views, clicks, comments, likes, and shares of each page, and so on.5 What is more, an SMSA can extract all the content of all posts by organizations as well as comments from their audiences and any other information that is publicly viewable on the organization’s page. The software code used for authentication can be easily modified to request different types of information from an organization. These requests are communicated to the person who manages an organization’s social media page via an online dialogue within the social media site. Many readers may already be familiar with such requests, which come in the form of ‘‘pop-up’’ windows within their Facebook page that explain precisely what type of information they agree to share when installing a new app. If a user agrees to such requests, Facebook creates an encrypted password known as an ‘‘authentication token’’ which allows the app developer to access their data for a limited amount of time—often several months. SMSA developers who wish to collect data across longer time periods must periodically request reauthentication from the person who manages the organization’s social media page via the same pop-up dialogue. Because Facebook insights data are available retrospectively for as much as two years, it is possible that such authentication might not be necessary to address a research question. If reauthentication is necessary, however, it can be easily automated within the app software.


Figure 1. Work flow of hypothetical social media survey app (SMSA). 6 Downloaded from smr.sagepub.com at Duke University Libraries on June 3, 2015

Bail

7

Extracting Data From Social Media Sites If the person who manages an organization’s social media site grants permission, an SMSA may request information from an application programming interface (API). An API is a relatively new type of Internet technology that enables app developers to request specific information from a social media site such as Facebook or Twitter. APIs are often referred to as ‘‘fire hoses’’ of information because they are capable of handling very frequent requests for large amounts of data in a highly efficient manner. Many of the largest social media sites and websites have developed their own proprietary API technology—from Google to the New York Times. Facebook’s API—which is described in detail in the discussion subsequently—is named the ‘‘Graph API.’’ Unfortunately, each social media site’s API has its own language for data requests—and these languages change frequently as app technology continues to mature.6 Learning to write API requests within computer programs is therefore one of the most challenging components of app development. Thankfully, Facebook has created a useful learning tool known as the ‘‘Graph API Explorer’’ (see http://developers.facebook.com/tools/explorer). On this website, an app developer can input an authentication token and an API request in the appropriate query language to observe the output.7 For example, one might input a request for all posts made by a given organization, and the output window of this site would reveal the data that would be returned by a URL request. The Graph API returns these data in JSON format—which contains metadata that simplifies data cleaning or manipulation at a later point. For example, if a user produced hundreds of posts and the researcher was only interested in posts within a particular time period, JSON metadata could be used to extract only those posts within the time period of interest. Next, SMSAs require infrastructure for storage of data from APIs. Once again, there are a variety of options for cloud-based data storage. Google provides some of the most popular cloud-storage options, such as the Google Drive—where data can easily be passed into matrix or spreadsheet form. An SMSA can later request data from this location in order to deliver scholarly analysis of interest to the user or for subsequent analysis by the researcher.

Surveying Organizations Using SMSAs While the amount of data available through social media APIs is extensive, most researchers interested in working with such data will require supplemental


8


data to identify additional characteristics of organizations—or their broader social environments—that are not available via social media websites. The amount of information that could potentially be requested of an organization through app-based surveys is only limited by the imagination of the researcher and the patience of the organizational representative who installs the app. For example, supplemental survey questions within an SMSA can allow researchers to assess the resources of an organization or its organizational structure since such information is not commonly described on social media pages. Or, an SMSA could ask a series of questions about off-line activity in order to place the online data in broader social context. These surveys can be conducted when the user installs the app, or multiple times across a broader study period if the study requires additional longitudinal data. Once again, additional Internet infrastructure is necessary to host the survey component of the SMSA. An ideal solution is for the SMSA software to ‘‘redirect’’ the user to a URL where the survey is hosted after the extraction of their social media data. Once again, there are a variety of different options available for web-based research. These include Qualtrics, Survey Monkey, and Google Forms, to name but a few. An SMSA could also interface with users within social media sites or on mobile devices. An advantage of using Google Forms is that they can be easily integrated within a broader website that hosts additional information about the study—as I describe in a later section of this article. Google Forms is also convenient if the data extracted from the API are passed to a Google Drive (as described earlier), since Google Forms can be easily linked to this cloud-based data storage system.

Incentivizing Organizations to Install an SMSA A major challenge for researchers who hope to develop SMSAs is to create incentives for organizations to use them. Although hundreds of millions of people share their personal information with app developers each day, public concern about protecting privacy and online identity theft continue to grow. Moreover, the remarkable proliferation of apps in recent years has created considerable competition for the attention of social media users—and many of these do not require users to answer survey questions or share significant amounts of private data. Even though completion of such surveys on computers or mobile technologies at a user’s time of choosing may be more convenient than conventional telephone surveys, steadily declining response rates to all types of social science surveys suggest people must perceive significant rewards that offset the risk and time of sharing their data.8


Bail

9

One way to recruit organizations to install an SMSA is to offer some type of scholarly analysis that might be of interest to them. For example, an SMSA that collects a large amount of data about a large group of organizations could help users understand how their organization compares to its peers, and how they might learn from the successes and failures of others. The application of SMSA technology subsequently, for example, helps nonprofit organizations attract new audiences by comparing their Facebook insights data to that of their peers and presenting customized recommendations about how to optimize their outreach based upon their responses to a survey. For-profit organizations may also be inspired to install an app if it allows them to track their competitors, identify new opportunities for investment or other forms of entrepreneurship, or helps them identify ways to be more efficient based upon the experiences of other organizations in or outside their field. SMSAs designed for such competitive industries must ensure organizations that their social media data and responses to survey questions will not be publicly identifiable—yet such arrangements must be made regardless of organizational field, as I explain in my discussion of legal issues surrounding SMSAs subsequently. Offering some form of scholarly analysis to organizations is of course much more computationally demanding. In addition to authentication, social media data extraction, and survey analysis, such SMSAs require some form of social science analysis to be built into the software itself. In most cases, organizations will be unlikely to appreciate highly sophisticated forms of social science analysis such as multivariate regression models. It is therefore advisable to offer only simple descriptive analyses that help the organization understand its position vis-a`-vis its peers. Such calculations are relatively straightforward from a computer programming perspective—one needs to only request a data point of interest from the user and compare it to the average score on that variable for all other users who installed the app, which can be easily calculated by calling the data stored on a cloud-based server such as Google Drive. An SMSA that offers some form of social science analysis as incentive must also include a mechanism for delivering such information back to the organizational representative who installs the app. One option is to have the software redirect the representative to a website that displays the results of an analysis in an HTML table following the survey stage of the SMSA. A separate option is to e-mail the results to the user in a form message that is also automated by the SMSA software. A final option is to create user accounts within a supplemental website where organizational representatives can perform updates of the analysis of interest in reaction to different events within


10


their field or the broader social environment. This final option is much more time intensive, however, since it requires the creation of infrastructure to support individual user accounts and passwords to ensure the data cannot be violated by a third party.9

Sampling and Recruitment Unfortunately, social media sites do not offer databases of organizations that maintain pages on their site. However, if a list of organizational actors of interest can be identified via the researcher, they can be rapidly queried via the ‘‘search’’ function of an API.10 Such API requests do not require authentication from a user but do require the app developer to have a temporary authentication token that can be obtained from the social media site.11 If an organization is discovered using such search queries, subsequent API requests can be used to identify the e-mail address of the person who manages an organization’s social media page. E-mail s to these organizational representatives can provide a direct link from which they may install the app or a link to a website that contains a more detailed description of the study, step-by-step instructions about how to participate, and a detailed privacy policy that details the researcher’s plans for maintaining the confidentiality and anonymity of social media users. This website can be hosted within Facebook, on a separate server, or both. An advantage of hosting the website within the social media site is that users can quickly navigate to the page from their own user accounts. On the other hand, a separate website hosted on another server gives the researcher much more flexibility in the style and length of text about the goals of the study, how to install the app, and privacy concerns.

An Application: Studying How Human Organ and Tissue Donation Advocacy Groups Attract New Audiences on Facebook In order to further illustrate the promise of SMSAs—and how to create them—this section describes an application of this new technology to study how advocacy organizations attract the attention of ‘‘bystanders’’—or broad segments of the public who are unlikely to join an organization but may become otherwise sympathetic to their cause. Students of collective behavior have not yet studied bystanders because of the considerable methodological obstacles involved (Gamson 2004). While a large survey could be fielded to ask people about whether they are aware of a particular cause,


Bail

11

for example, it could not determine whether the effect of an advocacy organization is direct or indirect.12 Such a survey would also require the identification of a sample of organizations that might reach bystanders before they achieve much public recognition. In contrast, an SMSA can quickly obtain detailed historical data about advocacy organizations and their interactions with bystanders on social media sites over vast time periods, as the following paragraphs describe. Moreover, an SMSA can monitor precisely when and where an advocacy organization contacts a bystander and extract all the texts of the relevant messages, comments, or ‘‘likes’’ as well as detailed information about the organization and the individuals they contact—including their gender, age, and the precise geographic coordinates from where they comment, like, or share messages.13 Between June 2012 and January 2013, I conducted a pilot study of SMSA technology designed to identify how human organ and tissue donation advocacy groups attract the attention of bystanders to their cause. Figure 2 describes the work flow of this SMSA. First, the app requests permission to access data from the advocacy group’s Facebook page from a representative of the organization. Next, the app mines all Facebook messages, comments, likes, and shares from the organization’s page in addition to Facebook ‘‘insights data’’ for the organization (120 variables that describe the size, characteristics, and engagement of their online audiences). Third, the app asks the representative from the organization to answer a series of questions about the organizational capacity of the group as well as its off-line tactics to increase organ donation. This survey is designed to collect supplementary data that is not available from Facebook in order to account for theories of resource mobilization produced by students of collective behavior (e.g., McCarthy and Zald 1977). Fourth, the app extracts a variety of additional information about the broader online environment inhabited by each organization from Google. These include data about whether and how often organizations are mentioned within the media, blogs, and relative increases in search volume for ‘‘organ donation’’ during the day they make a post via Google Trends. These additional data sources help evaluate theories of ‘‘discursive opportunity’’ within the literature on collective behavior (e.g., Koopmans and Olzak 2004). Fourth, the app uploads all of this information to a cloud server where it can be used to provide real-time updates to a database as new advocacy groups join the study. This cloud server is then used to prepare a report about how each organization might optimize its social media outreach based upon the successes and failures of its peer organizations in reaching new Facebook audiences. This report is later e-mailed to the organization’s representative via


12


Figure 2. Social media survey app (SMSA) for study of how organ donation advocacy groups reach new audiences on Facebook.

an e-mail address specified during the survey stage of the SMSA as incentive for the organization to participate in the study. In total, this SMSA collects more than 192 variables summarized in Table 1. I developed a two-stage sampling scheme to identify the total universe of organ donation advocacy organizations that maintain an active Facebook page with at least 30 fans.14 First, I extracted a list of all advocacy groups working on issues related to organ donation from a database of nonprofit organizations that are registered with the US Internal Revenue Service (IRS).


Bail

13

Table 1. Overview of 192 Variables Collected by SMSA. Types of Variables

Description

Social media discourse

Full text of all posts by an organization as well as every comment it receives Public engagement with each Number of times a post was loaded within a user’s Facebook post or organization Facebook news feed, clicked on by a user, liked by a user, shared by a user, and so ona Characteristics of audience of each Number of people who performed the actions post or organization above broken down by age, gender, and city, size of their ego, and alter social networksa Organizational capacity of Total annual budget, number of full-time staff and advocacy organization volunteers, age, interorganizational networks Tactics of advocacy organization Use of other online outreach tools (e.g., Twitter), frequency of online outreach, frequency organization engages in nine types of off-line tactics (e.g., holding events, going door to door, producing press releases). Broader external environment News coverage of organization, blog coverage of organization, amount of public interest in organ donation measured via Internet search patterns, number of people on a waiting list for an organ in state where organization is located Note: SMSA ¼ social media survey app. a These data are available at two levels of analysis. The first is the entire lifetime of the post, and the second is pooled for all posts within each day produced by an organization.

Second, I identified organizations that have at least 30 fans via API queries and supplemental searches by research assistants to identify organizations that were not listed in the aforementioned database of nonprofit organizations. In total, 79 organizations were identified during this stage. Organ and tissue donation advocacy groups within the target sample were first recruited via the e-mail address listed on their Facebook page.15 The initial recruitment e-mail provided a brief description of the app and invited respondents to visit the study’s website for additional information about the benefits and goals of the research, details about the research team, and a privacy policy. A phone survey of several organizations that did not install the app after two weeks revealed that e-mail spam filters blocked some of these initial recruitment e-mails. Therefore, I coordinated a second round of recruitment where organizations were sent a detailed letter about the study on university letterhead in order to further legitimatize the research endeavor. A third and final stage of recruitment was conducted by telephone for


14


Figure 3. Scatterplot of latitude and longitude coordinates of more than 123 million Facebook users who viewed a post by one of the organizations who installed the social media survey app (SMSA).

all organizations that did not participate during the first two stages of public outreach. Forty-six of the 79 organizations in the target sample installed the SMSA by the end of the recruitment period, for an overall response rate of 59 percent—substantially higher than most surveys of such populations, as I discuss in further detail subsequently. In total, the app mined data about approximately 123 million interactions between Facebook users and 9,911 posts produced by the 47 organ and tissue donation advocacy groups across 1.7 years. This includes 272,116 ‘‘likes’’ and 26,123 time stamped comments from 88,863 unique individuals, and daily data about the number of people who viewed and clicked on each message. The app also mined geo-coded data on the location from where people viewed, clicked, or commented upon each message. Figure 3 presents a scatterplot of the longitude and latitude of each city from which people viewed messages without GPS shape files in order to demonstrate the breadth and detail of these data. In addition to the 192 variables collected by this SMSA, many more could be created via the rich qualitative data that this method also extracts from an unusually large number of people. For example, qualitative coding of the posts of organ donation advocacy organizations could be used to create


Bail

15

variables that describe different types of frames or discursive tactics such groups use to call attention to their cause (e.g., Benford and Snow 2000).16 These data can also be directly tied to the likes and comments of individual social media users in order to analyze the evolution of frames as organizations interact with the broader public—or to emphasize how different types of frames produce different types of reactions from such audiences more broadly. Comments and likes are time stamped, enabling rich dynamic analysis of the flow of information between advocacy groups and bystanders. Similarly, the survey data obtained via this SMSA opens new lines of inquiry about the relationship between organizations’ online and off-line behavior. Variables that describe the discursive strategies of these organizations, for example, could be linked to their financial and social resources or off-line tactics for calling attention to their cause (e.g., fund-raising events, television advertisements). Finally, the app mines detailed information on the demographic background of Facebook users who interact with each advocacy group. Together, these data can therefore also be used to address confounding factors in the diffusion of social media messages that have not been examined by previous studies of social media because they are not publicly available (e.g., financial resources). According to the most recent meta-analysis available, the average response rate for Internet-based surveys is 35 percent (Donsbach and Traugott 2008:277). The response rate for this study was therefore 68 percent higher than average. Some of this very high response rate may be attributed to the use of telephone outreach, which typically creates a modest increase in survey response rates vis-a`-vis mail or e-mail surveys (Lavrakas 2010; Maynard, Schaeffer, and Freese 2011). Similarly, a recent meta-analysis suggests that response rates for studies of organizational populations are slightly higher than those of the general public (Baruch and Holtom 2008) but still only reach an average of 35.7 percent. Nevertheless, the high response rate of this study is encouraging because the research design was unprecedented and therefore unfamiliar to respondents. The response rate is also encouraging since this study asked respondents to share large amounts of sensitive online data, unlike conventional survey research. The high response rate may also be attributed to the convenience of the SMSA. Analysis of the time logs created by the app software suggest most users authenticated the app and completed the survey in less than three minutes—far less time than most conventional surveys which require lengthy questionnaires because they are unable to mine rich qualitative data from respondents within seconds. On the other hand, studies show that telephone recruitment also increases response rates (Tomaskovic-Devey, Leiter, and Thompson 1994), so further studies


16


are needed to determine whether future SMSAs can achieve such high response rates through e-mail contact alone.

Assessing Response Bias The high response rate for this study indicates SMSAs hold considerable potential to decrease survey nonresponse by providing incentive for organizations to participate in social science studies with minimal time commitment. In this case, organ donation organizations were provided with a free, high-quality audit of their social media strategies using cutting-edge social science methods. In contrast, the nascent field of social media consultants provides no such systematic analysis across a large group of organizations—and charges considerable fees for their services. Yet as the sampling discussion above noted, the use of incentives risks creating response bias. In this case, for example, organizations with limited financial resources may be more likely to install the SMSA because they are unable to afford expensive social media consultant services. Or, the incentive of a free social media outreach audit may be more attractive to younger organizations which are more likely to depend upon such technologies vis-a`-vis older, more established advocacy groups with dedicated infrastructure for media outreach. Finally, a free social media audit may create more incentive for organizations that use social media more often than others. To evaluate possible response bias among organizations that participated in the study, I compared the total annual budget and age of organizations in and outside the study sample using data about these nonprofit organizations from the IRS.17 Figure 4 presents kernel density plots that compare the total annual budget, age, and total number of Facebook posts for each organization in and outside the study sample. These figures suggest there is only minimal variation for these three variables between advocacy groups in and outside the final sample. Two-tailed t tests further confirm that no significant differences exist between the two populations. For the total annual budget of each organization, t ¼ .9431 and p < .352. For the age of each organization, t ¼ .0543 and p < .957. For the number of Facebook posts produced by each organization, t ¼ .4306 and p < .669. These analyses suggest the use of incentives did not introduce significant response bias among organizations recruited to participate in the study. As an additional test for response bias, I examined the mission statements of each organization that are included within the IRS paperwork required of all nonprofit organizations. These data were obtained from the Guidestar Database of IRS 501c(3) organizations that is widely used for sampling


Bail

17

Figure 4. Kernel density plots comparing budget, age, and frequency of Facebook posts for organizations in and outside the study sample.


18


purposes by organizational scholars (e.g., Andrews, Hunter, and Edwards 2012; Brulle et al. 2007; Minkoff, Aisenbrey, and Agnone 2008; Walker, McCarthy, and Baumgartner 2010). These statements may be used to examine whether there are any further qualitative differences in the missions or organizations in and outside the study sample. Once again, this analysis revealed no major differences. The range of missions was consistent among organizations in and outside the study sample. More specifically, there were equal numbers of organizations that focused upon increasing public awareness about organ donation, supporting those who are currently waiting for an organ transplant, or those organizations that participate in the logistics of organ allocation. There were also equal numbers of organizations that focus on the national and state levels of public outreach and advocacy. The only central difference between the target sample and the study sample was that the latter yielded fewer organizations that are involved with human eye donation—as opposed to other human organs or human tissue. Organizations such as the Lions Club and Rotary Clubs are most involved in eye donation, perhaps because it disproportionately affects the elderly male populations that frequent such groups. According to a recent study of social media user demographics, older men are among the least likely groups to use websites such as Facebook or Twitter. Groups comprising older people may be more reluctant to install an SMSA because of the negative association between age and social media use.

Directions for Future Research With SMSAs The rise of social media sites such as Facebook and Twitter provide an unprecedented opportunity to collect massive amounts of rich qualitative data on the evolution of social relationships as they develop. Yet while thousands—or even millions of texts produced by organizations and their audiences—can be collected, these data are almost completely unstructured. They contain little information about the organization and the broader social context in which they interact with other organizations and the broader public. What is more, much of the most interesting data generated by social media sites is not publicly available because of pervasive concerns about online privacy. I have argued that SMSAs can address these limitations by collecting vast amounts of public and private social media data alongside supplemental survey data and offering organizations some form of scholarly analysis of these data to incentivize their participation in social science research. These new tools offer a highly efficient means for collecting high volumes of detailed data about organizations. SMSAs


Bail

19

are also more cost-effective than conventional survey-based methods that require infrastructure and personnel for survey administration.18 The pilot study described earlier illustrates how SMSAs can yield very high response rates compared to conventional research methods and unbiased samples— though additional studies are needed to determine whether these results can be reproduced in other organizational fields. Future studies of SMSAs are also urgently needed to assess their viability for sites other than Facebook. The most obvious candidate for SMSA research is perhaps Twitter, since it is presently the world’s second largest social media site. Because Twitter enables researchers to download large samples of tweets without requesting permission from individual users or representatives of organizations, it has already become a cornerstone of social media research. Yet the paucity of data available about Twitter users and the relationship between their online and off-line behavior create formidable challenges for social scientists who wish to advance the study of Twitter beyond descriptive analyses. Because Twitter does not collect as much information about its users as Facebook, surveys of Twitter users conducted via SMSAs may need to be longer. Yet the technical implementation of an SMSA on Twitter would be quite similar to the Facebook example presented earlier—apart from significant differences in the query languages used for each site’s API. Other prominent social media sites such as LinkedIn and Google Plus are rapidly attracting app developers as well and could also become candidates for future studies of organizational populations using SMSAs. SMSAs also offer the potential to contribute to the rapidly increasing interest in online field experiments (e.g., Bakshy et al. 2012; Bond et al. 2012; Salganik, Dodds, and Watts 2006). These studies develop web-based interfaces to experimentally manipulate social media users’ access to information or other social media users. The scholarly analysis reported by an SMSA could easily be distributed in experimental fashion in order to isolate the effect of exposing organizations to different forms of information about their environment. Such studies, of course, require informed consent and minimization of any potential harm to organizations that install an SMSA, as I discuss in further detail in the section about legal issues subsequently.

Developing SMSAs to Study Individual Social Media Users Thus far I have only discussed SMSAs for the study of organizational populations. Yet the same technology could potentially be extended to study large groups of individual social media users as well. Indeed, SMSAs offer new


20


sampling and recruitment methods for the study of social media users. First, targeted samples can be easily extracted from an entire population of social media users, thanks to the remarkably detailed data that social media sites such as Facebook and Twitter maintain about their users for advertising purposes. These databases are far more detailed than conventional sampling frames such as the U.S. Census or American Community Survey. For instance, Facebook gives app developers the option of targeted advertising according to not only basic demographic variables such as age, gender, or minority status but also a variety of far more detailed information about cultural tastes, political preferences, religion, and educational status. At the time of this writing, for example, a research could use Facebook advertising to create a sample of expecting parents in a long distance relationship who already have two children between ages 4 and 12.19 The cost of sampling via social media advertising services is far less than conventional sampling methods. Instead of hiring a phone interviewer to screen thousands of households for eligibility in a study, researchers can simply pay sites such as Facebook a predetermined fee each time a user clicks on a link to the app that appears within their ‘‘news feed.’’ One may either set a predetermined budget for the entire sampling period or reset the amount each day in accordance with response rates in order to preserve even more research funding. In this way, a researcher may budget for a specific response rate and not pay for any responses in excess of this rate. Another sampling advantage of SMSAs is that they are easily scalable since individual social media users may share them with their friends. SMSAs therefore hold the potential to combine targeted stratified samples with snowball or respondent-driven sampling (RDS) approaches techniques.20 Although RDS can introduce significant bias—particularly in smaller studies (e.g., McCreesh et al. 2012; Salganik and Heckathorn 2004)—the scale and direction of this bias can be corrected if the survey component of an app includes basic demographic questions that can be compared to conventional sampling frames. Yet further studies are needed to verify whether SMSAs could generate representative or unbiased samples of individual social media users. A major challenge in applying SMSA technology to study individual social media users would be to create incentive for them to share their private information with researchers and answer intrusive or time-consuming survey questions. Few social scientists can offer analysis that will be of interest to large populations of social media users. Therefore, the most promising way to incentivize individual social media users to install SMSAs may be to offer them financial compensation. Or, social scientists could


Bail

21

offer free mobile access to the Internet to those willing to install a small number of apps on their phones or tablets each month and answer survey questions where necessary. Such a strategy would follow the example of the Knowledge Networks or Time-Sharing Experiments for the Social Sciences (Freese and Visser 2012). Respondents may even prefer to install SMSAs instead of taking online surveys because (a) app-based approaches may require fewer survey questions since extensive data about respondents can be automatically extracted from social media sites and (b) apps enable respondents to answer surveys at a time and a place of their choosing. Such arrangements might also benefit scholars conducting longitudinal or follow-up studies, since researchers could simply connect with users via mobile technology instead of laboriously tracking their movement across mailing addresses over time. SMSAs could also potentially obtain a wealth of new types of data about individual social media users. Because so many people now access social media sites through mobile phones or tablets (Miller 2012), an SMSA could collect detailed information about the GPS coordinates of users when they produce social media messages or collect audiovisual data produced by the user via these mobile technologies as well. Imagine, for example, a study of segregation that asks the respondent to explain where he or she is going when she enters a new zip code for the first time. Or, consider a study that asks a user to explain why she posted a picture of herself and her friends immediately after this action was undertaken. Both of these types of fluid data collection would help explain how much social media data reveal about off-line behavior, or the broader social context in which social media messages are produced. Needless to say, the amount and type of data that could potentially be obtained from individual social media users via SMSAs will be constrained by concerns about privacy and data sharing.

Logistical and Legal Challenges Despite the considerable promise of SMSAs, I will conclude by highlighting several important challenges that remain before this research tool can be applied on a broader scale. First, SMSA development requires significant technical skills. SMSA development requires proficiency in at least one computer programming language as well as considerable knowledge about APIs and web-hosting services necessary to maintain outreach websites and store data on cloud servers.21 Still, these technical barriers can be easily surpassed via relatively simple computer programming training or collaborations with computer scientists who are increasingly interested in leveraging social network analysis and other forms of sociological analysis into the software


22


tools they produce (Bail 2014).22 The global outsourcing of software development has also significantly lowered the cost of hiring a computer programmer to design an app for much less than the cost of fielding a conventional social survey. Finally, I have made the software code used to develop the application described earlier publicly available upon my website for others to borrow or develop even further. Another major challenge that will face those who aim to create SMSAs in future is the sheer size of data that might be collected. Data cleaning for the final analysis of this article took four weeks using four of the fastest computers available on the consumer market at the time of this writing. These long time periods are not only the result of hundreds of millions of data points that must be processed, but also rate limits placed upon apps by websites such as Facebook, Twitter, and Google. One way to improve the speed of data cleaning and analysis is to enlist powerful new cloud computing technologies that are increasingly available on university campuses or business such as Amazon Web Services. Such services can be used to host cloud servers necessary to store data for real-time analysis. Once again, renting space in cloud servers is far cheaper than the cost of a conventional mail or telephone survey, though efficient use of such tools requires familiarity with new programming languages designed for ‘‘big’’ data. Next, the pilot study described earlier revealed recruitment challenges unique to this new method of data collection. In principal, recruitment of study participants should be easier with an SMSA than a conventional telephone or mail survey, since e-mail contact can be easily automated. Yet spam filters and public suspicion of requests for sensitive Internet information stalled recruitment for my pilot study. Mail and telephone outreach resolved these problems, yet this added additional cost and time to the recruitment effort—though perhaps not as much cost or time that is required to train those who administer conventional surveys. Future SMSAs might avoid these pitfalls by recruiting study participants via the powerful Facebook advertising tools described earlier or by employing new technologies designed to ensure that messages circumvent spam filters. Response rates for studies that employ SMSAs may also improve if this new methodology becomes perceived as a legitimate means for the public to interface with academics online. A final issue related to the use of SMSAs concerns the complex legal and institutional review board (IRB) issues that may emerge during app development and implementation. These issues are particularly difficult to navigate at this point because intellectual property law for apps and IRB protocols for online Internet studies remain in their infancy. The recent


Bail

23

controversy surrounding a study of emotional contagion by researchers within Facebook demonstrates the potential of social media studies to generate concern about norms for online research (Kramer, Guillory, and Hancock 2014). Still, there is no a priori reason why informed consent should be practiced any differently for SMSAs than conventional survey research. Indeed, authentication dialogues within apps provide a natural forum for researchers to explain the risks and benefits of participating in an SMSA study and ensure that the text of such statements loads within a web browser before authentication begins.

Conclusion On balance, the technical, logistical, and legal obstacles that may arise in the creation and implementation of SMSAs should not detract from their potential to improve the scope, efficiency, cost, scalability, sampling, response rates, and convenience of studying organizations and collective behavior using social media data. Social scientists simply cannot ignore how the sea change in communication technology away from landline phones toward mobile Internet devices will profoundly recast the type of data available for research—but also the manner in which social scientists interface with the broader public. Future technological advances will only continue to increase the wealth of social media data available to social scientists—as well as the urgency of new methods to tame this tidal wave of big data. Although I am overall optimistic about the potential of computational social science to open new lines of inquiry about organizations and collective behavior, it is clear that publicly available social media serves little purpose beyond crude descriptions of unidentified populations of Internet users. New methods such as SMSAs are urgently needed to take advantage of the wealth of new data available on social media sites, without sacrificing the rigor and representativeness of conventional social science research. Acknowledgment I am grateful to Paul DiMaggio, Peter Marsden, and Andrew Perrin for helpful comments on previous drafts. I thank Taylor Whitten-Brown, Steven Merritt, Raina Sheth, David Jones, and Nate Carroll for research assistance.

Author’s Note Previous versions of this manuscript were presented at the University of Michigan, Notre Dame University, the University of North Carolina at Chapel Hill, and the Annual Meetings of the American Sociological Association.


24


Declaration of Conflicting Interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Robert Wood Johnson Foundation and the National Science Foundation (Award # 1357223).

Notes 1. Recognizing the considerable potential of these new data sources, the U.S. National Science Foundation recently launched a large initiative to develop infrastructure to support the collection, analysis, and distribution of social media data. See National Science Foundation, ‘‘NSF Leads Federal Efforts in Big Data,’’ Press Release #12-060, March 29, 2012, accessed August 2013, http://www. nsf.gov/news/news_summ.jsp?cntn_id¼123607. 2. Canvas pages may be created via the Facebook’s website for software developers: http://developers.facebook.com. 3. Annotated python code for the sample social media survey app (SMSA) discussed subsequently is provided on the author’s website (https://github.com/ cbail/App-for-Studying-Organizational-Behavior-on-Social-Media). 4. As previously mentioned, most data on Facebook is protected by privacy barriers, though many users—knowingly or unknowingly—continue to set their privacy settings to ‘‘public,’’ which allows anyone access to all of the information on their personal Facebook page. The potential for social scientists to collect data from such Facebook users without their expressed permission raises both ethical and legal issues related to informed consent and ownership of social media data. I do not advocate this form of data collection from Facebook or any other social media site. By contrast, organizations, businesses, and other public entities who maintain Facebook pages typically set their privacy settings to ‘‘public’’ because they are explicitly interested in recruiting new online audiences. Large amounts of data can be collected from such entities via Facebook’s application programming interface (API) without authentication, as I discuss elsewhere (Bail 2015). 5. For a complete list of the data available through Facebook Insights, see https:// developers.facebook.com/docs/graph-api/reference/insights/. 6. An overview of the language used on Facebook’s Graph API is available here: https://developers.facebook.com/docs/graph-api/.


Bail

25

7. If a user is already logged into Facebook when loading the Graph API Explorer into their web browser, they will see what information is available from their own Facebook account. This can be useful in early app development when the developer does not yet have access to other user accounts to test the software. 8. On declining response rates in social science research, see Fox, Crask, and Kim (1988), Curtin, Presser, and Singer (2000), Marsden and Wright (2010), Maynard et al. (2011), and Perrin and McFarland (2011). 9. For example, various encryption algorithms or lines of code that instruct the program to log the user out of the website after she or he has been inactive for some time. 10. There are a number of databases of organizational populations that could be used to this end such as The Encyclopedia of Associations or the Guidestar Database of Nonprofit Organizations based upon data from the Internal Revenue Service. For an overview of such databases, see Andrews et al. (2012). 11. For a discussion of how to obtain such authentication tokens, see http://developers.facebook.com/docs/facebook-login/access-tokens. 12. Qualitative fieldwork is also not well suited to study how advocacy organizations attract the attention of bystanders, since such individuals are usually not present at the sites of collective action that such research methods might target for interview recruitment. 13. While such analyses are necessarily limited to the realm of social media, Earl and Kimport (2011) show that this medium is rapidly becoming one of the most important channels of communication for collective behavior more broadly. Similarly, a recent survey revealed that fully 97 percent of nonprofit organizations maintain a regular presence on Facebook in an attempt to reach new audiences. See Non-profit Technology Enterprise Network 2012. At a minimum, these data suggest social media is a critical channel for advocacy groups to reach bystanders because it is free, efficient, and there is considerable potential for the messages to ‘‘go viral’’ across vast social networks. To give only one of many possible examples, consider the remarkable success of the ‘‘Kony 2012’’ campaign to call attention to the use of child soldiers by militant groups in Uganda. 14. The 30 fan cutoff point was necessary because Facebook only provides data about audience behavior for organizations whose audience is at least this size in order to preserve the confidentiality of its users. 15. Because this information is publicly accessible via the Facebook API, this process can be easily automated. 16. Needless to say, no single researcher—or even a group of researchers—could possibly code the hundreds of thousands of texts extracted by the SMSA described earlier. Yet recent advances in automated content techniques such as Latent Dirichlet Allocation can learn from the actions of human coders and


26

17. 18.

19.

20.

21. 22.

Sociological Methods & Research extend them toward thousands or even millions of other documents within hours. For a nontechnical overview of this technique, see Blei (2012), and for a detailed technical explanation of this technique, see Blei, Ng, and Jordan (2003). For recent applications of this technique by sociologists, see DiMaggio, Nag, and Blei (2014) and Mohr and Bognadov (2013), and Bail (2014). These data were compiled using the Guidestar database described in the discussion of sampling earlier. On the other hand, the cost of app development will add additional expense for those who lack computer programming skills or cannot identify collaborators from other fields such as computer science or information science. These advertising tools may not only facilitate the study of difficult to access populations but also enrich assessment of response bias across all populations. That is, the advertising tools of social media sites can be synchronized with conventional sampling frames such as the American Community Survey in order to ensure that social media users sampled are representative of the broader public. One may differentiate users who installed the app after viewing an advertisement within their ‘‘news feed’’ and those who did so at the suggestion of a Facebook friend. This enables differentiation of those recruited from targeted samples and respondent-driven samples. SMSAs that collect very large amounts of data might also require expertise in tools for the management of big data such as Hadoop, Hive, or Map Reduce. For example, in 2009, 11 of the largest national science foundations in the world launched an interdisciplinary competition to facilitate interaction between computer scientists and other disciplines entitled ‘‘digging into data.’’ This cross-national interdisciplinary initiative has already distributed tens of millions of dollars to researchers worldwide, yet no sociologists have participated at the time of this writing.

References Andrews, Kenneth, Anne Hunter, and Bob Edwards. 2012. ‘‘Methodological Strategies for Examining Populations of Social Movement Organizations.’’ Working Paper, Department of Sociology, University of North Carolina, Chapel Hill. Bail, Christopher. 2014. ‘‘The Cultural Environment: Measuring Culture with Big Data.’’ Theory and Society 43:465-82. Bail, Christopher. 2015. Terrified: How Anti-Muslim Fringe Organizations Became Mainstream. Princeton: Princeton University Press. Bakshy, Eytan, Itamar Rosenn, Cameron Marlow, and Lada Adamic. 2012. ‘‘The Role of Social Networks in Information Diffusion.’’ Pp. 519-28 in Proceedings of the 21st International Conference on World Wide Web, WWW ‘12. New York:


Bail

27

ACM. Retrieved August 5, 2014 (http://doi.acm.org.libproxy.lib.unc.edu/10. 1145/2187836.2187907). Baruch, Yehuda and Brooks C. Holtom. 2008. ‘‘Survey Response Rate Levels and Trends in Organizational Research.’’ Human Relations 61:1139-60. Benford, Robert and David Snow. 2000. ‘‘Framing Processes and Social Movements: An Overview and Assessment.’’ Annual Review of Sociology 26:611-39. Blei, David. 2012. ‘‘Probabilistic Topic Models.’’ Communications of the ACM 55: 77-84. Blei, David, Andrew Ng, and Michael Jordan. 2003. ‘‘Latent Dirichlet Allocation.’’ Journal of Machine Learning Research 3:993-1022. Bollen, Johan, Huina Mao, and Xiaojun Zeng. 2011. ‘‘Twitter Mood Predicts the Stock Market.’’ Journal of Computational Science 2:1-8. Bond, Robert M., Christopher J. Fariss, Jason J. Jones, Adam D. I. Kramer, Cameron Marlow, Jaime E. Settle, and James H. Fowler. 2012. ‘‘A 61-million-person Experiment in Social Influence and Political Mobilization.’’ Nature 489:295-98. Brulle, Robert, Liesel Turner, Jason Carmichael, and J. Craig Jenkins. 2007. ‘‘Measuring Social Movement Organization Populations: A Comprehensive Census of U.S Environmental Movement Organizations.’’ Mobilization 12:255-70. Curtin, Richard, Stanley Presser, and Eleanor Singer. 2000. ‘‘The Effects of Response Rate Changes on the Index of Consumer Sentiment.’’ Public Opinion Quarterly 64:413-28. Dimaggio, Paul, Eszter Hargittai, W. Russell Neuman, and John Robinson. 2001. ‘‘Social Implications of the Internet.’’ Annual Review of Sociology 27:307-36. Dimaggio, Paul, Manish Nag, and David Blei. 2014. ‘‘Exploiting Affinities between Topic Modeling and the Sociological Perspective on Culture: Application to Newspaper Coverage of Government Arts Funding in the U.S.’’ Poetics 41:570-606. Donsbach, Wolfgang and Michael W. Traugott. 2008. The SAGE Handbook of Public Opinion Research. Thousand Oaks, CA: Sage. Earl, Jennifer and Katrina Kimport. 2011. Digitally Enabled Social Change: Activism in the Internet Age. 1st ed. Cambridge, MA: The MIT Press. Fox, Richard J., Melvin R. Crask, and Jonghoon Kim. 1988. ‘‘Mail Survey Response Rate: A Meta-analysis of Selected Techniques for Inducing Response.’’ Public Opinion Quarterly 52:467-91. Freese, Jeremy and Penny Visser. 2012. ‘‘Time-Sharing Experiments for the Social Sciences.’’ NSF Grant 0818839. Gamson, William. 2004. ‘‘Bystanders, Public Opinion, and the Media.’’ Pp. 242-61 in The Blackwell Companion to Social Movements, edited by D. A. Snow, S. A. Soule, and H. Kriesi. Malden, MA: Blackwell. Golder, Scott and Michael Macy. 2014. ‘‘Digital Footprints: Opportunities and Challenges for Social Research.’’ Annual Review of Sociology 40:129-52.


28


Howard, Philip N., Aiden Duffy, Deen Freelon, Muzammil Hussain, Will Mari, and Marwa Mazaid. 2011. ‘‘Opening Closed Regimes: What Was the Role of Social Media during the Arab Spring?’’ Retrieved May 2015 (http://ictlogy.net/bibliography/reports/projects.php?idp¼2170). King, Gary. 2011. ‘‘Ensuring the Data Rich Future of the Social Sciences.’’ Science 331:719-21. Koopmans, Ruud and Susan Olzak. 2004. ‘‘Discursive Opportunities and the Evolution of Right-wing Violence in Germany.’’ American Journal of Sociology 110:198-230. Kramer, Adam D. I., Jamie E. Guillory, and Jeffrey T. Hancock. 2014. ‘‘Experimental Evidence of Massive-scale Emotional Contagion through Social Networks.’’ Proceedings of the National Academy of Sciences 111:8788-90. Kristofferson, Kirk, Katherine White, and John Peloza. 2014. ‘‘The Nature of Slacktivism: How the Social Observability of an Initial Act of Token Support Affects Subsequent Prosocial Action.’’ Journal of Consumer Research 40:1149-166. Lavrakas, Paul. 2010. ‘‘Telephone Surveys.’’ In Handbook of Survey Research, edited by Peter Marsden and James Wright, 471-98. New York: Emerald Group Publishing. Lazer, D., A. Pentland, L. Adamic, S. Aral, A. L. Barabasi, D. Brewer, N. Christakis, N. Contractor, J. Fowler, M. Gutmann, T. Jebara, G. King, M. Macy, D. Roy, and M. V. Alstyne. 2009. ‘‘SOCIAL SCIENCE: Computational Social Science.’’ Science 323:721-23. Lewis, Kevin, Kurt Gray, and Jens Meierhenrich. 2014. ‘‘The Structure of Online Activism.’’ Sociological Science 1:1-9. Marsden, Peter and James D. Wright. 2010. Handbook of Survey Research. Bingley, UK: Emerald Group Publishing. Maynard, Douglas, Nora Schaeffer, and Jeremy Freese. 2011. ‘‘Improving Response Rates in Telephone Interviews.’’ Pp. 54-74 in Applied Conversation Analysis, edited by Charles Antaki. New York: Palgrave Macmillan. McCarthy, John and Mayer Zald. 1977. ‘‘Resource Mobilization and Social Movements: A Partial Theory.’’ American Journal of Sociology 82:1212. McCreesh, Nicky, Simon D. W. Frost, Janet Seeley, Joseph Katongole, Matilda N. Tarsh, Richard Ndunguse, Fatima Jichi, Natasha L. Lunel, Dermot Maher, Lisa G. Johnston, Pam Sonnenberg, Andrew J. Copas, Richard J. Hayes, and Richard G. White. 2012. ‘‘Evaluation of Respondent-driven Sampling.’’ Epidemiology 23:138-47. Miller, Geoffrey. 2012. ‘‘The Smartphone Psychology Manifesto.’’ Perspectives on Psychological Science 7:221-37. Minkoff, Debra, Silke Aisenbrey, and Jon Agnone. 2008. ‘‘Organizational Diversity in the U.S. Advocacy Sector.’’ Social Problems 55:525-48.


Bail

29

Mohr, John and Petko Bognadov. 2013. ‘‘Topic Modeling, Textual Analysis, and Interdisciplinary Exchange.’’ Poetics 41:545-69. Non-profit Technology Enterprise Network. 2012. ‘‘Non-Profit Technology Network’s 4th Annual Non Profit Social Network Survey 2012.’’ Accessed May 2015. http://www.nten.org/sites/default/files/2012_nonprofit_social_networking_benchmark_report_final.pdf. Paul, Michael J. and Mark Dredze. 2011. ‘‘You Are What You Tweet: Analyzing Twitter for Public Health.’’ Fifth International Conference on Weblogs. Perrin, Andrew J. and Katherine McFarland. 2011. ‘‘Social Theory and Public Opinion.’’ Annual Review of Sociology 37:87-107. Salganik, Matthew J., Peter Sheridan Dodds, and Duncan J. Watts. 2006. ‘‘Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market.’’ Science 311:854-56. Salganik, Matthew J. and Douglas D. Heckathorn. 2004. ‘‘Sampling and Estimation in Hidden Populations Using Respondent-driven Sampling.’’ Sociological Methodology 34:193-240. Tomaskovic-Devey, Donald, Jeffrey Leiter, and Shealy Thompson. 1994. ‘‘Organizational Survey Nonresponse.’’ Administrative Science Quarterly 39:439-57. Walker, Edward, John McCarthy, and Frank Baumgartner. 2010. ‘‘Replacing Members with Managers? Mutualism among Membership and Non-membership Advocacy Organizations in the U.S.’’ American Journal of Sociology 116:1284-337.

Author Biography Christopher A. Bail is an assistant professor of sociology at Duke University. He studies how advocacy groups and political actors create cultural change by analyzing large corpora collected from digital sources. His research has been published by Princeton University Press, the American Sociological Review, Sociological Theory, and Theory and Society, recognized by awards from the American Sociological Association and supported by the National Science Foundation and the Robert Wood Johnson Foundation.


Taming Big Data: Using App Technology to Study ...

Taming Big Data: Using App Technology to Study ...

Suggest Documents

Taming Big Data - Synchrony Financial

Using Big Data Technology to Contain Current and ... - Springer Link

Using Big Data to Study Psychological ... - OMICS International

Catching Earworms on Twitter: Using Big Data to Study Involuntary ...

Taming EHR data: Using Semantic Similarity to ...

Big Data Technology Litrature Review

Taming the Big Data - Entropy Driven Statistical Models Abstract 1 ...

protecting student privacy taming big data school turnaround

protecting student privacy taming big data school turnaround

AppFlux: Taming App Delivery via Streaming

Taming Velocity and Variety Simultaneously in Big Data with Stream

Taming Big Data: An Information Extraction Strategy ...

Taming Biological Big Data with D4M - MIT Lincoln Laboratory

Evolution of Information Technology From Raw Data to Big Data

AppFlux: Taming App Delivery via Streaming

Next Media case study on big data technology

Next Media case study on big data technology

CDR Analysis using Big Data Technology - IEEE Xplore

Using Big Data Analytics in Information Technology

P-Governance Technology: Using Big Data for Political Party ...

CDR Analysis using Big Data Technology - IEEE Xplore

Case Study: Big Data Solution

Case Study: Big Data Solution

PDF Download Big Data: Using Smart Big Data ... - Google Sites