Variations in network boundary and type: A study of ...

10 downloads 1489 Views 706KB Size Report
Using the Facebook API and the Facebook “areFriends” function, which produces a ..... reported in Table 1 and included in the node-level and ERGM analyses.
Social Networks 35 (2013) 309–316

Contents lists available at SciVerse ScienceDirect

Social Networks journal homepage: www.elsevier.com/locate/socnet

Variations in network boundary and type: A study of adolescent peer influences Thomas W. Valente a,∗ , Kayo Fujimoto b , Jennifer B. Unger a , Daniel W. Soto a , Daniella Meeker c a

Institute for Prevention Research, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States Division of Health Promotion and Behavioral Sciences, University of Texas Health Science Center at Houston, Houston, TX, United States c Rand Corporation, Santa Monica, CA, United States b

a r t i c l e

i n f o

Keywords: Multi-level Adolescents Social influence Network type Boundary

a b s t r a c t This study compares variation in network boundary and network type on network indicators such as degree and estimates of social influences on adolescent substance use. We compare associations between individual use and peer use of tobacco and alcohol when network boundary (e.g., classroom, entire grade in school, and community) and relational type (elicited by asking whom students: (a) are friends with, (b) admire, (c) think will succeed, (d) would like to have a romantic relationship with, and (e) think are popular) are varied. Additionally, we estimate Exponential Random Graph Models (ERGMs) for 232 networks to obtain a homophily estimate for smoking and drinking. Data were collected from a crosssectional sample of 1707 adolescents in five high schools in one school district in Los Angeles, CA. Results of logistic regression models show that associations were strongest when the boundary condition was least constrained and that associations were stronger for friendship networks than for other ones. Additionally, ERGM estimations show that grade-level friendship networks returned significant homophily effects more frequently than the classroom networks. This study validates existing theoretical approaches to the network study of social influence as well as ways to estimate them. We recommend researchers use as broad a boundary as possible when collecting network data, but observe that for some research purposes more narrow boundaries may be preferred. © 2013 Elsevier B.V. All rights reserved.

1. Introduction Social network analysis has become a prominent new paradigm in the social and behavioral sciences. In public health, etiological and application research has addressed many domains to demonstrate the importance of a network approach to understanding human behavior (Valente, 2010). The importance of social network influences on behaviors is well established and the advantages of a network approach to understanding a wide variety of phenomena are clear (Wasserman and Faust, 1994; Sacerdote, 2001; Monge and Contractor, 2003; Cross and Parker, 2004; Newman et al., 2006; Christakis and Fowler, 2007; Valente, 2010). Considerable basic network science, however, remains to be done on the best ways to collect and analyze network data in order to understand how social networks influence behavior. This study was designed to compare variation in network boundary and relational type on estimates of social influences on adolescent substance use. We compare associations between individual and peer use of tobacco and alcohol use when network boundary and type are varied. We use cross-sectional data derived

∗ Corresponding author. Tel.: +1 323 442 8238. E-mail address: [email protected] (T.W. Valente). 0378-8733/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.socnet.2013.02.008

from 12 network measures collected from a sample of 10th grade students in five high schools located within one school district in Los Angeles, CA.

1.1. Boundary specification Research on social networks acknowledges the boundary specification problem as one that has been a challenge for many years (Lincoln and Miller, 1979; Knoke and Kuklinski, 1982; Laumann et al., 1983). Generally different methodologies are used for different scenarios such that ego-centric methods are used when no boundary is specified, snowball methods for a partial one, and sociometric methods for a complete one (Morris, 2004). Few studies have compared the effects of boundary specification issues (Doreian, 1992), and no studies have explicitly compared network metrics and effects when boundary conditions are varied. As Laumann and others (Laumann et al., 1983) note, most network research defines the boundary by the respondents (realist approach) or by the researcher (nominalist approach). Boundaries defined by study participants (realist approach) should provide more valid data than boundaries drawn by the researcher because they allow participants to indicate all relevant network ties, not just those contained within the boundary defined by the researcher. For most network research this provides

310

T.W. Valente et al. / Social Networks 35 (2013) 309–316

network information that is relevant and important to the respondent. Conversely, researcher specified boundaries (nominalist approach) permit specification of the arena of ties that the researcher considers relevant for the phenomenon of study. The data are valid from the researchers’ perspective, though not necessarily from the respondents’ perspective. For example, in the context of adolescent risk behaviors, allowing adolescents to specify their friendships regardless of location can return important friendship influences beyond the researchers’ boundary of the school. Conversely these extra-school ties may not be relevant for understanding the level of risk influence occurring within schools. In this study we ask two boundary specification questions: Do individual network metrics vary when the boundary condition is varied? For example, are individuals who are central in a smaller network also central in the larger one that contains the smaller units? The second research question addresses network effects: Do estimates of network influences on behaviors vary when boundary conditions are varied? Two competing hypotheses exist: (1) Adolescents can be influenced by friends in any environment, not just their classroom, so the widest possible boundary should give the best estimates of network influences. Conversely, (2) adolescents are most influenced by the friends they have the most contact with, so most of the influence probably occurs in the narrowest network, and the wider networks have only weak effects. We know of no research that has explicitly investigated this topic. In this study we ask the same friendship question three times varying the boundary from which respondents can choose. The least constrained condition allows the student to name their seven closest friends regardless of location which we refer to as the community boundary (i.e., an ego-centric measure). In a separate section we ask respondents to name their seven closest friends in their classroom and then in a separate section their seven closest friends in the grade (10th grade). Facebook friends were also measured for a subset of the sample. The association between individual behaviors and those of their friends can be compared across these four networks. We anticipate the strength of association between ego’s behavior and that of their friends to be strongest for the least constrained boundary specification (community) and weakest for the most constrained (classroom). We are uncertain about the strength of association for online networks because they are un-constrained by geography or institution (class versus non-class), but constrained because some friends are not social networking site members. Consequently, we anticipate behavioral associations for online networks to be intermediate between unconstrained (community or ego-centric) and most constrained (classroom). 1.2. Relational type The third research question concerns variation of network associations by relational type. In addition to asking participants to name their friends, for the classroom and grade network boundaries we asked students to choose from a roster up to seven students whom they think: (1) are the most admired, (2) are most likely to succeed, (3) they would like to have a romantic relationship with, and (4) are the most popular. Consequently, we have five networks within two of the boundary conditions (class and grade) to compare network metrics and behavioral associations. Past research on network determinants of behavior has usually been restricted to the examination of one network; and much of the history of social network analysis has been the development of methods usually restricted to one type of network (Wasserman and Faust, 1994; Freeman, 2004). However, some early network research acknowledged the role of different types of networks. For example, the classic Coleman and others (Coleman et al., 1966) study of medical innovation measured discussion, advice, and

friendship among 125 physicians in four communities in Illinois in the mid-1950s. Coleman and others showed that network influences on the uptake of tetracycline were associated with advice and discussion networks early in the diffusion process, and friendship networks later. For adolescents, friendships are perhaps the most salient and robust relationship influencing drinking and smoking behavior (Fujimoto and Valente, 2012a, 2012b). Other networks, however, may also be important. Romantic relationships are important both for developmental reasons but also the role they might play in risk behavior (Bearman et al., 2004). Perceptions of who is popular can influence risk behavior with studies showing that being popular puts one at risk for risk behavior (Schwartz and Gorman, 2011; Tucker et al., 2011). Students perceived as being successful are perhaps more likely to perform well academically. These relationships may also be classified as relational (friendships), based on achieved attributes (admire, succeed, and popular), and/or aspirational (romantic) ones. We expect that friendships will exhibit the strongest associations with behaviors for the students in this study. We believe this in part because friendships are implied personal relationships consisting of considerable face-to-face contact, direct communication, and affective exchange. Friendships thus create opportunities for influence and expectations for behaviors not present in the other relations. In contrast, relations based on attributes (admire, succeed and popular) should have less influence because there is less opportunity for direct persuasion or peer pressure. In addition, attribute relations reveal whom in the community possesses certain attributes but do not imply that those attributes are ones a respondent aspires to. For example, an employee can admire the computing abilities of a colleague without wanting or needing to possess those attributes him/herself. Similarly, an adolescent can acknowledge the popularity or academic success of another student without necessarily wanting to emulate those qualities. Aspirational ties should be associated with behavior since the respondent reports wanting to have a relationship with that alter and these relationships are mostly likely formed when they engage in the same behavior. Friendship relations are also more likely to be associated with behavioral homophily because of their greater likelihood of reciprocity thus creating a stronger tie among the pair.

2. Methods The data for this study come from a cross-sectional sample of 1707 students interviewed in five schools in one school district in Los Angeles. We approached the school district superintendent and obtained support for the study. We followed up with principals and teachers from each school and asked permission to conduct paper and pencil surveys in the schools. All five high schools in the school district participated and students were interviewed in October 2010. Of the 2290 10th grade students, 2016 returned valid parental consent forms (88.0%) with 1823 agreeing to participate in the study. Some 28 of these did not provide student assent, reducing the eligible pool to 1795 students of whom 1707 completed surveys (74.5% overall participation rate). Students completed surveys during the regular school day during the English (for 3 schools) or History class (2 schools). Smoking was initially coded into five categories based on responses to six questions regarding smoking frequency and intention. The distribution was highly skewed with 1085 (69.6%) students reporting never having smoked, for example. Therefore smoking was dichotomized into an ever smoked variable with students coded as ever-smokers if they responded yes to having ever smoked or smoked in the past 30 days, 479 (31.4%). Those with missing responses (n = 148) were coded as ever-smokers,

T.W. Valente et al. / Social Networks 35 (2013) 309–316

non-smokers, and a probability equal to the prevalence. Alcohol use was also coded into five categories and similarly transformed into a dichotomous variable which measured whether they ever tried alcohol, with the exception of religious purposes, 744 (59.8%). Social networks were measured by providing a roster of all students in the 10th grade using a photo ID of each student with an ID number printed below the picture. We asked the same friendship question three times varying the boundary from which respondents can choose. We first asked: “Please think of your seven BEST FRIENDS regardless of where they live or go to school. Be sure to write your friends’ real names and not their nicknames.” In a separate section we asked respondents to name their seven closest friends in their classroom and then in a separate section their seven closest friends in their grade (10th grade). For the classroom and grade networks, students were referred to the roster. The least constrained condition allows the respondent to name their seven closest friends regardless of location which we refer to as the community boundary. Facebook friends were also measured for a subset of the sample (n = 517, or 30.3%), as described below. For the other four network types after the friend question we asked students: “Using the CLASS ROSTER, write the numbers of the 7 students in this classroom who you think: (a) Are the most admired; (b) Are the most likely to succeed; (c) You would like to have a romantic relationship with: and (d) Are the most popular.” For the grade network we substituted “grade” for “classroom.” 2.1. Facebook data collection Survey participants provided their Facebook (FB) IDs in the survey and they were invited to “friend” the Study’s Facebook Profile. Using the Facebook API and the Facebook “areFriends” function, which produces a binary response indicating the presence of tie between two users, all two-way connections between identified users were tested. User identities were verified when possible by matching user’s publicly reported profile content including, birth date, location, and school. Multiple probes of connections were tested through June 2011. 2.2. Analytic approach We first present descriptive statistics and then correlate degree scores between the networks. Rates of overlap between the five networks are then reported along with Quadratic assignment procedure (QAP) correlations. For network associations with behaviors, two analyses are reported: First an exposure model was calculated in which the key independent variable is the average behavior of the respondent’s network partners for each network using a random effects logistic regression specifying school as the random effect. If a person did not name anyone in response to a specific question, the exposure was coded as missing. Models substituting zero for students who named no one to a specific network question were also calculated. Sensitivity analyses were conducted with both of these regression models by re-calculating them with outcomes coded as zero when they were missing; and using the five-point smoking and drinking measures. Second, exponential random graph models (ERGMs) were estimated for each network measured in the study using a common model with coefficients and associated p-values averaged. 3. Results Table 1 reports the risk behavior rates, socio-demographic characteristics, and in-degree scores for the sample (N = 1707). Some 31.4% of students reported ever-smoking and 59.8% reported everdrinking. These rates are somewhat consistent with those reported nationally in the Monitoring The Future study, which reports 10th

311

Table 1 Behavior rates, socio-demographic characteristics, and degree scores for the sample (N = 1707).

Percent ever smoke (%) Percent ever drink (%) Percent male (%) Percent hispanic (%) Average age Percent free lunch (%) Average rooms

Value

Standard deviation

Range

31.4 59.8 49.9 65.3 15.1 57.8

44.6 49.6 50 47.6 0.43 49.4

0–100 0–100 0–100 0–100 14–18 0–100

3.2

1.27

1–7

1.98

1–9

0.95

1–5

6.1 Average grades (1 = all Fs; 9 = all As) 3.45 Self-reported health (5 = excellent) Percent parent smoke 26.7 (%) Percent sibling smoke 16.1 (%) 40.9 Percent parent drink (%) 7.6 Percent sibling drink (%) Nominations (degree scores) Friend anywhere 5.42 Grade Friend 4.89 1.79 Admire 2.21 Succeed 0.67 Romantic 1.71 Popular Class 3.20 Friend Admire 1.29 Succeed 1.72 0.29 Romantic Popular 1.18 24.7 Facebook

44.2

0–100

36.8

0–100

49.2

0–100

26.5

0–100

1.80

0–7

2.13 2.31 2.44 1.49 2.33

0–7 0–7 0–7 0–7 0–7

2.24 1.82 2.00 0.91 1.77 33.8

0–7 0–7 0–7 0–7 0–7 1–167

Missing data changes

77 missing recoded to 1 114 missing given school mean 126 substituted mean 153 missing 179 missing coded as zero 198 missing coded as zero

N = 517

grade lifetime smoking of 33% and lifetime alcohol use of 58.2% (Johnston et al., 2011); and they are consistent with California data collected by other agencies (WestEd., 2012). Half the students were male (49.9%) with 65.3% being of Hispanic/Latino descent. The average age was 15.1 years and 57.8% qualified for reduced or free lunch. Students reported an average 3.2 rooms in their home (excluding kitchen and bathrooms), average grades of 6.1 (on a scale in which 1 = all Fs and 9 = all As, and 6.1 is between “Bs” and “Bs and Cs”), and a self-reported health status of 3.45 (1 = poor and 5 = excellent). Parental smoking was 26.7% and sibling smoking was 16.1%; parental drinking was 40.9% and sibling drinking 7.6%. The degree scores across the 12 networks are also reported in Table 1 and show some variation between boundary conditions and network type. The lowest degree average was for romantic partners within the classroom (0.29) and the highest was friends anywhere (5.42). The second highest degree score was for friends in the grade (4.89). In the survey, respondents were allowed to name an additional 12 friends for a maximum of 19 which yielded an average out-degree of 7.07. Analyses here were restricted to the first seven named so the friendship networks would be comparable to the others. For those with accessible social networking site profiles on Facebook, the average number of friends identified within these schools was 24.7 (FB links can cross schools). To determine whether position varies by network boundary, we present a series of simple correlations between the in-degree scores for each network question within the classroom and within the grade: friendship, 0.50 (p < 0.001); admiration, 0.60 (p < 0.001); success 0.68 (p < 0.001); romance, 0.51 (p < 0.001); and popularity,

312

T.W. Valente et al. / Social Networks 35 (2013) 309–316

Table 2 Overlap proportions for grade and classroom networks. Friend Grade level Friend Admire 0.46 Succeed 0.44 0.40 Romantic 0.38 Popular 0.42 Average Classroom level Friend Admire 0.46 Succeed 0.43 0.41 Romantic 0.43 Popular 0.43 Average

Admire

Succeed

Romantic

Popular

Average

0.15

0.18 0.36

0.04 0.07 0.05

0.11 0.35 0.18 0.20

0.12 0.31 0.24 0.26 0.27

0.29 0.23 0.39 0.27 0.18 0.32 0.33 0.46 0.32

0.21 0.25 0.25 0.23 0.44 0.27 0.31 0.31

0.07 0.06 0.03 0.07 0.04 0.08 0.06

0.21 0.14 0.41 0.20 0.34

0.15 0.35 0.25 0.34 0.32

0.27

0.64 (p < 0.001). Adding socio-demographic controls did not significantly affect the magnitude of these associations. These results indicate that students who receive many nominations when the boundary is restricted to the classroom also receive many nominations when expanded to the grade. The correlations may be inflated by the presence of many zeros; that is, the correlations are not high because of agreement but rather they are high because of the absence of nominations. The correlations were re-calculated restricting analyses to students who received at least one nomination in the classroom for each network: friendship, 0.47 (p < 0.001, n = 1533); admiration, 0.56 (p < 0.001, n = 851); success 0.68 (p < 0.001, n = 977); romance, 0.51 (p < 0.001, n = 318); and popularity, 0.64 (p < 0.001). We repeated these analyses for out-degree and discovered equally strong correlations for the entire sample and somewhat attenuated ones when analyzing only those with at least one nomination sent. The correlation was lowest for friendship nominations made between class and grade, 0.39 (p < 0.001), and it exceeded 0.50 for the other four networks. The correlation between the egocentric friend out-degree and the class and grade were lower, 0.33 (p < 0.001) and 0.38 (p < 0.001), respectively. These data suggest that out- and in-degree scores for the students are similar whether the boundary is restricted to the classroom or the grade. Table 2 reports the overlap in the nominations for the different networks. The off-diagonal entries indicate the proportion of the network common to each network. For example, 15% of the friend nominations within grade were also admire nominations (cell 1,2). Conversely, 46% of the admire nominations were friend nominations (cell 2,1). There are many more friend nominations, and on average 12% of them appear as nominations in the other networks whereas 42% of other network nominations appear as friend ones. Two observations arise: First, there is asymmetry in the overlap in the friend and romantic relationships but not the other three. The romantic ones are particularly asymmetric as people named in the romantic network appear a third of the time as those named in response to other network questions. In contrast, the people named in other network questions (friendship, admire, succeed, popular) are only named as romantic partners 6% of the time. Second, the rate and pattern of overlap between the networks is similar in the classroom and grade-level boundaries. This is surprising as one would expect the classroom boundary networks to have more overlap (because they are much smaller) than the grade ones. Quadratic assignment procedure (QAP) correlations were also calculated to determine similarity across the networks. Like the overlap scores, the QAP correlations are weakest for the romantic network. Excluding romantic networks, correlations between the other 4 networks range from a low of 0.17 (classroom friend and popular) to a high of 0.38 (grade level admire and popular). All of the grade level network correlations were positive and statistically

Table 3 Quadratic assignment permutation (QAP) correlations between networks: grade coefficients are above the diagonal and classroom ones below. Numbers in parentheses indicate proportion the QAP coefficients were statistically significant. Friend Friend Admire Succeed Romantic Popular

0.20 (0.59) 0.22 (0.61) 0.10 (0.33) 0.17 (0.56)

Admire

Succeed

Romantic

Popular

0.27 (1.0)

0.28 (1.0) 0.33 (1.0)

0.11 (1.0) 0.11 (1.0) 0.09 (1.0)

0.20 (1.0) 0.38 (1.0) 0.22 (1.0) 0.11 (1.0)

0.26 (0.60) 0.15 (0.36) 0.30 (0.60)

0.12 (0.33) 0.19 (0.50)

0.12 (0.38)

significant whereas the class ones varied from −0.10 to 1, and attained statistical significance about 2/3 of the time. Admiration seems to be the network most strongly correlated with the others (3 of the top 4 correlations include admiration and in both the classroom and grade networks). There is a very slight tendency for the grade-level correlations to be higher than their respective classroom ones but the difference is modest and not universal (Table 3). Logistic regression models were estimated to determine if behavioral associations (i.e., the associations between individual and network substance use) varied by network boundary and type. For each model covariates normally associated with smoking and drinking were included and co-varied with outcomes in the expected directions: age, male sex, Hispanic/Latino ethnicity, qualifying for reduced or free lunch, number of rooms in the household, academic achievement, parental smoking/drinking, and sibling smoking/drinking. Table 4 reports results of network exposure effects which showed that friends’ smoking/drinking was associated with individual smoking/drinking for all three friendship networks. The magnitude (and SE) of the ego-centric friend network associations (adjusted odds ratios of 37.1 and 47.6 for smoking and drinking, respectively) was considerably higher than the grade or class networks. As expected, the grade-level adjusted odds ratios were higher than the classroom ones (2.34 vs. 2.04 for smoking; 3.22 vs. 2.27 for drinking). The confidence intervals for the grade and classroom estimates overlap one another, however, and so the magnitude of the difference does not achieve statistical significance. The magnitude of the friend ego-centric associations warrants further investigation by comparing the ego-centric and sociometric Table 4 Logistic regression results of network exposurea effect on ever smoked and drank (N = 1707). Ever smoked b

AOR (std. error)

Ever drank N

AORb (std. error)

N

c

Anywhere Friend Grade Friend Admire Succeed Romantic Popular Class Friend Admire Succeed Romantic Popular

37.1* (10.1)

1466

47.6** (17.7)

1428

2.34** (0.49) 1.56 (0.39) 1.48 (0.41) 1.62 (0.52) 1.88* (0.48)

1374 727 850 354 678

3.22** (0.65) 1.65* (0.40) 1.65* (0.35) 1.29 (0.44) 2.46** (0.66)

1367 736 866 361 690

2.04** (0.40) 1.28 (0.34) 1.42 (.38) 2.13 (0.84) 1.29 (0.33)

1228 647 841 197 610

2.27** (0.42) 1.07 (0.24) 0.82 (0.16) 1.06 (0.49) 1.32 (0.33)

1223 662 857 206 624

a Note: Regression controls for age, male sex, Hispanic ethnicity, being on free lunch, rooms in household, academic achievement, health status, parental smoking, and sibling smoking; as well as a random effect for school. b Note: AOR = adjusted odds ratio. c Note: Anywhere is perceived cigarette or alcohol use by friends, not friends’ self-reports. * p < 0.05. ** p < 0.01.

T.W. Valente et al. / Social Networks 35 (2013) 309–316

data. On average, the students named 5.42 close friends anywhere for a total 9252 named friends. Of these, some 588 had no last name provided and so could not be matched to the existing school rosters (if they were on them). For the remaining 8664; 4648 (about half) were matched to the school rosters (leaving 4016 who were friends in another grade, other school, or elsewhere (or perhaps misspelled)). For the matched nominations, 3373 (72.6%) were study participants and provided information on their smoking status. Participants stated that 447 of their friends were smokers while 328 of them self-reported as smoking (73.4% agreement). Conversely, participants stated that 2926 of their friends were non-smokers while 2319 self-reported as non-smokers (79.3% agreement). (Smokers were slightly less likely to report smoking peers when those peers self-reported as smokers – 70.5% agreement, perhaps out of concern for their privacy.) This indicates that participants were fairly accurate regarding their perceptions of peer smoking. Accuracy reports were similar for drinking: Students accurately guessed drinking friends 88% of the time, yet only accurately estimated non-drinking friends 54.6% of the time. This disparity in estimates was exacerbated by the drinking status of the respondent such that drinkers only accurately estimated non-drinking peers 43.4% of the time. This finding is consistent with previous research that has shown that students overestimate the prevalence of substance use among their peers; substance users correctly identify users, but often think the non-users are users as well (Otten et al., 2009). When exposure was calculated via self-reports by the friends named anywhere (those identified and linked) the adjusted odds ratios were 2.31 for smoking (p < 0.01) and 2.13 for drinking (p < 0.01); magnitudes similar to the grade and class-level friendship associations. No other network relationship was consistently associated with behaviors across outcomes or boundary conditions. Naming popular smokers and drinkers was associated with smoking and drinking at the grade-level, but not the classroom. Naming grade-level admire and succeed drinkers was associated with drinking, but not smoking; and not for the classroom networks. For sensitivity analysis, the results in Table 4 are quite similar when missing values on the outcomes were coded as zero, one, or a proportion equal to the sample proportion. As a supplementary analysis, we also estimated dyadic level regression in which unit of analysis is the ego-alter pair with nearly identical results (results are available upon request). Exponential random graph models (ERGMS) were calculated on the 402 networks generated from the survey data. We omitted networks with fewer than 10 students and those with fewer than 20 links which left 232 networks. (All of these smaller networks were classroom ones and many were the romantic ones.) The ERGMs provide a homophily estimate for smoking and drinking with controls for attributes, being linked via those attributes (homophily), and structural parameters (density, reciprocity, and triangles). Table 5 reports the parameter estimates for the structural terms and the behavioral homophily terms as well as the percentage of converged models in which homophily on smoking and drinking behavior terms were statistically significant (at alpha