NDSSL Technical Report 10-139 October 13th, 2010
Title:
Synthesis and Embedding of Subnetworks for Individual-based Epidemic Models
Authors:
Huadong Xia Jiangzhuo Chen Madhav V. Marathe Henning S. Mortveit
Contact:
Jiangzhuo Chen Email:
[email protected] Tel.: +1 703 518 8032 Fax: +1 703 518 5376
Submitted to:
2011 International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction (SBP11)
Acknowledgements:
We thank our external collaborators and members of the Network Dynamics and Simulation Science Laboratory (NDSSL) for their suggestions and comments. This work has been partially supported by NSF Nets Grant CNS-0626964, NSF HSD Grant SES-0729441, NIH MIDAS project 2U01GM070694-7, NSF PetaApps Grant OCI-0904844, DTRA R&D Grant HDTRA1-0901-0017, DTRA CNIMS Grant HDTRA1-07-C0113, NSF NETS CNS-0831633, DHS 4112-31805, DOE DE-SC0003957, NSF REU Supplement CNS-0845700, US Naval Surface Warfare Center N00178-09-D-3017 DEL ORDER 13, NSF Netse CNS-1011769 and NSF SDCI OCI-1032677.
Network Dynamics and Simulation Science Laboratory Virginia Bioinformatics Institute Virginia Polytechnic Institute and State University
Abstract Individual based epidemic simulations on social networks can better describe the real world disease dynamics if the network better represents the real people-people contacts. In this work we have developed a methodology to synthesize and embed subnetworks of an existing contact network when more detailed activity data on the corresponding subpopulation becomes available. Our methodology involves multi-level refinement and we proposed a methodology to study which levels of details have significant impacts on epidemic dynamics in simulations. We have also developed a methodology to characterize our subnetwork refining model by sensitivity analysis. We have applied our methodologies to the social contact network of the New River Valley (NRV) region in Virginia. By refining subnetworks of three high schools in NRV region with real class schedules of every student, we show that details of a small fraction of a contact network significantly affect the global diffusion dynamics. By a factorial experiment design on in-class connection patterns, contact network density, and embedding constraint, we generate a multitude of refined NRV contact networks and identify the factors in our model that influence the disease dynamics in the resultant network.
1
Introduction
Agent based epidemic simulation models have recently been developed to provide decision support for public health policy makers [6, 7, 2, 3]. In contrast to aggregate differential equation based models [8, 10], these simulation models simulate diffusion of an infectious disease over a synthetic social contact network, which is contructed from individual level synthetic demographic and activity data. In a contact network, each node represents an individual and each edge represents a contact between two nodes due to a simulataneous visit of two persons to a location. The usefulness of such simulation models relies on the quality of the underlying contact network. There are two approaches to contructing social networks. Some are built directly from real survey data. For example, Zachary’s Karate Club Study [12] used survey data to reconstruct the network of friendship between 34 members of a karate club. In [9, 11] electronic devices were used to record people’s physical proximity in order to construct real social contact networks. These methods are limited to small populations, however, due to that they are extremely time consuming and highly costly. For contact networks over a large scale population, we normally derive a few activity patterns based on survey data and apply them to assign activities to individuals base on their synthetic demographic information. Then we can compute people-people contacts based on the time-labeled people-location bipartite graph [1]. To realistically simulate the disease dynamics, it is critical that the underlying social contact network closely resembles the reality. To this end, we would like to assign people realistic activities. Unfortunately this is often difficult due to data availability. For the US synthetic population, we have coarse activity templates generated based on survey data and choose a daily activity sequence from the templates according to the individual’s demographic properties [1]. For regions outside US, we often don’t even have survey data [4]. It is desirable to be able to refine an existing contact network based on coarse-grained activity data once more detailed data become available. In this work, we propose a methodology to improve a contact network by improving subnetworks of it with finer-grained structures. We will apply our methodology to the social contact network of the New River Valley (Virginia) region and show that the network is improved in terms of its capability to correctly simulate epidemic dynamics on it. Although more detailed data is never harmful, there is a point beyond which improvement on the contact network is no longer significant. We
1
have also developed a model to study how much details are necessary to have a better epidemic simulation result. Our subnetwork synthesizing and embedding methodology is fully characterized by a thorough sensitivity analysis. Our contributions. In this paper, we present a methodology to synthesize and embed subnetworks in an existing social contact network with more detailed data on a subpopulation. We apply our methodology to construct in-school networks for three high schools in the NRV region and embed them into the existing NRV contact network. With epidemic simulations, we show that the refinement is meaningful and point out that the refinement is insensitive to in-class (sublocation) mixing models but is sensitive to in-school and between-school mixing models. We emphasize that we have applied our network methodology to high schools in NRV region due to recently availability of extact class schedule data from student registration information in these schools, but the methodology can be applied to other types of campus locations, such as primary schools, colleges, or big companies. We highlight our major contributions as follows. • First, we propose a methodology to refine an existing social contact network by replacing contacts between people visiting a few campus locations with subnetworks synthesized based on more realistic activity data of these people in these locations, while keeping other contacts of them. The methodology is modular with multiple levels of refinement, including sublocation (class), location (school), and between-location levels. A user of the methodology can choose a model for each of the levels. In our case study on the NRV network, we use real activity data. But our methodology is applicable with synthetic activity data too. This is helpful when more realistic (although not real) activity data become available. • Second, we show with our NRV case study that refinement of even a small fraction of the network can have significant impact. Our methodology do not guarantee, however, that any refinement on subnetworks will be meaningful. Sometimes improvement on the contact network does not lead to any significantly different epidemic dynamics, which is what we really care about. To this end, we present a methodology to study whether the refinement over the existing network is significant, in terms of its influence on the epidemic dynamics; and which levels refinement matters more than other levels. This methodology is based on simulations. • Finally, we present a methodology to analyze the sensitivity and to characterize the uncertainty in our subnetwork refinement methodology. We stress that sensitivity of network dynamics is different from the normal numerical sensitivity, where the model can be expressed as y = f (X), where y is a numerical response variable and X is the vector of independent variables. In our network dynamics sensitivity analysis, we have Y = φ(G), G = f (X), where G is the contact network, φ is the epidemic simulation, and Y is a vector characterizing the epidemic dynamics. The network construction and simulation are separate, i.e., our response variable(s) indirectly respond to the independent variables. We point out that our sensitivity analysis methodology often invloves a factorial design of epidemic simulations consisting of thousands of runs on a large-scale contact network, it is important to have a fast simulation tool. In this work we have used EpiFast, an agent-based high performance epidemic simulation system [3]. Organization of the paper. In Section 2 we present our methodology of constructing subnetworks and embedding them into an existing social contact network, as well as the data set we used for 2
the NRV network refinement. In Section 3.1 we answer the question of whether and which details matter. Sensitivity analysis on our network refinement methodology is described in Section 3.2. We conclude in Section 4.
2
Methods for Subnetwork Construction and Embedding
In this section, we first describe our general methodology to construct a social contact network. We apply the methodology to generate a contact network (NRV network) for the New River Valley, Virginia region, based on synthetic population data; and contact networks for the three high schools in NRV region, based on real population data. Next we describe how we embed the three high school networks into the NRV network to improve its quality.
2.1
Contact Network Construction
People in the society perform their own activities everyday, including work, study, shopping, and other activities. These activities move people between locations. We divide locations into sublocations, so e.g. a school is a location while classrooms are sublocations in a school. Contacts between people may occur if they stay at the same sublocation at the same time. Activities of people can be represented by a series of time-labeled people-location bipartite graphs GP L [7] as shown in Figure 1. Each GP L corresponds to a time interval during which no one moves. Next we project each bipartite graph GP L into a people-people graph GP P . Since contacts only happen between people who are co-located, we group people in the same sublocation together. These are the people who are within distance of two to each other in GP L , as illustrated in “(sub)location” column in Figure 1. To construct the contact network within a sublocation, we apply one of several graph models, called sublocation models, including complete graph, G(n, p) graph, and random geometric graph. For example, if we use the G(n, p) graph model, then any two people in the same sublocation at the same time will be connected with probability p. We are interested in whether there is significant difference between different sublocation models and between different parameter settings (e.g. p in G(n, p)) of the same model and this is studies in Section 3. We put the sublocation contact networks together to form the people-people contact network GP P for the time interval. In GP P , each edge has a weight equal to the duration of the corresponding time interval. Finally we aggregate the GP P graphs for all time intervals to form the long-term GP P (contact network for the whole day). The construction process is illustrated in Figure 1, where complete graph is used as the sublocation model. Note that multiple contacts between two persons, probably in different time intervals, may be merged by adding the weights together (edge (1, 2) in figure); but it is not necessary. Using the above methodology, we first construct a synthetic social contact network for the NRV region, based on the synthetic population data, including demographics and activities, that our group generated from the US 2000 census and activity survey data. Details about the synthetic population construction for a US region can be found in [1]. We also construct contact networks for the three high schools in NRV region. The whole synthetic NRV contact network also contains a subnetwork for each of the three schools. The differences between the subnetworks within the NRV network and the high school networks created separately include the following.
3
L1 L2 L3
L1 L2 L3
Figure 1: People-location bipartite graph and contact network construction. In GP L , a circle represents a person and a square represents a location, whereas an edge between a person and a location means that the person visits the location during the labeled time interval. In GP P , edge (u, v) with weight w(u, v) represents a contact of duration w(u, v) between u and v. no. of students no. of students in grade 9/10/11/12 no. of classrooms no. of classes per day duration of classes (minutes)
school 1 1116 293/266/298/259 78 9 45
school 2 958 311/231/220/196 66 5 90
school 3 425 107/115/105/98 45 7 50-56
Table 1: Statistic of real class schedules (data from registration information). 1. The activities used to construct the whole NRV network are synthesized and coarse; while the three high school networks are based on real class schedules of all students (with their ID’s anonymized). In the former case, most high school students have only two classes, one for the whole morning and one for the whole afternoon. In the latter case, the students have more classes and shorter durations for each class. Therefore, the students are mixed more fully in the latter case; between them there are more edges with smaller weights. 2. In the synthetic subnetworks, sublocation model is complete graph. That is, any two students taking the same class are connected. In the “real” school networks, we use various sublocation model including G(n, p) random graph and random geometric graph. The real class schedules of the students in the three NRV high schools are collected from the 2009 school registration data. We list some of the statistics of the real NRV high school class schedules in Table 1. Note that the whole NRV population size is about 150,000 while the three high schools have about 2,500 students, only 1.7% of the whole population.
2.2
Subnetwork Embedding
We would like to refine the synthetic NRV contact network with the three more realistic high school contact networks. To this end we need to embed the three high school networks into the whole 4
Figure 2: Procedures for subnetwork embedding. Real subpopulation is larger than the synthetic subpopulation; we pick the adjacent red node to fill the gap. NRV network. The embedding involves three steps as illustrated in Figure 2. (a) Identifying and mapping. First, identify synthetic high school subpopulation from the global synthetic population. Unfortunately the synthetic high school subpopulation is smaller than the real high school subpopulation.1 We have to pick extra people from the NRV synthetic population to fill the gap. This process is illustrated in Figure 2(a). (b) Removing school edges. Delete from the whole NRV network all edges among the identified synthetic subpopulation that are due to in-school activities.2 Edges among them that are due to non-school activities are kept. (c) Embedding. Embed edges in the real high school networks into the global contact network based on the vertex mapping in the first step.
3
Epidemic Dynamics in the Refined Contact Network
In Section 2 we describe our methodology to improve a synthetic contact network with better, more realistic, and more detailed data on its subpopulation and apply it to construct three high school contact networks and embed them into the NRV synthetic network. We are interested in two questions: (1) Does the refinement make much difference, both in terms of static network structure and in terms of network dynamics? (2) How robust is our methodology? To these ends, we run structural analysis and epidemic simulations on the original NRV network as well as a series of refined NRV networks, generated by our methodology with different parameter settings. In Section 3.1, we show that although the refinement changes only a very small part of the whole network and does not seem to have noticeable impact on the overall network structure, the epidemic 1 The inconsistency is mainly due to that NRV high school students increased about 10% from year 2000 when the census data was obtained to year 2009 when the class schedule data was collected. 2 Each edge in the contact network also has an “activity type” label. It is used to identify school type edges between people in the identified high school subpopulation.
5
dynamics of the whole population does change significantly. In Section 3.2, we conduct a series of sensitivity tests to quantify uncertainties in our methodology. Naming convention for networks. In the figures in this section, we label the networks with following naming convention. The name contains several fields separated by ‘-’: the first field means the range of the network (NRV or School i); the second field is either “syn” or “real” representing the synthetic network (or its subnetwork) or the network refined by or constructed from real class schedule data; the third field denotes the sublocation model; and the fourth field denotes the parameter value in the sublocation model. Simulation settings. We run EpiFast, a fast epidemic simulation tool [3], to simulate a strong flu-like infectious disease on the NRV synthetic contact network and various refined networks. The disease has an attack rate (total infected fraction of the population) of about 30%. We simulate the epidemic for 300 days and for each network the simulation has 30 replicates. We report the average and plot the average epidemic curve(number of new exposures over time, aka epicurve), which represents the average number of new exposures on each day. In Figure 6 we plot the standard error around the average epicurves to give readers a sense about the variability of simulation results, but in other figures we omit the standard error so that the curves are not too cluttered.
3.1
Do Details Matter?
In this experiment we compare two contact networks: the original synthetic NRV network and the refined NRV network with real class schedule data and complete graph sublocation model. Since the synthetic network also uses complete graph sublocation model, the only source of difference between two networks is the high school students’ class schedules. Network structure. We first compare the network structure in the in-school subnetwork in the synthetic network and the “real” in-school network for each of the three high schools. In Figure 3 we see two completely different degree distributions. The “real” school-1 network has much larger average degree, due to more frequent class changing and so more mixing, than the synthetic school-1 network. In Figure 4, we show the distributions of edge weight (contact duration) in two in-school networks. The “real” school-1 network has shorter contact durations, due to shorter classes, than the synthetic school-1 network. It is worth pointing out that although the degree distributions and contact duration distributions are very different, the average weighted degree (sum of weights of all edges incident on a node) is very similar in two school-1 networks. We have similar observations for school 2 and 3. Next we compare the network structure in the whole synthetic NRV network and the refined NRV network. In Figure 5, we find that the degree distributions are almost identical, although those distributions are very different in in-school networks. This is because the involved high school subpopulation is only a small fraction (1.7%) of the whole NRV population. Epidemic dynamics. Now we run epidemic simulation on these two NRV networks and compare their epidemic dynamics. From Figure 6, we find that they have significantly differently epicurves. The synthetic NRV network has an earlier outbreak and a higher peak. This implies that the details of a small part of a contact network can have a large impact on the dynamics of the whole network. It means that our refinement on the synthetic contact network does make a difference.
6
Degree Distribution of Network for School 1 0.14
Fraction of Contacts
School1-Real-complete School1-Syn
0.12 Fraction of Nodes
Contact Distribution of Network of School 1
0.1 0.08 0.06 0.04 0.02
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
School1-Real-complete School1-Syn
45
40
35
30
25
20
00
00
00
00
00
00
00
Contact Duration (seconds)
Figure 3: Synthetic and “real” school-1 networks
Figure 4: Synthetic and “real” school-1 networks
have different degree distributions (average degree around 110 v.s. 27).
have different contact duration distributions
Degree Distribution of NRV Networks 0.018
NrvNet-Syn NrvNet-Real-complete
1000 Size of New Exposed
Fraction of Nodes
Epidemic Curve of NRV Networks: class schedule comparison
NRVNet-Real-complete NRVNet-Syn
0.016 0.014 0.012 0.01 0.008 0.006 0.004
800 600 400 200
0.002 0
0 0
0
0
0
0
0
0
0
0
80 100 120 140 160 180 Degree
00
60
15
40
00
20
10
0
50
0
0
50 100 150 200 250 300 350 400 450 Degree
0
50
100
150 Days
200
250
300
Figure 5: Global networks (synthetic and refined)
Figure 6: Global networks (synthetic and refined)
have similar degree distributions.
have different epidemic dynamics due to different class schedule details.
7
3.2
Sensitivity Test
In this section, we conduct a series of sensitivity tests for two purposes. First, our methods affect the network structure and it is interesting to see how sensitive the network structure is to our methods. Second, The sensitivity test will help to quantify the uncertainties residing in our methods. For example, it is very hard to collect the seating statistic data and the detailed interactions that occur between the individuals within a classroom. It is unlikely to eliminate such uncertainties but a way to quantify them would be helpful. Please note that network structure sensitivity is different from the usual numerical sensitivity studied in large scientific experiments, so we develop a new methodology for studying network structure sensitivity. The issue is, network structure is hard to quantify, and measuring the networks directly such as degree distribution is not a good idea, because local changes are hard to perceive in that way as illustrated in Figure 5. Here we quantify network structure with its epidemic outcome since the network is fabric on which disease spreads. Specifically, we represent network with the triple [AttackRate, P eak, P eakDay] that characterizes its epicurve. We study the following questions. 1. How sensitive are the results to the way networks are constructed within a class? We apply sublocation models of G(n, p) and geometric random graph to construct such local networks. The sublocation models decide the topology structure of the network and the connectivity parameter used in the model (p in G(n, p) and connection radius in random geometric graph) decides the density of in-class contacts. So the question is formalized as, how sensitive the triple [AttackRate, P eak, P eakDay] is to the choice of sublocation model and the connectivity parameter value? 2. How sensitive are the results on the choice of the 1-1 map? To sample the way we do 1-1 map, initially we pick up a 1-1 mapping function where each real student is mapped into a synthetic student in the same school. We then constantly perform switch Markov chain [5] to randomize the 1-1 map relation. Since there are many possibilities for 1-1 map, we should choose various combinations during switching. First, we switch 1-1 map inside each class, which is expected not to affect network structure a lot. Second, we conduct switch Markov chain for each group of students within the same school. Third, we put together mapping relations from arbitrary two schools and switch them for mixing between the two schools and let another school do the switching within itself, that way we shuffle students in two schools together; likewise, we then go further to switch 1-1 maps within all three schools. Following such hierarchy basically we cover the whole 1-1 mapping sample space, and the question becomes, how sensitive the triple [AttackRate, P eak, P eakDay] is to the choice of the four types of Switching Operations? To study the two questions, we conduct a systematic analysis over all the three factors together using three-factor ANOVA. Let’s represent the three factors [subLocation model, Connectivity parameter, Switching operations] as [L, C, S]. Here, the number of levels for factor L is 2; factor S has four levels as discussed above, and we choose three levels for factor C, which are 0.2, 0.5 and 0.8 for G(n, p) for example. 3 The combinations of factors and levels are called cells for multi-factor ANOVA. Therefore, there are 24 cells in our example. Each cell corresponds to a value for the tuple [L, C, S] and based on each tuple we construct 20 random network samples accordingly. The triple [AttackRate, P eak, P eakDay] 3
The three values for the connectivity parameter in geometric random graph are selected accordingly, so that the constructed school networks have the similar density as the school networks constructed with G(n, p) in the same level.
8
factor L C S L:C L:S C:S L:C:S Residuals
DF 1 2 3 2 3 6 6 456
attack rate SS F value 0.0000004 1.0389 0.0088561 10628.4552* 0.0019159 1532.8931* 0.0000013 1.5858 0.0000019 1.5305 0.0011373 454.9766* 0.0000041 1.6268 0.0001900
peak SS F value 24 0.3797 498078 3976.3839* 73668 392.0810* 87 0.6981 500 2.6622 150700 401.0359* 329 0.8745 28559
DF 1 2 3 2 3 6 6 456
DF 1 2 3 2 3 6 6 456
peak day SS F value 0.1 0.0327 13482.1 4236.0993* 2591.7 542.8744* 2.1 0.6690 6.1 1.2721 172.0 18.0094* 12.8 1.3428 725.7
Table 2: Sensitivity Joint Analysis with multi-factor ANOVA, * shows that the F-test is significant at 99% confidence level. level pair 2-1 3-1 4-1 3-2 4-2 4-3
attack diff 0.000022700 0.003262525 0.004543525 0.003239825 0.004520825 0.001281000
rate adj p-value 0.9929273* 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
peak diff adj p-value 0.3663889 0.9841899* 16.5158333 0.0000000 29.7380555 0.0000000 16.1494444 0.0000000 29.3716666 0.0000000 13.2222222 0.0000000
peak day diff adj p-value 0.03333333 0.9969634* -3.61666667 0.0000000 -5.32500000 0.0000000 -3.65000000 0.0000000 -5.35833333 0.0000000 -1.70833333 0.0000000
Table 3: Multiple comparisons between levels of factor S(Switching Operation) using Tukey Test, * shows that the two levels produce similar results at 99% confidence level. for each network is calculated and used as the response variable for the 3-factor ANOVA. Table 2 shows the joint analysis results based on multi-factor ANOVA. It shows the choice of local sublocation model is not a significant factor for our methodology, which is good since it means that uncertainty caused by the connectivity pattern inside a class is trivial. Apparently, Connectivity P arameter and Switching Operation are significant. To precisely measure the sensitivity of the two factors to each level, we further conduct multiple comparisons for each level pair through Tukey Tests. From Table 3 we find that the results are not sensitive if we change factor S between level 1 and level 2. It means that switching 1-1 map inside a school won’t affect network structure a lot. However, factor C doesn’t keep such robustness as we change its value between the three levels as shown in Table 4. Apparently, the factor C decides the network density which is a very critical and sensitive factor when we study propagation process over networks. level pair 0.5-0.2 0.8-0.2 0.8-0.5
attack rate diff adj p-value 0.007057219 0 0.010286744 0 0.003229525 0
diff 58.48542 75.11292 16.62750
peak adj p-value 0 0 0
peak day diff adj p-value -7.09375 0 -12.96250 0 -5.86875 0
Table 4: Multiple comparisons between levels of factor C(Connectivity P arameter) using Tukey Test. All level pair comparisons are significant.
9
4
Conclusion and Future Work
We propose a methodology to refine a synthetic social contact network by replacing its subnetwork with a more detailed, more realistic copy. By applying it to embed high school subnetworks in the NRV contact network, we show that an improvement on a local contact structure which is a very small part of the whole network may strongly affect the global dynamics. We also conduct a series of sensitivity tests to evaluate the robustness of our methodology from different aspects. In this paper, we replace the high school subnetworks with more detailed structure. The simulation results imply that high school may be a critical community in the propagation of an infectious disease. It would be interesting to explore whether there are other communities that will also impact the global network dynamics strongly if we change their underlying network structure, and what characteristics all such communities share that endow them the critical role in the society.
References [1] C. Barrett, R. Beckman, M. Khan, V. A. Kumar, M. Marathe, P. Stretz, T. Dutta, and B. Lewis. Generation and analysis of large synthetic social contact networks. In Proceedings of the 2009 Winter Simulation Conference, pages 1003–1014, 2009. [2] C. L. Barrett, K. R. Bisset, S. Eubank, X. Feng, and M. V. Marathe. Episimdemics: an efficient algorithm for simulating the spread of infectious disease over large realistic social networks. In Proceedings of the ACM/IEEE Conference on High Performance Computing (SC), page 37, 2008. [3] K. R. Bisset, J. Chen, X. Feng, V. A. Kumar, and M. V. Marathe. EpiFast: a fast algorithm for large scale realistic epidemic simulations on distributed memory systems. In Proceedings of the 23rd International Conference on Supercomputing (ICS), pages 430–439, 2009. [4] J. Chen, F. Huang, M. Khan, M. Marathe, P. Stretz, and H. Xia. The effect of demographic and spatial variability on epidemics: A comparison between beijing, delhi, and los angeles. In Proceedings of the 5th International CRIS conference on Critical Infrastructures, 2010. [5] C. Cooper, M. Dyer, and A. J. Handley. The flip markov chain and a randomising p2p protocol. In Proceedings of the 28th ACM symposium on Principles of distributed computing, PODC ’09, pages 141–150, New York, NY, USA, 2009. ACM. [6] S. Eubank. Scalable, efficient epidemiological simulation. In SAC ’02: Proceedings of the 2002 ACM symposium on Applied computing, pages 139–145, New York, NY, USA, 2002. ACM. [7] S. Eubank, H. Guclu, A. V. S. Kumar, M. V. Marathe, A. Srinivasan, Z. Toroczkai, and N. Wang. Modelling disease outbreaks in realistic urban social networks. Nature, 429(6988):180–184, May 2004. [8] H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42:599–653, 2000. [9] C. Hlady, D. Curtis, M. Severson, J. Fries, S. Pemmaraju, A. Segre, T. Herman, and P. Polgreen. A near-real-time method for discovering healthcare worker social networks via wireless devices. 47th Annual Meeting of the Infectious Disease Society of America, Oct. 2009. 10
[10] L. A. Meyers. Contact network epidemiology: Bond percolation applied to infectious disease prediction and control. Bulletin of The American Mathematical Society, 44:63–86, 2007. [11] I. Rhee, M. Shin, S. Hong, K. Lee, S. Kim, and S. Chong. ncsu/mobilitymodels (v. 2009-07-23), July 2009.
CRAWDAD data set
[12] W. W. Zachary. An information flow model for conflict and fission in small groupsi. Journal of Anthropological Research, 33(4):452–473, 1977.
11