Environment and Planning B: Planning and Design 2004, volume 31, pages 381 ^ 395
DOI:10.1068/b2833
A conceptual framework for an individual-based spatially explicit epidemiological model Ling Bian
Department of Geography, State University of New York at Buffalo, Amherst, NY 14261-0055, USA, e-mail:
[email protected] Received 25 June 2001; in revised form 7 August 2003
Abstract. This paper presents a conceptual framework to formalize an individual-based and spatially explicit model of the epidemiology of infectious diseases. The framework differs from that of the traditional population-based epidemiological models in terms of assumptions, conceptual models, and model structures. In particular, the author discusses four aspects of the model: (1) population segments or unique individuals as the modeling unit, (2) continuous process or discrete events to represent the disease development through time, (3) traveling wave or network dispersion to represent the disease transmission in space, and (4) within-group and between-group interactions to represent local and long-distance transmissions. Based on these conceptual discussions, a simple influenza epidemic is simulated in order to illustrate the application of the proposed framework.
Introduction The purpose of this paper is to formalize an individual-based and spatially explicit model for the epidemiology of infectious diseases. It is the common perception that infectious diseases are transmitted from individual to individual following the network of contact between them. Through this network, diseases spread through space and time. Computing models are needed to represent such spatially varying, temporally dynamic, and individual-based epidemiological phenomena. A successful computing model requires a formal conceptual model to capture the essence of the phenomenon ö the individuals, space, time, and the interplay of these essential elements öin order to guide the design and implementation of a computing model. Presently few such formal models exist. Traditional epidemiological models represent the dynamics of infectious diseases through a nonspatial and population-based approach, and time has always been represented explicitly. These models do not explicitly address causal factors for the development of epidemics. Instead, they directly estimate the size of the affected population. In a modeling process, a population is divided into susceptible, infected, and recovered segments. The development of an epidemic is represented as the change in the size of the infected population segment resulting from changes in the sizes of the other segments. These changes are assumed to be continuous and the rate of change at any given time is expressed by partial differential equations (Anderson and May, 1992). This basic population model and its derivatives provide a foundation for modern epidemiology and a tool to assess the dynamics of epidemics and population health. The simple model structure and associated small number of parameters allow for an easy modeling process. By adjusting the parameters, the models can reasonably approximate the observed health data. For well over half of a century, these traditional models have remained the mainstay for epidemiology. In the past decade, however, these population-based and nonspatial models have drawn increasing criticism. Despite the strengths of these models in predicting the health outcome at the population level, they fail to produce realistic and useful results for complex systems (Holmes, 1997; Koopman and Lynch, 1999).
382
L Bian
Several recent approaches in epidemiology attempt to seek alternative means to model population health. One approach explicitly considers discrete individuals and the interactions between them. This is represented in discrete individual transmission models. Another approach considers the spatial adjacency between the infected and the susceptible in local transmissions (hereafter referred to as the spatial adjacency model). A third approach focuses on subpopulations and the transmission of disease between them (referred to here as subpopulation models). Each of these approaches is driven by assumptions that deviate from those of the traditional population models. The discrete individual transmission models are based on the belief that individuals are different from one another and this simple fact should be a basic assumption for epidemiological studies (Keeling, 1999; Koopman and Lynch, 1999). Developed mostly for sexually transmitted diseases, these models are built on the discrete individuals and the contact between them (Adams et al, 1998; Ghani et al, 1997; Koopman and Lynch, 1999; Kretzschmar and Morris, 1996; van der Ploeg et al, 1998; Welch et al, 1998). This development challenges the traditional population-based modeling approach by explicitly incorporating causal factors into the model. The spatial variation in disease transmission, however, is not always considered in the discrete individual transmission models. The spatial adjacency models are based on the notion that disease transmission is an intrinsically spatial process. Thus, using nonspatial approaches to model a spatial process may lose insight into the epidemiological phenomena being studied (Holmes, 1997). In the spatial adjacency models, an individual is represented as a cell in a grid and the disease is transmitted locally from an infected cell to adjacent susceptible cells. When incorporating the spatial adjacency into the framework of populationbased models, the outcomes differ significantly from those of nonspatial models. However, in the spatial adjacency models, individuals are assumed to be immobile and spatially adjacent. These assumptions limit the applicability of this type of model to disease transmission between immobile objects, such as plants. For highly mobile and spatially dispersed objects, such as humans, the spatial adjacency models are inappropriate. The subpopulation models attempt to increase heterogeneity in the population in order to produce more realistic results than the traditional epidemiological models. Those models that define subpopulation by age group are called realistic age-structured models (Ferguson et al, 1997; Grenfell and Harwood, 1997). Those that place subpopulations into grid cells are called spatially structured models (Ferguson et al, 1997; Grenfell and Harwood, 1997; Keeling, 2000; Lloyd, 1995; Rhodes and Anderson, 1997; Szymanski and Caraco, 1994; Torres-Sorando and Rodriguez, 1997). The disease passes between the subpopulations through `coupling' at the subpopulation level. However, because within a subpopulation the individuals are assumed to be immobile and mix homogeneously, these models are not always directly applicable to humans (Rhodes and Anderson, 1997). Despite their shortcomings, these recent approaches reflect the effort to improve epidemiological modeling to deal with complex systems and overcome the problems in the traditional population-based epidemiological models. These approaches provide incentives as well as challenges for further studies to represent better the epidemiology of infectious diseases. In this paper I propose a framework to formalize an individual-based and spatially explicit model in the context of the epidemiology of infectious diseases. The framework is intended for infectious diseases communicable between mobile humans in an urban environment. The focus of the paper is on conceptual issues that arise in establishing such a framework. These issues include the following:
An individual-based spatially explicit epidemiological model
383
(a) assumptions, (b) conceptual models, (c) model structures, (d) computation approaches. The basic assumptions help abstract the essential aspects of epidemiology. In this study, these include the individuals, their interactions, and the spatial and temporal variations of these interactions. Based on these assumptions, the conceptual models formalize the representation of perceived reality into several principles. These include the basic elements of the perceived reality and the relationships between them. With the conceptual models established, a model structure is needed to support the explicit representation of the identified elements and relationships. Within the model structure, the computation approaches are used to estimate model outcomes, such as population health. These steps are necessary to represent the essential aspects of reality and implement them in the form of a computing model. Four aspects of such an individual-based and spatially explicit model are discussed: (1) the basic modeling unit, (2) temporal dynamics, (3) spatial variation, and (4) interactions between individuals. For each aspect, the basic assumptions are discussed in comparison with those of the traditional models. Conceptual models are then devised to reflect the assumptions and to guide the subsequent choice of model structures and computation approaches. If the conceptual models are drawn from theories in related fields, their implications to epidemiological modeling are evaluated. Based on these conceptual discussions, a simple influenza epidemic is simulated in order to illustrate the application of the proposed individual-based spatial model. Modeling unit: population segments versus unique individuals The traditional epidemiological models are rooted in general population models (Kermack and McKendrick, 1927). These models have several basic assumptions with respect to individual characteristics in the development of diseases: (1) individuals are identical; (2) the interaction between individuals is global; (3) the spatial distribution of individuals is uniform; and (4) contact rates between individuals are equal. These overly simplified assumptions significantly limit the usefulness of populationbased models for several reasons. First, the assumption of identical individuals ignores differences in infection probability among susceptibles. Individuals of a certain age, such as children and the elderly, are more vulnerable to infection than other susceptibles. Second, against the assumption of global mixing, an individual is actually in contact with only a finite number of other individuals during a given time. Third, the assumption of uniform spatial distribution of individuals is not realistic. Individuals are distributed in clusters (for example, home and workplace) and each individual travels between clusters. Fourth, the assumption of equal contact rates between all individuals is overly simplistic. For example, individuals of certain employment status, such as retired individuals, tend to have fewer contacts than do employed individuals. The recognition of these limitations, especially those of identical individuals and global mixing, has spawned a host of alternative modeling approaches, such as those mentioned in the introduction. Individual-based epidemiological modeling requires that a conceptual model be based on the following assumptions: (1) individuals are different; (2) individuals interact with each other locally; (3) individuals are mobile; and (4) the environment for individuals is heterogeneous. These considerations are widely supported in several disciplines that have experienced a major shift away from population-based and towards individual-based approaches. These individual-based models may carry different names, such as the discrete individual transmission models in epidemiology (Adams et al, 1998; Ghani et al, 1997; Koopman and Lynch, 1999; Kretzschmar and
384
L Bian
Morris, 1996; van der Ploeg et al, 1998; Welch et al, 1998), individual-based models in ecology (Bian, 2000; DeAngelis and Gross, 1992; Judson, 1994; Tyler and Rose, 1994; Westvelt and Hopkins, 1999), microsimulation in regional science (Amrhein and MacKinnon, 1988; Benenson et al, 2002), and most often, agent-based simulation in many disciplines (Bousquet et al, 2001; Gilbert and Conte, 1995; Gilbert and Doran, 1994; Lake, 2000). This shift calls for the explicit consideration of unique individuals and how they collectively affect population dynamics. The individual-based assumptions require the conceptual model to consider discrete individuals as the modeling unit. Associated with this requirement is a set of principles that form the core of the conceptual model. These principles include the representation of unique individuals, their characteristics and behaviors, the relationships between them, relationships between them and the environment, and how these characteristics and interactions change through time and space. Presently, object-oriented modeling is widely used as a modeling structure to support individual-based models (Bian, 2000; Maley and Caswell, 1993; Silvert, 1993). It is important to note that object orientation is treated here as a means of representation, modeling, and abstraction to represent human perceptions of reality, rather than as a popular programming technique (Khoshafian and Abnous, 1990; Wand, 1989; Wegner, 1990). Conceptually, object orientation is based on the assumption that the world is made of discrete objects, either tangible or conceptual. This assumption entails the use of discrete individuals as the modeling unit. Based on this assumption, object orientation furnishes the following modeling principles. These include the concept that each individual has unique attributes and behaviors. Although the attributes describe the state of the individual, behaviors may change this state in an event, either as a result of the behavior of the individual or of other individuals. These individuals can be organized in a hierarchy or as aggregates depending on the intended representation. These object-oriented principles are consistent with the requirements of an individualbased epidemiological model, and provide an ideal structure to build a model around discrete individuals. With this model structure, the representation and organization of individuals are straightforward. For example, the characteristics of an individual, such as age and occupation, can be represented as attributes of the individual, and the interactions between individuals are represented as behaviors. An individualized representation does not replace the most basic theories in epidemiology, but can significantly affect the modeling outcome. The most important estimates for forecasting epidemics include the critical population size to initiate an epidemic, the persistence of an epidemic, the total number of individuals infected, and the dynamics of epidemics through time. The difference in prediction is evidenced by recent studies using individual-based models (with various spatial assumptions), such as discrete individual transmission models and spatial adjacency models (Holmes, 1997; Judson, 1994; Keeling, 1999; van der Ploeg et al, 1998). The subpopulation models, though not based on discrete individuals, also generated predictions that differ from those of the traditional population models (Cliff et al, 2000; Ferguson et al, 1997; Grenfell and Harwood, 1997). In the traditional models, the outcome is determined by a few parameters. The individual-based models, on the other hand, depend on the accumulation of connected yet individualized infections. Thus, the infection probability of each individual and the contact between the susceptible and infected individuals ultimately determine the outcome. Keeling (1999) presents an analytical approach to formulating statistical properties of an epidemic through correlated individuals.
An individual-based spatially explicit epidemiological model
385
Time: continuous process versus discrete events The dynamics of diseases are intrinsically temporal. The traditional epidemiological models treat an epidemic as a continuous shift between the sizes of the infected, susceptible, and recovered population segments, giving a fixed total population size. Because of the continuity assumption, partial differential equations are the default model structure. These traditional epidemiological models treat time explicitly and two associated assumptions are worth discussing. First, the continuity assumption is built on the change in the size of population segments and thus is inappropriate for modeling individuals, because the concept of size does not apply to individuals. Second, the health states of the population segments are assumed to be discrete. That is, the susceptible, infected, and recovered are distinct states. The latter assumption can be adopted for the proposed individual-based model. From an ontological perspective, the epidemiology literature has always described an infection history as a sequence of distinct periods, each of which begins and ends with a discrete event. The critical periods include the latent, the infectious, and the incubation periods. The critical events include the receipt of infection, the emission of infectious material, and the appearance of symptoms. The relationships between these periods and events are illustrated in figure 1. The latent period begins with the receipt of infection and lasts as long as the infection develops internally without the emission of infectious material. Once the infected individual communicates infectious material to other susceptibles, the latent period ends and the infectious period begins. The incubation period overlaps with the latent period and part of or the entire infectious period, such that it begins with the receipt of infection and ends with the appearance of symptoms (Baily, 1975). Events
Receipt of infection
Emission of infectious material
Latent period
Ending of emissions
Infectious period
Periods Incubation period
Events
Receipt of infection
Appearance of symptoms
Figure 1. An illustration of the relationships between periods and events. Arrows represent the critical events and rectangles represent the periods. Each period begins and ends with an event. Note that the appearance of symptoms may occur before, at, or after the end of emission of infectious material.
Several implications can be drawn from these epidemiological principles and serve as a conceptual model for an individual infection process. First, it is most appropriate to represent the individual infection as a series of discrete events and periods. Second, the discrete events have no duration in their own right and only trigger the change between infection periods. Third, the discrete periods indicate the infection status of an individual and are part of the individual's characteristics. This information, along with the contact structure between individuals, ultimately determines how diseases spread in a population (as discussed in later sections of this paper). Fourth, these events and periods can be expressed in both semantic time and absolute time. The semantic time (that is, latent period, infectious period, and so on) carries meaning and
386
L Bian
quality with regard to an individual's health status. The absolute time is the commonly used Newtonian time reference. With this reference, the events can be expressed as points in time (for example, 9.00 am) and periods can be expressed as duration in time (for example, two hours). The representation of the infection times can be readily supported by an object-oriented model structure. For example, an infection event can be represented as an object-oriented event, and the infection periods should be treated as an attribute of an individual indicating the infection status. With regard to computation approaches, the discrete representation of both individuals and time may move epidemiological modeling further toward the stochastic approach and away from the deterministic approach routinely used in the traditional models. The stochastic prediction of epidemic has to rely on the probability of individual infection, while at the population level the prediction depends on the accumulative effect of individualized infections. This approach requires reasonable estimates of many stochastic aspects of infection. For example, the contact structure between individuals is especially important for modeling the spread of disease through a network of individuals. Space: wave versus network The spread of disease is a spatial process. The traditional epidemiological models focus primarily on the temporal dynamic of infection. The spatial version of these models extends the traditional models to deal explicitly with the spread of disease through space (Cliff and Haggett, 1990; Cliff et al, 1981; 2000; Ferguson et al, 2001; Rhodes and Anderson, 1997). A common form of these spatial models treats the dispersion of disease as traveling waves of infection across a landscape. Diseases spread from a center point and travel uniformly outward. The infected are on the crest of the wave, facing the susceptibles in front, and leaving the recovered behind. These models have been used successfully to predict the spread of disease between communities, such as from one city to another and from cities to rural areas. Rooted in the population-based modeling approach, these models treat a community as a homogeneous unit and the transmission of diseases is modeled between these units. Such an assumption is not directly applicable to individual-based and within-community modeling. The spatially structured models treat a smaller segment of the population as a homogeneous spatial unit than do the traditional models. The passage of diseases is thus modeled between these smaller units. Although a step away from the homogeneous community and towards a heterogeneous one, the spatial assumptions of the spatially structured models do not significantly deviate from the traditional models; thus they are also inappropriate for individual-based models. The spatial adjacency model (Holmes, 1997) brings the concept of local infection into epidemiological modeling. In this model, diseases spread through the combination of both local transmission and long-distance dispersion. The local transmission begins from an infected cell and spreads from this cell to the neighboring susceptible cells. The long-distance dispersion establishes new foci of infection throughout a community. The concept of local transmission is a significant change to the traditional models. However, the assumption behind the local transmission are not applicable to humans who are mobile and spatially dispersed. Different assumptions are required. In a human population, an individual participates in a sequence of activities on a daily basis. Some of the activities are stationary and some are mobile. Stationary activities occur at a physically fixed location, such as a home or a workplace. At these locations, the individual may interact with other individuals in a group activity. When a group dissolves, an individual travels through space and time to another location, often joining another group. This process is best described in Ha«gerstrand's time geography
An individual-based spatially explicit epidemiological model
387
(Ha«gerstrand, 1970; Lenntorp, 1978; Lo«yto«nen, 1998; Pred, 1977). In time geography, the daily (yearly or lifetime) experience of an individual is represented as a life path of movement through space and time, and this life path is presented as a trajectory in a space ^ time prism. An extended time geography can represent the spread of infection in space through both a local infection and a long-distance dispersion. Local infections occur at a stationary location and the long-distance dispersions occur through travel. For example, an individual may be exposed to infection at a stationary location through interaction within a group. The probability of infection depends on the attributes of the individual, such as age, the infection status of other individuals in the group, and the contact structure within the group. The contact structure may in turn depend on the environment of the location. At home, for example, family members are in full contact, whereas in the workplace the contact may only be partial. The local infection spreads further in space as individuals travel. For the susceptible, travel brings increased probability of infection at different locations. The infected, on the other hand, transmit diseases to the next location in space. The concept of extended time geography provides an appropriate assumption for individual-based models. This assumption requires the following conceptual model. First, the unit of representation of extended time geography is an individual. Second, because the individual is mobile, both location and time are essential attributes of the individual and must be explicitly represented. Third, location should be represented by two attributes: the absolute and semantic locations. The absolute location is referenced by the Newtonian coordinate system. The semantic location reflects the environment of a location, such as home or a workplace, and the corresponding interaction patterns. These representation requirements can be readily supported by the object-oriented model structure. Because the original intention of time geography is to represent the space ^ time constraints on an individual's life path, its focus is on a single individual. Although the group is an important concept in time geography, it is viewed from the perspective of an individual and is treated as part of the environment of a location. A reasonable representation of interactions between individuals must include two or more intersecting life paths in order to represent a group, and a collection of connected groups to represent a population. Although time geography is limited in these representations, network theory can effectively complement the necessary requirements and serve as an effective model structure. A network consists of nodes and links. For spread of diseases, a node can represent an individual or a group, and links represent the local interactions between individuals or connections between groups. Network theory mostly focuses on how a network topology affects the performance of a network, measured by how efficiently the network can support transmissions between nodes. Important topology measurements include the number of links for a node, the degree of connection between a given number of nodes, the minimum number of links between any pair of nodes, and so on (Albert et al, 2000; Keeling, 1999; Watts and Strogatz, 1998). These measurements can be used to describe the contact structure between individuals and between groups. Clearly, the network model structure can extend the single life path in time geography into a network of simultaneously active life paths. Interaction: nighttime versus daytime populations Interaction structures depend on the environment. Those that occur at home during the nighttime differ from those at the workplace during daytime. Neither the traditional models nor the alternative models consider this difference. Most studies focus only on the nighttime population, although the daytime population is equally at risk.
388
L Bian
The availability of census data has enhanced the emphasis on nighttime population because these data are based on residents at home rather than employees at workplaces. The proposed individual-based spatial model assumes that both nighttime and daytime populations as well as the connections between them are important in the development of an epidemic. Based on this assumption, the principles of the conceptual model are described below. As mentioned previously, there are two types of interaction: those within a group and those between groups. The interaction between individuals within a group occurs at a stationary location. The interaction between individuals across group boundaries is facilitated by travel. The interaction within a family is usually perceived as a simple network, in that all individuals interact with one another (figure 2). Interactions at workplaces tend to be perceived as a full or a partial mix and thus are more complex than interactions at home (figure 2). Small groups at workplaces tend to have a full mix and large groups tend to have a partial mix.
(a)
(b)
Figure 2. An illustration of contact structure: (a) a full mix, and (b) a partial mix. A large bold circle represents a group. The small open circles on the large circle represent individuals within a group. Straight lines indicate the connections between individuals.
The between-group interactions link those within a group into a network. There are three between-group links that can serve as possible pathways of infection: (a) an open path, (b) a semiopen path, and (c) a closed loop (figure 3). Depending on their occupation, most individuals interact with more than one group on a daily basis. They can receive and spread infection between groups and, thus represent an open path for infection. The semiopen path applies to those individuals who interact regularly with only one group and tend to receive and spread infection only within the group. Their contacts with outside groups are made indirectly through those members of the group who have between-group connections. The third type of path, closed loop, refers to the situation in which all members of a group interact with each other without outsidegroup connections. These individuals are less likely to receive or spread infection than other individuals. A similar model of possible infection paths is discussed in the context of sexually transmitted diseases by Ghani et al (1997). What are not considered in this study and other related works are the direction and magnitude of interactions. (a) (b) (c) Figure 3. The illustration of three possible paths of infection: (a) an open path, (b) a semiopen path, and (c) a closed loop. An open circle represents an individual who has a between-group interaction, and a straight line indicates the identical individual in two different locations. A filled circle represents an individual who has a within-group interaction only. The arrows indicate interactions between individuals.
An individual-based spatially explicit epidemiological model
389
Infection is a directional interaction in that diseases are transmitted from the infected to the susceptible. The direction of infection determines the temporal sequence of infection from individuals to groups, and eventually to the population. In addition, interaction also has magnitude, that is, the probability of infection, depending on the immune state of a susceptible individual. Of the three aforementioned principles, that is, the mixing pattern, infection pathway, and infection direction and magnitude, the network model structure can readily accommodate the infection direction and magnitude. For the remaining principles, it is necessary to treat a group, rather than an individual, as a node in the network. Thus, interactions are treated at two different scales, within a group and between groups. The within-group interactions are treated as a within-node network. The between-group interactions are treated as the network between nodes. Graphically this can be represented as a two-layer structure (figure 4), each layer representing a type of group (for example, homes or workplaces). Within a layer is the within-group local interaction, and between the layers is the between-group interaction that links the nighttime with the daytime population. Many implications can be drawn from this two-layer structure. A direct implication is the possibility of duplicate contact. For example, if several members of a family work at the same place, the contacts between these individuals are duplicated on a daily basis. This may lead to more realistic yet more complex outcomes than can be produced by the traditional and other models. Workplaces
Homes
Figure 4. An illustration of a two-layer interaction structure. The two layers represent homes and workplaces. Ovals represent nodes (homes and workplaces). A small open circle in a node represents an individual who has between-group interactions. A small filled circle represents an individual who has within-group interaction only (the within-node networks are not shown). A straight line indicates an identical individual at two different locations.
The network model structure complements the object-oriented model structure in supporting the representation of nighttime and daytime groups and the link between them. In the object-oriented model, two attributes (family identification and workplace identification) link an individual to the two groups. The size and composition of families and workplaces help estimate the mixing pattern within a group. Attributes of individuals, such as age, occupation, and health status, help estimate the probability of infection, type of infection path, and direction of the path, respectively. The interaction between individuals can be represented as the behavior of individuals. The network topology further specifies the structure of these links. The stochastic computation approach implements these groups and links through probability terms. Ultimately, the collective effect of these probabilities determines the sequence of the infection events through time and space. Model design and simulation The design of an illustrative simulation model for an influenza epidemic is described below. Certain aspects of the model were simplified because the purpose of the simulation is to illustrate the individual-based spatial modeling approach. Influenza is chosen for the simulation because it is common and readily communicable between humans.
390
L Bian
The basic design of the model follows the principles of object orientation. The identification of objects, their attributes and behaviors, and the network structures are summarized below. Most of these elements have been discussed in various sections of this paper. Additional discussion on how these elements are incorporated in the simulation is provided following the summary. Object: an individual. Attributes: age: children, elderly and other adults, and their associated probabilities of infection; occupation: with or without outside-group interaction; infection status: latent, infected, and recovery; location: home, workplace, and their associated spatial coordinates; time: nighttime, daytime, and associated absolute (Newtonian) time; connections: family identification, workplace identification. Behaviors: interaction: between-group and within-group interactions; infection and recovery. Events: receipt of infection, emission of infectious material, and ending of emissions. Network structure: within-group interaction: full or partial mix; between-group interaction: open path, semiopen path, and closed loop. The simulation described below involves a total of 1000 individuals in a metropolitan area over a period of one month. The population is assumed closed, that is, no birth, death, or migration. The recovered individuals do not reenter the susceptible pool. A stochastic computation approach is used to simulate the population. The 1000 individuals simultaneously belong to 200 families and 50 workplaces. The probability distributions of these nighttime and daytime populations are listed in table 1. Table 1. The probability distribution of nighttime and daytime populations for the simulation of an epidemic. Unit name
Number of units
Mean size
Standard deviation
Minimum size
Maximum size
Family Workplace
200 50
5 20
2 20
1 5
10 50
The nighttime and daytime populations are linked by assigning a member of a family randomly to a workplace. The simulation uses 10% infection probability for the susceptible and a one-day latent period and a four-day infection period for the infected. The location of homes is randomly assigned to the residential parts of the metropolitan area and workplaces are assigned to areas where most government, service, and industry jobs are located. The simulation uses a bidaily time step to simulate the daytime and nighttime infections. To begin, one infected individual is randomly introduced into the population. The infection for the rest of population depends on three operational steps. The first is to identify both the nighttime and the daytime group members associated with this infected individual. In the second step, certain group members are assigned an infected status according to infection probability of these members. The third step identifies those already infected individuals who have outside-group contact and continues the spread of infection throughout the population. In this illustrative simulation, a full mix and open paths are assumed at both homes and workplaces.
An individual-based spatially explicit epidemiological model
391
Number
1200
Traditional Individual based
800 400 0
1
3
5
7
9
11
13
15 Date
17
19
21
23
25
27
29
Figure 5. A comparison of simulation results based on the traditional modeling approach and individual-based spatial modeling approach. The vertical axis indicates the daily number of infected individuals, and the horizontal axis indicates the date within a period of thirty days.
Figure 5 displays the result averaged over 200 simulations for the number of daily infections over the thirty-day period. This result is compared with the simulation result using the traditional epidemiological modeling approach. Both simulations use the same population, simulation time interval, and operation steps, but the traditional approach allows a global mix in the population. As expected, results from the two approaches differ. The individual-based approach produced a lower number of daily infections, a lower and later peak, a longer period of epidemic, and a lower number of total infections over the epidemic. Because of the global mix in the traditional approach, almost the entire population is infected on the second day. In contrast, the epidemic in the individual-based approach depends on the accumulation of individualized infection paths to peak and decline over time. In a working model, the traditional approach uses certain parameters, such as a contact coefficient, to reduce the contact level within a population so the predicted level of infection can be realistic. For a working version of the individual-based approach, the size, composition, and location of families and workplaces can be estimated from census and other sources of data. With these locations, the links between the nighttime and daytime populations can be reasonably estimated by using the travel time between homes and workplaces provided by the census data. The mixing pattern within a group, infection path, and infection probability and direction can be estimated according to the population statistics. The spatial distribution of the simulated nighttime and daytime infection is displayed in figure 6 (see over). A group, either a family (circle) or a group of coworkers (triangle), resides at each location. The percentage of members of a group who are infected is represented by the degree of shading, darker shades indicating greater percentages of infection. The spatial distribution of infection on days 5, 7, and 10 are displayed in order to present different stages of the epidemic. On day 5 the epidemic is on the rise and the peak is reached on day 7, and on day 10, the epidemic is in decline. Conclusion: generality versus realism This paper presents a conceptual framework for individual-based and spatially explicit modeling of the epidemiology of infectious diseases. The framework supports the basic principles of epidemiology through an alternative modeling approach. In particular, in this paper I have discussed four aspects of modeling: (1) population segments or unique individuals as the modeling unit, (2) continuous process or discrete events for disease development through time, (3) traveling wave or network dispersion for transmission of diseases in space, and (4) interactions within and between nighttime and daytime populations. Object-oriented and network modeling structures are considered appropriate to support the proposed conceptual framework, and the stochastic computation approach is necessary to implement the model design.
392
L Bian
(a)
(b) Figure 6. Spatial distribution of population infections for a metropolitan area on three days: (a) day 5, (b) day 7, and (c) day 10. The circles represent the nighttime population and the triangles represent the daytime population. The gray shades indicate the percentage of group members who are infected, with darker shades indicating greater percentages.
An individual-based spatially explicit epidemiological model
393
(c) Figure 6 (continued).
The major shift from population-based to individual-based modeling is not unique to epidemiology. In part, this shift is stimulated by rapid improvements in computing power that support modeling of large numbers of objects easily. Related to the improved computing power is the development of new computing paradigms, such as object orientation, that provide the theoretical support for the individualized representation. Furthermore, the availability of spatial data has contributed to the spatially explicit modeling. Presently, infection data collected at the individual level are used under restrictive guidelines because of privacy concerns. Certain spatial data are available to serve as a surrogate means to calibrate model prediction. These advances in technology, theory, and data allow the incorporation of realistic assumptions and causal factors into a model that are not accounted for in the traditional models. The ultimate incentive for individual-based modeling comes perhaps from within the science communities that see not only the possibility of, but also the need for, realistic models. The emergence of individual-based models is expected to bring modeling closer to reality than the traditional models, thus providing new insights into the population health. The increased realism in individual-based modeling also raises a profound question. That is, how much realism is adequate before overwhelming the representation of the true essence of reality, or to state it differently, how to strike a balance between generality and realism. This question was best stated by Koopman and Lynch (1999) who referred to the development of laws of motion in physics. When developing these laws, Newton ignored Aristotle's insistence that the slowing of objects in motion should be a central part of the laws because it is a realistic observation. Newton's laws of motion are proven for their theoretical worth, but not for their fit to reality. In this sense, the traditional epidemiological models are better suited to the purpose of generalization. As they become more realistic, the increased complexity is too much of a burden for their deterministic model structure. On the other hand, the increased realism in the proposed individual-based model demands realistic estimates of stochastic aspects of the model, such as the infection probability of individuals. Further research is needed
394
L Bian
to address this demand. The balance between generality and realism will ultimately be determined by the combination of technology, representation theory, science, and observational data. For application concerns, the representation principles outlined in this paper can be extended for different modeling purposes. For example, the `weekend' population (for example, those at service, social, religious, or other nonhome or nonwork locations) is not addressed in this paper, but it can be added to the simulation. Environment is another important element for population health that is not addressed. It requires a conceptual model and a GIS data model to represent the continuous environment, in addition to the operations needed to represent the interactions between discrete individuals and the continuous environment. With these extensions, the role of the environment can be accommodated within the individual-based model framework. Vector-borne infectious diseases are another application that can be supported by the extension of the proposed framework. This representation may require modeling the life path of both human individuals and vector pathogens, such as mosquitoes for malaria and rodents for dengue fever. Good models require sound theories, reasonable estimates of parameters, and supporting data. In this paper I have explored the theoretical concerns of an individual-based spatial model for the epidemiology of infectious diseases, hoping to further the discussion on any aspects of model development and contribute to epidemiological research. Acknowledgements. This research was funded by the National Institute of Environmental Health Sciences, National Institute of Health under Award No. R01 ES09816-01. The author would like to thank Yuxia Huang and Alexandre Sorokine for their assistance in this research. The insightful suggestions and comments provided by the anonymous reviewers are greatly appreciated. References Adams L A, Barth-Jones D C, Chick S E, Koopman J S, 1998,``Simulations to evaluate HIV vaccine trial designs'' Simulation 71 228 ^ 241 Albert R, Jeong H, Baraba¨si A, 2000, ``Error and attack tolerance of complex networks'' Nature 406 378 ^ 382 Amrhein C G, MacKinnon R D, 1988, ``A micro-simulation model of a spatial labor market'' Annals of the Association of American Geographers 78 112 ^ 131 Anderson R M, May R M, 1992 Infectious Diseases of Humans: Dynamics and Control (Oxford University Press, New York) Baily N T J, 1975 The Mathematical Theory of Infectious Diseases and its Applications (Hafner Press, New York) Benenson I, Omer I, Hatna E, 2002, ``Entity-based modeling of urban residential dynamics: the case of Yaffo, Tel Aviv'' Environment and Planning B: Planning and Design 29 491 ^ 512 Bian L, 2000, ``Object-oriented representation for modeling mobile objects in an aquatic environment'' International Journal of Geographical Information Science 14 603 ^ 623 Bousquet F, Page C Le, Bakam I, Takfoyan A, 2001, ``Multiagent simulations of hunting wild meat in a village in eastern Cameroon'' Ecological Modeling 138 331 ^ 346 Cliff A D, Haggett P, 1990, ``Epidemic control and critical community size: spatial aspects of eliminating communicable diseases in human populations'', in London Papers in Regional Science 21. Spatial Epidemiology Ed. R W Thomas (Pion, London) pp 93 ^ 110 Cliff A D, Haggett P, Ord J K, Versey G R, 1981 Spatial Diffusion: An Historical Geography of Epidemics in an Island Community (Cambridge University Press, New York) Cliff A D, Haggett P, Smallman-Raynor M R, 2000 Island Epidemics (Oxford University Press, New York) DeAngelis D L, Gross L J (Eds), 1992 Individual-based Models and Approaches in Ecology: Populations, Communities, and Ecosystems (Chapman and Hall, New York) Ferguson N M, May R M, Anderson R M, 1997, ``Measles: persistence and synchronicity in disease dynamics'', in Spatial Ecology Eds D Tilman, P Kareiva (Princeton University Press, Princeton, NJ) pp 137 ^ 157 Ferguson N M, Donnelly C A, Anderson R M, 2001, ``The foot-and-mouth epidemic in great Britain: pattern of spread and impact of interventions'' Science 292 1155 ^ 1160
An individual-based spatially explicit epidemiological model
395
Ghani A C, Swinton J, Garnett G P, 1997, ``The role of sexual partnership networks in the epidemiology of gonorrhea'' Sexually Transmitted Diseases 24 45 ^ 56 Gilbert N, Conte R (Eds), 1995 Artificial Societies: The Computer Simulation of Social Life (UCL Press, London) Gilbert N, Doran J (Eds), 1994 Simulating Societies: The Computer Simulation of Social Phenomena (UCL Press, London) Grenfell B, Harwood J, 1997, ``(Meta) population dynamics of infectious diseases'' TREE 12 395 ^ 399 Ha«gerstrand T, 1970, ``What about people in regional science?'' Papers of the Regional Science Association 24 7 ^ 21 Holmes E E, 1997, ``Basic epidemiological concepts in a spatial context'', in Spatial Ecology Eds D Tilman, P Kareiva (Princeton University Press, Princeton, NJ) pp 111 ^ 136 Judson O P, 1994, ``The rise of the individual-based model in ecology'' TREE 9 9 ^ 14 Keeling M J, 1999, ``The effects of local spatial structure on epidemiological invasions'' Proceedings of the Royal Society of London B 266 859ö867 Keeling M J, 2000, ``Metapopulation moments: coupling, stochasticity and persistence'' Journal of Animal Ecology 69 725 ^ 736 Kermack W O, McKendrick A G, 1927, ``A contribution to the mathematical theory of epidemics'' Proceedings of the Royal Society of London A 115 700 ^ 721 Khoshafian S, Abnous R, 1990 Object Orientation: Concepts, Languages, Databases, User Interfaces (John Wiley, New York) Koopman J, Lynch J, 1999, ``Individual causal models and population system models in epidemiology'' American Journal of Public Health 89 1170 ^ 1174 Kretzschmar M, Morris M, 1996,``Measures of concurrency in networks and the spread of infectious disease'' Mathematical Biology 133 165 ^ 195 Lake M W, 2000, ``MAGICAL computer simulation of mesolithic foraging'', in Dynamics in Human and Primate Societies Eds T A Kohler, G J Gumerman (Oxford University Press, New York) pp 107 ^ 143 Lenntorp B, 1978, ``A time-geographic simulation model of individual activity programmes'', in Human Activity and Time Geography Eds T Carlstein, D Parkes, N Thrift (John Wiley, New York) pp 162 ^ 180 Lloyd A, 1995, ``The coupled logistic map: a simple model for the effects of spatial heterogeneity on population dynamics'' Journal of Theoretical Biology 173 217 ^ 230 Lo«yto«nen M, 1998, ``Time geography, health and GIS'', in GIS and Health Eds A C Gatrell, M Lo«yto«nen (Taylor and Francis, London) pp 97 ^ 110 Maley C C, Caswell H, 1993, ``Implementing I-state configuration models for population dynamics: an object-oriented programming approach'' Ecological Modeling 68 75 ^ 89 Pred A, 1977, ``The choreography of existence: comments on Hagerstrand's time geography and its usefulness'' Economic Geography 53 207 ^ 221 Rhodes C J, Anderson R M, 1997,``Epidemic threshold and vaccination in a lattice model of disease spread'' Theoretical Population Biology 52 101 ^ 118 Silvert W, 1993, ``Object-oriented ecosystem modeling'' Ecological Modeling 68 91 ^ 118 Szymanski B W, Caraco T, 1994, ``Spatial analysis of vector-borne disease: a four-species model'' Evolutionary Ecology 8 299 ^ 314 Torres-Sorando L, Rodriguez D J, 1997, ``Models of spatio-temporal dynamics in malaria'' Ecological Modeling 104 231 ^ 240 Tyler J A, Rose K A, 1994, ``Individual variability and spatial heterogeneity in fish population models'' Reviews in Fish Biology and Fisheries 4 91 ^ 123 van der Ploeg C P B,Van Vliet C, De Vlas S J, Ndinya-Achola J O, Fransen L,Van Oortmarssen G J, Habbema J D F, 1998,``STDSIM: a microsimulation model for decision support in STD control'' Interfaces 28 84 ^ 100 Wand Y, 1989, ``A proposal for a formal model of objects'', in Object-oriented Concepts, Databases, and Applications Eds W Kim, F H Lochovsky (ACM Press, New York) pp 537 ^ 559 Watts D J, Strogatz S H, 1998,``Collective dynamics of `small-world' networks'' Nature 393 440 ^ 442 Wegner P, 1990, ``Concepts and paradigms of object-oriented programming'' OOPS Messenger 1 8 ^ 87 Welch G, Chick S E, Koopman J S, 1998, ``Effects of concurrent partnerships and sex-act rate on gonnorhea prevalence'' Simulation 71 242 ^ 249 Westvelt J D, Hopkins L D, 1999, ``Modeling mobile individuals in dynamic landscapes'' International Journal of Geographical Information Science 13 191 ^ 208
ß 2004 a Pion publication printed in Great Britain