attribute income levels to Twitter users as the average income in the census sectors ... group in every street; (v) assess the relationship between dynamic ...
Digital footprints in the cityscape: 1 Finding segregated networks through Twitter locational data Abstract. From a tradition that can be traced back to the Chicago School, social segregation has been mostly seen as the natural consequence of the social division of space, taking the space as a surrogate for social isolation. We propose a change in the focus from static segregation of places to how social segregation is performed in embodied urban trajectories. Using social media locational data, we analysed trajectories of groups of social actors differentiated by income levels in Rio de Janeiro, Brazil, in order to (i) infer residential location of users as the origin of trajectories examining geographic spatiotemporal positions in tweets; (ii) attribute income levels to Twitter users as the average income in the census sectors inferred as residential location; (iii) reconstruct users’ trajectories linking tweet positions through a logic of shortest paths as a predictor for actual trajectories in urban space; (iv) map segregated networks visualising the dominant income group in every street; (v) assess the relationship between dynamic segregation and territorial segregation; and (vi) measure levels of social diversity on the streets. Finally, we discuss our findings, and the utility and limitations of the approach in grasping the local scenario of segregated networks as restrictions on social contact. Key words: segregation, mobilities, social networks, Twitter locational data.
Introduction: an alternative approach to urban segregation We will examine in this paper the relationship between forms of social segregation in the city and the mobilities of socially different actors. We suggest that the usual, purely spatial forms of segregation cannot explain the phenomenon of social segregation. Even if the answer to the question of the ways we experience social segregation still implies a role for space, we hope to show that this role cannot be reduced to territorial segregation. In contrast to a literature traditionally focused on the territorial dimension, and closer to a more recent trend in sociospatial studies focused on the positioning of actors in urban space rather than residential location, our approach explores an alternative spatiality of segregation, showing ways in which segregation is shaped by the potential of encounter within the channels of movement where co-‐presence is likely to happen – or not. This shift reinforces a change of focus from the vision of static forms of segregation inherent in places to the understanding of social segregation reproduced through the trajectories of bodies in urban space. This reformulation of the spatiality of segregation should reassert the body, the entity that carries the signals of identities and difference, as the primordial instance where segregation is socially revealed and experienced by the actor. In doing so, we intend to show that urban space maintains a key role in segregation – in fact, a more subtle and penetrating role than the usual reading of territorial segregation would have; a dynamic aspect essential to the lasting experience of segregation. In fact, this aim involves entering an elusive fabric of movement and interaction. This paper develops a method to capture such fabric, using ideas from Linton Freeman's view (1978) segregation as "restrictions on interaction" to references to the time-‐geography of Torsten Hägerstrand (1970). This method is applied in an empirical study of large-‐scale mapping of urban trajectories in the city of Rio de Janeiro. Our approach is intended to help us understand the subtle forms which render the Other invisible in our daily lives, and how subtle forms of social distance silently penetrates everyday life, transforming social differences in distance – as if we lived in different social worlds; a subtle process that operates in the final analysis, through the body. This reconsideration of segregation seems also useful if we take into account the complex patterns of daily mobility in contemporary cities. From the work of Park (1916) and Burgess (1928) to recent approaches to the contextual characteristics of segregation, mobility has been mostly represented by migration, with the slow temporality of production of residential areas and location as an 1
The title recalls Frederico de Holanda’s work “Class footprints in the landscape” (Holanda, 2000).
expression. In this view, the temporality and spatiality inherent to everyday mobility, such as the ability to access and join social situations, have been neglected. In fact, there are highly interdependent forms of mobility, which are central to complex connections in a network society (Urry, 2002:1). We wish to explore the elusive conditions of contact between the actors. We will develop these ideas in order to achieve a clearer understanding of the circumstances of encounter approaching (1) segregation as restrictions on social contact, and (2) the role of mobility in the opportunities of encounter, and its relation to social differences in an unequal society. Then we shall (3) develop a concept of social network able to include encounters and movement in urban space; (4) explain the methodological use of social media locational data to grasp movement; (5) apply this framework in an empirical study of class networks in Rio de Janeiro; and finally (6) discuss our findings on social differences and segregated networks and how they respond to the intention of breaking free from the spatial reduction of segregation to a social division of space. 1. Segregation as restrictions on social contact Segregation is usually seen as a social division of space, a form of controlling contact naturally stemming from socially homogeneous territorial areas [omitted for purposes of blind review]. This view of territorial segregation as a surrogate for social segregation may be traced back to Charles Booth’s maps of income, class and residential distribution in London in 1889, and the Chicago School’s works on residential segregation in American cities. These works established a traditional focus on causal relations in spatial patterns of residential segregation in the following decades, such as Duncan and Duncan (1955); Liberson (1961), Tauber and Tauber (1964), Guest and Weed (1976), Massey and Denton (1988) and Quillian (2002), among many others. Schelling (1969) showed how the effects of individual preferences on location unfold into patterns of residential segregation, and how an integrated society would generally unravel into a segregated one even though no individual actor strictly prefers this (see also Pancs and Vriend, 2007). Extensive empirical work reinforced the conflation. More recently, Schnell and Yoav (2001; 2005) expanded on such approaches in order to measure levels of social aggregation around individuals’ activity places other than home, such as work, leisure and friends’ location – along with isolation from other social groups. An alternative view of segregation is found in Freeman (1978: 413): "All restrictions on interaction, whether they involve physical space or not, are forms of segregation – in social space." We wish to explore Freeman's view of the restriction on interaction in order to understand segregation as a restriction of the presence of the other in our actions in the city, immersed in the delicate fabric of encounters and interactions that keeps local social systems together. Usual approaches seem ill-‐ equipped to recognize these subtle dimensions of segregation active in urban rhythms of the encounter. Our approach shares the focus on social spheres and routinized activities found in a number of more recent works (e.g. Schnell and Yoav, 2001; 2005; Dixon et al, 2005; Lee and Kwan, 2011; Selim, 2015). Schnell and Yoav’s work explores “an index that emphasizes an individual agent’s experience of isolation from or exposure to actual everyday life spaces of other groups” (2001:623). Another class of study based on Hägerstrand (1970) focuses on spatiotemporal paths of actors and interaction in personal social networks (Lee and Kwan, 2011). These approaches do not identify networks of movement of socially different actors and do not assess their overlapping levels. They tend to be centred on a representation of actors as dispersed / aggregated entities on a spatial basis, assuming proximity of location as encounter opportunities. In their turn, Lee and Kwan do assess movement in space-‐time prisms and other interesting abstract representations, but paths are not aggregated into networks of movement related to socially different actors. Our approach is based on modelling the channels of movement in public spaces and the activity places where co-‐presence and encounter may or not happen, so that ‘the other’ may be perceived and acknowledged. We would like to emphasise that, if we want to fully understand the integrative / segregative potential of the encounter, we must turn to the fabric of
2
bodily movement towards activities beyond segregated areas. This means the possibility of exploring a more complex spatiality of segregation, along with potentials of integration latent in spaces that allow the co-‐presence of socially different actors. A closer look into urban practices of different social groups requires a concept able to identify how actors perform their actions in order to access and join social situations. Let us define patterns of mobility not only in terms of distances to be overcome by actors, but as a composite capacity related to (i) distances overcome in time; (ii) the complexity of trajectories, as they indicate movement in different directions, indicating complexities in spatial behaviour; and (iii) the number of activities that actors are capable of participating (e.g. residence, work, consumption places) in time. As we shall see below, our source of spatial data unfortunately does not allow the inclusion of activities, as we did in other studies [references omitted], since tweet locations may not overlap with actual activity places. Nevertheless, the focus on mobility also allows the investigation of segregated networks, as different mobilities may be associated with different social groups and different forms of urban experience. We will argue that income is a key factor in unequal societies and may have effects on mobility as it includes transport costs and access to a number of activities. The location of activities is also important, and here approaches to residential segregation still have much to say, given that living in accessible places means being closer to more activities and having conditions to perform in greater numbers of them. Encounters can be dispersed in the streets or polarized in places of work, leisure and consumption, at bus stops, subway stations, institutional buildings and so on. These factors may have an impact on our actions, like sparks to a dense network of daily movements from residential locations. If movement could leave visible traces in space, such networks of movement could reveal the potential to encounters and bodily segregation unfolding in urban space. Mapping these networks of movement in the city is precisely an objective of this paper. In this sense, there are substantive reasons and methodological reasons for our option for analysing trajectories of bodies instead of location of activities or individuals. Substantively, our aim is to address social segregation and integration materialized in potentials of encounter in the city – a dynamical view of segregation as ‘restriction on interaction’ clearly opposed to usual views based on residential location, a tradition stemming especially from the Chicago School. Our aim is to focus precisely on movement in order to identify the role of mobility in segregation along with potential spaces for presence. Methodologically, Twitter data does offer locations, but we cannot know whether they correspond to actual places or were posted on the move. Furthermore, lines of trajectory do contain a number of actual and potential activity locations. In fact, the idea of mapping trajectories is far from new. The work of Hägerstrand (1970) was the first systematic attempt to capture trajectories and restrictions spatiotemporal hanging over our actions. Although Hägerstrand’s approach – fashionable in the early 1980s in human geography (e.g. Pred, 1981) – has lost interest since then, recent empirical approaches have taken the spirit of that work (e.g. Lee and Kwan, 2011), making use of technologies capable of recording the movement of actors and identify patterns of mobility.2 We propose to add more layers to this idea, and evaluate how the patterns of appropriation of actors shape opportunities of encounter. We wish to explore mobilities potentially related to different social groups. Networks of movement are of course evanescent features of our effective presence in space. If we could capture at least some of them, we could have a better idea of how socially differentiated groups materialize their actions. These ‘paths of action’ shape possibilities of encounter between people and may contain both spaces of co-‐presence and spaces of systematic absence of socially different actors. In other words, mapping 2
A method developed by Gonzales et al (2008) uses an extensive data base recorded through mobile phone communication in American cities to map spatial paths, showing that actors have a remarkable tendency to recursivity.
3
movement can allow an understanding of the spatiality of presence and absence as active features of social segregation. We would like to explore a particular definition of ‘social network’ useful for understanding how the spatiality of the encounter shapes the experience of social segregation. We shall not use the concept as a mathematically identifiable arrangement of personal ties, as in Social Network Analysis (SNA) in quantitative sociology. 3 We employ a definition of social network as an open set of relationships changing over time – one taking into account the social positions of the actors and the circumstances of time-‐space where groups are formed. We are interested in networks of encounters through which interaction may emerge and relationships be formed. Graphically, we do not represent actors by vertices and relationships by links, as in classic SNA. Instead, we invert this representation, seeing actors as “lifelines” (as in Hägerstrand) converging to positions in space-‐time. Encounters are the vertices, and actors’ trajectories in space-‐time, the potential links between them (Figure 1). This representation is favoured by a principle of homology in which lifelines correspond to urban trajectories, and circumstances of encounter correspond to positions in space. This inversion seeks to add the temporal and spatial dimensions as inherent dimensions of social networking, and render the spatiality of encounter more intuitive. Such a definition of network is intended to account for the probability of encounter a factor in the genesis of social relationships.
Figure 1 – Principles of homology between social and spatial networks in time.
2. Social differences and different mobilities What is the chance to meet someone from a different social group? We shall first examine a number of conditions of encounter reasonable from a material standpoint, some supported by previous empirical evidence, others still in need of further research. § First, the creation of social networks in cities depends on circumstances of encounter.4 § Second, cities are historically produced and spatially structured so as to make social situations in principle accessible to potential participants. In fact, city-‐making processes are consistently related to patterns of location and accessibility, as studies in spatial economics (from Hansen, 1959 to Glaeser, 2010) and urban studies (from Lynch, 1960 to Hillier, 2012) have systematically revealed. In a long tradition stemming from Alfred Weber (1909) and Hansen (1959), approaches in spatial economics have been able to identify actors’ preferences and location patterns amidst the apparent randomness of location. 3
See graph theoretical approaches in Freeman (1978), Wasserman and Faust (1994) and Marques (2012). The heart of this idea may be found already in Jacobs (1961; 1969); see also Giddens (1984); Hillier and Hanson (1984); Sassen (2001), Gordon and Ikeda (2011), Bettencourt (2013), and Batty (2013). 4
4
§ §
§
§
Third, activity places tend to increase the potential for convergence of actors who share similar interests and mobilities.5 Fourth, courses of action able to include more activity places may increase the potential for social contact. The higher the mobility, the larger the potential to increase personal social networks would be. Fifth, income may play a role in this process. People with smaller budgets face further restriction in mobility. As we shall see, limitations in mobility enhance localism – the dependency on proximity to do create social networks. In these cases, the density of encounters would tend to increase especially around home, and actors would tend to use places in the neighbourhood to create and maintain relationships. Furthermore, there is a range of activities not dependent on proximity to the residence, such as those around the work, which can increase the length and complexity of urban trajectories of lower income actors. Public transport and the progressive increase of private vehicle ownership in developing countries also allow a broader and more complex mobility in the city. In fact, a number of empirical studies have consistently shown that higher levels of income allow less dependence on spatial proximity.6
Our hypothesis is that similarities in patterns of mobility and appropriation of space would lead to increases in the density of encounters between socially similar actors. In turn, this spatial trend toward both higher levels of homophily and different degrees of connectivity in personal social networks, both generated by differences in income, lifestyles and mobility, may have strong implications for social performance. A number of studies support the idea of a substantial change in the dependency on proximity in the formation of personal social networks according to class and income – in Brazil, country of our case study, and other countries in the world. A recent study by Marques (2012) in Sao Paulo analyses the profiles of sociability of actors in poverty and its role in the formation of social networks, and reveals differences between personal network structures of actors from different social classes. Middle class networks tend to be less dependent on the proximity, as if they were personal deterritorialized communities (Wellman in Marques, 2012) – a very different pattern than actors in poverty. A similar relationship of income and network formation is found in analyses of cases in California USA, and Israel (Fischer and Shavit, 1995), France (Grosseti, 2007), Finland and Russia (Lonkila, 2010) and Chinese (Lee et al., 2005), among others. Their findings suggest that personal networks vary according to social class more than in relation to cultural and regional contexts. The inverse relationship between the dependence on proximity and income also finds support in Briggs (2005). These approaches to segregation, however, still tend to view the spatiality networks limited to the location of residential areas. We need to clarify how the formation of social networks is effectively performed both spatially and temporally, involving circumstances of co-‐presence and absence. Could mobility – and not proximity – be the key factor in networking? In turn, approaches to mobility that make use of geographic information derived from digital data (say, the data usage of mobile phones) are still restricted to capture spatial patterns of behaviour (e.g., Gonzales, Hidalgo and Barabasi, 2008) – with no connection with the social conditions of spatial behaviour, such as the influence of income and class. We propose that a detailed analysis of urban trajectories could clarify causal factors in the formation of segregated networks. 5
Bettencourt (2013) has recently theorized the effects of linear paths over the density of encounters. See Holanda (2000) and Marques (2012). Empirical data on transport expenses in Brazil show that higher income groups not only spend more than low-‐income groups, they spend more than proportionally (POF, 2009). 6
5
3. Mobilities and networking Now let's analyse the temporal formation of personal social networks in an urban context. It seems quite reasonable to say that increases in our access to different social situations could lead to expansions in our personal networks. It also seems reasonable to say that this increase depends on increased mobility. We know that broader and diverse personal networks allow more opportunities for social and economic activity (Marques, 2012). We arrive now at another key point of our argument. If social networks are to really offer more opportunities for activity, networks must first be formed. However, social networking in cities is a spatial capability. In other words, mobility and urban trajectories create encounters, new contacts and opportunities for activity, which generate and expand social networks. If this is the case, mobility should be considered as a key factor in overcoming localism and in the diversification of sociability. Income certainly keeps its central role, as it supports mobility, but we have to consider that a higher mobility is directly related to a higher capacity to form personal networks and perform in them. On the other hand, low mobility tends to limit interactions, as we can see in the case of poor actors. Mobility and income are associated in a circle that leads to increases or decreases in the potential to create, maintain and expand personal networks. But how does this happen? If networking depends on situations of encounter, we need to understand how mobility matters in the structure of urban encounters, as the instance in which social segregation operates in everyday life. We have seen that mobility may have a direct influence on interaction potentials. We also know that low-‐income actors, despite having more limited mobility, are not static within socially homogeneous areas. Differences in levels of localism and sociability in their personal networks suggest variations in their spatial reach and, by extension, in their condition of presence in places in the city. But how (and where) does the potential of encounter between the socially different materialize? In order to answer this question, we need to examine the effects of different mobilities in the formation of personal networks within and between social groups in order to understand the opportunities of encounter. We suggest that if social networks are formed so that actors can perform connections, then class networks (i.e., social networks operating within large-‐scale groups with common economic features that strongly influence their actions and lifestyles)7 and other forms of social grouping are shaped by probabilities of encounter in personal networks. As we hope to show in our empirical study, space matters here. Even though we do not usually think about it, our daily trajectories constitute the backbone of our interactions and shape the elusive structure of social life in the city. The distance between locations in a city, associated with different mobilities, may impose limitations on probabilities of encounter. Differences in mobility, income and lifestyle could bring inequalities in the capacity to participate in social situations. Actually, a counterfactual perspective is able to reveal the extent of this problem in the experience of segregation. Inequalities and incompatibilities in patterns of appropriation are forms of what we may call the disjunction of encounters – a way of disrupting the possibility of encounters that otherwise could happen. The disjunction of encounters may be especially active among socially different people – as consequences of the social and material conditions that prevent them to become co-‐present and acknowledge each other. Simply put, there would be a greater chance of social networks incorporate actors who share similar mobilities. These ideas begin to portray an image of the complex material fabric of sociality in a city. However, how can we understand in detail such volatile spatiality of the encounter? How can we see the fabric of personal trajectories that intertwine, only to be separated later in space-‐time? 7
We derive this notion from Giddens’ concept of social class (Giddens, 1993).
6
4. The methodological use of Twitter locational data It seems almost impossible to see the spatiality of the tremendously complex flows of convergences and divergences of our actions and paths in the city. Our own previous studies remained at a level of small scale, using social and spatial data derived from interviews (Netto et al 2010; forthcoming). They offered delicate images of segregation in social class networks operating in the city. So a key question posed to us is how can we expand this dynamic approach in order to grasp the panorama of segregation in an entire city? The answer is that we can track and record the movement of a large number of (socially different) actors in the city exploring the potential of Twitter locational data as a way of dealing with the urban environment. In fact, a number of works has recently emerged using social media locational data in order to extract information of human patterns of movement. On a substantive level, Liben-‐Nowell et al (2005) related geography and online social networks (a community of bloggers) to find that one-‐third of relationships are not dependent on geography. Lee et al (2011) examine how the use of mobile communication channels of information affects just-‐in-‐ time choices in consumption travelling behaviour. On a methodological level, Li et al (2011), Ribeiro et el (2012) and Zielinshki and Middleton (2013) developed forms to infer indirect locations from Twitter geotag and timestamp, whereas Veloso and Ferraz (2011) and Takhteyev et al (2012) inferred spatially reliable information through regression models correlating tweet frequencies with real world events. Sakaki et al (2010) filtered georeferenced tweets, whereas Boettcher and Lee (2012) applied density-‐based spatial clustering. In the spirit of these works, we conducted an empirical study in the city of Rio de Janeiro. Twitter offers particularly attractive possibilities, as it makes its metadata bank public through a principle of anonymity. The set of variables provided by Twitter API includes user IDs along with a spatiotemporal signal, the timestamp and geographic coordinates for each tweet. We collected metadata from tweets posted in Rio through the official Twitter streaming API between November 12th (0:07:13 am) and 14th (2:36:45) in 2014. We would like to show initial results from an ongoing empirical experiment in Rio. We recorded posts during a period of 56 hours, generating a database of 20,192 users and 333,407 tweets with spatiotemporal positions. We tested this time frame against a 241 hours database, with 70,403 users and 2,252,348 tweets collected along 18 days, and found a Pearson linear correlation of 0.976 (p-‐value 2.2e-‐16) between the datasets regarding the spatial distribution of tweets according to districts. We also could find a clear temporal pattern in tweeting behaviour. There is a steady increase during the day, a little peak around midday, raising again to drop immediately around midnight, to raise again in the early morning, when the cycle begins again – something like “the social pulse of the city” seen through the prism of digital information (figure 2).
Figure 2 – Total number and frequency of tweets in Rio de Janeiro November 12th (12:07:13) – 14th (2:36:45).
7
Metadata allowed us to generate a classification of users according to posting behaviour. Filters were used to ignore users that do not provide sufficient data to infer their residential location, say excluding 40% of users with only two or less tweets during the study period. Automatic tweeters posting for commercial purposes (bots), identifiable by the large number of tweets posted from the same position in space, were also excluded. Furthermore, the analysis of average Euclidean distance between tweet spatial positions revealed users whose movements were also not relevant to our study. A statistical analysis via quantile classification of the results showed that the threshold between high frequencies of short distances and the exponential distribution of long distance was 106 meters per tweet. Values lower than this were filtered, reducing the data set to 78,825 tweets posted by 4,325 users. After these filtering procedures in the initial metadata set, we selected a number of 2,543 users whose movement was relevant to this study. We also identified spatial and temporal patterns of posts in order to deduct the place of residence of users. Then the tweets of valid users were placed in the urban network via a model able to connect tweet positions through the logic of shortest paths, using Open Street Maps as the GIS layer for the street analysis. The result was a network of thousands of paths within the network of streets. It should be clear that the link between tweet positions via shortest paths should not be taken as the actual route covered by users between their tweets. Nevertheless, there are enough theoretical work and empirical evidence (especially in the fields of space syntax, urban networks and way-‐finding studies) to offer reliability to this procedure as a proxy for actual trajectories. Evidence shows that humans tend to choose the shortest path between two positions in space, in a metric sense as well as topological and angular minimization along the segments that compose an urban path (see Hillier, 2012). The next step was to differentiate users using the criterion of income. This step requires the crossing of residential locations and paths inferred from tweets with economic data. We used the 2010 Census (Brazilian Institute of Geography and Statistics, IBGE), which provides income data for areas in the city. The census block unit used to infer wage level of twitter users was the smallest one made available by the Brazilian Institute of Geography and Statistics (IBGE). Within the city of Rio de Janeiro, the census block unit has an average number of 210 households and 616 Inhabitants and a median area of 33,017 squared meters (a considerable variance is found for area). Although we could not find empirical studies on income variation within block units in Brazil, we assumed that proximity and its general relation to land values tend to correlate with similarities in income (as asserted in a line of works in urban economics, from Alonso [1964] on), especially once the size of census blocks is considered. Therefore we associated the initial positions of tweets with census data, allowing us to assign average income levels for users. We analysed income through the standard deviation of the average income per capita in Rio de Janeiro, which suggested ranges between R$ 750; R$ 1,600; R$ 2,500; R$ 3,400, R$ 6,200, and above. These values were identified as low, lower-‐middle, middle, middle-‐high and high-‐income users (differentiated by colours in figure 3). Interestingly, figure 3 shows two very different readings of segregation: residential patterns of income levels as a sign of residential segregation (left) and the pattern of distribution of tweets according to the movement of users in the city (right). Once we compare them, the more complex pattern of tweet locations within the trajectories of users reveal a picture of segregation / integration closer to users’ activities – a more dynamic view of segregation related to the positioning of bodies in urban space. Perhaps ironically, the Twitter users map seems closer to a real picture of segregation in the city. This second view resembles more closely Schnell and Yoav’s (2001; 2005) approach – but we wish to expand on this and include spaces of potential encounter also within the trajectories of actors in urban space.
8
Figure 3 – Residential segregation and a picture of segregation derived from user’s tweet locations: income levels in census blocks units (blue to red, left), and the more complex pattern of tweet locations within the trajectories of users.
In this sense, our methodological procedure can be summarized as follows: 1. Collect metadata Identification of the position in time and space of tweets posted on the city Period: 12 Nov (0:07:13) -‐ 14 Nov 2014 (02:36:45) Active Twitter users in the period: 20,192 users; 333,407 tweets 2. Filter users Minimum number of tweets per user: 3 Eliminate automatic tweeters Selected users: 2,543 users 3. Geographical analysis of tweets and residential location of users Identification of the location of the first tweet in the morning (first in the sequence of tweets) Confirmation by repeating location of the first tweet day. We used recurrent morning tweet locations, since the night position brings limitations to the sample regarding tweeting behaviour. 4. Generation of shortest paths between tweet positions An analytical procedure performed through GIS software allowed us to find the shortest paths within the urban fabric, as a proxy for the actual paths of users. 5. Crossing census data (income) with the inferred location of users Procedure performed through GIS software.
A final methodological question involves the representation Twitter in relation to the population of Rio de Janeiro – for example, how do we have all social classes and income groups that make up the city proportions in the use of Twitter? First we compared histograms of the average per capita income in Rio’s population and Twitter users. Although histograms display a similar trend, there are differences (figure 4, left). We investigated this question and found a significant correlation between population (at different levels of income) and income profile of Twitter users (figure 4, right). Linear regression (dark line) brings an adjusted R squared of 0.67, showing that the income distribution of users has a fairly reasonable degree of similarity with the income distribution of the population in general.
9
Figure 4 – Histograms of average per capita income in Rio’s population and Twitter users (left); and regression between users (Y) and population (X) in urban districts; colours display income level (right).
Therefore, the use of Twitter does not seem to be associated with specific income levels – it seems like a democratic medium in Rio, confirming the high penetration rate of Twitter in Brazil (Graham and Stephens, 2012). This empirical observation leads us to rule out the usual hypothesis of a digital divide in the city, and also shows a relative match between Twitter users and the general population (contrary to Steiger et al’s [2015] observation). 5. Overlapping networks: a digital study What does our experiment reveal about the dynamic segregation as revealed by Twitter users' trajectories in the city? We counted the number of actors’ trajectories classified by income level for each segment where there were trajectories. We also quantified the length of streets covered by users in their trajectories in order to verify the potential of co-‐presence of different income groups, seeking more precision in the description of spatial convergences and divergences between socially differentiated users. After classifying actors according to income level to generate proxies to social classes in Rio, we linked their tweets locations through shortest paths, calculated through GIS software. Trajectories are measured with the same metric length of street segments. This information is registered for each actor and accumulated for her/his income groups. Then we calculated the overlapping of income groups, using the number of actors for each group passing through each street segment. The criteria for determining visually the dominant presence of a single group over a street segment is ‘the social class with the higher number of paths overlapped in a street segment provides the colour for that segment’ – i.e. once we consider the proportion of social classes in actual numbers, when a group has one person above that percentage, it has dominant presence. Intensities matter: streets with higher number of actors of different groups have more weight than streets with similar distributions of groups but lower number of actors. Maps in figure 5 show the dominant income group in the streets that make up their trajectories. Results of the analysis seem to grasp traces of dynamic segregation and the spatiality of potential encounter.
10
Figure 5 – A picture of segregated networks: blue (low income), green (lower-‐middle), yellow (middle), orange (middle-‐ upper) and red (high income) groups. The larger map shows the dominant class network.
We may first notice strong evidences of residential segregation related to income. The poorer spread more broadly over the cityscape. Low-‐income and lower-‐middle income groups show considerable overlap, but low-‐income users are dominant in areas farther from the sea and the CBD, located in the East. Landscape amenities related to proximity to the sea are a clear factor in defining land values and higher income location (and both territorial and dynamic segregation) in Rio. Geographical topography adds complexity to residential location patterns in Rio – including the favelas scattered in the landscape, allowing the poorer to live in hills near richer areas, closer to the sea and the CBD. Such unusual cityscape enmeshes the lines of movement, as we can see in areas in South Rio, increasing the potential of encountering the socially different. Complexities considered, an overall pattern emerges, as higher income users are more likely to be found in South and Southeast Rio, near to the sea, and a gradual shift in users income levels becomes visible in trajectories as they spread farther, towards North.
11
We may render the analysis of overlapping networks more precise. Table 1 shows that the richer have a heavier share of the total length of trajectories performed by users. R5 have 8.9% of users and 11.1% of total trajectories, 20% above the average length (average-‐related factor – ARF). Users (n)
(%)
Total Length (km)
(%)
Average Length (km/user)
(ARF)
R1
600
23.6%
13,696
26.3%
22.8
1.1
R2
1,188
46.7%
22,606
43.4%
19.0
0.9
R3
235
9.2%
3,929
7.5%
16.7
0.8
R4
293
11.5%
6,091
11.7%
20.8
1.0
R5 TOTAL
227
8.9%
5,784
11.1%
25.5
1.2
2,543
100%
52,106
100%
20.5
1.0
Table 1 – Number of users by income group and a comparative analysis of their spatial behaviour.
We can also assess how isolated is the presence of a single class in Rio’s streets, and how the proportion of trajectories income groups share with one another (table 2). Lower income groups (R1 and R2) are much more segregated in their movements across the city, with 19.2% and 29.9% of their trajectories occurring in non-‐shared streets, respectively. They also show the highest degree of sharing spaces (10.4%). Richer groups (R4 and R5) share more of their trajectories with other groups. The poorer and the richer (R1 and R5) share only 0.8% of users’ trajectories. R2 displays a more socially integrative spatial behaviour – but also has a larger share of users (46.7%). R1 and R2 trajectories display less social diversity – they are easily the dominant group (i.e. their presence in a particular space is above the average of its proportion in the total number of actors, all groups considered). The fact that R1 consists of 23.6% of total users and are dominant in 30.5% of streets where they pass through suggests they are more segregated than other groups in their movements. R1 R2 R3 R4 R5
R1 19.2%
R2 10.4% 29.9%
R3 0.5% 1.5% 4.3%
R4 0.5% 1.7% 0.2% 4.3%
R5 0.8% 1.2% 0.3% 0.7% 4.6%
Table 2 – Matrix of the proportion of streets (regarding the total number of streets) appropriated exclusively by a single income group (italic), and the proportion of streets shared by different income groups in their daily paths.
Where are exactly the spaces that different social groups share in their routines? The superimposition of paths may be highlighted graphically. Lines in red show the most likely places to find combinations of income groups (figure 6).
12
Figure 6 – Spatial relations between income groups: overlapping pairs R1 (low income), R2 (lower-‐middle), R3 (middle), R4 (middle-‐upper) and R5 (high income level).
Figure 6 shows quite distinct patterns of co-‐presence between users of different income levels, visible in the extent of overlapping of networks of movement. Pairing the poorer and the rich (R1xR5 and R2xR5), maps show that the world-‐famous South Rio (Copacabana, Ipanema, Leblon) is not merely a segregated area marked by the dominating presence of the rich. It is a major area for mutual visibility. Lower-‐middle and upper income users (R2xR5) also converge in São Conrado, Barra and Recreio in Southeast Rio, in the CBD, and in Gloria, Catete and Flamengo to the Southeast. Lower-‐middle-‐income users also overlap with middle-‐income users (R2xR3) across areas like Tijuca (North Rio, near the CBD on the East) and Jacarapaguá (between North and Southwest Rio, near the ocean). Finally, poorer income users (R1xR2) share much more spaces when appropriating the city, mostly in North and West Rio. Now considering the relationship between dynamic segregation and residential segregation, how much do other social groups actually pass through territorially segregated areas? We assessed this relation crossing the average income in residential areas (census sectors) with the average income of the dominant group passing through those areas (table 3). However present in richer sectors, poorer groups (R1 and R2) strongly concentrate in poorer sectors: 72.1% of R1 trajectories happen in low-‐middle income sectors (S2). Middle-‐income users are well distributed in sectors S2-‐S5,
13
showing potential for encountering socially different actors. Richer users (R4 and R5) still appropriate rich sectors (S5): 58.58% of R5 trajectories happen in S5 areas. These areas are also very attractive to R4 users, consisting of 46.7% of the streets they move through. In turn, middle-‐ income and middle-‐upper income sectors (S3 and S4) are open to more diverse income groups.
Residential Sectors
Dominant presence of class networks R1
R2
R3
R4
R5
S1
9.1%
3.9%
3.6%
1.7%
5.2%
S2
72.1%
52.4%
28.0%
18.2%
17.2%
S3
13.5%
30.1%
29.3%
15.9%
7.4%
S4
2.0%
8.4%
20.5%
17.5%
11.4%
S5
3.3% 100%
5.2% 100%
18.7% 100%
46.7% 100%
58.8% 100%
Table 3 – Proportion of presence of income groups in residential sectors, considering local average income.
Finally, where do class networks converge more intensely? What are the streets with more social diversity, where ‘the other’ is more likely to be seen? We measured social diversity in the streetsi, i.e. the level of superimposition of networks, through Shannon information entropy, calculated as the participation of each class over the total number of actors (spaces with the presence in equal shares of all income groups contain the highest entropy), and associated different entropy levels with colours from blue to red (figure 7). 𝑃! 𝑙𝑜𝑔! 𝑃!
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 = −
!
where P is the participation over the total and i is the income level
Figure 7 – Entropy map showing in red the spaces with the highest convergence of different income groups.
14
There is a small network of socially convergent streets, an interesting superimposition of trajectories of users of all income groups around South Rio and the CBD (on the East). Spaces of social convergence are to be found in denser, busier areas like the CBD and South Rio, or major centralities like Tijuca and Jacarepaguá, a little to the North. These are the most likely spaces to find socially different actors. 6. Conclusion: on social difference, spatial behaviour and segregated networks Considering our hypothesis that similarities in patterns of mobility would lead to increases in the density of encounters between similar social actors, a number of inferences can be drawn. Lower income groups (R1 and R2) have more exclusivity (or are more isolated) in their appropriation of the street network. Their overlapping level is the highest among groups. On the other hand, higher income groups show relatively little overlapping. As for the implied idea that different patterns of mobility lead to little opportunities for co-‐presence, this seems to be the case between poorer (R1) and richer users (R5). R5 and R1 are also the income groups which display the longer trajectories, suggesting on one hand the weight of income, and on the other hand the weight of segregated residential locations. In its turn, lower-‐middle and middle income users (R2 and R3) have a potentially key social role, overlapping more than any other group above or below their income grade. Our proxy to the actual scenario of dynamic segregation/integration in Rio operated through a number of analytical reductions: (i) we inferred residential location as the origin of trajectories assessing spatial and temporal repetition in users’ first tweet in the morning; (ii) we attributed income levels to Twitter users as the average income in the census sectors inferred as residential location; (iii) we reconstructed users’ trajectories linking tweet georeferences and timestamps through a logic of shortest path as a predictor for actual movement and trajectory in urban space; (iv) our mapping of segregated networks involves a reduction of the actual networks, bringing to visualisation only the dominant income group in every street (dominance assumed as the proportion of each group in the total number of users, plus at least one user); (v) our measurement of the relationship between dynamic segregation and territorial segregation is based on the average income in residential location and the dominant income group passing through the sector. In its turn, (vi) our analysis of social diversity on the streets through Shannon entropy used the actual number of users. As a proxy to the scenario of segregated networks, this experiment can only show trends of dynamic segregation and potential encounters within the trajectories of a limited number of users. Unlike previous approaches – e.g. Schnell and Yoav’s (2001; 2005) index of location of individuals and activities, based on measures of aggregation and isolation – our approach [references omitted for purposes of blind review] is geared to trace movement, relate it conceptually and empirically to patterns of appropriation according to social differentiation (in this case, based on income) and assess their role in segregation in actual trajectories of actors in urban space – measured as overlapping levels. However, our findings do not imply that residential or territorial forms of segregation cease to matter. Rather the opposite, residential locations have a key role in structuring networks of movement and circumstances of encounter – along with geography itself, shaping complex patterns of location in the case of Rio de Janeiro. In this sense, our approach is intended to shed light on the complexity of segregation captured as a highly dynamic micro-‐segregation that occurs at the level of the trajectories of our bodies and spaces. It suggests that the probability of encounter is impregnated with spatiality, interacting actively with the structure of the street network to generate potentials of convergence and co-‐presence of social groups. The odds of finding ‘the other’ seem distributed according to the spatial and temporal frames of action of different groups within a
15
city. These frames have a stronger potential for allowing cohesion in social networks sharing mobility patterns and similarities in income as resources to access places and social situations. The paths of Twitter users, mapped in space-‐time, suggest greater compatibility between certain users – and, by extension, greater potential for interaction. On the other hand, incompatibilities between patterns of mobility materialise as differences in urban trajectories in daily life, leading to the reduction of spatial opportunities of encounter. This approach consists of a form of bringing to the forefront the elusive spatiality of segregation contained in bodily movement, closer to Freeman’s (1978) seminal definition of social segregation as ‘restrictions on interaction’. Acknowledgements: A previous version of this paper was presented at the International Conference on Location-‐Based Social Media Data [ICLSM 2015]. We would like to thank the referees for their invaluable input.
References Alonso, W. (1964) Location and Land Use: Toward a General Theory of Land Rent. Cambridge, MA: Harvard University Press. Batty, M. (2013) The New Science of Cities. Cambridge, MA: The MIT Press. Bettencourt, L.M.A. (2013) The origins of scaling in cities. Science, 340, 1348-‐1441. Boettcher, A. & Lee, D. (2012) EventRadar: A real-‐time local event detection scheme using Twitter stream. In Proceedings of the IEEE International Conference on Green Computing and Communications, Besançon, France, 358–67. Briggs, X. (2005) “Social capital and segregation in the United States” In L. Varady (ed.), Desegregating the city. Albany, NY: Suny Press. Burgess, E.W. (1928) Residential Segregation in American Cities. Annals of the American Academy of Political and Social Science, 140, 105–115. Dixon, J., Tredoux, C., & Clack, B. (2005) On the micro-‐ecology of racial division: A neglected dimension of segregation. South African Journal of Psychology, 35(3), 395–411. Duncan, D. and B. Duncan (1955) A methodological analysis of segregation indexes. American Sociological Review 20, 210– 217. Fischer, C. & Y. Shavit (1995) National differences in network density: Israel and the United States. Social Networks, 17.2, 129–45. Freeman, L. (1978) Segregation in social networks. Sociological Methods & Research, 6.4, 411–429. Giddens, A. (1984) The Constitution of Society. Cambridge: Polity Press. Giddens, A. (1993) Sociology. Cambridge: Polity Press,. Glaeser, E. (2010) The triumph of the city: how our greatest invention makes us richer, smarter, greener, healthier and happier. New York: Penguim. Gonzales, M., Hidalgo, C. & Barabási, A-‐L. (2008) Understanding individual human mobility patterns. Nature, 453, 479-‐482. Gordon, P.; Ikeda, S. (2011) Does density matter? In D. Andersson; A. Andersson; Mellander, C. (eds.) Handbook of Creative Cities. [S.l.] Cheltenham: Edward Elgar Pub. Graham, M. & Stephens, M. (2012) A Geography of Twitter. WWW document, http://www.oii.ox.ac.uk/vis/?id=4fe09570. Grosseti, M. (2007) Are French networks different? Social Networks, 29.3, 391–404. Guest, A. M. & J. A. Weed. (1976) Ethnic Residential Segregation: Patterns of Change. American Journal of Sociology, 81, 1088–1111. Hägerstrand, T. (1970) What about people in regional science? Papers of the Regional Science Association, 24.1, 6–21. Hansen, W.G. (1959) “How acessibility shapes land use.” Journal of the American Institute of Planners, 25, Issue 2. Hillier, B. (2012) The genetic code for cities: Is it simpler than we think? In J. Portugali, H. Meyer, E. Stolk & E. Tan (Eds.) Complexity theories of cities have come of age. London: Springer, 67-‐89. Holanda, F. (2000) Class footprints in the landscape. Urban Design International, 5, 189-‐198. Jacobs, J. (1961) Death and Life of Great American Cities. New York: Random House. Jacobs, J. (1969) The Economy of Cities. New York: Random House. Lee, R., D. Ruan and G. Lai (2005) Social structure and support networks in Beijing and Hong Kong. Social Networks, 27.3, 249–74. Lee, J. & Kwan, M-‐P. (2011) Visualisation of socio-‐spatial isolation based on human activity patterns and social networks in space-‐time. Tijdschrift voor Economische en Sociale Geografie, 102.4, 468–485. Lee, K. H., Lippman, A., & Pentland, A. (2011) The Impacts of Just-‐In-‐Time Social Networks on People’s Choices in the Real World. The Third International Conference on Social Computing. Li, W., Serdyukov, P., Vries, A. P., Eickhoff, C., & Larson, M. (2011) The where in the tweet. In Proceedings of the Twentieth ACM International Conference on Information and Knowledge Management, Glasgow, Scotland. Liben-‐Nowell, D., Novak, J., Kumar, R., Raghavan, V. & Tomkins, A. (2005) Geographic routing in social networks. PNAS, 102.33, 11623–11628. Lynch, K. (1960) The image of the city. Cambridge: The M.I.T. Press.
16
Lonkila, M. (2010) The importance of work-‐related social ties in post-‐soviet Russia: the role of co-‐workers in the personal networks in St. Petersburg and Helsinki. Connections, 30.1, 46–56. Maloutas, T. (2004) Segregation and residential mobility spatially entrapped social mobility and its impact on segregation in Athens. European Urban and Regional Studies, 11.3, 195–211. Marques, E. C. (2012) Social Networks, Segregation and Poverty in São Paulo. International Journal of Urban and Regional Research, 36.6, 958-‐979. Massey, D.S. & Denton, N.A. (1988) The Dimensions of Residential Segregation. Social Forces, 67. 2, 281–315. Pancs, R. & Vriend, N. (2007) Schelling's spatial proximity model of segregation revisited. Journal of Public Economics, 91, 1–24. Park, R.E. (1916) The City: Suggestions for the Investigation of Human Behaviour in the Urban Environment. American Journal of Sociology, 577–612; reprinted in Park, R.E. (1957) Human Communities: The City and Human Ecology. Glencoe: Free Press. Pred, A.R. (1981) (ed.) Space and time in geography: Essays dedicated to Torsten Hägerstrand. Lund Studies in Geography, Series B, 48. Lund: Gleerup. Quillian, L. (2002) Why Is Black–White Residential Segregation So Persistent? Evidence on Three Theories from Migration Data. Social Science Research, 31, 197–229. Sassen, S. (2001) The Global City. 2 ed. Princeton: University Press [1991]. Schelling, T.C., 1969. Models of segregation. American Economic Review, 59, 488–493. Schnell, I., & Yoav, B. (2001) The Socio-‐Spatial Isolation of Agents in Everyday Life Spaces as an Aspect of Segregation, Annals of the Association of American Geographers, 91.4, 622-‐633. Schnell, I., & Yoav, B. (2005) Globalisation and the structure of urban social space: the lesson from Tel Aviv. Urban Studies, 42. 13, 2489–2510. Selim, G. (2015) The landscape of differences: Contact and segregation in the everyday encounters. Cities, 46, 16-‐25. Ribeiro, S., Davis, C., Oliveira, D., Meira, W., Gonçalves, T., & Pappa, G. (2012) Traffic Observatory: A system to detect and locate traffic events and conditions using Twitter. In Proceedings of the Fifth International Workshop on Location-‐ Based Social Networks, Redondo Beach, California: 5–11. Sakaki, T., Okazaki, M., & Matsuo, Y. (2010) Earthquake shakes Twitter users: real-‐time event detection by social sensors. In Proceedings of the Nineteenth International Conference on the World Wide Web, Raleigh, North Carolina: 851–60. Steiger, E., Albuquerque, J.P. & Zipf, A. (2015) An Advanced Systematic Literature Review on Spatiotemporal Analyses of Twitter Data. Transactions in GIS (online version before publication). Simmel, G. (1997) Simmel on Culture. In D. Frisby and M. Featherston (eds.), Sage Publications, London. Steiger Takhteyev, Y., Gruzd, A., & Wellman, B. (2012) Geography of Twitter networks. Social Networks 34: 73–81. Tauber, K.T. & Tauber, A.F. (1964) The Negro as an Immigrant Group: Recent Trends in Racial and Ethnic Segregation in Chicago. American Journal of Sociology, 69.4, 374–382. Urry, J. (2002) Mobility and proximity. Sociology, 36.2, 255–274. Veloso, A. & Ferraz, F. (2011) Dengue surveillance based on a computational model of spatio-‐temporal locality of Twitter. In Proceedings of the Third International Conference on Web Science, Koblenz, Germany. Wasserman, S. & Faust, K. (1994) Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press,. Weber, A. (1909) Theory of the Location of Industries. Chicago: University of Chicago Press. Zielinski, A., & Middleton, S. (2013) Social media text mining and network analysis for decision support in natural crisis management. In Proceedings of the Tenth International Conference on Information Systems for Crisis Response and Management, Baden-‐Baden, Germany.
17