Building a Crowd-Sourcing Tool for the Validation of Urban Extent and Gridded Population Steffen Fritz1, Linda See1, Ian McCallum1, Christian Schill1,2, Christoph Perger3, and Michael Obersteiner1 1
International Institute for Applied Systems Analysis (IIASA), Schlossplatz 1, A-2361 Laxenburg, Austria 2 University of Freiburg, Tennenbacherstr. 4, Freiburg, Germany 3 Fachhochschule Wiener Neustadt, Johannes Gutenberg-Strasse 3, A-2700 Wiener Neustadt, Austria {fritz,see,mccallum,obersteiner}@iiasa.ac.at,
[email protected],
[email protected]
Abstract. This paper provides an overview of the crowd-sourcing tool GeoWiki, which is used to collect in-situ land cover validation data from the public. This tool is now being modularized in order to allow for domain specific land cover validation. Agriculture and biomass versions of Geo-Wiki are already operational. The next module, which is called urban.geo-wiki.org, is aimed at the validation of urban extent and gridded population data. The aim of this paper is to outline the structure of this module and the datasets that will aid in the validation of this land cover type and gridded population data. The ultimate aim of Geo-Wiki is to produce a hybrid land cover map that is better than existing products, including estimates of urban extent. Keywords: Crowd-sourcing, urban areas, land cover validation, Google Earth.
1 Introduction Cities are one of the greatest challenges of the modern age, where the rapid urbanization of the last century has led to many problems such as pollution, climate change, food security, global health issues and increases in total water demand [1]. More than half of the world’s population currently lives in cities, but by 2050, this will increase to 80% as continued movement to cities takes place in China, India, Africa and Asia [2]. Urban areas are captured in global land cover datasets such as the GLC-2000 [3], MODIS [4] and GlobCover [5], which have been created as baseline terrestrial products in the last decade. These products are used in a number of applications such as global and regional land use modeling and land use change [6, 7, 8]. The problem with global land cover products has been highlighted in a number of comparison studies, which have revealed significant amounts of spatial disagreement in land cover types [9] [10] [11] [12] with a particular focus on the forest and cropland domains. The need for a more accurate global map of urban extent has already B. Murgante et al. (Eds.): ICCSA 2011, Part II, LNCS 6783, pp. 39–50, 2011. © Springer-Verlag Berlin Heidelberg 2011
40
S. Fritz et al.
been highlighted in [13] in the development of a MODIS urban extent product and gridded population. Information on urban extent, in particular changes in urbanization, and population forecasts are necessary for the estimation of future carbon emissions [14] [15]. They are also important in trying to define world urban areas and population [16]. However, future projections of urbanization and future population will only be as good as the current input data, yet there are obvious discrepancies and uncertainties with the global datasets used to create these inputs. One reason why the various global land cover maps disagree is because of insufficient ground-truth or in-situ data for the calibration and validation of these maps. Google Earth and Google Earth Engine provide ideal mechanisms for improving the validation of global land cover, including urban extent. By superimposing urban land cover onto Google Earth, the accuracy of urban extent from any land cover product can be assessed. Other datasets such as gridded population can also be superimposed on Google Earth and validated in the same way. However, the task of validation is an enormous one, and is therefore an ideal candidate for crowd-sourcing, in particular because Google Earth is freely available to the public via the internet. Crowdsourcing involves the public provision of information to different websites on the internet [17], made possible because of advances in Web2.0 technology and an active and willing public. Examples of successful crowd-sourcing applications include the eBird project [18] containing greater than 48 million bird sightings from the public, Galaxy Zoo [19], where the public classify galaxies and work with scientists in making new discoveries, and FoldIt, which is a serious game in which the public creates protein structures that may lead to the treatment of a disease [20]. There are also many successful examples of spatial crowd-sourcing applications, which provide volunteered geographic information (VGI) [21]. Openstreetmap (openstreetmap.org) is a street map wiki in which members of the public can edit and update street information for any location on the Earth’s land surface. Other examples of VGI include wikimapia (wikimapia.org), the Degree Confluence project (confluence.org), MapAction (mapaction.org) and the European Environment Agency’s ‘Eye on Earth’, which involves the wider public in monitoring air and water pollution in the environment. Google Earth and crowd-sourcing have been combined together in the Geo-Wiki application, developed in [22], as a way of increasing the database of in-situ calibration and validation points for improving land cover. Geo-Wiki also represents one initiative in the growing trend of ‘community remote sensing’ in which remote sensing data are combined with local information to directly engage and empower communities in understanding and arguing environmental issues [23]. Geo-Wiki could be used by the public for a range of environmental applications. However, from a land cover perspective, the ultimate goal of Geo-Wiki is to use the crowd-sourced calibration/validation database to create a hybrid land cover product that is better than any of the individual products that are currently available. The aim of this paper is to provide an overview of the Geo-Wiki application and the latest module to be developed: urban.geo-wiki.org, which covers validation of urban areas and gridded population. The need for such a module becomes evident through an examination of the disagreement in urban extent based on the GLC-2000, MODIS and GlobCover. Plans for how Geo-Wiki will be developed in the short to long term are also presented.
Building a Crowd-Sourcing Tool
41
2 Examining Disagreement in Urban Land Cover Urban land cover is represented by a single class in the GLC-2000, MODIS and GlobCover products. Unlike other classes such as different forest types or agriculture, the definition of urban land cover is relatively consistent between these global products, i.e. urban or built-up areas are comprised of artificial surfaces, where MODIS additionally specifies an urban area of greater than 50% [4]. Based on the GLC-2000, urban areas represent less than 0.5% of the Earth’s land surface [3] while the GRUMP mapping project estimates urban areas at roughly 3% of the Earth’s land surface [24]. Despite covering such a small area, urban areas represent one of the most difficult land cover types to classify correctly. One reason for this is that urban areas create many mosaics because the resolution of current land cover products is still quite coarse. There are difficulties in classification even when using Landsat TM images, and ancillary data are needed [25]. Figure 1 illustrates a section of Johannesburg using high resolution imagery on Google Earth, which is difficult to classify because of the large amount of greenspace.
Fig. 1. A section of Johannesburg on Google Earth (in Geo-Wiki)
Figure 2 shows the estimates of total urban area based on 8 different products [13]. Maps to highlight the spatial disagreement in the urban class between the GLC-2000 and MODIS, the GLC-2000 and GlobCover, and MODIS and GlobCover were created. The percentage agreement and disagreement for each pair of land cover products is listed in Table 1. Focusing specifically on global land cover data sets, the MODIS products estimate urban areas at about 50% greater than the GLC-2000 and GlobCover. Table 1 illustrates that the percentage of pixels on which each of these pair of land cover products agrees is less than 30%. The table also highlights the overestimation by MODIS.
42
S. Fritz et al. Table 1. Disagreement of urban areas between pairs of land cover products
Pairs of Land Cover Matching between Land Cover Datasets % of Total Datasets 21.7 GLC-2000 compared Agreement between GLC-2000 and MODIS to MODIS Present in GLC-2000 and not in MODIS 17.1 Present in MODIS and absent in GLC-2000 61.3 GLC-2000 compared Agreement between GLC-2000 and GlobCover 29.6 to GlobCover Present in GLC-2000 and not in GlobCover 31.5 Present in GlobCover and absent in GLC-2000 38.8 21.4 MODIS compared to Agreement between MODIS and GlobCover GlobCover Present in MODIS and not in GlobCover 58.3 Present in GlobCover and absent in MODIS 20.3 One explanation for these differences is due to the methods that were used to classify these maps. The GLC-2000 used night time luminosity [3] set at a relatively high threshold in combination with expert knowledge to classify urban areas. GlobCover used GLC-2000 as one input in the creation of this product so the total area is therefore relatively consistent. However, there is still a large degree of spatial disagreement between the two products as evidenced by the percentage of disagreement in Table 1. MODIS v.5, on the other hand, uses an automated classification algorithm trained using calibration data [4].
Urban area [1000 km2]
4000
3500 3000 2500
2000 1500
1000 500 0
Fig. 2. The size of the Earth’s urban extent based on different products [13]
Figure 3 shows an example of the disagreement map between the GLC-2000 and MODIS for the city of London in the UK and the surrounding areas. The green areas show agreement whilst the red and orange show areas of disagreement or where urban areas are present in MODIS but not GLC-2000 and vice versa, respectively. The core of large cities shows agreement but the fringes and smaller urban conurbations and towns are areas where there is clear disagreement. The black circle shows an area of disagreement which is the city of Milton Keynes. Figures 4 and 5 show Milton Keynes according to MODIS and GLC-2000.
Building a Crowd-Sourcing Tool
43
Fig. 3. Urban disagreement between GLC-2000 and MODIS showing London and Milton Keynes (in the black circle)
Fig. 4. Milton Keynes shown using MODIS data
The MODIS land cover data set classifies a larger area as urban than the GLC2000, where the latter data sets show most of Milton Keynes as cropland. A comparison of the images clearly highlights how much disagreement there is, especially when viewed spatially. High resolution satellite imagery on Google Earth would be ideal for examining the validity of the urban extent as specified in both land cover maps. Spatial patterns of disagreement like that illustrated above can be found throughout the world.
44
S. Fritz et al.
Fig. 5. Milton Keynes shown using GLC-2000 data
3 Overview of the Geo-Wiki Application Geo-wiki was developed as part of a European-funded project called GeoBene, which was aimed at determining the benefits of Earth Observation. The project involved running global economic land use models, and uncertainties in the land cover data motivated the development of a tool to improve this input dataset. The main Geo-Wiki application can be found at geo-wiki.org, which is shown in Figure 6. The home page indicates who the top five current validators are and how many areas these individuals have validated to date. The system can be accessed as a guest or by registration, which will ensure that validations are stored in the database by the user or contributor. Geo-Wiki runs on an Apache server with gentoo Linux and is written in PHP and Javascript. The components that comprise Geo-Wiki are: the Google Earth API, the Minnesota Map Server, a PostgreSQL database with a PostGIS extension containing a vector version of the pixels of each land cover product and the crowd-sourced validations, and the Geographic Data Abstraction Library (GDAL). Geo-Wiki conforms to Open Geospatial Consortium (OGC) standards and the INSPIRE guidelines as much as possible. To validate a place on the Earth’s land surface, the user zooms in until it is possible to see sufficient details. Figure 7 shows an example of validating a portion of Milton Keynes. The red dot indicates the point where the user chose to undertake the validation. Three pixels then appear on the screen, which correspond to GLC-2000, MODIS and GlobCover along with the legend class of these pixels. At this stage the user would indicate how well the legend classes capture what they see on Google Earth. This would allow volunteers to correct the GLC-2000 from ‘Cultivated and managed areas’ to ‘Artificial surfaces and associated areas’ or the class for urban extent. Both MODIS and GlobCover are correct at this location.
Building a Crowd-Sourcing Tool
45
Fig. 6. The Geo-Wiki land cover validation crowd-sourcing tool
Fig. 7. Validating a pixel of the GLC-2000 (light blue), MODIS (dark blue) and GlobCover (red) using Google Earth in a section of Milton Keynes
A disagreement map can be displayed on top of Google Earth to show those hotspots where the global land cover maps disagree the most. These disagreement maps can be used to guide the selection of areas for validation or the user can randomly
46
S. Fritz et al.
validate any area on the Earth’s land surface. Other Geo-Wiki features include: (i) the ability to display confluence points (lat/long intersections) so that validation can be systematically undertaken as well as providing a direct link to the information held on each intersection by the Degree Confluence project; (ii) the display of geo-tagged photos from existing projects, websites, Panaramio or uploaded by users; (iii) the display of forest and agricultural statistics by country; (iv) the Normalized Difference Vegetation Index (NDVI) profile for 1 year averaged over a 5 year period at any point on the Earth’s surface. This latter feature can be used to differentiate between evergreen and deciduous forest or vegetated and non-vegetated surfaces.
4 Building an Urban and Population Geo-Wiki An urban and population Geo-Wiki is currently being developed, which will be demonstrated at the conference. Users will be able to view a series of different layers on top of Google Earth, which are listed in Table 2. Table 2. List of spatial layers in Urban and Population Geo-wiki
Layer GLC-2000 global land cover MODIS v.5 global land cover GlobCover global land cover CORINE land cover 2000 Disagreement layers between pairs of the above land cover products, e.g. disagreement between GLC-2000 and MODIS, etc. VMAP HYDE IMPSA (Global Impervious Surface Area) GRUMP LandScan global population distribution Nighttime Luminosity Anthropogenic Biomes OpenStreetMap Soil Sealing
Source [3] [4] [5] [26] Calculated [27] [28] [29] [24] [30] [31] [32] [33] [34]
The application will have the same look and feel as the geo-wiki.org but be customized in terms of the layers that are displayed and in terms of the way that users will undertake the validation. Instead of indicating whether the land cover products are correct in terms of displaying urban areas, users will be asked to indicate the percentage of urban area that they see in the pixels. The urban disagreement layers between GLC-2000, MODIS and GlobCover will allow users to focus in on hotspots of disagreement, where further validation is needed. The other layers can be used as further evidence in the validation, e.g. the presence of night time luminosity will be an indication of the degree of urbanization in a pixel. In terms of validating the gridded population data, the user will simply be asked to indicate if there is evidence of human settlement or not.
Building a Crowd-Sourcing Tool
47
5 Further Developments The first version of urban.geo-wiki.org will be operational within the next two months. In addition to this recent modularization of Geo-Wiki that has allowed for the validation of specific land cover types, e.g. cropland, biomass, urban areas, etc., there are several ongoing developments to improve Geo-Wiki. The first involves finding innovative ways of increasing the overall volume and coverage of data crowd-sourced through this application. This applies not only to the urban domain but to all land cover classes. There is currently little incentive for citizens to willingly validate global land cover whether this applies to forests in the Amazon or validating the urban extent of the city of London. One method of providing this incentive is to develop one or more game applications that encourage users to participate while simultaneously providing land cover validation information. An Austrian Funding Agency project called LandSpotting, which began in Feb 2011, addresses the creation and implementation of such a game. Game designers at the Vienna Technical University are currently working on the development of this application. Options that are being considered are social networking games (e.g. Farmville, Frontierville, Carcasonne), serious social games (e.g. JobHop, Brain Buddies, Serious Beats) and smartphone applications that utilize a GPS. A prototype of the game will be available at the end of 2011. Another feature, which will be added in the next two months, will be the option for users to define a grid of validation points. At the moment users can choose any random point on the earth’s surface or validate a confluence point (i.e. each intersection of latitude and longitude). However, users might want to define the extent of their own grid across an urban landscape and then they can systematically validate these points. After validation, they will be able to download these sample validations. A third major ongoing development is the ability for users to define their own legends across all Geo-wiki modules. Users will need to specify how this legend maps onto an existing legend in order to have a consistent calibration and validation data set. These legends can also be expressed in the language of the user. This user-defined legend could be used to validate the current global land cover products, the urban products that will be included in urban.geo-wiki or maps supplied by the user. For example, a user might have an urban extent map for their country and might wish to validate this map using the services that will be offered in the family of Geo-wiki modules in the future. This service would provide validation data to Geo-Wiki while simultaneously offering the community of users a tangible return for their inputs. Another interesting addition to Geo-Wiki will be a time series of Landsat images. The Thematic Mapper on the Landsat 5 satellite has been operational since 1984 [35] and has been consistently collecting data on the earth’s surface. At a resolution of 30m, this archive will be incorporated into Geo-Wiki, which opens up new possibilities for examining temporal changes in urban land cover. This archive will be added in the next year.
48
S. Fritz et al.
6 Conclusions This paper has outlined the need to collect in-situ validation data to improve existing layers of urban extent and gridded population. A variant system based on the geo-wiki crowdsourcing tool has been proposed along with ongoing and future developments to date. The system will be operational shortly and the first validation points will be crowdsourced during the summer of 2011. The development of an improved urban extent and gridded population layer will be one of the main products of urban.geowiki.org, and these products and the algorithms for creating these layers will be the subject of a future paper. Acknowledgements. This research was supported by the European Community’s Framework Programme via the Project EuroGEOSS (No. 226487) and by the Austrian Research Funding Agency (FFG) via the Project LandSpotting. (No. 828332).
References 1. Bettencourt, L., West, G.: A unified theory of urban living. Nature 467, 912–913 (2010) 2. UN-Habitat: State of the World’s Cities 2010/2011 – Cities for All: Bridging the Urban Divide (2010), http://www.unhabitat.org/pmss/listItemDetails.aspx? publicationID=2917 3. Fritz, S., Bartholomé, E., Belward, A., Hartley, A., Stibig, H.J., Eva, H., Mayaux, P., Bartalev, S., Latifovic, R., Kolmert, S., Roy, P., Agrawal, S., Bingfang, W., Wenting, X., Ledwith, M., Pekel, F.J., Giri, C., Mücher, S., de Badts, E., Tateishi, R., Champeaux, J.-L., Defourny, P.: Harmonisation, mosaicing and production of the Global Land Cover 2000 database (Beta Version), 41 p. Office for Official Publications of the European Communities EUR 20849 EN, Luxembourg (2003) ISBN 92-894-6332-5 4. Friedl, M.A., McIver, D.K., Hodges, J.C.F., Zhang, X.Y., Muchoney, D., Strahler, A.H., Woodcock, C.E., Gopal, S., Schneider, A., Cooper, A., Baccini, A., Gao, F., Schaaf, C.: Global land cover mapping from MODIS: algorithms and early results. Remote Sensing of Environment 83, 287–302 (2002) 5. Bicheron, P., Defourny, P., Brockman, C., Schouten, L., Vancutsem, C., Huc, M., Bontemps, S., Leroy, M., Achard, F., Herold, M., Ranera, F., Arino, O.: GLOBCOVER (2008), http://ionia1.esrin.esa.int/docs/GLOBCOVER_Products_Descript ion_Validation_Report_I2.1.pdf 6. Foley, J.A., DeFries, R., Asner, G.P., Barford, C., Bonan, G., Carpenter, S.R., Chapin, F.S., Coe, M.T., Daily, G.C., Gibbs, H.K., Helkowski, J.H., Holloway, T., Howard, E.A., Kucharik, C.J., Monfreda, C., Patz, J.A., Prentice, I.C., Ramankutty, N., Snyder, P.K.: Global consequences of land use. Science 309(5734), 570–574 (2005) 7. Verburg, P.H., Neumann, K., Nol, L.: Challenges in using land use and land cover data for global change studies. Global Change Biology (2010), doi: 10.1111/j.13652486.2010.02307.x
Building a Crowd-Sourcing Tool
49
8. Liu, M.L., Tian, H.Q.: China’s land cover and land use change from 1700 to 2005: Estimations from high-resolution satellite data and historical archives. Global Biogeochemical Cycles 24, GB3003 (2010) 9. Fritz, S., You, L., Bun, A., See, L.M., McCallum, I., Liu, J., Hansen, M. Obersteiner M.: Cropland for Sub-Saharan Africa: A synergistic approach using five land cover datasets. Geophysical Research Letters (in press, 2011a), doi: 10.1029/2010GL046231 10. Fritz, S., See, L., McCallum, I., Schill, C., Obersteiner, M., Boettcher, H., Achard, F.: Highlighting continued uncertainty in global land cover maps. Submitted to Geophysical Research Letters (2011b) 11. Fritz, S., See, L.M., Rembold, F.: Comparison of global and regional land cover maps with statistical information for the agricultural domain in Africa. International Journal of Remote Sensing 25(7-8), 1527–1532 (2010) 12. Fritz, S., See, L.: Quantifying uncertainty and spatial disagreement in the comparison of Global Land Cover for different applications. Global Change Biology 14, 1–23 (2008) 13. Schneider, A., Friedl, M.A., Potere, D.: A new map of global urban extent from MODIS satellite data. Environmental Research Letters 4, 044003 (2009), doi:10.1088/17489326/4/4/044003 14. O’Neill, B.C., Dalton, M., Fuchs, R., Jianga, L., Pachauri, S., Zigova, K.: Global demographic trends and future carbon emissions. Proc. Natl. Acad. Sci. USA 107(41), 17521– 17526 (2010) 15. Gaffin, S.R., Rosenzweig, C.R., Xing, X., Yetman, G.: Downscaling and Geo-spatial Gridding of Socio-Economic Projections from the IPCC Special Report on Emissions Scenarios (SRES). Global Environmental Change 14(2), 105–123 (2004) 16. Demographia: Demographia World Urban Areas: Population & Projections, 6.1. edn. Demographia, Bellevillle, Illinois (2010) 17. Howe, J.: Crowdsourcing: Why the power of the crowd is driving the future of business. Crown Business, New York (2008) 18. Marris, E.: Birds flock online. Nature (2010), doi: 10.1038/news.2010.395 19. Timmer, J.: Galaxy Zoo shows how well crowdsourced citizen science works (2010), http://arstechnica.com/science/news/2010/10/galaxy-zooshows-how-well-crowdsourced-citizen-science-works.ars 20. Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popovic, Z.: Foldit Players: Predicting protein structures with a multiplayer online game. Nature 466(7307), 756–760 (2010) 21. Goodchild, M.F.: Citizens as sensors: The world of volunteered geography. GeoJournal 69, 211–221 (2007) 22. Fritz, S., McCallum, I., Schill, C., Perger, C., Grillmayer, R., Achard, F., Kraxner, F., Obersteiner, M.: Geo-Wiki.Org: The use of crowd-sourcing to improve global land cover. Remote Sensing 1(3), 345–354 (2009) 23. Williamson, R.A.: Community remote sensing and remote sensing capacity building and the United Nations (Updated) (2010), http://nextnowcollab.wordpress.com/2010/01/06/ remote-sensing-capacity-building-and-the-united-nations/ 24. CIESIN (Center for International Earth Science Information Network) (2004) Global Rural-Urban Mapping Project (GRUMP): Urban Extents, http://sedac.ciesin.columbia.edu/gpw 25. Yuan, F., Bauer, M.E., Heinert, N.J., Holden, G.R.: Multi-level Land Cover Mapping of the Twin Cities (Minnesota) Metropolitan Area with Multi-seasonal Landsat TM/ETM+ Data. GeoCarto International 20(2), 5–14 (2005)
50
S. Fritz et al.
26. European Environment Agency: Corine Land Cover 2000 seamless vector data – version 13 (02/2010) (2010), http://www.eea.europa.eu/data-and-maps/data/corine-landcover-2000-clc2000-seamless-vector-database-2 27. Danko, D.: The digital chart of the world project. Photogramm. Eng. Remote Sensing 58, 1125–1128 (1992) 28. Goldewijk, K.: Three centuries of global population growth: a spatially referenced population density database for 1700–2000. Populat. Environ. 26, 343–367 (2005) 29. Elvidge, C., Tuttle, B., Sutton, P., Baugh, K., Howard, A., Milesi, C., Bhaduri, B., Nemani, R.: Global distribution and density of constructed impervious surfaces. Sensors 7, 1962–1979 (2007) 30. Bhaduri, B., Bright, E., Coleman, P., Dobson, J.: LandScan: locating people is what matters. Geoinfomatics 5, 34–37 (2002) 31. Elvidge, C., Imhoff, M., Baugh, K., Hobson, V., Nelson, I., Safran, J., Dietz, J., Tuttle, B.: Nighttime lights of the world: 1994–95 ISPRS. J. Photogramm. Remote Sens. 56, 81–99 (2001) 32. Ellis, E.C., Ramankutty, N.: Putting people in the map: anthropogenic biomes of the world. Frontiers in Ecology and the Environment 6 (2008), doi: 10.1890/070062 33. OpenStreetMap, http://www.openstreetmap.org/ 34. European Environment Agency: EEA-FTSP-Sealing-Enhancement Delivery Report – European Mosaic, Issue 1.0 (2009) 35. Hansen, K.: Earth-Observing Landsat 5 Turns 25 (2009), http://www.nasa.gov/topics/earth/features/landsat_bday.html