Document not found! Please try again

An assessment using parallel computing with Python and FOSS4G ...

4 downloads 2352 Views 59KB Size Report
supported this project by specifying and applying several python scripts and libraries, and running these on our high performance cluster. Keywords. Parallel ...
Geomatics Workbooks n° 12 – "FOSS4G Europe Como 2015"

How far do Dutch people live from attractive nature? An assessment using parallel computing with Python and FOSS4G libraries Bob Dröge, Leon van der Meulen, Govert Schoof Research and Innovation Support, Centre for Information Technology, University of Groningen

Abstract How valuable is living nearby nature? Does nature have a positive effect on nearby residential property prices? How much are we willing to pay for nature in our living environment, and does this amount decay with distance to nature? Increasing urbanization and stress on natural landscapes makes such questions more and more important in spatial planning. However, quantifying the value of public green space is challenging, especially for large study areas, because of the required high computing power. In a recent conference paper by Daams et al. (2014), over 200.000 (!) individual properties across the Netherlands were analyzed to give insight into the Dutch people’s willingness to pay for living near highly attractive public nature. Unlike existing studies of such kind, not only the relation between property prices and the most nearby nature, e.g. within 1 or 2 kilometer, was analyzed, as effects from the quantity of attractive nature up to 10 kilometers away were evaluated in the initial research process. That analysis required comprehensive and highly detailed spatial data, as the areas of the all natural land use polygons, with many vertices per feature, needed to be summed for each of the 200,000 properties separately. The required resources to do so far exceeded those that a single computer, even with heavy specifications, could provide. In this paper we discuss our solution to this problem that Daams et al. (in prep.) encountered: parallel computing with Python and FOSS4G libraries. More specific, we describe how we supported this project by specifying and applying several python scripts and libraries, and running these on our high performance cluster.

Keywords Parallel computing, Nature, Property value, Big Data

1 Data In the research by Daams et al. (in prep.), attractive natural spaces were designated with point data from the Hotspotmonitor (see also Sijtsma & Daams, 2014). The Hotspotmonitor (http://en.hotspotmonitor.eu) is an online survey using which over 13.000 people in the Netherlands, Germany, and Denmark have ’marked’ attractive natural places and explained why (University of Groningen, 2014) However, these point-data do not comprise 51

Geomatics Workbooks n° 12 – "FOSS4G Europe Como 2015"

information about the quantity of attractive natural space, so these were combined with land use data. Our important contribution to these data preparations was the measurement of the land use area of both attractive and non-attractive nature surrounding each of the 200,000 observed properties. As it was initially not clear across which distance attractive nature would be found to impact on property prices, the total area of (attractive) natural space was measured in distinct rings with different radii (1km, 2km, …, 10km).Computing the total area of natural space within multiple rings surrounding each property involves a huge number of calculations. In order to speed this up and be able to run this computation in an efficient way on our high-performance computing cluster consisting of over 3000 cores, we implemented a Python application that can do the computation in parallel: there are no dependencies between the computations for different properties, which makes it an embarrassingly parallel problem. Our implementation makes use of the Shapely module (based on the GEOS and JTS libraries) for the geometrical functionality and the R-tree module (based on libspatialindex) for spatial indexing functionality. The following versions of software and libraries have been used:



Python 2.7.4



Rtree 0.7.0



libspatialindex 1.8.1



Shapely 1.2.18



GEOS 3.4.2

pyshp 1.2.0 • The implementation of the Python application can be described in pseudocode as follows: Read the shapefile containing the natural spaces Read the shapefile containing the residential properties Build an R-tree based on the bounding boxes enclosing the natural spaces Start a pool of parallel worker threads For each property to be processed, let one worker thread do the following: Create a buffer of a given radius around the property Query the R-tree for natural spaces that possibly intersect with the buffer For each of these natural spaces: Calculate the area of the intersection with the buffer Add the calculated area to the sum for this particular property Write the calculated sums for all properties to a CSV file The way of parallelizing the problem allows the program to make use of all the cores on a single machine in an efficient way. In order to scale even beyond a 52

Geomatics Workbooks n° 12 – "FOSS4G Europe Como 2015"

single machine and make even better use of our high-performance computing facilities, we added extra input parameters to the application for specifying a range of properties to be processed. Each range is independent of all the others and can therefore be processed on a different machine, significantly decreasing the runtime of the application even further.

References ✔





Daams, M.N., Sijtsma, F.J., Van der Vlist, A.J. (in.prep.) The effect of natural space on property prices: accounting for perceived attractiveness. Presented at the 2nd Workshop on Non-Market Valuation, Aix-Marseille, France. Sijtsma, F., & Daams, M. (2014). How near are urban inhabitants to appreciated natural areas? An exploration of Hotspotmonitor based wellbeing indicators. Results for the Netherlands, Germany, and Denmark.: Report at the request of the OECD; supporting the How's life in your region? study. (348 ed.) Groningen: Urban and Regional Studies Institute / University of Groningen. University of Groningen (15-10-2014) Unifocus 19:De Hotspotmonitor [Video file] Retrieved from https://www.youtube.com/watch? v=0M9lMFQwrgg

53

Geomatics Workbooks n° 12 – "FOSS4G Europe Como 2015"

54