Medial-Axes-based Cartograms
Daniel A. Keim1 Stephen C. North2 Christian Panse3
Correspondence: Prof. Dr. Daniel A. Keim Computer Science Institute Universit¨at Konstanz Fach D78 Universit¨atsstr. 10 D-78457 Konstanz Phone: (+49) 7531 88 3161 Fax: (+49) 7531 88 3062
Short running title: Medial-Axes-based Cartograms
Acknowledgements: no acknowledgements
1 University
of Constance, Germany –
[email protected] Shannon Laboratory, Florham Park, NJ, USA –
[email protected] 3 University of Constance, Germany –
[email protected] 2 AT&T
1
Medial-Axes-based Cartograms Daniel A. Keim
Stephen C. North
University of Constance, Germany
AT&T Shannon Laboratory, Florham Park, NJ, USA
Christian Panse University of Constance, Germany
[email protected], {keim, panse}@informatik.uni-konstanz.de
Abstract
Cartograms are a well-known technique for showing geography-related statistical information, such as population demographics and epidemiological data. The basic idea is to distort a map by resizing its regions according to a statistical parameter, but in a way that keeps the map recognizable. In this paper, we deal with the continuous cartogram problem which strictly retains the topology of the polygon mesh. We develop an algorithm to solve the problem which uses an iterative relocation of the vertices based on a modified medial axes transformation of the polygon mesh. Experiments using real data sets show that our algorithm is capable of producing high-quality cartograms in interactive time even for very large polygon meshes. A number of application examples show the high potential of our algorithm.
1
Introduction
Cartographers and geographers have been making cartograms long before digital computer displays were available to help with the problem. One of the oldest known cartograms was made about 300 C.E. to depict the extent of the Roman Empire for travelers (see SIDEBAR on the History of Cartograms on next page). The basic idea of a cartogram is to distort a map by resizing its regions according to some external geography-related parameter. Typical applications include display of population demographics, election results and epidemiological statistics. Cartograms are difficult to make by hand because of the difficulty of simultaneously optimizing shape and area error, while preserving the original map’s topology. So the study of automated methods for drawing cartograms has received considerable interest. A cartogram can be thought of as a generalization of an ordinary map. In this interpretation, an arbitrary parameter vector gives the desired sizes of the cartogram regions, so an ordinary map is simply a cartogram with sizes proportional to land area. Beyond the classical applications already mentioned, a key motivation for studying computer cartograms is to define a general framework for trading off shape and area adjustments in information visualization. For example in a conventional choropleth map, where a parameter vector value is encoded in the color of the regions, high values are often concentrated in densely populated areas, and low values spread out over sparsely populated areas. Such maps, therefore, tend to highlight patterns in the areas where few people live. In contrast, cartograms adjust areas in proportion to an additional parameter such as population. Patterns may then be displayed in proportion to the number of people involved instead of the raw
2
size of the area involved. The pictures in Figure 3, which are all population-based cartograms, show that a cartogram can give a very different overall impression than the original map. For a cartogram to be effective, a human must be able to quickly relate the displayed data to the original map. Intuitive recognition depends on preserving basic properties such as shape, orientation, and contiguity. This, however, is difficult to achieve, because in general it is impossible even to retain the original map’s topology. Even if we permit some error in shape and area in the cartogram, what remains is still a hard simultaneous optimization problem, for which known algorithms are very time-consuming. Further, one of the ultimate goals of this work is to continuously display the behavior of an input parameter over time. For example, we would like to display network usage or transaction event levels, or their deviation from an expected value, by country, state and local region. In an on-line setting, this requires cartogram generation on the fly, and to our knowledge extant algorithms do not have adequate speed (see previous work in section 1). In the following, we sketch a brief history of cartograms (cmp. historical remarks in Borden D. Dent: Cartography: Thematic Map Design, 4th Ed., Chapter 10 (1996)) SIDEBAR: History of Cartograms
• 330 – Tabula Peutingeriana (Roman Empire) (see Figure 1(a)) Although Greek contributions to the theory of cartography based on geometry overshadowed those of Rome, the expansion of the Roman empire fostered interest in practical cartography and surveying. The Tabula Peutingeriana is an ancient Roman map, preserved in a copy from 1598, and apparently the oldest cartogram. It is a strip map of the highways of the empire formatted to fit on a 1 X 21 foot scroll. It shows 70,000 miles of roads and points of interest for travelers. Interesting distortion techniques were devised by the map’s unknown creators. For example, because the Nile River flows northward, it could not be drawn as a long river in its normal orientation, so it is shown flowing eastward, and its delta drawn in an enlarged area. • 600 – T-in-O maps (see Figure 1(c)) (Orbis Terrarum) were popular diagrammatic maps in the Middle Ages. The known continents (Europe, Asia and Africa) were distorted to fit inside a circle (an “O”) representing the edge of the world, and separated by the Mediterranean, Red Sea and Black Sea (the “T”). Frequently, Jerusalem was placed in the exact center. These maps were first published by Spanish archbishop and cartographer San Isidoro (about 560-636) in his encyclopaedic work Etymologies. • 1844 – Levasseur The mid-19th century saw an explosion of progress in statistical graphics. Scientific and demographic data were becoming available from new sources and were coupled with new techniques such as histograms and choropleth maps. As early as 1851, Charles Joseph Minard (creator of a famous thematic map of Napoleon’s 1812 march) combined geographic maps with statistical displays of circles of varying size. French statistician Emile Levasseur was not only one of the last cartographers to specialize in ornate, highly decorative maps with extensive annotations, but was also a pioneer in the use of maps for statistical displays. He is credited with introducing such maps in an 1868 French school textbook (H. Gray Funkhouser: Historical Development of the Geographical Representation of Statistical Data (1937)) and advocated the bi-polar red/blue scale employed in some choropleth maps in this paper. • 1948 – Raisz (Erwin Raisz: Principles of Cartography (1962)) Erwin Raisz, a cartographer noted for his much-admired landform maps and one of the last pen-andink cartographer, also invented a technique for making rectangular statistical cartograms. His 1938 textbook describes a manual divide-and-conquer technique for drawing them on a predefined grid. He also described a way of making cartograms based on distances measured from a single point on the
3
(a) Cartogram of the Roman Empire (source: Lspost03/Tabula/ January 10, 2003)
http://www.fh-augsburg.de/˜harsch/Chronologia/
(b) A clipping of Figure 1(a)
(c) Jerusalem Historic Site Cartogram (source: http://www.mfa.gov.il/mfa/home.asp January 10, 2003)
(d) Gillihan’s cartogram made with plasticine and a rolling pin (source ??? January 10, 2003)
(e) Wallingford’s A New Yorker’s Idea of the United States, (source: http://www.georgeglazer.com/ maps/newyorkmaps/nyersideas.html January 10, 2003)
Figure 1: Historic Cartogram Examples
4
map. He wrote, ”Value by area cartograms are important. Our socioeconomic overview of the world will be more realistic if we think of the relative importance of its parts in the proportions of a population cartogram rather than in the proportions of a map.” • 1961 – Tobler Waldo Tobler was apparently first to study cartograms from the formal standpoint of conformal transformations of the plane, beginning with his landmark 1961 University of Washington thesis. His later works included practical examples and applications in analysis of population demographics using cartograms made on computer plotters. (His thesis also contains a copy of Daniel K. Wallingford’s 1932 map ”A New Yorker’s Idea of the World,” an idea reinterpreted in Saul Steinberg’s famous 1975 New Yorker Magazine cover cartoon. Tobler points out that not only distances but areas are intentionally distorted in Wallingford’s map; note Florida’s size). • 1972 – Dent The late Borden Dent studied perceptual issues, by conducting formal experiments in how humans recognize shapes in maps and evaluate cartograms for usability. • 1990 - today From the mid 1990s on, there were several general attempts to apply more sophisticated techniques from computer science to cartogram generation. Nonlinear optimization and mesh-transformation approaches such as fisheye lens distortions have been explored. END SIDEBAR: History of Cartograms Cartograms can be made by continuous or non-continuous distortions. The non-continuous case is much easier, since the input map topology is not necessarily preserved. In the following we examine variants of the cartogram problem in more detail. The first category consists of non-continuous cartograms. As shown in figure 2(a), non-continuous cartograms can exactly satisfy area and shape constraints, but do not preserve the input map’s topology. Since the scaled polygons are drawn inside the original regions, the loss of topology does not cause perceptual problems. More critical is the problem that the size of the final polygons is restricted by their original size. Consequently, small polygons can not be made arbitrarily large without scaling the entire map, so important areas might be difficult to see and screen usage may be poor in the cartogram. Non-contiguous cartograms (see figure 2(b)) scale all polygons to their target sizes, perfectly satisfying the area objectives. Shapes may be slightly relaxed so polygons may touch without overlapping, and the map’s topology is also highly relaxed since polygons do not retain their adjacency relationships. The advantage is perfect area adjustment and good shape preservation; the main problem, however, is the loss of the map’s global shape and topology, which makes it difficult to perceive the generated visualization as a map. Circular Cartograms (see figure 2(c)) approach the cartogram generation problem by completely ignoring the shape of the input polygons and representing them as circles in the output. In many cases, the area and topology constraints must be also relaxed. The general applicability of this technique is therefore open to question. The final category are continuous (and contiguous) cartograms. In contrast to the other categories mentioned, contiguous cartograms retain a map’s topology perfectly, but relax the given area and shape constraints. In general, neither shape nor area objectives can be fully satisfied, so cartogram generation involves a complex approximate optimization problem in searching for a good compromise between shape and area preservation. Figure 2(d) is an example of a continuous cartogram generated by the algorithm presented in [5]. In summary, continuous cartograms are difficult to generate, but the resulting polygonal meshes resemble the original map more closely than those generated by other cartogram variants. This study therefore focuses
SIDEBAR: Variants of the Cartogram Problem
5
(a) Non-Continuous Cartogram
(b) Non-Contiguous Cartogram
(c) Circluar Cartogram [1]
(d) Continuous Cartogram [5]
Figure 2: Variants of the Cartogram Problem
6
(a) Tobler [9]
(b) Zade & Tikunov [3]
(c) Edelsbrunner & Waupotitsch [2]
(d) Kocmoud & House [6]
Figure 3: Cartogram Drawing Methods on continuous cartogram generation. END SIDEBAR: Cartogram Variants Most previous approaches for the automated drawing of contiguous cartograms do not yield results comparable in quality to good hand-made drawings. One reason, first identified by Dent, is that straight lines, right angles and other features considered important in human recognition of cartograms are obliterated. Radial methods such as the conformal maps proposed by Tobler, the radial expansion method of Selvin et. al., the rubber sheet method of Dougenik et. al. and the line integral method of Guseyn-Zade and Tikunov [3] do not provide completely acceptable results, since the shapes of the polygons are often heavily deformed (see figure 3). Likewise, the pseudo-cartograms of Tobler expand the lines of longitude and latitude to achieve a least root mean square area error [9]. Very similar drawings are made by approaching the problem as distortion viewing by nonlinear magnification Jackel [4] applied radial forces to change the size of polygons, moving the sides of each polygon relative to its centroid, but the solver runs slowly (taking 90 minutes to perform 8 iterations on a map of 6 New England states of the U.S.) and seems to have problems with non-convex input polygons and with self-intersections in the output, which is consistent with our experiments with a similar approach. Another family of approaches operates on a a grid or mesh imposed on the input map. The “piezopleth” method of Cauvin, Schneider and Cherrier transforms the grid by a physical pressure load model Dorling’s cellular automaton approach trades grid cells until each region achieves the desired number of cells [1]. The combinatorial approach of Edelsbrunner and Waupotitsch [2] computes a sequence of piecewise linear homeomorphisms of the mesh that preserve its topology. While the first method is good at preserving the shape of the polygons, the second method allows a very good fit for area but does not account for shape preservation. A synthesis of both approaches was devised by Kocmoud and House, who propose a force-based model and alternately optimize shape and the area error [6]. Although the results are better than most other methods, this optimization has a prohibitively high execution time. They report a running time of 18 hours for a reasonablesized map with 744 vertices. In figure 3, we present population cartograms generated by several of the methods mentioned. Previous Work
A new approach for computing continuous cartograms is based on the medial axes of the input mesh. The medial axis has long been used as a compact representation of the shape of an object, and is thus a natural choice for use in the computation of cartograms. We complement this with the idea of using scanlines (first exploited in [5]) to guide the process of cartogram computation. In essence, the scanline approach to cartogram construction uses a line drawn through the map as a hint to the direction in which to extend (or contract) polygons. A series of such local improvements using different scanlines leads to the final cartogram. This new approach runs several times faster than existing methods and provides results that preserve shape M-CartoDraw
7
accurately while reflecting the underlying data distribution. Example applications illustrate the potential of our algorithm. The rest of this article is organized as follows. Section 2 defines the cartogram problem and discusses the area and shape error functions and complexity of the problem. Section 3 describes a new algorithm for constructing cartograms using the medial axis of the global polygon. Section 4 presents the results and some applications of the new technique, and compares it with previous approaches. Section 5 concludes with open questions and ideas for future work. Overview
2
Cartogram Drawing
The cartogram problem can be defined as a map deformation problem. The input is a planar polygon mesh (map) and a set of values, one for each region. The goal is to deform the map so that the area of each region matches the value assigned to it, doing this in such a way that the shape of the regions is preserved as much as possible. The Cartogram Problem
Definition: The Cartogram Problem Input A planar polygon mesh P consisting of polygons p1 , . . . , pk , values X = x1 , . . . xk with xi > 0, ∑ xi = 1. Let A(pi ) denote the normalized area of polygon pi with A(pi ) > 0, ∑ A(pi ) = 1. Output A topology-preserving polygon mesh P consisting of polygons p1 , . . . , pk such that the function f : Rk × Rk → R is minimized: f (S, A) −→ min where S = {s1 , . . . , sk } with si = dS (pi , pi ) A = {a1 , . . . ak } with ai = dA (xi , A(pi ))
(Shape Error) (Area Error)
Intuitively, topology preservation means that the faces of the input mesh must stay the same, i.e. the cyclic order of adjacent edges in P must be the same in P . This can be expressed formally by saying that the pseudo-duals1 of the planar graphs represented by P and P should be isomorphic. In general, it is impossible to fulfill the area and shape constraints simultaneously. The functions f (·.·), dS (·, ·) and dA (·, ·) therefore model the error in the output cartogram. In the simplest case, the single polygon area error function dA is the norm of the difference of the desired and actual area. If the vi and A(pi ) are normalized (vi > 0 ∧ ∑ vi = 1; A(pi ) > 0 ∧ ∑ A(pi ) = 1) and wi is a weight between 0 and 1 (∑ wi = 1), the area error dA can be determined as Area Error
dA (vi , A(pi )) = wi · |vi − A(pi )| ∀i = 1 . . . k. The weighting factor wi is important to obtain meaningful cartograms. In polygon meshes consisting of thousands of polygons, not all areas are of equal importance. The area error in regions of high interest (i.e. high values of vi ) should have a higher influence on the overall error than the area error in regions of low interest which may be negligible althogether. In some applications, only the top x (x < k) most important polygons are of interest and the area error of all others can be ignored. This would lead to weighting factors 1 The
pseudo dual of a planar graph is a graph that has one vertex for each face and an edge connecting two vertices if the corresponding faces are adjacent.
8
wi =
i f vi > ( k−x k ) − quantile 0 otherwise. 1 x
In some of our experiments we use such a weighting function since it corresponds to the needs of the application. The shape of two polygons can be compared in a number of different ways. We can approximate the curvature of the polygons by a turning angle algorithm, curvature plots such as the centroidal profile, super segments geometric hashing and Fourier approximations Our implementation incorporates a Fourier transformation of the polygons’ curvatures. If C (p) denotes the curvature of a polygon p and F (C (p)) denotes its Fourier transformation, the shape error dS can be determined as Shape Error
dS (pi , pi ) = dEuclid (F (C (pi )), F (C (pi ))) ∀i = 1 . . . k. Note that for polygons the Fourier transformation of the curvature can be determined analytically as we intruduced in [5]. If we formalize the problem as an optimization problem as done above, there alwalys exists a solution2 also it may be difficult to find a good one. It can be shown that a simpler variant of the cartogram problem is NP-hard. The simpler version is called integer cartograms and the basic idea is that regions consist of connected grid-cells and issues of shape preservation are ignored. For details the interested reader is referred to [10]. Complexity
3 M-CartoDraw a Medial-Axes Based Cartogram Algorithm The basic idea of our algorithm is to incrementally reposition the vertices of the polygon mesh using the medial axes as scanlines. The medial axes of a 2D region is defined as the locus of the centre of all the maximal inscribed circles of the object (see Figure 4(a) for examples). There are many equivalent definitions of the medial axis transform. It can, for example, also be defined based on the Delauney triangulation (centers of the circumcircle of the triangles). A more intuitive definition is the prairie fire transformation [8]. Imagine that the interior of the polygon is composed of dry grass and that the exterior of the polygon is composed of unburnable wet grass. Suppose a fire is set simultaneously at all points along the polygon boundary. The fire will propogate, at uniform speed toward the middle of the figure. At some points, however, the advancing line of the fire from one region of the boundary will intersect the fire front from some other region and the two fronts will extinguish each other. These points are called quench points of the fire; the set of quench points defines the skeleton of the figure. In figure 5(a) we show the US map together with the medial axes and the population errors shown on a bipolar red/blue colormap. Blue corresponds to positive area errors, i.e. blue regions should be larger, red corresponds to negative area errors, i.e. red regions should be smaller, and the darkness indicates the magnitude of the error. In the following we describe the basic idea of the algorithm. Basic Idea Consider a line (called a scan line) drawn inside a polygon. Our algorithm draws line segments (called cutting lines) perpendicular to the scan line at regular intervals. Consider the two edges on the boundary of the polygon intersected by the cutting line on either side of the scan line. These edges divide the polygon boundary into two connected chains. Now, if the area constraints require that the polygon be expanded, the algorithm applies a translation parallel to the scan line to each vertex on the two connected pieces of the 2 As
shown in [5], all problem variants with an absolute area constraint do not even have a solution in the general case.
9
(a) Rectangle with circles
(b) Triangle
(c) Polygon
Figure 4: Medial Axes Algorithm 1: ProcessMASegement(P , X , MASeg, XY ) for all cl ∈ CuttingLines(MASeg) do // cutting lines on medial axes segment MASeg / X ) // determines the aggregated scaling factor of all pi SF = ScalingFactor({pi | pi ∩ cl 6= 0}, for all v ∈ P do // for all nodes of the polygon mesh ~ XY [v] = XY [v] + SF · side(v, cl) · MASeg // side(cl, v) = 1 if v is on the left side of cl and −1 otherwise ~ |MASeg|
boundary (in opposite directions) to stretch the polygon at that point. Similarly, if a contraction is called for, the direction of translation is reversed. The idea of our algorithm is to use medial axis segments as scan lines, effectively using the shape of the polygon to make local expansions or contractions. In figure 5(c), we show three examples of this process. In the mid-west a cutting line contracts the global shape while in the north-east and Florida the global shape is stretched. The cutting lines are inserted in regular intervals on the medial axes which are the “scanlines” of the algorithm. In figure 5(b), we show a few cutting lines at six different locations of the map. The processing of a single medial axis segment is described in Algorithm 1. The function ScalingFactor determines whether the global polygon is to be stretched or contracted, and also determines the amount of motion. It is computed as a weighted average of the area errors of the polygons cut by the cutting line, weighted by their scale factors. Note that the algorithm does not calculate the new positions of all global vertices for each cutting line. Instead, it aggregates the distortion vectors for each point and applies the aggregate vector after all cutting lines of a medial axis segment have been considered. The main algorithm is presented in Algorithm 2. The main loop of the algorithm iterates over all medial axes segments, computes a candidate transformation of the polygon and checks it for topology preservation and shape errors. The order of processing the sections of the medial axes is determined by their potential for reducing the area error. A shape error threshold εs is used to control the maximum error that can be introduced in one step. If the candidate transformation passes these tests, the results are made persistent, else the transformation is discarded. The algorithm keeps iterating over all medial axes segments until the area error improvement of all medial axes segments is below a threshold εa . Observe that in processing an individual medial axis segment, we do not allow the algorithm to potentially increase the area error.However, over each iteration of the repeat-until loop, the area error decreases monotonically and therefore termination of the algorithm is guaranteed. Figure 6 shows a few intermediate steps of incrementally applying the M-CartoDraw to the US population cartogram problem. The area error is encoded in red for polygons that should be smaller and blue for polygons that should be larger. The algorithm quickly provides nice results in some areas which are well adapted (e.g., the mid-western states and New England states, California or New York, & Pennsylvania). Cartogram Algorithm
10
(a) Map with Area Error and Medial Axes
(b) Example Cutting Lines
(c) Stretching and Contracting the Global Polygon
Figure 5: Basic Idea of the Cartogram Algorithm
11
(a) step 0
(b) step 1
(c) step 2
(d) step 3
(e) step 4
(f) step 5
(g) step 6
(h) step 7
(i) step 8
(j) step 9
(k) step 10
(l) blue-red colormap used to demonstrate the area error
Figure 6: M-CartoDraw contruction serie; area error on step 10 is less than 3%.
12
Algorithm 2: M-CartoDraw(P , X ) XY.init(P , (0.0; 0.0)); // a nodes array which will store the (x,y) distortion of each node repeat Earea = AreaError(P , X ); MA = ComputeMedialAxes(GP (P )); for all MASeg ∈ MA do // iterate over all medial axes ProcessMASegment(P , X , MASeg, XY ); if ¬(Topology(P , XY ) ∧ (ShapeError(P , XY ) ≤ εs ) ∧ (Earea > AreaError(P , X , XY ))) then ReverseLastStep(P , XY ); MakePersistent(P , XY ); until (|Earea − AreaError(P , X )| > εa ) One problem of the proposed algorithm is that the medial axis of the global polygon may not allow us to process local regions of high area error. For example, portions of the United States map with a high area error, such as the upper mid-west, are not covered well by the medial axes (see figure 5(a)). Previous experiments with an interactive scanline-based cartogram algorithm suggest that additional scan lines in such regions can improve the quality of the resulting cartograms. Extensions of the Cartogram Algorithm
• Clustering Regions One approach is to cluster regions that have area errors in the same direction, i.e they all need to be shrunk or expanded (see figure 7(b)). We compute the medial axes for each such cluster of regions and then apply Algorithm 2 to calculate the cartogram. The cluster regions are processed in order of decreasing aggregated area error. • All Polygons The cluster-based approach can be extended further by computing the medial axis for each polygon in the input set. In figure 7(c) we show the polygons and their scanlines. Again, scanlines for each polygon are considered in order of decreasing area error.
13
(a) Interactive Scanlines
(b) Cluster Region Medial Axes
(c) All Polygon Medial Axes
Figure 7: Extensions of the Cartogram Algorithm
4
Evaluation and Applications
The algorithm as described in the previous sections has been implemented in C++ using the LEDA library [7] and runs on Windows 2000 as well as on UNIX Systems. The tests were performed on a 1,5 GHz Dell-Power Edge with 4 GBytes (just 15MB were needed) of main memory and a Red-Hat Linux. In this section, we report and discuss the results and compare the effectiveness and efficiency of our approach. Our method provides results comparable to previous approaches (see figure 3), with the geography of the United States being clearly perceivable.
14
(a) 0:00 am (EST)
(b) 6:00am (EST)
(c) 12:00pm (EST)
(d) 6:00pm (EST)
(e) Blue–red colormap used to demonstrate the area error
Figure 8: US-Long Distance Call Volume Data during one day. The color of each polygon represents the area error. White polygons are perfectly distorted with an area error close to 0, blue polygons have to be smaller, and red polygons have to be larger. The pictures show the results of the telephone call volume at four different times during one day. In Figure 8 we show the results of the M-CartoDraw algorithm with the AT&T call usage (Figure 8(a–d)) during one day. Note that all cartograms (Figures 8a–d) provide high quality because the area error
Quality
15
is less than 5% on each cartogram. The color of each polygon represents the area error. White polygons are perfectly distorted with an area error close to 0, blue polygons have to be smaller, and red polygons have to be larger. The picture shows the results of the telephone call volume at four different times (0am, 6am, 12am, 6pm EST) during one day. The resulting visualizations clearly reflect the different time zones of the US, and show interesting patterns of phone usage as it proceeds during the day.
(a) Shape versus Area Error
(b) Area Error
(c) Running Time
Figure 9: Quality and Efficiency Comparison The better quality of our new algorithm can be seen in Figure 9(a), where we compared the area shape error tradeoff for each incremental step of the algorithm with the Scanline approach. To measure the shape performance we have used a Fourier based function as we introduced in [5]. Each point in Figure 9(a) corresponds to one intermediate result of one M-CartoDraw iteration step. At the beginning, the area error is larger than 36% on all maps. With each iteration of the algorithm, the area error decreases and the shape error grows due to the distortions. It is, therfore, natural that the curve goes from the lower right corner up to the left until the error is small enough, the area error difference is less than its threshold, or the shape distortion is larger than a given threshold. In most cases the shape error increases with decreasing area error. The diagram also shows that the final shape error depends on the beginning area error. That can be explained in that the polygon mesh of polygons with a high beginning area error has to be more distorted than
16
the polygon meshes with a lower one. The picture also shows the regular decrease of the M-CartoDraw curve in contrast to the completely different behavior of the manual Scanline approach, which can be explained by the massive human interactions. Figure 9(b) shows the total area error for the M-CartoDraw (left) with 3% and 5% area error, [5], and [6]. This figure shows that our new proposal is much better than [5] and the complex optimization-based approach from [6]. We also performed extensive experiments to evaluate the efficiency of our new algorithm. For this study we did not care about the computation time to reduce the number of nodes because this can be done only once by a precomputing step. The main advantage of our approach is that the total running time is very small; running times range from six seconds for the US-state map to five minutes for the US-county map (about 3000 polygons). This fact is compared favorably to the prior best known approach [6], which (adjusted for modern CPUs) takes about two orders of magnitude longer to compute a cartogram. Finally, in Figure 9(c) we compared the M-CartoDraw algorithm running time with [6]. The comparison assumes that the algorithm runs on a 120MHz computer with 32MByte RAM. Note that the y-scaling is logarithmical. Efficiency
(a) 1912 Roosevelt(blue)–Wilson(red)
(b) 2000 Bush(blue)–Core(red)
(c) Blue-red colormap
Figure 10: Elections – computed with M-CartoDraw; Area Error less than 5% We ran the algorithm on a number of example data sets. Figure 10 shows the U.S. population cartogram with the presidential election results of 1912 and 2000. The area of the states in the cartograms corresponds to population and the color corresponds to the percentage of the votes. A bipolar colormap is used to show which candidate won the state. The cartograms help to understand the outcome of the elections while retaining the shape of the states. Note that the cartograms still have a remaining area error which overemphasizes the smaller states of the mid-west and therefore leads to the impression that, for example, George W. Bush won the 2000 elections with a larger amount of the popular vote. Application Examples
17
(a) smoking97
(b) lungcancer97
(c) white–black and white–orange colormap
Figure 11: Both pictures show population cartograms colored with the normalized smoking behavior (Figure 11(a)) and lung cancer behavior (Figure ). Source: American Cancer Society HTTP :// WWW. CANCER . ORG . Linear colormap, normalized by population In Figure 11 we used population cartograms for showing the smoking behaviour of the US to demonstrate the acrimounious after–effects in appeareance of lung cancer. The White–Black colormap (Figure 11(c)) is used to visualize this smoking behavior. The White–Orange color in Figure 11 ist used to depict the number of dangerous lung cancer diseases in each state. The color orange is used as signal color to demonstrate the regions where the lung cancer diseases are very high. As you can easily see, the areas with cancer diseases are highly correlated to the areas with a high smoking occurance. The cartogram helps you to understand where in the US the smoking behavior is low or high, which percent of the population does or does not smoke, and how the smoking behavior relates to the lung cancer diseases. It is interesting to see, that the lowest amount of cancer diseases in the US happens to be in the state of Utah and if you look at the left hand cartogram (Figure 11(a)), the number of smokers is small. The population cartogram tells you that this fortunate concerns just a minority of the US population. Note that we have normalized the smoking and cancer data of each state by population.
18
(a) 1910
(b) 1920
(c) 1930
(d) 1940
(e) 1950
(f) 1960
(g) 1970
(h) 1980
(i) 1990
(j) 2000
(k) Compare 2000
1900
(l) Blue–red colormap
Figure 12: The pictures show the population development of the last 100 years. Each decade is colored according to population increasing or decreasing of the ten years. The used colormap (Figure 12(l)) can be interpreted as follow: dark blue - extremely large immigration, light blue - low immigration, white - nothing happens, and light red - low migration In Figure 12 we studied the development of the US population during the ten decades of the 20th century. For example, the pictures show the population history of the western states (CA) at the beginning of the 20th century and the rapid decreasing during the rest of the century. Each decade is colored according to the population increasing or decreasing of the last ten years. The used colormap (Figure 12(l)) can be interpreted as follow: dark blue - extremely large immigration, light blue - low immigration, white - nothing happens, and
19
light red - low migration. Note that the area of the US does not reflect the absolute number of inhabitants each year. Figure 12(k) tries to explain this fact when the area of the whole polygon mesh is scaled according to the number of US inhabitants. The black polygon corresponds to the global polygon of the US in 2000 with 270 millons inhabitants and the grey one reflects the global polygon in the population in 1900 with 76 millions inhabitants.
(a) Original map
(b) from M-CartoDraw Distorted map
(c) Red–blue colormap
Figure 13: Figure 13(a) shows the German map. The red–blue colormap (Figure 13(c)) is applied to the population density. In blue colored polygons the density is extremely high (e.g. Berlin, Hamburg, Munich), if the polygons are red colored the population density is low, and if the polygon is white the population corresponds to the polygon area. In Figure 13(b) you can see the distoreded German map. One very important goal of the design of the M-CartoDraw algorithm was the possibility to handle maps with a high number of polygons. Figure 13(a) shows the German population with a red–blue colormap as it can be seen in Figure 13(b). This picture shows the 3 most populated regions very clearly, i.e. Berlin, Hamburg, and Munich, which are blue colored. Figure 13(b) shows the German population cartogram with approximately 400 polygons. We have used the colormap from Figure 13(c) to demonstrate the area error. The main focus was laid on the high populated regions. The area error of the 3 largest polygons as well as the area of the polygons in the north west (white colored polygons) is almost equal to zero. The bipolar colormap can be also used to demonstrate the population density which describes the ratio between population and area. If we use our red–blue colormap, the meaning of the colors can be described as follow: In blue colored polygons the density is extremely high, if the polygons are red colored the population density is low, and if the polygon is white the area mirrors exactly the population to which it corresponds.
20
(a) Year 2000 Median Household Income of the U.S. Census Bureau
(b) Used Unicolor Map
Figure 14: The picture shows the results of the year 2000 Median Household Income of the U.S. Census Bureau . The Color is redundantly mapped to the Median Household Income where brighter colors correspond to higher Median Income. We also used the M-CartoDraw algorithm to continuously monitor the year 2000 medial houshold income of the US Census Bureau (source: http://www.census.gov) on the US-county level (Figure 14). Since the underlying polygon data do not change, the medial axes computations only need to be run once (for the global polygon and all polygon medial axes variants). Note that M-CartoDraw needed less than a minute to compute the cartogram as you can see in Figure 14. Please, take into account that the computation time is just one magnitude higher than the time to compute the US-state cartograms even though the input magnitude (in this
21
case the number of polygons) is two times higher. With help of M-CartoDraw it is possible to quickly figure out the main points of interest, which in our example are the polygons with the highest Median Household Income. These applications lead us to a completely new dimension related to the computation time. In Figure 15(b) we give you another example in which we use cartograms for visualizing geographic related data. In contrast to Figure 15(a), where you can find the original map of the US, in Figure 15(b) M-CartoDraw computed a US population texture cartogram where the relief was distorted according to the number of inhabitants. Due to the lack of space in this paper, we will cover the texture cartogram technique in one of our next publications.
(a) Original map
(b) From M-CartoDraw distorted map
Figure 15: US population cartogram with texture (source of the texture: http://fermi.jhuapl.edu, January 9, 2003)
22
(a) Original New York state map
(b) From M-CartoDraw distorted map with almost zero area error
Figure 16: The right hand picture shows a population cartogram of New York state with texture mapping which M-CartoDraw computed from the left hand picture. M-CartoDraw increases polygons for visualizing where the population density is extremely high. In contrast to Figure 16(a), where you can find the original New York state map, in Figure 16(b) MCartoDraw computed a population texture cartogram of the New York state. Since in the cartogram algorithm the focus was only laid on the area error, the distorted map of Figure 16(b) as a consequence of the extreme population distribution, shows you an almost zero area error cartogram of the New York state. The cartogram clearly reflects that the majority of the population lives in New York city. This application example shows you that we can use an adapted version of M-CartoDraw for visualizing many interesting problems in real time. Another example could be a map transformation in which we distort the map according to a given route in order to point the user out the most interesting characteristics of his route.
5
Conclusions and Future Work
In this study we analyzed and discussed the problem of efficient cartogram drawing, and proposed a medial axes-based algorithm that outperforms prior art by orders of magnitude while providing results that are comparable to previous approaches. Experiments show that the proposed algorithm offers good results for a variety of applications, even allowing an interactive display of network traffic levels in telecommunication applications especially for never seen maps with a high number of polygons (3000!). Although the proposed algorithm is a significant step toward fast, reliable, and effective cartogram generation, there remain promising directions for further research. It would be interesting to study general methods for computing a morph that has specified boundary properties and optimizes some function of the interior. Our method of using medial axis segments as scanlines can be generalized to allow other kinds of operators (for example applying vectors in the direction of flow from the medial axis to the boundary).
23
References [1] D ORLING , D. Area Cartograms: Their Use and Creation, 1st ed. Department of Geography, University of Bristol, England, 1996. [2] E DELSBRUNNER , H., AND WAUPOTITSCH , R. A combinatorial approach to cartograms. Computational Geometry (1997), 343–360. [3] G USEIN -Z ADE , S., AND T IKUNOV, V. A new technique for constructing continuous cartograms. Cartography and Geographic Information Systems 20, 3 (1993), 66–85. [4] JACKEL , C. B. Using arcview to create contiguous and noncontiguous area cartograms. Cartography and Geographic Information Systems 24, 2 (1997), 101–109. [5] K EIM , D. A., N ORTH , S. C., AND PANSE , C. Cartodraw: A fast algorithm for generating contiguous cartograms. Information Visualization Research Group, AT&T Laboratories, Florham Park, 2002. [6] KOCMOUD , C. J., AND H OUSE , D. H. Continuous cartogram construction. Proceedings IEEE Visualization (1998), 197–204. ¨ [7] M EHLHORN , K., AND N AHER , S. The LEDA Platform of Combinatorial and Geometric Computing, 1st ed. Cambridge University Press, 1999. http://www.mpi-sb.mpg.de/˜mehlhorn/LEDAbook.html. [8] O’ROURKE , J. Computational geometry in C, 1st ed. Cambridge Univ. Press, 1994. http://cs.smith. edu/˜orourke/. [9] T OBLER , W. Pseudo-cartograms. The American Cartographer 13, 1 (1986), 43–40. [10] V ENKATASUBRAMANIAN , S. Computing integer cartograms is NP-complete. Technical Report, ATT Research Labs (2002).
24