A Shape-based Vector Watermark for Digital Mapping - RMIT University

9 downloads 5480 Views 187KB Size Report
hidden message or watermark in a digital map so that its ... A vector map can be rendered by a viewer in order to ..... digital signature to prevent forgeries.
A Shape-based Vector Watermark for Digital Mapping

Stefan Bird, Chris Bellman

Ron van Schyndel

School of Mathematical and Geospatial Sciences RMIT University Melbourne, Australia e-mail: [email protected], [email protected]

School of Computer Science and IT RMIT University Melbourne, Australia e-mail: [email protected]

Abstract— Digital vector maps are an expensive commodity. Like any digital data, they are also very easy to copy. Piracy (or unauthorised reselling) of maps will become increasingly common in the future. This project looks at embedding a hidden message or watermark in a digital map so that its original authorship can be ascertained. This information enables a 3rd party to verify a seller's rights to the map and aid in the resolution of copyright disputes. Some other vector watermarking schemes, look at vector maps as a cloud of coordinates, to be perturbed in some way that is independent of actual usage. These papers generally do not discuss how large a subset of the map is needed to reliably retain the watermark. Instead, we concentrate on watermarking map feature lines, so that feature extraction from a watermarked vector map may not necessarily compromise watermark integrity. Keywords-component; vector watermarking, geometric watermarking

I.

INTRODUCTION

example, Google-maps’ use of map annotations [5], and its use of transformations in Google-Earth [14]). Because of this flexibility, vector maps can be more highly valued over equivalent raster images. A map is generally extremely expensive to produce, yet the digital nature of any vector map leaves it vulnerable to being copied and resold by a 3rd party without permission. A vector map can be rendered by a viewer in order to produce an image for human consumption, just like other digital media. However, while images and videos are usually designed for direct human consumption (viewing), a vector map is usually consumed by a viewer that is application-specific, and often the resultant vector data is not directly viewed by humans at all, but derivatives of it may be, and may even include conversion to image for viewing. Thus, like software watermarking [6] and some forms of 3D polygon watermarking [8], vector watermarking in general cannot directly make use of deficiencies in the human visual system (which is the `traditional' approach to making watermarks invisible) in order to find places to hide data.

In the past few years, digital rights management has become an increasingly important issue for digital media companies. The cheap availability of powerful computers has provided many new avenues for content creators to explore. However they have also provided the means for inexpensive copying of virtually any data. This copying is extremely difficult to police. One of the most visible parts of this phenomenon is the proliferation of online music and movie trading, allegedly costing the content industry millions of dollars in lost revenue [3]. The Geospatial industry also faces the issues of illegal copying [4]. A. Vector maps and Images are different Vector maps are becoming an increasingly important tool in many areas besides geospatial mapping. Unlike images, vector maps have properties such as effectively non-lossy scaling (due to coordinates being stored as floating-point values), and often includes point-based meta-data. Like other digital media, it does not degrade with successive copying, but unlike multimedia, it can also better withstand various common geometric transformations reversibly. This flexibility enables more reliable secondary usage (for

Figure 1. A common method of vector watermarking is to modify vertex locations to reveal patterns that may look like sampling artefacts, but actually carry embedded information. The right side shows the contours of the cover data, and the left side shows some of the patterning with the contours removed. The random-looking patterns can clearly be seen to have little to do with the contours.

Nevertheless, some watermarking techniques may rely on the assumption that the end user of a map is a human, and use devices such as drawing the same vector or vector object multiple times or redundantly adjusting point properties. Such watermarking may not survive a resampling attack, and map editors would prune such redundancy. Watermark transparency is usually described in human terms, but it is in fact an application-specific concept – a watermark that perturbs a map below the level of tolerance used by that application would be invisible to it. Unfortunately, this application-specific definition of transparency cannot be generalised. Unless the watermark is completely reversible (ie removable), it is highly desirable that any watermarking be applied to vector data that is a public derivative from a nonpublic secure master data set. An example of such usage is in the licence for vector data from Ordnance Survey Ireland, which explicitly specifies that only derivative information, not the vector data itself can be made public in any form [11]. Clearly, in mission-critical cases, the end-user must be informed of the presence of watermarks that subtly alter the data, and perhaps be provided with a method for its authorized removal if this is possible. This has been an ongoing issue in the use of watermarks for medical imagery [12], as well as for GIS. B. Paper summary In this paper, we propose a scheme of hiding information within map elements that remain attached to the elements themselves. This is in contrast to other research in this area, which aims to hide information in the configuration of the map elements with each other. As discussed in section III, we believe that it can be valid to change these configurations, but they would break more traditional watermarks. After some background in section II, we will introduce our main proposed scheme for vector watermarking in section III, followed by experimental design and results showing the resiliency of the watermarking in sections IV and V, and finish with a discussion and conclusion in section VI, and suggestions for future work in section VII. II.

BACKGROUND

In this section, we describe a non-exhaustive list of terms, and some of the key ideas used in the rest of the paper. A. Terminology Vector maps as defined for the purpose of this paper are collections of 2D vertices or connected points, which typically form polylines (or polygons if closed). A header will classify each of these lines as belonging to a feature which may thus have many lines. These features could be a height level contour line, a pressure level (isobar), or a set of land features (road, house, groups of houses classified in some way, areas of certain land use, etc). The 2D vertices are usually stored as floating point numbers, but prior to use, they are often quantised to a level

of tolerance – any change of coordinates below this usually being quantised away. Polylines are composed of straight-line segments, and a collection of segments can form a shape, with multiple shapes forming the polyline. When a shape’s coordinates are systematically perturbed, we define it as a stego-shape. Like images and video, maps for human viewing are usually rendered into a 2D display, using line style, colour, and text as annotations. The software can usually display some or all lines in user-selectable colours and styles, and highlight certain features (feature-separation) as listed in the file format. Since it is a common practice to extract features from one map and apply them (after calibration) to another, it makes sense to create a stego- hierarchy consistent with usage. Thus is addition to stego-shapes, we have stego-lines, stego-features, and even stego-maps to describe maps or map elements which contain hidden data. B. Using Shapes to Embed Information A shape is defined as a significant component in the linework of a map. A significant feature is anything that cannot be removed from the map without degrading its quality. By perturbing these shapes slightly, creating stegoshapes, we are able to embed a message into the map. A suitable decoder will recognize these stego-shapes and decode the embedded message. The significance of the effect of quality degradation on the application is almost entirely application-dependent. For example, for maps that are primarily viewed by humans, any invisible degradation is acceptable as long as it is invisible to the human eye. If the map is used for (say) mining and prospecting, a watermark’s perturbation may have a bad effect on the application under certain circumstances. The shapes can be altered by quantising one or more of the parameters to particular values. For example, if we chose to use the rotation of each shape as the host, we could quantise that rotation to (say) 30” intervals. We could then add 15” to the rotation to represent a binary 1, or subtract 15” to represent a 0. Thus when the map is analysed, we can re-quantise it to 15” intervals (because the map may have been altered in the meantime). One of the challenges of our method is to find ways of perturbing the shapes that do not significantly damage the map and ensuring resistance to affine transformations. Another difficulty is that not all shapes can be readily perturbed. A road network generally consists of straight lines. If this is not taken into account, manipulating the shapes would result in changes in the bearing of the roads near intersections. This is quite noticeable even if the change is very small. One way to avoid this is to simplify the mesh before looking for shapes, thus any changes in point positions would change the angle of the whole road segment and would be far less noticeable. Given further restrictions about the use of the map data, we are able to adapt the watermark in order to work within such restrictions. For example, if local area preservation is known to be a requirement for a given map, then we may assume that only area-preserving transforms will be applied to the map. A watermark might thus be embedded within the

area data, and area-preserving map projections would not disturb it. C. Other Vector watermarks In general, digital watermarking relies on embedding hidden information onto a ‘substrate’ of host data. For image data, there are really only three things you can do to data points to embed a watermark: you can quantise the data; you can partially replace or subsample the data; or you can merge the data with some other data (presumably the watermark). You can do this either in raw or some transform domain. You cannot usually reorder the data, or change its neighbour connectivity, though you can resample, and requantise the sampling grid. In contrast, drawing order is not usually a significant issue, so information could be hidden in the drawing sequence, but it could also easily be broken that way. A generalisation of the vector watermarking problem is discussed by Sion, Atallah, and Prabhakar [7], where they look at how to embed information within real number sets. In vector watermarking, we have ‘topological linkages’ between the numbers. For example, in general a vector map has a 2D topology, but points can be linked to their neighbours in many different ways. Ideally, the hidden information should inherit the topological characteristics of the host data, but this is not an absolute requirement, and many watermarks differ in this aspect – particularly in vector watermarks (see Figure 1. for an example where topological linkage was not used). For example, in sampled audio watermarking, the host audio can be represented as a 1D stream of sample data. Similarly, the watermark would be a 1D stream somehow embedded into the host, but accessed in a similar manner to the host audio. While it is plausible that the watermark data be randomly inserted anywhere within the stream, in practise, it is most useful, if revealed to a compliant player in the same sequence that the original host data is played. Indeed Su and Girod [13] go further to suggest that the ideal watermark would need to have a similar statistical powerspectrum to its host, in order to prevent watermark isolation and possible removal via statistical methods. Some vector watermarking algorithms, by contrast attempt to exploit topological characteristics of the displayed vector map, ignoring the topological characteristics of the host format. This is shown in 1, where patterns made by the points actually belong on different contour lines. Since the contour interval can be changed, the patterns and so the watermark can be destroyed without badly affecting the map. For example, Pu, Du and Jou [9] create an adjacency map to find points that are physically close together and build a watermark local to each of these regions. While this provides some degree of image crop-resistance, if multiple map features were present when watermarked, and later some features were removed, the effect on the watermark is unclear, as it depends on the adjacency of the removed points. Voigt, Yang and Busch [10] quantise the coordinates to a tolerance greater than normally used, and then effectively perturb the points within that tolerance by a positive or

negative amount determined by the spatial location in a map created using a secret key. Again, the 2D representation is part of the algorithm. Our method is an extension of this approach, applied to shapes so as to preserve the topological characteristics of the data. III.

WATERMARKED SHAPES

A. Overall structure Figure 2. shows the overall construction of the proposed vector watermarking algorithm. Its overall appearance resembled the traditional structure of other watermarking algorithms [1], but it differs in that shapes are watermarked, not the map as a whole. map

Selectively Quantise

New map For each shape…

message

Replicate Msg

Find Shape

Encrypt Apply ECC Modulate

Embed into Shape

Apply Modified Shape to Map

Figure 2. Overall Vector Watermark algorithm

Briefly, the map may be quantised to facilitate shape recognition later. The map is then scanned for shapes, and for each shape, a message is embedded. The message may be split or replicated as needed to fill the space available, which depends on the shape size. Different shapes may carry parts of a message or different messages depending on required robustness. Each may be encrypted and ECC applied to improve recovery. Various modulation schemes can then be applied in embedding, although in this paper, this step was avoided. Finally, the transformed message is converted into perturbation codes and embedded into the shape coordinates, and it then replaces the original shape within the map. Decoding and message extraction is the logical reverse of the process. We now cover each component in more detail. B. Preparing the map A mesh-simplification (Figure 3. ) is often done to remove points that may otherwise make the shape-detection (and thus watermark detection) too sensitive. If this was not done, it would be trivial to remove the watermark by just simplifying the mesh. Because the shapes are in effect 'artefacts' of the line's points, the addition or removal of points will substantially change the shapes, and hence the data stored within them. If the program makes a note of where the removed nodes are in relation to the ones that were kept, it is possible to reinsert them after the watermark has been embedded. This means the overall complexity and number of points would be

unaffected. The program may need to readjust the removed points slightly so they still have the same position relative to the remaining points (which may have been moved in the process of embedding the watermark).

method establishes a buffer zone between shapes, hence the name. Shape Native Map Tolerance

Histogram of Line Lengths Count

Buffer Zone

Minimum Map Tolerance

Line Length

Baseline Figure 3. Simplifying a line by removing points

C. Defining a Shape Critical to the success of the algorithm is the method used to define the bounds of the shape. In other words, ‘How do we decide which parts of a line segment should be used to embed the watermark, and how sure can we be to find those same locations for decoding after some distortion has occurred?’ In light of the above question, the method for shape boundary selection must satisfy the following requirements: • Shape position must be stable – Stego-shapes existing on a line must be consistently recoverable. Slight changes in point positions should not alter the set of points each shape contains. Shapes must also be recoverable even if part of the line is added or removed. This precludes methods that only use any absolute point on the line (eg start, end or middle) as a reference. • Stego-shapes should contain relatively few points – Having smaller shapes allows more shapes to fit on the line. This increases the storage capacity, allowing us to have a longer message or more redundancy and error checking. • Stego-shapes should be dense – We want as many shapes as possible on each line, therefore it is desirable not to 'waste' points in between shapes. Ideally, every point on the line should be part of a shape. The first of these is by far the most important. The proposed Buffer-zone method looks at shorter lines amongst long ones. Long lines indicate the lack of interesting small-scale features, whereas short lines show small detail. By embedding the message in the detailed areas, a higher information capacity will result. 1) Defining Shapes using the Buffer Zone Method The algorithm for finding shapes is to scan the feature for lines below a certain length. This length could be defined by the operator or based on map statistics such as the mean line length, or using a line histogram as in Figure 4. . Once a segment below the cut-off length is found, the program then continues until a segment longer than the cutoff length is found. The shape boundaries are thus defined as the region between these two segments, inclusive. This

Stego-shape

Buffer Zone

Non Buffer Zone

Buffer Zone Histogram Based Threshold

Non Buffer Zone

Figure 4. A Histogram-threshold based Buffer-Zone Method. A single feature from a map is shown. Here, the buffer zone selection is based on line length whose threshold can be determined using a histogram of line lengths or the median line length. A stego-shape (gray line) is shown with perturbed vertices based on a grid rotated to align with the shape’s end points.

The method satisfies the first requirement – it is very difficult to get a line's length to move over the threshold, without significantly altering the detail, especially after a simplification has already been performed. In addition, any perturbation due to watermark will not significantly alter this arrangement. There are drawbacks to this method, however. Large areas of the map may remain unwatermarked because their line segments are too long. This is equivalent to smooth areas in an image (such as a blue sky) not being watermarked due to it becoming visible. Similarly, shapes could become very large in areas of high complexity. Several improvements to this basic idea are possible. The main one, shown in Figure 5. , is to watermark both the long and short lines patches (thus reducing the number of points that are unwatermarked). The median segment length of each map feature can be taken and segments classified into 2 groups: buffer segments and non-buffer segments. Segments too small for watermarking

Histogram of Line Lengths

Median Buffer Zone

Non Buffer Zones

Figure 5. A Median-based Buffer-Zone Method. Using the median threshold buffer zone, results in more of the lines being watermarked.

Buffer segments typically have a length that is close to the median segment length. A good range for buffer segments is the median segment length ± twice the map tolerance. This means that few segments will be classed as buffer segments (which are not watermarked), but the buffer is large enough to prevent a short line becoming a long one and vice-versa if the map points move. Shapes are created in the non-buffer segments and consist of n > 3 segments, these shapes form contiguous sections along the feature separated by buffer segments. It is important to note the trade-off in the choice of n. A larger n has more segments (and thus points) existing within each shape. This makes the properties of the shape less likely to change and the information represented in the shape more robust. However, larger shapes imply that fewer shapes will fit in the dataset, and less data can be stored overall. Having extra data space also increases the robustness of the message by applying a higher chip rate (the number of times the message is repeated in the map) to the message or error correction codes can be embedded. 2) Watermarking a shape Once the shapes have been found, the program has to change the properties of each shape to reflect the watermark message. Several potential properties were evaluated to find a suitable carrier. All of them are based on the geometry of the shape, and although it is feasible to use some statistical measure between the shapes as well. Perturbing the geometry of the shapes proved much simpler. Shapes can be considered polylines with a start and end point, thus any properties that can be calculated on a polyline can also be applied to a shape. The difficulty is in finding a property that is both robust against unintentional change yet is easy to modify without moving any of the points outside their tolerance.

Minimum Map Tolerance Baseline Original Line

Native Map Tolerance

Perturbed Line

Figure 6. A detail of 4 showing a Stego-shape as a polyline with perturbed verticesis shown at left. The Baseline is formed from the end points of the shape, rotated to horizontal and forms the basis of a coordinate system for embedding the information.

Because this is a blind watermark, we cannot rely only on relative differences between the original and the watermarked image, as this requires the original. Instead,

information is stored by selectively quantising vertex coordinate values or other properties. As shown in Figure 6. , shapes offer the advantage of providing a baseline orientation, potentially allowing them to become affine transformation resistant. Because of its relative simplicity, robustness and the potential to read shapes even after they had been scaled or rotated, this paper concentrates on using the X and Y coordinates of each point relative to a baseline – a line drawn from the start to the end point of a shape. Calculations are performed as if the shape's baseline was rotated to be horizontal. The X and Y components of each non-end point are then quantised to store the information. This method has the advantage of the information being encoded multiple times within each shape and being orientation-independent. For simplicity and to test the concept, we have used a chord from start to end point of a shape to provide the baseline. To facilitate this, the start and end points are themselves never perturbed. Clearly, this exposes the watermark to a noise attack which may affect these start and end points, and hence the baseline, so we are investigating more robust methods involving sample moments of the points in a manner unaffected by any previous potential perturbation. 3) Choosing the message format Because of added or deleted points, features rearranged on disk, shapes damaged and other attacks on the watermark, it is important to be able to locate the start of the message irrespective of where it is within the data. To aid this, each message has a 3-byte sync header prepended to it. The first two bytes contain the bit sequence 11111111 11111110. This particular sequence was chosen as being easy to find as well as impossible to occur normally within the message. For simplicity, each byte of the message is stored with a parity bit, so only 127 different characters are possible. This allows for some crude error detection using parity. The third byte of the header is the message length. This allows a further redundancy check, if multiple occurrences of the message are stored for robustness. Their lengths should also agree. Since only a single byte is allocated to the message length, it is not possible to encode a message longer than 255 bytes into the map. This is sufficient for most uses however; all that would be needed would be a company name, ID number and timestamp, and optionally a client ID and or digital signature to prevent forgeries. The message is stored directly after the header, possibly being encoded with error correction or other codes to make message detection and reading more reliable (see next section). 4) Making the message more reliable One of the problems in watermarking is preventing the message from being corrupted. This is usually done by incorporating a significant level of redundancy in the message format.

If the map has enough shapes to store 8000 bits and the message is only 200 bits (25 bytes) long, we can store the message 40 times. The number of repetitions is called the chip rate in deference to an equivalent property in digital communications. When the multiple copies of the message are read back, a 'majority vote' is taken – for each bit of the message. The number of ‘dissenters’ allows a measure of the reliability. 5) Error correction Because the chip rate is inversely proportional to the message size, including error correction reduces the number of copies of the message stored. The most basic form of error detection is the standard use of a parity bit. One parity bit will only detect an odd number of bit errors. By including more parity bits, it is possible to calculate which bit caused the error. These extra parity bits are called an error correction code (ECC). One of the simplest is called a Hamming code [2]. Each of the parity bits depend only on certain message bits and each message bit has a unique combination of parity bits depending on it. When an error is detected, it can be located by determining which parity bits are incorrect. This allows single-bit errors to be corrected. IV.

EXPERIMENTAL DESIGN

A Java program was developed to test the two methods of defining shapes which provides a simple user interface showing a view of the map as shown in Figure 7. The map shown was an Auslig (now Geoscience Australia) 9” DEM of the Grampians from the late 1990's. It contains roughly 40,000 points and 170 features. Being derived from a triangulation, it has very little fine detail, limiting its message capacity.

The program is capable of finding the shapes using both methods discussed in this paper, encoding the message with or without error correction. The program is able to then attack the watermark by adding random noise and attempting to read the watermark back again. Because watermarks cannot always be read perfectly, an error rate of 5% per message copy was considered acceptable; provided subsequent copies were correct. To provide more balanced data, several messages were tested. These were: “Test Message”, “Vector Watermarker”, “Geomatics Major Project”, “RMIT University”, “Melbourne, Australia” and “Stefan Bird”. The messages were truncated or repeated as necessary to achieve the desired message length. To determine the reliability of the watermark, we found the largest message the program could embed in a given file and read back with acceptable error. It starts by embedding a message one character long and keeps increasing the message size by one character each test. The program considers the maximum message length to be reached when the last 3 messages lengths had unacceptable error rates. Finally, the watermark was tested for resiliency against an additive random-noise attack. Other tests which would not appear as random noise were truncation/extension of line features and rotation of the map. Future research could concentrate on these and other kinds of attacks. V.

RESULTS

A. Finding the shapes The Buffer Zone method has not had any problems finding shapes within the map, and there were no shapes that caused it to fail (for example all of the points being in a line etc). Although some shapes having vertices that are coincident in the original may have them not remain coincident after watermark insertion. We will detect these conditions and allow for them in future work. There is currently a minimum shape length and a minimum feature line length requirement. Future work will also relax this requirement by merging shapes across features. B. Encoding the message Encoding the message was generally fast, taking less than 1 second per message over 43,700 polygons, 10 seconds overall to load a map with over 9000 features and 1.5 million segments. This time includes finding the shapes, embedding any error correction and moving each shape's points to the correct positions. Depending on a user-selectable minimum number of segments per shape, there could be anywhere from 600 to 200,000 shapes.

Figure 7. The user interface of the program that was used for testing the approach, showing a typical contour map, with each contour line being a feature.

C. Decoding the Message Using buffer zones created a reliable watermark; with the right parameters, messages up to 70 characters could be embedded within the file shown in 6. The results shown in Figure 8. Figure 10. are typical for this kind of file.

The exact number of segments per shape appears to have a very large effect on the reliability of the watermark. This appears to be a feature inherent in the method, as the peaks and troughs are also present when error correction codes are applied (Figure 8. ). There is also a slow downward trend as the number of shape segments increases. A shape containing more segments will be more error-tolerant than a smaller one, however larger shapes also means fewer shapes. Since the message capacity is directly related to the number of shapes available, having fewer shapes reduces the message reliability, counteracting the gain by having the larger shapes. 70

E. Robustness Against Random Noise Attack The Buffer Method showed resilience to attack as shown in Figure 10. , surviving attacks up to 40% of the watermark strength. It can survive further, but only at reduced message capacities.

60 50

80

40

70

30 20 10 0 0

10

20

30

40

50

60

70

Shape Size (Segments)

Maximum Message Length (chars)

Maximum Message Lengtrh (chars)

80

It can be seen from s 8 and 9 that adding error correction reduced the reliability of the message (as shown by a lower maximum length for a given shape size). As the number of ECC bits decreased, the message reliability increased. This suggests that the chip rate (which is reduced by the error correction overheads) is significantly more important for reliability than bit error corre ction. When the error correction word was 16 bits long, the reliability was similar to having no error correction, in some cases exceeding it.

Figure 8. Message Capacity as a function of shape length

None 4 bit 8 bit 12 bit 16 bit

60 50 40 30 20 10 0 0%

D. Testing error correction codes In attempting to make the message more reliable, error correction codes (ECC) were added. Hamming codes were used for error correction. The stronger the error correction, the more ECC bits are needed and the larger the overhead. It is clear from Figure 8. Figure 9. that the best results occur when error correction is not used. Applying stronger error correction (eg 8 bit instead of 16) reduces the maximum message length; the overheads of the error correction reduce the message chip rate.

Maximum Message Length (chars)

80

None 4 bits 8 bits 12 bits 16 bits

70 60 50 40 30 20 10 0 0

10

20

30

40

50

60

Shape Size (segments)

Figure 9. Adding Error Correction Coding made no significant improvement.

70

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Attack Strength (% of watermark strength)

Figure 10. Resilience against an additive random noise attack

Error correction did not help resiliency against attack, showing similar results to 8. F.

Other robustness issues The random noise attack was used to simulate several other classes of attack that the watermark would have to withstand in a real-world environment. Some of the possible attacks include: • Point insertion/deletion – To increase resistance to this attack, the method simplifies the map beyond its tolerances before watermark insertion and detection; the deleted or inserted points are likely to be ignored. • Feature insertion – Features inserted into the map will not contain the watermark headers, and so will be ignored. Additionally, the program is able to display to the user which parts of the map contain the watermark and which parts do not. • Feature deletion – Features that are deleted from the map will lose the part of the watermark contained within them. The program may recover from this condition using the other copies of the message stored elsewhere in the map. Higher chip rates increase resistance to this kind of attack. • Map scaling – The points within a shape are perturbed relative to the highest point in the shape

(for the bit encoded in the Y coordinate) and the opposite shape end (for the bit encoded in the X coordinate). The current version of the program is not resistant to a scaling attack; however it is possible for a future version to try normalising each shape to give the best reading of the X and Y bits. • Change of map projection – A change in the projection can be seen as localised scaling of the shapes, and may be dealt with as above. • Map rotation – Since all points are watermarked relative to their shape baseline, rotating the shape should result in no effect on the data read. Rounding errors may induce random noise in the point locations, however this is normally well below the 40% level the watermark can tolerate. • Point quantisation – Shapes are rotated randomly (based on the first and last point) with respect to the axes used for quantisation, thus quantisation will appear as random noise. The level of this noise depends on the level of quantisation. It would be reasonable to expect a quantisation to be at least half of the map's tolerance. Assuming this induces a similar strength of random noise, the watermark is unlikely to survive this attack. The single largest threat to the watermark as described in this paper is a re-watermarking attack of the map. Assuming the shapes are defined with the same parameters (method and segments per shape), the message can simply be overwritten (although this will damage the map). Even if the new watermark does not use exactly the same definition of the shapes, it will represent random noise of the same level as the watermark strength which will make it unreadable. We are currently working on ways to deal with this situation. VI.

CONCLUSION

The watermarking method of using shapes to group points proved successful. In the test map, messages up to 70 characters long could be reliably encoded using the buffer zone method. Error correction codes were tried and the overheads reduced the message reliability, showing that having a higher chip-rate was the better way of protecting the message. The watermark message proved resistant to random-noise attacks up to half of the strength of the watermark. This would make it useful for a general purpose watermark on map data. However, it would not survive a concerted attack by a knowledgeable attacker, so it would mainly act as a deterrent against casual piracy.

more uniformly, and densely over the map, improving reliability and performance. In addition, its generation can be keyed, so that a secret key is needed to decode the watermark. The issue currently is synchronisation, for which we propose to use Barker Codes. We are also investigating some method of standardising the orientation and scale which is independent of whether the points have been marked already or not. The current baseline method is adequate but noise-prone. It may be possible to develop a watermark that cannot be overwritten. Such a watermark would prove far better for copyright protection than the current method (the current method can be overwritten, but in doing so the map will be damaged). REFERENCES [1] [2] [3]

[4]

[5] [6]

[7]

[8]

[9]

[10]

[11] [12]

[13]

VII. FUTURE WORK There are several directions future research will take. Currently, an alternative method of finding the shapes is being investigated. This method will ‘grow shapes like crystals’ using a seeding technique. Unlike the Buffer Zone method, which may leave large, unwatermarked buffer zones, this new method promises to apply the watermark

[14]

I. Cox, J. Boom, and M. Miller, “Digital Watermarking”, Morgan Kaufmann San Matio, 2001. R. W. Hamming, “Error detecting and error correcting codes”, Bell Systems Tech Journal vol 29, pp.147-160, 1950 B. Rosenblatt, “Digital Rights and Digital Television” in Television Goes Digital, Darcy Gerbarg ed, Chapter 14, pp. 209-223, Springer, New York, 2009 T. Yamada, Y. Fujii, S. Tezuja, N. Komoda, “Line Division based Digital Watermarking System for Facilitating Fair use of Small Size Vector Map Content”, Electronics and Communications in Japan, Vol. 91, No. 9, 2008, Translated from Denki Gakkai Ronbunshi, Vol. 127-C, No. 6, June 2007, pp. 897-903, URL= http://www3.interscience.wiley.com/journal/121638021/abstract, accessed 21 June 2009 http://maps.google.com Zhang, L.-h., Yang, Y.-x., Niu, X.-x., Niu, S.-z.: “A Survey on Software Watermarking”. Journal of Software 14(2), 2003, pp. 268277 Sion, R., Atallah, M.J., Prabhakar, S.: On Watermarking Numeric Sets. In International Workshop on Digital Watermarking (IWDW09) Petitcolas, F.A.P., Kim, H.-J. (eds.). LNCS, vol. 2613, pp. 130-146. Springer, Heidelberg (2003) R. Ohbuchi, H. Masuda, and M Aono. “Watermarking ThreeDimensional Polygonal Models Through Geometric and Topological Modifications”, IEEE Journal On Selected Areas In Communications, vol. 16, no. 4, may 1998, pp. 551-560 Y. Pu, W. Du and C. Jou, “Toward Blind Robust Watermarking of Vector Maps”, International Conference on Pattern Recognition (ICPR'06), volume 3, 2006, pp. 930-933 M. Voigt, B Yang, C. Busch, “Reversible Watermarking of 2DVector Data”, Proceedings of the 2004 workshop on Multimedia and Security, Magdeburg, Germany , 2004, pp.160-165 http://www.osi.ie/, accessed 21 June, 2009 G. Coatrieux, H. Maitre, B. Sankur, Y. Rolland, and R. Collorec, “Relevance of watermarking in medical imaging,” in Proc. IEEE EMBS Conf. Information Technology Applications in Biomedicine, Arlington, VA, 2000, pp. 250-255. J. K. Su, B. Girod, “Power-spectrum condition for energy-efficient watermarking”, IEEE Transactions on Multimedia, Dec 2002. Vol: 4, no 4, pp. 551-560 http://earth.google.com, accessed 21 June 2009