ICES Journal of Marine Science, 59: 226–234. 2002 doi:10.1006/jmsc.2001.1147, available online at http://www.idealibrary.com on
Error detection of bathymetry data by visualization using GIS Atanu Basu and Shivani Malhotra Basu, A., and Malhotra, S. 2002. Error detection of bathymetry data by visualization using GIS. – ICES Journal of Marine Science, 59: 226–234. Graphical methods are very efficient means for error detection in large volumes of spatial data. Bathymetry data form the basis for nautical charts that are used by the fishing industry to utilize and manage the fishing resources, and by the fishing communities to study migration and habitat studies of fish. The bathymetry data of the world have been collected over a century and have a wide range of resolution and accuracy. The Geographic Information System (GIS) is a powerful tool to process, analyze, manage, and display spatial data. Marine GIS provides a mechanism for recording data with navigation, creating an efficient digital database, and plotting data on maps. Here, we have used ArcView 3.2 GIS software and its 3D Analyst extension module to visualize bathymetry data that were collected around the Hawaiian Islands. We have also identified some highly erroneous bathymetry data in this data set. 2002 International Council for the Exploration of the Sea
Keywords: bathymetry, data errors, data visualization, GIS, Hawaiian Islands. Published electronically 16 November 2001. A. Basu and S. Malhotra: Pacific Mapping Program, University of Hawaii, 2525 Correa Road, HIG 407A, Honolulu, Hawaii 96822, USA. Correspondence to A. Basu: tel: (808) 956-5061; fax: (808) 956-2580; e-mail:
[email protected]
Introduction Graphical methods for error detection of spatial data are highly suitable for human comprehension of the complex, multidimensional aspects of spatial data quality. These methods may provide the most efficient means to evaluate the quality of large volumes of spatial data. Even though some researchers view graphical methods for data analysis and presentations as unscientific (Cox, 1978), one can find several excellent examples of effective use of graphical techniques to detect, evaluate, and display spatial errors (Beard and Buttenfield, 1999). These techniques have already been used in land-based applications in the area of spatial statistics, graphics, visualization, error modeling, and cartography. In this paper, we have used this technique on a Geographic Information System (GIS) platform to detect errors in bathymetry data. Bathymetry survey data are used to provide the electronic charting technology for the fishing industry, for delineation of bottom features by the commercial fishermen, and for migration and habitat studies of fish and marine mammals. These data are also used for earthquake and fault studies, sediment and pollutant studies, studies of storm surge and tsunami effects, oil and gas exploration, mineral exploration, coastal planning, and ecosystem evaluations. Bathymetry survey 1054–3139/02/020226+09 $35.00/0
data is also the core component of nautical charts, which are used to lay out courses and navigate ships by the shortest and most economically safe route. Today, nautical charts are used by the federal and state agencies, commercial shippers, the fishing industry, environmental groups, academia, coastal zone planners, and others. The marine soundings that form the basis for nautical charts and bathymetry maps have been collected over a the course of a century. During this time, bathymetry measurement techniques have evolved from using a knotted lead line to sonar-based single-beam echo sounders, multibeams (Kleinrock, 1992; Basu and Saxena, 1999), side-scan sonars (Blondel and Murton, 1997), and laser-based Lidar Laser Line-scan systems (Estep, 1993; Whitman, 1996). Similarly, over the years navigational techniques have also been drastically improved from radio navigation systems such as Loran C and Omega to satellite-based systems such as Transit Systems, and Global Positioning Systems (GPS; Leick, 1990). The collected bathymetry data have a wide range of resolution. In some areas, bathymetry data are sparsely distributed and in other areas data density is exceptionally high (Basu and Nalamotu, 1997). The collected data also have a wide range of horizontal and positional accuracy, which were the best possible at thetime of collection, but are considered very inaccurate in the present times (Smith, 1993; Lee, 2000). Smith 2002 International Council for the Exploration of the Sea
Error detection of bathymetry data by visualization using GIS
227
Table 1. Summary of survey lines in nautical miles in which the organizations collected the bathymetry and gravity data around the Hawaiian Islands. The data have been obtained from NGDC. Organization Lamont-Doherty Earth Observatory NOAA US Geological Survey Oregon State University SOEST, University of Hawaii US Navy Scripps Institute of Oceanography University of Washington Texas A & M University US Defense Mapping Agency Woods Hole Oceanographic Institution University of Rhode Island Geological Survey of Japan University of Tokyo Geological Survey of Russia Geological Survey of UK Total
(1993) assessed the accuracy of 14 491 069 digital ship soundings in 2253 cruise surveys collected between 1955 and 1992 in the Lamont-Doherty Earth Observatory on-line database by analyzing 329 058 crossover errors (COEs) at intersecting ship tracks. The COE analysis method checks the quality of each track segment by comparing it to the other segments in the database. Smith (1993) observed that 5% of cruises with COEs yield root-mean-square amplitudes exceeding 500 m and the cumulative median global COE has remained constant at 26 m since the late 1970s. GIS technology is a powerful tool to process, analyze, manage, and display spatial data. In recent years, new applications of GIS have been developed, such as coastal GIS and marine GIS. Marine GIS provides a mechanism for recording data with navigation, creating an efficient digital database, and plotting data on maps. Multiple data sets, organized as maps, can be georeferenced, overlaid, and compared using GIS (Basu, 1998). Such comparisons can help promote sustainable development in coastal areas, while protecting the marine environment. Here, we have used ArcView 3.2 GIS software (ESRI, 1996) and its 3D Analyst extension module to visualize bathymetry data collected around the Hawaiian Islands. We have also identified some highly erroneous bathymetry data in this data set.
Data visualization and analysis Data source The National Geophysical Data Center (NGDC) is now the official distribution center for all the National Ocean Service (NOS) bathymetry, bathymetry/fishing, regional,
Country
Bathymetry
Gravity
USA USA USA USA USA USA USA USA USA USA USA USA Japan Japan Russia UK
10 027 20 400 16 365 1 513 33 783 7 044 25 636 1 055 691 1 662 164 787 1 310 2 207 2 201 394 125 239
7 846 2 263 12 161 1 294 22 598 0 6 135 0 0 0 0 0 2 277 904 2 967 0 58 445
geophysical, and other maps. NGDC’s GEODAS Marine Trackline Geophysics database contains bathymetry (single beam and vertical beam of multibeam), magnetic, gravity, and seismic navigation data collected during marine cruises from 1953 to the present. The data has been collected worldwide by various US and international agencies and academic institutions. The GIS database for this work included bathymetry and marine gravity data, which were retrieved from GEODAS CD-ROM (version 3.2) published by the NGDC. These data were collected around the Hawaiian Islands in 303 surveys by 16 organizations from four different countries. A summary of the data is presented in Table 1. From this data set we chose an area where the density of tracklines is high. The area of study is between 18.5 N to 20.0 N and 156.11 W to 158.0 W (Figure 1). In this area, there are 71 294 survey points containing bathymetry values, 19 002 survey points containing gravity values and 7890 survey points containing both the values. The information for survey data points that is included in the GIS database is given in Table 2. The survey points have both spatial and non-spatial attributes. Spatial attributes contain the positional information and non-spatial attributes contain both temporal and thematic attributes of the survey points. The positional information of the survey point is given in terms of longitude and latitude of the point; the temporal attribute is given in terms of year of the data collection. The thematic attributes provide information about the primary and secondary navigational systems. The bathymetry and gravity instruments that were used to collect the data, the name of the data collecting organization, and the bathymetry and gravity data values are also recorded.
228
A. Basu and S. Malhotra
Figure 1. The area of study around the Hawaiian islands that is between 18.5 N to 20.0 N and 156.11 W to 158.0 W. There are 71 294 survey points containing bathymetry values, 19 002 survey points containing gravity values, and 7890 survey points containing both the values. Table 2. Information for survey data points that is included in the GIS database. Information type Point-ID Longitude Latitude Cruise # NGDC # Year Institution Country Primary Navigation Systems Secondary Navigation Systems Bathymetry Instruments Gravity Instruments Bathymetry Values Gravity Values
Descriptions Each data point has unique ID Given in degrees Given in degrees Survey cruise number Each cruise has unique NGDC number Year of survey Survey Institutions like Lamont, Scripps, USGS etc. Country which conducted the survey Sextant, Loran-C, Transit Satellite, GPS etc. Sextant, Loran-C, Transit Satellite, GPS etc. 3.5 kHz Precision Depth Recorder, Sea Beam etc. Lacoste-Romberg, Graf-Askania, Bell etc. Given in meters Given in mgal
Data display The 71 294 survey data points that have bathymetry values are shown in Figure 2. A Triangulated Irregular Network (TIN) model of the ocean floor surface is
created using these data points and is shown in Figure 3. The objective of TIN modeling is to convert the point objects into a mosaic of area objects that approximate a surface. The well-known seamounts in that area are labeled in Figure 3. The depth of the ocean floor in this
Error detection of bathymetry data by visualization using GIS
229
Figure 2. 71 294 survey data points in the study area having bathymetry values.
area is between 351 and 11 159 m, with average depth of 4016.8 m. The standard deviation of these bathymetry data values is 940.8 m. In Figure 3, a deep trench-like feature is evident, lying between Indianapolis seamount and Jaggar seamount. The orientation of the trench is from southeast to northwest. It is identified in the TIN model by the color range of 6000 to 11 159 m. We created a Raster model of the ocean floor surface by gridding the data (Figure 4). The trench-like feature, labeled as ‘‘anomalous data’’, is more clearly visible in this model. Figure 5 is a 3D plot of the data points. The anomalous data points are seen to have bottom-right to top-left orientation, and also appear to divide the display area in two parts. A 3D surface of the ocean floor has been created with these points (Figure 6). The anomalous data points created the spikes that protrude downwards from the basement of the ocean floor surface.
Data analysis Visualization reveals spatial patterns amongst collections of organized data items, but visualization itself is
not so helpful in revealing detailed information about the data point other than its value. Spatial query is a complementary activity to data visualization. It permits the user to find out more details of an individual data point. GIS provides the tool for the interactive query such as ‘‘What are the characteristics of this point?’’ Most GIS software allow the user to generate a summary table of selected characteristics, which appear in a display window, related to a specific point. The point is often identified interactively with the cursor. In our study area, we were interested in looking more closely at anomalous data points to probe their characteristics. Using the information tool button, we selected one of these points in order to display a summary table of its characteristics (Figure 7, Table 3). This survey data point shown in Figure 7 and detailed in Table 3 was collected by the Lamont-Doherty Earth Observatory, USA in 1974. There is no information available regarding the navigation and bathymetry instruments that were used to collect this data set. On the basis of this information, and from comparing these values with those of neighbouring data points, we concluded that these bathymetry values are largely erroneous. We dis-
Figure 3. TIN model of the ocean floor surface in the study area. TIN is created using the bathymetry values of 71 294 data points. The well-known seamounts in the area are labeled.
Figure 4. Raster model of the ocean floor surface in the study area. The raster model is created by gridding the bathymetry values of 71 294 data points.
Error detection of bathymetry data by visualization using GIS
231
Figure 5. 3D plot of 71 294 data points in the study area. The vertical axis shows the bathymetry values.
Figure 6. 3D bathymetric surface of the ocean floor in the study area. It is generated using 71 294 data points. The anomalous data points created the spikes that are protruding downwards from the basement of the ocean floor surface.
232
A. Basu and S. Malhotra
Figure 7. The display of table of characteristics of one of the anomalous data points in the study area.
carded all 229 data points that were collected in that survey cruise in the study area. We then generated the TIN model of the ocean floor surface without including these points, as shown in Figure 8. The depth of the ocean floor in this area is between 351 and 5127 m, with an average depth of 4003.5 m. The standard deviation of these bathymetry data values is 909.7 m. Table 3. The characteristics of the selected point. Attribute name Shape Point–ID Longitude Latitude Depth (m) Year Cruise–ID NGDC–Num Institution Country Prim–Nav–System Sec–Nav–System Bathy–Inst. Gravity–Inst.
Value Point 36 690 156.95160 19.68370 9 106.0 1974 MMW02 01100001 LAMONT USA NA NA NA NA
Comparison of Figures 3 and 8 shows that a few highly erroneous data points introduce very high frequency noise in the TIN model and create a very uneven surface. The average bathymetry values of the erroneous data points collected in the survey cruise were almost twice the maximum bathymetry value of the rest of the data points. A 3D surface of the ocean floor has been created without the erroneous data points and is shown in Figure 9. The spikes that protrude downwards from the basement of the ocean floor surface in Figure 6 are absent in this Figure. Figure 9 thus represents the ocean floor surface of the study area more accurately than Figure 6.
Discussion Here, we have shown how visualization of the data sets using GIS helps to identify erroneous data. Errors present in a bathymetry data set arise from different sources. Some errors are introduced during the time of data collection: no measurement instrument is completely error free, and operators collecting or encoding data may be partially inefficient and therefore introduce errors. A poorly defined geodetic datum in the survey
Figure 8. TIN model of the ocean floor surface in the study area. This model was created without using the erroneous data points.
Figure 9. 3D bathymetric surface of the ocean floor in the study area. This model was generated without using the erroneous data points.
234
A. Basu and S. Malhotra
area also introduces errors in the collected data. If a bathymetry data set is built by digitizing nautical charts or ordinary maps, then errors can be introduced during the digitizing process. Acquisition of marine data is both time-consuming and costly. The data collection is difficult because of marine environmental fluctuations. The continuous advancement of data-acquiring technologies is improving the quality of bathymetry data, but the high-quality data acquisition rate is very low when compared to the total volume of low-quality, historic data. Gaps in high-quality bathymetry data sets can be filled with low-quality historic bathymetry data sets if we can rectify the historic data set to improve its quality to a level comparable to the high-quality data set. The first step in bathymetry data rectification is to identify good and bad data. We refer to the bathymetry data that were collected using modern bathymetry measurement and global positioning systems (GPS) navigational equipment as good data. Basu and Saxena (in press) have investigated the use of the simulated annealing (SA) global optimization technique to correct very old bathymetry data sets of South China Sea (SCS) with the bathymetry data set that was collected in 2000 using single beam echo-sounder and GPS.
Acknowledgements We thank Dr Narendra K. Saxena, director of the Pacific Mapping Program (PMP), for his valuable support to carry out this work. This is PMP contribution number 18.
References Basu, A. 1998. Case study of land and marine data integration using GIS. Surveying and Land Information Systems, 58: 147–155. Basu, A., and Saxena, N. K. In press. Bathymerty data correction using global optimization method. Marine Geodesy. Basu, A., and Nalamotu, C. 1997. Marine geographic information system for the exclusive economic zone. Marine Geodesy, 20: 255–265. Basu, A., and Saxena, N. K. 1999. A review of shallow water mapping systems. Marine Geodesy, 22: 249–257. Beard, M. K., and Buttenfield, B. P. 1999. Detecting and evaluating errors by graphical methods. In Geographical Information Systems, Vol. 1, pp. 219–233. Ed. by P. Longley, M. F. Goodchild, D. J. Maguire, and D. W. Rhind. John Wiley & Sons Inc, New York. 580 pp. Blondel, P., and Murton, B. J. 1997. Handbook of Seafloor Sonar Imagery. John Wiley & Sons Inc., New York. 314 pp. Cox, D. R. 1978. Some remarks on the role in statistics of graphical methods. Applied Statistics, 27: 9. Environmental Systems Research Institute (ESRI) 1996. ArcView GIS. ESRI, Redlands. 340 pp. Estep, L. 1993. A review of airborne lidar hydrography (ALH) systems. Hydrography Journal, 67: 25–42. Kleinrock, M. C. 1992. Capabilities of some systems used to survey the deep-sea floor. In CRC Handbook of Geophysical Exploration at Sea, pp. 35–86. Ed. by R. A. Geyer. CRC Press, Boca Raton. Lee, S.-M. 2000. Constraining navigation by matching swath bathymetry and gravity measurements at ship track crossovers. Marine Geodesy, 23: 31–53. Leick, A. 1990. GPS Satellite Surveying. John Wiley & Sons Inc., New York. Smith, W. H. F. 1993. On the accuracy of digital bathymetry data. Journal of Geophysical Research, 98: 9591–9603. Whitman, E. C. 1996. Laser airborne bathymetry – lifting the littoral. Sea Technology, 37: 95–98.