Cave Science Data Model for ArcGIS - CiteSeerX

14 downloads 203 Views 2MB Size Report
The design goal of a “usable” data model is centered on the premise of .... COMPASS and WALLS are the two most popular cave survey software .... dedicated to GIS as applied to caves and karst on the ESRI and Yahoo websites were.
ARCGIS GEODATABASE DATA MODEL FOR CAVE SCIENCE

A THESIS PRESENTED TO THE DEPARTMENT OF GEOLOGY AND GEOGRAPHY IN CANDIDACY FOR THE DEGREE OF MASTER OF SCIENCE

By AARON ADDISON

NORTHWEST MISSOURI STATE UNIVERSITY MARYVILLE, MISSOURI JULY, 2006

GEODATABASE DATA MODEL FOR CAVE SCIENCE

ArcGIS Geodatabase Data Model for Cave Science Aaron Addison Northwest Missouri State University

THESIS APPROVED

Thesis Advisor

Date

Dean of Graduate School

Date

ArcGIS Geodatabase Data Model for Cave Science

Abstract

The purpose of this research is to determine whether a usable ArcGIS geodatabase data model could be developed for use in cave science. Traditionally, cave scientists, or speleologists, have collected various data in multiple formats. In many cases, researchers are collecting the same data using different methodologies. This is undesirable not only from the repetition of work, but perhaps more importantly because many of the more scientifically interesting caves are such fragile environments that they cannot tolerate additional and especially redundant data collection. Additionally, the geodatabase provides a common data format for researchers to exchange data when working with colleagues or personnel from agencies that manage caves. An ArcGIS geodatabase data model was developed utilizing ArcCatalog and standard cave feature classifications. The data model was then tested against an existing traditional cave map to determine whether or not the geodatabase model was functional. The results were encouraging as the data model was able to handle the majority of data types and their accompanying representation. Problem areas discovered during development included the inability of the geodatabase to facilitate multiple types of geometry for a single feature class and cartographic map finishing.

iii

TABLE OF CONTENTS

ABSTRACT........................................................................................................... iii TABLE OF CONTENTS....................................................................................... iv LIST OF FIGURES ............................................................................................... vi ACKNOWLEDGEMENTS................................................................................. viii LIST OF ABBREVIATIONS................................................................................ ix CHAPTER I: INTRODUCTION................................................................................................1 The Geodatabase Model ....................................................................................3 Research Objectives...........................................................................................4 Study Area .........................................................................................................5 Rationale for Cave Science Data Model............................................................7 II: LITERATURE REVIEW....................................................................................9 Cave Survey Software........................................................................................9 Application of GIS Related to Speleology.......................................................13 Literature Review Summary ............................................................................15 III: CONCEPTUAL FRAMEWORK AND METHODOLOGY ..........................17 Study Area: Great Onyx Cave .........................................................................18 Data Source Descriptions.................................................................................20 Research Methodology ....................................................................................21 IV: RESULTS AND TESTING.............................................................................29 Conceptual Design ...........................................................................................29 Logical Design .................................................................................................31 Physical Design................................................................................................33 Survey Feature Dataset..............................................................................33 Passages Feature Dataset..........................................................................36 Hydrology Feature Dataset .......................................................................43 CrossSections Feature Dataset..................................................................46 Profile Feature Dataset .............................................................................49 Object Classes............................................................................................50 Raster Data Classes...................................................................................52 Page Layout Theme....................................................................................52 Supporting Attribute Domains ...................................................................53 Testing the Data Model....................................................................................55

iv

V: CONCLUSIONS...............................................................................................60 Future Research ...............................................................................................61 APPENDICES .......................................................................................................64 Appendix A: ArcCatalog View of Cave Science Data Model...............................64 Appendix B: National Speleological Society Map Symbols .................................66 Appendix C: Missouri Speleological Survey Map Symbols .................................69 Appendix D: Test Data Sample: Great Onyx Cave ...............................................84 REFERENCES ......................................................................................................87

v

LIST OF FIGURES

Figure

Page

1.

Study area .......................................................................................................6

2.

WALLS shapefile export dialog...................................................................11

3.

COMPASS shapefile export dialog..............................................................13

4.

Study area detail: Great Onyx Cave .............................................................19

5.

Data model thematic layers ..........................................................................30

6.

Geodatabase design structure .......................................................................32

7.

SurveyStations feature class .........................................................................34

8.

SurveyVectors feature class..........................................................................35

9.

SurveyAnnotation feature class ....................................................................35

10.

SurveyStations featured linked relationship class ........................................36

11.

PassageWalls feature class with subtypes ....................................................37

12.

PassageFeatures feature class with subtypes ................................................39

13.

Ceiling channel and Joint attribute domains.................................................39

14.

FloorMaterial feature class with subtypes ....................................................40

15.

Speleothems feature class with subtypes......................................................41

16.

PassageAnnotation feature class...................................................................42

17.

FeaturePoint feature class with subtypes......................................................44

18.

Streams feature class with subtypes .............................................................44

19.

Pools feature class with subtypes .................................................................45

20.

Sump Type and Water Quality attribute domains ........................................46

vi

21.

Cross section of plan view............................................................................47

22.

CrossSections feature class...........................................................................48

23.

CrossSectionsNearSurveyStation relationship class ....................................48

24.

ProfileLines feature class .............................................................................49

25.

ResearchProjects object class .......................................................................51

26.

Researchers object class ...............................................................................51

27.

ResearchProjecthasResearcher relationship class ........................................51

28.

Internal Annotation attribute domains ..........................................................54

29.

Import from CAD geodatabase tool..............................................................56

vii

ACKNOWLEDGEMENTS

I wish to thank my friends and fellow cavers for their assistance and support of this research project. In particular, the collective understanding of Bernie Szukalski, Alan Glennon, and David McKenzie, on the topics of GIS and cave science is greatly appreciated. Without their input and suggestions, this project would not have been possible. I also need to thank my family for missed time together as I completed my research. The Cave Research Foundation (CRF) and the National Park Service (NPS) were kind enough to provide the test data and research for the data model. The members of CRF and the NPS staff are truly dedicated to the exploration and management of the world class cave resources found within Mammoth Cave National Park. I would especially like to thank Mark DePoy and Rick Olson with the NPS for their support, and Dave West and Bob Osburn for their suggestions for the data model. I would also like to thank my thesis committee members Dr. Patty Drews, Dr. Ming Hung, and Dr. Rickard Toomey III for their support and guidance along the way. In addition I would like to thank all of my instructors both past and present for taking the time to teach. I dedicate this work to my wife Anica, and my daughters Aslee and Anora. They have supported me unwaveringly throughout my studies.

viii

LIST OF ABBREVIATIONS

CAD

Computer Aided Drafting

COTS

Commercial Off-The-Shelf Software

CRF

Cave Research Foundation

CRWR

Center for Research in Water Resources

DOQQ

Digital Orthophoto Quarter Quadrangle

ESRI

Environmental Systems Research Institute

GIS

Geographic Information System

GRASS

Geographical Resources Analysis Support System

GOC

Great Onyx Cave

MrSID

Multi Resolution Seamless Image Database

MSS

Missouri Speleological Survey

MXD

ESRI map document

MXT

ESRI map template

OGC

Open Geospatial Consortium

NAD

North American Datum

NPS

National Park Service

NSS

National Speleological Society

SMAPS

Survey Manipulation, Analysis and Plotting Software

TIF

Tagged Image File

USGS

United States Geological Survey

ix

CHAPTER I INTRODUCTION

Caves have been used by humans for various activities since prehistoric times. People have used caves as shelter, burial sites, storehouses, natural laboratories, and even as places to worship various gods.

The scientific study of caves and the cave

environment is called speleology (Moore and Sullivan 1978). Speleology crosses many different areas of scientific specialization. Obvious areas include geology and hydrology, but other areas of science are equally as important, such as biology, microbiology, paleontology, anthropology, and meteorology. Speleology may be compared with the domain of geography in the sense that it is not bound by a single area of study. Caves provide a valuable laboratory for science. They have unique environments that provide the setting for unique science. Scientists use caves to study undisturbed environments (White 1988). Biologists are using caves to learn what micro-organisms might be found on Mars (Cole 1999). Still other scientists are studying caves to better understand groundwater contamination (Pfaff and Glennon 2004). Geologists use caves to get a first hand view of the processes that continue to form the Earth. The decentralized characteristics of speleology have given rise to a multitude of data gathering and data storage techniques. In many cases, researchers are collecting the same data using different methodologies. This is undesirable not only from the repetition of work, but perhaps more importantly because many of the more scientifically interesting caves are such fragile environments that they cannot tolerate additional, sometimes redundant, data collecting.

1

Unfortunately, there is no common ground for the various scientific domains to conduct cross analysis with other domains. Geographic Information Systems (GIS) have shown great promise in addressing this need for the scientific community. Geographic patterns and relationships that were previously unknown or overlooked can be discovered with simple mapping tools within GIS. Analysis of more complex relationships and even forecasting behavior or presence/absence of conditions can be conducted with the modeling tools in GIS. The cave science projects that have integrated GIS philosophies and techniques have discovered the usefulness of GIS. Unfortunately, many of these projects are also challenged by the generic nature of the data storage in GIS, the steep learning curves for commercially available GIS software, and file management. All of the scientists studying caves and the cave environment are collecting large amounts of data. Much of this data is spatial in nature. Scientists are recording the location of specific species occurrences, archeological remains, and geological features. The common thread of all data collection is that location is important. Other data being collected includes inventory information on environmental conditions and areas that are at risk for contamination.

These data are used by scientists looking for certain

environmental conditions and the spatial relationships between the features found in the cave environment. GIS provides many valuable tools in working with this field data. First, it allows the storage of data gathered in a logical format which allows for easy data maintenance and retrieval. Secondly, GIS provides an environment where the cave scientists can visualize their spatial data. Visualization provides a powerful method of data analysis that can reveal data relationships not easily seen in tabular format. Finally,

2

GIS tools allow the researchers to leverage other scientists’ data and data collected by governmental agencies to conduct domain specific research without needing to incur the significant effort of base data collection such as cave mapping. Many researchers and governmental agencies are already using GIS in their cave science work. These implementations are not following any established guidelines for data storage or a data model. In some cases, such as the National Park Service (NPS), individual GIS data users are often left to their own data design without strong intraagency coordination, let alone coordination with other agencies.

This can create

confusion when individuals move to other park units or data must be exchanged between parks for research or management projects within a single park. Data modeling can be as much as an art as a structured exercise. Shaw (2005) notes that data modeling has become “more art than science”. However, he follows the statement by comparing data modeling to disciplines such as architecture or engineering where art and creativity are at the forefront, but they are only useful when built on a foundation of “rules and knowledge”. Creating a usable data model for cave science or any other domain is a balancing act of leveraging the structure of data model against the nuances of the data being stored and analyzed.

The Geodatabase Model In 1999, a leading GIS software vendor Environmental Systems Research Institute (ESRI) introduced a new concept for GIS data storage. The new data structure was called the geodatabase and held great promise for not only data storage, but for data topology and analysis as well. Over the last five years, ESRI and users have developed

3

several different data models for various industries (ESRI 2006a). These data models provide a framework and a starting point for users to begin using the geodatabase for data storage. Data models may also be designed to support more specialized models that extend the usability of the data. One example of this type of extension would be the creation of a geometric network based on a street centerline feature class. Lastly, the data model structure provides a consistent format for researchers and organizations that must exchange data. Cave science and GIS have both made great advances.

As information

management and storage become more and more important for speleology, the ArcGIS family of products and the geodatabase model may provide much needed tools to advance science. If a geodatabase model for speleology can be developed, it would provide an important tool for all of the researchers working on cave science.

Research Objectives The objective for this research is to investigate whether or not a usable ArcGIS geodatabase data model can be created and implemented for speleological science. The term “usable” is an important one. It is possible to develop a model that encompasses the needs of the domain, yet is not practical to implement on real-world research. This research does not endeavor to include every imaginable scenario that may be needed in a cave science data model. Often data modelers will pride themselves on the completeness of their model design, even though the model becomes unwieldy for the community of users that it was designed to help.

Esoteric data models also carry

overweight baggage in the form of unused feature classes, attributes and relationships.

4

The design goal of a “usable” data model is centered on the premise of including the features and functionality currently being used by the cave research community.

Study Area The study area for this research is the front section of Great Onyx Cave (GOC). Great Onyx is a large cave situated under Flint Ridge in the northern area of Mammoth Cave National Park, Kentucky (Figure 1). Mammoth Cave National Park is situated in the south central region of Kentucky. Mammoth Cave was established as a National Park in 1941. Although Mammoth Cave and several other caves in the park were used by prehistoric man, Mammoth Cave received Federal protection because of its vast network of interconnected tunnels and passageways. These passages comprise the longest known cave system in the world at just over 346 miles of mapped passage (Osburn 2005). The entire park area was designated as a World Biosphere Reserve in 1996 by the United Nations. Great Onyx operated as a commercial cave long before the region was set aside as a national park. Edmund Turner discovered it in 1916. It was operated as an independent commercial cave until 1961, when the property was sold to the National Park Service. A road to the cave and a hotel at the entrance were constructed. Fortunately, the owners of GOC understood the need to protect this significant cave. Today, GOC is publicly accessible only on ranger led trips operated by the National Park Service. The visitation and notable cave resources lead to significant management concerns.

The Cave

Research Foundation (CRF) continues to explore and document the cave today under a cooperative agreement with the NPS.

Great Onyx was chosen as a test for the cave

5

science data model as it has a complete map of the entrance area, in addition to a number of completed and ongoing research projects.

Great Onyx Cave

Mammoth Cave National Park

Figure 1 – Study area

6

Rationale for Cave Science Data Model Work on geodatabase data models is widespread compared to work on cave science. ESRI created the geodatabase model and understandably has published the major works discussing the design and usability of the geodatabase data model. As discussed previously, most of this work has been targeted towards large market segments that could benefit from the geodatabase model. Specifically the information presented by MacDonald (2001) and Arctur and Zeiler (2004) provides guidelines for designing and implementing a new geodatabase model. These steps are outlined and detailed in the Research Methodology section of this thesis. The ESRI geodatabase was chosen as a foundation for this research in large part because ESRI is the market leader in GIS software (ESRI 2002). There are several other GIS software packages that provide more flexibility for data storage. For example, CAD based GIS software will allow mixed geometry types within a single feature class. While such functionality is attractive, especially in the context of cartographic production, development of a data model for these software packages would almost certainly limit the already niche user base of the data model. It is difficult to imagine research in the cave science environment that does not rest squarely on spatial data. The research community appears to be slowing embracing GIS philosophy and toolsets. This idea is supported by the fact that the various cave survey programs have started to incorporate basic GIS functionality in their software. Although the geodatabase is subject to the rapidly changing domain of GIS, adopting this file format and data structure is gaining momentum as a replacement for legacy data such

7

as ESRI’s shapefile and coverage formats. The geodatabase simplifies file management, data management, and extends the capabilities of the data for the end user.

8

CHAPTER II LITERATURE REVIEW

Cave Survey Software Significant research has been done in cave science and in geodatabase models. However, a review of the literature suggests that almost no research activity has taken place in connecting these two areas together. GIS is still an emerging concept to many cave science researchers. Cave science is considered a niche science to the greater GIS industry and has gone largely unnoticed by the industry leaders. Geodatabase models have concentrated on more widely used models such as parcel mapping, utility networks, and transportation systems. Many of these models may have applications to a speleology data model. Various types of cave survey software have been developed over the last 30 years. In most cases the programs have been developed through the effort of an individual caver with the unique combination of computer programming and cave survey skills. The following is a summary, organized by author, of significant milestones and differing points of view in the context of data modeling of cave data. The Survey Manipulation, Analysis and Plotting Software (SMAPS) was developed by Doug Dotson in the early 1990s (Dotson 1992). SMAPS was a MS-DOS based software package that was designed to address all of the aspects of the cave management process. The software introduced a hierarchical, or tree based, file storage system and the ability to add a geographical reference to the data, both firsts for cave researchers. In later versions of the software, Dotson created a GIS option that could be

9

added to the base module of SMAPS by including an early version of the open source Geographical Resources Analysis Support System (GRASS) GIS system. The GIS functions included the ability to add attribute data to the base survey data, display surface contours, and run basic queries. In addition to the on-screen functionality of SMAPS, it also provided printer and plotter drivers so that researchers could generate hard copy records of their maps. One of the first software packages to store spatial data for cave researchers was a FORTRAN based program named Ellipse, developed by David McKenzie in the mid 1970s (McKenzie 2006). Ellipse ran on a mainframe computer and was capable of generating not only line plots of collected survey data, but also the associated walls of the passages. The software gained widespread use by cave projects in the southern USA and throughout Mexico.

McKenzie has continued development on his software for over

thirty years. The software was ported to micro-computers in the 1980s and ultimately to the C++ language running on the Windows platform in 1994 (McKenzie 2006). The Windows version of the software is called WALLS. WALLS utilizes a tree structure for storage of data files and a custom binary database for storing processed data. This combination of features allows the software to efficiently handle hundreds of thousands of data points. Storing the data files separately has the added benefit of allowing multiple users to work on the data files prior to processing. In addition to storing traditional survey stations, vectors and geographical reference, WALLS also allows the user to store attribute information in the form of “flags” and “notes”. Flags are typically used to indicate specific areas of interest, while notes are used to add notation text to a feature. In both cases, the attribute data is stored as ancillary text in the data file.

10

McKenzie has also added the ability to export the four data layers described above, along with the passage walls to ESRI’s shapefile format (Figure 2). These four data layers are separated into survey stations, survey lines, flags and notes. Survey stations are defined as point features and represent the traverse points in the cave. Survey lines are line features and connect survey stations. Flags are point features and provide a way to isolate specific stations for the purpose of alternate symbology. The note data layer is also a point feature that provides a way for WALLS to replace survey station names with longer more descriptive names. Both the flag and note data layers would be replaced with simple attribute data in the context of ArcGIS. The export functionality of WALLS has allowed the cave science community to leverage the analysis tools of ArcGIS in a wide range of applications. Later versions of the shapefile export utility allow the creation of 3D shapefiles that can be used in conjunction with ArcGIS extensions such as 3D Analyst and Spatial Analyst.

Figure 2 – WALLS shapefile export dialog (McKenzie 2006) 11

COMPASS is another widely used cave survey program. Like WALLS, it was originally developed in the 1970s for use on a mainframe computer by Larry Fish (Fish 2006). The software was ported to the Apple II in the late 1980s, and finally to Windows in 1994. Fish states that his design goals for the software are to “visualize and analyze” (Fish 2006). These qualities are quite evident in his work. COMPASS is an assembly of modules held together by a workbench interface. The user can input their information in the Editor module, pass the information to the Compile module, and finally realize Fish’s visualization and analysis goals in the Viewer module. The Viewer module provides the interface that most researchers would use while interacting with the data. Functions such as attribute symbolization and attribute queries are supported. Another important module of COMPASS is the CaveBase database module. This module provides attribute data storage in a custom database format created by Fish. The implementation allows the user to define database fields, import data from industry standard database formats, and leverage a custom developed query builder tool to analyze data.

Many of these functions are very similar to those found in commercial GIS

software such as ArcGIS. COMPASS also has a built-in export utility for the ESRI shapefile format (Figure 3). The utility allows for control of layers to be exported and associated settings. Similar to WALLS, COMPASS provides the ability to export 2D or 3D shapefiles for use with various ESRI extensions. The 2D option is provided to allow compatibility with older versions of ArcGIS that did not support 3D data types. COMPASS does provide the added functionality of allowing the user to export point layers named “feature location” and “feature lines”.

12

Figure 3 – COMPASS shapefile export dialog (Fish 2006)

The feature layers exported by COMPASS provide a link to data stored in the CaveBase database.

This functionality closely resembles the behavior between the

geometry of shapefiles and the tabular information typically stored in attribute tables.

Applications of GIS Related to Speleology Cave passages range from very simple single passage systems to extremely complex sponge work systems.

Some work has been done in flow modeling and

networks. The Center for Research in Water Resources (CRWR) team led by Dr. David Maidment has developed a watershed model for surface streams called Arc Hydro (Maidment 2002). The CRWR team has created a data model for defining hydrology

13

networks and streams.

The model also supports the inclusion of time series

measurements such as staff gauges. All of these features may have valuable application towards a cave science data model in terms of network analysis and spatiotemporal data storage. Glennon (2001) investigated the use of GIS tools for bookkeeping large complex spatial data sets found in some cave systems. His research outlines the variety of data sources utilized in documenting caves and their environments. Glennon leverages the power of GIS, specifically ArcGIS, for his research on the morphometric relationships to active flow networks with the Mammoth Cave watershed. Also investigated were the GIS analysis and quantitative modeling of karst processes. Phaff and Glennon (2004) described work on groundwater protection. Their work on the ModelBuilder functionality in ArcGIS 9 is useful as an example of the information products that will be required from a cave science data model. Their work utilized a geodatabase for storage and analysis; however, it does not appear that they used a structured data model in their work. The data model should be able to support tools such as ModelBuilder. This research also provides good examples of how karst data is used with publicly available data such as land use information and topographic data. Moyes and Awe (2000) provided another great example of an information product from a cave science data model. In their report on using GIS for spatial analysis of an ancient cave site, they discussed the importance of the spatial component of data inventories. They described how data can not only be stored, but reclassified if needed. The report finished with an analysis produced with ArcGIS tools.

14

Some work has been done on developing cave science tools with the ArcGIS environment.

In 1998 Bernard Szukalski, an ESRI employee and long time caver,

identified the need for a utility to assist the cave research community in translating their cave survey data to an ArcGIS compatible format. ESRI’s shapefile format was chosen as a widely implemented file type, and Szukalski coupled the functionality with basic georeferencing tools to create the CaveTools extension for ArcGIS (Szukalski 2004). This work was prior to the export routines developed within cave software products such as WALLS and COMPASS which eventually superseded the need for the CaveTools utility. Still, Szukalski’s work demonstrated that there was demand for GIS tools and capabilities within the cave science community. There are several other specialized cave survey software programs available and used by cave researchers.

These include programs such as Winkarst, WinCAPS,

CaveRender, and Survex.

All of these programs perform similar functions to the

software systems detailed above, but do not appear to be as widely used as programs such as WALLS or COMPASS.

Literature Review Summary COMPASS and WALLS are the two most popular cave survey software packages available to researchers. These software packages have been developed over many years by members of the cave community (Szukalski 2004). These packages provide basic data entry and processing functions in support of cartographic efforts. Both software modules have made some data storage efforts similar to those found in commercial GIS software, but neither provides the data analysis functions found in even the most basic GIS

15

software. A structured data model, such as the one outlined by this research, is needed to address this notable lack of functionality. In addition, the reviewed literature indicates that the only GIS file format available for export from the cave survey programs is the shapefile. The shapefile format is adequate for simple data storage, but lacks needed information to participate in more advanced GIS analysis such as network tracing and annotation feature classes. The shapefile format also cannot support the use of subtypes and attribute domains commonly used in current versions of ArcGIS. Glennon’s research (2001) points out the similarities between subsurface streams and surface streams. This finding would seem to support the idea that parts of the CRWR work may be usable in a cave science geodatabase model. In both the CRWR Arc Hydro data model and Glennon’s research, work focused on the flow modeling aspects of data. However, neither of these studies specifically addresses the data modeling challenges presented when attempting to store spatial data about the cave itself. The data model should be able to support tools such as ModelBuilder. The research presented by Phaff and Glennon (2004) also provides good examples of how karst data is used with publicly available data such as land use information and topographic data. This data usage is a good example of an information product a cave science data model must support. In addition, the data model should be supportive of data types available in the geodatabase structure. These include incorporating subtypes and attribute domains, relationship classes, network feature types, and topology rules. The work of these researchers provides a solid foundation for extending the development of GIS concepts and tools in the context of cave science. This research is a step towards the goal of a spatial data model for speleologists.

16

CHAPTER III CONCEPTUAL FRAMEWORK AND METHODOLOGY

The most significant challenge of developing a data model for cave science is to build consensus among the potential users for the content and framework of the model. It is particularly difficult when most of the user community, in this case cave researchers, have only a cursory knowledge of GIS and data modeling in general. Humans are creatures of habit and often are slow to embrace change. When applying technology to a new area of study, it is helpful to bring users along by duplicating traditional data presentation within the new technology. For instance, the casual observer may puzzle as to why a standard CAD line weight of “1” is 0.32 inches, but closer examination reveals that a size “1” pen for the traditional board draftsman matches that width exactly. As the various CAD vendors transitioned users to their digital drafting software, they needed a common point of reference for the end users. Along these lines this research attempts to duplicate the cartographic output that both data creators and data users have established as useful. This information product is often the nexus of the data needed by the cave scientist. To facilitate this effort, the current map symbols from the Missouri Speleological Survey (MSS), and the National Speleological Society (NSS) were implemented in the data model as feature classes, subtypes and attribute domains. Great Onyx Cave was surveyed by CRF, which has adopted the MSS map symbol set as its standard. It is hoped that by incorporating the classifications of the MSS and NSS map symbols that the common point of reference will be established for researchers familiar with traditional methods.

17

A second challenge is incorporating geometric network tracing in the data model. Existing cave survey software does not support network creation or analysis. Many cave researchers and agencies responsible for cave management have a need to generate cost of travel analysis based on factors such as time, hazards, and fragility (Hale 2005). The three dimensional nature of many caves does not lend itself well to the geometric networking tools found in the base ArcGIS software package.

ArcGIS

Network Analyst will support the needs of 3D cave network modeling. The development of such custom networking tools is beyond the scope of this research, but it is an objective of this study to design the data model to support such functionality in the future. Lastly the data model must be documented using accepted methods.

These

methods include illustrations showing the thematic organization of the data, the schematic diagrams of data structures, and a written description of the data model. Failure to properly document the model may significantly impact the ability to communicate the purpose and usage of the data model.

Study Area: Great Onyx Cave In addition to the main cave system, there are over 300 “lesser” or small caves within Mammoth Cave National Park (House 2005). These caves range widely in length and complexity. The smallest of caves may only be tens of feet long. The largest caves in this classification are around three miles in known length. Many of these caves were used by Indians inhabiting the area before Europeans arrived. Early settlers mined the onyx and saltpeter from the caves leaving a record of modern man’s usage. Great Onyx Cave is one of these “lesser” caves in the park. The overall size of the GOC map

18

prohibits inclusion in the this thesis, but Figure 4 illustrates the level of detailed data collection for the entire cave. Great Onyx Cave has multitude of uses and a robust data set (House 2005). Implementation of the model on the main cave system would be problematic due to the size of the system and unique challenges it presents as the longest known cave system in the world. Many sections of the system are still being surveyed and processed (Osburn 2005). The wide variation in cave characteristics of Great Onyx provides a good testing environment for a cave science geodatabase data model.

© Cave Research Foundation

Figure 4 – Study area detail: Great Onyx Cave (Gulden 2006)

19

Data Source Descriptions All data needed for this research existed. The only new data created was in the context of representing the various features within the geodatabase. The core data used in the data model was provided by CRF member Bob Gulden, cartographer of the Great Onyx Cave map. Great Onyx was surveyed to CRF data standards by Gulden and others over several years. Their efforts represent hundreds of hours of underground field data collection. Gulden then drafted the original map in AutoCAD software (Gulden 2006). Utilizing different layers in CAD, the various graphics were organized into common themes. Still, the features were not attributed with any non-graphical information that would be expected in a GIS environment. The Great Onyx map represents the most complex data to be loaded into the cave science data model. The cave survey line plot data was in the WALLS format. This data consists of all survey stations and vectors. The standard survey practices at Mammoth Cave do not utilize the flag and note fields discussed earlier in Chapter II. All survey data was converted to shapefiles using the shapefile exporter within the WALLS software (Figure 2). The shapefiles were converted to feature classes with the ArcToolBox functions provided with ArcGIS. Research data used to test the data model developed for this thesis is based on a census of a several existing CRF and NPS research projects in Great Onyx Cave. Although based on CRF and NPS data, the actual research data was not available for testing during the timeframe for this research. While most of the CRF and NPS research has spatial characteristics, none of the data for these research projects currently is associated with a map. The data was incorporated into the geodatabase data model so

20

that it is available for analysis and visualization. These data are currently in either FileMaker Pro or Microsoft Access databases and text file formats. Raster datasets were also tested with the data model. These data are in various file formats. Topographic data was obtained from the United States Geological Survey (USGS) in a tagged image format (TIF) format. These files are georeferenced to the Kentucky South State Plane coordinate system in North American Datum (NAD) 83 feet. Digital Orthographic Quarter Quadrangles (DOQQ) orthographic photos were also obtained from USGS in the same projection as the topographic data. The geologic maps for the study area were obtained from the State of Kentucky Geologic Survey in a multi resolution seamless image database (MrSID) format. The raster files depicting geology have no explicit georeference metadata, but were determined to have the same spatial characteristics as the USGS data by inspection.

Research Methodology The ArcGIS database modeling process follows database abstractions developed in the 1980s (Nyerges 2006).

These abstractions are classification, generalization,

association, and aggregation. ArcGIS modifies the terminology as follows: classification, subtypes, relationships, and topology, respectively. Zeiler (1999) suggests a geodatabase development methodology that closely follows a broader GIS implementation philosophy presented by Tomlinson (2003). This process includes determining information products, defining objects and graphical representations, matching objects to corresponding geodatabase elements and lastly organizing and implementing the system. Perhaps the most comprehensive guidelines for geodatabase design are described by Arctur and

21

Zeiler (2004). Their research isolates geodatabase design into three distinct phases. The conceptual design begins the design process by identifying not only the information products the geodatabase will support, but also points to the need for development of thematic layers and scale considerations of the data. The second design phase builds the logical data structure.

Attribute fields and spatial characteristics of the design are

outlined and assembled in to a proposed geodatabase design. Arctur and Zeiler conclude the process with the physical design of the data model. This stage of the process includes reviewing and refining the geodatabase, developing workflows for the thematic layers and documentation of the modeling process. The development of the cave science geodatabase data model follows the outline described by Arctur and Zeiler (2004) and recommended by ESRI. The process can be more specifically broken down into ten steps adapted from Arctur and Zeiler (2004). The following information describes how this methodology was used in the context of this research.

1. Identify the information products to be produced from the geodatabase. Domain experts were solicited for information using a wide variety of methods. Internet newsgroups focused on cave exploration and speleology were queried for input on the conceptual design of the data model. The speleology discussion group on the NSS website was polled for input from the caver community. Newsgroups dedicated to GIS as applied to caves and karst on the ESRI and Yahoo websites were queried. Several responses were received and suggestions were incorporated into the

22

overall design. Interesting, many more responses were received offering data for testing purposes or expressing interest in the results produced by this research. Many individual stakeholders were also contacted. These conversations took place by email and phone. Significant insight regarding the philosophy and design of existing cave survey software and data collection methodologies were obtained. Several cave resource managers were also contacted to better understand the types of information products and analysis that are needed by their organizations.

It is

anticipated that these requirements will greatly expand once the managers realize the full potential of the geodatabase.

Lastly, I was able to speak with ESRI

representatives to better identify with market trends and future possibilities within the geodatabase data structure.

One significant trend identified was enhanced

cartographic support in future releases of ArcGIS. As discussed earlier, it is difficult to get complete cooperation when designing data models. There appears to be a “build it and they will come” approach to GIS within the cave research population.

2. Identify the base layers needed to support the information products. The thematic base data layers for cartography were identified from MSS and NSS map symbols.

Appendices B and C show the NSS and MSS map symbols,

respectively. These symbols provide graphic representation for commonly found features in the cave environment.

Additionally, these map symbols are widely

circulated and generally accepted among US speleologists.

The top level

cartographic map themes are passages, hydrology, profile and cross sections. The survey theme contains only the survey stations and vectors and is based on

23

information exported from the COMPASS or WALLS cave surveying software. All thematic layers were defined as a graphic data type supported by the geodatabase. The spatial characteristics of the geodatabase are also important at this point of the conceptual design. The spatial extent of each feature dataset within the model was adjusted to accommodate a geographic dataset the size of the entire continental USA at a precision of one centimeter. The actual geographic extent of this XY domain is determined by setting the geographical reference of the geodatabase.

3. Specify the scale ranges needed and data types (point, line, polygon, raster). The scale ranges for all feature classes were designed for large scale representation. Most caves are not visible at conventional smaller scales. Even at the common topographic map scale of 1:24,000, caves may be little more than a thin line on the map. The larger scale representation also eliminated the need for multiple data types for a single feature class. Features that may have been represented as a point feature when using smaller map scales could easily be represented as polygons for the larger map scales. Line features appear to be minimally impacted by the scale range. Annotation feature classes were optimized for a 1:600 scale.

The resolutions of the raster

datasets utilized in this research were far lower than other data in the geodatabase. As a result, the raster data was stored at maximum available resolution. The geodatabase stores raster data with multi-resolution pyramids so that large datasets render more quickly.

24

4. Describe datasets. The final step in the conceptual design process was to group the feature classes into datasets. As discussed in step 2, the top level feature datasets were identified as passages, hydrology, profile, cross sections, and survey. These datasets are based on the MSS and NSS map symbols and feature classifications. Each of these feature classes is described in detail in the Results section of Chapter IV. Raster datasets were defined as topographic, geologic and aerial photos. These datasets are seamless and are well suited to the nature of the raster dataset. A raster catalog was defined to store scanned survey notes, should they be needed.

5. Define the tabular database structure and any behavior for attributes. Tabular datasets were defined for several feature classes. Effort was made in the creation of feature classes to organize features so that common attribute data could be collected for each feature. It is anticipated that a real world implementation of the data model would not necessarily utilize all available attributes. Subtypes were created for many feature classes to further classify features within various feature classes. The subtypes also assist the user in controlling the behavior and appearance of features. A limited number of attribute domains were defined to limit the possible values for certain tabular attributes. It is expected that as the data model matures, there will be additional attribute domains suggested for incorporation to the model structure. A relationship class was defined between the Researchers object class and the ResearchProject object class. This relationship allows various types of research to be

25

incorporated in the cartographic features of the GIS. A second relationship class was created between the CrossSections feature class and the SurveyStations feature class. This relationship associates each cross section with a survey station.

6. Define spatial properties for all datasets. All feature datasets share common spatial properties. These properties are then subject to a geographical reference.

The geographical reference should not be

confused with the spatial properties of the geodatabase. It may be useful to think of the spatial properties as a piece of paper, and the geographical reference as a position on a desk. The combination of these two parameters determines where the paper will be located on the desk. These settings are particularly important when topology rules are defined. Feature classes must share the same spatial properties or these topology rules cannot be enforced.

7. Create prototype geodatabase design. This study has produced a prototype geodatabase design. The prototype is based on the information products collected in step 1, the data types and structure outlined in steps 2-5, and the spatial properties defined in step 6. The prototype geodatabase was created with ArcCatalog. Background information for the data model was collected during the literature review process to better understand how other data models have been designed and implemented in other fields of study.

26

8. Implement, review, and revise geodatabase design. The cave science geodatabase data model was implemented against a real world set of cave data for Great Onyx Cave. Great Onyx Cave is representative of the vast majority of limestone caves in the world. The data model will be implemented by cave researchers on their own data over time. This will begin a more earnest peer review process. Through the review process revisions will be made to the model and new functionality may be added to extend the model.

9. Develop workflows for data creation and data maintenance. Workflows were designed for importing cave survey data from WALLS and COMPASS. These workflows leverage the ModelBuilder functionality of ArcGIS. Shapefiles are incorporated to their corresponding feature class and attributes are mapped according. Creating cartographic features is a straightforward process that leverages existing editing and data creation tools found in ArcGIS. Importing cartographic features from drafting programs such as CAD is a more complex and problematic process. CAD data was tested as a part of this research, but the overall cartographic effort in the cave research community sorely lacks any type of data standards beyond the map symbol sets.

This lack of structure in drawing files must be addressed before

effective workflows can be developed for importing cartographic features.

27

10. Document geodatabase design using established methods. The cave science geodatabase data model was documented by several established methods. This thesis serves as written documentation of the model. Future efforts to extend this model or refine the existing model should be able to use this documentation to better understand the philosophy and methods used to create the existing data model. A poster size document illustrating the thematic layers, schematic diagram and geodatabase structure was created to better illustrate the purpose and relationships of the feature classes. This document will allow the users of the data model to easily visualize the organization and components of the model. The poster also functions as a “face” for the data model during discussions with colleagues wishing to implement the data model. Finally, an extensible markup language (XML) schema of the data model is provided so that cave researchers may leverage the data model as a geodatabase template for their own research. This template is compatible with ArcGIS and allows the user to create an empty copy of the geodatabase ready to populate with their own GIS data. The schema preserves all data structure, subtypes and relationship classes.

28

CHAPTER IV RESULTS AND TESTING

The main result of this research was the development of a core data model for cave science. As discussed elsewhere in this paper, the model is based on information a researcher would expect to see on a traditional cave map. The feature sets in the data model include those published by the NSS and MSS. The overall structure of the data is best presented in a large poster format that illustrates the many roles and relationships of the model. Here the various components of the data model are detailed individually utilizing color schemes established by ESRI.

Conceptual Design The early steps of the methodology developed in Chapter III state that the information products for the data model should be identified. This process can be expressed in the context of the various thematic data the model needs to support. The themes are described not only in terms of the feature or layer names, but also outline additional metadata. The information product description includes short explanations for how the data will be used within the map, the source of the data, how the data is represented in GIS, spatial relationships to other data, map accuracy and the scale the information product is designed to support. The cave science data model contains seven different layers or themes. These layers are survey, passages, hydrology, cross sections, profile, raster image base, and

29

page layout (Figure 5). Documentation of the thematic layers is the final product of the first three steps in the research methodology.

Figure 5 – Data model thematic layers

30

Logical Design The themes were then expanded to identify how discrete features should be modeled.

All vector features were represented by creating feature classes.

An

ArcCatalog tree view of the data model is illustrated in Appendix A. The geometry feature type is defined as point, line or polygon. Each feature class can have only a single geometry type (MacDonald 2001), so similar features were organized into each feature class. Five feature datasets were used to group similar feature classes. Tabular data for researchers and research projects was represented by object classes. Object classes provide a mechanism for storing data in the model that does not have a spatial component. The image base thematic layer was divided in to four information products. The image themes were identified as aerial photos, geology, topographic and survey data. The aerial photo raster dataset was created so that a seamless mosaic of images covering a given study area may be stored in the geodatabase. Similar raster datasets were developed to store scanned geological maps and USGS topographic map data. A raster catalog was created to store scanned field data or other research findings that may be useful to store in the geodatabase. A raster catalog differs from a raster dataset in that the personal geodatabase only stores a pointer to the raster file and does not store the file itself in the geodatabase (Wayne 2005). This is an important distinction because it directly impacts the file size of the geodatabase and the number of files that must be delivered when colleagues share data. The cave science geodatabase data model is comprised of sixteen feature classes organized into five feature datasets (Figure 6). The four raster classes and two object classes are also shown in the figure.

Three relationship classes and nine attribute

31

domains were developed to support the feature classes and are discussed later in this chapter.

Figure 6 – Geodatabase design structure 32

Physical Design Survey Feature Dataset As shown in Figure 6, each of the feature datasets is comprised of several feature classes. It is useful to expand each of these objects to understand how the data model is designed to function. The core feature dataset is named Survey. The Survey dataset encompasses two main feature classes and a supporting annotation feature class. The SurveyStations feature class stores a geometry type of point and represents each station surveyed in the cave (Figure 7). This information is critical as it often represents the only precisely known locations in the cave which are retrievable. The SurveyVectors feature class stores line type geometry and connects all survey stations (Figure 8). Appendix D illustrates how the various survey features appear graphically. Consideration was given for creating a topology rule to force all survey vectors to be covered by survey stations, but was not implemented for two reasons. Topology rules must be uniformly applied across datasets and reduce data flexibility. Secondly, all of the survey data is imported from either the WALLS or COMPASS cave surveying programs and generally is not manipulated once in ArcGIS. Manipulation of data outside of WALLS or COMPASS presents an opportunity for introduction of errors and should be avoided. This is especially important in larger cave systems that may have ongoing exploration. The flexibility of the data model is maintained because new survey data sets can be imported as they become available. If data is complete and no longer needs to be maintained in a cave survey program, the data may be manually modified in ArcGIS. Implementations of this type should be aware that there are no tools for managing

33

precision survey data in the basic ArcGIS platform, and may want to consider the addition of the ArcGIS Survey Analyst extension. The SurveyAnnotation feature class stores annotation linked to the SurveyStations feature class (Figure 9). The name attribute of each survey station is linked to the class for display.

Storing annotation in a feature class provides added flexibility for

symbology when creating maps. Creating the feature linked annotation results in the establishment of a relationship class in the geodatabase. This relationship class has cardinality of one-to-many from the SurveyStations point feature class to the SurveyAnnotation feature class (Figure 10). A second relationship class was created between the SurveyStations feature class and the CrossSections feature class.

This

relationship class is discussed in detail later in this chapter. It should be noted that users not wishing to implement the SurveyAnnotation feature class could still use the basic tools in ArcGIS for labeling.

Figure 7 – SurveyStations feature class

34

Figure 8 – SurveyVectors feature class

Figure 9 – SurveyAnnotation feature class

35

Figure 10 – SurveyStations featured linked relationship class

Passages Feature Dataset The Passages feature dataset is comprised of five feature classes and contains all of the information for the plan view of the map. The feature classes are PassageWalls, PassageFeatures,

FloorMaterial,

Speleothems,

and

PassageAnnotation.

The

PassageWalls feature class stores line geometry and supports a subtype for the attribute of passage type (Figure 11). Illustrations of these features may be found in Appendices B and C.

The subtype implementation helps to enforce data integrity by establishing

acceptable attribute values. This is especially useful when several researchers or organizations are working with a common data set or cave map. While the PassageWalls feature class establishes the limits of the cave, it does not represent the features surveyed in the cave. The PassageFeatures, FloorMaterial, and Speleothems feature classes organize and store this type of data. One important aspect of the development of the data model that must be considered is the spatial representation of features. Many features found in caves are represented in different ways. For example a single stalactite may be represented as a point feature while an area of the same cave where a large area is covered with stalactites may be represented with a

36

Figure 11 – PassageWalls feature class with subtypes

polygon. This variation in cartographic representation is problematic when trying to store data in a way that it can be meaningfully retrieved. For the purposes of this data model all passage features are represented as polygons. This is based on the fact that all features represent a spatial extent of some size. The PassageFeatures feature class stores features with a polygon geometry type. These features represent phenomenon in the cave that have been created by primary processes such as flowing streams in the cave, faulting, and breakdown (Figure 12). This

37

would include all of the features commonly grouped as speleogens. The type of passage feature is controlled by a subtype class. The attributes of the feature class include support for joint control direction, ceiling channel type, size, survey station tie-in and a short description. The joint control and ceiling channel data are supported by attribute domains (Figure 13). These domains standardize the values for data that the user can enter for a given attribute. In this implementation of the cave science data model, only features common to limestone caves are supported. The FloorMaterial feature class is similar to the PassageFeatures feature class with the exception that it handles floor material that may or may not have been a result of primary cave formation. Often, materials such as cobbles, clay, and sand are transported and deposited in various places by cave streams. This same process can also bring debris in to the cave. In undisturbed areas it is not uncommon to find vertebrate remains. Water processes may deposit flowstone or reduce larger rocks to gravel. Similar to previously described feature classes, the FloorMaterial feature class implements a subtype to standardize the options to those features represented in either the NSS or MSS map symbols. The type of floor material is the only required attribute, with added support for size, survey station tie-in, and a description field (Figure 14). The design of the survey station field is limited to eight characters to match the format of the station name attribute of the SurveyStations feature class. Matching the design parameters of these two fields allows for data joining when performing data analysis.

38

Figure 12 – PassageFeatures feature class with subtypes

Figure 13 – Ceiling channel and Joint attribute domains 39

Figure 14 – FloorMaterial feature class with subtypes

The last major feature class of the Passages feature dataset is Speleothems. The Speleothems feature class stores a polygon geometry type and is supported by a subtype class for valid types of Speleothems (Figure 15). Speleothems are often referred to collectively as cave formations. Features such as stalactites and stalagmites fall into this feature class. Other common features found in the Speleothems feature class include shields, rimstone dams and cave coral.

The data model supports representation of

discrete individual formations as well as larger spatial areas representing several

40

individuals. The available attributes for each speleothem include all of the same field names as FloorMaterial with the addition of a field to record the facing direction of the formation. The latter may be of special interest to researchers investigating speleothems influenced by air flow direction.

Figure 15 – Speleothems feature class with subtypes 41

The final feature class of the Passages feature dataset is to support needed annotation of features (Figure 16). The parameters are identical to the SurveyAnnotation feature class with the exception that feature linked annotation is not implemented. The latter was not implemented because the annotation class is supporting multiple feature classes concurrently which is not supported by feature linking.

Figure 16 – PassageAnnotation feature class

42

Hydrology Feature Dataset The last geographic features traditionally depicted on the plan view of a cave map are the hydrological features. In the case of limestone caves, water and its related hydrological systems are the primary mechanism for cave formation. In the context of the cave science data model, hydrology is classified as a separate feature dataset consisting of three main feature classes and an annotation feature class. This feature dataset is unique because it is the only dataset where a single feature, water, can be represented as a point, a line, or a polygon. Examples of the symbology used to depict these features may be found in Appendices B and C. This is necessary in the data model because water is integrally tied to cave formation. It also plays a major role in many cave ecosystems as the vehicle for energy and food entering the cave environment. Hydrological point features are stored in the Feature Point feature class (Figure 17). The class supports three core attributes. The feature type is controlled by a subtype and classifies rapids or riffles, waterfalls, and well casings.

The other attributes

supported are the size and height of the feature. The size attribute is a text field allowing the user to enter information such as the length of the riffle or width of the stream at the riffle point. The height attribute stores the vertical change of the overall feature and is a numeric field. Streams are the feature class within the hydrology feature dataset that supports line type geometry (Figure 18). These features are used to represent flowing water in the cave. Two attribute types are supported by the Stream class. The stream type attribute is supported by a subtype class and the water quality attribute is supported by an attribute domain. The water quality attribute is entered as “safe’ or “unsafe”. The intended

43

context for this attribute is in relation to human consumption, but this attribute could also be used to indicate whether a particular stream was safe for researchers to conduct investigations.

Figure 17 – FeaturePoint feature class with subtypes

Figure 18 – Streams feature class with subtypes

44

The polygon geometry type was used to represent pools, lakes and sumps within the cave. A sump is defined as a cave passage that continues underwater, but cannot be traversed without specialized diving equipment. These features are grouped into the Pools feature class (Figure 19). Each pool feature stores four attributes. The pool type is set by a subtype class. Water depth is stored as a numeric value. The water quality attribute is linked to the same attribute domain as the Stream feature class. The possible values for water quality are “Safe” or “Unsafe”. Sump type is also linked to an attribute domain to indicate whether the sump is “Diveable” or “Not Diveable” (Figure 20).

Figure 19 – Pools feature class with subtypes

45

Figure 20 – Sump Type and Water Quality attribute domains

All of the hydrology feature classes are supported by the HydrologyAnnotation feature class. This feature class is not linked to any feature classes and simply provides a way for the data model to support various annotations that may be needed for cartography. The attributes of the feature class are identical to the PassageAnnotation class shown in Figure 16.

CrossSections Feature Dataset While the plan view is perhaps the most used map information in cave science, cross sections and a profile view can be useful in visualizing complex aspects of the cave environment. One case where cross sections can be especially useful is in illustrating ledges and undercuts that are present, but not obvious in the plan view (Figure 21). Cross sections are normally oriented at right angles to the cave passage, but may be drawn at other angles if needed.

46

Cross Section Plan view

Figure 21 – Cross section of plan view

The ability to provide context for plan view features suggests that it would be a useful addition to the cave science geodatabase data model.

Cross sections were

implemented by creating the CrossSections feature dataset. The dataset supports two feature classes. The primary feature class is CrossSections, and is spatially represented by polygons. The feature class has four attributes of interest (Figure 22). The overall height and width of the passage may be stored, but are not required fields. All cross sections should have the near station and facing direction fields populated.

This

information is necessary to correctly position and orient the cross section. All of these attributes are routinely collected as a part of cave surveying and represent no added work for field teams.

The CrossSections feature class has a relationship class with

SurveyStations feature class.

The relationship cardinality is one-to-many from

SurveyStations to CrossSections and links the survey station name attribute (Figure 23). An annotation feature class named CrossSectionAnnotation was developed to support the CrossSections feature class.

This feature class is not linked to any other

feature class and functions identically to the PassageAnnotation feature class shown in

47

Figure 16. Any ancillary annotation supporting cartographic representation of cross sectional data should be placed in this feature class.

Figure 22 – CrossSections feature class

Figure 23 – CrossSectionsNearSurveyStation relationship class

48

Profile Feature Dataset Profiles are the final component to a cave map. The principal of showing a profile in addition to a plan view is similar to that of cross sections. The profile view runs longitudinally along the survey line. This view is often used to help visualize caves with significant vertical extent or maze-like passages.

Beyond cartographic

representation, this research was not able to clearly identify information products that are unique to the profile. Accordingly, the data model was designed to support only the graphic representation of the profile. The Profile feature dataset contains two feature classes. The ProfileLines feature class stores line type geometry and has no user attributes (Figure 24). A relationship class between ProfileLines and SurveyStations was not suitable because profiles cover multiple survey stations and may extend along entire cave passages.

The

ProfileAnnotation feature class stores ancillary annotation related to the cartographic display of profile data. The design and functionality of the ProfileAnnotation feature class is identical to the PassageAnnotation feature class shown in Figure 16.

Figure 24 – ProfileLines feature class

49

Object Classes The final component to the cave science data model was the creation of two object classes to store basic information about research and researchers. As stated earlier in this research, the cave science data model is designed to be a core data model. It is not practical to encompass every domain that speleology envelops. The data model should, however, provide a way to “hook” into other data models supporting cave research. The two primary areas for attaching to other data models are the SurveyStation feature class discussed earlier in this chapter and the Research object classes. The two object classes are used to store non-graphical tabular data. The ResearchProjects object class stores three attributes about each research project in the cave in addition to a unique identification number (Figure 25). The attributes record the nearest survey station, a short description, and the type of research. The type attribute is supported by a subtype class containing different areas of scientific study. The ResearchProjects table is linked to the Researchers object class (Figure 26) through the ResearchProjecthasResearcher relationship class (Figure 27). This relationship enforces that every research project has at least one researcher. The cardinality of the relationship is from ResearchProjects to Researchers. The type of relationship has been set as “many to many” to support projects with more than one primary investigator. The Researcher table stores basic information about scientists conducting projects in the cave (Figure 26). The table stores the name, email and phone contact information for each researcher. The table is simple by design and may be linked to other databases of information by the unique identifier for each entry.

50

Figure 25 – ResearchProjects object class

Figure 26 – Researchers object class

Figure 27 – ResearchProjecthasResearcher relationship class 51

Raster Data Classes Three raster datasets and a single raster catalog were developed for the data model. The raster datasets provide a container to create seamless mosaics of continuous raster data. This data is classified into one of three themes: Aerial Photos, Geology, and Topographic data. These data are well suited to the raster dataset structure because they are continuous and most often are produced by a single agency. The latter is significant since it provides uniformity when creating the seamless mosaic. The raster datasets have no user definable attributes. The Survey Data raster catalog provides a structure for referencing cave survey, inventory or other field data within the geodatabase. The raster catalog does not directly store the data, but rather provides a pointer to a location outside of the geodatabase. The raster catalog is useful when two or more researchers are collaborating on a project and need to have similar structures for organizing data. It should be noted, however, that implementing the raster catalog feature of the geodatabase data model requires that all raster documents outside of the geodatabase are also delivered when exchanging data.

Page Layout Theme The final theme implemented by the data model (Figure 5) is Page Layout. The layout theme exists outside of the geodatabase structure. It is contained in the map document also known as the MXD file. MXD files may be generated for a given project and saved as a template or MXT file.

This functionality allows the reuse of map

elements that are common to all output. These elements may include items such as a

52

north arrow, bar scale, title annotation and neat line. Other annotation can be included on per project basis.

Supporting Attribute Domains There are four additional attribute domains that are internal to the geodatabase data model (Figure 28). These domains support the feature linked annotation established between the SurveyStations and the SurveyAnnotation feature classes. There are no user definable attributes in these domains and they should not be manually adjusted. The AnnotationStatus domain indicates whether a particular feature has annotation placed or not. The BooleanSymbolValue domain toggles between yes and no to indicate if user defined symbols are utilized. The HorizontalAlignment and Vertical Alignment attribute domains store the horizontal and vertical position for each instance of feature linked annotation.

53

Figure 28 – Internal Annotation attribute domains 54

Testing the Data Model The cave science data model was tested on a digital cave map of Great Onyx Cave (GOC) provided by CRF member Bob Gulden. The Gulden map was originally drafted in a Computer Aided Drafting (CAD) software package called AutoCAD. CAD software supports a layer based organization of data. Although there is no community standard for cave map drawing layers, most cartographers follow some scheme simply for map organization. The GOC map loosely organizes objects into several layers. The main layers are more generalized than the data model. Layers such as “detail” and “symbols” may have numerous features crossing multiple feature classes. For example, features that would be found in the Hydrology feature dataset are found on the same layer as features that would be in the Passages feature dataset. This is expected given that the GOC map was not drawn with GIS in mind. Two different methods were tested for importing the GOC map to the data model. The first method simply adds the CAD data to an ArcGIS project. This method is basic and relies on the generic CAD import tool found in ArcGIS. The results of this process were unacceptable as the tool does not appear to support the advanced element types such as Bezier curves commonly found in CAD data such as the GOC map. The second methodology processed the GOC map with the “Import from CAD” tool in ArcToolBox (Figure 29). This method simplifies the complex element types to geometry that can be stored in the geodatabase. The obvious advantage to this method is that there is no data lost as a result of the import process. The import tool also supports attribute data from CAD, but processing the GOC map revealed that no attributes had been created. Twelve layers were imported and stored in a temporary geodatabase by the

55

process. As expected, these layers did not easily support the cave science data model. Since CAD does not enforce the geometry types that ArcGIS supports, many elements that would ideally map to a single feature class in the data model are represented by both lines and polygons. In these cases, the geometry would need to be recreated as features with the appropriate geometry. Once the geometry was correctly represented in the temporary geodatabase, the various features were copied to the cave science geodatabase using the editing tools found in ArcGIS. This process was straightforward and without complication. The survey data was exported from WALLS and imported directly to the cave science geodatabase without incident. Appendix D illustrates portions of the GOC map after import to the cave science data model.

Figure 29 – Import from CAD geodatabase tool 56

One issue encountered was the geometry types used in drawing the map. CAD software leverages several element types not easily supported by ArcGIS. Complex element types, such as parametric B-spline geometry, are often used by cartographers to represent cave walls and other features. These element types had to be converted to simple lines before importing the data to GIS. The conversion produces a less aesthetic line, which is a problem for cartographic representation. By far, the most significant problem with testing the prototype cave science data model was lack of standards for imported drawings. If a cave map was drawn from the beginning using the data model, the data would be created and attributed correctly. The ArcGIS editing environment provides all of the necessary functionality for drawing and data entry. These tools could further be developed by programming to customize them for the cave research community. The imported drawings are more problematic, though, because they have all been drawn in different software packages and by different individuals. Even within the same software the drawing organization and layer scheme is left up to the individual. This fact makes it a necessity to adjust layer mappings for each individual cartographer depending on what layer and geometry type features need to be retained when importing a map. Establishing standards for maps not native to the data model should be established for a given project. Limitations on the allowable geometry types for features also proved to be cumbersome. Many of the features that were to be stored in the FloorMaterial feature class had been created using line tools. The FloorMaterial feature class only supports polygonal geometry.

This limitation of only storing a single geometry type would

suggest that several feature classes with identical user attributes may need to be

57

supported by the data model. This architecture may create a burdensome data model that is unwieldy for any meaningful analysis using GIS tools. Several relationship classes were implemented in the data model. This type of structure appears to work well within the model. The cross sectional data was imported from the GOC temporary geodatabase and stored in the CrossSections feature class. The feature class was attributed with the survey station attribute to test the CrossSectionsNearSurveyStation relationship class. The relationship class appeared to work as designed, enforced that each cross section had a survey station reference. The CrossSectionsNearSurveyStation is a good example of how one feature class can have a mandatory attribute linked to a second feature class. If the map has cross sections that cannot be related to the plan view, those cross sections are not useful in analysis. The relationship class verifies that all cross section features have a station record in the attribute table.

Based on this result, it may be beneficial to establish additional

relationship classes for those features that employ the “near station’ record. The object classes were relatively easy to fit into the data model.

This

functionality was tested by creating simulated research projects and researchers. It was hoped that actual research data could be used in the testing of the model, but ultimately data was not available in the timeframe of this research. The addition of non-graphical information is possible within the data model, but should generally be kept to a minimum to keep file sizes small and more efficient. The flexibility of GIS for linking to digital data outside of the geodatabase provides a powerful method for extending project data to other domains and research collaborators.

58

The data model was only tested with the ESRI personal geodatabase format. This format is based on the Microsoft Access database format. For most cave projects this format is acceptable since the limit on the Access file size is two gigabytes. Support for the geodatabase at the enterprise level is only possible by implementing ArcSDE. This middleware software brokers data transactions between the geodatabase and the user(s). ArcSDE requires specialized knowledge and hardware to operate. These requirements put this type of enterprise implementation out of reach many end users, especially those working independently on cave science.

59

CHAPTER V CONCLUSIONS

The research objective for this thesis was to develop a usable ArcGIS geodatabase data model for cave science. Because speleology is a wide field of study, the data model focused on a core set of features that would be useful to all researchers. These features were defined as the cave map symbols published by the NSS and MSS. Comments were solicited from several cave researchers to help verify the data model usability. The process started with the conceptual design of the data model. There was some difficulty in getting feedback from other stakeholders in the cave research community. This may be a result of GIS just beginning to reach a larger user base within the community. Once the information products were finalized, the map symbols were organized into a logical design of feature datasets and feature classes. There were some compromises that had to be made with regard to geometry types and spatial representation of features. This appears to be a somewhat problematic solution because some feature classes would ideally support multiple geometry types. The physical design and creation of the data model was straightforward with ArcCatalog. Ultimately this proved to be the easiest part of the process. Finally the data model was tested with the CAD map of Great Onyx Cave. The testing verified a problem identified during the logical design of the data model. The geometry types used to create features in CAD did not match the geometry types created for the cave science model. This issue must be addressed before legacy data can be easily supported by the data model.

60

An important limitation toward the utility of the data model being developed here is potential resistance in the targeted community of users. Many cave researchers are slow to change, and many more have not yet discovered GIS as a powerful tool. Acceptance of the data model will take continued communication and ideas by a wide cross-section of actual and potential users.

Future Research This research is only a first step towards creating a viable data model for speleologists. Most widely used data models take years to mature and have undergone several iterations. Data models should be considered a work in progress, especially in regard to spatial data. The explosive growth and development that the GIS industry has been experiencing over the last several years have yielded new functionality on an almost continual basis. The framework developed and described here can be extended in many directions. Some of these extensions will require custom programming, while others will be contingent on new features of the latest GIS software. The release of ESRI’s ArcGIS 9.2 software slated for late 2006 promises more robust cartographic functionality in the geodatabase (ESRI 2006b). New data structures will be needed to support such operations. These features will likely provide benefits valuable to the data model designed as a part of this research. Support for automated import of data is needed to support casual GIS users. The export routines from WALLS and COMPASS create files in the shapefile format. These data files can be manually imported into the geodatabase provided that the user has a working knowledge of ArcToolBox. Creation of geoprocessing scripts is possible using

61

the ModelBuilder module of ArcGIS. Establishing scripts would also allow for easy attribute field mapping for WALLS, COMPASS or any other cave software program that may be used on a given project. Similar geoprocessing scripts and ArcObjects code could be developed for common tasks such as searching for survey stations or drawing geometry. Tools such as these would allow non-technical GIS users to leverage the data model and software for their research. Network modeling is another useful area to consider developing.

The

SurveyStations and SurveyVectors feature classes support the use of the built-in geometric modeling tools in ArcGIS. However, these tools are very basic and only support two dimensional networks.

With the release of the 9.1 version of ArcGIS

Network Analyst, it is possible to create non-planar networks. This type of functionality would allow cave researchers and managers to visualize attribute data related to the cave GIS data in ways not previously possible. Another area for future research centers on added cartographic tools in ArcGIS. Of particular interest is the ability to support different symbology for features at different map scales. This type of functionality is critical to supporting different feature geometry types within a single feature class.

Support for more complex geometry types such as

B-spline curves would benefit the usability of the data model. Unfortunately, inclusion of such functionality is at the discretion of the software vendor. Working towards a community wide standardization of data collection and cartographic methods is badly needed.

As researchers collect more data on more

domains this becomes of particular importance. The ability to collaborate is greatly

62

diminished if every project is stored in its own data “silo” and unable to easily interact with other data. The cave science data model is designed to be a core data model for speleology. This data model may be extended in to various other domains. Some users may want to extend the model to their particular area of interest such as paleontology, biology, or geology, while cave managers may want to extend the model to areas such as management concerns, interpretation, or maintenance. Finally, compliance with the Open Geospatial Consortium (OGC) should be a long term goal for any GIS data model. The OGC supports non-vendor specific data standards. Many of the leading GIS software vendors support OGC in some manner. It is not realistic to expect all speleologists to use ESRI software or any single software package for GIS. The support for OGC standards not only expands the options for which GIS software can be used with the data model, it also expands the opportunities for the cave research community to collaborate. This research provides a starting point for future development and refinement of the cave science data model. The literature review for this thesis suggests that this is the first attempt to apply the geodatabase structure to traditional cave maps. Work has been done in the areas of groundwater modeling and karst systems, but it did not address the research needs for science in the cave itself. GIS is the next step for cave map creation and analysis. Moving the science beyond simple geometry is critical to exploiting the spatial data as researchers continue to expand their knowledge of caves. I am hopeful that this data model is a step in that direction.

63

Appendix A

ArcCatalog View of Cave Science Data Model

64

ArcCatalog View of Cave Science Data Model

65

Appendix B

National Speleological Society Map Symbols

66

NSS Map symbols reprinted with permission of the National Speleological Society (Dasher 1994)

67

68

Appendix C

Missouri Speleological Survey Map Symbols

69

MSS Map symbols reprinted with permission of the Missouri Speleological Survey (Thomson and Taylor 1991)

70

71

72

73

74

75

76

77

78

79

80

81

82

83

Appendix D

Test Data Sample: Great Onyx Cave

84

Survey Annotation

Survey Vector

Survey Stations

Sample of the Survey Feature Dataset testing

85

Ledge Slope in floor Flowstone Area

Breakdown

Passage Wall

Sample of the Passages Feature Dataset testing

86

REFERENCES

Arctur, D. and Zeiler, M., 2004, Designing Geodatabases: Case Studies in GIS Data Modeling, (Redlands: ESRI). Cole, J., 1999, News Notes: Treasures in a Pristine Cave. Available online at: http://www.geotimes.org/oct99/newsnotes.html (accessed 9 July 2005). Dasher, G., 1994, ON STATION A Complete Handbook for Surveying and Mapping Caves. (Huntsville: National Speleological Society, Inc.). ESRI, 2002, COTS GIS: The Value of a Commercial Geographic Information System. Available online at: http://www.esri.com/library/whitepapers/pdfs/cots-gis.pdf (accessed on 28 May 2006). ESRI, 2006a, Data Models. Available online at: http://support.esri.com/index.cfm?fa=downloads.dataModels.gateway (accessed 4 February 2006). ESRI, 2006b, What’s Coming in ArcGIS 9.2. Available online at: http://www.esri.com/software/arcgis/about/whats-coming.html (accessed 28 May 2006). Fish, L., 2006, COMPASS History, Goal and Philosophy. Available online at: http://fountainware.com/compass/miscitem.htm#GOALS (accessed28 May 2006). Dotson, D., 1992, The SMAPS Cave Management System, (Frostburg, MD: Speleotechnologies Glennon, J., 2001, Application of Morphometric Relationships to Active Flow Networks within the Mammoth Cave Watershed. Masters thesis, Available online at: http://www.uweb.ucsb.edu/~glennon/GlennonThesis.pdf (accessed 8 July 2005). Gulden, B. 2006, Great Onyx Cave map. Unpublished. Hale, E., 2005, Geometric Network for Cave Survey Lines. Available online at: http://forums.esri.com/Thread.asp?c=139&f=771&t=172648&mc=0#msgid508264 (accessed 8 June 2006). House, S., 2005, Cave Research Foundation small caves database. Unpublished. Kilmchouk, A., Ford, D., Palmer, A., and Dreybrodt, W., (eds), 2000. Speleogenesis: Evolution of Karst Aquifers, (Huntsville: National Speleological Society, Inc.).

87

MacDonald, A., 2001, Building a Geodatabase, (Redlands: ESRI). Maidment, D., 2002, Arc Hydro: GIS for Water Resources. (Redlands: ESRI). McKenzie, D., 2006, WALLS Project Editor – Tools for Cave Survey Data Management. Available online at: http://www.utexas.edu/tmm/sponsored_sites/tss/Walls/tsswalls.htm (accessed 28 May 2006). Moore, W., and Sullivan G., 1978, Speleology: The Study of Caves. (St. Louis: Cave Books). Moyes, H., and Awe, J., 2000, Spatial Analysis of an Ancient Cave Site. Available online at: http://www.esri.com/news/arcuser/1000/cave.html (accessed 8 July 2005). Nyerges, T., 2006, Developing a Geodatabase. Available online at: http://courses.washington.edu/geog461/final_project_06/geodatabase_development.d oc (accessed 3 February 2006). Osburn, B., 2005, Exploration/Survey/Cartography Program Activities Report. Available online at: http://www.cave-research.org/eocrf/eocart.html (accessed 3 February 2006). Pfaff, R. and Glennon, J., 2004, Working with ArcGIS 9: Building a Groundwater Protection Model. Available online at: http://www.esri.com/news/arcuser/0704/files/modelbuilder.pdf (accessed: 8 July 2005). Shaw, T., 2005, There is a Lot of New Stuff to Say About Data Modeling. Available online at: Http://www.dmreview.com/editorial/newsletter_article.cfm?nl=dmdirect&articleId=1 022729&issue=20157 (accessed 28 May 2006). Szukalski, B., 2004, ESRI Cave and Karst News #9, July 2004. Available online at: http://www.esri.com/industries/cavekarst/news_community/cavekarst_enews_0704.ht ml (accessed: 3 February 2006). Thomson, K. and Taylor, R., 1991, The Art of Cave Mapping. Missouri Speleology 31(14). Tomlinson, R., 2003, Thinking About GIS: Geographic Information System Planning for Managers. (Redlands, ESRI). Wayne, C., 2005, Managing Rasters in a Personal Geodatabase. Available online at: http://www.esri.com/news/arcuser/0705/files/managerasters.pdf (Accessed 23 June 2006).

88

White, W., 1988, Geomorphology and Hydrology of Karst Terrains, (New York, Oxford University Press). Zeiler M., 1999, Modeling Our World: The ESRI Guide to Geodatabase Design. (Redlands, ESRI).

89