Spatial Datamodel Based Schema Creation for the ...

4 downloads 4368 Views 2MB Size Report
and the resultant database schema becomes unusable. In this paper, an attempt .... Figure 3.5: Creation of package by selecting model wizard. Then the process ...
Spatial Datamodel Based Schema Creation for the Forest Change Phenomenon Dr. K. R. Manjula

Dr. Singaraju Jyothi

Senor Assistant Professor, CSE Dept., School of Computing SASTRA University Tirumalaisamudram, Thanjavur, Tamilnadu, India [email protected]

Professor, Computer Science Dept., School of Sciences, Sri Padmavati Mahila Viswavithyalayam Tirupathi, Andhra Pradesh, India [email protected]

Abstract — Geographic data consist of location and extent or degree, which are not hold up by a standardized modeling language like Unified Modeling Language (UML). Here, development of broad standard data models presents an interesting challenge since the level of complexity becomes high, and the resultant database schema becomes unusable. In this paper, an attempt is made to address the need of well documented and organized data based on certain standard formats. As UML became extensively used Case Tool both for academic and industry needs, now it became a standard modeling language defined by Open Management Group (OMG) and International Organization for Standardization (ISO). UML covers simple concepts and is very perceptive and easy to use. But association between object, definition of constraints, generalization among data concepts and its quality factors are important to be considered while modeling geographic data. So, to address this issue, this paper proposes UML based data model as a solution, extended to accommodate the spatial data which are derived from the remote sensing images. The solution implemented through the use of stereotyped classes to represent spatial data in a straightforward manner including support for geometries as well as the definitions of attributes, objects, classes and association relations among them. In this paper, UML profile based models are created for various features like Road, Forest, Waterbody, Builtup, Cultivation, Mining and Wasteland and the Extensible Markup Language (XML) code is generated which are stored in XML Metadata Interchange (XMI). Then using ArcGIS support, this code is converted into schemas and finally the schema is populated with data that is used in the study of deforestation factors using GIS and Remote Sensing. The final outcome of the model to database is physically implemented and few tables are also presented here. Keywords — Data Model; OMG; OCL; Spatial Data Schema; Stereotyped Classes.

I.

INTRODUCTION

A huge amount of data has been accumulated over the years, and much of it is duplicative, unformatted and disorganized. Such data was developed in a range fashion, with little or no coordination or consideration of relation among data. Data standards, though developed, were done because of the interest of a small group of individuals, but not in priority areas. This has lead to redundant data, which results in problems with data quality, lack of understanding over what data is available, if available, where to find such data. The appropriate selection of a structured representation for the problem is most important factor of its solution. The growing complexity of engineering tasks creates a need for tools that offer possibilities for high quality situation monitoring and comprehension, as well as prediction of our

c 978-93-80544-12-0/14/$31.00 2014 IEEE

future line of action. A geographic Information System (GIS) is precisely such a tool which offer support for implementation of various types of projects aimed at economically high quality decision making. According to a general definition, GIS host a powerful set of tools for collecting, storing, searching, transforming and presentation of real world spatial data. Any information system’s capacity typically depends on its data model. Therefore, for such information system, a rigorously defined data model is essential and to have such model, domain specific basic concepts need to be considered that is elaborated in this paper with the help of Unified Modeling Language (UML). Advantages • UML specification of this data model development completely validates the newly created profile. • It can provide a framework to express complex topological constraints. • It can find a balance between the two i.e. extended UML and intuitive development of notations that covers the important spatial aspects but at the same time simple to use. It has been demonstrated how, by means of a visual representation, database users (managers, mining engineers, geologists) can have a clear picture of the entire system. II.

REVIEW OF LITERATURE

Anders Friis-Christensen [1] proposes the geographic data modeling requirements and focuses on spatiotemporal properties and roles of geographic objects. Changcheng Dong et al [2] suggest a new approach to spatial data generalization using object oriented techniques using the Object Model and Dynamic Model of Object Modeling Technique (OMT), where the relationships between different feature classes are examined and generalization processes is analyzed. Jean Broder et al [6], suggest the use of geospatial repositories to store conceptual content of object oriented application database schemas and dictionaries aligned with international standards in geographic information (ISO/TC 211 and Open GIS Consortium (OGC)) in his recent work. Jugurta LisboaFilhol et al [7] devised a UML Profile in which excellent solution to standardize domain-specific modeling through structured and precise UML extension is suggested, which is used in the entire UML infrastructure. Kehe Wu et al [9] proposes a method for modeling power spatial data, based on object-relational (OR) model and discusses the methods to organize, manage, and query spatial data. Michael F et al [14]

130

suggested the specific Object Oriented (OO) data model IFO, discusses current research issues and directions. Myoung-Ah Kang et al [15] describes a specific formalism for geographic database design and a matching language for the expression of topological constraints, precise and easy to use textual representation of type constraints. Tomaševiü Aleksandra et al [16] outlines a methodology for development of a spatial database, tackles exploration works, sample analysis, infrastructure within the mine and in its environment, excavating and loading equipment, etc. According to, Cliff Kottman and Carl Reed [18], the OpenGIS Specification serves as the technical basis for the Technology Development Process which is realized according to a plan and schedule developed by the Technical Committee. Sajayasree K K [19], specify how unambiguous constraints to so-called formal languages have been developed. This project economically feasible and achieves cost benefit by supporting the customization of database according to the needs and also supports reusability. Thus it reduces the cost of constructing the database. Collection of these dataset type creations is the first step in designing and building a geodatabase. In this project, these elements are designed through appropriate model with the help of ArcGIS CASE Tools. This project discusses how to create each component of our Geodatabase schema in UML using Enterprise Architect and also how to use CASE tools in ArcCatalog to generate schema for our own database with UML design. III.

METHODOLOGY

A. Data Collection According to a general definition, GIS is a powerful set of tool which supports users to create queries, analyze spatial information and manage data and maps, and present results of these operations. As a first step in the process, a study area is defined for collection of data.

(a) (b) Fig. 3.2: Toposheet showing forest region of Study Area

(a) (b) Fig. 3.3: Reference Map of the study: Path Row Coverage of A.P. and Scene

B. Data Preparation As it is an extension of previous paper, the detailed processes of features extracted from the Remote Sensing (RS) images are depicted in the base paper titled “Construction of Spatial Dataset from Remote Sensing using GIS for Deforestation Study [11]”. The main aim is to determine the land use and land cover changes due to anthropogenic activity by studying the satellite images of different decades, such as 1991, 2001 and 2011 and to assess its impact on forest and define factors for deforestation process [4]. After performing digitization [12] and feature extraction process [11], we found a set of individual categories of Land use Land Cover features on the ground and this paper utilized these feature classes that are referred as deforestation factors in the specified study area. The final features classes that are extracted from [12] [13] are depicted in the following table 3.1. Table 3.1: Feature classes extracted from classified remote sensing image

1) Study Area : The setting of this study spans an area of 5000 Square Kilometers and the study area boundary in lat-long is E 79 39" to E 78 45" and N 13 35" to N 14 33" and the study area whose district outline is specified in the following figure 3.1[12].

C. Data Design There are three general strategies to create a geodatabase [17] by 1. 2. 3. Fig. 3.1: Map Showing the District Outline Containing the Study Area

Migrating existing databases to the geodatabase. Using tools in ArcCatalog and ArcToolbox to create the schema for geodatabase design. Using UML and the CASE tools subsystem of ArcGIS to generate the schema.

The following figures represent toposheets (figure 32. (a) And (b)), path row coverage (figure 3.3 (a)) and scene (figure 3.3 (a)) of the satellite for study area.

2014 International Conference on Computing for Sustainable Global Development (INDIACom)

131

Among these, this paper discusses the third strategy as outlined in the figure 3.4 for our geodatabase which is achieved in two stages.

4. 5.

Based on requirement define stereotype constraints and if required add enumeration elements. Add shape scripts and set default or modify appearance and then finally export a UML profile.

Figure 3.4: Design and development of a geodatabase using UML case tool

1) Create an UML Model: First UML models are created using enterprise architecture, then exported to an intermediate Extended Markup Language (XML) format files, and then these XML files are used to create spatial database schema [5]. 2) Schema Design and Generation : The general strategy for geodatabase design using UML and CASE tools involves, defining the entire schema for the geodatabase, generating that schema from the XML Metadata Interchange (XMI) and then populating the schema with data. Once the schema has been generated, customize that schema to build database, have existing data with which the schema can be populated. 3) Modeling Database Structure : Geodatabase elements like, feature tables, feature classes and their relationship classes pursue certain rules that direct where it is stored in relative to each other in the geodatabase and are defined in UML packages to represent the geodatabase and feature datasets. The UML package contains Structural Elements Feature datasets Geometric networks Feature classes Relationship classes Fields Subtypes

Parameterized Behavior Elements Domains Connectivity rules Relationship rules

The very first step in design is to create package that holds the data features as specified in figure 3.5.

Figure 3.5: Creation of package by selecting model wizard

Then the process proceeds to create metaclass and metaclass based class (figure 3.6) and subclass and finally ends with saving the profile (figure 3.7).

Custom Behavior Custom features Feature class extensions Custom interfaces Figure 3.6: Creation of child class and extends relationship

D. UML Profile Generation The general steps included in the package is presented here as an aid to understand the work carried out in this paper [3] [7]. 1. 2.

3.

132

The first step is to create a Package diagram using the Profile page of the Enterprise Architect UML Toolbox. As we need to define a new data type for each deforestation factors, required number of stereotype and Metaclass are added to the package. Redefine predefined tag types and assign predefined tag types to stereotypes using the tagged value connector and define stereotype tagged values.

Figure 3.7: Saving UML profile

2014 International Conference on Computing for Sustainable Global Development (INDIACom)

E. Use CASE tools and UML to Create a Geodatabase Schema Generally, UML is much handier at object-oriented software design and in documenting the design of a Database Management System (DBMS) table schema. UML is not as useful at helping to design geographic elements in our geodatabase. One strategy that is used for creating a geodatabase is to employ UML to design the schema and ArcGIS CASE tools subsystem to generate feature datasets, feature classes, tables, and other items. The following section describes the steps to design data model in CASE tool [17]. 1.

With enterprise architecture, design of a data model in UML is done and the model is export to XMI file. To design a database we need to create a package under ArcGIS workspace (Figure 3.8) and then creation of required class and subclasses is done within the package which results into final model using profile.

Figure 3.8: Creation of ArcGIS package

Figure 3.10: Description of fields

3.

Generate a geodatabase schema from the XMI file or Microsoft Repository with the Schema wizard. The major steps in this is to import the XML file and then proceeds to create a geodatabse (Figure 3.11) where the deforestation factors are stored as database tables and finally followed by schema generation(Figure 3.12) is done.

Figure 3.11: Geodatabase creation

After creating the model spatial reference and coordinate values are set (Figure 3.9) with WGS 1984 which is a default projected coordinate system for India. These spatial references are essential as the data models are converted as database schemas for deforestation factors. Then the model created is exported to XMI and final package is published. Figure 3.12: Schema generation

Once the schemas are ready, load or populate the data into it. Likewise we have created schemas for features such as road, forest, builtup, cultivation, mining and wasteland which are depicted in the implementation part of this paper. IV. Figure 3.9: Setting the spatial reference and coordinate values

2.

To add the Schema wizard to ArcCatalog, the XML workspace document is imported into ArcCatelog where the data handling is done under ArcGIS Desktop based ArcInfo package. The fields are described(Figure 3.10) as per the design.

IMPLEMENTATION

The model provides classes based on Vector and Raster representation and enables the association of both these representations to the same information layer, for system implementation. The following section shows the relation between the abstract and implementation levels of the data model. A. Creation of Data models for Deforestation Factors The procedure as specified in previous section is followed and class diagrams for various deforestation factors identified in the study are constructed as data models. In this project the UML profile based models are created for various features such as: Waterbody, Forest, Mining, Cultivation, and

2014 International Conference on Computing for Sustainable Global Development (INDIACom)

133

Wasteland and for Builtup and sample diagram is presented in figure 4.1.

Figure 4.1: Data model of feature: Waterbody

B. Construction of Spatiotemporal Database Using GIS Profile After creating above specified data models, then each model is exported into XML file and the XML code is generated. Then using ArcGIS support the code is converted into schemas and finally the schema is populated with data that is used in the study of deforestation factors using GIS and RS [8]. The final outcome of the model as database is physically implemented and few tables that are used in deforestation analysis are presented here [13]. Table 4.1: Land Use Patterns Label Count for the Year 1991

Table 4.2: Predicate Count of Adjacent_To Degraded Polygons for Year 2001

models can be tested in an effort to uncover errors prior to their proliferation to next iteration. As these models cannot be executed, the conventional method of testing is not supported here. For correctness of data models, the following section describes in brief, various methods of verification used in this paper. These corrective methods are used according to predefined specification standards for GIS data. Correctness and consistency of models is most important aspect to use these models in construction of geodatabase schemas [10] [20]. A. Correctness of Models The notation and syntax used to represent models tied to specific analysis and design methods selected for the project. During analysis and design, assessment of semantic correctness based on models conformance to real world domain is done. If the model accurately reflects the real world, then, it is semantically correct. B. Consistency of Models Consistency is judged by considering the relationships among entities in the model. To assess consistency the following methods are used in this project. 1) Semantics Checker : Models need to be created by following a set of modeling rules. The semantics checker is used in this project to verify whether a model that is stored in the Repository or XMI has been correctly defined or not. Upon verification the model has certain errors that are produced as report by semantic checker with the list of errors encountered in the model and then the model are refined accordingly. 2) Object Constraint Language : A UML class diagram not refined to provide all the relevant aspects of a specification. There is a need to depict additional constraints of objects. Object Constraint Language (OCL) is a formal and expression language, which makes it easy to read and write constraints. The evaluation of an OCL expression is instantaneous i.e., the object state cannot altered during evaluation. The OGC Technical Committee (TC) has developed architecture called the OpenGIS Abstract Specification. OGC exists, in part, to provide unambiguous models of real world phenomena: features, events, and relationships. The OGC is developing technology that addresses common “behavior” of digital geospatial information containers in the name of OGC Reference model.

V.

OGC Reference Model – It describes the OGC Standards Baseline focusing on relationships between the baseline documents. It consists of the approved OGC Abstract and Implementation Standards (Interface, Encoding, Profile, and Application Schema – normative documents) and OGC Best Practice documents (informative documents).

b)

OGC Feature Classes - OpenGIS feature type is specified by its property set. The type specifies the list of properties that distinguish the feature, as

TESTING

The construction of OO software begins with creation of requirements and followed by design models [10] [21]. Because of the evolutionary nature of these models, the models begin as informal representations of system requirements and evolve as detailed models of classes, class relationships, system design and allocation. At each stage, the

134

a)

2014 International Conference on Computing for Sustainable Global Development (INDIACom)

specified by the Attribute Schema. A feature is the space it occupies, and this is modeled by an OpenGIS WKS. Every feature with extent has a property named geometry. This paper has created feature classes like forest class, deforest, road, builtup, cultivation mining etc., based on the OGC standard. Based on OGC reference model and specifications, this project has set the roles and role types and cardinality among the classes. This paper utilizes all strategies for testing the models and the database schemas generated are quality based one, so that the spatial data that are extracted from the remote sensing images are properly accommodated into the schema and these tables are used in deforestation factors analysis based study. VI.

CONCLUSIONS

In this paper, the development of a geodatabase for analyzing deforestation factors from the phase of logical modeling performed using the UML and CASE tools to the implementation of a development of spatial database within the ArcGIS development environment is implemented. A systematic approach is suggested for development of complex geodatabases: development of a conceptual model, followed by logical data modeling, and finally creation of a physical model, upon which the database is ready for implementation. It has been demonstrated how, by means of a visual representation, database users (managers, mining engineers, geologists) can have a clear picture of the entire system. VII. FUTURE ENHANCEMENT The outlined solution is only the first step in creating an integral GIS system for a natural resource forest. Hence, further research in this area will focus on extension of the system with thematic classes related to population and other factors, integratetion with deposit modeling tools, and 3D extension. In the era of web publication of data in all areas, even the project can be extended to create a web GIS portal with contents from the database. REFERENCES Journals [1]. Anders Friis-Christensen, “Modeling Geographic Data Using UML”, Department of Computer Science, Aalborg University and National Survey and Cadastre, Denmark, pp. 361–369, 1999. [2]. Changcheng Dong, Paul Luker, Philippa Berry and Hongji Yang, “Analysis and Modeling of Spatial Objects to Implement OOP for Spatial Data Generalization in GIS”, Proceedings: International Conference on Object Oriented Information Systems, 18–20 December, OOIS’ 95, pp 189-199, 1996. [3]. Dingfei Liu*, Theodor J. Stewart, “Integrated Object-Oriented Framework for MCDM and DSS Modeling”, Elsevier Journal of Decision Support Systems, Volume 38, Issue 3, pp.421–434, December 2004. [4]. Gilberto Câmara, Ricardo Cartaxo Modesto Souza, Ubirajara Moura Freitas, Juan Garrido, “Spring: Integrating Remote Sensing And GIS By Object Oriented Data Modelling”, Elsevier Journal of Computers & Graphics, Volume 20, Issue 3, pp. 395–403, June 1996. [5]. Indira Mukherjee, P.S. Acharya and S.K. Ghosh, “A Framework for Sharing Heterogeneous Geo-Spatial Information Using Spatial Data Modeling and Enterprise GIS”, http://www.gisdevelopment.net/ publication/index.html.

[6]. Jean Brodeur, and., Prof. Yvan, Marie-Jos6e Proulx, “Modelling Geospatial Application Databases Using UML-Based repositories Aligned With International Standards in Geomatics”, Proceeding, GIS '00 Proceedings of the 8th ACM international symposium on Advances in geographic information systems, ACM New York, NY, USA, pp.3946, 2000, doi>10.1145/355274.355280. [7]. Jugurta Lisboa-Filho1, Gustavo Breder Sampaio1, Filipe Ribeiro Nalon1 and Karla A. de V. Borges2, “A UML Profile for Conceptual Modeling in GIS Domain”, CAiSE 2010 Workshop DE@CAiSE’10, Hammamet, Tunisia, pp. 18-31, 2010. [8]. Jyothi. S, Saritha, K and Manjula. K. R, “Classification of Deforestation Factors using Data Mining Techniques”, International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR), Vol. 3, Issue 4, pp. 159-172, Oct 2013. [9]. Kehe Wu, Xuerong Xu, Xiaohui Wang and Yuhan Xu, “A Method for Modeling Power Spatial Data Based on Object-Relational Model”, International proceedings on Computer Science and Information Technology (ICCSIT 2011), IACSIT Press, Vol. 51, pp. 260-264, 2012, doi: 10.7763/IPCSIT.2012.V51.45. [10]. Kumar M. and Duffy C., “An Object Oriented Shared Data Model for GIS and Distributed Hydrologic, Models”, International Journal of Geographical Information Science, Volume 24, Issue 7, pp. 1061-79, 2010. DOI: 10.1080/ 13658810903289460. [11]. Manjula K. R, Jyothi S, Anand Kumar Varma, Vijaya Kumar Varma S, “Construction of Spatial Dataset from Remote Sensing using GIS for Deforestation Study”, International Journal of Computer Applications, Vol. 31, pp. 26-32. Oct 2011. [12]. Manjula K. R., Jyothi S, Anand Kumar Varma, “Digitizing the Forest Resource Map Using ArcGIS”, International Journal of Computer Science Issues (IJCSI), Vol. 7 Issue 6, pp. 300-306, Nov 2010. [13]. Manjula K. R., Jyothi S, “Mining Multilevel Spatiotemporal Association Rules for Analyzing the Factors of Deforestation”, Proceedings of First International Conference on Intelligent Infrastructure and 47th Annual National Convention, CSI, pp. 256 – 260, Dec 2012. http://hdl.handle.net/123456789/431. [14]. Michael F. Worboys, Hilary M. Hearnshaw, and David J. Maguire, “Object-Oriented Data modeling for Spatial databases”, International Journal of Geographical Information Systems, Vol. 4, pp- 369–383, 1990. [15]. Myoung-Ah Kang, François Pinet, Michel Schneider, Jean-Pierre Chanet, Frederic Vigier, “How to Design Geographic Databases? Specific UML Profile and Spatial OCL Applied To Wireless Ad Hoc Networks”, Proceedings of 7th AGILE Conference on Geographic Information Science, Database Technology, Greece, pp. 289-299, May 2004. [16]. Tomaševiü Aleksandra, Kolonja Ljiljana, Obradoviü Ivan, Stankoviü Ranka, Kitanoviü Olivera, ”Using UML Case Tools For Development of an Open Pit ARCGIS Geodatabase”, Underground mining engineering 20, pp. 89 – 98, 2012. Manuals [17]. Andrew Perencsik, Eddie Idolyantes, Joe Breman, “Creating Custom Features and Geodatabase Schemas”, ARCGIS 9 CASE Tools Manual, ESRI 2005. [18]. Cliff Kottman and Carl Reed, “The OpenGIS® Abstract Specification”, http://www.opengeospatial.org/legal/, Version: 5.0, 2009. [19]. Sajayasree K. K., “Object Constraint Language –OCL”, http://www. csci.csusb.edu/dick/samples/ocl.html,http://www.jeckle.de/files/UML12/ apndxb.pdf. Books [20]. Jos Warmer and Anneke Kleppe, “The Object Constraint Language”, Addison-Wesley, Boston, MA, 2003, ISBN 0-321-17936-6 [21]. Rogers. Pressman, “Software Engineering”, McGraw-Hill, a business unit of the McGraw-Hill companies, 2010.

2014 International Conference on Computing for Sustainable Global Development (INDIACom)

135

Suggest Documents