Providing Support for Multiple Collection Types in ... - Semantic Scholar

0 downloads 0 Views 32KB Size Report
Databases, Taylor and Francis, Basingstoke, UK, 1990. [4] A. Morris ... Taylor and Francis, London, UK, 1996. [8]A. Morris ... Redwood City, CA., 1994. [13] Van ...
Providing Support for Multiple Collection Types in a Fuzzy Object Oriented Spatial Data Model

Ashley Morris, James Foster University of Idaho Department of Computer Science Moscow, ID, USA 83844-1010 [email protected]

Abstract Fuzzy set approaches are particularly suitable for issues of modeling uncertainty in spatial data. Previous work of the authors describes a framework to support uncertainty by using an object-oriented approach to modeling spatial data. The original research focused on how to incorporate spatial data into a fuzzy object model. This paper expands upon that by discussing the implications of incorporating all collection types described in the ODMG object database standard in this framework. In addition, we will look at how future collection types may be incorporated into the framework.

1. Introduction With the advent of the modern Geographic Information System (GIS), there is a need to model the underlying spatial data. New models have been proposed in recent years which allow for enriching database models to handle this spatial data [1]. One of the hidden drawbacks of a GIS is that the spatial data stored in the GIS is inherently fuzzy and uncertain [2]. Several authors have suggested that the object oriented data model might be the best available technique to model spatial data [3,1]. There are many benefits to the object oriented paradigm, such as polymorphism, information hiding, incorporation of methods with objects, and the use of collections. As outlined in [4], when we combine a geographic spatial data model [5] with an object oriented data model that supports impreciseness and uncertainty [6], we are able to incorporate all of the benefits of the object oriented paradigm into our spatial data framework. This framework is much more appropriate for spatial data, as it incorporates fuzziness into the spatial features (which are inherently uncertain and imprecise) [7,8,9], rather than a

Frederick E. Petry Tulane University Department of Electrical Engineering and Computer Science New Orleans, LA, USA 70118

framework which uses a crisp model for the spatial features, and implements a fuzzy querying technique. One of the key elements of [4] is that we are able to represent a spatial coverage as a set of spatial objects, or features. By being able to represent a coverage in the object oriented model as a set, we gain the use of all of the normal set operations, and we can use the notion of a set when performing spatial queries or spatial operations. Spatial queries on spatial coverages, as performed by a GIS, are typically akin to set operations on those coverages [10]. We typically perform unions of coverages, intersections of coverages, and differences of coverages. In this paper, we outline how we can use different collection types provided by the object oriented model in a spatial data framework.

2. The ODMG object model The primary purpose of the Object Database Management Group (ODMG) is to set standards for persistence of object-oriented programming language objects in databases [11]. Since an ODBMS integrates the programming language with the database store, it is a much broader model than the relational model. As such, there is enhanced functionality with the ODBMS, but additional constraints are placed upon the model.

2.1. The basis for the ODMG object model The ODMG used the OMG (Object Management Group) model as a basis for their object model. The purpose of the OMG core model was to be a common ground for object request brokers, object databases, and other object based applications. The OMG model is used, but it is extended by adding components such as relationships and additional collection types.

2.2. ODMG object model collections In the ODMG object model, a collection is a structure composed of zero or more distinct elements. Each of these may be a literal, an atomic object, or another collection. Thus, it is certainly possible to have a collection of collections. As such, it follows that a collection may be treated as an atomic object, and in fact, the ODMG refers to a collection as a collection object. All elements comprising a collection must be of the same type, that is, they must be a collection of the same type of literal types, the same atomic type, or the same type of collection. The original OMG model supported three distinct collection types. These are the set, the bag, and the list. Each element of the collection must be of the same type. The ODMG also supports the collection types of set, bag, and list; and additionally provides support for the collection types array and dictionary. Each of these collection types is a subclass of the class collection. It is possible to have a collection instance which contains no elements, but each collection must be an instance of one of the five collection types. There are certain operations (methods) associated with the collection type. We only discuss specific ones that are germane to our discussion. All collections can be tested for cardinality. There are also Boolean methods to determine if a certain collection instance is ordered, empty, or allows duplicates. These three Boolean operations are what we will use to differentiate and generalize our collection types. We can also insert or remove elements from the collection. Now we will look at the specific characteristics of each of the five ODMG collection types.

2.3. The Set collection A Set object is an unordered collection of distinct elements. So when we perform an operation on a set, where the set is a collection of spatial objects, it is immaterial in what order these operations are performed. Also, a set collection does not allow duplicates. The insert_element operation is refined, so that if we attempt to add an element to the set that is already a member of the set, then the set will remain unchanged. Since a set collection type does not allow duplicates, this means that for spatial operations, we will not perform any spatial operations on the same object more than once. This makes sense, as our determination of coverages is dependent upon the qualifications of a single element as it relates to our spatial query [4].

The degree of membership of an object in our spatial query result is the same no matter how many times we perform the spatial query upon that object. Therefore, we can generalize that any spatial collection type that allows duplicates can be represented by a collection type that does not allow duplicates, but is equivalent in all of its other characteristics. Please note that duplicates, for our purposes, are duplicate instances of the same object in a collection. This paper does not address issues of conflation of spatial objects. The framework proposed in [4] allows for multiple representations of the same spatial object in two ways. First, a feature may be stored in the database as a single object, with several representations of its spatial characteristics. Obviously, the GIS modeler has to determine that the multiple representations are, in fact, of the same feature. Therefore we can assume that the issue of conflation was handled manually a priori by the GIS modeler. Second, the database may store several features that are actually multiple representations of the same feature. For example, we may have three representations of a house: one a raster representation at 1:50,000, one a vector representation at 1:100,000, and the third a raster representation at 1:100,000. If the GIS modeler did not a priori determine that these multiple objects were, in fact, the same feature, then it is beyond the scope of the underlying database framework to conflate the objects. So to restate, duplicates, for our uses, is the multiple representation of the same data object in a single collection. Since a spatial coverage is (in our framework) by definition a set [4], then when a spatial operation is performed on a collection, the order of processing of the elements in the collection is irrelevant, as spatial queries and spatial operations are associative. Therefore, we can generalize that any spatial collection type that orders items may also be represented by a collection type that does not order items, but is equivalent in all other aspects. Of course, processing features in a specific order may increase performance.

2.4. The Bag collection A Bag object is an unordered collection of elements, but duplicates are allowed. Here we will prove how we can generalize a bag to a set.

If a spatial operation SO over a set S results in a set RS, and a set Union operation ∪ is performed on a bag B to eliminate the duplicates such that ∪ B = S, then we prove that if SO is performed on B, it is equivalent to performing SO on S such that ∪ RB(result bag) = RS. If SO(S) ⇒ RS, and ∪ Β = S [∪ Β = for each x ∈ B, insert x into S, return S] then ∪ RB = RS Let y ∈ ∪ RB, then y ∈ ∪ B, then ∃x ∈ B, SO(x) = y so x ∈ S // S= ∪ Β so SO(x) ∈ RS so B ∈ RS so RB ⊆ RS; Let y ∈ RS, then ∃x ∈ S , x =SO( y) and x ∈ B // ∪ Β = S so SO(x) ∈ RB (by definition) so SO(x) ∈ ∪ RB so RS ⊆ ∪ RB; Since RB ⊆ RS and RS ⊆ ∪ RB, then ∪ RB = RS; Because of this property of a spatial bag, we can generalize a spatial bag to a spatial set.

2.5. The List collection A List object is an ordered collection of elements. According to the ODMG standard, there is no restriction on duplicate elements. So we can define a list as a collection that is ordered, and may contain duplicates (as does the bag). As we have shown previously, when performing spatial operations, the order of operations does not matter, as the resultant object will be a set (spatial coverage). As such, the spatial list can be generalized to the spatial bag, and as we have shown previously, the spatial bag can be generalized to the spatial set. The set, bag, and list are the original three collection types described by the OMG object data model. The ODMG object data model adds two additional collection types: the array and the dictionary.

2.6. The Array collection An array object is an ordered collection of objects that allows duplicates, is dynamically sized, and elements can be located by position. For our purposes, an array has all of the attributes of a list, with the addition of dynamic sizing, and elements being located by position. It is possible to have null values as elements in an array. Any spatial operation on a single null value will result in a coverage set consisting of a single null value. That is, the presence or absence of a null value in a collection is immaterial when performing spatial operations. Because null values contribute nothing to a spatial collection, we can generalize an array object to a list. Thus, using our previous techniques, a spatial array can be generalized to a spatial set.

2.7. The Dictionary collection The final collection object supported by the ODMG object data model is the dictionary. A dictionary object is an unordered sequence of key-value pairs with no duplicate keys. The intent of the dictionary collection is to provide indexing of objects by a key, and a dictionary collection is a collection of object identifiers and their associated key values. Clearly, iterating a spatial operation over a spatial dictionary collection will not present any incidents not covered by the previous four collection types. At this point, we have shown that every spatial collection type outlined in the ODMG object data model can be generalized to the spatial set. This is due to the definition of a spatial coverage being a set of spatial objects.

2.8. Future collection types In the future, there are additional collection objects that could be supported by an object data model. We will consider one of these, the tree, and then we will look at how we could provide support for any new generic collection type. The tree is a type of data structure consisting of a set of nodes in an order such that a node may have one parent, and zero or more children [12]. The root node has no parent, and the leaf nodes have no children. Thus we can define a tree recursively such that a tree T = node; T= node and a set of trees.

Since we have established that order and duplicates do not matter when dealing with spatial data collections, these attributes are immaterial for a spatial tree. Also, as we have established for spatial array collections, nulls are immaterial to our framework. So at this point, the tree then consists of an unordered collection of unique spatial objects, which is the same definition as we have for a set. Therefore, for spatial operations, we can generalize the tree spatial collection type to the set spatial collection type. If any new collection types are implemented, we will be able to support them if we can do as we have done with these five ODMG collection types and the tree collection type, and generalize them to the set spatial collection type. Since any spatial operation on a spatial coverage will result in a spatial coverage, and the spatial coverage is by definition a set, we are able to support any new spatial collection type that we may generalize to the set.

3. Formal generalization of all collection types to the Set Now, we will formally show how all of these collection types, as well as a directed graph, may be generalized to the set. An ordered pair can be represented in set notation as {x, {x,y}} (Note: = iff x = y) Let x k = { if k = 1, else} Let = To represent bag B, with m(x) = multiplicity of x in B ◊: Bags ⇒ Sets ◊(Β)= {x k : x ∈ B and m(x) = k} To represent lists L (L = (x 1, … , x n)) ◊(L) = To represent array A: ◊(A) = {< x, n > : x is at position n in A} To represent dictionary D: ◊(D) = {< x, k > : k is the key for x}

To represent tree T, where T is defined recursively as (T = node; T = node and a set of trees) ◊(node n) = {n} ◊(n, {T1, … , Tk}) = { < n, ◊(Τ1),… , ◊(Τk)>} To represent directed graph G, where G = { [x,y] : ∃edge E •→ •} x

y

◊(G) = {: [x,y] ∈ E}

Since we can generalize all of these collection types to the set, our model can support all of these collection types. Also, since our model treats a spatial coverage as a set of features, we may be able to treat any collection of features as a spatial coverage. As new collection types may emerge in the object oriented data model, our framework for a fuzzy object oriented spatial data model will support those types as long as they can be generalized to the set.

4. Conclusions In conclusion, let us say that this is a work in progress. We have built a proof of concept prototype using Java and ObjectStore, and implementing sets as our only collection type. Spatial coverages are simply collections of spatial objects, or features. We have taken the model proposed in [4], and shown that we can implement any collection type we desire, including some of those that are not yet implemented, by generalizing them to the set. As we stated at the outset, uncertainty and impreciseness is inherent in spatial data, due to erroneous data, missing data, imprecise data, errors in scale or measurement, and many other ways [7,8]. It is therefore imperative that a spatial data model have a way of storing and representing uncertain and imprecise data, and also have a way of informing the user of the degree of certainty of the precision and accuracy of the data. This model addresses these issues by storing the fuzziness as an attribute of the model. Also, we believe that object oriented databases are a natural fit for the storage of spatial data, and the advantages offered by the object oriented paradigm give many additional benefits to the GIS system.

We believe that GIS offer the most significant opportunity for real-world application of fuzzy logic theory, especially as an extension of the work that has been done on fuzzy databases. Finally, we have shown that this framework will explicitly support every current collection type supported by the ODMG object model. In addition, our framework will support directed graphs, trees, and any other instance of an collection of spatial objects that can be generalized to the set.

5. Acknowledgments Many thanks to Bill P. Buckles, for his raising of the issues of multiple collection types to us in the first place, which inspired this paper.

References [1]R. George, B. Buckles, F. Petry, and A. Yazici, “Uncertainty Modeling in Object-Oriented Geographical Information Systems”, Proceedings of Conference on Database and Expert System Applications (DEXA 92), 77-86, 1992. [2]M. Katinsky, Fuzzy Set Modeling in Geographic Information Systems, Unpublished Master's Thesis, University of Wisconsin-Madison, Madison, WI, 1994. [3]M. Goodchild and S. Gopal, eds. The Accuracy of Spatial Databases, Taylor and Francis, Basingstoke, UK, 1990. [4] A. Morris, F.E. Petry, M. Cobb, “Incorporating Spatial Data into the Fuzzy Object Oriented Data Model”, Proceedings of Seventh International Conference on Information Processing and Management of Uncertainty in Knowledge Based Systems (IPMU 98), 604-611, Editions EDK, Paris, 1998. [5] M. Feuchtwanger, Towards a Geographic Semantic Database Model. Unpublished Ph.D. Thesis, Simon Fraser University, Vancouver, 1993.

[6] R. George, A. Yazici, F. E. Petry, B. P. Buckles,” Modeling Impreciseness and Uncertainty in the Object-Oriented Data Model - A Similarity-Based Approach”, Fuzzy and Uncertain Object-Oriented Databases: Concepts and Models, pp. 63-95. World Scientific: New Jersey, 1997. [7]H. Couclelis, “Towards an Operational Typology of Geographic Entities with Ill-defined Boundaries”, Geographic Objects with Indeterminate Boundaries (eds, P. Burrough and A. Frank), 45-56, GISDATA Series Vol. 2, Taylor and Francis, London, UK, 1996. [8]A. Morris and F.E. Petry, “Design of Fuzzy Querying in Object-Oriented Spatial Data and Geographic Information Systems”, Proceedings of 1998 Conference of the North American Fuzzy Information Processing Society (NAFIPS 98), 165-169, 1998. [9]E. L. Usery, “A Conceptual Framework and Fuzzy Set Implementation for Geographic Features”, Geographic Objects with Indeterminate Boundaries (eds, P. Burrough and A. Frank), 71-86, GISDATA Series Vol. 2, Taylor and Francis, London, UK, 1996. [10]-, ARC/INFO User’s Guide, Environmental Sciences Research Institute, Redlands, CA, 1990. [11] R.G.G. Cattell, D.K. Barry, eds., The Object Database Standard: ODMG 2.0., Morgan Kaufman Publishers, San Francisco, 1997. [12]R. Elmasri, S.B. Navathe, Fundamentals of Database Systems, 2nd edition, 115-128, Benjamin-Cummings, Redwood City, CA., 1994. [13] Van Gyseghem, R. De Caluwe, “The UFO Database Model: Dealing with Imperfect Information”. Fuzzy and Uncertain Object-Oriented Databases: Concepts and Models, 123-186, World Scientific, New Jersey, 1997.

Suggest Documents