An Ordering of Convex Topological Relations Matthew P. Dube and Max J. Egenhofer School of Computing and Information Science, University of Maine 5711 Boardman Hall, Orono, ME, USA 04469-5711
[email protected],
[email protected]
Abstract. Klippel has recently identified topological relativity as an important question for geographic information theory. One way of assessing at the importance of topology in spatial reasoning and in spatial theory is to analyze commonplace terms from natural language relative to conceptual neighborhood graphs, the alignment structures of choice for topological relations. Each of the terms analyzed is found to represent a convex set within the conceptual neighborhood graph of the region-region relations, giving rise to the construction of the convex ordering of region-region relations on the surface of the sphere. Keywords: Conceptual neighborhood graph, topological spatial reasoning, convex relations, spatial language, spatial prepositions
1 Introduction Geographic information science has been advocating a topological understanding of space to assemble cognitively plausible models that mirror the human understanding of spatial phenomena. Within this realm, such models as the 4-intersection [19], the 9intersection [14], the Region Connection Calculus [39], and conceptual neighborhood graphs [18,24] contribute symbolically to analyzing spatial scenes for similarity and to distinguishing spatial scenes from one another [9,37]. The formally defined spatial relations are mutually exclusive, yielding atomic relations (i.e., the smallest currency to describe spatial scenes). They may, however, be combined in disjunctions (exclusive ORs) to account for vague scenarios, as often expressed in naturallanguage terms. While the relations’ conceptual neighborhood graphs have been studied substantially, their cognitive plausibility is still an unanswered question. Mathematically based studies [15,18] and initial psychological assessments [29,30] have demonstrated the utility of the graphs, but as of yet, topological relativity [28] to bridge formal and observed human spatial cognitive processes has not been determined explicitly. As a contribution to the discussion about the plausible value of a conceptual neighborhood graph, we assess the structure of a conceptual neighborhood graph for modeling linguistic spatial terms. Languages (spoken or signed) have at their spatial core the ability to address not only the axiomatic building blocks per se, but to tie them together into larger groupings [45]. Spatial language can thus be explicit (as in the case of a term like disjoint), or it can be vague (as in the case of along), leading to
uncertainty [40]. Talmy [45] asserts that there seems to be a set of universal primitives that is approximately closed with which to construct concepts from; therefore, constructions like the 9-intersection and the Region Connection Calculus move semantic terms into rigorously defined mathematical terminology that come straight from topology. Natural-language spatial prepositions have been studied on a computational level [1], but not with the conceptual neighborhood graph backdrop, which would offer a rationale for relating most similar terms. A study of road-andpark relations showed that people link particular constructions to particular language terms, and these constructions—though topologically distinct from one another—are close to each other within the neighborhood graph [36,43]. When the constructions are separated within the graph, the separation is the by-product of prototypical relations, rather than a fundamental rift. A cognitively plausible conceptual neighborhood graph must keep the atomic relations that constitute a spatial disjunction connected to each other. For this purpose we examine the convexity of disjunctions of atomic relations. Convexity has been applied in GIS for convex hulls [38], object decomposition and reconstruction [10], in the surveyor’s formula [8], and in cell trees for geometric data storage [25]. It has made but one appearance in spatial domains pertinent to the conceptual neighborhood graph [4], which studied convex relations of a particular cardinality, based on pre-convex relations [34]. In the temporal domain, convexity has been a criterion to analyze a set of relations similar to the topological ones [2] and their conceptual neighborhood graphs [24] in order to assist in temporally interpreting prose and looking for descriptive consistency [42]. Each convex subgraph of the conceptual neighborhood has been placed into a subset/superset lattice to provide a temporal aggregate neighborhood. This paper investigates whether natural-language spatial disjunctions of the relations between two simple regions correspond to convex sets of the neighborhood graphs. The side effects of convexity—shape and path redundancy—have important impacts on people’s decision-making and certain spatial motor skills. Decision trees—a graphical structure where all connected subgraphs are convex—represent the optimal decision structure for experts [11]. Corroborative evidence (the mental manifestation of multi-path decisions) has been found to produce increases in learning [22], retention [5], argument construction [49], and memorization [50], and furthermore is an often-cited burden of proof in law [26,32]. On a motor skill level, as people learn to pick up objects at young ages, they acquire the skill of grasping the object as if it were convex, lending preference to that type of object [41]. Also, the judgment of position sees significant benefit from convexity [7]. While both spatial motor skills are dependent upon the convexity of a geometric solid or geometric scenario, the psychological impacts that would be mirrored in a graph structure lend favorably to considering its impact upon conceptual neighborhood graphs. Further evidence exists to support that reasoning itself is a spatial process [31], and convexity at its core is a spatial property (contrived through distance structure). This paper offers a platform for analyzing the conceptual similarities of spatial terms in any natural language based on the atomic region-region relations found at their hearts: an ordering of convex relations based on the conceptual neighborhood graph. The remainder of this paper is structured as follows. Section 2 summarizes the underlying model for topological relations and their conceptual neighborhood graph.
Based on the formal definition of convexity in graphs (Section 3), we derive in Section 4 the complete set of convex subgraphs of the region-region relations on the surface of the sphere in their A-neighborhood graph [17]. Commonplace English spatial prepositions [33] are then mapped onto disjunctions of the region-region relations and are shown to be members of the set of convex subgraphs (Section 5). A lattice of the convex subgraphs is constructed (Section 6), which is used for translating terms between languages (Section 7). Section 8 draws conclusions and discusses future work.
2 Topological Relations and Conceptual Neighborhood Graphs The binary topological relations between two simple regions—that is, regions that are homeomorphic to 2-discs—are the focus of this study. The 9-intersection [20] captures these relations through the pairwise intersections of two regions interiors, boundaries, and exteriors. Topological invariants of these nine intersections (i.e., properties that are preserved under topological transformations) categorize topological relations. The content invariant—distinguishing empty (ø) and nonempty (¬ø) intersections—is the most generic criterion, as other invariants can be considered refinements of non-empty intersections. A 3x3 matrix captures these specifications concisely. Pairs of regions with different 9-intersection matrices have different topological relations. For regions embedded in IR 2 , eight different relations—called disjoint, meet, overlap, equal, coveredBy, inside, covers, and contains—can be distinguished with empty and non-empty intersections (Fig. 1). Another three 9-intersection matrices—called attach, entwined, and embrace—are found when the regions are embedded in SS 2 , the surface of the sphere [16].
£ ¬¥ ² ¬´ ¤¬ ¬ ¬¦
£ ¬¥ ² ¬ ¬´ ¤¬ ¬ ¬¦
£¬ ¬ ¬¥ ²¬ ¬ ¬´ ¤¬ ¬ ¬¦
£¬ ¥ ² ¬ ´ ¤ ¬¦
disjoint
meet
overlap
equal
£¬ ¥ ²¬ ¬ ´ ¤¬ ¬ ¬¦
£¬ ¥ ²¬ ´ ¤¬ ¬ ¬¦
coveredBy inside
£¬ ¬ ¬¥ ² ¬ ¬´ ¤ ¬¦
£¬ ¬ ¬¥ ² ¬´ ¤ ¬¦
£ ¬¥ ² ¬ ´ ¤¬ ¦
covers
contains
attach
£¬ ¬ ¬¥ ²¬ ¬ ´ ¤¬ ¦
£¬ ¬ ¬¥ ²¬ ´ ¤¬ ¦
entwined embrace
Fig. 1. The eleven topological relations between two regions embedded in SS 2 with the relations’ 9-intersection matrices and their labels [16].
Since this set of relations is jointly exhaustive and mutually exclusive, exactly one of these eleven relations applies to any pair of simple regions. These base relations can be combined with an exclusive disjunction (XOR) to capture scenarios when a specific relation cannot be described. A total of 211 different combinations are possible. The least specific case is called the universal relation, which is the exclusive disjunction of all eleven relations. The eleven cases with exactly one relation are the atomic relations. On the disjunctions and atomic relations, the intersection of relations applies, yielding in the case of no common relation the empty relation (i.e., a scenario that cannot be realized). These topological relations per se are on a nominal scale. Although there is no strict order that applies to these relations, the eleven topological relations can be
arranged such that pairs of most similar relations are grouped together [18], much like the arrangement of interval relations [24]. Different rationales for such similarity grouping lead to different arrangements of the relations, called conceptual neighborhood graphs [24]. Three basic types of neighborhood graphs are common— the A-neighborhood, which derives the similar relations from anisotropic scaling; the B-neighborhood (similarity from rotation or translation while preserving size and shape); and the C-neighborhood (similarity from isotropic scaling). Another five neighborhoods (Fig. 2) provide closure under union and intersection [17].
Fig. 2. The family of conceptual neighborhood graphs of the eleven topological relations between two regions embedded in SS 2 [17].
3 Convexity and Subgraphs In this section, the mathematical underpinnings of the paper are provided to give rise to common terminological ground. Defintion 1. Let V be a set of vertices and E be a set of edges connecting vi,vj ∈ V. The construction V ∪ E is called a graph and is denoted GV,E. A subset of a graph is referred to as a subgraph [3]. Defintion 2. If a subgraph contains all edges that connect members of its vertex set, then the subgraph is called an induced subgraph. Defintion 3. Let HA,B be an induced subgraph from GV,E. If for every ai,aj ∈ A, every shortest path connecting ai to aj within GV,E is found within HA,B, HA,B is convex [3]. Theorem 1. Let t,u,v be vertices in a graph G such that a u-v path exists between each pair. Further let d(r,s) represent the distance between a specified pair of vertices r and s. Vertex t can be found on a shortest path between u and v if and only if: d(u,t) + d(t,v) = d(u,v).
(1)
Proof: Assume that t is on a shortest path between u and v. We must show that d(u,t) + d(t,v) = d(u,v). Since t is on a shortest path, there can be no shorter path between u and t than the one of length d(u,t) by construction. Similarly for d(t,v). Since shortest paths cannot have loops, both d(u,t) and d(t,v) do not share a common vertex other than t. The distance d(u,v) is minimized under this construction and is thus a shortest path. Any other vertex can only produce a value greater than or equal to this one. Now assume that d(u,t) + d(t,v) = d(u,v). We must now show that t sits on a shortest path. Assume not. This implies that d(u,t) + d(t,v) > d(u,v), which contradicts the initial assumption. Therefore t must be on a shortest path.
4 Application to Simple Region-Region Relations on the Sphere Theorem 1 allows for reducing an algorithm for determining convex subgraphs to a shortest path calculation and a sequence of tests of additive distance. If the distance sum is found to fit the condition of Eqn. 1, a vertex t must be contained within a subgraph that contains vertices u and v for all possible t and all possible pairs u,v. If not, the subgraph cannot possibly be convex.
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11 C12
C13
C14
C15 C16
C17
C18
C19
C20
C21
C22
C23
C24 C25
C26
C27
C28 C29
C30
C31
C32
C33
C34
C35
C36
C37 C38
C39
C40
C41 C42
C43
C44
C45
C46
C47
C48
C49
C50 C51
C52
C53
C54 C55
C56
C57
C58
C59
C60
C61
C62
C63 C64
C65
C66
C67 C68
C69
C70
C71
C72
C73
C74
C75
C76 C77
C78
C79
C80 C81
C82
C83
C84
C85
C86
C87
C88
C89 C90
C91
C92
C93 C94
C95
C96
C97
C98
C99
C100 C101 C102 C103 C104
Fig. 3. The complete set of convex subgraphs of the A-neighborhood graph, ranging form C1 (empty disjunction) to C104 (universal relation) [35].
We implemented this algorithm and derived the complete set of convex subgraphs for the eleven region-region relations on the sphere [16] over their A-neighborhood graph. Among the 211 subgraphs of the A-neighborhood, 104 (5%) are identified as convex (Fig. 3). The set of convex relations includes the empty relation (C1), the eleven atomic relations (C2–12), and the universal relation (C104).
4 Common Spatial Prepositions as Disjunctions of Topological Relations From among three different perspectives of cognitive plausibility—(1) maintenance of human success and errors (typical of cognitive modeling), (2) inputs and outputs match human thought (typical of artificial intelligence), and (3) rational behavior (typical of social simulations) [27]—we are concerned particularly with the inputs and outputs of a machine system for inference. To get an adequate understanding of plausibility, we examine English-language terms that people typically use to describe region-region relations. Landau and Jackendoff provided a “fairly complete list of the prepositional repertoire of English” [33] that applies to region objects. Among their 78 terms are two groups of terms that are outside the scope of this investigation. First, a large contingency of terms refers primarily to direction or orientation (e.g., above/below, beneath/on top of, in front/behind, up/down, left/right, north/south/east/west). Since the present focus is on topological relations, only terms without an explicit reference to direction are considered here. Second, a fair number of terms are intransitive or map onto relations that are not binary (e.g., here/there, between, among). Since topological relation are binary in nature, only those terms that apply to exactly two regions are considered. Finally, since some of the 78 terms are considered synonyms (e.g., along and alongside, inside and in, outside and out) only non-redundant terms are considered. After consolidation, a total of sixteen spatial prepositions remain for this investigation of convex topological relations (Table 1). Table 1. Sixteen of the 78 spatial prepositions that describe region-region configurations without an explicit reference to direction or orientation. about in
across near
along out
around to the side of
at together
beside through
by throughout
far from with
The goal is to express these terms as disjunctions of the eleven 9-intersection relations, which then map onto subgraphs of the A-neighborhood graph. Given natural preferences for convex objects, we expect each of the sixteen spatial prepositions to have a construction from the set of 104 convex relations. Details of these mappings are given for the four predicates outside, inside, crosses, and along (Fig. 4). These terms in particular are used to describe many specific configurations of objects, and all of them abstract away a level of detail that is determined to be unimportant given the scenario. Given the nature of language, information systems have to be able to account for such abstraction mechanisms to accommodate spatial searching features.
While context can tell the human user a lot about which meaning is implied, the system itself has no such knowledge of the contextual basis for the term, leaving it in a position to conduct inference on an uncertain set of terms that are often found to be quite precise for the person accessing the information that system can provide.
(a) Maine is outside of Massachusetts
(b) My neighbor’s property is outside of my fence
(c) That topic is outside of my knowledge base
(d) Morocco is in Africa
(e) Lesotho is in South Africa
(f) A swimmer’s wake crosses the lake
(g) The roads cross each other
(h) The yellow line is along the road
(i) The guardrail is along the road
(j) The tidal plain is along the coastal towns
(k) US Route 1 goes along the Atlantic Ocean
(l) The glacier flows along the mountain
Fig. 4. Uses of spatial terms: (a-c) outside mapping respectively onto disjoint, meet, or attach; (d-e) in mapping respectively onto covered by and inside; (f-g): crosses mapping respectively onto coveredBy and overlap; and (h-l) along mapping respectively onto inside, overlap, coveredBy, disjoint, and meet.
Table 2 shows that each of the sixteen spatial prepositions indeed takes on the preferential form of convexity. While this pattern may be considered as proof enough, each of these subgraphs are of course connected, of which convexity is a special case. Many of the problems that people arrive at spatially when translating languages have to do with the presence of the explicit terms. Only four spatial words for planar regions relations—in, out, far, and near—can be found in the Natural Semantic Metalanguage [48], which indicates that few explicit terms are found across families of languages. Topologically explicit terms are those that map onto a single atomic relation (C2– C12). For the sixteen spatial prepositions, this condition applies only to through (C9) and to far from (C2). The remaining terms are topologically ambiguous. While explicit terms are preferable for communication, ambiguous terminology serves a fundamental purpose as well, as they are abstractions of things that do not matter in the grand scheme of conversation or context. The image created by ambiguous expressions is not impacted substantially, even if the explicit relation changes. For example, someone asking whether two things are connected does not care how, but
cares that one can get from one place to the other without leaving either territory. Whether or not the items are connected only at one point or one is completely inside the other makes no difference from that perspective. Table 2. The sixteen spatial prepositions (Table 1) equated to explicit atomic constructors. Spatial Preposition about across/crosses along around at beside by far from in/inside near out/outside through throughout to the side of together with
Union of Atomic Relations equal, coveredBy, inside overlap, coveredBy disjoint, meet, overlap, coveredBy, inside disjoint, meet equal, coveredBy, inside disjoint, meet, attach disjoint, meet, attach disjoint coveredBy, inside disjoint, meet, attach disjoint, meet, attach overlap equal, covers, contains disjoint, meet, attach overlap, equal, coveredBy, inside, covers, contains, entwined, embrace overlap, equal, coveredBy, inside, covers, contains, entwined, embrace
Convex Relation C30 C19 C47 C13 C30 C29 C29 C2 C18 C29 C29 C9 C31 C29 C91 C91
While the sixteen spatial prepositions prevail in natural English language, the domain of mathematics offers more spatial terms for technical usage. To bring them into the same framework, the mathematical terms connected, equal, nothing, subset, superset, touching, unequal, and universal are mapped onto corresponding atomic 9intersection relations (Table 3). Table 3. The seven mathematical spatial terms equated to explicit atomic constructors. Spatial Preposition equal nothing subset superset touching unequal universal
Union of Atomic Relations equal ø coveredBy, inside covers, contains meet, attach disjoint, meet, overlap, coveredBy, inside, covers, contains, attach, entwined, embrace disjoint, meet, overlap, equal, coveredBy, inside, covers, contains, attach, entwined, embrace
Convex Relation C4 C1 C18 C20 C15 – C104
Two of these seven mathematical terms are explicit (equal and nothings map onto C4 and C1, respectively). Another four terms are convex; however, unequal is not a convex relation.
5 A Convex Ordering Given that convex relations may in fact serve as links in many domains, one needs an understanding of the structure of the convex relations to one another, allowing for autonomous neighborhood graph generation by users in isolated fields. A prime example of this is the comparison of linguistically based neighborhoods that only reflect terminology that exists within a particular language. While spoken language may be descriptive enough for some, written, signed, and drawn languages may be able to convey more concepts [45]. The construction of an ordering needs a mechanism to sort the subgraphs. For neighborhood graphs in general, there are two typical approaches: physical deformations [16-18,21,24] and representational lattices [4,12,42], banking on the condition that topological deformations occur smoothly so that homeomorphic members will pass to a relation that is either a subset or superset under the 9intersection matrix. In this disjunctive case, the representational lattice approach is taken, indicative of a “power set” construction (Fig. 5).
Fig. 5. Lattice of the 104 convex relations with different colors representing disjunctive sets of differing cardinalities. The graph is aligned in the optimal Reingold-Tilford construction.
As an example, touching (C15) is the union of the convex relations meet (C3) and attach (C10). It is also a subset of the convex relation outside (C29). These three terms mathematically are the most similar to the term touching. Any of the spatial prepositions presented here can be bounded by other linguistic terms in a similar manner.
Using the terms explored in Section 4, the lattice of convex relations can be exploited to provide a generalization/specification structure based on the terms themselves. Fig. 6 shows the ordering of the spatial prepositions from Table 2. While the example given here is for the English language, such a procedure can be employed on any language using the same principle. Comparing such word networks can point out significant differences in languages, either within the same language family or from differing families. universal (C104)
together/with (C91)
along (C47)
throughout (C31)
superset (C20)
beside/near/out/ outside/to the side of (C29)
about/at (C30)
in/subset (C18)
equal (C4)
across/corsses (C19)
through (C9)
around (C13)
touching (C15)
far from (C2)
nothing (C1)
Fig. 6. The lattice of convex relations for the sixteen spatial prepositions and the six mathematical spatial terms.
6 Translating an Unfamiliar Term Given the implication from the Natural Semantic Metalanguage that the vast majority of particular spatial concepts do not exist in all languages, some languages contain terms that are unknown to other languages. In creating information systems, it is plausible to assume that entries will be made in varying languages or in translations from one language to another, creating the necessity for addressing terms without a direct language parallel within two or more contributing languages. The lattice of convex relations (Fig. 6) enables a straightforward mathematical translation of one term to another term. More involved, however, is an automated language translation to invoke a suitable understanding of the term, using only a user’s familiar lexicon. This translation is developed subsequently. Borrowing from the idea that any object can be constructed as a union of convex objects [10], an algorithm is designed to provide a minimal precise cover of the graph representing the foreign term, and use the minimal set as a disjunctive description. To illustrate this concept, consider the statement: sensors detected rain north of Zanzibar.
While the concept of north is foreign to the current lexicon (Fig. 6), because it is directionally charged, itself not a topological property. The term does, however, bring with it some implied topological information that can be used. The obvious example of north is the way that Kenya is related to Zanzibar (i.e., separated from one another and directly “above”), referring to disjoint. On the other hand, Kenya relates to Tanzania (of which Zanzibar is a part) as Kenya and Tanzania meet. One can also think of it by a latitudinal perspective: it is raining at all points with greater latitude than Zanzibar, precisely the scenario of attach. Those three possibilities represent the term outside. If it is raining on the island of Zanzibar and it is raining in Kenya and over the open water between them, this scenario represents overlap. These four atomic relations—disjoint, meet, attach, and overlap—together all contribute to our understanding of north of (some more than others), but all have instances that would be expressed under that paradigm. The set—though a union of convex sets—is not a convex set itself (Fig. 7a), as it is missing at least one other possibility, entwined. If in the attach scenario precipitation slightly penetrates the island of Zanzibar, one crosses over the line from attach to entwined. Similarly, embrace could be added on to the set. The additions of entwined (Fig. 7b) or entwined and embrace (Fig. 7c) yield convex relations.
(a)
(b)
(c)
Fig. 7. The set of base relations disjoint, meet, overlap, and attach (a) is not convex, (b) becomes convex with the addition of entwined (C53), and (c) becomes convex with the addition of entwined and embrace (C59).
The algorithm Translate is developed to create a minimal disjunctive phrase (knownTerms) for an unfamiliar term (foreignTerm), which has been seed with its atomic relations (foreignAtoms). It traverses recursively the lattice of known terms (e.g., Fig. 6) starting at its root C104 and tests at each node whether its atomic relations (allAtomicRels) are fully included in the seeded atomic relations of the foreign term. If so, that node’s terms are added to the disjunctive phrase and no more detailed terms are needed; otherwise, all descendants of that node are examined recursively with the same procedure. To eliminate possible false hits for nodes with multiple parents, a final test is needed to clean up any terms for which a more generic term (i.e., in the lattice convexRel(t1)