A Dynamic Classification Pattern of Spatial Statistical Services Using ...

1 downloads 0 Views 889KB Size Report
Mar 8, 2017 - feasibility and validity of the dynamic classification pattern. ... There are also several classifications of spatial statistical analyses in specific ...
Geographical Analysis (2018) 00, 00–00

A Dynamic Classification Pattern of Spatial Statistical Services Using Formal Concept Analysis Yumin Chen1, Jiang Zhou2, John P. Wilson3, Jingyang Wu4, Qianjiao Wu1, Jiaxin Yang1 1

School of Resource and Environment Science, Wuhan University, Wuhan, China, 2Bejing Geoway Software Co Ltd., Wuhan Research Institute, Wuhan, China, 3Spatial Sciences Institute, University of Southern California, Los Angeles, CA, USA, 4Shaanxi Administration of Surveying, Mapping and Geoinformation, Xi’an, China

In ubiquitous computing environments, with advanced Information and Communication Technologies, the availability of geographical data is rapidly improving. Spatial statistical services which is based on mathematics and geographic principles provide powerful tools to mine effective information from the rapid improving data. But how to help users to find the appropriate spatial statistical service is a serious challenge. Classification which can help to organize and manage the service effectively might be the key to solve it. However, traditional classifications which start from a certain perspective such as a service function or data source usually aim at a certain application. It is fixed and does not consider both the link between the different attributes and the demands of different users. Formal concept analysis (FCA) utilizes mathematical order theory and particularly the theory of complete lattices to comprehensively express the interrelationships between attributes and objects. Based on FCA, this article provides a dynamic classification pattern which take the relationships between these services and the characteristics of ubiquitous environments into consideration to help different users to complete the expected classification that meets their demands. Moreover, with this pattern, any number of additional categories could be added to the classification scheme flexibly. Two kinds of classification results that use three kinds of sensors and data types are presented to prove the feasibility and validity of the dynamic classification pattern.

Introduction The concept of ubiquitous computing, first proposed by Mark Weiser in 1991, depicted a new computing environment that would be fully interwoven with the fabric of our everyday lives and based on a ubiquitous network (Weiser 1991; Friedewald and Raabe 2011;

Correspondence: Yumin Chen, School of Resource and Environment Science, Wuhan University, 129 Luoyu Road, Wuhan 430079, China e-mail: [email protected]

Submitted: March 08, 2017. Revised version accepted: December 18, 2017. doi: 10.1111/gean.12154 C 2018 The Ohio State University V

1

Geographical Analysis

Kim and Jang 2011; Zhu and Xu 2012). The ubiquitous computing environments are based on a ubiquitous network, which connects people and objects while minimizing technical restrictions regarding where, when, and how these services are accessed in the context of the service(s) subscribed to. The classification results provided by standard interfaces rely on discipline norms and as such, will not help people to obtain the needed services. However, people can get their familiar and expected spatial statistical services contexts through the dynamic classification pattern which is based on peoples demands. The dynamic and customer-oriented characteristics makes the classification pattern work more widely available. Under ubiquitous computing, supported by Information and Communication Technologies, geographic information can be produced anywhere and anytime, providing ubiquitous geographic information (UBGI) (Leem and Kim 2013; Kim et al. 2014). The explosion of UBGI has extended the use of spatial statistical analysis methods in many domains, which in turn, makes GIS (geographic information system) a more powerful tool in spatial analysis (Watts 2013; Ntozini et al. 2015). Nowadays, many of these spatial statistical methods, which are deployed to identify and/or characterize the relationships between spatial entities in different locations, have been coded by many organizations and individuals and provided as spatial statistical analysis services (Longley 2010). However, these services have to be effectively integrated so that users can discover and search for the services which are appropriate for their requests conveniently. All of the services are accessible through the directory service which has constructed the mapping between the contents and physical positions of these services. Availability does not mean that people can discover their expected services from the large number of complex service entries. In this situation, a classification/taxonomy of these UBGI services in the form of a visual service resource catalog would be indispensable. This kind of classification interface cannot only help service providers to realize the sharing of geographic spatial information but also help service clients to retrieve the needed geographic data accurately and mine information resources efficiently (Chen et al. 2013). Furthermore, a service taxonomy could facilitate improved understanding of service functionalities among service providers and clients and offer references for later providers on what category their services should be put into, which would be beneficial for the management and interoperability of geographic information (GI) services (Li and Li 2000; Yue et al. 2011; Silos et al. 2013). Most of the efforts concerned with the classification of GI services thus far have been carried out by standardization bodies such as the International Organization for Standardization and the open geospatial consortium (OGC) (International Standards Organization 2005; Whiteside 2005; Major 2012). Generally speaking, the classifications established by these organizations are on a coarse and abstract level within which the spatial statistical services are but one component (Bai, Di, and Wei 2009; Michaelis and Ames 2009). There are also several classifications of spatial statistical analyses in specific software packages such as ArcGIS, GeoDa, and R (e.g., Anselin 2003; Bivand, Pebesma, and GomezRubio 2008; Esri 2017). The spatial analysis tools in ArcGIS are spread across several extensions and toolboxes. The Spatial Analyst Extension, for example, includes 20 toolboxes that include utilities such as Map Algebra/raster calculator and both broad (i.e., Interpolation, Multivariate, Density) and narrowly focused tools (i.e., Hydrology, Groundwater). The Geostatistical Analyst Extension provides access to a larger number and variety of interpolation tools and the Spatial Statistics toolbox, containing toolsets plus utilities for Analyzing Patterns, Mapping Clusters, Measuring Geographic Distributions, and Modeling Spatial Relationships. This 2

Yumin Chen et al.

Classification of Spatial Statistical Services

organization of toolboxes and tools sometimes makes it difficult to find the required method. Similarly, R consists of a set of disparate packages by different authors, which makes classification and discovery difficult, and GeoDa is a small spatial econometrics package used for exploratory spatial data analysis, geovisualization, and modeling. There are many popular open source systems that perform certain tasks well. The GRASS GIS, for example, offers tools to support image processing, digital terrain analysis, and many forms of statistical data analysis. QGIS provides easy-to-use data display, editing and analysis services, and can automatically generate maps. And finally, PostGIS implements some of the OGC specifications and is one of the most popular open source GIS database systems. All of the aforementioned classifications are fixed and organized more often than not under the single perspective of a specific geographic process. However, the relationships between the various services are complicated and different results are expected according to different demands. For example, the services of “Spatial autoregression” and “Kriging interpolation” obviously belong to different subclasses of “Regression” and “Interpolation” in terms of functionality, but they also share some common properties because they could both handle the same type of sensor data (i.e., GPS data) and the problem of spatial autocorrelation. In this situation, it is difficult to decide which property should be the priority because it may vary with the purpose at hand. In ubiquitous computing environments, traditional spatial analytical services need to be reinterpreted and organized in order to tackle the amount and variety of what is often illstructured data. However, the relationships between the various services are complicated. Considering the complexities of the relationships between spatial statistical services and the defects of the classifications mentioned above, formal concept analysis may provide an appropriate and useful approach with which to describe the relationships between the various services. Formal Concept Analysis, first proposed by Wille and Ganter (Wille 1992; Ganter and Wille 2010), provides a method of concept formation and conceptual classification based on order and lattice theory. It provides a conceptual framework for structuring, analyzing, and visualizing data, in order to make them more understandable (Alqadah and Bhatnagar 2011; Ma, Sui, and Cao 2012). Three basic notions are included in this theory. First, a formal context is a specific context of a domain which contains sets of objects and attributes and the relationship between the two sets. Here, the objects and attributes refer to the extents and intents of concepts. Usually the attributes are widely accepted basic terms in a domain. A formal concept is next defined as a subset of objects and attributes. In this subset, the extent (object) includes all objects belonging to the concept, while the intent (attribute) comprises all attributes valid for all of those objects. The formal concept is the central notion of formal concept analysis (FCA), and often refers to a certain conceptual class or category in a specific research domain. Based on this definition, the formal concepts are finally organized into a concept lattice which depicts the complex relationships between them. Hence, with the reference meanings of these formal concepts and the structure indicated in the resultant concept lattice, the classification can be easily and explicitly described. To date, the aforementioned formal concept analysis has been used to study the identification of taxa in paleobiological data (Belohlavek, Kostak, and Osicka 2013), to state the theoretical basis for the on-the-fly construction of component directories (Arevalo et al. 2009), to identify components for interoperable process models (Bian and Hu 2007) and to extract the common equations concealed in different concepts in order to reorganize these concepts into a single framework for the purpose of interoperability (Hu and Bian 2009). There is no doubt 3

Geographical Analysis

that FCA is good at revealing the relationships between the objects and can help to organize the objects clearly. However, in an ubiquitous computing environment, many of the spatial statistical services are produced as web services that are distributed across scattered network nodes. The directory services provide the corresponding relationship between the spatial statistical services and the physical address. These directory services help to organize and manage the complex spatial statistical services effectively, but they are not convenient for users to obtain the expected services from the content which includes all of the spatial statistical services. In addition, the attributes that users will be interested in will be different and so, the classification is expected to change according to the user’s demands. In this case, a dynamic classification pattern rather than a fixed classification result makes sense. This article seeks to explore a classification pattern by which the classification system can change dynamically to correspond to different user’s demands. FCA is used in this dynamic classification to characterize the complicated relationships between the rapidly evolving suites of spatial statistical methods or services. Unlike traditional classifications that usually start from a certain perspective and are limited by either the data structure and/ or the scope of the applications (Albrecht 1998). The dynamic classification pattern can avoid these limitations through the analysis of the information of the user’s needs and the constructed concept lattice that summarizes the interrelationships between these services. Since users focus on different attributes, the classification system is dynamic and tailored to serve different demands. To demonstrate the specific steps of the proposed method, the domain of spatial statistical analysis is treated as a case study and the remainder of this article is arranged as follows. The next section describes the methodology used for the dynamic classification pattern. Experiment and Results section describes an experiment that used the resulting dynamic classification pattern. Discussion section discusses the practicality and significance of the proposed method and we draw some conclusions in Conclusions section.

Methodology The dynamic classification pattern used in this article is composed of three steps. First, we analyze the demands of users and the purpose of the classification result to select the objects and attributes. Second, we construct a classification pattern using formal concept analysis. In this step, selected objects are projected onto a set of attributes, and a new set of concepts is then extracted based on the attributes associated with each object. These are the basic concepts and unique combinations of them embedded in the original concepts. These new concepts are then organized into a multilevel hierarchical concept lattice. Lastly, by analyzing the concept lattice that reveals the relationships between these spatial statistical analysis methods, the classification and other new information and insights can be drawn. And once the demands of users change, we can get a new classification system quickly using the classification pattern mentioned in Step 2. The workflow is demonstrated in Fig. 1. Determination of objects and attributes The first step is to analyze the demands of users, and determine the objects and attributes. Usually, we use G 5 {g1, g2, . . ., gn} as the object set and M 5 {m1, m2, . . ., mn} as the attribute set for reasons of brevity and clarity. 4

Yumin Chen et al.

Classification of Spatial Statistical Services

Figure 1. Workflow used to identify dynamic classification pattern. Construction of the classification pattern After the objects and attributes are identified, the formal context is easy to establish through projecting the objects onto the attributes. Usually, a formal context can be expressed by a matrix. First, the objects are assigned to columns and the attributes are assigned to rows. Then if one object and one attribute have a relationship, the corresponding cell value will be set to 1, otherwise it is set to 0. Based on the formal context, we can get the new concepts easily. New concepts which are indicated by the combinations of attributes that are connected via various objects are mainly determined by the rank of the formal context matrix. All of the rows and columns which correspond to the submatrix whose rank is n of the formal context matrix form a new concept. For example, a new concept Ci (g1, g4, g5, g6, m1) means that objects g1, g4, g5, g6 have attribute m1. Formal concepts perform well in expressing multiple complex relationship between the spatial statistical services set and the attribute set, unlike the groupings of spatial statistical services which can only emphasize single attribute. For example, people can easily see that group{g17(Histogram), g18(Box plot), g19(Percentile plot), g20(Scatter plot), g21(Parallel 5

Geographical Analysis

Figure 2. The square matrix M showing the inclusion relationships between the concepts. coordinate plot), g22(Conditional plot), g23(Conditional plot), g24(Moran scatter plot)} are services that belong to Statistical Graph and fail to see that they have the common attribute of Fixed Sampling Sensors and Exploratory Spatial Data Analysis. There is no doubt that formal concepts perform better than groups. The inclusion relationships between the new concepts are identified using the definitions of super- and sub-concepts. In the original definition, the suband super-concept relation () between these concepts is defined as follows: Suppose concepts (A1, B1) and (A2, B2) are both concepts with the same context (Ai represents the objects and Bi represents the attributes), then the concept (A1, B1) is a subconcept of the concept of (A2, B2), if A1A2 (which is equivalent to B2B1) and (A2, B2) is a super-concept of (A2, B2). A set of concepts including C1, C2, C3, C4, C5, C6, C7, and C8 is next taken as an example to illustrate the specific steps of this method. First, a square matrix M with both rows and columns standing for the same set of concepts is used to display the inclusion relationships between these concepts. Specifically, if Ci is a subconcept of Cj, then the corresponding element mij 5 1; otherwise mij 5 0. With this operation, the square matrix M is built up as shown in Fig. 2. Second, a new matrix L is generated as follows: L5M2M  M

(1)

Since both direct and indirect inclusion relationships are identified in matrix M, with this computation, the transitive inclusion relationships are filtered. Hence, for matrix L reproduced in Fig. 3, the corresponding element Lij 5 1 only if Ci is a direct subconcept of Cj. Generally speaking, the more “1s” along a row indicates that the corresponding concept is at a lower level. In the example, C1 has three “1s” which means it is at the lowest level in the concept lattice; C8 has no “1s” along the rows, which means it is on the top level. However, notice that C3 and C5 have the same number of “1s” as C6 and C7. Based on this method these formal concepts are finally organized into a concept lattice which depicts the complex relationships between them. The generated concept lattice is displayed in Fig. 4.

Get classification by analyzing the demands of users Different users have different expectations of the classification. Service providers may hope to organize their service so that it is clear and distinctive whereas the service client may hope to 6

Yumin Chen et al.

Classification of Spatial Statistical Services

Figure 3. The square matrix L that shows the direct super- and sub-concept relationships between the concepts. The minus values in the matrix have no special meaning. find the appropriate service quickly. We can easily obtain the classification result with the consideration of users’ demands and the concept lattice that has been constructed. The classification system will vary according to a user’s requirements. Once the demands of users change, we can quickly determine the new objects and attributes. By comparing the previous formal context and the new demands, we can attach attributes to objects quickly and obtain the concept lattice. Once this is done, a new classification system that is specifically adapted to the new demands of new and/or existing users will be constructed.

Experiment and results An experiment that uses spatial statistical services as the objects will be given here to prove the feasibility of the dynamic classification pattern. Two different classification results are

Figure 4. The concept lattice generated to make the relationships between formal concepts more explicit. Nodes with new objects are marked by a black filled lower semicircle and nodes with new attributes are marked by a blue filled upper semicircle. 7

Geographical Analysis

Table 1. List of selected spatial statistical services used as objects in concept lattice method* Objects g1 g2 g3 g4 g5 g6 g7 g8 g9 g10 g11 g12 g13 g14 g15 g16 g17 g18 g19 g20 g21 g22 g23 g24 g25 g26 g27 g28 g29 g30 g31 g32 g33 g34 g35 g36 g37 g38 g39 g40 g41

8

Quadrat analysis Nearest neighbor analysis Riley’s K function Mean center Median center Standard distance Variance Standard deviation Correlation coefficient Linear directional mean Total accessibility matrix Gamma index Alpha index Circular variance Central feature Spatial weights matrix Histogram Box plot Percentile plot Scatter plot Scatter plot matrix Parallel coordinate plot Conditional plot Moran scatter plot Getis-ord’s G Local Moran’s I Local Geary’s C Geary’s C Moran’s I General G-statistic Ordinary Least Square regression Spatial lag model Spatial error model Spatial filtering model Geographically weighted regression Natural breaks classification Geometrical interval classification Defined interval classification Standard deviation classification Inverse distance weighted interpolation Global polynomial interpolation

Yumin Chen et al.

Classification of Spatial Statistical Services

Table 1 Continued Objects g42 g43 g44 g45

Local polynomial interpolation Simple kriging Ordinary kriging Universal kriging

*The objects are coded g1 to g45 for convenience of description. produced. The first takes three kinds of sensors as the root classes and several other attributes representing the basic issues contained in these services as subclasses to discover the relationships between spatial statistical services and sensors. The second takes three data types as the root classes instead to show that the classification pattern is dynamic and users can get different results. Since the number of existing spatial statistical services is very large and many of them provide similar capabilities, just parts of the whole analysis are listed for reasons of brevity and clarity. The selected spatial statistical services listed in Table 1 are collected based on the existing software categories and reviews of pertinent literature (e.g., Yang and Cai 2010). As for the attributes, they should be widely accepted within a specific domain because they are of great importance for identifying the relationships between objects, based on which new concepts and the final concept lattice are obtained. Moreover, these attributes should be unique such that they cannot be divided conceptually. Specifically, for this article, 15 elements were selected as attributes (Table 2). They include the sensor types and the various tasks which the basic methods perform.

Table 2. List of selected attributes used in concept lattice method* Attributes m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 m11 m12 m13 m14 m15

Fixed sampling sensors Mobile sensors RFID sensors Point description analysis Line description analysis Point pattern analysis Line pattern analysis Classification Regression Global spatial autocorrelation Local spatial autocorrelation Interpolation Exploratory spatial data analysis (ESDA) Central tendency Statistical graph

*The attributes are coded m1–m15 for convenience of description. 9

Geographical Analysis

Three types of sensors, Fixed Sampling, Mobile, and radio frequency identification (RFID Sensors), were derived from a review of existing sensor types and used for the work at hand. The classification is focused on spatial statistical analysis and not the broader sensor definitions promoted by standards organizations like OGC. For example, in our classification the Fixed Sampling Sensors class refers to sensors like humidity or temperature sensors in which the locations are relatively fixed (in some situations they can move within a certain area). In addition, the data collected by this kind of sensor fixes on a point but usually provide a representative average value for an area. This kind of data coupled with geo-referenced information is widely used in spatial statistical analysis. The RFID Sensors class, on the other hand, is usually used to identify entities and read their information with wireless communication technology (Kamoun and Miniaoui 2015). Unlike a mobile sensor such as GPS using absolute coordinates, a RFID sensor utilizes relative geo-references to trace the location and movement of an entity. Moreover, in some applications such as transportation monitoring, RFID sensors are also used to monitor the traffic flow on the crossing roads. This kind of statistical data is also frequently utilized in spatial statistical analysis. Some of the other selected attributes were chosen because these functionalities are incorporated in one or more of the methods, and still others like Global and Local Spatial Autocorrelation were chosen because they are important basic issues that pervade most applications of spatial statistical services (Anselin and Rey 2010; Radersma and Sheldon 2015). After the objects and attributes are identified, we use FCA to describe the relationships among them. Although the basic mathematical method is easy to perform, we implemented the FCA method, Conexp-1.3 (see http://sourceforge.net/projects/conexp/ for additional details), because of the large number of concepts used in this article. With the help of the software, if one object and one attribute have a relationship, a cross sign is placed in the corresponding cell. With this operation, the formal context was constructed as shown in Table 3. The existing software catalogue(s) and published research on Spatial Statistics were used to link objects and attributes together. For example, the spatial statistical services g17–23 were listed under the menu to explore because they correspond to the attribute m13 (Exploratory Spatial Data Analysis) in the software GeoDa. Given this knowledge, we can link them together. Once the specification of the formal context has been completed, new concepts are indicated by the combinations of attributes that are connected with various objects. Table 4 lists all of the new concepts derived from the formal context matrix. With the help of the software, the corresponding concept lattice will be generated. The original generated concept lattice is displayed in Fig. 5. To make the relationships between these concepts more explicit, the original graph was reorganized. Compared with the original one, this new concept lattice graph (Fig. 6) explicitly displays the hierarchical structure between the formal concepts. Table 5, in turn, lists the inclusion relationships between the formal concepts at each level. In the original concept lattice reproduced in Fig. 5, every node stands for a concept. The bottom concept C1 contains all attributes of the context and its extent requires the set of objects having all attributes so it does not correspond to a meaningful concept. The top concept contains all of the objects and is also a meaningless concept. The two concepts are included just to make the lattice complete. Other concepts with lines connecting them have an inheritance relationship among objects in a descending path, which means an upper node inherits all the objects of a lower one connected with it and has its own objects. On the contrary, a lower node inherits all the attributes of an upper one and has its own attributes. So in the concept lattice in Fig. 5, only new attributes and objects are labeled in each node. 10

Yumin Chen et al.

Classification of Spatial Statistical Services

Table 3. The formal concept context presented by a matrix

g1 g2 g3 g4 g5 g6 g7 g8 g9 g10 g11 g12 g13 g14 g15 g16 g17 g18 g19 g20 g21 g22 g23 g24 g25 g26 g27 g28 g29 g30 g31 g32 g33 g34 g35 g36 g37 g38 g39 g40 g41 g42

m1

m2

X X X X X X X X X

X X X X X X X X X X X X X X X X

X X X X X X X X X X X X X X X X X X X X X X X X X X X X

m3

m4

m5

m6

m7

m8

m9

m10

m11

m12

m13

m14

m15

X X X X X X X X

X X

X X X X X X X X X X X X X X X

X X X X X

X

X X

X X

X

X X X X X X X X X X X X X X X

X X X X X X X X X X X X X X

X X X X X X X X X X X X

X X X X X X X

X

X X X

X X X X X X X X

X X X X

X X X X X X X

11

Geographical Analysis

Table 3 Continued m1 g43 g44 g45

X X X

m2

m3 X X X

m4

m5

m6

m7

m8

m9

m10

m11

m12

X X X

X X X

m13

m14

m15

In the cleaned-up lattice presented in Fig. 6, the set of new concepts and their relationships are organized into seven levels. Since level 1 and level 7 correspond to the meaningless concepts of C42 and C1, they need not be discussed further. At level 6, the actual bottom level, the objects are associated with the most attributes, which means the corresponding spatial statistical methods can be classified into several categories. For example, in C2 (g32, g33, g34, m1, m9, m10, m11), g32 Spatial lag model, g33 Spatial error model, and g34 Spatial filtering model can be used in or classified with the Fixed Sampling sensors (m1), Regression (m9), Global Spatial Autocorrelation (m10), and Local Spatial Autocorrelation (m11). Moreover, because of the inheritance relationships between formal concepts in different levels, these services are also involved in concepts across all the other levels. This information indicates these services could be used with a wider range of data types and that they contain more abundant meanings. So in the classification, they may appear in many subclasses. From levels 6 to 2, the spatial statistical services gradually focus on more specific aspects as the contained attributes in the concepts decrease. For instance, at level 4, C31 (g10, g11, g12, g13, g14, g15, g16, m2, m3) is a concept with no inheritance relationships with the direct upper level concepts, in which the contained objects such as g10 Linear Directional Mean, g12 Gamma Index, g14 Circular Variance are only used with the Mobile Sensor (m2) and RFID Sensor (m3) classes. This situation is understandable because these methods are not common techniques used in spatial statistical analysis. This information also reveals that the data collected by Mobile and RFID sensors can be taken as line data. Furthermore, from levels 4 to 2, few new objects are added to the formal concepts while the associated attributes are decreasing. Especially at levels 3 and 2, all the objects included in the concepts are inherited from the former levels. In the vertical structure, concepts at different levels with inheritance relationships can be defined as classes and subclasses in the classification. In the horizontal structure, the formal concepts at the same level mimic one another and they can be taken as subclasses of the upper level concepts. Next, by analyzing both the vertical and horizontal structures of the concept lattice, the final classification can be inferred. Take C39 as an example, the concepts C26 (m1, m9), C27 (m1, m11), C30 (m1, m3), C34 (m1, m6), C35 (m1, m10), C36 (m1, m13), C37 (m1, m2), and C38 (m1, m4) (the objects are omitted) are superconcepts of C23 (m1) at level 2. Through the attributes contained in these super-concepts such as m6, m9, m10, m11, and their reference meanings, it is possible to determine the subclasses included by the class represented by m1, the Fixed Sampling Sensor class. Noticeably, although C30 is a super-concept of C39, m3 cannot be taken as a subclass of m1 since they are on the same level. Similarly, in C37, m2 can be thought as a subclass of m1. Moreover, C9 (m1, m3, m8) in level 4 is not a direct super-concept of C39, but m8 could be taken as a direct subclass of m1 in the final classification. This result is 12

Yumin Chen et al.

Classification of Spatial Statistical Services

Table 4. The new concepts generated from the formal concept matrix New concepts C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 C20 C21 C22 C23 C24 C25 C26 C27 C28 C29 C30 C31 C32 C33 C34 C35 C36 C37 C38 C39 C40 C41 C42

(m1,m2,m3,m4,m5,m6,m7,m8,m9,m10,m11,m12,m13,m14,m15) (g32,g33,g34,m1,m9,m10,m11) (g25,g26,g27,m1,m6,m11,m13) (g16,m1,m2,m3,m7,m10,m11,m13) (g1,g2,g3,g9,m1,m2,m6) (g43,g44,g45,m1,m3,m11,m12) (g24,m1,m6,m10,m13.m15) (g17,g18,g19,g20,g21,g22,g23,m1,m3,m4,m13,m15) (g36,g37,g38,g39,m1,m3,m8) (g15,m1,m2,m3,m4,m5,m14) (g10,m2,m3,m7,m14) (g32,g33,g34,g35,m1,m9,m11) (g16,g32,g33,g34,m1,m10,m11) (g16,g25,g26,g27,m1,m11,m13) (g16,g43,g44,g45,m1,m3,m11) (g24,g28,g29,g30,m1,m6,m10,m13) (g16,g17,g18,g19,g20,g21,g22,g23,m1,m3,m13) (g40,g41,g42,g43,g44,g45,m1,m3,m12) (g15,g16,m1,m2,m3) (g17,g18,g19,g20,g21,g22,g23,g24,m1,m13,m15) (g10,g11,g12,g13,g16,m2,m3,m7) (g15,g16,g17,g18,g19,g20,g21,g22,g23,m1,m3,m4) (g14,g15,m2,m3,m5) (g10,g15,m2,m3,m14) (g4,g5,g15,m1,m2,m4,m14) (g31,g32,g33,g34,g35,m1,m9) (g16,g25,g26,g27,g32,g33,g34,g35,g43,g44,g45,m1,m11) (g24,g25,g26,g27,g28,g29,g30,m1,m6,m13) (g16,g24,g28,g29,g30,m1,m10,m13) (g15,g16,g17,. . .,g20,g21,g22,g23,g36,g37,g38,g39,g40,g41,g42,g43,g44,g45,m1,m3) (g10,g11,g12,g13,g14,g15,g16,m2,m3) (g4,g5,g6,g7,g8,g15,m1,m2,m4) (g4,g5,g10,g15,m2,m14) (g1,g2,g3,g9,g24,g25,g26,g27,g28,g29,g30,m1,m6) (g16,g24,g28,g29,g30,g32,g33,g34,m1,m10) (g16,g17,g18,g19,g20,g21,g22,g23,g24,g25,g26,g27,g28,g29,g30,m1,m13) (g1,g2,g3,g4,g5,g6,g7,g8,g9,g15,g16,m1,m2) (g4,g5,g6,g7,g8,g15,g17,g18,g19,g20,g21,g22,g23,m1,m4) (g1,g2,g3. . .g7,g8,g9,g15,g16,g17. . .g27,g28,g29. . .g42,g43,g44,g45,m1) (g10,g11,g12,. . .,g16,g17,. . .,g21,g22,g23,g36,g37,g38,g39,g40,g41,. . .,g44,g45,m3) (g1,g2,g3,g4,g5,g6,g7,g8,g9,g10,g11,g12,g13,g14,g15,g16,m2) (g1,g2,g3,g4,g5. . .g23,g24,g25,g26,g27. . .g40,g41,g42,g43,g44,g45)

13

Geographical Analysis

Figure 5. The original concept lattice established using Conexp-1.3.

understandable because all of the objects related to m8 are also contained in the set of objects related to m3 and as a result, no new concept with only m1 and m8 was extracted. The same situation also occurred with C18 (m1, m3, m12), in which m12 can be taken as a

Figure 6. The cleaned-up concept lattice graph. Arrows point from super- toward sub-concepts. Dashed-line arrows indicate super- and sub-concepts that were not located on successive levels. 14

Yumin Chen et al.

Classification of Spatial Statistical Services

Table 5. Lists of relationships between the concepts in the concept lattice graph* Level 1 Level 2 Level 3 Level 4 Level 5

Level 6 Level 7

C42(C41,C40,C39) C41(C37,C33); C40(C31,C29); C39(C38,C37,C36,C35,C34, C30,C27,C26) C38(C32,C22);C37(C32,C19,C5);C36(C29,C28,C20,C17,C14); C35(C29,C13);C34(C28,C5) C33(C25,C24);C32(C25);C31(C24,C23,C21,C19);C30(C22,C19,C18, C17,C15)C29(C16,C4);C28(C16);C27(C15,C14,C13,C12);C26(C12) C25(C10);C24(C11,C10);C23(C10);C22(C10,C8);C21(C11,C4); C20(C8,C7);C19(C10,C4);C18(C6);C17(C8,C4);C16(C7); C15(C6,C4);C14(C4,C3);C13(C4,C2);C12(C2) C11,C10,C9,C8,C7,C6,C5,C4,C3,C2 C1

*The first concept preceding the brackets is a super-concept of each concept in the brackets after it. Concepts in levels 6 and 7 have no sub-concepts.

direct subclass of m1. With comprehensive analysis of the super-concepts of C39, the subclasses of the Fixed Sampling Sensor class (m1) are identified as Global Spatial Autocorrelation (m6), Local Spatial Autocorrelation (m11), Interpolation (m12), Regression (m9), Point Description Analysis (m4), Exploratory Spatial Data Analysis (ESDA) (m13), Point Pattern Analysis (m6), Classification (m8), and Statistical Graph (m15). Finally, the spatial statistical services contained in these subclasses can be identified based on the associated objects. In this manner, the final classification was constructed as shown in Fig. 7. Based on the dynamic characteristics of this method, we can get another classification system if the user’s demands change. In this second case, we assume that the users are familiar with the data types and the relationships between the data types and the spatial statistical services. Thus, we can get the new attribute set listed in Table 6. With the assistance of the classification pattern, we can confirm that the attribute m1 (Point) includes a subclass of m6 (Global Spatial Autocorrelation), m4 (Spatial Cluster), m5 (Regression), m8 (Interpolation), m9 (Exploratory Spatial Data Analysis [ESDA]), m7 (Local Spatial Autocorrelation) m2 (Polyline) include a subclass of m8 (Interpolation), m9 (Exploratory Spatial Data Analysis [ESDA]) m3 (Polygon) includes a subclass of m6 (Global Spatial Autocorrelation), m4 (Spatial Cluster), m5 (Regression), m8 (Interpolation), m9 (Exploratory Spatial Data Analysis [ESDA]), m7 (Local Spatial Autocorrelation) (the attributes belong to the new selected attributes list) and through the structure of the relationship, a new classification system is obtained and listed in Fig. 8.

Discussion The proposed dynamic classification pattern satisfies different users’ needs within ubiquitous computing environments, and we attempted to obtain different classification systems of spatial statistical services using dynamic classification patterns. Through the experiment, we revealed complicated associations between spatial statistical services and proved the dynamic classification pattern is more adaptable and practical. 15

Geographical Analysis

Figure 7. The final classification showing the concept lattice graph results.

16

Yumin Chen et al.

Classification of Spatial Statistical Services

Figure 7. (Continued). Constructing a dynamic classification pattern which is more adaptable and practical In the final classification, we obtain two different classification systems by the dynamic classification pattern. In the first case, three kinds of sensors were taken as the root classes and several other selected attributes which represent the functionalities or basic issues contained in these spatial statistical methods are taken as subclasses. In the second case, three data types were taken to be the root classes. 17

Geographical Analysis

Table 6. List of the new selected attributes used in dynamic classification pattern* Attributes m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 m11

Point Polyline Polygon Spatial cluster Regression Global spatial autocorrelation Local spatial autocorrelation Interpolation Exploratory spatial data analysis (ESDA) Classification Statistical graph

*The attributes are coded m1–m11 for convenience of description.

The dynamic nature of the classification makes the classification pattern more adaptable and practical. Unlike traditional classification emphasizing one confirmed perspective, the dynamic classification pattern considers the different demands of users in ubiquitous computing networks. In the final results, both users with data collected by different sensors or of different types could immediately find out which of the available spatial statistical methods were suitable for the analysis of their data. Revealing the complicated associations between the spatial statistical services The dynamic classification pattern also revealed the complicated associations between the spatial statistical services that make this approach more comprehensive and logical. Since some spatial statistical services have interrelationships with several attributes, they have occurrences in different subclasses of the same class. For example, the Moran’s I method (g29) emerged in both the Point Pattern Analysis (m6) and Global Spatial Autocorrelation (m10) subclasses in the Fixed Sampling Sensors (m1) class. This kind of repetition could make the classification more comprehensive and improve the query efficiency although it may also cause problems in terms of specificity and/or redundancy. The advantages of applying the concept lattice method are readily apparent. By analyzing the relationships between the objects and attributes, the formal concepts which represent some categories are mined. Then, the concept lattice is directly generated by the algorithm rather than being predefined, such that the classification can be drawn through the hierarchical structure and relationships contained in these concepts. This method is especially effective and reliable when dealing with a large number of interrelated entities. In fact, with this method, any number of additional categories could be added to the classification scheme flexibly. Moreover, the concept lattice reveals more complicated associations between the concepts rather than the classification which could be conducted using only parts of the concept lattice. The hierarchical structure of the concept lattice, both vertically and horizontally, gives insight into the nature of these services, as was partly discussed in Experiment and Results section. 18

Yumin Chen et al.

Classification of Spatial Statistical Services

Figure 8. Another final classification showing the concept lattice graph results.

19

Geographical Analysis

Figure 8. (Continued).

20

Yumin Chen et al.

Classification of Spatial Statistical Services

Figure 8. (Continued).

Limitations and future enhancements As demonstrated in this study, the dynamic classification pattern offers several advantages both in applicability and effectiveness. However, it also presents some challenges for further development and application. First and foremost, it is important to note that the accuracy of the final concept lattice depends on the complete and explicit description of the objects and attributes. However, the objects considered in this article were a selection of unique spatial statistical services and their attributes and the choices we made would have definitely influenced the final classification. In addition, the current implementation did not consider the ways in which many data sets can be transformed between point, line, and polygon data structures, potentially complicating the relationships elucidated in this article.

Conclusions As summarized in the above discussion, some conclusions can be drawn as follows. First and foremost, in today’s ubiquitous network environments, spatial statistical analysis is becoming more and more important with the increasing availability and variability of geo-referenced data and their use in a wide range of applications. Considering both the characteristics of the ubiquitous network environments and spatial statistical analysis, this article utilized formal concept analysis to propose a classification pattern for spatial statistical analysis web services in ubiquitous computing environments. Second, the concept lattice that lies at the heart of this method proved to be an effective analytical tool to reveal the complex relationships between different entities sharing common properties. And last, but not least, a dynamic classification pattern rather than a classification result makes good sense. The dynamic classification pattern is reasonable and feasible and we hope the classification pattern presented in this article can contribute to the promotion of spatial statistical analysis in the rapidly expanding and evolving ubiquitous computer networks that characterize the modern world. 21

Geographical Analysis

Acknowledgements The research is supported by the National Key S&T Special Projects of China (project No. 2017YFB0503704), the National Nature Science Foundation of China (project No. 41671380), and the Open Research Fund Program of Shenzhen Key Laboratory of Spatial Smart Sensing and Services.

References Albrecht, J. (1998). “Universal analytical GIS operations—A task-oriented systematization of data structure-independent GIS functionality.” Geographic Information Research: Transatlantic Perspectives, 577–91, edited by M. Craglia and H. Onsrud, CRC Press. Alqadah, F., and R. Bhatnagar. (2011). “Similarity Measures in Formal Concept Analysis.” Annals of Mathematics and Artificial Intelligence 61, 245–56. Anselin, L. (2003). GeoDaTM 0.9 user’s guide. Retrieved 29 January 2017 from http://geodacenter.asu. edu/software/documentation Anselin, L., and S. J. Rey. (2010). Perspectives on Spatial Data Analysis. Berlin, GER: Springer. Arevalo, G., N. Desnos, M. Huchard, C. Urtado, and S. Vauttier. (2009). “Formal Concept AnalysisBased Service Classification to Dynamically Build Efficient Software Component Directories.” International Journal of General Systems 38, 427–53. Bai, Y., L. Di, and Y. Wei. (2009). “A Taxonomy of Geospatial Services for Global Service Discovery and Interoperability.” Computers & Geosciences 35, 783–90. Belohlavek, R., M. Kostak, and P. Osicka. (2013). “Formal Concept Analysis with Background Knowledge: A Case Study in Paleobiological Taxonomy of Belemnites.” International Journal of General Systems 42, 426–40. Bian, L., and S. Hu. (2007). “Identifying Components for Interoperable Process Models Using Concept Lattice and Semantic Reference System.” International Journal of Geographical Information Science 21, 1009–32. Bivand, R. S., E. J. Pebesma, and V. Gomez-Rubio. (2008). Applied Spatial Data Analysis with R. New York: Springer. Chen, Y. M., J. Y. Wu, F. Zeng, X. Gao, and X. M. Bi. (2013). “Geographical Information Services Classification Based on FCA: A Case Study in Vector Spatial Statistic Analysis.” Advanced Materials Research 765–767, 1210–3. Esri. (2017). ArcGIS Online Help. Retrieved from https://doc.arcgis.com/en/arcgis-online/get-started/getstarted.htm Friedewald, M., and O. Raabe. (2011). “Ubiquitous Computing: An Overview of Technology Impacts.” Telematics & Informatics 28, 55–65. Ganter, B., and R. Wille. (2010). Formal Concept Analysis: Mathematical Foundations. Berlin, GER: Springer. Hu, S., and L. Bian. (2009). “Interoperability of Functions in Environmental Models: A Case Study in Hydrological Modeling.” International Journal of Geographical Information Science 23, 657–81. International Standards Organization. (2005). ISO 19119: Geographic Information—Services. Retrieved 29 January 2017 from http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.ht-m? csnumber539890 Kamoun, F., and S. Miniaoui. (2015). “Towards a Better Understanding of Organizational Adoption and Diffusion of RFID Technology.” International Journal of Technology Diffusion 6, 1–20. Kim, S. K., J. H. Lee, K. H. Ryu, and U. Kim. (2014). “A Framework of Spatial Co-Location Pattern Mining for Ubiquitous GIS.” Multimedia Tools and Applications 71, 199–218. Kim, T. J., and S. G. Jang. (2011). Ubiquitous Geographic Information. Berlin, GER: Springer. Leem, C. S., and B. G. Kim. (2013). “Taxonomy of Ubiquitous Computing Service for City Development.” Personal and Ubiquitous Computing 17, 1475–83. Li, B., and Z. Li. (2000). “Distributed Spatial Catalog Service on the CORBA Object Bus.” GeoInformatica 4, 253–69. 22

Yumin Chen et al.

Classification of Spatial Statistical Services

Longley, P. (2010). “Handbook of Applied Spatial Analysis: Software Tools, Methods and Applications.” In Spatial Data Infrastructures, 978–3, edited by M. M. Fischer and A. Getis. Berlin, Germany: Springer. Ma, Y., Y. Sui, and C. Cao. (2012). “The correspondence between the concepts in description logics for contexts and formal concept Analysis.” Science China Information Sciences 55, 1106–22. Major, G. R. (2012). “NASA’s Global Change Master Directory: Fostering Collaborations for Earth Science Information and Data Retrieval.” Neuropsychiatrie De Lenfance Et De Ladolescence 60, S124–5. Michaelis, C. D., and D. P. Ames. (2009). “Evaluation and Implementation of the OGC Web Processing Service for Use in Client-Side GIS.” Geoinformatica 13, 109–20. Ntozini, R., S. J. Marks, G. Mangwadu, M. N. N. Mbuya, G. Gerema, B. Mutasa, T. R. Julian, K. J. Schwab, J. H. Humphrey, and L. I. Zungu. (2015). “Using Geographic Information Systems and Spatial Analysis Methods to Assess Household Water Access and Sanitation Coverage in the SHINE Trial.” Clinical Infectious Diseases an Official Publication of the Infectious Diseases Society of America 61 (Suppl. 7: S716–25). Radersma, R., and B. C. Sheldon. (2015). “A New Permutation Technique to Explore and Control for Spatial Autocorrelation.” Methods in Ecology & Evolution 6, 1026–33. Silos, J. M., F. Piniella, J. Monedero, and J. Walliser.(2013). “The Role of the Classification Societies in the Era of Globalization: A Case Study.” Maritime Policy & Management 40, 384–400. Watts, M. (2013). “Assessing Different Spatial Grouping Algorithms: An Application to the Design of Australia’s New Statistical Geography.” Spatial Economic Analysis 8, 92–112. Weiser, M. (1991). “The Computer of the 21st Century.” Scientific American 265, 94–104. Wille, R. (1992). “Concept Lattices and Conceptual Knowledge Systems.” Computers & Mathematics with Applications 23, 493–515. Yang, Z., and J. Cai. (2010). “Progress of Spatial Statistics and Its Application in Economic Geography.” Progress in Geography 29(6), 757–68. Yue, P., J. Gong, L. Di, L. He, and Y. Wei. (2011). “Integrating Semantic Web Technologies and Geospatial Catalog Services for Geospatial Information Discovery and Processing in Cyberinfrastructure.” Geoinformatica 15, 273–303. Zhu, P., and X. Xu. (2012). “Research on Ubiquitous Network Technique and Application.” Communications in Computer and Information Science 312, 133–40.

23

Suggest Documents