Geospatial Information Bottom-Up: A Matter of ... - Semantic Scholar

2 downloads 169 Views 280KB Size Report
particular, we focus on the notion of trust in social networks and its spatio- temporal dimensions, and on attempts to combine folksonomies with on- tologies.
Geospatial Information Bottom-Up: A Matter of Trust and Semantics

Mohamed Bishr, Werner Kuhn Institute for Geoinformatics, University of Muenster, Germany {m.bishr,kuhn}@uni-muenster.de

Abstract. Geographic Information Science and business are facing a new challenge: understanding and exploiting data and services emerging from online communities. In the emerging technologies of the social web, GI user roles switched from being data consumers to become data producers, the challenge we argue is in making this generated GI usable. As a use case we point to the increasing demands for up-to-date geographic information combined with the high cost of maintenance which present serious challenges to data providers. In this paper we argue that the social web combined with social network science present a unique opportunity to achieve the goal of reducing the cost of maintenance and update of geospatial data and providing a platform for bottom up approaches to GI. We propose to focus on web-based trust as a proxy measure for quality and to study its spatio-temporal dimensions. We also point to work on combining folksonomies with ontologies, allowing for alternative models of metadata and semantics as components of our proposed vision.

Keyword: trust, spatio-temporal trust, social networks, social semantics, quality

2

Mohamed Bishr, Werner Kuhn

Introduction Like any information science, GI science has to address the social nature of its subject, in addition to the technical and cognitive aspects. Here, we are not focusing on economic, institutional, legal, or ethical aspects, but on the socials side of quality and semantics highlighted by recent technological developments. Collaborative geospatial web applications have created a new class of GI producers, involving those that were previously only GI consumers. The huge challenges and opportunities resulting from this development call for new approaches to accommodating and handling new GI sources. We combine the notion of networks as dynamic and complex systems with a review of ongoing developments in web-based information technologies to identify new approaches to GI quality and semantics. In particular, we focus on the notion of trust in social networks and its spatiotemporal dimensions, and on attempts to combine folksonomies with ontologies. We start by examining the social nature of geospatial information and its semantics. We are motivated by the notion of geospatial information communities, originally proposed by OGC a decade ago, Information communities are defined as “a community of geodata producers and users who share a common set of feature definitions and ontology of real world phenomena” [32 p.57] Information communities of a different kind from those we know around GI have now suddenly become reality and remain fully consistent with this definition. They are an object of study in the science of social networks, where nodes are agents engaged in activities [1, 2]. In such networks, the links affect the overall behavior of a system as well as the activities at the nodes. In our connected age [2] anything that happens and how it happens depends on such networks. The ways in which the internet keeps changing prove that we are just starting to understand the implications of a networked society. Networks of individuals are reshaping our understanding of human behavior in large-scale collaboration environments. Producers and users of GI are directly affected by these developments, but lack theories to understand their implications. Here, we are primarily interested in the network effects transforming users from content consumers to producers and in the social semantics emerging from such user communities. The open source software movement and the plethora of successful applications that emerged from it are an early example of the power that a networked society can give to “the masses”. Open source is an alternative business model for distribution of software, but the open source software development process itself is based on the collective intelligence of soft-

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

3

ware developers empowered by technology that allows them to communicate, work remotely, and coordinate their efforts. The wider phenomenon includes multiple examples that are diverse and spread out over many domains. Benkler [3], asks questions like why do 4.5 million volunteers contribute their leftover computer cycles to create the most powerful supercomputer on Earth, SETI@Home? Or why and how can fifty thousand volunteers successfully coauthor Wikipedia 1 , and give it away for free? The case of Wikipedia is indeed compelling.

Fig. 1. Wikipedia.org user contribution statistics [3] Figure 1 illustrates the number of contributions from Wikipedia users. To put these numbers in perspective we should consider that the number of English language articles has more than doubled (to 1.4 million) in a period of slightly over a year ending in September 2006 [33]. Wikipedia attracted around 60 million different visitors during that month, while it is estimated that “only” 1-2% of the visitors contribute to the sites. However, the impact of their collaborative effort is multiplied through the vast network. One of the most noticeable achievements of Wikipedia is how the integrity and authenticity of the information generally resists attempts of vandalism, particularly in highly controversial topics with strong political biases (e.g. articles on abortion or human rights issues). Some of the acts of vandalism were quickly corrected by the contributors before even sparking any attention of the general users [3]. This is indeed a remarkable characteristic suggesting that with minimal governance and control, web based collaborative applications remain effectively self governed. However as we later illustrate other views are not as favorable regarding Wikipedia, suggesting that the over all quality is being lowered by vandalism as the wiki becomes more important and hence more attracting of illicit behaviors. This highlights the need for establishing trust and making it explicit in online communities. 1

www.wikipedia.org

4

Mohamed Bishr, Werner Kuhn

Among the other notions emerging around Web 2.0, are folksonomies. A folksonomy is a collaboratively generated, open-ended labeling system that enables web users to categorize content, such as web pages, online photographs, web links, blog entries, or just about any information they encounter or contribute to the web themselves. Folksonomies offer interesting insights into metadata developed collaboratively by communities to categorize their contributed content. It is also being discussed by many with relation to ontologies and the semantic web vision [4, 30]. A different insight into emergent semantics [5, 6] has a more social perspective, where folksonomies can play a role in establishing a novel social semantics approach. Modern linguistics has long recognized the social (situated) nature of language and thereby of information. The geospatial domain is experiencing its own version of this trend through collaborative approaches to geospatial information maintenance, update and production. Our research is stemming from an umbrella question that we initially posed. This question is how to make collaborative GI more usable for a wider base of users? We identify three sub questions, which we believe are essential for the success of collaborative GI. ƒ In light of a potentially large flow of information, how can we enhance the governance of collaborative GI applications to ensure that high value contributions are embraced and low value/fraudulent contributions are discarded? ƒ How can we provide metadata for the collaboratively generated GI? ƒ How can the semantics of this collaborative GI be made explicit to enhance to overall usability of the data? We present two applications among those shaping this trend within the GI community and draw some conclusions on research needs. Collaborative web applications, some times covered under the umbrella term Web 2.0, exhibit characteristics properties of real world social networks. We expect that social network science helps in understanding their dynamics of collaboration, with the purpose of making the information produced more relevant and useful to users. We use the presented applications to illustrate the observed shortcomings that bread our above mentioned questions. A guiding scenario is then used as the basis for our research approach to establish a robust environment for collaborative geospatial applications.

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

5

A Vision Becoming a Reality Traditional approaches to maintenance and update of GI involve government and private sector data collection activities. The data collection methods have evolved over the years, but the organizational models underlying these methods have largely remained the same. We call these models top-down approaches. They preserve the distinction between GI providers and GI consumers, as opposed to bottom-up approaches which blur the distinction and close the loop of GI production and maintenance in ways that can benefit providers and consumers. In early 1998, German mapping agencies held a workshop on maintenance of geospatial databases. The second author of this paper proposed to close the loop with users and offer them messaging and feedback capabilities to update road, land use, and other frequently changing data. Almost ten years later, the technological components are in place to make this vision a reality. However, as this vision unfolds the scope of the challenges involved broadens, and the need for methodological work on collaborative GI (to which we propose a social networks approach) becomes apparent. Based on the trend discussed in the introduction, we summarize the two core motivations behind our work as follows: a. the emergence of a new role for web users, who are traditionally content consumers, as content producers. b. the emergence of the Social Web or Web 2.0, with a set of technologies that enable the formation of large-scale, online collaborative social communities, attempting to harness the collective knowledge and intelligence of their users. In the next sections, we discuss those two motivations with more details and examples, while trying to highlight inherent shortcoming in currently emerging collaborative GI environments. We then present our motivating scenario and dedicate the rest of the paper for a discussion of our proposed approach to the scenario. GI Consumers as GI Producers Collaborating users empowered by technology have proven to be innovation drivers in online communities. The web in general and the geospatial domain in particular require an integration of available technologies and the development of new technologies to tame and benefit from social content creation. Two examples of emerging applications are demonstrated in the next sub-sections. Our aim is to illustrate that a trend is already set, and

6

Mohamed Bishr, Werner Kuhn

that we can learn the potentials and examine the deficiencies by observing the behavior of large-scale user communities. Open Street Maps

Open street map 2 is a website that takes the form of a Wiki. Wikis allow users to edit contents in a collaborative manner, each user having access to the content of the website and the ability to add, remove, or change the content. The aim of wikis is to leverage on the collaborative intelligence of their users. Looking at wikis in a geospatial context, one realizes that they pose new challenges for semantic interoperability between the information sources provided by users. For example, the vocabularies of different users need to be harmonized or translated, and quality control mechanisms need to be established.

Fig. 2. Openstreetmap.org illustrating the Isle of Wight (UK), which is one of few areas that are now fully mapped

Openstreetmap aims at building street maps of major cities and making these available for public use and maintenance. Users upload GPS tracks to the wiki. The GPS tracks are subject to editing done by the users through open source editors made available by the wiki project. With GPS tracks from many users being added and turned into road segments. Gradually, maps of many locations are collected and refined continually. To organize the process, expert users monitor the contributions of less experienced users, and practice some loose form of management to ensure consistency and correctness of the updates. In wikipedia.org the staff perform this role to a certain degree and provides minor safety measures. However, the governance of a wiki remains in the hands of its users and 2

www.openstreetmap.org

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

7

not in those of an external authority. In a GI application like the open street map wiki, managing the evolution of the information is certainly more complex, as many users can report the same data multiple times with varying degrees of quality and with varying semantics. A key success factor to social applications is to empower users to contribute. We can observe the following on Openstreetmaps.org: • ordinary web users are empowered by modern technology to shift their role from GI content consumers to GI content producers; • currently users can contribute road geometries to Openstreetmap, and the system allows for more information to be introduced by the users: ƒ users can report Points of Interest (POI) to enrich the road maps. ƒ users can add road directions and one-way information; ƒ to some extent some other forms of thematic, temporal, or spatial data could be added. • Many problems arise as Openstreetmap grows, thereby making the management of the wiki more complex (but its potential value ever greater). The question is how to effectively manage such flow of information of potentially interesting information; • the information provided lacks any useful form of metadata that can assist in judging its quality or be used in retrieval and discovery. The dilemmas facing the role model of wikis that is Wikipedia.org are being strongly debated. An analytical study of the problems and potentials is yet to be concealed. Some critical views are extremely opposing to the whole notion behind wikis. The process in itself is viewed as flawed and utterly bound to fail [27]. Other articles circulating the web seem to be making the same point, and the range of reasons revolves around the anonymity of participation in Wikipedia and the ability of anyone even if less expert than the earlier editors to make and commit edits to the wiki. This makes it very susceptible to vandalism, lowering the overall quality of Wikipedia overtime. On the other hand, proponents of Wikipedia acknowledge the problems surrounding Wikipedia today, but argue that these are signs of a need for different management techniques after Wikipedia have reached a critical mass. Larry Sanger who is a co-founder of Wikipedia (not now part of the project) suggests that “Anti-elitism, or lack of respect for expertise” [28] to be the root cause of the problem. He does not however point to the technical reasons behind this problem that pertain to the anonymity and lack of accountability of previous actions of individuals in the editing process of

8

Mohamed Bishr, Werner Kuhn

Wikipedia. Adding this reason to the fact that Wikipedia have reached a critical mass not just in size but in importance, hence it attracts more vandalism by non-experts to enforce and strengthen their views, biases and beliefs. This argument can be made clear by examining articles covering controversial issues like abortion or political figures for example, where one can observe a high rate of change. Our argument is that the current state of Wikis is characterized by a low sense of community and a lack of explicit trust. A wiki forms a social community by definition, albeit a virtual one. However it lacks the explicit definition of social relations by the users. For a community to grow, and for its members to become more confident in the community, some improved form of self-governance is needed. Functioning communities rely heavily on trust among their members [7-9], this view as we illustrate later is the foundation of our vision for the road network example. While Open street maps wiki might not be inherently susceptible to vandalism like Wikipedia it is still clear that similar problems are in place. The all too common question of how can wiki maps maintain acceptable degrees of quality over time remains a serious challenge to such a concept of open collaborative map building. Social Trust [8] in that context remains a central issue; a growing community without an accompanying sense of trust is more prone to decay overtime. Way Faring (a Mapping Mashup)

Wayfaring.com is another geospatial social web application, serving as a medium where a user or a community of users can enrich datasets with their knowledge about local environments or places they visit.

Fig. 3.Wayfaring.com allows users to build their own maps and collaborate with each others on shared maps, adding nodes and road segments along with metadata in the form of folksonomies. Google maps API provides the background against which users build their maps.

Wayfaring.com is based on a mashup model that integrates various information sources to offer new services not possible to produce from each

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

9

source alone. Its users overlay personal data like favorite restaurants or travel routes over Google maps via a mashup developed on the Google maps API (Application Programming Interface). Users of wayfaring.com can collaborate on building highly sophisticated maps, comprised of the local experience of many web users. Other users can view these community-generated maps and consult them for the wealth of knowledge being continuously added and updated. We can observe the following on wayfaring.com: • much like Openstreetmap, it empowers those who were traditionally lower-end geospatial information consumers to become content producers; • very intuitive interfaces and APIs lie behind its success - an important characteristic of Web 2.0 applications in general; • theoretically, other mashups can connect to wayfaring and integrate its data with other data sources to offer new services; • users can collaborate on certain maps if declared public by the original author. Despite that, wayfaring still does not provide a platform for interaction between the users to coordinate their efforts (a low sense of community, much like wiki environments); • there is no clear mechanism for self-governance and building or assessing trust between community members. This impedes the ability to judge the value of any particular content provided by users; • Wayfaring relies on folksonomies as means of collecting metadata about the user-generated content. Hence, there exist a layers of interesting thematic information other than the simple marked geometries. The Social Web Or Web 2.0 The second motive we discuss is he notion of Web 2.0 3 which began with a conference brainstorming session between O'Reilly and MediaLive International, after which the first Web 2.0 conference was initiated. The term has evolved to encompass various applications and business practices on the Web. It is also widely used to describe only a certain set of technologies, some of which are semantic web technologies and others are dubbed social technologies enabling large scale collaboration between web

3

http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web20.html?page=1

10

Mohamed Bishr, Werner Kuhn

users. The potentials offered by this evolution of the web is key to our vision. Web 2.0 is also sometimes referred to as the Social web. It is generally accepted that Web 2.0 encompasses the key idea of harnessing the collective intelligence of web communities. Within our research, we adopt the term Social Web to describe a set of technologies and applications enabling interaction and collaboration between web users organized in virtual communities while being centered on common goals and objectives. This understanding of the Social web is not comprehensive, neither agreed upon by the web community. Though it is now agreed that the evolution of the web into a social interaction medium allowing web users to produce large amounts of information content is a fundamental change to how the web was previously used. This change of perception of the web was a result of many innovative applications and technologies that provided new approaches to application integration (e.g. mapping mashups) and knowledge representation and management (e.g. folksonomies) as well as wikis, Web Based Social Networks (WBSN) and web based social trust research. In the next subsections, we discuss two essential components of Web 2.0 that in our view are important foundations for laying down our approach to the navigation data scenario presented later. Social networks are the basis of our understanding of the dynamics of collaborative GI environments. Together with web based social trust research they are used as the foundation of answering our first question regarding the governance of collaborative GI environments which then leads to more trusted folksonomies and bottom-up semantics. Social networks

Social network science has its routes in multiple disciplines such as sociology, anthropology, and social psychology. The term social network was coined in 1954 by J. A. Barnes. A social network is made of nodes standing for individuals or institutions. The social network indicates ways in which they are connected in social terms (friends, family, business associates, etc.). A famous but controversial hypothesis in social networking is the “Small World Phenomenon” which suggests that any two people in the United States are connected through a chain of no more than six acquaintances [13]. Such phenomena led later to the remarkable finding of small world networks [14]. Social networks arise from human contact through daily activities at home or work over communication channels. Consequently, there are spatial and temporal regularities in social network formation and maintenance. These networks occur and sustain themselves in particular spaces and time

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

11

contexts. To connect social network structure with dynamics, a synthesis of social network theory in relation to social dynamics was presented [15]. This demonstrated, for example, how the strength of friendship relationships might decay over time when involved parties do not communicate. The dynamics of social networks also reflect the impact of individual changes on the collective behavior of the social network. For example, the decay of the strength of the friendship between any two members of a social network over time may not only impact the level of trust between them, but also the collective level of trust across the network (e.g. if these people are considered to be influential in the network). |The utility of threshold models in understanding collective behavior, and how a person becomes influential in a social network is discussed in other work such as [16]. Social thresholds refer to the minimum fraction of one’s peers who have made a decision before the individual in question does. In a similar manner, [17] utilizes a contagion model (inspired by how diseases are transmitted) to examine the nonlinear social effects of neighborhood dynamics as behaviors are transmitted among members. It is believed that much like real world social networks online communities can be viewed through social network analysis [10, 11, 21]. Online communities organized in the form of a social network are called WBSN. They exhibit very similar dynamics to real life social networks. Trust, reputation and social proximity play a key role in governing the relations between their actors [10]. Software that supports social networking in the internet is sometimes referred to as social software: “Social software enables people to rendezvous, connect or collaborate through computermediated communication and to form online communities.” (Wikipedia.org 29/8/2006). WBSN users maintain lists of contacts to whom they are directly connected and via whom they are connected to others, forming a large web of interconnected individuals communicating via the available media (e.g. blogs, comments, messages, etc.). Management of relationships within and across communities, as well as of relationships between generated content and communities becomes essential. The examples of social collaborative applications discussed earlier do not provide facilities for establishing WBSN, but provide means for users to collaborate. Integrating social networking into our vision of collaborative geospatial applications primarily requires more explicit account of relations between the users as nodes on a social network. Constructing collaborative GI environments as social networks or WBSN provide the first step towards improved governance of the environments, which is an essential step to ensure GI quality as we later discuss.

12

Mohamed Bishr, Werner Kuhn

Trust layer

Trust in social networks is the second step towards a tighter and better governance of collaborative GI environments. Trust in computer science has been co-opted by many subfields to mean many different things. It is for example a descriptor of security and encryption [18], a name for authentication methods or digital signatures [19], a factor in game theory [20], and a motivation for online interaction and recommender systems [21]. The computing community is also researching social aspects of trust. Trust from that perspective is a prerequisite for the existence of a community, as functioning societies rely heavily on trust among their members [79]. Good reputation is something that members of a community are keen on establishing and maintaining to generate trust. Thus, WBSN actors start to find formal ways of asserting trust in other actors in order to make decisions about the types of interactions to engage in, be it to produce or consume content. An attempt to establish a mathematical formalization of social trust [22], such that distributed artificial intelligence could benefit from enabling trusting behavior for agents in a distributed system. Other work was done on trust in WBSN [10] for a variety of applications including email filtering and web based recommender systems. Golbeck’s [10] trust calculations proved superior in providing a measure of the quality of user movie reviews and emails respectively when compared to both the average rating and the ratings generated by traditional collaborative filtering algorithms. Some web researchers (and users) are also interested in using the social notion of trust to identify how answers provided on the web can be trusted. The Inference Web [23] aims to take opaque query answers and make the answers more transparent by providing explanations. Also a discussion of an infrastructure extension of Inference Web [24] called Interface Web Trust (IWTrust) which can quantify users’ degree of trust in answers obtained from web applications and services. IWTrust uses trust values between users as well as trust values between users and provenance elements to come up with trust ratings for answers to web queries. The computational algorithms of social trust [10, 11, 21, 24], particularly when applied to WBSN, can be used as a measure of quality for the content created in the social network. Trusted users tend to provide information that is more valuable (i.e. of higher relative quality) to their network peers. Within our vision, we intend to use trust along similar lines to assess the value of the socially generated geospatial content. This again amounts to the same question of governance of collaborative GI communities.

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

13

Navigation Data Scenario To understand the challenges posed by the information community vision in its Web 2.0 incarnation we narrow our perspective to a scenario involving navigation and road data. Our scenario is inspired by projects like HeavyRoute [34]. The volume of freight carried on Europe's roads increases continuously. Heavy Goods Vehicles (HGV) operating on European roads are also carrying more gross weight per trip. The results are faster deterioration rates to roads, bridges and similar infrastructure elements. At the same time, European freight carriers are operating on low margins that force finding better ways to maximize efficiency and profitability. Driver abilities are stretched to meet working hours and find facilities like rest locations or service stations along travel routes. Some aims of the HeavyRoute project are to: ƒ develop an advanced route guidance system for HGVs that covers the whole of Europe to allow users to pick the safest and most cost-effective routes; ƒ develop databases of static, periodic and dynamic road, bridge and traffic data as well as vehicle/infrastructure interaction models that can derive optimal routes; ƒ design and develop innovative route guidance and driver support applications for HGVs based on database contents and effect models. Current navigation systems relied upon by HGVs fall short of being suitable components of the desired route guidance systems. They are not designed to provide for the specific needs of the freight industry. The HeavyRoute project requires a different breed of enriched road databases to power route guidance systems. They need knowledge about specific requirements of HGVs to be coupled with the specific road dataset properties to optimize routing decisions for HGVs. Such properties are, for example: ƒ allowed weights on roads; ƒ tight road bends where HGVs cannot maneuver, heights (bridges and tunnels) that might block passages; ƒ road widths not passable by HGVs. ƒ additional temporal information, such as restricted passing hours for HGVs on certain roads or road segments; ƒ constantly changing properties like temporary road blockades (i.e. maintenance operations) that can take days, weeks or even months. To enrich road network datasets using traditional top-down approaches means that data providers will need to pursue additional maintenance and update efforts. The cost of GI will keep increasing as they try to make their

14

Mohamed Bishr, Werner Kuhn

datasets more suitable for requirements of future applications. Some problems can be identified with the collection of this additional road network information: 1. navigation data providers are already plagued by high costs of updating and maintaining their datasets; adding properties to the datasets will incur even more costs, both short term (to collect) and long term (to maintain) them; 2. some information is changing at such a fast rate that it will exceed the maintenance abilities; 3. some information on best routes and other “tips and tricks” are considered local knowledge to experienced HGV drivers; this information is a valuable community resource that cannot become part of the traditional data collection process. Our proposed vision to this scenario is based on facilitating bottom-up approaches to GI production. We intend to leverage on what we call the social nature of GI and its semantics to provide a collaborative environment in which communities of users can create and maintain more valuable GI. We propose a collaborative environment that allows HGV drivers (and others) to contribute information in multiple categories as the ones discussed above. In this scenario: ƒ drivers will have access to a website where they can locate problematic areas they encountered during their trips and annotate those area with additional information (Tagging). ƒ in addition, functionality is provided to Navigation systems where drivers can mark faulty instructions and the system would report the problem and location via SMS to the central server. Drivers are assumed to be using the best available datasets which they intend to enrich. The HeavyRoute project is stemming from the fact that those best available road network data are as is not fit enough for HeavyRoute’s mission. Hence, this enrichment approach tries to solve this problem. This essentially means that such a collaborative approach leads to the overall good of the HGV operators, which is an ideal environment for collaborative applications in general. In such a scenario we believe the central problem is similar to that of other collaborative GI applications. This pertain to the lack of measures to judge the quality of the collaborated GI , collecting metadata, and ends with the problem of the semantics of collaborative GI and how to benefit from this potentially large flow of information in establishing community semantics.

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

15

Quality of Collaborative GI In collaborative environments, content is created in a process that appears rather arbitrary. There is no centralized control of the virtual communities involved. The lack of governance, metadata and semantics is a result of the “bottom-up” nature of the applications where users are non-experts and particularly not geospatial information experts. In particular, there are no means of determining the standardized aspects of geospatial information quality (lineage, accuracy, consistency and completeness). The lack of quality measures, semantics and metadata adversely affect the usability of information generated in this manner. The size of the challenge becomes clear considering the potentially large numbers of users, and the amount of data that is constantly evolving. We use the observations from the discussed applications to identify two major problems facing our use case scenario: a. The governance of the collaborative web environments: This problem pertains to the fact that the geospatial applications discussed earlier provide no means of assessing the quality of the provided information. With the large flow of information expected from social application it becomes very difficult to identify valuable information entities and discard those that are fraud or less valuable. In our view this is synonymous to identifying the expert contributors and discarding those that perform acts of vandalism or disrupt the network in any other manner. This is an essential step to render the collaborative environments useful. b. The semantics of the collaboratively generated information: The collaborative geospatial applications discussed lack any means of collecting metadata as well as any explicit semantics. Alternative approaches are needed to the collection and maintenance of metadata and capturing the semantics of the generated geospatial information. With reference to the first problem, in the next sub sections, we define the first components of our vision by adopting alternative measures to Quality in collaborative GI environments. We strongly rely on the notions of social networks and trust which we extend with spatio-temporal trust. Then we discuss our approach to social semantics with reference to the second problem.

16

Mohamed Bishr, Werner Kuhn

Trust as a proxy for Geospatial Information Quality Trust in social networks is the foundation of our approach to build an alternative measure of quality for collaborative environments for geospatial information creation and maintenance. We adopt trust as a proxy measure for geospatial information quality. Trust in web based social networks is a measure of how information produced by some network users is relatively valuable to others [10-12]. Trusted users tend to provide more useful information compared to not (or less) trusted users. Quality is a subjective measure here (and always, to some extent). If some trust-rated geospatial information is useful and relevant to a larger group of users, it can then be assumed to have satisfactory quality in a more objective sense. The spatio-temporal dimension of trust Utilizing social trust as a measure of quality of geospatial information generated by actors in a social network poses new and different challenges as to how social trust can most accurately assert the value of geospatial information.

Fig. 4. As an example when user in a collaborative environment provide overlapping information, which contributions are most valuable?

For example, in Figure 2, John, Mike and Jeff are HGV drivers. They have all added the same information to Map A, marking a road segment as a maintenance spot with high potential for traffic jams. Jack is searching for information to be used in updating his road network data to make his

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

17

HGV routing system more accurate. The data provided by John, Mike and Jeff are relevant for Jack. Which information, or combination, would be most valuable so that it can be recommended to Jack? Trust inference is generally a complex computational task. When trying to recommend information to Jack it is important to understand that the geospatial information generated in a WBSN has certain peculiarities stemming from the spatio-temporal properties of both the actors and the geospatial information contents. Geospatial information is about space and inherently has a temporal dimension. WBSN users are actors that are also moving in space and time. Hence, trust in geospatial information cannot be judged only based on direct user-to-user ratings, as in other models of social trust like in [10]. Moreover, as we intend to study trust on social networks research showed that space and time factors influence the structure of social networks [31]. We hypothesize that Trust here is affected by the spatio-temporal context of both the actors and the information entities. Hence, when giving Jack recommendations, answering questions such as the following becomes important: 1. 2. 3.

when did John, Mike and Jeff visit the area they are producing information about? how many trips did each one make through this location, at what times? what is the relation between the locations and information provided by John, Mike and Jeff and other locations and information along the same road segment (e.g. John might prove most accurate with regard to other spatio-temporal constraints).

Current research on trust in social networks ignores the premise that social network actors, which are living and interacting in space and time, create information about phenomena that are also spatio-temporal. A widely cited work on trust [22] accounts for the temporal dimension of trust in a model for interacting autonomous agents. Spatio-temporal properties of phenomena as well as agents, we argue, is important to calculate trust values that can be used to assess quality of geospatial information in a collaborative content generation environment. A major task of our research is to develop a spatio-temporal trust model for web based social networks. A spatio-temporal trust model will benefit a wide variety of purposes in artificial intelligence research, including the search for models of social semantics. In the next section, we examine the social aspects of metadata and semantics necessary for collaborative GI environments, which complements the spatio-temporal trust model in establishing a robust collaborative GI environment.

18

Mohamed Bishr, Werner Kuhn

Social Semantics The second problem in the vision of collaborative geospatial content generation applications is the lack of metadata and semantics to render generated information useful for a variety of purposes. However, the evolution of web users into content producers comes with its own answers to this problem, again in the form of a collaborative approach. Folksonomies are the currently fashionable response to metadata needs. They are community-generated collections of key words, sharable and effective, yet easy to produce and use. We have earlier identified folksonomies as a collaboratively generated, open-ended labeling system that enables internet users to categorize content, such as web pages, online photographs, web links, blog entries, or just about any information they encounter or contribute to the web themselves. The process leading to folksonomies is often referred to as Social Tagging where users label recourses with tags. Two widely cited examples of websites using folksonomies are Flickr 4 and Del.icio.us 5 . Folksonomies can be looked at as community based metadata. Stewart Butterfield, one of the creators of Flickr and a social tagging pioneer, argues that the difference in complexity between folksonomies and classification schemes is important: “I think the lack of hierarchy, synonym control, and semantic precision are precisely why it works. Free typing loose associations is just a lot easier than making a decision about the degree of match to a pre-defined category (especially hierarchical ones). It’s like 90% of the value of a proper taxonomy but 10 times simpler” [29]. While one might not agree that folksonomies provide 90% of the benefits of a proper taxonomy, the comparison could be misleading: it would be impossible to get the users of tagging systems to use complex controlled vocabularies or formal taxonomy. Battlefield’s view reflects a commonly known contradiction between the scope and scale of commitments: web ontologies need to carry minimal commitment in order to be adopted for a wide scope [25]. This view have been reflected on folksonomies to explain the reason behind their success [4]. Faced with both, the need for ontologies and the success of folksonomies, what can we do to integrate the two? In [4] a novel approach to extract light weight ontologies by mining the del.icio.us folksonomy through a process of graph transformation (while assuming a folksonomy is a social network centered around commonality of interest) shows impressive results on the emergence of semantics. Mika [4] calls his model the Actor4 5

http://www.flickr.com http://del.icio.us

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

19

Concept-Instance model of ontologies. This model is built on an implicit realization of emergent semantics, namely that meaning necessarily depends on a community of agents (the notion of information community, again). The resulting lightweight ontologies can be used in a variety of applications, including enhancing the structure and value of a folksonomy, by allowing users to search for more abstract or specialized concepts within it. This work offers an exciting insight into the value of folksonomies as community based tools for ontology-based knowledge management. In another effort to formalize folxonomies the tripartite model of Mika is extended to include the source of the tag (the system where the tag originated) offering a Tagging (object, tag, tagger, source) construct in an ontology for tags dubbed TagOntology [26]. Folksonomies provide an interesting insight into information community semantics that is different from that of traditional realist semantics (the meaning is in the world), but also from cognitive semantics (the meaning is in the heads), by coping with the collective emergence of meaning from the user community. Folksonomies are open-ended and continuously change and evolve in ways that reflect the current dynamics of the communities using (and thereby producing) them. They are a source and product of social networks, where users who converge on a common set of tags exhibit common interests. In our road network scenario, users can tag the geospatial information they produce with tags that convey their personal understanding of the entities. Additional automated tagging can reflect the spatio-temporal context. The collected tags can then be used to build lightweight ontologies. This is a first step towards information community semantics, and our intention is to devise methods to extract more knowledge from folksonomies by employing social network analysis methods (such as centrality measures). Better (i.e. more structured) folksonomies will lead to better knowledge extraction, involving and assisting the users in revealing more meaning throughout the tagging and querying processes. For example, sites like Tidepool [35] provide a very intuitive interface that assists in extracting more meaning from the users of a tagging system. Each tag the user enters can be labeled with four different categories: ƒ Who ƒ Where ƒ What ƒ When The relevance to our geospatial aspects is obvious. A user who tags a photo with “Münster” can easily mark this with the “Where” category, or even let a gazetteer do the work for her. She might also add a tag saying “2006” to the same photo, to be easily put in the “When” category, just

20

Mohamed Bishr, Werner Kuhn

like a “GIScience” tag for the “What” category. One can immediately see the potential of a more elaborate tagging system. It enables users to convey more meaning to tags. This is an important aspect of research on how to build more effective folksonomies, particularly for geospatial applications. Consequently, there are key roles for folksonomies to play in our vision for the road networks scenario. Social semantics as such is in an early stage of development. It has already become clear, however, that the semantic web technology of today rests on an illusory view of top-down semantic modeling through ontologies. In practice, ontologies are built bottom-up without folksonomies, resulting in a “worst of both worlds” solution: semantic models that do not connect well (because they lack an upper level) and do not reflect community meaning (because they are built outside them). We are not advocating a warm fuzzy feeling about semantics, abandoning formalization. Quite the contrary, the modeling challenges posed by a socially aware view of semantics are likely to be even tougher than those faced in traditional ontologies. One could argue that the simpler case should be solved first, but it starts to become apparent that idealistic views of semantics need to be complemented with social approaches to become useful.

Conclusions Our vision of information communities is that of virtual online communities communicating through network structures that resemble real world social networks. Within those online communities, linkages between individuals are explicit and establish social networks based on common interest and most importantly on trust. A major difference that characterizes this vision is that it is not addressing the institutional aspects of community, sociology or trust, but it is addressing the technical aspects of building virtual communities of trust. In this paper, we propose a use case of maintaining and updating road network data for HGVs via a collaborative process. We propose a research approach of the relevant technologies from web based social trust to folksonomies and social semantics that we believe are essential to ensure the success of collaboratively generated geospatial information. Firstly, we propose an alternative insight into online collaborative communities. In our view, online collaborative communities are to be developed as social networks with explicit linkages between the nodes (users) which are then studied as social network actors. Social network analysis is

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

21

proposed to identify key individuals on the network and define users with varying degrees of influence. This allows for the introduction of the notion of social trust into the network using varying technologies. Secondly, we hypothesis that social trust in WBSNs has a spatiotemporal dimension. By identifying spatio-temporal trusting patterns within the communities, one can filter the collaboratively generated geospatial information and provide better governance that is currently lacking in online environments. This task requires the development of spatiotemporal trust models that resemble trusting behavior of real world communities. Quality of GI then (as a subjective measure) can be deduced by spatio-temporal trust filtering of the collaboration network. Thirdly, we address the aspects of the collaborative GI metadata and semantics and illustrate some efforts attempting to analyze folksonomies as means of collecting metadata (to study the emergence of meaning from communities) or other attempts to transform the seemingly chaotic folksonomies into more structured constructs. This in turn takes folksonomies a step further from being community metadata to a means of establishing semantics. Our research proposes to treat the notion of social semantics as an important aspect of semantics research. It is important for our research to develop formal methods of integrating community meaning into geospatial applications as well as spatio-temporal trust models.

Acknowledgements This work is funded by the European Commission under the SWING project (FP6-26514). In addition, the authors would like to thank the three reviewers for their insightful comments.

References: 1. Barabási, A.L., Crandall, R.E.: Linked: The New Science of Networks. Vol. 71. AAPT (2003) 409 2. Watts, D.J.: Six Degrees: The Science of a Connected Age. W. W. Norton & Company (2004) 3. Benkler, Y.: The Wealth of Networks: How Social Production Transforms Markets and Freedom. Yale University Press (2006) 4. Mika, P.: Ontologies are us: A unified model of social networks and semantics. International semantic Web Conference ISWC (2004) 5. Aberer, K., Cudre-Mauroux, P., Hauswirth, M.: The Chatty Web: Emergent Semantics Through Gossiping.

22

Mohamed Bishr, Werner Kuhn

6. Santini, S., Gupta, A., Jain, R.: Emergent semantics through interaction in image databases. Vol. 13 (2001) 337-351 7. Cook, K.S.: Trust in society. Russell Sage Foundation New York (2001) 8. Fukuyama, F.: Trust: The Social Virtues and the Creation of Prosperity. Vol. 457 (1996) 9. Uslaner, E.M.: The Moral Foundations of Trust. Cambridge University Press (2002) 10. Golbeck, J.A.: Computing and Applying Trust in Web-based Social Networks. Department of Computing, Vol. PhD. University of Maryland (College Park), Maryland (2005) 11. Richardson, M., Agrawal, R., Domingos, P.: Trust management for the semantic web. Second International Semantic Web Conference, Sanibel Island, Florida (2003) 351–368 12. Ziegler, C.N., Lausen, G.: Spreading Activation Models for Trust Propagation. The IEEE International Conference on e-Technology, e- Commerce, and eService, , Taipei, Taiwan (2004) 13. Milgram, S.: The small world problem. Psychology Today 2 (1967) 60-67 14. Watts, D.J., Strogatz, S.H.: Collective dynamics of'small-world'networks. Vol. 393 (1998) 409-410 15. White, D.R.: Network Analysis and Social Dynamics. Vol. 35. Taylor & Francis (2004) 173-192 16. Granovetter, M.: Threshold Models of Collective Behavior. American Journal of Sociology 83 (1978) 1420 17. Crane, J.: The Epidemic Theory of Ghettos and Neighborhood Effects on Dropping Out and Teenage Childbearing. American Journal of Sociology 96 (1991) 1226 18. Kent, S., Atkinson, R.: Security Architecture for the Internet Protocol. RFC 2401, November 1998 (1998) 19. Ansper, A., Buldas, A., Roos, M., Willemson, J.: Efficient long-term validation of digital signatures. Springer (2001) 402–415 20. McCabe, K.A., Rigdon, M., Smith, V.L.: Positive reciprocity and intentions in trust games. Vol. 52 (2003) 267-275 21. Abdul-Rahman, A., Hailes, S.: A distributed trust model. ACM Press New York, NY, USA (1998) 48-60 22. Marsh, S.: Formalising Trust as a Computational Concept. Department of Mathematics and Computer Science, Vol. PhD. University of Stirling., Stirling (1994) 23. McGuinness, D.L., da Silva, P.P., Chang, C.: IWBase: Provenance Metadata Infrastructure for Explaining and Trusting Answers from the Web. 24. Zaihrayeu, I., da Silva, P.P., McGuinness, D.L.: IWTrust: Improving User Trust in Answers from the Web. Springer (2005) 25. van Elst, L., Abecker, A.: Ontologies for information management: balancing formality, stability, and sharing scope. Vol. 23. Elsevier (2002) 357-366 26. Gruber, T.: Ontology of Folksonomy: A Mash-up of Apples and Oranges. First on-Line conference on Metadata and Semantics Research (MTSR'05) (2005)

Geospatial Information Bottom-Up: A Matter of Trust and Semantics

23

27. McHenry,R. The Faith Based Encyclopedia http://www.techcentralstation.com/111504A.html visited on 19/12/06 28. Sanger, L Why Wikipedia Must Jettison Its Anti-Elitism http://www.kuro5hin.org/story/2004/12/30/142458/25 29. Butterfield, Stewart. “Sylloge.” August 4, 2004. [http://www.sylloge.com/personal/2004/08/folksonomy-social-classificationgreat.html] visited on 22/12/06 30. Shirky, C 2003. LazyWeb and RSS: Given Enough Eyeballs, Are Features Shallow Too, O’Reilly Network, January 7, http://www.oreillynet.com/pub/a/p2p/2003/01/07/lazyweb.html visited on 12/12/06 31. Metcalf, S., Paich, M.: Spatial Dynamics of Social Network Evolution. 23rd International Conference of the System Dynamics Society Vol. 51 61801 32. Bishr, Y.A., Pundt, H., Kuhn, W., Radwan, M.: Probing the concept of information communities-a first step toward semantic interoperability. In: Goodchild, M.F. (ed.): Interoperating Geographic Information Systems. Springer (1999) 55-70 33.http://www.pbs.org/mediashift/2006/11/digging_deeperyour_guide_to_wi.html visited on 15/12/06 34. http://www.ertico.com/en/activities/safety/heavyroute.htm visited on 8/12/06 35. http://storymill.net/tidepool/ visited on 13/12/06

Suggest Documents