FlierMeet: A Mobile Crowdsensing System for Cross-Space Public ...

2 downloads 1187 Views 962KB Size Report
http://www.ieee.org/publications_standards/publications/rights/index.html for more information. ... Index Terms—Participatory sensing, cross-space reposting,.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2014.2385097, IEEE Transactions on Mobile Computing

1

FlierMeet: A Mobile Crowdsensing System for Cross-Space Public Information Reposting, Tagging, and Sharing Bin Guo, Senior Member, IEEE, Huihui Chen, Zhiwen Yu, Senior Member, IEEE, Xing Xie, Senior Member, IEEE, and Shenlong Huangfu  Abstract—Community bulletin boards serve an important function for public information sharing in modern society. Posted fliers advertise services, events, and other announcements. However, fliers posted offline suffer from problems such as limited spatial-temporal coverage and inefficient search support. In recent years, with the development of sensor-enhanced mobile devices, mobile crowd sensing (MCS) has been used in a variety of application areas. This paper presents FlierMeet, a crowdpowered sensing system for cross-space public information reposting, tagging and sharing. The tags learned are useful for flier sharing and preferred information retrieval and suggestion. Specifically, we utilize various contexts (e.g., spatio-temporal info, flier publishing/reposting behaviors, etc.) and textual features to group similar reposts and classify them into categories. We further identify a novel set of crowd-object interaction hints to predict the semantic tags of reposts. To evaluate our system, 38 participants were recruited and 2,035 reposts were captured during an eight-week period. Experiments on this dataset showed that our approach to flier grouping is effective and the proposed features are useful for flier category/semantic tagging.

environments, and so on. Therefore, it is particularly useful for small businesses and average users. Numerous empirical studies on the use of bulletin boards have proved its significance on community information sharing, socializing, viewpoint advertising, and marketing [2, 6-9]. In short, the usage of bulletin boards have been part of the fabric of the social space [2] and it presents an informal, nonintrusive, and inexpensive medium for mass communication.

Fig. 1. Community boards: on the street, in the workplace.

Index Terms—Participatory sensing, cross-space reposting, data grouping and selection, interaction-based semantic tagging, urban sensing.

I. INTRODUCTION

B

ulletin boards serve an important communication function within communities [1-3]. They are usually placed in public settings [4], taking advantage of the movement of people through social spaces (on the streets, near to sport centres, in college campuses, at workplaces/cafes), as shown in Fig. 1. Specially, the paper fliers posted on bulletin boards usually provide a means for people to seek and advertise local businesses (e.g., sales, driving lessons, recruitments), events (e.g., local art shows, gatherings), and services (e.g., bicycle wanted, lost object sought) [5]. They are widely accepted and used by individual users, institutions, third-party advisers for numerous advantages: ease of use, flexibility, low-cost in deployment and publishing, durability in public/outdoor Manuscript received xx; accepted xx. This work was partially supported by the National Basic Research Program of China 973 (No. 2012CB316400), the National Natural Science Foundation of China (No. 61332005, 61373119, 61222209), the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2012JQ8028). B. Guo, H. Chen, Z. Yu and S. Huangfu are with the School of Computer Science, Northwestern Polytechnical University, Xi’an, China (e-mail: [email protected]). X. Xie is with Microsoft Research Asia, Beijing, China (e-mail: [email protected]).

Though the bulletin board has proved useful and significant in our daily lives, it suffers from several drawbacks. First, it has limited spatial-temporal coverage. For example, fliers on bulletin boards might quickly be covered by new fliers and are mainly visible to passer-bys [1]. Second, fliers on a board are often cluttered and lack order, making it a laboring task for people to identify the information needed. Therefore, there would be benefits to augment paper-based boards with digital techniques to facilitate information sharing and retrieval. There have been several techniques devoted to address this. For example, digital displays lead to a transformation from paper fliers to digital contents, but its deployment and publishing cost is high, thus creating barriers to average content providers. Though barcodes or RFID tags connect paper fliers with the Internet [10-12], the link is often directed to proprietary websites and thus making it difficult to have an open and universal platform for public information sharing. Due to their complementary features and merits, in the near future, we envision an co-existence of varied public information exchange mediums (paper fliers, digital contents, barcode-tagged fliers, etc.) and an effective and practical way is, however, to build an overlay above them for public information gathering and sharing. The consensus is that we should bridge the gap between these physical-space objects and their cyber-space counterparts to facilitate public information sharing, i.e., enabling cross-space content transferring and sharing. With the recent surge of sensor-rich (e.g., accelerometer,

1536-1233 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2014.2385097, IEEE Transactions on Mobile Computing

2 GPS, camera, etc.) mobile phones and the prevalence of GPS-equipped cars, taxis, and buses, mobile crowd sensing (MCS) [13, 14] has become an emerging paradigm for large-scale, real-world sensing and information gathering. It has demonstrated its usefulness in a variety of application areas [15-22]. However, no existing method focuses on distributed public flier information collection and cross-space sharing. In this paper, we propose FlierMeet, a system that enables cross-space transferring, intelligent tagging (to facilitate public information retrieval), and pervasive sharing of distributed fliers. We leverage crowdsourcing (e.g., taking a picture of the flier) to ‘repost’ fliers from the physical space to the cyber space. Context-sensitive approaches are proposed to group the distributed crowdsourced reposts and evaluate their quality for data selection. The clustered repost groups are further processed to predict their category (e.g., is the flier a recruitment or a sale activity) and semantic (e.g., is it widely noted, does it meet my preference) tags based on the features extracted from crowd-flier interaction. The system can be applied to a variety of application areas, such as public information collection and sharing, targeted advertising, mobile socializing, and so on. Specifically, our work makes the following research contributions:  Develops a mobile platform for participatory public information reposting, intelligent tagging and sharing. 

Proposes context-sensitive approaches for repost grouping and high-quality repost selection, using a set of contexts such as spatio-temporal constraints, flier publishing behaviours, reposting behaviour associated contexts (e.g., light value, motion blur, heading angle), and so on.



Introduces a model that characterizes crowd-flier interaction and a novel set of crowdsensing-specific features that are extracted from crowd-flier interaction patterns to characterize the reposted fliers.



Provides a combination of learning and inference models to predict varied category and semantic tags using the aforementioned features.

We evaluated FlierMeet with an eight-week, 38 person deployment using commercially-available smartphones. Our findings demonstrate that FlierMeet is effective and the features proposed are useful for flier category/semantic tagging. II. RELATED WORK There are three closely related research areas to our work: mobile crowd sensing, community bulletin boards, and heterogeneous crowdsourced data mining. A. Mobile crowd sensing Mobile crowd sensing (MCS) is a new sensing paradigm that empowers ordinary people to contribute data gathered or generated from their mobile devices. It further aggregates heterogeneous crowdsourced data in the cloud to extract hidden intelligence (e.g., from ambient audio/visual signal to location semantics [14]). From the AI perspective, MCS is founded on a

distributed problem-solving model where crowds are engaged in complex problem solving procedures through open calls [23]. There have been numerous MCS-powered applications, such as environment monitoring [19, 21], traffic planning [18, 22], social context sensing [16, 20], public safety [17], social event replay [15], etc. In this paper we focus on a novel application area of MCS, which aims to leverage crowd power to repost, analyze, and share distributed public info in urban areas. B. Augmented bulletin boards and fliers With the development of information and communication techniques, several approaches have been proposed to address the shortage of board-based fliers. The first proposal calls for replacing physical bulletin boards with digital displays, which have been used in many places (e.g., high-traffic streets, building entrances, workplaces) for information publishing. For example, Houde et al. [24] projected a digital newsletter created by members of a research group into a common gathering space. Alt et al. [6, 7] developed a networked digital display system called Digifieds and deployed it in 12 public places in Oulu, Finland. They presented their insights regarding to the design and interaction techniques of digital displays after a six-month study [6]. Public displays were also found useful for promoting social ties and enhancing work performance in small groups or organizations [25, 26]. However, digital displays are often used for a business purpose or interaction in organizations. It lessens the openness in terms of average flier-posting activities, and their deployment/maintenance cost is considerably high. Another widely studied method is the use of barcodes or RFID tags on fliers [10-12]. By reading the associated tag, people can get access to a website and obtain more information about the flier. Though this method bridges the gap between the physical space and the virtual space, the information sharing problem still remains. The cross-space link is often directed to proprietary websites (owned by different stakeholders or content providers), making it difficult to have an open and universal platform for public information sharing. Unlike the above studies, FlierMeet presents a novel way to enhance public information dissemination and sharing. It is a cross-space reposting platform that leverages crowdsourcing and human mobility to move paper flier information into the cyber-physical computing system. Alt et al. [6] made an empirical study for varied paper-based notice areas, and attempted to propose a single large network of public displays to connect isolated display solutions. However, it is designed at a conceptual level and the vision for enforcing the same policies (the same design and structure) across all displays is hard to be implemented. C. Crowdsourced data mining The extraction of high-level context or semantic information from raw user-contributed data is crucial to MCS. Cranshaw et al. [27] and Krumm et al. [28] introduced a set of location-based features to analyze the social context of geographic regions (e.g., home, office, etc.), which were extracted from crowd visiting patterns to the regions. CrowdSense@Place [16] exploited opportunistically captured

1536-1233 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2014.2385097, IEEE Transactions on Mobile Computing

3 image and audio clips from smartphones and proposed a set of features to assign category tags (e.g., store, restaurant) to places. MoVi [15] was an application in which smartphones collaboratively sense their environment and recognize socially “interesting” events. While existing studies mainly focus on mining contexts pertaining to places and events, we aim to identify the contexts of the reposted fliers (e.g., flier category and semantics) by leveraging both visual/textual features and crowd-flier interaction patterns. That’s to say, crowd-object interaction patterns are explored to distil object contexts. III. THE FLIERMEET SYSTEM OVERVIEW FlierMeet is the first system that supports cross-space public information reposting, tagging, and sharing. Before getting into the technical details, we first define the data model and then present the system architecture. A. The System Model A cross-space reposting system consists of physical elements (e.g., fliers, human, boards), cyber elements (e.g., reposted counterparts), as well as their interactions. The proposed data model is given in Fig. 2.

Fig. 2. The FlierMeet data model.

1) Person Person denotes a user who reposts (called a reposter in this work) or views public information through FlierMeet. Each reposter has a set of personal or social features, such as age, gender, affiliation, education background, interests, social ties, etc. 2) Board A board refers to a place (e.g., a traditional bulletin board, a digital display) where fliers can be posted. Usually, the flier content varies depending on its location [29]. For example, fliers found on a street board are generally varied in categories, while fliers seen around a department building in a university are often related to the academic. We call it the social preference of a board. In other words, while some boards attract a wide range of flier categories, some others are category-specific. 3) Flier A flier can have many copies and can be posted on one or more boards. We define the reposts of the same flier as a flier

group, which has several features: the group size (i.e., the number of reposts), the number of source boards (i.e., how many boards do the group members come from), and the group of users who repost this flier. Each repost has its repost time and each flier group has its category and semantic tags. For flier group fg, we use |fg| to denote the size of fg, B(fg) as the number of source boards, and U(fg) as the group of reposters for fg. 4) Tags Considering mass demands of public information viewing, we characterize fliers at different levels using two types of tags: category tags and semantic tags. In the pilot study, we considered the following category tags: (1) ads (e.g., special deals, items on sale), (2) academic events (e.g., lectures, talks, workshops), (3) notices (e.g., notices about a coming event), and (4) recruitments (e.g., job hunting, part-time jobs). In addition to category tags, in information systems we often characterize items by their semantics [30-32], e.g., a popular Twitter post or a surprise item to a user. According to the characteristics of FlierMeet, we define the following semantic tags. a) Popular. A ‘popular’ flier refers to a flier whose content will be enjoyed by a variety of people (of different age, occupation, and so on). In distribution, a popular flier is often posted in several different places. b) Hot. A ‘hot’ flier is defined as a specialized type of popular fliers: (1) it has a wide audience; (2) its popularity usually reaches a peak value within a short time period. The sudden explosion of hot fliers distinguishes themselves from average popular fliers. c) Professional. Compared to ‘popular’ fliers, a professional flier shows its influence to a specific community of users (i.e., not widely accepted by the public) who share some commonalities (e.g., major, interests, needs, behavior similarities). This is very common in our daily life. For example, a seminar flier is often interesting to people with the academic background, while a concert event often attracts music lovers. In this paper, we define this kind of flier as ‘professional’. d) Social. People from existing groups usually show high similarity, which motivates us to characterize a flier at the social structure level. For instance, we can recommend a ‘social’ flier to an existing online group if a proportion of members from this group repost the information (e.g., suggesting a sale information to the detected group in Facebook). Groups can be extracted using community detection techniques [33, 34]. Different from the ‘professional’ tag, which largely depends on the content of a flier and is suggested to a dynamic group of users (e.g., job hunters, music lovers, people with similar GPS trajectories [35]), the ‘social’ tag to a flier is determined by the link density of its reposters and is recommended to the members of existing groups. e) Surprise. Most recommendation methods focus on the precision in the proximity to user preferences, which may, however, narrow down users’ horizons. Recently, surprise recommendation, which aims to broaden users’ horizons, has become a new research topic [36, 30]. There can be different definitions of a ‘surprising item.’ We characterize it from the

1536-1233 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2014.2385097, IEEE Transactions on Mobile Computing

4 following aspects. (1) ‘Surprise’ is a user-specific concept [36], and for distinct users, the surprising items can be different. (2) We believe that the type of fliers that a person commonly sees is not surprising or novel to the user. So fliers that do not often appear in a user’s daily life (e.g., fliers from the place you do not often visit) have more potential to be a surprise to the user. Most social tagging systems adopt a multi-tag approach [37, 38], which means that multiple tags can be given to a single object. In this paper, considering that some semantic tags may overlap, we use the multi-tag approach as well. For instance, a person from an existing online social network group may also belong to a dynamic group formed based on behavior similarity, and thus a flier can have both ‘social’ and ‘professional’ tags. B. The System Architecture The system architecture of FlierMeet is shown in Fig. 3. We briefly describe its major components in the following.

crowd-flier interaction. 5) User interface It displays the extracted information to users in a multi-view manner, e.g., browsing by categories or semantic tags. IV. REPOSTING, GROUPING, AND SELECTION Crowd-powered cross-space reposting is a novel method of public information collection. In this section we first describe the reposting infrastructure and then present the flier grouping and selection methods. A. Bulletin Board Detection It is hard to gain prior knowledge of the deployment of bulletin boards in a city. We propose a crowdsourced clustering method for board detection that relies on a dynamic discovery process, as described below. In FlierMeet, each repost is associated with a GPS point, which is captured during reposting at the mobile client side. Assuming that at a certain time t, there are n reposts in the system and the associated GPS points are P = {p1, p2,…, pn}, and the detected board set is B = {b1,b2,..,bm}. When a new repost from GPS point pn+1 arrives:  If the distance between pn+1 and a board bj (0