A Framework for Data Quality and Feedback in ... - CiteSeerX

22 downloads 0 Views 1MB Size Report
Nov 6, 2007 - A Framework for Data Quality and Feedback in Participatory Sensing. Sasank Reddy*, Jeff Burke*†, Deborah Estrin*, Mark Hansen*, Mani ...
Poster Abstract: A Framework for Data Quality A Framework for Data Quality and Feedback in Participatory Sensing and Feedback in Participatory Sensing

Merging models and sensing Sasank Reddy*, Jeff Burke* †, Deborah Estrin*, Mark Hansen*, Mani Srivastava* Merging models and sensing Center Embedded Networked Sensing, Sasank Reddy*, Jeff* Burke* †for , Deborah Estrin*, Mark Hansen*, Mani Srivastava* Merging models and sensing

Personalized Environmental Impact Report (PEIR) Center for Embedded Networked Sensing, †CenterMerging for*Merging Research in Engineering, Media and Performance Merging models and sensing models and sensing models and sensing Merging models and sensing Personalized Environmental Impact Report (PEIR) †Center forUniversity Research in Media and Performance ofEngineering, California, Los Angeles Personalized Environmental Impact Report (PEIR) University of California, LosImpact Angeles Personalized Environmental Impact Report (PEIR) Personalized Environmental Report (PEIR) Personalized Environmental Impact Report (PEIR) Personalized Environmental Impact Report (PEIR) • “Footprint” calculators long-term choices using coarse-grained models impact {inform sasank, jburke, choices mbs }@ucla.edu, • “Footprint” calculators inform long-term using coarse-grained models impact

{sasank, jburke,choices mbs}@ucla.edu, •• “Footprint” calculators inform long-term using coarse-grained models impact [email protected], [email protected] , impact Personalized, real-time assessment to help individuals reduce and minimize exposure [email protected], [email protected] ,impact • Personalized, real-time assessment to help individuals reduce and minimize exposure “Footprint” calculators inform long-term choices using coarse-grained models impact •••“Footprint” calculators inform long-term choices using coarse-grained models impact • “Footprint” calculators inform long-term choices using coarse-grained models impact • “Footprint” calculators inform long-term choices using coarse-grained models impact Personalized, real-time assessment to help individuals reduce impact and minimize exposure by viewing their own practices and habits seen in data inferred from models viewing their own practices and habits as as seen in data andand inferred from models real-time assessment individuals reduce impact and minimize exposure •••by Personalized, real-time assessment totohelp reduce impact and minimize exposure • Personalized, real-time assessment to help individuals reduce impact and minimize exposure •Personalized, Personalized, real-time assessment tohelp help individuals reduce impact and minimize exposure by viewing their own practices and habits as individuals seen in data and inferred from models Categories and Subject Descriptors 2 System Architecture Employ built-in capabilities of mobile handsets to scale without specialized hardware • Employ built-in capabilities of mobile handsets to scale without specialized hardware 2 System Architecture Categories and Subject Descriptors byby viewing their own practices and habits seen and inferred from models by viewing their own practices and habits asasseen inindata and inferred from models viewing their own practices and habits as seen indata data and inferred from models by ••viewing their own practices and habits as seen intraces data and inferred from models Employ built-in capabilities of mobile handsets to scale without specialized hardware C.2.4 [Computer Communication Networks]: DisOur system consists ofthree three components: applicaLeverage model-based analyses with location generated using GPS, cell tower, Our system consists ofusing components: an an application C.2.4 [Computer Communication Networks]: Dis•••Leverage model-based analyses with location traces generated GPS, cell tower, WiFiWiFi Employ built-in capabilities of mobile handsets to scale without specialized hardware Employ built-in capabilities of mobile handsets to scale without specialized hardware Leverage model-based analyses with location traces generated using GPS, cell tower, • Employ built-in capabilities of mobile handsets to scale without specialized hardware •tributed Employ built-in capabilities of mobile handsets to scale without specialized hardware tributed Systems tion runs that runs the phone to collect dataWiFi (Camthat on theonphone to collect sensorsensor data (Campaignr), Systems a sensor store (SensorBase) tocell collect/aggregate •• Leverage model-based analyses location traces generated using GPS, tower, WiFi model-based analyses with location traces generated using GPS, cell tower, WiFi apaignr), sensor store (SensorBase) to collect/aggregate data, and aWiFi •Leverage Leverage model-based analyses with location traces generated using GPS, cell tower, • Leverage model-based analyses withwith location traces generated using GPS, cell tower, WiFi General Terms data, and a set of services that interact with the phone set of services that interact with the phone and the store to General Terms Design,System Experimentation, componentsHuman Factors System components

Trace, audio, image (Fig. SensorBase Nokia n80, n95 Campaignr Location, Audio, Image SensorBase enable value added services (Fig. 1). N80, N95 Campaignr Trace, audio, image SensorBase and the store toCampaignr enable value added services 1). Nokia n80, n95 Campaignr Location, Audio, Image SensorBase N80, N95 Campaignr Trace, audio, image SensorBase Nokia n80, n95 Campaignr Location, Audio, Image SensorBase N80, N95

componentsHuman Factors Design,System Experimentation,

Keywords GPS-equipped mobile handset. GPS-equipped mobile handset. GPS-equipped mobile handset. System components System components System components System components

MobileCustom Systems, Human SensorforNetworks Keywords handset software automatic location Custom handset software for Custom handset software forautomatic automatic location location GPS-equipped mobile handset.

1

GPS-equipped handset. GPS-equipped mobile handset. GPS-equipped mobilemobile handset.

Campaignr Trace, audio, image SensorBase Campaignr Trace, audio, image SensorBase Nokia n80, n95 Campaignr Location, Audio, Image SensorBase Campaignr Trace, audio, imageSensorBase SensorBase Nokia n80, n95 Campaignr Location, Audio, Image SensorBase Campaignr Trace, audio, image Nokia n80, n95 Campaignr Location, Audio, Image SensorBase N80, N95 Nokia n80, n95 Campaignr Location, Audio, Image SensorBase N80, N95 N80, N95 Nokia n80, n95 Campaignr Location, Audio, Image SensorBase N80, N95

time-series collection, robust upload, over-the-air Introduction time-series robust upload, over-the-air Mobile Systems,collection, Human Sensor Networks time-series collection, robust upload, over-the-air

++ +

== =

upgrade/tasking, just-in-time annotation with Custom handset for automatic location Custom handset software forannotation automatic location Custom handset software for automatic location upgrade/tasking, just-in-time with Custom handset software for automatic location + + +++ = === = upgrade/tasking, just-in-time annotation with The rapid adoption ofsoftware mobile phones by society over time-series collection, robust upload, over-the-air voice or text. time-series collection, robust upload, over-the-air time-series collection, robust upload, over-the-air voice or text. time-series collection, robust upload, over-the-air voice or and text.the increasing ability to capture, 1 last Introduction the decade upgrade/tasking, just-in-time annotation with upgrade/tasking, just-in-time annotation with upgrade/tasking, just-in-time annotation with upgrade/tasking, just-in-time annotation with Server side tools to analyze individual spatioclassifying, and transmit aanalyze wide variety of data (imServer side tools individual spatioThe rapid adoption oftomobile phones by society over the Server side tools toanalyze individual spatiovoice or text. or text. voice or text. voicetemporal orvoice text. temporal patterns and calculate corresponding patterns and calculate corresponding temporal patterns and calculate corresponding last decade and increasing ability to capture, classifying, age, audio, andthe location) have enabled a new sensing Server side tools totoof analyze individual spatioimpact and exposure metrics to inform and impact and exposure metrics to inform and Server side tools analyze individual spatioside tools to analyze individual spatioimpact and exposure metrics to inform and andServer transmit a tools wide variety data (image, audio, and locaside to analyze individual spatioparadigm - Server where humans carrying mobile phones can temporal patterns and calculate corresponding advise users. advise users. temporal patterns and calculate corresponding advise users. temporal patterns and calculate corresponding temporal patterns and calculate corresponding tion)ashave enabled a new Human-in-the-loop sensing paradigm - where humans act sensor systems. sensor sysimpact and exposure metrics totoinform and impact and exposure metrics to inform and impact and exposure metrics inform and impact and exposure metrics to and carrying mobile phones can act as sensor systems. HumanWeb-based interfaces informing and advising tems raise many new challenges ininform areas of sensor data Web-based interfaces informing and advising Web-based interfaces informing and advising Campaign Management Inference / Learning advise users. Campaign Management Inference / Learning Analysis / Classification advise users. advise users. Analysis and Classification advise users. in-the-loop sensor systems raise many new challenges in ar- Impact users, which provide reports, real-time feedback, Activity type inference //Exposure Server-side users, which provide reports, real-time feedback, Activity type inference users, which provide reports, real-time feedback, Activity type inference Impact Exposure Server-side quality assessment, mobility and sampling coordination, Impact / Exposure Server-side Model classifier Figure 1. Components for Participatory Sensing. eas of sensor data quality assessment, mobility and sampling visualizations and exploratory data analysis tools Model classifier Web-based interfaces informing and advising Model classifier Figure 1. Components for Participatory Sensing. visualizations and exploratory data analysis tools Web-based interfaces informing and advising visualizations and exploratory data analysis tools Web-based interfaces informing and advising interfaces informing and advising andWeb-based user interaction procedures. for non-professional users. (For handsets and coordination, and user interaction procedures. users, which provide reports, real-time feedback, Activity type inference for non-professional users. (For handsets and / Exposure Server-side users, which provide reports, feedback, Activity type inference for non-professional users. (For handsets and users, which provide reports, real-time feedback, Activity type inference Impact / ExposureActivity Server-side / Exposure Server-side users, which provide reports, real-time feedback, type inference /Impact Exposure Server-side The idea of using mobile phones asreal-time sensing instru-ImpactImpact workstations.) Model classifier visualizations and exploratory data analysis tools Model classifier Model classifier workstations.) The idea of using mobile phones asanalysis sensing instruments visualizations and exploratory data analysis tools workstations.) visualizations and exploratory data analysis tools Model classifier visualizations and exploratory data tools 2.1 Campaignr ments isfor nothing new. Inusers. fact, (For theyhandsets have been used 2.1 Campaignr non-professional and for non-professional users. (For handsets and is nothing new. In fact, they have been used successfully in for non-professional users. (For handsets and for non-professional users. (For handsets and In enable participatory sensing from the phone, successfully in applications personal sensing applications forhealth soIn order ordertoto enable participatory sensing from the workstations.) workstations.) workstations.) personal sensing for social dynamics and workstations.) we created a software application that facilitates in situ data cial dynamics and health monitoring [1, 2]. More rephone, we created a software application that facilitates monitoring [1, 2]. More recently there has been growing collection. This application, which we refer to as Campaignr, cently thereinhas been growing excitement in using moin designed situ datatocollection. This data application, we reexcitement using mobile phones to analyze Burke, Estrin, Hansen, etthe al physical is support various collectionwhich initiatives or Burke, Estrin, Hansen, et al bile phones to analyze physical world. [3]etsuggests Burke, Estrin, Hansen, al fer to as Campaignr, is designed to support various data world. [3] suggests thatthe mobile phones can enable partici“campaigns” [3]. It is easily configurable (through an XML that mobile phones can enable participatorycomponents research. patory research. [4, 5] focus on infrastructure collection initiatives or “campaigns” [3].sensor It is easily conconfiguration file), robust to handle various modalities Estrin, Hansen, etBut Burke, Estrin, Hansen, etal al enBurke, Hansen, etal Estrin, Hansen, et al [4, 5] focus on Burke, infrastructure components needed to needed to Burke, enable this newEstrin, sensing paradigm. little atfigurable (through an XML configuration file), robust to and network connections, and contains a intuitive interface. tention been paid toparadigm. how havingBut humans able thishas new sensing little operate, attentioncarry, has handle various sensor modalities andimage, network Campaigns can be created to use audio, and conneclocation and interact sensors can affect participation, been paid towith howthese having humans operate, carry, anddata inmodalities continuous or triggered sampling. tions, and with contains a intuitive interface. Campaigns can quality, and spatio-temporal coverage. In turn, our research teract with these sensors can affect participation, data be created to use audio, image, and location modalities 2.2 SensorBase builds onand the spatio-temporal ideas of [3] and focuses on methods model quality, coverage. In turn,toour rewith continuous or triggered sampling. The recorded sensor readings from Campaignr are upthe sensing behavior individuals, access their spatial and search builds on theofideas of [3] and focuses on methloaded to a data store (SensorBase). SensorBase is a sen2.2 SensorBase temporal availability, and determine appropriate feedback ods to model the sensing behavior of individuals, access sor data archive service, based on MySQL and PHP. Users The recorded sensor readings from Campaignr are mechanisms to help in the data collection process. The retheir spatial and temporal availability, and determine apcan interact with SensorBase to publish, share, and manage maining sections of the paper are arranged as follows: a.) uploaded to a data store (SensorBase). SensorBase is a sensor data either through its web GUI front end or by using propriate to architecture help in thecreated data colprovide anfeedback overview mechanisms of our software for sensor data archive service, based on MySQL and PHP. its HTTP Post interface. Furthermore, SensorBase provides lection process. The remaining sections of the paper are participatory sensing, b.) describe models for quality of samcan interact SensorBase to publish, share, aUsers data sharing model with that enables fine-grained data managearranged follows: a.) mobility provide and an overview pling and as ideas regarding feedback, of andour c.) and manage sensor data either through its web GUI front ment so that sensor data can be audited before being released software architecture created for participatory sensing, present basic conclusions from preliminary pilot studies. end or by forum. using its HTTP Post interface. Furthermore, to a public b.) describe models for quality of sampling and ideas SensorBase provides a data sharing model that enables 2.3 Additional Services regarding mobility and feedback, and c.) present basic Copyright is held by the author/owner(s). fine-grained data management so that sensor data can be Since SensorBase and Campaignr share a common interconclusions from pilot studies. SenSys’07, November 6–9,preliminary 2007, Sydney, Australia. audited before being released to a public forum. Campaign CampaignManagement Management

Campaign Management Campaign Management Campaign Management Campaign Management

ACM 1-59593-763-6/07/0011

Copyright is held by the author/owner(s). SenSys’07, November 6–9, 2007, Sydney, Australia. ACM 1-59593-763-6/07/0011

Inference / Learning Inference / Learning

Inference / Learning Inference / Learning Inference / Learning Inference / Learning

Analysis and Classification Analysis and Classification

AnalysisAnalysis and and Classification and Classification Classification Analysis andAnalysis Classification

face, it is easy to build value-added services in the system to

2.3 Additional Services

Since SensorBase and Campaignr share a common interface, it is easy to build value-added services in the

help in various data collection scenarios. For instance, we have created applications to enable users to easily tag data, to help with the visualization of data, and to add additional inferences by combining different sources of information or learning techniques.

3

Human Factor Research

Having the participant involved in the sensing process leads us to an unique research challenge: Can we model the “human factors” involved in the sensing process and use them to make the data gathering process efficient while helping users better understand their contributions. Thus, we focus on three main ideals: methods to model the quality of sampling provided by participants, ways to make inferences from the spatial and temporal mobility of users, and mechanisms to provide feedback to enable efficient participation.

3.1

Quality of Sampling

Based on extensive review of possible campaign scenarios and sampling techniques, we came up with five distinct measures of quality involved in human-in-the-loop sampling. We believe that these metrics are essential in making the participatory sensing process more efficient by enabling methods to pick, refine, and evaluate the users needed and data obtained from a campaign. Timeliness - Represents the latency between when a phenomenon occurs and when the sample is available for a data processing unit. It is affected by the sampling, communication, and auditing phases of sensing. Capture - Describes the quality of a particular reading in terms of the ability in determining a particular feature probability of inference. It is an attribute affected by both the specifications of the sensors used and the capturing process by the participant. Relevancy - Tells how well the sample describes the phenomenon that is sought for capture. Ranges from irrelevant (not related to the item of interest) to completely relevant (describes exactly the item of interest). Coverage - Represents the spatial and temporal availability (mean and variance) associated with the coverage provided by a user. The spatial and temporal extent and resolution play a key role here as well. Responsiveness - Describes the probability of responding to a directed, in situ sensing request.

3.2

Mobility Mining

Another important aspect of our research is to mine the spatio-temporal behaviors of participants. Mobility analysis will help campaign designers to assess which users could be available for capturing data based on spatial and temporal constraints. Also, mobility mining will help enable context-sensitive feedback for participants. Currently, two methods are being researched as methods for mobility pattern recognition. The first involves point analysis of cell id and GPS data to find significant locations and learn common tracks. The second deals with the idea of “eigenbehaviors” where location analysis is performed to find behavior features with the most variance as indicators of multi-resolution patterns [1].

3.3

Feedback Mechanisms

We plan to use feedback to communicate the state of the campaign to the user and also as a tool for passive persuasion. We are researching using prompts, positive reinforcement techniques, social validation, and adaptive interfaces as feedback mechanisms. Just-In-Time Prompts provided using SMS and MMS will be based as spatio-temporal triggers and encourage additional sampling in areas of interest and provide feedback to enhance quality of sampling. Positive Reinforcement will be introduced through an incentive system that rewards credits weighted on quality of sampling. Credits can be used for obtaining additional services (ability to suggest campaigns) or out of band rewards (monetary gain). Adaptive Interfaces are essential to make feedback continually effective. By varying the time, the type, and frequency of feedback, we hope to continue the upward trajectory of quality and participation of individuals. Social Validation refers to the concept that people determine what is correct based on what other people determine as correct. This mechanism will be especially helpful in encouraging honest use of the system by pointing out what the “mass” has concluded.

4

Initial Pilot and Future Work

We performed a few pilot campaigns in order to get some ground truth information and also test the validity of our models and metrics. The pilots typically involved groups of 5-10 individuals over the span of 1-2 weeks. The first set of pilot campaigns involved users collecting location (GPS and Cell Tower) data, and the second campaign set dealt with walk-ability of sidewalks in a community. Specifically, in the second campaign set, users were asked to take geotagged pictures whenever a sidewalk was encountered that had structural problems. Based on these preliminary campaigns, we found that users have distinct mobility profiles and have a general area of coverage. Furthermore, certain users were generally more careful with their sampling and thus provided data with higher quality. Also, we noticed that users have different connectivity levels in general - the mean timeliness varied from minutes for certain individuals to hours for others. Overall, these campaigns have served us in providing preliminary validation of our models and research goals. We plan to continue developing algorithms and methods to handle the “human” aspect of human-in-the-loop sampling.

5

References

[1] N. Eagle and A. Pentland. Reality mining: sensing complex social systems. Personal and Ubiq. Computing, 10(4):255–268, 2006. [2] A. Pentland. Healthwear: medical technology becomes wearable. Computer, 37(5):42–49, 2004. [3] J. Burke, D. Estrin, M. Hansen, A. Parker, N. Ramanathan, S. Reddy, and MB Srivastava. Participatory Sensing. [4] A. Parker, S. Reddy, T. Schmid, K. Chang, G. Saurabh, M. Srivastava, M. Hansen, J. Burke, D. Estrin, M. Allman, et al. Network System Challenges in Selective Sharing and Verification for Personal, Social, and Urban-Scale Sensing Applications. [5] A.T. Campbell, S.B. Eisenman, N.D. Lane, E. Miluzzo, and R.A. Peterson. People-centric urban sensing. IEEE WICON, 2006.

Suggest Documents