Enhancing Mobile Access to Online Social Networks with ...

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with Opportunistic Optimization DI WU, Hunan University DMITRI I. ARKHIPOV, University of California, Irvine THOMAS PRZEPIORKA, Imperial College London YONG LI, Tsinghua University BIN GUO, Northwestern Polytechnical University QIANG LIU, Dartmouth College Accessing online social networks in situations with intermittent Internet connectivity is a challenge. We have designed a context-aware mobile system to enable efficient offline access to online social media by prefetching, caching and disseminating content opportunistically when signal availability is detected. This system can measure, crowdsense and predict network characteristics, and then use these predictions of mobile network signal to schedule cellular access or device-to-device (D2D) communication. We propose several opportunistic optimization schemes to enhance controlled crowdsensing, resource constrained mobile prefetch, and D2D transmissions impacted by individual selfishness. Realistic tests and large-scale trace analysis show our system can achieve a significant improvement over existing approaches in situations where users experience intermittent cellular service or disrupted network connection. CCS Concepts: • Human-centered computing → Mobile computing; • Networks → Network mobility; Additional Key Words and Phrases: Mobile access, opportunistic networking, mobile social networks, mobile crowdsensing, D2D communication ACMReferenceformat: Di Wu, Dmitri I. Arkhipov, Thomas Przepiorka, Yong Li, Bin Guo, and Qiang Liu. 2017. From Intermittent to Ubiquitous:EnhancingMobileAccesstoOnlineSocialNetworkswithOpportunisticOptimization.Proc.ACM Interact.Mob.WearableUbiquitousTechnol.1,3,Article114(September2017),32pages. %0*http://doi.org/10.1145/3130979

ThisworkissupportedbytheNationalNaturalScienceFoundationofChinaunderGrantNo.61602168and61672217,theNational KeyR&DProgramofChinaunderGrantNo.2016YFB0200405,theIntelCollaborativeResearchInstituteforSustainableConnected Cities(ICRICities),andtheUniversityofCaliforniaCenteronEconomicCompetitivenessinTransportation(UCCONNECT). Author’saddresses:D.Wu,DepartmentofComputerEngineering,HunanUniversity,China;D.Arkhipov,DepartmentofComputer Science,UniversityofCalifornia,Irvine,UnitedStates;T.Przepiorka,DepartmentofComputing,ImperialCollegeLondon,United Kingdom;Y.Li,DepartmentofElectronicEngineering,TsinghuaUniversity,China;B.Guo,DepartmentofComputerSystemsand Microelectronics,NorthwesternPolytechnicalUniversity,China;Q.Liu,DepartmentofComputerScience,DartmouthCollege, UnitedStates.D.Wuisthecorrespondingauthor(Email:[email protected]). Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovided thatcopiesarenotmadeordistributedforprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationon thefirstpage.CopyrightsforcomponentsofthisworkownedbyothersthanACMmustbehonored.Abstractingwithcreditis permitted.Tocopyotherwise,orrepublish,topostonserversortoredistributetolists,requirespriorspecificpermissionand/ora fee.Requestpermissionsfrom[email protected]. ©2017AssociationforComputingMachinery. 2474-9567/2017/9-ART114$15.00 %0*http://doi.org/10.1145/3130979

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114

114:2

1

•

D. Wu et al.

INTRODUCTION

Social networks have changed the way information is disseminated and provide live coverage of developing events. Accessing social media contents from Facebook, Twitter and various sites of news feeds has become a constant part of people’s daily routines. Because of the growing popularity and ubiquitous usage of mobile devices, people have become accustomed to accessing social content through their mobile devices and see it as a key way to receive updates and interact with others. This has resulted in the phenomenon that managing social interactions and obtaining up-to-the-minute bulletins via mobile devices is commonplace. Mobile devices roaming in different geographical areas usually access online social networks through an Internet connection provided from cellular networks. However, with the increasing demands of fetching social content by mobile devices in cellular networks, a stable cellular service is not always available due to lack of coverage (e.g. underground metro systems, rural areas, national parks such as Yellowstone, or other remote areas such as mountains and near shores) or infrastructure overload (e.g. busy shopping centers or stadiums during matches and concerts). Given intermittent cellular service in these scenarios, it is a challenge to provide agile mobile connectivity to access online social networks [14].

1.1

Motivations

Mobile prefetching has been proposed as a potential solution to facilitate mobile access to social networks for situations where people experience intermittent cellular service (and hence do not have continuous access to social media content), while also helping to reduce energy consumption by only accessing data at times of high signal strength. Since when, what and how to prefetch/access social media data are all tightly coupled, the goal of mobile prefetching in our system is to exploit cellular coverage, and then cache and make content available to social media applications of mobile devices. The main issues related to efficient mobile access of social media under intermittent cellular service are: 1) Mobile Access: the mobile access performance of mobile prefetching relies on an accurate prediction of cellular coverage. An opportunistic mobile signal is more likely to provide stable access to social media using signal estimation from controlled/supervised the collection and crowdsensing of data [4, 5, 36], related to signal readings. Recent and historical information on mobile signal availability combined with location information can predict current and future cellular service, and networking activities can be coordinated based on these predictions. 2) Smart Caching: the smart caching for prefetching of social media content on resourcerestricted mobile devices requires user- and device- context information. The user-context can predict the level of interest that users have for specific content. The device-context can provide real-time device status information which can include residual power, storage and data plan details to help manage resources. By exploiting this mobile and social context information [16, 18, 24], one can provide a customized and optimized caching solution for prefetching content on individual mobile devices. Although mobile prefetching presents a feasible way to access and cache social media content in advance, before cellular service is getting worse, it requires the availability of strong cellular signals and direct connectivity to cellular networks during the prefetching process. When the correlation does not exist in mobile scenarios due to the lack of coverage or the infrastructure overload, we need to assist mobile users that experience disrupted network connection to be still able to access online social networks. To enhance mobile access to social media content in such situation, we could employ the device-to-device (D2D) communication to provide multi-hop connectivity and forward cached data which is prefetched by other mobile devices to the device which has lost cellular service. 3GPP has integrated the device-to-device (D2D) communication as an underlay to enhance the next generation cellular service known as LTE-A [6]. D2D communications require base stations (BSs) to distribute cellular Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... •

114:3

Fig. 1. Connections of system components.

resources to communicating user equipment (UE) pairs. With the allocated cellular resources, an UE is able not only to help to forward data from a BS or another UE to other UEs, but also to temporally store some data in its buffer and to wait for contact opportunities to send them out, which is corresponding to two D2D transmission modes known as D2D connected transmission and D2D opportunistic transmission, respectively. Social media content that has been cached by mobile prefetching could be forwarded to mobile devices losing cellular service by one of the two D2D transmission modes, depending on the availability of D2D connectivity. Note that mobile devices such as smartphones are UEs, and we use the term "UE" to represent a mobile device in this paper. The D2D connected transmission relies on the usual relaying technique which requires the end-to-end path, while the D2D opportunistic transmission utilizes the new paradigm of store-carry-forward which does not need the end-to-end path. Thus, in order to benefit the whole system composed of mobile prefetching and D2D communication, these connected UEs, especially the UE that is involved in mobile prefetching, are required to devote some of their own resources such as power and storage to D2D communications in an unselfish way [3]. UEs are held and controlled by humans, who are selfish. If most mobile users are unwilling to participate in D2D communication, their UE resources cannot be fully utilized and the D2D assisted transmission system will not be able to operate successfully. Therefore, user selfishness is crucial to D2D communication and needs to be studied for the design of better D2D systems underlaying cellular networks. Fig. 1 states our motivation to integrate mobile prefetching and D2D communication as a holistic system, and their interaction on enhancing mobile access to online social networks under intermittent and disrupted networks. Specifically, after obtaining a crowdsensed cellular signal distribution, mobile prefetching functions with the support of BS that has effective cellular coverage. The cached social media content from mobile prefetching can be accessed offline by its mobile user, or forwarded to other UEs experiencing a long period of no cellular service by D2D communication. Two D2D transmission modes in terms of the availability of D2D connectivity provide an agile solution to deliver the cached content to UEs that can’t directly access cellular networks. These social media content are forwarded and shared among mobile users because of their common interests reflected in the social applications of their UEs. Meanwhile, these mobile users have certain selfish inclinations on the mobile resources of their UEs during D2D communication; the user selfishness has a great impact on the mobile performance of ubiquitous access to social networks. In general, we present a mobile social network access solution to cover intermittent and disrupted scenarios with various communication connections. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114:4 • D. Wu et al.

1.2

Related Works

No public data are available regarding the exact cellular coverage and bandwidth available for mobile prefetching. Projects such as OpenSignal [26] rely on GPS to collect, aggregate, and publish signal strength data, but their data are usually outdated and can’t reflect real-time or recent cellular service quality. Though the concept of crowdsensing has been used in the search of roadside unit (RSU) for mobile access in vehicular networks [40, 44], further action to crowdsense roadside signal charateristics after localizing RSUs is missing in these works. Bartendr [30] introduces the idea of measurement and prediction based on signal traces to get an estimate of what our signal strength will be ten minutes into the future. However, such estimates from existing databases are simplistic and error-prone since the server side lacks efficient methods to evaluate the accuracy of the information contributed by various UEs from the same geographic area. Reliable learning schemes are needed to generate more accurate prediction results by fusing multiple estimates and inferring the reliability of each UE. Techniques to determine mobile prefetching based on network conditions, user preferences and phone status have been studied recently. EarlyBird [37] develops a prefetch scheduling scheme that effectively integrates network access condition and user content preference to maximize delay reduction under resource constraints. O2SM [48] estimates social media items that are likely to be viewed by a user and provides hints to resource-efficient scheduling and delivery. RichNote [35] tackles three important challenges in realizing rich notification delivery, which is content and presentation utility modeling, notification selection, and scheduling of delivery. PowerTutor [47] and BatteryExtender [25] have been designed to estimate and manage battery behavior. APPM [27] can adapt to power and app usage dynamics for prefetching on a phone. Nevertheless, all of these techniques assume constant/stable Internet access and have not specifically addressed the smart caching problem for both intermittent cellular service and disrupted network connection scenarios. When UEs have intermittent cellular service we must be able to identify the periods of connectivity during which UEs have to select the content to prefetch and distinguish between the types of social content displayed. Furthermore, if UEs have weak connections with BSs resulting in a long period of cellular service loss, we should extend the networking functions of UEs to support D2D communication, so that the cached social media content from mobile prefetching could be shared and forwarded between UEs. However, the unavoidable selfishness of UEs to a server as helpers for D2D communications underlaying cellular networks has not been studied in existing works. Ma et al. use an analytical framework to address inference management and resource allocation in D2D underlaying cellular networks [23], and Chen et al. propose social information and social properties based cooperative D2D communications [3] but both of those investigations assume that UEs are always willing to act in a cooperative way. This assumption may lead to an overestimation of system performance and a misinterpretation of the relevant properties. Though user selfishness in terms of wireless forwarding and routing has been addressed in delay tolerant networks[19], its impacts on the system design and performance of D2D transmission should be examined.

1.3

System Overview

Fig. 2 presents an overview of our system to leverage the benefits of mobile prefetching and D2D communication under intermittent cellular services. Since UEs have varying positions and fluctuating access states, we use time frame to loosely mark a system period within which the access states and physical relationships of all the nodes remain nearly constant. The orange hexagons indicate the approximate cellular coverage areas of BSs (B 1 , B 2 , B 3 and B 4 ) which can distribute cellular resources to their associated UEs for cellular direct transmissions or D2D transmissions. All the BSs have an effective connection to the Internet, as shown by the dashed green line connecting BS and Internet provider. During the access Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... • 114:5

Ϯ KƉƉŽƌƚƵŶŝƐƚŝĐ dƌĂŶƐŵŝƐƐŝŽŶ

Ϯ ŽŶŶĞĐƚĞĚ dƌĂŶƐŵŝƐƐŝŽŶ ĞůůƵůĂƌ ŝƌĞĐƚ dƌĂŶƐŵŝƐƐŝŽŶ

dŝŵĞ &ƌĂŵĞ ƚŝ

^ƚƌŽŶŐ ĞůůƵůĂƌ ^ŝŐŶĂů

tĞĂŬ ĞůůƵůĂƌ ^ŝŐŶĂů

Ϯ ŽŵŵƵŶŝĐĂƚŝŽŶ ƌĞĂ

dŝŵĞ &ƌĂŵĞ ƚŝнȴ

ĞůůƵůĂƌ ŽǀĞƌĂŐĞ ƌĞĂ

Fig. 2. System overview underlaying cellular network.

of social networks, there are two natural groups of UEs – subscribers and helpers. Subscribers are UEs which request and download social media content in a given time frame, such as S 1 and S 2 in Fig. 2. Each subscriber’s D2D communication area is denoted by a small purple circle. Helpers are UEs which do not request content in the same time frame, for example, H 1 and H 2 with D2D communication areas denoted by green-circles in Fig. 2, serve as "relays" for subscribers, and participate in the two D2D transmission modes mentioned above – D2D connected transmission and D2D opportunistic transmission. Note that although all the UEs including subscribers and helpers are located in the cellular coverage areas of different BSs, some UEs may encounter intermittent cellular service caused by coverage gaps or infrastructural overload at a specific position and time frame. For example, S 2 can get a strong cellular signal and corresponding service in the cellular coverage area of B 2 at ti , but confronts weak cellular signal and therefore has no cellular service in the cellular coverage area of B 4 at ti+δ . Considering this spatial-temporal changes of cellular services for different UEs, we depict that a UE can obtain a strong cellular signal with the solid circle to present its D2D communication area, and dashed circle to reflect weak cellular signal as shown in Fig. 2, respectively. Given strong (usable) cellular signals, both subscribers and helpers can use cellular direct transmission to prefetch social media content before the cellular signal becomes weak and corresponding cellular service is lost. These cached content from mobile prefetching of helpers can be later forwarded to other subscribers which have no cellular services due to a weak signal, through one of the two D2D transmission modes depending on the availability of D2D connectivity. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114:6 • D. Wu et al. Specifically, in the context and framework of D2D communication presented by Fig. 2, we classify mobile prefetching with the support of direct cellular connection as cellular direct transmission, and the process that some UEs (helpers) are involved in serving other UEs (subscribers) to access social networks through D2D communication as D2D transmission. Furthermore, the D2D transmission can be defined as D2D connected transmission if a helper is able to forward the cached social media content from a BS or another helper to a subscriber using a multi-hop connection. Otherwise, the D2D transmission is named as D2D opportunistic transmission because the prefetched content needs to be stored in the buffer of a helper and wait for contact opportunities to send out to a nearby subscriber. The three types of transmission existed in our system are distinguished by lines with three different color and their arrows indicate the direction of transmission, as shown in Fig. 2. We illustrate several mobile social network access cases existed in the figure as follows using these transmissions as follows: • Cellular direct transmission: S 2 prefetches content from B 3 using cellular direct transmission at time ti before entering an area that has weak cellular signal from B 4 at time ti+δ . • D2D connected transmission: Because S 1 has weak cellular signal and thus no service from B 1 , H 2 can serve S 1 by establishing a path to immediately forward prefetched content from B 4 to S 1 using D2D connected transmission at time ti . • D2D opportunistic transmission: Since H 1 is serving S 1 across different time frames, H 1 can prefetch content from B 2 at time ti , then carry and forward the content to S 1 , which has no cellular service at time ti+δ . • Coexistence of D2D transmissions: H 1 use D2D connected transmission and D2D opportunistic transmission at time ti+δ to enhance the content-downloading of S 1 . As an individual, an UE tends to show unwillingness when required to selflessly cache and forward data for other users. However, for acting as a helper in the D2D transmission of our system, an UE is required to contribute its battery resource in a D2D connected transmission, or both its battery and storage resources in D2D opportunistic transmission. Therefore, it is necessary to study the impact of user selfishness on the performance of mobile access to social networks during D2D transmissions. To address all the issues relevant to our system composed of mobile prefetching and D2D communication, the key techniques of this paper and their contributions are summarized as follows: • We design a mobile crowdsensing scheme with control items and reliability analysis for fast aggregation and dissemination of network conditions and opportunities in areas with unstable cellular service. These crowdsensed data can be used by UEs to predict network characteristics and prefetch social media content. • We develop a context-aware optimization solution for prefetching and caching social media contents based on network conditions, user preferences and UE capacities. This optimization has been implemented in a mobile system that can enable efficient offline access to online social media content under intermittent signal availability. • We propose an analytic framework which uses a time-varying graph model to study the D2D transmissions underlaying cellular networks. Our framework reveals the impacts of selfishness within the framework. To the best of our knowledge, it is the first work to study the impact of selfishness in D2D systems. We have evaluated our framework using realistic mobile tests and large-scale trace analysis. Remark: Our previous work [41] presented a primitive implementation of mobile prefetching application with a simple framework to collect signals to facilitate mobile access. However, the framework does not contain reliable crowdsensing and context-aware mobile optimization. Later these issues were Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... • 114:7 partially addressed in [39] specifically for mobile prefetching of social media content in the underground metro scenario. Our system described in the paper significantly differs from these earlier works as follows: 1) The scenario considered in this paper has been updated to work in general environment with intermittent cellular service, whereas our previous works in [39, 41] were limited to the London Underground scenario. 2) The crowdsensing scheme in [39] is based on a two-stage estimator, where the worker’s reliability is only evaluated by the control items. In contrast, our system in this paper proposes a joint estimator to score the workers based on their performance on both control items and target items, therefore can achieve more efficient crowdsensing results with less cost of control items. 3) During mobile optimization of prefetching, the caching problem is modeled as a 0-1 knapsack problem in [39]; this was a simplification of the problem. In this paper the caching problem is redefined as a multidimensional 0-1 knapsack problem; a more accurate formulation. Additionally, we improve our solution from previous work by including a series of novel filtering techniques. 4) As part of our improvements on previous works, D2D communication has been integrated into our system in this paper. We have presented a totally new analytic framework which uses a time-varying graph model to study the D2D transmissions underlaying cellular networks and the impact of selfishness in the system. 5) With respect to these new schemes in our system, corresponding experiments have been updated or added using realistic tests and large-scale trace analysis. The rest of this paper is organized as follows: Section 2 addresses crowdsensing of a cellular signal for mobile access during mobile prefetching. Section 3 illustrates the design of smart caching in mobile prefetching. Section 4 explains our D2D framework and its selfishness models. Section 5 evaluates system performance. Section 6 concludes our paper.

2

CROWDSENSING OF NETWORKING CHARACTERISTICS

When mobile users experience intermittent cellular service their UEs should be able to run mobile prefetching to cache social media content at locations where the UEs have a direct cellular connection and usable cellular service. These cached content can be latter browsed by the users when they lose cellular support. Therefore it is necessary to obtain objective network measurements and an understanding of what cellular opportunities we have for building a stable mobile prefetching function [45]. As illustrated in Fig. 1, the availability, and distribution of cellular signal data can be crowdsensed through the coordination between crowd server and crowd workers in advance. The mobile prefetching module can use these signal crowdsensing results as on-demand input to establish a stable cellular connection and achieve agile mobile access and caching performance. In this section, we present a crowdsensing method to aggregate cellular signal collections from reliable workers. This method can measure the intermittent availability of cellular signal strength and spatialtemporal networking characteristics to enable effective mobile prefetching of social media content, therefore providing a reliable basis of opportunistic cellular connectivity for mobile users to offline access to online social networks.

2.1

Mobile Signal Crowdsensing

A UE based solution has been chosen for the collection of mobile signal data, such as signal strength. Recent versions of the Android operating system (since 4.2 Jelly Bean) provide functionality for measuring the signal strength; this is retrieved through a variety of measures such as dBm or ASU. The ASU measure conforms to the relevant ETSI (European Telecommunications Standard Institute) specification 10.3.0 [15], where ASU values from 0-31 correspond to the range from -113dBm to -5dBm and an ASU of 99 represents an unknown or undetectable signal. Normally, an ASU value above 8 results in at least 3 bars signal Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

! ! "! " "" ! " ! "

114:8 • D. Wu et al.

(a) Crowdsensed data of signal strength from an underground metro line. Ͳ

(b) Signal tracker

Ͳ

݇ κ െ ݇

(c) Crowdsensing operation

Fig. 3. Illustration of mobile signal crowdsensing.

indication on an Android smartphone, which is considered as a usable signal for mobile access of Internet. The data gathered from the dedicated collection runs of a UE could give us a good indication as to the levels of the mobile signal at various locations. However, there is only so much data that one person can collect and variances exist in the sample size of the data; it would be better to allow the collection of mobile signal data from any number of UEs. Given these considerations, crowdsensing [21, 22, 46], an approach to outsource or share sensing tasks among workers and then collect large amounts of sensing data from crowds (workers), could be a promising solution for the aggregation and dissemination of the opportunistic signal data. We have implemented a crowd server to collect updated cellular data (e.g. 3G, 4G) from workers’ surrounding areas, and allows clients to easily get access to the signal readings and other information (e.g. associated timing and location trace) on synchronization. For example, as shown in Fig. 3 (a), it is a subset of crowdsensed signal strength data from an underground metro line where mobile users usually Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... • 114:9 encounter intermittent cellular service. Areas shaded in dark gray are in tunnel sections and areas in white are in platform sections of the metro line. These crowdsensed ASU data are labeled with their collection timings and locations, therefore present an estimated performance of signal strength and its changes during a specific underground metro trip. As for workers during the crowdsensing process, they can use commodity UEs to collect surrounding signal strength data and upload the data to the crowd server. The Funf framework [7] that targets Android smartphones and offers a built-in set of data probes can be chosen as the basic signal collection function. The framework probe is enabled (but "sleeping") when the UE starts to run the framework. At regular intervals, a probe enters a running state to run its data collection code and upon completion returns to the enabled state until it is scheduled to run again. We have extended the Funf framework as a background application in the UE for measuring mobile signal characteristics. Fig. 3 (b) presents our implementation of such a signal tracker. The measured ASU values of surrounding signal strength over time as displayed by its statistic graph will be uploaded to the crowd server if the UE is a worker. Given a destination location (e.g. the specific name of a metro platform), the signal tracker also can download crowdsensed data to estimate signal availability and schedule mobile prefetching along the route to the destination. Unfortunately, it is not always obvious how best to combine the workers to collect reliable crowdsensed data; the fact that the workers have unknown and diverse levels of expertise introduces systematic process biases [11]. There are several factors that could account for the performance difference in data collected from crowd workers, such as UE hardware, current position, time of day and others. Therefore we need a way of scoring the workers, that is, estimating their expertise, bias, and any other associated parameters, in order to combine their answers more effectively [12, 13, 20]. One way to score the workers is by using their past performance on tasks similar to the current task. However, this approach is not always practical. It is difficult and costly for anonymous workers to maintain historical records. Another problem with this approach is that worker’s past tasks may be very different from their current task, considering various traveling routes and collection locations. One alternative is to "seed" some control items with known answers into each worker’s assigned tasks, then score each worker’s reliability using these control items, and weight their answers according to their reliability [32, 42, 43]. The operation of using control items to evaluate worker’s reliability is illustrated in Fig. 3 (c). The control items are chosen from the values of mobile signal strength (ASU values) at specific locations and timings, which can be obtained by cellular service provider or collected by trusted workers that have been verified already by the crowd server. The correct answers to these control items are known to the crowd server, but not known to a potential worker that needs reliability evaluation. When the worker participates in the crowdsensing process, the UE first reports it’s historic and current spatial-temporal information to the crowd server. The crowd server then maps the information with its database and selects items with similar spatial-temporal attributes, including k control items (correct answers known by the crowd server) and − k target items (correct answers unknown by the crowd server), and sends these items back to the worker. The worker’s UE correspondingly obtains signal strength measurements from its signal tracker, as recorded by the signal graph over time, to answer these items. These answers will be later uploaded to the crowd server for correctness verification, and further used in joint estimation (see Section 2.2) to evaluate the worker’s reliability. The whole process of mobile crowdsensing that uses control items to improve the reliability of aggregated mobile signal data for mobile prefetching can be formulated as follows. Assume there is a set T of target items (data to be collected by workers), associated with a set of labels μ T := {μ i : i ∈ T } whose true values μ ∗T we want to estimate. In addition, we have a set C of control (or training) items whose true labels μ ∗C := {μ i∗ : i ∈ C} are known. We denote the set of workers by W; Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114:10 • D. Wu et al. each worker j is associated with a parameter ν j∗ that characterizes their performance bias. We denote the complete vector of worker parameters by ν := {ν j∗ : j ∈ W}. Denote by nt the number of target items and m the workers. Let ∂i be the set of workers assigned to item i, and ∂jt (and ∂jc ) be the set of target (and control) items labeled by worker j. The assignment relationship between the workers and the target items can be represented by a bipartite graph Gt = (T , W, Et ), where there is an edge (ij) ∈ Et iff item i is assigned to worker j. Let r be the number of workers assigned each target item, and let each worker answer items including k control items and − k target items, so that r = m( − k)/nt . Denote by x i j the label we collect from worker j for item i. In general, we can assume that x i j is a random variable drawn from a probability distribution p(x i j |μ i∗ , ν j∗ ). It can be modelled by a Gaussian model: x i j = μ i∗ + b j∗ + ξ i j , where ξ i j ∼ N (0, σ ∗2 ), μ i∗ is the quantity of interest of item i, b j∗ is the bias of worker j, and σ ∗2 is the variance. This Gaussian model captures heterogeneous biases across workers that are commonly observed in practice, for example in their UEs’ perception of signal strength; these biases can have significant effects on the estimation of reliability. Note that the biases are not identifiable solely from the crowdsensed labels {x i j }, making it necessary to add some control items when decoding the answers.

2.2

Crowdsensing with Control Items

To incorporate control items into the mobile crowdsensing process, we introduce a consensus method that uses a joint maximum likelihood estimator to score the workers based on their performance on both control items and target items. We present the method in terms of a general model p(x i j |μ i , ν j ) here; the updates for the Gaussian model can be reduced to simple weighted averaging updates, see Algorithm 1. Algorithm 1 Joint Estimation with Ground Truth (for Gaussian Model) Input: The labels {x i j }, and true labels μ ∗C of the control items. Initialize bˆj = 0 and weight wˆ j = 0 for all worker j, and μˆ C = μ ∗C Iterate until convergence: Scoring: bˆj = (x i j − μî )/|∂j |, i ∈∂j

wˆ j =|∂j |/

(x i j − μî − bˆj )2 ,

i ∈∂j

for all worker j μî = wˆ j (x i j − bˆj )/ wˆ j ,

Prediction:

j ∈∂i

j ∈∂i

for all item i ∈ A Joint Estimator: we directly maximize the joint likelihood of the crowdsensed labels {x i j } of both target and control items, with μ C of the control items set to the true values μ ∗C . That is, ˆ = arg max log p(x i j |μ i∗ , ν j ) + log p(x i j |μ i , ν j ) , (1) [μˆ T , ν] [μ T ,ν ]

i ∈ C j ∈∂i

i ∈T j ∈∂i

which can be solved by block coordinate descent [34], alternatively optimizing μ T and ν , see Algorithm 2. The joint estimator estimates the workers’ parameters based on both the control items and the target Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... • 114:11 items, even though their true labels are unknown. This is because the labels xi j provide information on μ i∗ through the probabilistic distribution p(x i j |μ i∗ , ν j∗ ). Algorithm 2 Joint Consensus Algorithm for incorporating known items Initialize μˆ T and νˆ . Iterate until convergence: νˆj = arg max

Scoring:

νj

i ∈∂j ∩C

log p(x i j |μ i∗ , ν j ) +

log p(x i j | μî , ν j ),

i ∈∂j ∩T

for all j ∈ W, μî = arg max log p(x i j |μ i , νˆj ),

Prediction:

μi

j ∈∂i

for all i ∈ T ,

2.3

Optimal Number of Control Items

From the above crowdsensing formulations, control items with known answers can be used to evaluate workers’ performance, and hence improve the combined results on the target items with unknown answers. Considering the intermittent connectivity in our scenario, the total number of mobile signal data each worker can collect is limited. To further improve crowdsensing efficiency we would like to learn the reliability of each worker using fewer control items so that each worker can provide answers for more target items. In other words, a rule of thumb for crowdsensing practitioners is expected to achieve fast crowdsensing using a minimum (optimal) number of control items. Correspondingly, the computational question in fast crowdsensing is to construct an estimator μˆ T of the true labels μ ∗T based on the crowdsensed labels {x i j }, such that the expected mean square error (MSE) on the target items, E || μˆ T − μ ∗T || 2 , can be minimized by the optimal choice of k. Here we give theoretical results for this problem by studying the properties of the joint estimator with our Gaussian model. We first introduce some matrix notation. Let At be the adjacency matrix of bipartite assignment graph Gt between workers and the target items, where ai j = 1 if item i is labeled by worker j. Assume r i is the number of workers assigned to i-th target item and jt and cj is the number of target items and control items assigned to j-th worker, respectively. Let R t := diag({r i : i ∈ T }) be the diagonal matrix formed by the degree sequence of the target items, and similarly define Lt = diag({jt : j ∈ W}) and Lc = diag({cj : j ∈ W}). Theorem 1. For the Gaussian model with x i j = μ i∗ + b j∗ + ξ i j , where ξ i j are i.i.d. noise drawn from N (0, σ ∗2 ), the expected MSE of the joint estimator defined in Eq.(1) is E[ || μî − μ i∗ || 2 /nt ] = σ ∗2 tr((R t − At (Lt + Lc )−1ATt )−1 )/nt , (2) i ∈T

where "tr" is the trace of a matrix. If At is regular, with R t = rI and Lt = ( − k)I , where I is the identity matrix, this simplifies: − k −1 1 W ) )/nt , E[ || μî − μ i∗ || 2 /nt ] = σ ∗2 tr((I − (3) r i ∈T T where W = R t−1At L−1 t At .


114:12 • D. Wu et al. Proof. Assume B := I − R t−1At (Lt + Lc )−1ATt is invertible. The solution of the joint estimator on the 1 1 (ξ i j − ξ¯j ), and ξ¯j = c ξ i j and ξ i j = Gaussian model is μˆ T = μ ∗T + B −1z T , where zi = t r i j ∈∂ j + j c t xi j −

μ i∗

− b j∗

i

for ∀i ∈ T . We obtain Eq.(2) by calculating Var(μˆ T ).

i ∈∂j ∪∂j

Eq.(3) establishes an explicit connection between MSE and the spectral structure of the bipartite graph Gt . According to the theory on the second eigenvalue of random regular graphs introduced in [8], we T consider the eigenvalues 1 = λ 1 ≥ λ 2 ≥ · · · ≥ 0 of W := R t−1At L−1 t At , where the second largest eigenvalue λ 2 famously characterizes the connectivity of the graph Gt . Roughly speaking, Gt has better connectivity if λ 2 is small, and visa versa. Observe that nt − k −1 − k −1 tr((I − W) ) = λi ) (1 − i=1 (4) nt − 1 ≤ + . k 1 − −k λ 2 Therefore, the joint estimator performs better when λ 2 is small, i.e. when the graph is strongly connected. Intuitively, better connectivity "couples" the items and workers more tightly together, making it easier not to make mistakes during inference. Besides hoping for a small error, one may also want the assignment graph to be sparse, i.e. use fewer labels. Graphs that are both sparse and strongly connected are known as expander graphs and have been found universally important in areas like robust computer networks, error-correcting codes, and communication networks. It is well known that large sparse random regular graphs are good expanders [8], and hence a near-optimal allocation strategy for crowdsensing [17]. On such graphs, we can also estimate the optimal k in a simple form. Theorem 2. Assume At is a random regular bipartite graph, and nt = m. We have that σ ∗2 nt − 1 1 E[ , || μî − μ i∗ || 2 /nt ] = (1 + O( )) + − k nt nt k i ∈T

(5)

with probability one as nt → ∞. If in addition → ∞, the optimal k that minimizes Eq.(5) can be derived as √ following: k ∗ = 2 /nt + 2 /nt 2 + 1/4 − /nt − 1/2 ≈ / nt . Proof. Use Eq.(4) and the bound given in [28] for λ 2 of large random regular bipartite graphs.

Therefore, the optimal choice of control items (k) to minimize the expected MSE should scale as √ O(/ nt ) when using joint estimators. That is, the optimal k of the joint estimator scales linearly w.r.t. budget . In addition, the optimal k for the joint estimator also decreases as the total number nt of target items increases. Because nt is usually quite large in practice, the number of control items is usually very small. In particular, as nt → ∞, we have k ∗ = 1, that is, there is no need for control items beyond fixing the unidentifiability issue of the biases. Our system employs fast crowdsensing to aggregate the most recent signal readings and provide updated signal data to users on synchronization. The crowdsensed dataset can be used to determine the performance difference among cellular base stations, and increase the reliability of signal traces at specific times and locations. Through crowdsensing coordination between servers and UEs involved in the process, previous signal traces can be obtained from the remote dataset in advance. UEs can then use these readings in conjunction with current location and future movements to prefetch online social media at potential areas along the route that have usable cellular signals. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.


3

CONTEXT-AWARE MOBILE PREFETCHING

As shown in Fig. 1, the signal crowdsensing results as a mobile context information can be used in mobile prefetching as the input of network conditions to support the caching of social media content. Meanwhile, since cellular service is not always stable in the intermittent scenarios we have described, having mobile users constantly check social networks is a drain on limited UE resources. Therefore, besides network conditions, we also need to consider other mobile context factors to achieve smart caching smart caching during the mobile prefetching process. In this section, we have designed an optimization solution for mobile users’ UEs to detect cellular service opportunities by learning crowdsensed signal traces and choose optimal social media items to cache based on the context of the network condition, user preference, and UE’s status.

3.1

Mobile Cost of Prefetching

Social and news-feed style networks perform metadata-based learning to sort each user’s news-feed into order according to their "user-post" affinity. However, such an ordering based on the user’s click history assumes ubiquitous mobile connectivity and unlimited UE resources. For example, Facebook does not factor bandwidth vs. size of news-items into their ordering algorithm. We can solve the prefetching and caching problem by intelligently using the ordering calculated by social networks and imposing on this ordering the constraints from the low-connectivity environment and limited resources on UE, e.g. power, storage, and data plan. Specifically, the problem degenerates into a multidimensional 0 − 1 knapsack problem (0 − 1 MKP) (where dimensions refer to resource constraints). Let I = (i |I | , ..., i 1 ) be the sequence of affinity ordered news-items. The ordering implies the following: Q(i i−1 ) < Q(i i ) where Q(i x ) is the likelihood that newsitem i (which is the x’th item returned by social networks) will be viewed by the user. Assume that the media-caching module is allocated a fixed block of available storage space (M) and a fixed energy budget (E). Finally, assume that there is a fixed data-plan budget which indicates the number of bytes which may be received by the cellular interface for the purpose of caching. Each time the optimization is run E may be set to a different value, for example, if E is limited to 5% of the power available to the UE at the time of its operation this value will be different each time the optimization is run. Further, suppose that the UE may trigger the caching application intelligently, for example prior to the user’s entry into areas of lower connectivity. Cached content can be removed at fixed time intervals, for example, the amount of time that one will be traveling to the area. Finally, assume functions f (i, t) and д(i) which returns the power and storage expenditures respectively for downloading and caching news-item i at time t. We adopt energy models from [2]. The energy cost under cellular networks is modeled in Eq.(6): si + et ail , (6) ecell (i, t) = Rcell + c cell × bw cell (t) where: • • • • •

Rcell is a constant indicating the ramp energy cost for a given cellular interface. ccell is a power coefficient for the interface. bw cell (t) is the bandwidth available on the cellular interface at time t. si is the size in bytes of data item i. et ail is the estimate of cellular tail energy costs.

t ail Since contents are pre-fetched in batches, we have et ail = c t ail × Tlav . Where c t ail is the power coefficient д for cellular tail energy, Tt ail is the typical cellular tail time, and lavд is the history average of the number of contents downloaded in a batch by the pre-fetcher. To predict cellular tail consumption for on-demand


114:14 • D. Wu et al. fetches we have et ail = c t ail × min(Tinact ive ,Tt ail ), where Tinact ive is the history average of the idle period between two consecutive content requests from the user. We assume that bw cell (t) = 0 when no cell connectivity is available at time t. With regards to data plan usage, we define D(t) to represent the number of bytes allocated as available for caching from the cellular data plan at time t. D(t) is monotonically non-increasing; when more cache data is downloaded through the cellular connection at time t − 1, D(t) will decrease by the amount of data that was downloaded at time t − 1. Clearly the storage cost of item i is M(i) = si . Note that the bandwidth at some time t is a function of the signal strength to the available wireless interface at t. In this optimization, we assume that all news items will be cached in a single batch with the bandwidth available at the time that the optimization is initiated. As the downloads will be batched, the ramp and tail energy costs will be incurred once per batch.

3.2

Mobile Optimization of Prefetching

As mentioned earlier, certain optimization constants will be set by an UE before prefetching occurs. Suppose that it is run at time t. We can formulate power expenditure f (i, t) = ecell (i, t), if D(t) − si ≥ 0; otherwise f (i, t) = ∞. We also formulate storage expenditure д(i) = si . The assumptions about downloading news-items for caching is based on the cellular availability at time t. Note that ecell (t) is dependent on bw cell (t) as described in Section 3.1. The problem is now a 0 − 1 MKP. In the problem formulated in Eq.(7) only the x i are variables (which are news items to cache), the remaining terms are constants derived at run time. max

|I |

(|I | − i) × x i ,

i=1

s.t.

|I |

f (i, t) × x i ≤ E,

i |I |

(7)

д(i) × x i ≤ M,

i

x i ∈ {0, 1}. where t is constant at the time the optimization is evaluated. The 0 − 1 MKP is an NP-hard problem, and no efficient polynomial time approximation scheme can be used to generate a solution with bounded sub-optimality in polynomial time (unless P=NP). Our problem has the additional complication that the collection of social media items available at any one time can be presumed to be infinite. Given the difficulty of the optimization task, the best reasonable solution method will be a heuristic with unbounded sub-optimality, and we must relax the available list of items to some finite value to make the problem tractable even for such a heuristic. Balasubramanian et al. [2] show that only the first 10 results for a search query originating from a UE have a non-negligible likelihood of user interaction, and this result also applies to news-item recommender systems. Furthermore, as the sub-sequences of news items returned from social networks are non-increasing in value our algorithm should not consider news items too late in the order. We restrict our problem to consider only 20 highly ranked news items, and then solve this restricted problem with a branch and bound procedure to arrive at a feasible solution for our original problem. Shih’s [31] successful and efficient application of branch and bound to MKPs larger than the ones we are considering motivates our choice for branch and bound as our optimization technique. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... • 114:15 Considering at a given instance only the top 10 retrieved items have a non-negligible likelihood of being interacted with. Our caching optimization acts as a filter on the item sequence returned. The sequence cached after filtration is a sub-sequence of the newsfeed. The final step in our optimization is a branch and bound optimization which will filter out news items that are too resource intensive given their order in the sequence. However, branch and bound cannot be applied to an infinite collection where the cost of any item in the collection cannot be predicted. To set up a useful and solvable optimization problem with a finite number of variables we apply the first filter out items previously cached and consumed, then we remove overly resource intensive or poor quality items. These filters decrease the number of variables in the optimization to a size that can be solved too quickly and effectively. Metadata regarding news items arrives at the optimizer in batches; the arriving metadata may reference pieces of content already in cache. Thus we must remove all news items from consideration which are already in the cache prior to running our optimization. As metadata requests are relatively cheap when batched we request 100 metadata items at optimization time (the largest number allowed in the Facebook API [1]). If we detect that 10 or more of the received metadata items reference news items already in the cache we request a further 100 metadata items until our collection of candidate news-items has size at least 90, and at most 100 (with preference given to earlier news items). The range 90 to 100 is chosen somewhat arbitrarily. The intent is to select a large number of items of which roughly the best half will have the possibility of being cached. Prior to running our branch and bound optimization, we must address the fact that while the news items arrive in non-decreasing order of user affinity, the order does not take the energy and memory expenditures into account in the ordering. By selecting only the highly ordered items returned by social networks we may be biasing the result of the branch and bound optimization in favor of interesting but resource intensive items. To fix this bias we further filter news items by applying Toyoda’s [33] primal effective gradient method on the collection of metadata items returned from social networks. We use the primal effective gradient method to filter our collection of unique news items into a set of 40 candidate items (the 40 news items with highest effective gradients). Toyoda’s method is an improvement heuristic that achieves both a feasible solution and an approximate ranking of the variables reflecting their quality when included in a solution. We use this ordering to filter all but the best 40 items. We do not make use of the approximate solution given by Toyoda’s algorithm, instead, we use only the approximate ranking; this means that even though we select 40 items at this stage, a solution that caches all 40 items may still be infeasible. Finally, we use the 40 or fewer remaining items to formulate a 0 − 1 MKP optimization problem. We run a branch and bound approach (a backtracking tree search) influenced by the implementation in [31] to achieve a solution of our 0 − 1 MKP with reasonable quality. The branch and bound search returns to us the final set of news items which we will request from social networks for caching. The outline of our optimization procedure can be found in Algorithm 3. Remark:Ideally the optimization should run as an on-demand background service based on cellular connectivity events when the UE is under intermittent cellular service, and it should be automatically turned on and off without any explicit user intervention. By using the location information and the Android Geofencing APIs [10] the UE can define geographical areas which can trigger actions within applications. In these geographical areas, stable cellular services can be guaranteed in terms of the ASU value of the cellular signal, and the availability and distribution of usable cellular signal strength can be obtained in advance from the crowd server, as described in Section 2.1. When users enter into an area with connectivity from an area of no connectivity this event triggers the caching module of our application to attempt to sense the available news items and to optimize the set of news items to be cached, then to cache them. In summary, the mobile optimization is triggered as follows: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114:16 • D. Wu et al. • Obtain crowdsensing data: The signal distribution information is downloaded from the crowd server and updated locally in a UE. A list of crowdsensed coordinates of usable signal is used to generate and register geofence areas; these geographical areas information will be cached in the UE for mobile prefetching. • Run optimization: When the UE enters the geofence area gaining effective cellular service connectivity after a period of being disconnected triggers mobile prefetching, and thereafter enables optimization and caching. • Customization: A mobile user may specify a fixed interval used for mobile prefetching after the UE entering a geofence area, and the optimization should stop running after passing the interval. Otherwise, the mobile prefetching is disabled when the UE detects no cellular service again. The size of geographical area for geofencing also can be customized by mobile users. Memory and energy constraints are recalculated at the beginning of each optimization. When the constraints are specified originally by the mobile users, they are specified as relative values (percentage of the full energy and memory capacity of the UE). These relative constraints are translated into absolute constraints prior to each optimization and take into account the resources at the time of the optimization. We will explicitly illustrate the settings of these customization options in Section 5.1. Algorithm 3 Optimize Input: E // Millijoules available for caching. Input: M // Bytes available for caching. Input: C // SQLITE cache. Input: F // HTTP session with social network. Output: I // Items to cache. 1: Y = () // Y is a sequence. 2: while |Y | < 100 do 3: H = F.getNext( 100 ) // Next 100 meta items. 4: for h ∈ H do 5: if h C then 6: Y = Y (h) // is concatenation. 7: end if 8: if |Y | ≥ 90 then 9: break 10: end if 11: end for 12: end while 13: Z =PEG( Y ) // PEG="primal effective gradient". Item values come from the order in Y. 14: I =BB( (y 1 , ..., y 40 ) ) // Branch and Bound, the method returns a set of newsitem references. 15: return I

4

D2D FRAMEWORK UNDER CELLULAR NETWORKS

In our system covering various situations of communication connections, mobile prefetching is designed to cache social media content in advance for future access of social networks, therefore it requires the availability of cellular service and direct connectivity to BSs during the prefetching process. However, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.


ƚϭ

ϭϬ

ďϬ

,ϭ Ϭ

,ŚϬ

^ϭ Ϭ

^^ Ϭ

ƚϮ

ϭϭ

ďϭ

,ϭϭ

,Śϭ

^ϭϭ

^^ ϭ

ƚŶ

ϭŶ

ďŶ

,ϭ Ŷ

,ŚŶ

^ϭ Ŷ

^^ Ŷ

Fig. 4. Illustration of a dynamic graph model.

UEs may experience a long period of no cellular service because of disrupted network connection. To ensure effective mobile access to social media content in such situation, our system presents a D2D communication framework underlaying cellular networks where UEs are enabled to share and forward content cached from mobile prefetching to other UEs that have no cellular service. As shown in Fig. 1, the mobile prefetching module of our system has been extended to support two D2D communication modes known as D2D connected transmission and D2D opportunistic transmission. The social media contents are shared across mobile users because of their common social interests reflected in feeds ((e.g. text, image, video)) of the social applications in their UEs, and forwarded along agile connections provided by the two D2D transmission modes. Meanwhile, an individual UE acting as a helper in D2D communication tends to show unwillingness when it is required to selflessly cache and forward social media content to other UEs. This type of selfishness can be referred as individual selfishness. In accordance with the two D2D transmission modes, the individual selfishness can be divided into two components, namely connected selfishness and opportunistic selfishness, and this selfishness has a great impact on the social network access performance of both D2D transmission modes. In this section, we present our D2D communication framework underlaying cellular networks and analyze the impact of user selfishness in our D2D framework, to quantify optimal performance achievable under different D2D transmission modes.

4.1

Time-varying Graph Model

If there are b BSs labeled as B = {B 1 , B 2 , · · · , Bb }, h helpers labeled as H = {H 1 , H 2 , · · · , Hh } and s subscribers labeled as S = {S 1 , S 2 , · · · , Ss }, a static graph similar to Fig. 2 with b + h + s nodes can be drawn for every time frame. We can use directed edges to represent the data flows between nodes in a time frame. Specifically, the edges of outgoing D2D opportunistic flows are from willing helpers to subscribers or to other helpers in a given time frame or to themselves in the successive time frame, which represent the contents stored in the helpers’ buffer. Similarly, the edges of D2D connected transmissions are from BSs via some voluntary helpers to subscribers, and the edges of cellular direct transmissions are directly from BSs to subscribers, all within the same time frame. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114:18 • D. Wu et al. D2D opportunistic transmissions enable data flow across time frames and hence makes it possible to model the time evolution of this time-varying system by linking the static graphs of different time frames. Let the entire time period be divided into n time frames. We first generate n graphs, one for each time frame, and then link them with directed edges to represent data flow in buffers across time frames. Fig. 4 includes all the possible transmission modes (cellular direct, D2D connected and D2D opportunistic transmissions), where BSs and UEs are represented by vertices. Moreover, directed edges are added to vertices, which represent the data flow of cellular direct transmission and/or D2D communication. Furthermore, we attribute weights to directed edges to represent the magnitudes of data flows. Thus, each directed edge in the same row (time frame) in Fig. 4 is one of the following forms: black arrow for cellular direct transmission, a blue arrow for D2D connected transmission, and red arrow for D2D opportunistic transmission. Each edge in the same time frame is associated with a positive value representing the amount of data transmitted within this time frame, bound by the product of the time-frame duration and the temporal transmission rate. A directed edge across time frames in Fig. 4 is bound by the buffer size that the particular helper is willing to contribute and, therefore, is linked to node selfishness. For the graphic conciseness, we omit the weights on all the directed edges in Fig. 4. It is worth emphasizing that this weighted connected dynamic graph explicitly takes into account user selfishness. To quantitatively reveal the impacts of user selfishness in the two D2D modes, both incoming flow and outgoing flow of each node in each time frame are divided into two flows – one for D2D connected transmission and one for D2D opportunistic transmission. • Connected incoming flow: is the component of flow incoming to a node via the D2D connected transmission. If the node is a helper, the flow will be forwarded immediately after reception to another node and does not occupy the helper’s buffer. The amount of data in this flow for helper Hi in time frame l is denoted vl (Hi ). • Opportunistic incoming flow: is the component of flow incoming to a node via a D2D opportunistic transmission. If the node is a helper, it will be stored in the helper’s buffer for opportunistic transmission at some later time frame. The data amount of this flow for helper Hi in time frame l is denoted wl (Hi ). • Connected outgoing flow: is the component of flow outgoing from a node that is received and forwarded immediately by a helper. This flow is not from the helper’s buffer and is equal to the incoming flow at the node. The amount of data from helper Hi to helper H j or to subscriber S j in time frame l is denoted as xl (Hi , H j ) or xl (Hi , S j ) • Opportunistic outgoing flow: is the component of flow outgoing from a helper that is from the helper’s buffer. The data in this kind of flow were received in a previous time frame and have been stored in the helper’s buffer for one or more time frames. The data amount from helper Hi to helper H j or subscriber S j in time frame l is denoted as yl (Hi , H j ) or yl (Hi , S j ). Note that connected and opportunistic selfishness impacts the connection states of each helper by imposing constraints on outgoing flows instead of incoming flows.

4.2

D2D Selfishness Modes

Mobile users in proximity are able to share common objects between them, in particular, common updates from common feeds (this includes text, images, video, etc). When two users are subscribed to common entities the news items generated by these entities can be identified uniquely and so detected and passed from user to user using D2D communications. The "small world" phenomenon which is well studied, both in the context of social networks generally, and in the online social network sites, for example, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... • 114:19 Facebook ensures that with high likelihood any two users will have at least some common associations (and thus common feed items). See for example Wilson et al. [38]. In the time-varying graph model, whether a connected outgoing flow or an opportunistic outgoing flow is allowed to establish depends on two factors – the physical access states between node pairs and the selfishness of UE senders – a very selfish helper may, for example, refuse to participate in some possible transmissions. Since the access states within each time frame remain constant, in each time frame, a Boolean matrix, referred to as Connection Matrix, can be used to describe the access states of UE pairs. In this matrix, the row number of an element specifies a helper, and the column number of an element specifies a helper or subscriber, while the Boolean value of an element represents whether a connection between this node pairs is allowed or not. Similarly, another Boolean matrix, referred to as Cover Matrix, can be used to store the access states between BSs and UEs. In this matrix, the row number of an element specifies a BS and the column number of an element specifies a helper or subscriber. If we denote Connection Matrix as N and Cover Matrix as V for b BSs, h helpers and s subscribers, then: size(N ) = h × (h + s), size(V ) = b × (h + s).

(8)

Further denote the set of UEs as UE = {U E 1 , U E 2 , · · · , U E h+s } = H + S. Then, for example, if the (i, j)-th element in N is Ni, j = TRUE, the helper Hi is able to establish D2D connection with the UE U E j , where 1 ≤ i ≤ h and 1 ≤ j ≤ h + s. If the (i, j)-th element in V is Vi, j = FALSE, on the other hand, the BS Bi is unable to transmit data to UE U E j , where 1 ≤ i ≤ b and 1 ≤ j ≤ h + s. As mentioned previously, a helper may be unwilling to participate in D2D communication because of its connected selfishness and/or opportunistic selfishness. To model the UEs’ unwillingness to participate in D2D communications, we enable helpers to randomly forbid any potential D2D connected or opportunistic transmission that it may establish. For possible D2D connected transmissions, the forbiddance occurs according to a probability p which we refer to as "connected selfishness probability", while for possible D2D opportunistic transmissions, the forbidding probability q is referred to as "opportunistic selfishness probability". Then, two random Boolean matrices, referred to as Connected Selfishness Matrix R and Opportunistic Selfishness Matrix G, respectively, can be generated for each time frame, which store the random forbiddance to the two respective D2D modes. Clearly, size(R) = size(G) = h × (h + s).

(9)

More specifically, let pi, j be the connected selfishness probability of helper hi for UE U E j . Then (i, j)-th element Ri, j of R represents whether the connected D2D transmission mode from hi to U E j is allowed, namely,

Pr Ri, j = FALSE = pi, j , 1 ≤ i ≤ h, 1 ≤ j ≤ h + s. (10) Pr Ri, j = T RU E = 1 − pi, j , Similarly, let qi, j be the opportunistic selfishness probability of hi for U E j . Then (i, j)-th element G i, j of G represents whether the opportunistic D2D transmission mode from hi to U E j is allowed, namely,

Pr G i, j = FALSE = qi, j , 1 ≤ i ≤ h, 1 ≤ j ≤ h + s. Pr G i, j = T RU E = 1 − qi, j ,

(11)


114:20 • D. Wu et al. By combining Connection Matrix with Connected Selfishness Matrix and Opportunistic Selfishness Matrix, respectively, we acquire two new matrices M and W according to Mi, j = Ni, j &Ri, j , 1 ≤ i ≤ h, 1 ≤ j ≤ h + s,

(12)

Wi, j = Ni, j &G i, j , 1 ≤ i ≤ h, 1 ≤ j ≤ h + s,

(13)

which explicitly model the selfishness in the D2D communication underlaying system. In our proposed graph model, whether a D2D connected connection is allowed to establish or not at a given time frame is determined by M matrix of this time frame, while the permission of D2D opportunistic connection depends on W matrix. Note that the constraints of Connection Matrix are imposed on output flows rather than input flows.

4.3

D2D Optimization

In time frame l, the incoming flow of helper Hi via its own buffer from previous time frame l − 1 is denoted as al (Hi ), while the outgoing flow of Hi via its own buffer to successive time frame l + 1 is denoted as bl (Hi ). Before the first time frame, namely, at time frame l = 0, the initial buffer of each helper is empty, i.e. a 0 (Hi ) = 0, ∀i. Let the total amount of the data received by all the subscribers be D in the time period considered. We set maximizing the total amount of the data received by all the subscribers in a time frame as the objective, and formulate the optimization problem as follow: max s.t.

xl (Hi , S j ) + yl (Hi , S j ) + cl (Bk , S j ) , l

i, j

k, j

yl (Hi , H j ) ≤ al (Hi ), ∀i, j, l, yl (Hi , S j ) ≤ al (Hi ), ∀i, j, l, wl (Hi ) + al (Hi ) ≥ bl (Hi ), ∀i, l, vl (Hi ) = xl (Hi , H j ) + xl (Hi , Sk ), ∀i, l, j

(14)

k

yl (H i , S j ) ≤ al (Hi ), ∀i, j, l, yl (H j , Hi ) + x l (H j , Hi ) + cl (Bk , Hi ) = wl (Hi ) + vl (Hi ), ∀i, l . j

k

where cl (Bk , S j ) denotes the data amount transmitted through the flow via cellular direct transmission from BS Bk to subscriber S j in time frame l. In addition, as described in Section 4.1, xl (Hi , H j ) and xl (Hi , S j ) denote the data amount transmitted through outgoing flow via D2D connected transmission from helper Hi to helper H j and to subscriber S j in time frame l, respectively. yl (Hi , H j ) and yl (Hi , S j ) denote the data amount transmitted through outgoing flow via D2D opportunistic transmission from helper Hi to helper H j and to subscriber S j in time frame l, respectively. vl (Hi ) denotes the data amount transmitted through incoming flow via D2D connected transmission for helper Hi in time frame l. wl (Hi ) denotes the data amount transmitted through incoming flow via D2D opportunistic transmission for helper Hi in time frame l. The transmission rate between each node pair must meet the resource constraint, and the total transmitted data amount of each edge is constrained by the product of the transmission rate and the time-frame duration. Moreover, considering connection states and user selfishness, the transmitted content flows must be strictly circumscribed within the connected and willing UEs in each time frame, as described by the matrix V , M and W in D2D selfishness modes introduced in Section 4.2. By further adding these constraints to the optimization problem described in Eq. 14, we formulate the completed constrained maximization problem, whose decision variables include all the data flow, such as the weights of all the Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... • 114:21 directed edges in Fig. 4. However, not all the associated constraints are linear constraints, indicating that this constrained optimization problem does not belong to the category of linear programming problems. Fortunately, we can use the reformulation-linearization technique (RLT) [14] to transform those nonlinear constraints into linear expressions and, consequently, we can use the existing optimization toolkits, such as CPLEX [3] to solve this constrained maximization problem. By solving this constrained maximization problem, we obtain the total amount of data that can be transmitted successfully, namely, the maximized objective value. Therefore, we can derive the total achievable transmission rate by dividing the data amount with the duration. Remark: The time-varying graph model and the formulated optimization framework allow us to understand the impact of user selfishness on the performance of D2D communications and to quantify the optimal system performance achievable under different levels of user selfishness, which are vital for aiding the current process of defining effective and workable D2D communication protocols in our system to enhance mobile access of online social networks under disrupted network scenarios.

5

PERFORMANCE EVALUATION

As a holistic mobile solution to enhance opportunistic networking and context-aware access to social media content, our system has been evaluated with extensive experiments both on realistic tests and trace analysis.

5.1

Mobile Client Application

We have implemented an Android mobile app prototype for different scenarios with intermittent cellular service. The DeepOpp app, as shown in Fig. 5 (a), is designed for an underground metro system that usually lacks cellular coverage. The MetrOpp app, as shown in Fig. 5 (b), is designed for urban areas that usually face infrastructure (mobile access) overload. Both apps can support prefetching, caching, D2D transmission, and displaying of social media contents from social networks under intermittent connectivity availability. Specifically as shown in Fig. 5, DeepOpp can obtain the metadata and download the ranked newsfeed content from Facebook, and then support displaying user’s posted statuses, links, photos, and videos. Users also can see the number of "likes" a post has gotten and individual comments on a post. The online social media content was prefetched and cached when there was a usable mobile signal and can be accessed by mobile users in offline and there is no mobile signal available as indicated by the 0 bar signal indication for cellular service in Fig. 5 (a). DeepOpp users can use the refresh icon to manually update displayed content to achieve offline access to online social media. MetrOpp also can perform similar prefetching and cache of content from a social network, as shown in Fig. 5 (b) to provide users with a news-feed of live football match commentary downloaded from BBC Sport’s website. The prefetched stream of match commentary can be displayed to the user after losing cellular signal, and disseminated to other mobile users via WiFi Direct assisted D2D communication (see the full bar signal indication for WiFi connection in Fig. 5 (b)). Our mobile apps also provide controls that the mobile user of a UE can manage. Fig. 5 (c) shows the UI presented to the user to change the system settings. In the general settings, the user can opt to enable the optimizer, download over 3G (need data plan support), and choose content to download. Once the optimizer is selected, the user can control specific optimizer thresholds for power, storage, and data plan. The UI of "Content to Download" is also given in Fig. 5 (d), and it allows users to select preferred kinds of media content to download. Considering text, image and video are the major three types of media contents presented in social networks, users are authorized in our mobile apps to change their Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114:22 • D. Wu et al.

(a) DeepOpp app

(b) MetrOpp app

(c) User settings

(d) Content selection

Fig. 5. The mobile client applications.

preference to select part or all the contents to be cached during mobile prefetching. Because video usually consumes much more computing and communication resources than text and image in the process of mobile prefetching, these content options provide a friendly control interaction for the users, besides optimization settings, to balance their anticipated online social network experience against the UE’s real-time status (e.g. battery, storage, residual data plan). We have tested DeepOpp on the London Underground (simply the Underground) which consists of 11 different train lines and 403km of tracks covering the Greater London; seven of these lines are considered deep-level that have no cellular signal, and the remaining four are sub-surface that have intermittent cellular service. We also tested MetrOpp in the area of Oxford Street, which is a major road located in the West End of London and considered Europe’s busiest shopping street, with around half a million daily visitors that usually generate overloaded infrastructure access and result in poor cellular service. We present these evaluations in following sections.

5.2

Evaluation of Mobile Prefetching

The performance of mobile prefetching has been evaluated by real world experiments using the DeepOpp app on the London Underground. A total of about eleven hours worth of data was collected at various times by 20 crowd workers riding on the Circle Line route between South Kensington and King’s Cross stations. The significant amount of time spent traveling on the Underground gives us confidence in our results and means that we can break down readings by time of day. Each trial consists of a round-trip. Four morning trials were conducted during rush hour and two trials were conducted during the evening rush hour. During these trials, the train carriages ranged from busy to very packed and near capacity. Another two trials were carried out during the afternoon when the train carriages were generally quite empty with seats available. The mobile signal data collected earlier at specific time window could be used as the crowdsensed results for future prediction and optimization. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... • 114:23 We first test the crowdsensing performance of the 20 workers in a real-time scenario during the eleven hours for on-demand signal search. The values of mobile signal strength (ASU values) at different underground locations are chosen as the items to answer by these workers. Assuming each worker answers items (we refer as the budget), including k control items (ASU values are known) and − k target items (ASU values are unknown). For each experiment, we first construct several sets of target and control items by randomly partitioning items, then we randomly assign each worker to k control items and − k target items, for varying values of and k. (PSLULFDO 2SWLPDO

=7

MSE

= 10

12

= 15

10

= 25

8 6 1

2

3

5

8

13

22

k (# of control items)

(a) MSE of signal strength

Optimal k

14

(PSLULFDO 2SWLPDO

6

6

16

Optimal k

18

4

4

2 0

10

20 30 Budget

(b) Optimal k vs.

40

2

100 200 300 400 500 600 n t (# of target items)

(c) Optimal k vs. nt

Fig. 6. Results of crowdsensing on real dataset.

Fig. 6 shows the results of crowdsensing with the joint estimator and Gaussian model on our real dataset. Fig. 6(a) shows the signal strength MSE (mean squared error) of the joint estimator with varying k. The stars and circles denote the empirically and theoretically optimal k, respectively. Fig. 6(b) shows the optimal k as the budget varies. In the both figures, we have fixed nt = 200; nt is the total number of target items. nt is large because we want to probe signal at more locations along the Underground lines, and the signal availability status at each location varies. Fig. 6(c) shows the optimal k with varying nt , but fixed = 50. According to our proof in Section 2.3, the optimal (minimum) number of control items √ (k) to minimize the expected MSE should scale as O(/ nt ) when using joint estimators. Fig. 6 shows our prediction of the optimal k matches the empirical results surprisingly well on the real dataset. Therefore, choosing a number of control items near the optimal k value could guarantee the performance of crowd reliability analysis for the aggregation of answers, meanwhile presenting an opportunity for each worker to answer more target items (collect more mobile signal data) along an underground traveling route. We next evaluate DeepOpp on mobile access performance and compare it with other state-of-the-art schemes. The access scheduler defines the process for how often and under which conditions the UE will fetch and obtain new data to store in the cache. Basic schedulers such as O2SM [48] runs on a fixed interval and will make a fetch at this interval. A one-dimensional scheduler such as EarlyBird [37] runs on a fixed interval schedule and will make a fetch if the current signal strength is above a set threshold. DeepOpp employs a different kind of scheduler that uses the spatiotemporal signal availability crowdsensed by workers to reschedule fetches only at locations that we know have usable signal strength; it can determine the current location and estimate future movements to reschedule at the next location that has a signal higher than a given threshold. Note that when there is no GPS signal on the Underground, DeepOpp can use the unique MAC address of station WiFi and corresponding WiFi-to-station mapping to localize mobile user. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

" !

(a) Percentage of successful requests

(c) Power used per media items requested

#

(b) Power used per metadata requests

&

114:24 • D. Wu et al.

(d) Power used per byte downloaded

Fig. 7. Performance evaluation of scheduler in mobile access.

Fig. 7 (a) shows the average percentage of successful requests and its standard deviation. From our experiments on the same section of the Circle Line, the number of requests was significantly more for O2SM and EarlyBird than DeepOpp (O2SM provided over 30 times as many success full retrievals of metadata). While O2SM and EarlyBird made many more requests, a larger proportion of their requests failed as reported by network errors from the social networks, due to the fact that the basic scheduler and one-dimensional scheduler have no knowledge of intermittent cellular service on the Underground. Therefore, while DeepOpp with crowdsensed information had an average success rate at 50%, O2SM and EarlyBird only had the rate around 25.63% and 36.27%, respectively. The amount of power consumed per successful request is displayed in Fig. 7 (b) shows that the DeepOpp needed 2.58 times less power for each successful data request than O2SM. This is due to the power wasted by O2SM on failed attempts at making requests. The significant number of requests made by EarlyBird also consumed much more power than DeepOpp. DeepOpp is based on a combination of cellular direct transmission and additional crowdsensed data gathered as part of its optimization for prefetching, therefore it consumed much less power than O2SM and EarlyBird. Fig. 7 (c) shows the total amount of power used by each application divided by the number of media items prefetched. DeepOpp was 2.51 times more efficient at getting media items successfully than O2SM. This is similar to results in Fig. 7 (b) as most news-feed items from social network contains some media Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.


(a) Comparison of memory consumption

DĞŵŽƌǇ

items. In our trials, each metadata request contained ten feed items, which is why the power for each media item is approximately a tenth of that for each metadata request. As for data download efficiency compared to the power used by each of the three schedulers. Fig. 7 (d) shows the total power used divided by the total bytes downloaded over the course of the experiment. DeepOpp is more efficient for the amount of data than other applications. O2SM with its fixed-interval scheduler may begin downloading at points with little signal strength but fail due to the low signal strength at that location. Each time this happens the data downloaded is wasted, but it still contributes to a user’s data allowance and consumes power. The overheads in making these failed requests cause higher power usage. Each of the metrics in Fig. 7 shows that DeepOpp offers a significant improvement over existing solutions. O2SM and EarlyBird are much more likely to waste data and power on failed requests on the Underground than DeepOpp we have designed and implemented.

(b) Comparison of power consumption

Fig. 8. Performance evaluation of optimization solution.

The 0-1 MKP presented in our paper is a classic NP-hard optimization problem and as such has several well-studied solution methods. To evaluate the performance of our system on solving the problem of prefetching and caching, especially the memory cost, we compared DeepOpp with a dynamic programming (DP) solution [29], another classical solution method for the 0-1 MKP. Fig. 8 compares our branch and bound solution with heuristic design adopted in DeepOpp with the dynamic programming solution. The average performance and its standard deviation presented in Fig. 8 (a) and (b) show that DeepOpp can save remarkably 94% on memory consumption and 59% of power consumption in comparison with DP. The classical DP approach may be easily adapted to a problem where two "weight" budgets are used as memory and energy costs in our case, however memory is byte addressable and residual energy is probed at millijoule accuracy in Android, which means the memory and energy budgets will be represented by very large numbers, making a direct application of DP technique impractical. DeepOpp employs a metadata based heuristic to order and collect news item for running branch and bound approach, therefore can solve 0-1 MKP with less memory and energy cost, but no loss of accuracy.

5.3

Evaluation of D2D Communication

The performance of D2D communication underlaying cellular networks is first evaluated using our MetrOpp app in Oxford Street. We choose an area of Oxford Street with a radius of 100 meters for our Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114:26 • D. Wu et al. realistic tests. In our tests, 2 UEs are subscribers and other UEs are helpers. Due to the mobility of helpers, a helper could use the D2D connected transmission to forward the social content to subscribers or other help if it is away from the core area of Oxford Street; cellular service is still available in the boundary zone. Otherwise, the helper employs mobile prefetching, and D2D opportunistic transmission where epidemic routing is used. The tests have 10 runs; each run with a period around 5 minutes.

(a) Percentage of data delivered

(b) Latency for delivered data

Fig. 9. Performance evaluation of MetrOpp in D2D transmission.

The MetrOpp is designed to provide news-feed of football matches. The results and actions of a match are highly time sensitive since fans want to stay as up to date as possible as new commentary is coming in every minute. Therefore, two performance metrics, percentage of data delivered and latency for delivered data, are evaluated for the dissemination of live match documentary in D2D scenario, as shown in Fig. 9 (a) and (b) respectively. When the number of UEs increases from 5 to 20, we can correspondingly obtain average results with higher data delivery ratio and lower delivery latency, because the increase of helper density in a fixed area means more contact opportunities for the D2D transmissions. We also observe that the D2D link established by WiFi Direct in MetrOpp could be impacted by random movement of visitors in Oxford Street, therefore presenting the performance deviations in both figures. However, the overall performance is still tolerable for accessing match commentary when there are more than 10 helpers involved in area with intermittent cellular service. To further evaluate our D2D framework underlaying cellular networks and the impacts of selfishness within the framework, we have to test a larger scale scenario. Considering our limit in testing time and manpower, the Cambridge trace [19] is adopted to meet the evaluation need. We use the method of [9] to compute the individual contact rates of node pairs by average statistics from the trace. We then average the contact rates of users to implement simulations. In each simulation scenario, there are 35 realistic human mobility traces. The total bandwidth is 20 MHz, and 80% of the cellular resources are allocated to BS for transmitting data to UEs, and the other 20% are used for D2D communications between UEs. We represent the two types of individual selfishness by two probabilities, individually connected selfishness probability (ICSP) and individual opportunistic selfishness probability (IOSP), which reflect helpers’ unwillingness to cooperate in the two D2D modes, respectively. For example, if the ICSP is described by c (0 ≤ c ≤ 1), then c is the probability of a helper’s refusal to cooperate with a UE within its D2D communication range in D2D connected communication. Fig. 10 depicts the trend of the total Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

Total Data Transmission Rate(Mbps)

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... • 114:27 850 900

800

800

750

700

700

600

650 600

500

550

400 0 1

0.5

500

0.5 Connected Selfishness

Opportunistic Selfishness 1 0

Fig. 10. Impact of individual selfishness on total data transmission rate.

data transmission rate of all subscribers, influenced by the two selfishness. The result shows that a slight decrease in IOSP leads to a large increase in the data transmission rate, while ICSP has less influence on the system performance. Its explanation is as follows: a helper with popular contents in its storage may transmit them during every D2D contact and, consequently, it can take advantage of the physical proximity more efficiently. By contrast, D2D connected transmission participants have to rely on the data flow sent by the BS and the same contents may be downloaded frequently. 900 Incremental Opportunistic Selfishness



900

800

700

600

500

400 0

0.2 0.4 0.6 0.8 Individual Connected Selfishness

(a) Individual connected selfishness

1

Incremental Connected Selfishness

800

700

600

500

400 0

0.2 0.4 0.6 0.8 Individual Opportunistic Selfishness

1

(b) Individual opportunistic selfishness

Fig. 11. Interaction between individual selfishness modes.

Intuitively, when ICSP is relatively low, the unoccupied cellular resources as the result of users’ refusal to participate in D2D opportunistic transmission can be redistributed to D2D connected communications. We present the total data transmission rates under different selfishness metrics in Fig. 11, where there are 20 lines in both figures, with IOSP ranging from 0 to 0.95 with a step of 0.05 in Fig. 11 (a), while ICSP Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114:28 • D. Wu et al.

900 800 700 600 500 400 300 1 0.8 1000

0.6

800

0.4

600 400

0.2

Rate Unselfishness Factor

0

200 0

Buffer Unselfishness Upper−Bound (Mb)

(a) Total data transmission rate

D2D Data Transmission Rate(Mbps)


ranging from 0 to 0.95 with a step of 0.05 in Fig. 11 (b). Not surprisingly, Fig. 11 (a) and (b) indicate that the impacts of the D2D connected and D2D opportunistic selfishness are not independent. Denote the ICSP as c and the IOSP as o. Further denote the total data transmission rate as d = f (c, o), and its partial derivatives ∂f ∂f with respect to c and o as p and q, respectively. We have p(c, o) = ∂c (c, o) < 0 and q(c, o) = ∂c (c, o) < 0. Both p and q are approximately monotonically decreasing functions of c and o, defined on the interval [0, 1]. Thus an increase of ICSP or IOSP by a given increment in IOSP or IOSP results in an obvious drop in the total data transmission rate. Therefore, if users are able to refuse D2D communication requests freely, the D2D communication system is likely to perform poorly due to high D2D opportunistic selfishness. Fortunately, there are possible solutions to this problem. If D2D connected transmission is compulsory in the protocol, the results will regress to the highest (blue) curve in Fig. 11 (b), which shows that the dropped data transmission rate can be regained to some extend from around 600 Mbps to around 700 Mbps at high IOSP. By ensuring alternative D2D connected communication choices in the case of high rejection of D2D opportunistic transmission, it makes the D2D communication underlaying cellular network resilient to high D2D opportunistic selfishness and avoids forcing users to devote their storage to D2D. Another possible solution is to set a minimum D2D-request-acceptance ratio and a reasonable buffer reservation, which constrains the system to the top left (the crimson part) of Fig. 10, and therefore guarantees satisfying performance.

600 500 400 300 200 100 0 1

1000 0.8

800 0.6

600 0.4

400

0.2 Rate Unselfishness Factor

200 0

0

Buffer Unselfishness Upper−Bound (Mb)

(b) D2D data transmission rate

Fig. 12. Rate and buffer limits in individual selfishness.

Apart from the above-mentioned ICSP and IOSP, individual selfishness also manifests in terms of power and buffer limits offered to D2D transmission. To quantitatively reveal the impacts of these two kinds of user selfishness, we use two parameters, rate unselfishness factor (RUSF) and buffer unselfishness upperbound (BUSU), to measure how willing users to contribute power and storage for D2D communications. Specifically, RUSF is the ratio of the allowed D2D communication rate to the maximum available rate, which affects both D2D communication modes, while BUSU is the upper-bound of the allocated buffer by helpers, which only constrains D2D opportunistic transmission. Therefore, we can set pi, j = 0 and qi, j = 0, and instead, use the RUSF and BUSU to represent the constraints imposed by the individual selfishness. After performing the Cambridge trace-driven simulation study under the same practical network settings, we acquire the empirical results of the total data and total D2D data received per second, depicted in Fig. 12 (a) and (b), respectively, as the functions of RUSF and BUSU. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

From Intermittent to Ubiquitous: Enhancing Mobile Access to Online Social Networks with... • 114:29 From Fig. 12, it can be seen that as the RUSF decreases, the achievable system rate reduces significantly. Obviously, reducing power or uploading rate limit, which is a form of user selfishness, can severely decrease the data transmitted via D2D communications. On the other hand, enlarging the buffer size also brings significant performance improvements when the buffer is small but has little impact when it is already sufficiently large, i.e. the system performance shows a tendency of saturation with the increase of the buffer size. More specifically, for the simulated system, an increase in the buffer size of helpers has little influence on the achievable system performance when it is larger than 70 Mb. Therefore, the results of Fig. 12 suggest that to fully develop the potential of D2D underlaying cellular networks, service providers should discourage users to set the uploading rate limit for D2D communications.

5.4

User Feedbacks and Potential Limitations

Discussions were held with some of the test users of DeepOpp and MetrOpp applications. The main users said that they enabled the application on about 70-80% of their time under intermittent and disrupted networks. All found the operation process intuitive and straightforward. Testers were able to understand the purpose of the applications, understood that their cached social media data from mobile prefetching will be shared with other mobile users with similar social interests, and agreed with the D2D mode to serve as the helper to store, carry and forward data during D2D transmission so that they also can enjoy the same service in return when there is no cellular service in surrounding environments. Users are then able to obtain accurate estimations of cellular signal strength crowdsensed by reliable workers to run effective mobile prefetching was well received by mobile users and likely contributed to more usage. Two-thirds (nearly 70%) of the users in our study were satisfied with their experience and feel that our applications built around mobile crowdsensing, smart caching and agile D2D communication are important and useful. Other users were not satisfied with our applications mainly because of two reasons: 1) Extra power expenditure caused by redundant runs of mobile prefetching when the cached social media content is already up-to-date. 2) Running out of interesting social media items to browse when the UE is in an out of cache situation. However, the first issue can’t be totally avoided since the mobile prefetching should attempt to update social media content under direct cellular coverage by its default schedule setting. It could be improved by adjusting the scheduled period per the request of user. The second issue can be addressed by adjusting system parameters specified by the user. For example, the user can choose the content to download and set the storage threshold, as shown in Fig. 5 (c). When prompted further, users additionally stated that they would have preferred if news items which have been interacted with disappeared from the cached feed. However, this is not a feature found on many social media sites, including Facebook according to our knowledge. We noticed that the latency of delivered data for D2D communication, as shown in Fig. 9 (b), was about 40-50 seconds when there were 10 UEs including helpers and subscribers. Though the latency seems long, our test was performed in the area of Oxford Street with around half a million daily visitors; it is understood by our users that given limited number of helpers it is hard to establish either D2D connected transmission or D2D opportunistic transmission, and weak connectivity under this situation could be easily disturbed by other crowded people. However, the latency was greatly decreased to be around 10 seconds when there were 20 helpers, which presents a feasible number of helpers to be gathered given the high population density, as a result most of our users were satisfied. A user may choose to proceed to a Wi-Fi spot in the area of Oxford Street for faster access to social networks. However, as we have experienced, most of the open or free urban Wi-Fi spots are not user-friendly. To access this urban Wi-Fi, mobile users need to first go through a registration process and share some of their private Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114:30 • D. Wu et al. information such as name, email and phone number, etc. These are also sensitive information for personal social network account, therefore mobile users usually hesitate to use urban Wi-Fi spot to access social networks. Surprisingly, delay time for mobile prefetching was not a major problem for our users, as caching was activated automatically; once mobile prefetching is finished users can browse the cached social media content immediately or at any time. Lengthy caching intervals happened naturally during the daily routes of our users and neither the delays nor the broken transmissions were noticed in most cases. Without user intervention (though the option for manual caching exists in the app) users were not discomforted during lengthy caching intervals. We also noticed the average successful request ratio of mobile prefetching on the Underground is around 50%, which might be not a good result for over-ground scenario. However, this performance is acceptable by our users after they knew that it was mainly impacted by the fast train running on the Underground environment that barely has usable cellular service, especially for deep lines. For mobile users in the harsh situation, getting effective social media content is more important than successful request ratio, therefore they can tolerate a little extra latency caused by lost requests and usually have the patience to wait while spending time to take a train on the Underground. Because the environmental interference can’t be avoided and our solution has already shown better performance than other state-of-the-art works, the average successful request ratio on the Underground is still satisfied from our test users’ point of view. At its current stage, our system does not handle network disconnection and connection interruptions very gracefully. Pieces of social media content that are not fully loaded by the time the wireless connection is broken are lost and need to be downloaded at the next wireless communication area. Solving this issue would be especially useful for unstable peer to peer connections occurring between devices in a crowded environment. Another limitation that we encounter is that users may not be good judges of how many resources to dedicate to caching; they can get "buyer’s remorse" after consuming the resources they intended to expand on the application. In addition, as devices are increasingly organized as a social community, there is a requirement on cooperative communication among devices, for example using game theory approaches, to leverage different status of devices and their available resources to share and forward social media content. Other limitations of current work include the need of better incentive scheme during crowdsensing, smarter spatial-temporal decision for mobile prefetching, robust maintenance of user privacy during D2D communication, etc. These will be studied in our future works.

6

CONCLUSIONS

In this paper, we presented the design of a context-aware mobile system to enable efficient offline access to online social network in situations where intermittent signal coverage is experienced. The presented system design takes advantage of crowdsensed signal variation at different locations to prefetch and cache data. It also supports D2D communication underlaying cellular networks with a framework to analyze the impact of user selfishness in the process. Opportunistic optimization schemes have been proposed to enhance the mobile crowdsensing, prefetching and D2D transmission, and presented a significant improvement over existing approaches. In general, the system to the best of our knowledge is the first work to integrate mobile prefetching and D2D communication as a holistic system to tackle various communication connections; it is not only designed to optimize mobile access of online social networks but also extended to handle mobile connectivity issues for both intermittent and disrupted scenarios. In the future, we will further study user feedback and address potential limitation to improve the performance and usability of the current system. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.


ACKNOWLEDGMENTS The authors would like to thank Prof. Julie A. McCann, Dr. Lambros Lambrinos and Mr. Xiang Nie for providing helps on the application tests. This work is supported by the National Natural Science Foundation of China under Grant No. 61602168 and 61672217, the National Key R&D Program of China under Grant No. 2016YFB0200405, the Intel Collaborative Research Institute for Sustainable Connected Cities (ICRI Cities), and the University of California Center on Economic Competitiveness in Transportation (UCCONNECT).

REFERENCES [1] Facebook Graph API. 2016. Facebook Graph API Reference. (2016). Retrieved March 30, 2016 from https://developers. facebook.com/docs/graph-api/reference/v2.5/user/home [2] Niranjan Balasubramanian, Aruna Balasubramanian, and Arun Venkataramani. 2009. Energy consumption in mobile phones: a measurement study and implications for network applications. In ACM IMC. 280–293. [3] Xu Chen, Brian Proulx, Xiaowen Gong, and Junshan Zhang. 2013. Social trust and social reciprocity based cooperative D2D communications. In ACM MOBIHOC. 187–196. [4] Yohan Chon, Nicholas D. Lane, Yunjong Kim, Feng Zhao, and Hojung Cha. 2013. Understanding the coverage and scalability of place-centric crowdsensing. In ACM UbiComp. 3–12. [5] Yohan Chon, Gwangmin Lee, Rhan Ha, and Hojung Cha. 2016. Crowdsensing-based smartphone use guide for battery life extension. In ACM UbiComp. 958–969. [6] Klaus Doppler, Mika Rinne, Carl Wijting, Cassio B. Ribeiro, and Klaus Hugl. 2009. Device-to-device communication as an underlay to LTE-advanced networks. IEEE Communications Magazine 47 (2009), 42–49. Issue 12. [7] Funf Framework. 2016. funf: Open Sensing Framework. (2016). Retrieved March 30, 2016 from http://www.funf.org [8] Joel Friedman, Jeff Kahn, and Endre Szemeredi. 1989. On the second eigenvalue of random regular graphs. In ACM Symposium on Theory of Computing (STOC). 587–598. [9] Wei Gao, Qinghua Li, Bo Zhao, and Guohong Cao. 2009. Multicasting in delay tolerant networks: A social network perspective. In ACM MOBIHOC. 299–308. [10] Android Geofencing. 2016. Creating and Monitoring Android Geofences. (2016). Retrieved March 30, 2016 from https: //developer.android.com/training/location/geofencing.html [11] Nitesh Goyal and Susan R. Fussell. 2016. Effects of Sensemaking Translucence on Distributed Collaborative Analysis. In ACM CSCW. 287–301. [12] Bin Guo, Huihui Chen, Qi Han, Zhiwen Yu, Daqing Zhang, and Yu Wang. 2017. Worker-Contributed Data Utility Measurement for Visual Crowdsensing Systems. IEEE Transactions on Mobile Computing 16 (2017), 2379–2391. Issue 8. [13] Bin Guo, Huihui Chen, Wenqian Nan, Zhiwen Yu, Xing Xie, Daqing Zhang, and Xingshe Zhou. 2017. TaskMe: Toward a Dynamic and Quality-Enhanced Incentive Mechanism for Mobile Crowd Sensing. International Journal of Human Computer Studies 102 (2017), 14–26. Issue 6. [14] Bo Han, Pan Hui, V. S. Anil Kumar, Madhav V. Marathe, Jianhua Shao, and Aravind Srinivasan. 2012. Mobile data offloading through opportunistic communications and social participation. IEEE Transactions on Mobile Computing 11 (2012), 821–834. Issue 5. [15] European Telecommunications Standard Institute. 2016. Etsi ts 127007 v8.5.0. (2016). Retrieved March 30, 2016 from http://www.etsi.org/deliver/etsits/127000127099/127007/08.05.0060/ts127007v080500p.pdf [16] Chakajkla Jesdabodi and Walid Maalej. 2015. Understanding usage states on mobile devices. In ACM UbiComp. 1221–1225. [17] David R. Karger, Sewoong Oh, and Devavrat Shah. 2011. Iterative learning for reliable crowdsourcing systems. In Neural Information Processing Systems (NIPS). 1953–1961. [18] Joohyun Lee, Kyunghan Lee, Euijin Jeong, Jaemin Jo, and Ness B. Shroff. 2016. Context-aware application scheduling in mobile systems: what will users do and not do next?. In ACM UbiComp. 1235–1246. [19] Yong Li, Pan Hui, Depeng Jin, Li Su, and Lieguang Zeng. 2010. Evaluating the impact of social selfishness on the epidemic routing in delay tolerant networks. IEEE Communications Letters 14 (2010), 1026–1028. Issue 11. [20] Qiang Liu, Alexander T. Ihler, and Mark Steyvers. 2013. Scoring Workers in Crowdsourcing: How Many Control Questions are Enough?. In Neural Information Processing Systems (NIPS). 1914–1922. [21] Qiang Liu, Jian Peng, and Alexander T. Ihler. 2012. Variational inference for crowdsourcing. In Neural Information Processing Systems (NIPS). 701–709. [22] Yan Liu, Bin Guo, Yang Wang, Wenle Wu, Zhiwen Yu, and Daqing Zhang. 2016. TaskMe: multi-task allocation in mobile crowd sensing. In ACM UbiComp. 403–414. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.

114:32 • D. Wu et al. [23] Chuan Ma, Weijie Wu, Ying Cui, and Xinbing Wang. 2015. On the performance of successive interference cancellation in D2D-enabled cellular networks. In IEEE INFOCOM. 37–45. [24] Akhil Mathur, Nicholas D. Lane, and Fahim Kawsar. 2016. Engagement-aware computing: modelling user engagement from mobile contexts. In ACM UbiComp. 622–633. [25] Grace Metri, Weisong Shi, Monica Brockmeyer, and Abhishek Agrawa. 2014. BatteryExtender: an adaptive user-guided tool for power management of mobile devices. In ACM UbiComp. 33–43. [26] OpenSignal. 2016. OpenSignal: 3G and 4G LTE Cell Coverage Map. (2016). Retrieved March 30, 2016 from http://opensignal. com [27] Abhinav Parate, Matthias Bohmer, David Chu, Deepak Ganesan, and Benjamin M. Marlin. 2013. Practical prediction and prefetch for faster access to applications on mobile phones. In ACM UbiComp. 275–284. [28] Doron Puder. 2012. Expansion of Random Graphs: New Proofs, New Results. arXiv preprint arXiv:1212.5216 (2012). [29] Stuart J. Russell and Peter Norvig. 2010. Artificial Intelligence - A Modern Approach. Pearson Education. [30] Aaron Schulman, Vishnu Navda, Ramachandran Ramjee, Neil Spring, Pralhad Deshpande, Calvin Grunewald, Kamal Jain, and Venkata N.Padmanabhan. 2010. Bartendr: a practical approach to energy-aware cellular data scheduling. In ACM MobiCom. 85–96. [31] Wei Shih. 1979. A branch and bound method for the multiconstraint zero-one knapsack problem. Journal of the Operational Research Society 30, 4 (1979), 369–378. [32] Alberto Tarable, Alessandro Nordio, Emilio Leonardi, and Marco Ajmone Marsan. 2015. The Importance of Being Earnest in Crowdsourcing Systems. In IEEE INFOCOM. 2821–2829. [33] Yoshiaki Toyoda. 1975. A simplified algorithm for obtaining approximate solutions to zero-one programming problems. Management Science 21, 12 (1975), 1417–1427. [34] Paul Tseng and Olvi L. Mangasarian. 2001. Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications 109(3) (2001), 475–494. [35] Md. Yusuf Sarwar Uddin, Vinay Setty, Ye Zhao, Roman Vitenberg, and Nalini Venkatasubramanian. 2016. RichNote: Adaptive Selection and Delivery of Rich Media Notifications to Mobile Users. In IEEE International Conference on Distributed Computing Systems (ICDCS). 159–168. [36] Leye Wang, Daqing Zhang, Animesh Pathak, Chao Chen, Haoyi Xiong, Dingqi Yang, and Yasha Wang. 2015. CCS-TA: quality-guaranteed online task allocation in compressive crowdsensing. In ACM UbiComp. 683–694. [37] Yichuan Wang, Xin Liu, David Chu, and Yunxin Liu. 2015. EarlyBird: Mobile Prefetching of Social Network Feeds via Content Preference Mining and Usage Pattern Analysis. In ACM MobiHoc. 67–76. [38] Christo Wilson, Alessandra Sala, Krishna PN Puttaswamy, and Ben Y Zhao. 2012. Beyond social graphs: User interactions in online social networks and their implications. ACM Transactions on the Web 6 (2012), 17:1–17:31. Issue 4. [39] Di Wu, Dmitri I. Arkhipov, Thomas Przepiorka, Qiang Liu, Julie A. McCann, and Amelia C. Regan. 2017. DeepOpp: Contextaware Mobile Access to Social Media Content on Underground Metro Systems. In IEEE ICDCS. [40] Di Wu, Dmitri I. Arkhipov, Yuan Zhang, Chi Harold Liu, and Amelia C. Regan. 2015. Online War-Driving by Compressive Sensing. IEEE Transactions on Mobile Computing 14 (2015), 2349–2362. Issue 11. [41] Di Wu, Lambros Lambrinos, Thomas Przepiorka, and Julie A. McCann. 2016. Facilitating mobile access to social media content on urban underground metro systems. In IEEE INFOCOM Workshop on Smart Cities and Urban Computing. 921–926. [42] Di Wu, Qiang Liu, Yong Li, Julie A. McCann, Amelia C. Regan, and Nalini Venkatasubramanian. 2016. Adaptive Lookup of Open WiFi Using Crowdsensing. IEEE/ACM Transactions on Networking 24 (2016), 3634–3647. Issue 6. [43] Di Wu, Qiang Liu, Yuan Zhang, Julie A. McCann, Amelia Regan, and Nalini Venkatasubramanian. 2014. CrowdWiFi: efficient crowdsensing of roadside WiFi networks. In ACM/IFIP/USENIX Middleware. 229–240. [44] Di Wu, Yuan Zhang, Juan Luo, and Renfa Li. 2014. Efficient data dissemination by crowdsensing in vehicular networks. In IEEE/ACM IWQoS. 314–319. [45] Fengli Xu, Yu-Yun Lin, Jiaxin Huang, Di Wu, Hongzhi Shi, Jeungeun Song, and Yong Li. 2016. Big Data Driven Mobile Traffic Understanding and Forecasting: A Time Series Approach. IEEE Transactions on Services Computing 9 (2016), 796–805. Issue 5. [46] Liwen Xu, Xiaohong Hao, Nicholas D. Lane, Xin Liu, and Thomas Moscibroda. 2015. More with less: lowering user burden in mobile crowdsourcing through compressive sensing. In ACM UbiComp. 659–670. [47] Lide Zhang, Birjodh Tiwana, Zhiyun Qian, Zhaoguang Wang, Robert P. Dick, Zhuoqing Morley Mao, and Lei Yang. 2010. Accurate online power estimation and automatic battery behavior based power model generation for smartphones. In ACM CODES+ISSS. 105–114. [48] Ye Zhao, Ngoc Minh D, Shu-Ting Wang, Cheng-Hsin Hsu, and Nalini Venkatasubramanian. 2013. O2SM: Enabling Efficient Offline Access to Online Social Media and Social Networks. In ACM/IFIP/USENIX Middleware. 445–465.

Received February 2017; revised May 2017; accepted June 2017 Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 1, No. 3, Article 114. Publication date: September 2017.