Proximity Reasoning via Multimodal Context Fusion - LRZ Sync+Share

FAKULTÄT FÜR INFORMATIK TECHNISCHE UNIVERSITÄT MÜNCHEN

Master’s Thesis in Informatics

Proximity Reasoning via Multimodal Context Fusion Jimmy Abualdenien

FAKULTÄT FÜR INFORMATIK TECHNISCHE UNIVERSITÄT MÜNCHEN

Master’s Thesis in Informatics

Proximity Reasoning via Multimodal Context Fusion Proximity-Schlussfolgerung durch Multimodale Kontextfusion Author: Supervisor: Advisor: Submission Date:

Jimmy Abualdenien Prof. Dr.-Ing. Jörg Ott M.Sc. Michael Haus 15.08.2017

I confirm that this master’s thesis in informatics is my own work and I have documented all sources and material used.

Munich, 15.08.2017

Jimmy Abualdenien

Acknowledgments

I would like to express my sincere appreciation to my advisor, Michael Haus, for his continuous support and guidance throughout this thesis. His valuable knowledge helped me achieve the planned goals. Furthermore, I would like to thank Prof. Dr.-Ing. Jörg Ott for supervising my thesis and for the lessons I have learned from him during my master’s study. Last but not least, I thank my family for their support.

Abstract Mobile devices provide powerful capabilities to promote the awareness of the surrounding environment via integrated sensors. These capabilities offer acquiring the devices’ location which enables a wide range of Location Based Services (LBS). LBS are interested in the devices’ location from multiple aspects; acquiring their exact location and obtaining the devices’ relative position to other entities, known as Proximity-based Services (ProSe). The accuracy of the devices’ location relies primarily on the technology used, ranging from hundreds of meters down to several centimeters. The rapid improvements of the mobile devices’ specifications is a major incentive for utilizing ProSe over Device-to-Device (D2D) scheme. The nature of D2D brings multiple advantages like direct connection between devices and enhanced security. In this thesis, we focus on the design and implementation of a D2D proximity-based service which allows a group of devices to automatically establish a private meeting and maintain its attendees’ membership based on their relative position. The aim is to detect the devices’ proximity using the mobile device’s built-in microphone, WiFi adapter, and speakers. Thus, our service performs a passive WiFi and ambient sound sensing, as well as an active ultrasound probing as input data for proximity reasoning. The main challenge is to deliver a fine-grained estimation of two devices’ proximity out of measuring the similarity between the features extracted from the different context information. To evaluate the service’s feasibility, its connection’s performance, energy consumption and accuracy of inferring devices’ proximity are evaluated in different environments and multiple scenarios. As a result, the service’s efficiency is reasonable from users’ perspective and its accuracy outperformed the random baseline in all the scenarios; it achieves a worst-case accuracy of 94% in comparison to 23%. The environment’s context information can reveal different kinds of information about the users’ proximity and interactions. D2D communication is a promising approach for ProSe to naturally exchange data with nearby devices.

iv

Contents Acknowledgments

iii

Abstract

iv

1

2

3

Introduction & Motivation 1.1 Location-based Services (LBS) . . 1.2 Proximity-based Services (ProSe) 1.3 Room-level Proximity Services . 1.4 Scope and Goals . . . . . . . . . . 1.5 Structure of the Thesis . . . . . .

. . . . .

1 4 5 6 8 8

. . . . . . . . . . .

10 10 11 11 16 16 18 18 21 22 22 24

Proximity Service Design 3.1 Architecture and Communication Technology . . . . . . . . . . . . . . .

28 29

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

Related Work 2.1 Discovery and Communication . . . . . . . . . . . 2.1.1 Client-Server Paradigm . . . . . . . . . . . 2.1.2 Device-to-Device Paradigm (D2D) . . . . . 2.2 Localization Techniques . . . . . . . . . . . . . . . 2.2.1 Cellular Network Localization . . . . . . . 2.2.2 Global Positioning System (GPS) . . . . . . 2.2.3 Tag-based Localization . . . . . . . . . . . . 2.2.4 Accelerometer and Gyroscope Localization 2.2.5 Image Processing Localization . . . . . . . 2.2.6 WiFi Localization . . . . . . . . . . . . . . . 2.2.7 Audio Localization . . . . . . . . . . . . . .

v

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

Contents 3.2 3.3 4

Communication Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proximity Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

35 35 37 40 43 45 48 48 52 62

. . . . . . . . . . . .

66 67 68 70 72 73 76 77 79 82 83 85 90

6

Conclusion and Future Work 6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94 94 96

7

Appendix

98

5

Proximity Service Implementation 4.1 Architecture Overview . . . . . . . . . . . . 4.2 Description of WiFi-P2P API . . . . . . . . 4.3 Service Advertising and Discovery . . . . . 4.4 Group Formation . . . . . . . . . . . . . . . 4.5 Communication Flow . . . . . . . . . . . . . 4.6 Proximity Detection . . . . . . . . . . . . . . 4.6.1 Proximity Detection via WiFi . . . . 4.6.2 Proximity Detection via Audio . . . 4.6.3 Proximity Detection via Ultrasound

29 31

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

Evaluation 5.1 Device-to-Device Connection Performance . . . . . . . 5.1.1 Idle vs. Download State Latency . . . . . . . . . 5.1.2 Proximity Detection Latency . . . . . . . . . . . 5.1.3 D2D Throughput Measurements . . . . . . . . . 5.2 Parameters for Proximity Evaluation . . . . . . . . . . . 5.3 Proximity Service Duration and Energy Consumption 5.3.1 Group Formation . . . . . . . . . . . . . . . . . . 5.3.2 Proximity Detection . . . . . . . . . . . . . . . . 5.4 Proximity Detection Accuracy . . . . . . . . . . . . . . . 5.4.1 Restricted Space Environment . . . . . . . . . . 5.4.2 Open Space Environment . . . . . . . . . . . . . 5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .

vi

. . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . . . . .

Contents List of Figures

102

List of Tables

105

Bibliography

106

vii

1 Introduction & Motivation Mobile devices are the digital hub of our lifestyle for providing seamless ways to accomplish the conventional daily tasks. These devices have multiple powerful characteristics, including, mobility, ease-of-use and awareness of the surrounding environment which induce the users to view them as a personal property rather than a digital tool [1]. Moreover, mobile devices offer several methods for communicating with the rest of the world, such as cellular network, WiFi, and ad-hoc mode besides the tremendous number of applications available for work, entertainment and social networking. Thus, these devices are becoming more pervasive; in 2016 the number of mobile devices excluding the non-smartphones and desktops reached around 4.56 billion devices and are expected to reach around 6.6 billion in 2020 [2]. Nowadays, mobile devices are equipped with a rich collection of sensors and communication interfaces which provides various information about the user’s environment and situation, such as their current location and motion. Table 1.1 provides an overview of the built-in sensors in three models of devices, iPhone 7, LG Nexus 5X and Samsung Galaxy S8. Besides the mentioned sensors, the mobile’s microphone, camera and communication interfaces, including WiFi, Bluetooth and NFC can be utilized as sensors to reveal decisive information about the environment nearby. G. D. Abowd et al. in [3] proposed a definition of the term context from an application point of view: Context is any information that can be used to characterize the situation of an entity. An entity is a person, place or object that is considered relevant to the integration between a user and an application, including the user and the application themselves. The term context-aware was firstly introduced in [4], where it referred to context as

1

1 Introduction & Motivation location, identities of nearby people and objects and changes to those objects. Context awareness deals with linking changes in the environment with computer systems [5]. The development of Context-Aware services (CAS) was established using the term ubiquitous computing. Mobile devices act as context-aware systems [6] that continuously trigger suitable actions based on changes in, e.g., time and location [7]. CAS are popular in different scenarios, applications collect local sensor data to personalize and enhance the user’s experience in a context-aware manner. For example, Google Maps [8] display the device’s current location on a map and while navigating it rotates the view based on the device’s direction and motion. Table 1.1: Mobile Devices Sensors: comparison between IPhone 7, LG Nexus 5X and Samsung Galaxy S8 built-in sensors. Sensor GPS Compass Barometer Accelerometer Ambient Light Gyroscope Proximity

iPhone 7

LG Nexus 5X

Samsung Galaxy S8

X X X X X X X

X X X X X X X X

X X X X X X X X X X X

Hall Effect Pressure Heart Rate Iris

Mobile services and applications are used in a variety of contexts and surroundings. When services are made context-aware, they can be personalized by sensing the surrounding environment and adapt accordingly. Since context can refer to any aspect of a situation, we can classify it in three general types [9]:

2

1 Introduction & Motivation • Environment-specific: describes the environmental status in which a user and devices are. Sensors are usually used in order to provide such kind of information as pressure, temperature and location. • User-specific: personal preferences that can impact the context’s evaluation, such as a student in a university or hobbies. • Device-specific: provides information about the device’s current status, including, screen size, CPU usage or available memory. A widespread approach for developing context-aware services is Mobile Cloud Computing (MCC) [10]. MCC aims to offload the applications’ computationally intensive and storage demanding tasks to the cloud in order to leverage its powerful resources [11]. Computation offloading to the cloud can save battery resources and improve the application’s performance. However, the existence of a fast, reliable and secure network connection is required for the cloud resources to be available; otherwise, the application delivers a poor quality of service. Thus, depending on the use-case, the MCC approach might not be suitable, such as a scenario when an opportunistic grouping of devices and exchanging sensitive sensor data are required. In this case it is more suitable and secure for the devices to communicate locally via different communication interfaces rather than through the cloud. A device-to-device direct communication is a promising approach, especially that the mobile devices’ battery resources, storage and computational power are rapidly increasing by hardware manufacturers to satisfy customers’ needs and demands [2]. Figure 1.1 illustrates a scenario where Bob and John’s families are on a vacation. Each of the families prefers to explore the places differently. Therefore, Bob and John agree to split up and notify each other if they see any interesting place. Bob and John turn on a context-aware communication application on their mobiles to keep track of each others current location and communicate when needed. The application is capable of sensing the current location’s context information, including GPS coordinate, compass and accelerometer data via the mobile device’s built-in sensors to infer and report the distance between the devices. Since the devices of Bob and John are directly connected, a missing internet network infrastructure or lack of cellular network connection does not influence the service at all. Bob finds a

3

1 Introduction & Motivation Cafe with very interesting deals and sends a picture of the deals to John through the application. John checks Bob’s current location from the application and finds the Cafe.

Figure 1.1: A context-aware service scenario using device-to-device communication.

1.1 Location-based Services (LBS) Location is the most explored of context information. Its role in people’s digital life is changing and growing with the increase of the location-based services. LBS [12], incorporate the mobile devices’ location information to deliver context-aware and personalized features and services. For instance, a car’s navigation application relies on the user’s current location coordinates to provide a turn-by-turn directions to another location, or a smart-home application opens the garage door when the user’s location is close by a specific distance. The technology used to determine the user’s location is a critical factor for providing accurate services. There are various approaches for acquiring a user’s location information via mobile devices, such as Cellular Network localization techniques and Global Positioning System (GPS).

4

1 Introduction & Motivation

1.2 Proximity-based Services (ProSe) Proximity-based services (ProSe) [13], special class of location-based services that delivers information and triggers suitable actions, based on the devices’ relative position to other entities. The purpose of ProSe applications is to discover instances of the applications running on devices within a proximity of each other and exchange application-related contents [14]. ProSe are popular in different scenarios, such as public safety, network offloading, and broadcasting information in shopping malls and museums. ProSe consist of two fundamental features: nearby devices discovery and reliable communication. These features allow devices in proximity to find each other and establish a communication channel. ProSe can be achieved outdoors and indoors; in each case it utilizes different localization techniques. Devices are likely to be close to each other when the geographical coordinates are in proximity or two devices have similar sensor readings. In ProSe, the actual location of the two devices is not as important as the presence of the two devices near to each other. ProSe can be implemented in different scales and perspectives. A service may consider individuals and objects in proximity when the distance between them is less than 500 or 5 meters for an example. Another service might be interested in the individuals’ orientation, in addition to their relative position to consider them in proximity, such as understanding individuals’ interest of a specific exhibit in a museum. Proximity detection, recognizing which devices are nearby, is an essential process in ProSe. To discover devices in proximity; first of all, we need to gather information about each device’s surrounding environment. Then, we compare the environment information to infer the distance between the devices to decide whether they are in proximity. Attaining a device’s location can be achieved in two ways: acquiring the device’s exact location coordinates, such as GPS or retrieving its relative position to a known object or place; an access point positioned in a specific room for example. Based on which way we use, there are specialized methods to calculate the distance between the devices’ locations or check whether they are in the same area. Each way has advantages

5

1 Introduction & Motivation and limitations as discussed in Section 2.2.

Figure 1.2: Illustrates inferring individuals proximity from WiFi and sound information. Figure 1.2, exhibits inferring individuals proximity via sensing and analyzing the surrounding WiFi and sound signals information. Once the information is analyzed and compared, Bob and Alice are recognized as in proximity since the similarity between their readings is high, while Eve’s relative position is not considered close enough.

1.3 Room-level Proximity Services ProSe are available for devices in proximity or nearby. Detecting if two devices are in proximity; means checking whether the distance between them is less than or equals a specific threshold (ex: 5 meters). Room-level proximity services are only available for users inside a specific room. In other words, the service shouldn’t be accessible for other users outside the room. Thereby, the proximity threshold is the room boundaries rather than a fixed distance. To illustrate this idea, in Figure 1.3, the service running in room1 should be available for Bob and Alice and unavailable for others in room2 and hallway.

6


Figure 1.3: Example of room-level proximity service. A definition and an abstraction of the main requirements of the applicable applications, in addition to a suitable network architecture were presented in “Room-Area Networks” (RAN) [15]. RAN is a new category that falls between personal area networks and local area networks. The major requirements are: adhere to room boundaries, work over a broadcast medium and provide unicast, multicast, and service discovery. They implemented a prototype that uses acoustic channels as they provide sharp attenuation at room boundaries. To realize a highly accurate room-level proximity service, the localization technique needs to adhere room boundaries like walls and doors. Furthermore, using a combination of techniques can improve the service’s accuracy and fulfill more scenarios, such as combining sound and ultrasound, in addition to WiFi signal processing [16].

7


1.4 Scope and Goals The main goal of this thesis is to review the available technologies for realizing proximity-based services and evaluate the feasibility of WiFi, sound and ultrasound localization techniques for inferring devices’ proximity. As a proof-of-concept, we implement a proximity-based service prototype that provides a group of people the ability to automatically establish a private meeting based on their relative position. Our approach starts by implementing means of nearby devices discovery and communication. Then, collect and extract context information from the available list of access points and ambient sound in form of fingerprints. Once the fingerprints are generated, we implement multiple methods for measuring their similarities. Finally, we infer devices proximity based on the resultant measurements. In case the measurement results are not confident enough, an ultrasound text message is exchanged between the devices to confirm their proximity. Devices in proximity will form a meeting and can exchange files and messages while the other devices are disconnected. We evaluate our service by verifying its efficiency in terms of connection performance, energy consumption and accuracy of inferring devices’ proximity in different environments, including restricted and open spaces in multiple scenarios.

1.5 Structure of the Thesis The rest of this master thesis is structured as follows: Chapter 2, consists of two main parts, the first part provides an overview of the available architectures and technologies for realizing the proximity services’ discovery and communication. The second part reviews the different localization techniques explaining their strengths and limitations in inferring devices proximity. Chapter 3 presents the high-level design and use-cases of the our proximity service. While Chapter 4 explains the implementation of the services’ advertising, discovery and communication using Android’s WiFi-P2P API, moreover, the techniques of detecting proximity via WiFi, audio and ultrasound are reviewed.

8


Afterwards, Chapter 5 evaluates the WiFi-Direct connection’s latency and throughput in multiple states: idle, download running in the background and while the supernode is verifying the clients’ proximity. Then, the duration and energy consumption of forming a group, as well as verifying the users’ proximity are measured in two different environments; restricted and open spaces. Thereafter, the proximity service’s accuracy is evaluated in multiple concrete scenarios. To reason about the evaluation results, the connection’s latency and throughput findings are compared to a conventional WiFi connection and the service’s accuracy is compared to a random baseline as a ground truth. Finally, in Chapter 6, we provide a brief conclusion summarizing the work we have done throughout this thesis followed by some guidance for possible improvements as future work.

9

2 Related Work This chapter provides an outline of the necessary information and concepts that are related to the Proximity-based Services. The first section starts with an overview of the available paradigms and technologies for realizing ProSe discovery and communication. The second section reviews different localization techniques explaining their strengths and limitations in inferring devices’ proximity.

2.1 Discovery and Communication ProSe consist of two fundamental features: nearby devices discovery and reliable communication. These features allow devices in proximity to find each other and establish a communication channel. As shown in Figure 2.1, the design and implementation of ProSe can be achieved in two paradigms; as client-server and device-to-device (D2D). Based on the use case, each paradigm fits better than the other in fulfilling the requirements and delivering better quality of service.

Figure 2.1: Proximity-based services paradigms.

10

2 Related Work

2.1.1 Client-Server Paradigm In client-server paradigm, mobile devices will be collecting data such as: GPS, Wi-Fi, and periodically send them to a centralized server; which is usually located in a backend to process it and determine proximity relations between devices. This paradigm makes discovering peers and services faster and more efficient in terms of energy consumption; since the main logic is executed on a powerful server instead of the mobile device’s limited resources. However, the interaction with servers located at a distant location is prone to critical privacy problems, network’s latency, bandwidth and delay which influence the quality of the service.

2.1.2 Device-to-Device Paradigm (D2D) D2D systems are distributed systems with a high degree of decentralization without the need of any centralized server. A D2D system consists of interconnected nodes (peers) which are equivalent in terms of function. Each node can act as a client and a server at the same time. Since there are no centralized servers in the D2D network, D2D networks are self-organized [17]. Nodes in D2D systems cooperate in order to provide one or more services. These services could be grid computing, file-sharing, replication and distributed storage. Nodes can cooperate by sharing resources, such as storage, CPU cycles, network bandwidth, and data [18]. The decentralized nature of this paradigm has several advantages when applied in distributed systems. The main advantage is to avoid the single point of failure, which is a common problem in client-server paradigm. Avoiding the single point of failure improves reliability, availability and robustness. As a consequence, the cost-effectiveness is another benefit of this paradigm; it is achieved by utilizing the existing resources and eliminating the need of expensive infrastructure of super nodes [18]. In case of ProSe, despite of the additional time and energy consumed in peers and services discovery because of the execution of them locally on the device rather than on

11

2 Related Work a centralized server, D2D alleviates the privacy concerns by giving users more control over what information is being shared. D2D communication was initially implemented by means of WiFi ad-hoc mode and Tunneled Direct Link Setup (TDLS) [19]. The limitation of ad-hoc systems is the set up complexity and providing low data transfer rate. Additionally, devices that are configured to use those implementations are unable to concurrently manage other connections. Other approaches tried using Bluetooth; which on the one hand generates short radio range with low power consumption in comparison to WiFi and is available in all devices. On the other hand, has low data transmission rate and is not capable of broadcasting data to multiple devices; which is impractical in transmitting documents and media files. Recently, multiple technologies were introduced to overcome the previous limitations and support ProSe use cases in a more efficient way, such as WiFi-Direct by WiFi Alliance [20] and LTE-Direct by 3GPP (the 3rd Generation Partnership Project) [21]. WiFi-Direct WiFi-Direct is a wireless technology, it enables devices with WiFi capability to seamlessly connect to each other without the need of an access point or internet by establishing P2P groups. Each group requires one device to act as a Group Owner (GO) which behaves as an access point to one or more clients. These roles are considered logical and dynamic; each device has to implement both roles and can execute them simultaneously. This technology allows devices to send data using the WiFi standards at a fast speed [20]. Figure 2.2 shows the difference between traditional WiFi and WiFi-Direct topologies. WiFi-Direct communication flow is initiated by devices’ discovery, where each device collects wireless information about the surrounding devices and sends probe requests to check the other devices’ state. When a device receives a request, it responds by sending its own identification and the currently joined group details if available.

12

2 Related Work

Figure 2.2: WiFi and WiFi-Direct communication topology.

Once the two devices have found each other and decided to establish a connection, a GO negotiation process starts, whereby the two devices agree on which device will act as P2P GO and on the channel where the group will operate. Then, the GO device starts a Dynamic Host Configuration Protocol (DHCP) server and assigns every client an IP address as shown in Figure 2.3. WiFi-Direct devices implement an authentication process based on WiFi Protected Setup (WPS) to communicate securely in a simple manner via PIN code exchange or pushing a button in the two P2P Devices [20]; where the P2P GO is required to implement an internal Registrar, and the P2P Client is required to implement an Enrollee [20]. A prominent capability of WiFi-Direct is the ability to advertise and discover services at the link layer. This is useful prior to the establishment of a P2P Group since P2P Devices can exchange queries to discover the set of available services and, based on this, decide whether to continue the group formation or not. In order to implement this feature, service discovery queries generated by a higher layer protocol, such as Bonjour [22], are transported at the link layer using the Generic Advertisement Protocol (GAS) specified by 802.11u. Using WiFi-Direct provides a secure, easy to discover and connect schema which enhances the applications’ mobility and portability; as devices can connect and exchange

13

2 Related Work data anytime and anywhere. A major limitation of WiFi-Direct is that it does not allow the transfer of the Group Owner role once the group is created. As a result, if the P2P GO leaves then the P2P Group gets destroyed and all the devices will be disconnected.

Figure 2.3: WiFi-Direct discovery and group formation flow.

LTE-Direct LTE-Direct emerged as one of the most innovative technologies in 3GPP Release 12 and 13 [23]. LTE-Direct uses radio signals, it leverages the LTE air interface to realize two fundamental features; D2D discovery and direct data exchange between devices that are in proximity without the need to route via evolved NodeBs (eNBs). LTE-Direct works in licensed spectrum and is under mobile operators control [23, 24]. In terms of discovery, LTE-Direct provides a common language called "expressions" for applications’ advertising and devices discovery. An expression is associated with: • Name - required: an application defined string that’s used by the application layer. An example can be "TUM Campus". • Code - required: a binary representation of the application name, used by the physical and mac layers over the LTE air interface. • Metadata - optional: a collection of associated parameters and attributes in a human readable form, such as "Summer semester 2017", used by the application layer.

14

2 Related Work LTE-Direct expressions can be either private or public. Services are mapped to public expressions through the centralized Expression Name Server (ENS) [21]. The communication in LTE-Direct has three types of links: uplink, downlink and sidelink. As shown in Figure 2.4, the "uplink" designates the link from the mobile device or user equipment (UE) to eNB and the "downlink" is the link from the eNB to the mobile device. Both links are managed via the conventional Uu interface. While the "sidelink" represents the D2D communication channel between devices to realize the ProSe application, it’s managed via the PC5 air interface.

Figure 2.4: LTE-Direct communication links between eNB and UEs. The main use case for LTE-Direct ProSe is to support public safety events, such as providing wireless services for police and ambulance crews in the case of an accident. Additionally, this technology would induce creating many applications to support daily use cases, such as providing ProSe in shopping mall, broadcasting information for tourists or local social networking [24]. There are several benefits of LTE-Direct over WiFi-Direct, the first one is discovery range. LTE-Direct has a discovery range of hundreds of meters reaching up to 500 meters and it allows the discovery to be "Always ON" and autonomous. While in WiFi-Direct, the range is up to tens of meters. Additionally, in LTE-Direct, the discovery process does not consume as much energy as WiFi-Direct [21]. Qualcomm reported that LTE-Direct can discover as many as 7200 devices in 0.64s in comparison to 369 found in using WiFi-Direct that took 82–119s. The simulation results showed that LTE-Direct

15

2 Related Work can achieve improved energy efficient communication by 5053 (MB/J) [25]. However, since LTE-Direct is under the mobile operators control and there are many technical challenges, the current attention of industrial applications is on the unlicensed bands, including WiFi and Bluetooth [26].

2.2 Localization Techniques As a key ingredient of sensing, localization is a prerequisite for many services, such as LBS and ProSe. There are various approaches for retrieving the mobile device’s location information ranging from hundreds of meters down to several centimeters divided into outdoor and indoor localization techniques. Outdoor localization techniques are available as a standard requirement in most devices, such as maps navigation in mobiles and cars’ computers. While in case the user is inside a building, these techniques’ accuracy will be affected producing inaccurate results after passing through building materials [27]. Therefore, leveraging a different set of sensors and radio interfaces is required. A great number of approaches were researched [28, 33, 31, 32, 15, 30, 29] trying to provide high accuracy while maintaining multiple other aspects including: cost, ease of installation and deployment.

2.2.1 Cellular Network Localization Network-based systems use technologies that calculate the mobile device’s position from measurements obtained at base stations. The transmitted cellular network signal properties, such as signal strength and angle of arrival, are affected by the distance between the transmitter and the receiver. Therefore, using these properties it is possible to estimate the distance and compute locate the mobile device. The basic cellular network localization methods as shown in Figure 2.5 are [35, 34]: • Cell- ID and Timing Advance (TA), when a mobile device is in an area, it registers its current location with a base station. This information can be used to estimate

16

2 Related Work

Figure 2.5: Cellular Network localization methods. the mobile device’s current location, using the identification codes assigned to it. Additionally, TA, is a measure of the distance between the base station transceiver (BTS) and the mobile device; which can be used to reduce the positioning error. • Received Signal Strength (RSS), the distance between the BTS and a mobile device is approximated by using the signal strength and propagation models, such as average path loss, shadowing and small scale fading of the environment. • Angle of Arrival (AOA), use the angle of the arriving signals to the mobile device from two BTSs or more. Measuring the angle depends on the line of sight conditions. In case the angle measurement has any errors, it will result into major positioning errors. • Time of Arrival (TOA), use triangulation to measure the propagation delay of transmitting to multiple BTSs. • Time Difference of Arrival (TDOA), a mobile device transmits a signal to two synchronized BTSs, then the difference of signal arrival time at the BTSs is used to estimate the mobile device’s location.

17

2 Related Work The main advantage of Cell-ID and RSS methods over AOA, TOA and TDOA is that they do not require any modifications or additional hardware installation to cellular network base stations. Considering this approach’s localization accuracy, these methods produce low accuracy of hundreds of meters, thus its not applicable for many LBS.

2.2.2 Global Positioning System (GPS) GPS is the most popular outdoor localization system, it uses 2-D or 3-D triangulation for calculating position coordinates with high accuracy. However, the required time and energy to acquire a GPS coordinate is significant [36]. A typical smart-phone will completely drain its battery in about 6 hours if the GPS is running continuously [37, 38]. Moreover, GPS shows weakness in non line-of-sight (NLOS) environment, such as inside buildings; where GPS signals’ accuracy will be affected producing inaccurate results after passing through building materials [27], making it inconvenient for indoor localization. Assisted GPS (A-GPS) A-GPS was designed to increase the signal strength indoors; it consists of networks of GPS transmitters on towers. Since they are much closer to the users, their signal’s strength is stronger which can also be used for indoor localization. The main disadvantages of this system is its high cost, limited coverage and low accuracy of around 50 meters [28].

2.2.3 Tag-based Localization Tag-based approaches require distributing sensors, such as proximity, ultrasound, and BLE-beacons at fixed positions indoors. The main advantage of these approaches is the ability of producing high localization accuracy. However, the effort and cost of installing additional hardware and sometimes the need for a centralized server are considered major limitations.

18

2 Related Work Infrared Tags Active Badge system [39] is one of the first indoor tag-based localization systems. The system consists of infrared tags that are carried by users, sensors placed in each room in order to detect signals sent by the infrared tags and a central server. The system provided high accuracy in room-level localization. Radio Frequency Identification (RFID) RFID systems use a network of radio beacons and tags, which are accurate and reliable. However, this technique requires numerous tags to be installed, along with a central network server and transmitters [40]. BLE Beacons BLE beacons are flexible in the sense that they are small in size, they do not need to be plugged in and are power efficient. The authors of [31] presented a low-cost, threshold-based approach and introduced an algorithm that takes into account both the Received Signal Strength Indicator (RSSI) of the Bluetooth Low Energy (BLE) beacons and the geometry of the rooms; where the beacons are placed. They attached one BLE beacon to the center of each room’s ceiling and used signal propagation loss method [41] to calculate the distance from the received RSSI. ZigBee ZigBee is reliable, cost effective, and low power home area wireless network developed by ZigBee Alliance based on an open global standard. ZigBee was used in [42] for room detection. It considers the behaviour of the RSSI through walls, floors and ceilings. Reference nodes were installed in each room as a reference point for mobile devices. The system exhibits good performance for its simplicity, although a wrong room decision sometimes occurs when the mobile device is near a wall that separates two different rooms.

19

2 Related Work Ultrasound The system [43] measured the distance between a receiver and the transmitters by measuring the time it takes sound waves to reach the receiver from each transmitter. The system requires installing ultrasound transmitters at numerous locations around a building. This technique is very accurate, it can provide accuracy up to several centimeters from the user’s location. In the same context, P’erez et al. in [44] presented LOCATE-US, every mobile device includes a small dedicated hardware with small dimensions that consists of an ultrasonic microphone and a micro-controller. The system emits signals at high frequency, close to 41kHz, that are captured by the ultrasonic microphone. The signals are processed and sent over Bluetooth to the mobile device. Once they are received by the device, the application determines its own position. Based on their experiments, the proposed application achieved 80% localization accuracy below 10 centimeters. Proximity In some cases, beside the proximity or location of individuals, their orientation can denote important information. For example, in case of museums, it’s important to detect where each individual is looking at. Such information helps understanding the visitors’ interests and identify group behavior, such as the time spent at each exhibit, common paths and patterns. Understanding the visitors’ behavior can support building marketing campaigns and applications, in addition to being useful for the museum staff, stakeholders and founders. This museums case is handled in [33] by using dedicated, inexpensive and energy-efficient proximity sensors. The sensors were positioned at the base of each exhibit in a museum to measure person-to-object proximity and orientation to determine where each individual is looking at. Based on their evaluation, the system accurately positions visitors at exhibits at all times. The disadvantage of this system is that it’s not efficient in the case of crowded areas where there is a large number of individuals, besides the need for an additional hardware.

20

2 Related Work WiFi Adapters The proposed system in RADAR [45], requires multiple computers as base stations with WiFi adaptors installed in every room. Every user needs to carry a device equipped with a WiFi router to broadcast signals. The base stations receive these signals and use triangulation to determine the user’s location. The system requires installing WiFi adapter in each room and a WiFi router on the mobile device. Visible Light Communications (VLC) VLC uses visible light instead of radio frequency (RF) electromagnetic waves for communication. light emitting diode (LED) are the used to modulate electrical signals into light signals at high speed. A major advantage for LED signals is that they don’t interfere with electronic equipment as in the case of RF signals [46]. Additionally, LED has long life expectancy, low power consumption, high tolerance to humidity, and environmental friendliness [47]. VLC system’s transmission speed is high [48]; data rate of 8 Gb/S can is achieved in 1 m indoor free space environement. Jung et al. in [46] introduced an indoor positioning system that relies on an existing infrastructure of LED ceiling lamps and used the time difference of arrival (TDOA) method as a positioning technique. The system works by assigning a unique frequency to each LED lamp, then the LED properties, such as lighting and switching, are used for transmission. The maximum and mean values of location errors during simulation were 4.5 mm and 1.8 mm.

2.2.4 Accelerometer and Gyroscope Localization Through the accelerometer and gyroscope readings it is possible to measure the person’s displacement by counting the number of the steps. Additionally, the direction of each step can be tracked by analyzing the compass readings. The localization system developed in [49] continuously tracks the user’s steps, heading directions and step length. The system interacts with a user to get the initial location through user input

21

2 Related Work and estimates the user’s movement trajectory from the device’s sensors without the need for infrastructure assistance. Based on the evaluation results, the system achieves a mean accuracy of 1.5m when the device is in-hand and 2m in case in-pocket. The issue with this approach is that the tracking is accurate only at the beginning; it suffers from an increased error accumulation over time.

2.2.5 Image Processing Localization One approach detects a user’s movement by analysing changes of captured images from a camera [50]. Another approach provides navigation by matching the captured images to images stored in a database using SIFT algorithm [51]. These systems produce good results. The main disadvantage is the requirement for high computing power, resulting into high energy consumption which makes them impractical for use on mobile devices. Another image processing approach that does not consume a high amount of energy is using marker tags [52]. These special tags need to be installed in every room, then every tag’s information including room location is stored in a database. When a user takes a picture of a room, the system will recognize these tags and match them with the database. The main disadvantage is the need for additional hardware installed, i.e special tags.

2.2.6 WiFi Localization Several approaches researched using WiFi as indoor localization technique. WiFi information can be useful in many ways, as a simple usage, a WiFi hot spot could be used as presence sensor which can provide context about the device’s current location. SpotEx [53], the system defines specific rules to identify particular room; access point X signal strength is above a specific threshold, the device is inside the room A. Another way of using WiFi, is by generating a fingerprint of every area or room. WiFi fingerprint-based positioning is a popular approach in indoor localization; it is

22

2 Related Work completely software-based and relies on deployed access points to build up knowledge about a device’s surrounding area. Accordingly, the more reachable access points are the more knowledge is available [54]. A WiFi fingerprint can be built from a list of the available access points’ information including its SSID, BSSID and the received signal strength indicator (RSSI) value. As an example, the proposed system in [55] creates a WiFi fingerprint for each room of a building. During navigation, the system determines the user’s location by measuring the strengths of the ambient WiFi signals and matches them with the stored fingerprints. In the same context, Kaemarungsi et al. in [56] utilized RSSI of APs and presented an analytical model. They used Euclidean distance between a sample signal vector and a set of stored fingerprints in a database. On a large scale, WiFi information was used to analyze mobility and interactions between approximately 1,000 students who interacted in various environments over two years in [32]. The dataset used was collected as part of Copenhagen Networks Study [57]. They used Bluetooth as a ground-truth for physical proximity where they have collected data for over a year between approximately 800 participants. Their research shows that it’s possible to infer person-to-person proximity from lists of WiFi access points scanned by smartphones. They used a number of metrics to compare two lists of WiFi scan results and used these metrics as features in a supervised machine learning approach. The features are divided into availability of access points, received signal strength, presence + RSSI and derived context from the characteristics of interaction dynamic, such as time of the day, day of the week and the popularity of a location. Based on their evaluation, the system was able to score 89% in determining person-to-person interaction showing that WiFi environment indicates proximity in a more granular way than just the Bluetooth 10 meter range. From another perspective, SpotiFi [58] estimates the angle of arrival (AoA) and time of flight (ToF) of different multi-path components of a target’s signal arriving at the AP by using the channel state information that is exposed by WiFi APs. Afterwards, it estimates the likelihood that each AoA and ToF pair is the one corresponding to the direct path between the AP and the target without any reflections. Last step is to use

23

2 Related Work the estimated information to calculate the most likely location of the target that could have produced the observed RSSI and estimated AoA. Based on their experiments, SpotFi achieves a median accuracy of 40 cm.

2.2.7 Audio Localization Mobile devices’ microphone is a sensor able to capture rich information about its surrounding environment. Using sound signals as a proximity indicator is inspired by the scenario of two people making a conversation in an area or inside a meeting room, where they hear a very similar sound in comparison to a third person standing away or outside the room. Unlike radio waves, audio waves are more local; adhere to boundaries and are affected by inference with the surrounding objects. Figure 2.6 demonstrates how the sound propagation of a conversation can limit the proximity indication in comparison to WiFi.

Figure 2.6: WiFi vs. sound signal propagation. An interesting implementation over the physical layer based on 802.11a standard is Blurt [15]. Blurt can tolerate noise and interference, it is also computationally efficient

24

2 Related Work to work over ranges of several meters. Moreover, Blurt signals are attenuated by 4000x more than WiFi, thus adhering much more intuitively to room boundaries. Every ambient usually has a relatively unique audio fingerprint at a specific time. An audio fingerprint, can be derived from an audio recording in a way that provides an identification or a summary of it. Generating an audio fingerprint from an audio file can involve different techniques and algorithms, such as measuring Energy and applying Fast Fourier transform (FFT) and Mel-frequency cepstral coefficients (MFCC) [59]. For example, the method proposed by Wirz et al. in [60] utilizes the similarity of frequency spectrum of ambient sound to conclude the existence of a relation between the distance of two devices and the similarity of the recorded ambient sound. The authors of [61] investigated two techniques for proximity detection on a database of personal audio recordings by applying two mechanisms: short-time cross-correlation and acoustic landmark-based fingerprinting [63, 62]. Evaluation shows that crosscorrelation between 10 seconds windows is effective for detecting when individuals are close enough to be in a conversation and using a fingerprinting approach based on acoustic landmarks is comparably accurate. Diverse approaches of applying audio fingerprinting to infer proximity or similarity were researched. The main motivation for generating an audio or any multimedia fingerprint is to have an efficient mechanism to establish the perceptual equality of two audio files by comparing their small in size meta-data instead of comparing their original large files. Any fingerprinting system consists of two main components [64]: a method to extract and another to match two or more fingerprints. J. Haitsma et al. in [64] elaborated on defining the requirements for an audio fingerprint, including robustness, reliability, size and granularity, in addition to explaining the use of hash functions [65] in extracting and matching fingerprints. An overview or various fingerprinting approaches is presented in [66]. Landmark-based fingerprint approach is presented by Wang et. al. in the implementation of Shazam [59, 62]. The idea is to use audio structures as time reference instead of arbitrary time frames to overcome any errors or mismatches in framing the recording. Shazam uses time-frequency peaks (i.e

25

2 Related Work dominant frequencies since they are the most robust to noise) of nearby frames in order to generate a set of robust hash values which represent the recording. Based on the fact that modern life is full of noises, such as computers whirr and lights buzz. Acoustic Background Spectrum (ABS)[30] assumed that each room has different persistent acoustic characteristics, even if they may sound similar to humans. ABS fingerprint is generated by dividing the recorded audio file into small frames, applying a window function; such as hamming to each frame, then computing the power spectrum by applying FFT. After all, they filter out frequency band of 0-7kHz and take the log of the 5th percentile column. They have stated that the accuracy of ABS combined with WiFi fingerprinting have improved from 30% to 69% in room-level localization and it can distinguish pairs of adjacent rooms with accuracy of 92%. From another point of view, [16] implemented a technique to represent the fingerprint of silence of an audio file. This technique looks into silence pattern rather than a detailed audio characterization. From their evaluation in a noisy cafeteria, the accuracy in the worst case is 96%. However, in case there are no sufficiently loud acoustic events in the environment, this approach will not work. Comparing silence and sound fingerprints, silence fingerprinting requires less computation and effort to build the final fingerprint. Moreover, it provides more privacy regarding conversational content. Besides the limitation of high silence fingerprint, in some cases, the similarity between frequency spectrum readings in two different places can be high [67]. An approach to overcome these issues is peer-assisted localization [29], it leverages the acoustic ranging between peers without requiring any special hardware. A device can use nearby devices as reference points and obtain its relative positions to them. The system works by each device emitting a special audio signal and recording an audio file, then sending it to the server, as well as to a WiFi fingerprint. The server schedules when a device should emit the audio signal as it is also responsible for determining the devices’ location and inferring their proximity. The emitted audio signal consists of several evenly paced beeps of equal lengths. Each beep is emitted over a high frequency band between 16kHz and 20kHz; which makes it both easier to filter out noise and render

26

2 Related Work the signal is unnoticeable to most people. The server uses two methods to detect the signal, change-point detection and correlation-based. Experiments show that it can reduce the maximum and 80-percentile errors to as small as 2m and 1m with negligible impact on the battery’s lifetime. After reviewing the available localization techniques, we can perceive their capabilities and limitations. Choosing the proper technique depends on the use case’s requirements in terms of environment, expected proximity accuracy, transfer rate and cost. For example, the case of detecting individuals orientation in a museum requires different techniques than a proximity-based service used inside a meeting room or finding out individuals existence in a shopping mall.

27

3 Proximity Service Design This section represents the high-level design of our proximity-based service. We discuss the possible use-cases explaining each use-case’s flow, expected behavior and final state. Throughout this section, we give an oversight of the design choices made while developing the service. Besides that, we provide details about the implementation in the next chapter. This thesis aims for implementing a proximity-based service that allows a group of people in proximity to establish a private meeting to exchange files and messages. People in proximity can exist inside a closed room or in an open space. Therefore, an existing network infrastructure, Internet connection or an additional hardware shouldn’t be a prerequisite for the service to work. In the same context, the service should verify and maintain the meeting’s membership automatically via sensing the surrounding area’s context information. Consequently, when participants are not in proximity anymore, they should no longer have access to associated content. The service accomplishes these requirements through the following main features, inclusive details about each feature are are further discussed during this section: • Device-to-Device service advertisement and discovery. • Automatic negotiation of the supernode device. • Automatic verification of the devices’ proximity. • Direct data exchange between devices. • Encryption of the entire communication between devices.

28

3 Proximity Service Design

3.1 Architecture and Communication Technology In Chapter 2, we reviewed the available approaches and technologies for realizing proximity-based services discovery and communication. Based on that, D2D architecture serves our service’s requirements more than client-server due to its capability to work without the need for any additional hardware or infrastructure. Furthermore, as discussed by M. Hazas in [68], it would be better to perform all the processing in the mobile phone to maintain the user’s privacy. In terms of technology, we have chosen WiFi-Direct to manage the communication between devices. The main strength of WiFi-Direct is that it works in an unlicensed spectrum, in addition to its availability in modern devices such as Android starting from version 4.0 (API level 14) [69]. Accordingly, the service is designed to work on Android devices with WiFi-Direct capabilities. Moreover, besides the powerful built-in support for WiFi-Direct, the Android framework is reliable and has a great community behind it.

3.2 Communication Flow Our service is designed as a standalone application; it doesn’t require any back-end support and includes all the logic needed to perform all the functionality. Besides exchanging files and messages; the devices joining the meeting collaborate in moderating the meeting’s use-cases by assigning virtual roles for each device, such as; a supernode role which is designated to the responsibility of managing communication between devices and inferring their proximity. Our application is capable of playing two roles; Supernode and Client. In each role it behaves accordingly; in case of Supernode, the device starts a Sockets server where it requests and analyzes different context information from each device to infer their proximity. While in case of Client role, each device connects to the supernode device and responds to any context information requests. The flow of our service consists of five main processes; Service Advertising, Services Discovery, Group Formation, Proximity Detection and Data exchange. Initially, each

29

3 Proximity Service Design device checks its currently available resources, the remaining battery level, in-use memory and CPU. Then, it advertises a service that is discoverable by nearby devices, publishing its identifier and the available resources. As soon as the service is advertised successfully, for a period of time, the device starts discovering and caching similar services advertised by nearby devices. When the services discovery period is over, the Group Formation process starts; all the devices agree on which device has the most powerful resources and delegate the Supernode role to it. Thereby, each device compares all the resources information from the cached services including its own resources. At the end of this process all the devices stop advertising their service, the supernode device starts a sockets server and the rest of the devices connect to it as demonstrated in Figure 3.1.

Figure 3.1: Device-to-Device Service group formation: demonstrates the service’s flow between two devices, Alice and Bob’s. After both devices advertised and discovered each other, Bob’s device is selected as the supernode since it has better resources. Therefore, it starts a sockets server and Alice connects to it.

30

3 Proximity Service Design Thereafter, for the connected devices to start exchanging data, the supernode starts the proximity detection process in order to verify that the data is confined between the devices in proximity. This process involves collecting the clients’ context information and comparing them with its own. Once a client’s proximity is verified, it is able to communicate with all the other devices; otherwise, it is disconnected. The supernode is periodically performing the proximity detection process to verify that all the devices are still in proximity. Another responsibility that the supernode tackles is advertising about the currently formed group. This way, any nearby device can discover the current group and request joining it. Whenever a new device connects to the current group, its proximity is verified similarly before being able to communicate with the rest of the devices.

In the proximity service, the communication between the devices is secure; a device needs a secret key to decrypt the exchanged messages and files. The secret keys are managed by the supernode as it generates a new secret key every time a proximity detection process is started, once a client device’s proximity is verified, the supernode exchanges the private key with a welcome message.

3.3 Proximity Detection The Proximity Detection process’s efficiency and accuracy are crucial for the service to accept users in a meeting. As reviewed in Section 2.2, there are various approaches in determining a user’s location information, including tag-based systems, WiFi and audio. Based on that knowledge, we designed our proximity detection process in a way that relies on the results of multiple techniques to conclude device’s proximity. The basic requirement of our service is to be capable of detecting a devices’ proximity without any additional hardware or infrastructure. To fulfill this requirement, our service leverages the Mobile’s commonly available microphone and WiFi adapter to

31


Figure 3.2: Service roles use-case scenarios: an overview of the Supernode and the Client’s roles responsibilities; diagramming the supernode and the client’s device as actors. Each line links an actor with a use-case that it interacts with. continuously sense and exchange the surrounding area’s context information. Although these information differ from one technique to another, in our service, the generic flow for all the cases is the same. The process starts by the supernode requesting from a client device a specific type of context information, such as WiFi scans or audio recording. Both devices collect the requested context information and extract useful features. As soon as the features are extracted, the client sends them to the supernode. As a part of its role, it compares the client’s extracted features against its own using multiple similarity and distance measurement methods to decide based on a threshold

32

3 Proximity Service Design from pre-evaluation, whether the devices are in proximity or not. To determine a device’s proximity, the supernode concludes a client’s coarse proximity estimation using the WiFi context information. Afterwards, it requests ambient sound features to come up with a finer estimation. In case the accuracy estimation is not certain enough, the supernode broadcasts an ultrasound message and requests from the client to decode it for the sake of confirming its proximity. Figure 3.3 represents a sequence diagram for the interactions between a Supernode and a Client’s device to infer their proximity.

33


Figure 3.3: Proximity Detection process’ sequence diagram.

34

4 Proximity Service Implementation The implementation of our proximity-based service achieving the use-cases presented in the service design Chapter 3 is discussed here. Our service is a standalone Android application that makes use of the WiFi-Direct technology to establish a direct communication between devices. The aim is to implement a service that combines passive WiFi and sound sensing with active ultrasound probing in order to come up with a fine proximity estimation of the devices nearby. This section first provides an overview of the Android’s architecture and WiFi-P2P API. WiFi-P2P API, represents the framework used for service advertising and discovery as well as establishing a D2D communication channel between devices via WiFi-Direct. Next, the used communication model, including exchanging requests, responses and data messages is explained. The remaining of this section describes the implementation of each proximity detection algorithm showing how useful features are extracted and utilized in proximity reasoning.

4.1 Architecture Overview Android applications are built using Java programming language and XML structured language that stands for an Extensible Markup Language. Java is the main language used to implement the application’s state, logic and layout. It is an object-oriented programming language that relies on a compiler to transform it into bytecode which the Android run-time can read and execute. XML, is used to design the user-interface’s layout and controls. The layout and controls include activities, fragments and any

35

4 Proximity Service Implementation visual component such as buttons; which are Java classes depicting what the user sees. The architecture of our application is composed of the following main components: • Layouts: responsible for rendering the User Interface (UI) and communicating with the controller when the user interacts with the application. • Layout Controllers: each layout is linked to a controller that ties the application together. It is the class that defines the application’s logic. As an example: when a user taps on the "start meeting" button, the layout notifies the controller about this event which starts the service advertising and discovery process. • Services: is a class that is meant to perform a specific action, like scanning the available access points and notify any subscribed listeners about the result. In our application there is a service for managing each of: – Resources management: checks the device’s available memory, CPU and battery resources. – WiFi-P2P interactions: offer the ability to advertise a service, discover nearby services and establish a connection. – Devices communication: represents the sockets realization in terms of hosting a server and accepting clients’ connections. Additionally, it manages exchanging messages. – WiFi sensing: provides the ability of scanning an up-to-date list of the available access points. – Sound sensing: records the ambient sound and extract useful features from it by applying different techniques that will be described in Subsection 4.6. – Ultrasound broadcasting: takes a text message as an input, encodes it into a high-frequency sound signal and broadcasts it. • Event Listeners: is an interface that contains one or more callback methods. These methods are called by a service or controller when a specific event occurs to notify any subscriber.

36

4 Proximity Service Implementation • Fingerprinting Component: includes the logic to validate the devices’ proximity by processing different requests using the application’s services. In other words, for each client’s device; it manages the context information requests and responses flow in addition to producing its proximity estimation.

4.2 Description of WiFi-P2P API The Android framework supports WiFi-Direct capabilities by providing WiFi-P2P API starting from version 4.0 (API level 14). Using the API, any device can advertise a service that is discoverable for the nearby devices without being connected to an existing network. The service can hold meta-data to describe what it is providing. The WiFi-P2P API consists of the following main parts [69]: • WifiP2pManager class, available in android.net.wifi.p2p package, interacts with the device’s hardware to manage peers discovery, advertising and connection. • Listeners such as WifiP2pManager.ActionListener provides a notification of the success or failure status of WifiP2pManager method calls. • intents get broadcast when certain WiFi P2P events happen, such as when a new peer is discovered or when a device’s WiFi state changes. API Setup In order to use the API, the application needs to request the following permissions to access the device’s WiFi hardware and register broadcast receivers for any WiFi-P2P Intent (see table 4.1) [69]: • INTERNET to open network sockets. WiFi P2P doesn’t require an internet connection, but it does use standard Java sockets, which requires the INTERNET permission. • ACCESS_WIFI_STATE and CHANGE_WIFI_STATE to access WiFi networks information and change the connectivity state, such as retrieving a list of the available access points.

37

4 Proximity Service Implementation • ACCESS_NETWORK_STATE and CHANGE_NETWORK_STATE to access information about networks and change the connectivity state, usually used for monitoring the network connections.

Table 4.1: Wi-Fi P2P Intents: notifies registered applications when the connection or the peers state changes. [69] Intent

Description

WIFI_P2P_CONNECTION_CHANGED_ACTION

Broadcasts when the state of the device’s Wi-Fi connection changes.

WIFI_P2P_PEERS_CHANGED_ACTION

Broadcasts when calling discoverPeers() pointing to a change in the nearby devices.

WIFI_P2P_STATE_CHANGED_ACTION

Broadcasts when Wi-Fi P2P is enabled or disabled on the device.

WIFI_P2P_THIS_DEVICE_CHANGED_ACTION

Broadcasts when a device’s details have changed, such as the device’s name.

When the state of WiFi P2P component changes, it initiates an intent and then broadcasts it to the system. The system delivers the intent to the broadcast receivers that are registered to receive updates about it. The first step for registering for a WiFi-P2P intent is to create a class that extends the Android’s BroadcastReceiver abstract class. This class holds the logic that our application runs whenever a specific intent event is triggered; when the system broadcasts an intent, the onReceive method is called. The code listing 4.1 illustrates the intents registered in our application’s BroadcastReceiver class.

Listing 4.1: Broadcast Receiver Definition 1 2

/** * A BroadcastReceiver that notifies of important Wi-Fi p2p events.

38

4 Proximity Service Implementation 3 4 5 6 7 8 9 10

*/ public class WiFiDirectBroadcastReceiver extends BroadcastReceiver { @Override public void onReceive(Context context, Intent intent) { String action = intent.getAction(); if (WifiP2pManager.WIFI_P2P_STATE_CHANGED_ACTION.equals(action)) { // Check to see if Wi-Fi is enabled and notify appropriate activity } else if (WifiP2pManager.WIFI_P2P_PEERS_CHANGED_ACTION.equals( action)) { // Call WifiP2pManager.requestPeers() to get a list of current peers } else if (WifiP2pManager.WIFI_P2P_CONNECTION_CHANGED_ACTION.equals( action)) { // Respond to new connection or disconnections } else if (WifiP2pManager.WIFI_P2P_THIS_DEVICE_CHANGED_ACTION.equals (action)) { // Respond to this device’s wifi state changing }

11 12 13 14 15 16 17 18 19

} }

The next step in registering for a WiFi-P2P intent is to create an IntentFilter and register it along with an instance of the BroadcastReceiver as shown in listing 4.2. The IntentFilter is used to specify which actions our application is interested in.

Listing 4.2: Broadcast Receiver Registration 1 2 3 4 5 6 7

// Create new instance of IntentFilter IntentFilter mIntentFilter = new IntentFilter(); // Express the interest in WiFi-P2P intents mIntentFilter.addAction(WifiP2pManager.WIFI_P2P_STATE_CHANGED_ACTION); mIntentFilter.addAction(WifiP2pManager.WIFI_P2P_PEERS_CHANGED_ACTION); mIntentFilter.addAction(WifiP2pManager.WIFI_P2P_CONNECTION_CHANGED_ACTION);

39

4 Proximity Service Implementation 8 9 10 11 12 13 14

mIntentFilter.addAction(WifiP2pManager.WIFI_P2P_THIS_DEVICE_CHANGED_ACTION); // Create new instance of WiFiDirectBroadcastReceiver WiFiDirectBroadcastReceiver mReceiver = new WiFiDirectBroadcastReceiver(); // Register Intent and BroadcastReceiver with our application ApplicationContext.registerReceiver(mReceiver, mIntentFilter);

After setting up the permissions and registering the broadcast receiver, the application is now ready to start using the API, such as discovering services (see table 4.2 for the available methods).

4.3 Service Advertising and Discovery Service discovery is one of the fundamental phases in establishing a D2D communication. For a device to be discovered, it needs to advertise its own service first so other devices can discover it. It is required for the devices to acknowledge the existence of each other in order to establish a connection and communicate. The API provides an easy way for creating and advertising a Bonjour service instance over the WiFi P2P network. Bonjour services are supported by various devices including smart-phones, printers and webcams. The initial step to advertise a service is to instantiate a service object using the WifiP2pDnsSdServiceInfo class (illustrated in listing 4.3) and define the following fields: • Service Name: the visible name to other devices on the network. The name is visible to any device on the network that is using NSD to look for local services. • Service Type: specifies which protocol and transport layer the application uses. The syntax is "_._", where the protocol can be any of the available protocols at [70].

40

4 Proximity Service Implementation

Table 4.2: WiFi P2P Methods: allow the interaction with the WiFi hardware [69]. Method

Description

initialize

Registers the application with the Wi-Fi framework. This must be called before calling any other Wi-Fi P2P method.

connect

Starts a peer-to-peer connection with a device with the specified configuration.

cancelConnect

Cancels any ongoing peer-to-peer group negotiation.

requestConnectInfo

Requests a device’s connection information.

createGroup

Creates a peer-to-peer group with the current device as the group owner.

removeGroup

Removes the current peer-to-peer group.

requestGroupInfo

Requests peer-to-peer group information.

discoverPeers

Initiates peer discovery.

requestPeers

Requests the current list of discovered peers.

addLocalService

Registers a local service for service discovery.

addServiceRequest

Adds a service discovery request.

discoverServices

Initiates service discovery to browse for instances of a service type.

• TXT Map: meta-data with key/value pair used to publish any useful information for the devices that discover the service. In our implementation for instance, the device’s currently available resources are published as shown in listing 4.4. Listing 4.3: DNS Instance Initialization 1 2 3 4 5

// Service name that appears to all devices String serviceName = "Proximity-Meeting"; // "_._", protocol and transport layer // "presence" is a P2P messaging protocol over TCP layer

41

4 Proximity Service Implementation 6 7 8 9 10 11 12 13 14

String serviceType = "_presence._tcp"; // Can be any data, such as device’s available resources Map metadata = new HashMap(); metadata.put("TotalMemory", totalMemory); // Init service info object P2pDnsSdServiceInfo serviceInfo = P2pDnsSdServiceInfo.newInstance(serviceName, serviceType, metadata);

Listing 4.4: DNS Service Meta-data: publishing the device’s available resources { IsCharging = true, BatteryScale = 100, BatteryLevel = 61, TotalCPU = 28855, InUseCPU = 15944, CPUPercentage = 55.26, MemoryPercentage = 54.75, AvailableMemory = 1071, TotalMemory = 1945 }

Afterwards, the service instance is advertised for discovery using the addLocalService WifiP2pManager’s method. To discover a service, it is necessary to implement and register the following WifiP2pManager listeners to receive any discovered services’ information: • DnsSdServiceResponseListener: invoked when a new service is discovered. • DnsSdTxtRecordListener: invoked when a service’s shared meta-data are available. A service information consists of the service’s advertised fields in addition to a WiFiP2pDevice instance that represents the device hosting the service; as shown in table 4.3. Triggering service discovery is accomplished by calling the discoverServices method.

42


Table 4.3: WiFiP2pDevice object properties. Field deviceAddress deviceName primaryDeviceType secondaryDeviceType status

Description The device MAC address uniquely identifies a Wi-Fi p2p device. The device’s name is a user friendly string to identify a Wi-Fi p2p device. identifies the type of the device. is an optional attribute that can be provided by a device in addition to the primary device type. Device connection status.

4.4 Group Formation In our service there are two types of services, PROXIMITY-MEETING and PROXIMITYMEETING-HOST. A PROXIMITY-MEETING service is advertised by the devices that are trying to start a new meeting. In the Group Formation process, each device starts discovering a service using the WiFi-P2P API methods. Once the discovery is completed, each device analyzes the cached services in a process called Supernode Negotiation. In this process, the services’ meta-data which represent each device’s resources are compared against each other. This process yields into electing the supernode device which has the most powerful resources. Eventually, all the devices connect to the elected supernode to establish a meeting. Listing 4.5 illustrates how a device sends a connection request to the supernode device using the WiFi-P2P API. A crucial configuration property is the groupOwnerIntent, it can have a value between 0 - 15 and represents the probability of the device requesting the connection to take over the WiFi-Direct’s Group Owner (GO) role. Thus, when connecting to the elected supernode device, this value should be set to 0 in order to explicitly express the current device’s intention of not taking over that role.

43

4 Proximity Service Implementation Listing 4.5: Connecting to the Supernode Device 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

WifiP2pConfig config = new WifiP2pConfig(); // Set the supernode’s address config.deviceAddress = supernodeDevice.deviceAddress; config.wps.setup = WpsInfo.PBC; // Set the current device’s intent to take over the group owner role to 0 config.groupOwnerIntent = 0; wifiDirectService.connect(config,new WifiP2pManager.ActionListener() { @Override public void onSuccess() { // Connection request sent } @Override public void onFailure(int i) { // Connection request failed } });

Once a meeting is established, all the devices stop advertising their services and the supernode device starts advertising a PROXIMITY-MEETING-HOST service. The supernode advertises this type of service to inform the nearby devices of the meeting’s existence. Advertising two types of services has two advantages; a device interested in joining an already established meeting can differentiate between a supernode device and other devices trying to start a new meeting. Similarly, a group of devices can start a new meeting without interfering with an already started one. In our application, a device has the ability to create a new meeting along with other devices or join an already established meeting. In the scenario of creating a new meeting; each user is prompted to disclose their intention to stay the entire meeting’s period or not. The user’s selected choice affects the process; if the user is staying the entire

44

4 Proximity Service Implementation meeting’s period, the device advertises a PROXIMITY-MEETING service and joins the supernode negotiation. Otherwise, the application refrains from advertising a service and starts discovering services immediately. This way, the likelihood of destroying the meeting’s group as a result of the supernode device leaving is less. A possible case in this scenario is when the device doesn’t discover any PROXIMITY-MEETING service and discovers one or more PROXIMITY-MEETING-HOST services. In this case, the application asks the user if there is an interest to join one of the existing meetings or cancels the process. While in the scenario of joining an already established meeting the application starts discovering services of type PROXIMITY-MEETING-HOST. In case a single service is discovered the application automatically attempts to connect to it, while when there are multiple services discovered, the application prompts the user to choose a meeting service from the list of discovered services. To differentiate between services, each service is presented with the device’s name that is hosting it, in addition to its MAC address. The user has the option of cancelling the service selection which results into the application cancelling the process. A possible case in this scenario is when the device doesn’t discover any PROXIMITY-MEETING-HOST service and discovers one or more PROXIMITY-MEETING services. When this happens, the application asks the user if there is an interest to create a new meeting with those devices. Figure 4.1 depicts the flow of both scenarios.

4.5 Communication Flow Once a connection is established, the WiFi-P2P API notifies each device of the group formation via the WifiP2pManager.ConnectionInfoListener. The listener provides a WifiP2pInfo object that contains the formed group’s information including, the GO device’s IP address and a flag that specifies if the current device is the GO. At this moment, the communication channel between the devices is established; the GO device establishes a virtual access point which assigns IP addresses to each device connected to it.

45


Figure 4.1: Supernode Negotiation Flow Diagram: describes the flow of supernode negotiation process for both scenarios; creating a new and joining an existing meeting.

46

4 Proximity Service Implementation Afterwards, the devices need a way to communicate and exchange data. Our D2D communication model is based on sockets; the GO device starts a socket server and the remaining devices connect to it as clients. Exchanged data are serialized into JavaScript Object Notation (JSON) to efficiently transfer complex object and then transform it into bytes. In our service there are 3 main types of messages: • FingerprintRequest: used by the supernode to request a fingerprint from a client. It consists of a request type such as WiFi, Sound and Ultrasound as well as a user-friendly message. When a client receives a request, it checks its type and behaves accordingly besides displaying the request message to the user. Listing 4.6 demonstrates an example of WiFi fingerprint request JSON representation. • FingerprintResponse: used by the client to respond with the generated fingerprint to the supernode. It consists of a response type such as WiFi, Sound and Ultrasound, in addition to the generated fingerprint content. When the supernode receives a response, it checks its type and then compares the client’s fingerprint with its own. • DataMessage: used by all the devices to exchange text messages or files. It consists of a message type to determine if it contains a text message or a data file, in addition to the message data as an array of bytes. When a device receives a data message, it checks its type and parse the data accordingly. Additionally, in case of a supernode device, it broadcasts the same message to all the clients. Listing 4.6: WiFi Fingerprint JSON Representation { "message":"Send me the access points information around you!", "requestType":"WiFi" }

47


4.6 Proximity Detection Any sensor that can detect the presence of an object without any physical contact can be used as a proximity sensor [71]. Proximity sensors vary in their proximity range and the accuracy they provide. Based on the knowledge gained from studying the main concepts and reviewing prior researches, we investigate the feasibility of passively sensing WiFi and Audio signals, in addition to actively probing ultrasound signals to conclude users proximity using the mobile device’s available hardware. Detecting devices in proximity is a crucial responsibility which is assigned to the supernode. Whenever a new device is connected, the supernode device starts verifying the client’s proximity by requesting different types of context information, such as the list of available access points and extracted features of the ambient sound. As a result of the verification process, the supernode determines if the client is in proximity, thus, accepts the client and allows it to exchange messages with the meeting members or rejects it and terminates the connection. Generally, deciding whether a device is in proximity from a specific approach is a threshold-based approach; for each measurement, a minimum threshold which indicates that a device is in proximity is defined. In this section, the design and implementation of each approach our service uses as a proximity indicator are reviewed in detail.

4.6.1 Proximity Detection via WiFi Mobile devices have the ability of scanning the available list of nearby access points. A WiFi access point exposes multiple useful information that can identify it, such as its associated SSID and BSSID and provides context about its signal strength, like frequency and received signal strength indicator RSSI. As discussed in Subsection 2.2.6, the existence of one or more WiFi access points can bring up multiple techniques in inferring a device’s location. For these techniques to work, there has to be at least

48

4 Proximity Service Implementation one access point and the more access points deployed, the better the results can be. Considering that our service is using WiFi-Direct, there is always at least one access point available; the GO’s access point. Therefore, using WiFi localization techniques is valid even with the absence of any other access points deployed. The flow of inferring devices proximity via WiFi information starts by the supernode device sending a FingerprintRequest message to a client device with a request type equals to "WiFi". When the message is received by the client device, both devices the supernode and the client start scanning the available WiFi access points. Once the scan result is available, each device generates a WifiFingerprintResult object. This object holds the list of scanned WiFi access points information including each access point’s SSID, BSSID, Frequency and RSSI. The client device wraps this object in a FingerprintResponse object, serializes it and sends it to the supernode device. At the moment the supernode device receives the response, it parses it and applies different techniques to measure the percentage of similarity between both devices’ scan results in addition to estimating the distance between both of the devices. The techniques used to measure scan results similarity vary in terms of the information used and the method applied. The simplest similarity technique is Access Point Presence Similarity, it takes two lists of scanned access points and applies Jaccard similarity coefficient on the access points’ BSSID values regardless of their RSSI values. Jaccard similarity coefficient [72], is a statistical measure of similarity between two sets. It is defined as the size of their intersection divided by the size of their union. Mathematically represented as equation 4.1. The advantage of Jaccard similarity coefficient is that its formula is not affected by increasing the number of visible access points. The resultant value is between 0 and 1, the greater the value is the higher the probability of both devices being in proximity. J ( A, B) =

| A ∩ B| | A ∪ B|

where A and B are two sets.

49

(4.1)

4 Proximity Service Implementation Although RSSI is not considered a reliable indicator for the distance between two receivers [73], it is expected for two devices in proximity to capture highly similar RSSI values for the shared access points. The rest of the techniques take into account each access point’s RSSI value. Before performing any of these methods, the supernode reviews its own scan results in addition to the client’s; whenever an access point is missing from one of the lists, it is added to the other list with a RSSI value equals to -200 dBm as a penalty. The RSSI range is between 0 and -100 dbm, the bigger the value is the better the signal. Since -100 is the minimum value for the access point to be visible, a value of -200 is chosen as a penalty to amplify its effect and differentiate it from a weak access point. Once the access points review process is completed, another technique using an extended version of Jaccard similarity coefficient, called RSSI +- 10 is applied. This technique takes into account each access point’s BSSID and RSSI altogether. When counting the intersection and union of both lists, it checks if an access point with the same BSSID exists and its RSSI value is within a range of +-10. As an example, in case the supernode and the client have captured three access points A, B and C, the supernode’s scan results showed values of {-60, -65, -50} and the client’s scan results were {-70, -59, -63} respectively, using this technique, access points A and B are considered as exist or similar for both devices, but access point C is not matched since the difference between both readings is more than 10 dbm. The motivation for using +-10 dBm as a difference range is that in some cases, the orientation of holding the device in addition to the blocking and reflection of radio signals affect the received signal strength for two adjacent devices in a range of 4 - 10 dBm [74, 29]. Furthermore, correlation-based measurements are commonly used to conclude similarity or relation between two series. These techniques take two lists of RSSI values and the result is a value between 1 and -1, where 1 refers to high correlation, 0 means the absence of a correlation and -1 indicates negative correlation. The methods implemented in our service are:

50

4 Proximity Service Implementation Cross-Correlation Normalized [75, 76], is a measure of similarity of two series as a function of the displacement of one relative to the other. It is commonly used for searching a long signal for a shorter known feature, mathematically represented as: N −1

∑ x [n]y[n]

xcorr ( x, y) = s

n =0 N −1

N −1

n =0

n =0

∑ x 2 [ n ] ∑ y2 [ n ]

(4.2)

where x and y are two vectors.

Person’s correlation coefficient [77], is one of the standard procedures in pattern recognition used to describe the degree of overlap between two patterns. It provides information about the similarity of shape without regard to the average intensity of the signals, mathematically represented as: ρ X,Y =

cov( X, Y ) σX σY

where cov is the covariance, σX is the standard deviation of X and

(4.3)

σY is the standard deviation of Y.

One more approach for inferring the devices’ proximity is by measuring the distance between the scanned RSSI values using the following methods: Euclidean Distance[78], is the straight-line distance connecting two points ( pq). Mathematically represented as: s d( p, q) =

n

∑ ( p i − q i )2

(4.4)

i =1

Manhattan Distance[79], known as the L1 distance, calculates the distance between

51

4 Proximity Service Implementation two vectors p and q in an n-dimensional vector space. Mathematically represented as: n

d( p, q) =

∑ | pi − qi |

(4.5)

i =1

After calculating the distance, it is divided over the number of access points to come up with a relative distance value. For this method, the expected behavior between the devices’ proximity and the resultant value is directly proportional; when the distance between devices increases, the calculated distance increases. From another perspective, a RSSI value can be used to calculate the distance between the transmitter (access point) and the mobile device. This approach is power-based, it uses radio propagation methods to calculate a path loss model. Although this approach may produce inaccurate estimation for the distance in meters [80] due to non line-of-sight conditions, interference and other shadowing effects; it still can provide additional context about the distance between the device and the access point. This measurement is called Free-space path loss (FSPL)[81], it’s defined as the loss in signal strength of an electromagnetic wave; as a result of travelling in a direct path from the transmitter to the receiver through free space with no obstacles nearby to cause reflection or diffraction. Mathematically represented as:

FSBL(dB) = 20 log10 (d) + 20 log10 ( f ) − 27.55 Where f is the signal frequency in megahertz and d is the distance from the transmitter in meters.

(4.6)

4.6.2 Proximity Detection via Audio Mobile devices’ microphone is a sensor able to capture rich context information about the user’s location. A remarkable motivation for sound sensing using mobile devices is that it doesn’t require any additional hardware. In comparison to WiFi, sound respects

52

4 Proximity Service Implementation barriers and provides separation between nearby people’s conversations. Various features can be extracted from the recorded audio signals, such as frequency domain over time. Our assumption is that the higher the similarity between the extracted audio features of two devices’ recordings, the closer the devices are with respect to each other. Our service extracts three different types of audio features from the recorded ambient sound, Power Spectrogram, Mel-Frequency Cepstral Coefficients (MFCC) and Landmark fingerprint. Each feature requires custom steps to be generated and has special metrics for measuring the similarity of two audio recordings. Our service’s flow for this type of fingerprints starts by the supernode sending a FingerprintRequest message to a client device with a request type equals to "Sound". When the message is received by the client device, both devices supernode and client start recording 10 seconds of audio. As soon as the recording period is completed, each device starts analyzing the recorded signals in order to generate an audio fingerprint.

Audio Feature: Power Spectrogram The Power Spectrogram quantifies how the frequency content of the recording varies over time. As illustrated in Figure 4.2, the standard signal processing steps for computing it from an audio signal [83, 82] are:

Figure 4.2: Power Spectrogram extraction steps.

• Splitting the recording into small overlapping frames: a frame has length and a number of samples that are the same for all the frames. The number of frames per

53

4 Proximity Service Implementation second (FPS) is called Frame Rate. An example of a frame length is 10 or 20 ms, shorter frames may not provide sufficient frequency resolution. While the number of samples for each frame is dependent on the sampling rate used to record the audio recording; for example, in case the sample rate is 44.1KHz, 10ms is equal to 441 samples. • Smoothing each frame by a window function: a window function is a mathematical function that is zero-valued outside some chosen interval. Multiplying a frame by a window function, affects the signal magnitudes near the frame boundaries. There are multiple known window functions, such as Rectangular, Hanning and Hamming. • Converting the signal to frequency domain using Fast Fourier Transform (FFT) [84]: is a computational tool which facilitates signal analysis such as power spectrum analysis and filter simulation by means of digital computers. It is a method for efficiently computing the discrete Fourier transform of a series of data samples (referred to as a time series). • Power calculation: as the FFT yields complex results for each frequency, to calculate the amplitude, the resultant elements are multiplied by their complex real and imaginary values. In our service, the Power Spectrogram is generated for the entire recording. In terms of complexity, this feature is the simplest since it extracts and compares the spectrograms of each recording as a single frame. Additionally, as a preprocessing step, signal normalization is carried out to smooth out sharp areas and bring different recordings into the same scale. To demonstrate how the power spectrogram changes based on the surrounding environment, Figure 4.3 presents the amplitude of a 10 seconds recording from two different devices that are attending a meeting in the same room and 4.4 exhibits the amplitude of the frequencies in case of a crowded environment. Apparently, the resultant spectrograms in Figure 4.3 are highly similar to each other and unlike the crowded environment.

54


7000 6000 Amplitude (au)

5000 4000 3000 2000 1000 0

1000

2000 3000 Frequency (Hz)

4000

5000

4000

5000

(a) Device 1

7000 6000 Amplitude (au)

5000 4000 3000 2000 1000 0

1000

2000 3000 Frequency (Hz) (b) Device 2

Figure 4.3: Amplitude of 10 seconds recording for two devices inside the same room.

55


160000 140000 Amplitude (au)

120000 100000 80000 60000 40000 20000 0

1000

2000

3000 Frequency (Hz)

4000

5000

Figure 4.4: Amplitude of 10 seconds recording of a crowded environment. When the processing is completed, the Power Spectrogram final results are represented by a series of numbers. Therefore, a statistical measurement is capable of deriving an existing relation and similarity between two series, such as correlationbased measurements. The measurements used by our service are: • Person correlation coefficient and Cross-Correlation Normalized. • Chi-Square distance [78], is a weighted Euclidean distance, it weights each variable by the inverse of the variable’s overall mean count. Mathematically represented as: s n ( p i − q i )2 (4.7) d( p, q) = ∑ pi + qi i =1 • Cosine similarity [85, 86], is a measure of similarity between two non-zero vectors of an inner product space that measures the cosine of the angle between them. Two vectors with the same orientation have a cosine similarity of 1, two vectors at

56

4 Proximity Service Implementation 90◦ have a similarity of 0, and two vectors diametrically opposed have a similarity of -1. Mathematically represented as: n

∑ Ai Bi i =1 s cos(θ ) = s n

∑ A2i

i =1

n

∑ Bi2

(4.8)

i =1

Audio Feature: Mel-Frequency Cepstral Coefficients (MFCC) The Mel-Frequency Cepstral Coefficients [87] are designed to mimic the human’s perception [88] by providing fine details in the frequent bands to which the human’s ear is sensitive to. It is recognized as one of the most important feature sets for audio signal processing [89, 90]. As illustrated in Figure 4.5, the common steps of deriving the coefficients are:

Figure 4.5: Mel-Frequency Cepstral Coefficients (MFCC) extraction steps.

• Generate the Power Spectrogram using the previously mentioned steps starting from framing until generating the power of each frame. • Map the powers of the spectrum obtained onto the Mel scale [91]. This scale is a perceptual scale of pitches judged by listeners to be equal in distance from one another. The result is an updated set of spectrum called Mel spectrum.

57

4 Proximity Service Implementation • Take the log of the powers at each of the Mel spectrum. • Convert the log Mel spectrum into time domain using the discrete cosine transform (DCT) [92]. DCT is a significant tool in many areas of signal processing especially for the purposes of pattern recognition and feature extraction [93]. • Compute Delta Energy and Delta Spectrum which are the features related to the change in cepstral features over time. In our service, the MFCCs are extracted and the similarity between two devices’ features is measured. After successfully extracting the MFCCs, the result is a multidimensional array; 13-order MFCC for each frame. Hence, to measure the similarity between two recordings MFCCs vectors, the same coefficient in every frame should be compared against the other. The methods used by our service are: • Dynamic Time Warping (DTW) [94], is a popular technique in speech recognition. It allows a non-linear mapping of one signal to another by minimizing the distance between them. Mathematically represented as:

(s DTW ( Q, C ) = min

K

∑ wk

k =1

Where wk is the matrix element (i,j)k also belongs to kth element of a warping path W, a contiguous set of matrix elements that represent a mapping between Q and C. L. Muda et al. in [95] provided detailed explanation on how to use DTW for matching two MFCCs vectors. • Mahalanobis Distance [97, 96], is a measure of the distance between point P and distribution D. It measures how many standard deviations P is away from the mean of D. The distance is zero if P is at the mean of D, and grows as P moves away from the mean. Mathematically represented as:

d( x, y) =

q

( x − y ) t V −1 ( x − y )

58

(4.9)

4 Proximity Service Implementation Where, x and y are two feature vectors both having n dimensions, V is a variancecovariance vector of distribution of the feature vectors.

Audio Feature: Landmark Fingerprint The Landmark Fingerprint approach [59] is generated from the most robust amplitude peaks in a time-frequency analysis of each frame for a particular recording. Amplitude peaks are known for their robustness in the presence of noise. Thus, the generated fingerprint can be reproduced from another recording which has a different quality. To consider a time-frequency as a peak candidate, it should have the highest energy in comparison to its neighbors. As a result, the spectrogram is transformed into a small set of tuples which are used to generate comparable hash values. Extracting this fingerprint starts by following the standard steps of computing the Power Spectrogram and then applying different sorting and processing to identify the amplitude peaks in every frame to generate a fingerprint that represents the entire recording. The steps in details involve: 1. Divide the signal into small overlapping frames. 2. Apply a Hamming window function on each frame. 3. Apply FFT and generate power spectrum for each frame, Figure 4.6a. 4. Normalize each frame’s amplitude. 5. Split each frame’s power spectrogram into smaller segments. 6. Identify the frequency peaks for each segment, Figure 4.6b. 7. Filter out each frame’s low peaks, Figure 4.6c. 8. Identify robust frames; frames with multiple high peaks. (see Listing 4.7 for an example).

59

4 Proximity Service Implementation Listing 4.7: Landmark Fingerprint Peaks Grouping = [{ f rame1time , f rame1 f requency , f rame1amplitude , { f rame2time , f rame2 f requency , f rame2amplitude }, ... { f rameN time , f rameN f requency , f rameN amplitude }]

9. Generate a list of hash values describing the time and frequency relationship of the recording. Each hash value combines the distance between the adjacent frames along with their frequencies, Figure 4.6d. The distance between frames means the time difference between their occurrence; therefore, distant events do not affect the hash value. Additionally, it is used to overcome any synchronization issues. Hereafter, for every two adjacent frames there is a hash value. Equation 4.10 shows how a hash value is generated from two frames (n and m) and their frequencies. Note that there is a {padding} value, from our pre-evaluation experiments, magnifying the effect of the time difference between peaks which adds more uniqueness to the hash value. H (n, m) = |ntime − mtime | ∗ {( padding)2 }

+n f requency ∗ { padding} + m f requency

(4.10)

10. After generating the hash values, all frames that have the same hash value are grouped together, such as in Listing 4.8. At this moment, the recording fingerprint is generated and is ready for comparison. Listing 4.8: Landmark Fingerprint Hash Grouping {hashVal1, [ f rame1time , f rame2time , f rame3time ]}

Measuring the similarity between two Landmark fingerprints requires a custom matching technique. Comparing two fingerprints involves: 1. Iterate through all hash values.

60


1000

1000 800 Freuency (Hz)

Freuency (Hz)

800 600 400

400 200

200 0

600

0 0

10

20

30

40 50 Time

60

70

80

0

10

20

(a) Recording Spectrogram

40 50 Time

60

70

80

(b) All Peaks 400

1000

350

800

300 Freuency (Hz)

Freuency (Hz)

30

600 400

(t3,f3) (t2,f2)

t = t3 - t1

250 (t1,f1)

200

t = t2 - t1

150

200

100

0 0

10

20

30

40 50 Time

60

70

80

65

(c) Robust Peaks

66

67

68

69 70 Time

71

72

73

74

(d) Hash Calculation

Figure 4.6: Landmark fingerprint: extracting fingerprint from robust frequency peaks and calculating hash keys. 2. Group the frames by the hash value. 3. Measure the time difference offset between each pair of frames (see Equation 4.11). o f f set = | f rame1index − f rame2index | (4.11) 4. Create a list that contains the offsets with a counter that represents how many

61

4 Proximity Service Implementation times each offset was encountered. 5. The offset that is associated with the highest counter represents the synchronization difference between both fingerprints and their counter which is considered as the matching score between both fingerprints. 6. To come up with a similarity value between 0 and 1, the matching score is divided over the number of frames. Note that measuring the similarity between two fingerprints relies on the hash values as a key to identify similar frames. Therefore, the hashes format or value is not as important as providing the ability of identifying robust time-frequencies in the recording.

Silence Detection Aside from extracting features, our service analyzes the signals while recording the ambient sound to measure the total percentage of silence. This step is critical for determining the quality of the recording; a silent recording doesn’t carry over much of the identifiable information. This process is based on checking every recorded wave amplitude, per ASHA [98], a moderate conversation has a minimum amplitude of 60 dB. Thus, any wave with an amplitude value that’s less than 60 dB is counted as silent. At the end of the recording, the silence percentage of the entire recording is calculated. This percentage can be used as an indicator in determining the recording and fingerprint quality, as the fingerprint of two silent recordings in two different areas is similar.

4.6.3 Proximity Detection via Ultrasound In multiple scenarios, passively recording and analyzing an ambient sound is not expressive; such as the fingerprints generated from any two silent rooms can be highly similar. So is the case in a noisy environment, a crowded cafeteria as an example; the sound signature is similar for two distant devices since the noise signal is dominant

62

4 Proximity Service Implementation and shared between all devices. Therefore, passively recording ambient sound is not always suggestive which brings the need for a different approach to provide a finer proximity estimation in those scenarios. As a supportive approach, actively emitting inaudible sound signals (i.e ultrasound) from one device to the other is proposed. The main advantage of ultrasound besides being inaudible for humans is the ability to avoide ambient noise interference. For a sound to be inaudible, it needs to be emitted over a frequency that’s higher than the human’s perception. As discussed in [99], the human’s hearing capabilities vary with age; however, they observed no perception of frequencies higher than 17 kHz for users aged around 20 years old. In terms of mobile devices support, Borriello et al. [100] and experiments in [101] showed that it is possible to emit sounds in the range of 18 and 21 KHz signal from a mobile’s speaker and successfully detect it with its default microphone. Another work [68] tested four commercial phones (HTC G1, HTC Hero, Apple iPhone 3GS, and Nokia 6210 Navigator) for playing sounds at frequencies between 17 kHz and 22 kHz. They observed that all phones were capable of generating these high frequencies. Ultrasound’s signal strength, propagation speed and time-of-arrival are essential characteristics that are usually used for tone detection and for measuring the distance between a transmitter and a receiver. Our approach is using ultrasound as an over-theair modem for exchanging text messages between devices. The supernode encodes a text message into ultrasound signals and emits them. The client devices record for a specific period of time and attempt to decode the message by processing the received signals. As long as a device is able to successfully decode the text message to some extent, it is considered as in proximity. An important factor for a successful exchanging of messages between devices using ultrasound is the techniques used to encode the text message into sound signals. The resultant signals should be easily decodable and robust against synchronization.

63

4 Proximity Service Implementation Encoding a text message into sound signals and decoding the signals back to text depends on two main techniques, Reed-Solomon codes (RS-codes) [102] and Preamblebased synchronization [103]. RS-codes are based upon interpolation using polynomials over specific fields; for k length message, it is translated into a polynomial of k-1 degree where each character represents a polynomial coefficient, then it is evaluated at n different points to generate the message codes. Reed-Solomon codes are considered as one of the mostly used techniques in digital error control [104]. Once the RS-codes are generated from the text message, the preamble-based synchronization technique comes into play. This technique provides some flags for the receiver to identify the message’s start and end points in order to successfully decode the message. As a result of this technique, the signal is formed from 3 types of flags in addition to the RS-codes as follows: • Preamble: represents the message’s start delimiter, its goal is the synchronization of the receiver; used as a time reference for the receiver to start decoding a new message. • Guard: is added to the message data to resist symbol interference and to allow multi-path components to fade away before extracting the information from the next symbol. • Tail: represents the message’s end delimiter, when the receiver detects the preamble and tail, it can decode the complete message. Subsequently, the transmitter starts emitting the formed signal using 44.1KHz sampling rate, according to the Nyquist-Shannon theorem [105] sounds with a maximum frequency of 22kHz can be technically emitted. Similarly, the receiver starts recording using 44.1KHz as well, for a period that equals to the played ultrasound. In order to decode the transmitted message out of the recorded signals, the receiver attempts to recognize the embedded flags by finding the maximum correlation for each. The preamble flag is detected and removed, guard flags are filtered out and the tail flag’s start position is used as the message’s last index. This way, the bytes in between are equivalent to the RS-codes generated before the emission. Next, the codes are passed

64

4 Proximity Service Implementation to the Reed-Solomon decoder which can correct errors and erasures producing into the original text message [106].

Figure 4.7: Encoded ultrasound message structure; the message starts with a Preamble flag, RS-Codes are divided into parts and wrapped with Guard flags and the end of the message is identified by the Tail flag. At the moment a client device decodes the text message, it transfers it back to the supernode to check its validity. The supernode compares the client’s decoded message against the original message using Levenshtein Edit Distance [107]: a string metric for measuring the difference between two sequences. It measures the length of the shortest sequence of character insertions, deletions, substitutions, and transpositions required to transform t into t’, mathematically represented as:

 max (i, j)         L R,R0 (i, j) =  L R,R0 (i, j − 1) + 1    L R,R0 (i − 1, j) + 1  min     L 0 (i − 1, j − 1) + ∆( R(i ), R0 ( j)) R,R

65

i f min(i, j) = 0

, else

5 Evaluation In the previous chapters, we explained the proximity service’s design and implementation to provide an automatic meeting establishment based on the users’ relative position. The service is a D2D-based that leverages the WiFi-Direct technology as a communication channel. In this chapter, we first evaluate the latency and throughput of the WiFi-Direct connection in multiple states: idle, download running in the background and while the supernode is verifying the clients’ proximity. Then, the duration and energy consumption of forming a group, in addition to verifying the users’ proximity is measured in two different environments; restricted and open spaces. Finally, the proximity service’s accuracy is evaluated in multiple scenarios. To reason about the evaluation results, the connection’s latency and throughput findings are compared to a conventional WiFi connection and the service’s accuracy is compared to a random baseline as a ground truth. Table 5.1: Specifications of the devices’ models used in the experiments [108, 109]. Specification

Samsung Galaxy Tab A 10.1

Samsung Galaxy S5

Android 6.0.1

Android 6.0.1

1.6 GHz Octa-Core

2.5 GHz Quad-core

Memory

16 GB 2 GB RAM

16 GB 2 GB RAM

Battery Capacity / Volage

7,300 mAh / 3 V

2,800 mAh / 4 V

48.6 hours

25 hours

Operating System CPU

Lifetime (450 mW avg power)

The experiments were run on four devices, three of them are of the same type

66

5 Evaluation Samsung Galaxy Tab A 10.1 and the fourth is a Samsung Galaxy S5. Table 5.1 shows a summary of the devices’ specifications. Note that the tablet devices are used in all the experiments. Whereas, the Samsung Galaxy S5 device is only used in the Proximity Detection Latency experiment 5.1.2 to evaluate the increase in latency while the supernode is verifying three clients simultaneously.

5.1 Device-to-Device Connection Performance A connection’s latency and throughput are key factors that impact the user’s quality of experience; providing guarantees on the ability of a network to deliver predictable results. Connection latency represents how long it takes for a packet of data to travel from one destination to another. Whereas, excessive latency results into bottlenecks that prevent the data from filling the network pipe, thus decreasing the throughput. Throughput is defined as the quantity of data that can be sent and received by a unit of time. In order to analyze the WiFi-Direct connection’s performance, its latency and throughput are evaluated in multiple scenarios, including idle state, download running in the background and while a supernode device is verifying one or more client devices’ proximity. Furthermore, to reason about the results, the latency and throughput of a conventional WiFi connection are evaluated and used as a baseline. For the connection’s latency experiments, the round-trip time (RTT) between the devices is measured; the time required for a packet to travel from one device, reach the destination and back again. Accordingly, one of the devices sends a packet to the other and measures the time it takes until an acknowledgement returns in milliseconds (ms). Whereas to evaluate the throughput, the data transfer rate is measured by downloading different file sizes and calculating the transfer speed in megabits per seconds (Mbits/sec). As demonstrated in Figure 5.1, the communication flow for WiFi-Direct differs from the conventional WiFi; in case of WiFi-Direct the devices need to discover and connect to each other instead of connecting to an external access point. Additionally, after successfully establishing a direct connection between devices, the supernode device starts a virtual access point (AP) by hosting a DHCP server that

67

5 Evaluation assigns every connected device an IP address. Therefore, when a client device sends a packet to the supernode, the packet is transferred directly to the supernode without any routing. While in case of WiFi, the devices are not connected to each other, but instead to an external access point. Thus, when a device sends a packet to another device, it is sent to the access point which routes the packet to the correct destination.

Figure 5.1: WiFi-Direct and conventional WiFi communication flow.

5.1.1 Idle vs. Download State Latency This experiment measures the RTT for a period of 100 seconds. This period is selected to guarantee capturing the experiment’s effect in each state and provide reasonable average when accumulated. In terms of devices, two tablets (D1 and D2) are used where D1 takes the responsibility of sending the RTT requests to D2 and measures how long it takes for the packet to return. When the latency of the download state is evaluated; the packet latency by D1 is measured, meanwhile D2 is continuously downloading data in the background. Results Figure 5.2 shows that in both scenarios, the connection’s latency pattern for WiFi and WiFi-Direct in case of idle state is very close. However, comparing the change in latency while a download is running in the background, it’s noticeable that the latency pattern is different; sharper reduction of latency is conducted for WiFi-Direct than of WiFi. This can be explained as; the WiFi-Direct initiates a virtual access point hosted at the

68

5 Evaluation

900

700

700

600

600

500 400

500 400

300

300

200

200

100

100

0

0

20

40

60

80

Download Idle

800

Latency (ms)

Latency (ms)

900

Download Idle

800

100

0

0

20

40

Time (s)

60

80

100

Time (s)

(a) WiFi-Direct: comparing the idle state change (b) WiFi: comparing the idle state change of of latency to download state. latency to download state.

308

WiFi WiFi-Direct

300

250 196

Latency (ms)

200

150

100 60

55

50

0

Idle

Download

(c) Average latency for WiFi-Direct and WiFi in idle and download states.

Figure 5.2: WiFi-Direct and WiFi connections’ latency in an idle and continuous download states scenarios. supernode, where it establishes a direct connection between the devices instead of communicating through a router as in the case of a conventional WiFi connection. As a result, the average latency in case of idle and download is higher by 9% and 59%

69

5 Evaluation repectively when using a conventional WiFi connection.

5.1.2 Proximity Detection Latency This experiment evaluates the latency from a different perspective; it is carried out while the supernode device is collecting the clients’ context information and measuring their proximity. The goal from performing this experiment is to comprehend how much verifying the clients’ proximity affects the communication latency; especially that this process is executed periodically by the supernode to maintain the clients’ membership. In this experiment the RTT is measured while the supernode is verifying the proximity of one, two and three clients simultaneously. In terms of devices, the three tablets are used, one of them acts as a supernode and the others as clients. Additionally, the Samsung Galaxy S5 device, is used as a client when evaluating the latency while the supernode is verifying the proximity of three clients simultaneously. To understand what impacts the latency, Table 5.2 provides a summary of the amount of data sent for each context type request and response. Table 5.2: Proximity detection latency experiment: a summary of the amount of data sent for each context type request and response.

Context Type

Request (bytes)

Response (bytes)

Response Content

WiFi

119

186,821

Information of 110 access points

Audio

140

791,325

Features of 10 s recording

Ultrasound

141

71

Decoded text message

Results Figure 7.4 exhibits how the latency changes while the supernode is verifying the client devices’ proximity. When the number of the devices increases, additional latency peaks

70

5 Evaluation

700

1 Client 2 Clients

2 Clients 3 Clients

600 800

Latency (ms)

Latency (ms)

500 400 300

600

400

200 200 100 0

0

20

40

60

80

0

100

0

20

Time (s)

40

60

80

100

Time (s)

(a) Change in latency while the supernoode is (b) Change in latency while the supernoode is verifying one and two clients’ proximity. verifying two and three clients’ proximity.

200 175

Idle 1 Client 2 Clients Download 3 Clients

Latency (ms)

150

196.11

200.18

Download

3 Clients

165.09 142.93

125 100 75 55.95 50 25 0

Idle

1 Client

2 Clients

(c) Average latency while the supernode is verifying one, two and three clients’ proximity simultaneously in comparison to Idle and Download states.

Figure 5.3: The latency resulting from the supernode verifying the client devices’ proximity. start to occur as demonstrated in Figures 7.4a and 7.4b. Comparing the average latency of this experiment, as shown in Figure 5.3c, while collecting and processing one client’s

71

5 Evaluation proximity, the latency increases by 61% with respect to an idle connection state. When the supernode verifies two client’s proximity simultaneously, the connection undergoes an additional increase of 16%. Thereafter, when verifying three clients’ proximity, the connection’s latency is 21% higher than in the case of verifying two clients. Furthermore, the latency of the three clients case is 2% more than the continuous download running in the background. This indicates that the connection’s latency significantly increases due to an additional overhead the supernode confronts to serve additional clients. In another words, since the average latency is more than in the case of a continuous download running in the background, further delay is encountered at the supernode device while concurrently processing and responding to the clients requests. Thus, we observe that the supernode mobile device’s computational power is an important factor that affects the WiFi-Direct connection’s latency when the number of connected clients increases.

5.1.3 D2D Throughput Measurements An interconnected factor to latency that affects the connection’s performance is the throughput. In this experiment the connection’s throughput is investigated by downloading three different file sizes, 1 MB, 10 MB and 50 MB and measuring the transfer speed for each case. The experiment takes into account the different distances between the devices; repeated 10 times every 5 m, starting from 0 m up to 20 m. At the end, the average throughput is calculated and used as the final result. In terms of devices, two tablets are used; one hosts the files and the other downloads them. Results Figure 5.4 depicts the average throughtput for WiFi and WiFi-Direct when downloading the three files. In both cases, the average throughput is close; we don’t advantage one over the other. Additionally, we’ve conducted that the change in the distance doesn’t affect the throughput.

72

5 Evaluation

WiFi WiFi-Direct 62

70 64 60

60

59

63

60

Speed (Mbits/sec)

50 40 30 20 10 0

1 MB

10 MB

50 MB

Figure 5.4: WiFi-Direct and conventional WiFi connections’ average throughput.

5.2 Parameters for Proximity Evaluation The proximity service is evaluated in two different environments, restricted and open spaces, such as a closed room and a university’s entrance hall. As explained previously in Chapter 2, The main difference between both environments is that a restricted space has boundaries, such as; walls and doors, separating it from the outside surroundings which works on differentiating between the context information inside and outside that space. Conversely, the devices located in an open space share the same prevalent context, e.g, the environment’s noise. Sketch 5.5 depicts the room’s size and shape used in this experiment as a restricted space environment. The room size is 4.5 * 3.7 = 16.65 m2 . The 4.5 m sides are walls and each of the 3.7 m sides consists of a double-glass window and a door. The room boundaries are capable of isolating the con-

73

Figure 5.5: Restricted space: sketch for the room shape.

5 Evaluation versation going on inside from the outside. While for the open space environment, the Technical University of Munich’s entrance hall is used. The hall is large and is crowded most of the time; students entering, leaving and chatting, in addition to other students and staff eating in the cafeteria nearby. Besides the experiment’s environment, the service configurations used for collecting and processing the different context information is a major factor affecting the duration, energy consumption and the accuracy results. In this experiment the following configurations are used based on pre-evaluation experiments: • WiFi: the number of WiFi scans collected in every request critically affects the algorithm’s accuracy. Based on pre-evaluation experiments, the collected information from one or two WiFi scans are not reliable enough since the RSSI value is influenced by multiple factors as explained in Chapter 2. Additionally, the algorithm’s performance improved similarly when collecting three, four and five WiFi scans. For those reasons, in the following experiments, three WiFi scans are collected in every request. In order to compare two lists of WiFi scan results, a number of metrics are employed to produce a set of reliable features. The final accuracy is constructed from the combination of the following: Cross-correlation (X-corr), Pearson’s correlation (P-corr), AP Presence, Jaccard +- 10, Euclidean Distance, Manhattan Distance and calculating the signal’s propagation effect through the Free Space Path-Loss (FSPL). Table 5.3 lists the threshold used for each feature. Table 5.3: WiFi features’ thresholds derived from pre-evaluation experiments: for each feature, the service defines a threshold; if the feature value is above a threshold, then the user is within X meters proximity. P-corr

X-corr

Jaccard +-10

AP Presence

Manhattan

Euclidean

FSPL

0.74

0.76

0.48

0.40

0.45

0.69

2.5

• Audio and ultrasound signal parameters: Table 5.4 provides an overview of the

74

5 Evaluation settings used to record, analyze the ambient sound signals and generate the ultrasound message. These parameters are derived from pre-evaluation experiments, taking into account the most reliable results and amount of resources used. To collect these settings, we tested different recordings duration 5, 10, 15 and 20 seconds and different sampling rates, including 44.1, 32.0, 22.05, 11.025 and 8 kHz. We found out that the most reliable results require at least 10+ seconds of sound sampled at 11.025 kHz rate to reveal decisive information about the surrounding environment; recording shorter audio or a lower sampling rate reduces the algorithm’s accuracy. Furthermore, using longer recording or higher sampling rate increased the used resources significantly without adding noticeable advantages. Table 5.4: Audio and ultrasound techniques signal settings derived from pre-evaluation experiments.

Method

Power Spectrogram

MFCC

Landmark

Ultrasound

Duration

10 s

10 s

10 s

7s

11.025 kHz

11.025 kHz

11.025 kHz

44.1kHz

10 s

0.46 s

0.18 s

0.23 s

0-5 kHz

0 - 5 kHz

0 - 5 kHz

17.9 -19.6 kHz

None

Hamming

Hamming

None

Sampling Rate Frame size Frequency band Window Function

• Audio: once the ambient sound is recorded, the extracted features are compared using the following metrics and thresholds: Table 5.5 defines the settings used for Power Spectrogram, Table 5.6 for MFCC and a value of 0.5 is used as a similarity threshold for the Landmark fingerprint. • Ultrasound: the word "Proximity" is exchanged between the devices. A similarity based on the Levenshtein Edit Distance with a threshold of 0.5 is used.

75

5 Evaluation

Table 5.5: Power Spectrogram similarity metrics thresholds derived from pre-evaluation experiments. P-corr

X-corr

Cosine

Chi-squared

0.65

0.65

0.80

0.81

Table 5.6: MFCC similarity metrics thresholds derived from pre-evaluation experiments. Mahalanobis

Dynamic Time Wrapping

3

80

5.3 Proximity Service Duration and Energy Consumption After evaluating the WiFi-Direct connection’s performance and comparing it with the conventional WiFi, this section focuses on assessing the duration and energy consumption of each phase performed in the proximity service to accomplish a private meeting that’s confined between nearby devices, including the Group Formation and the Proximity Detection techniques. In the Group Formation phase, each device advertises its own service, discovers services of other devices, and negotiates which mobile device is elected as the supernode. For the Proximity Detection techniques, the supernode requests from each client device a specific type of context information. Devices collect the requested context information and extract useful features. As soon as the features are extracted, the client sends them to the supernode. As a part of its role, it compares the client’s extracted features against its own to decide whether the devices are in proximity or not. The duration of a specific phase means the time it takes for the entire process to complete its intended functionality. Similarly to the network connection’s latency effect, long duration delivers a poor quality of experience for the users.

76

5 Evaluation Measuring the additional energy overhead consumed by the proximity service to establish a meeting is necessary for evaluating its feasibility for practical use; when its energy overhead is small, it can greatly increase the user’s satisfaction. In this section, the energy consumption of each phase is measured using the tool PowerTutor [110]. This tool, measures the power consumed by the major system components such as CPU, network interface, display, and GPS for each application. The average foreground battery drain rate is measured in milliwatts (mW), which is a standard unit of measurement to express rate of energy transfer with respect to the time. In order to efficiently use the PowerTutor, we have extended it by implementing a BroadcastReceiver which listens to incoming intents from any application running on the device. The supported intents allow starting and stopping the energy monitoring, in addition to saving the currently consumed energy statistics into a csv file for each individual application. This way, the evaluation results are more accurate since the flow reduces any user errors resulting from starting and stopping the tool manually. In these experiments, the proximity service broadcasts a powertutor.start intent before any code execution, thereby, the PowerTutor starts monitoring the energy consumption. Once a phase’s relative code execution is completed, the service broadcasts a powertutor.stop intent to stop monitoring and broadcasts a powertutor.savelog intent to save a copy of the energy monitoring results into the device’s local storage.

5.3.1 Group Formation The duration and energy consumption of this phase is evaluated in three scenarios; two devices starting a meeting, three devices starting a meeting, and one device joining an already established meeting. To properly perform this experiment, the group formation functionality is triggered at the same time for all the devices. This way, the WiFi-Direct efforts to advertise and discover the services are entirely captured.

77

5 Evaluation

Restricted-Space Energy

Open-Space Energy

Restricted-Space Duration

Open-Space Duration

210.89 213.0

51.7 50

200 178.74 179.31 168.6 167.12 36.43

Energy (mW)

150

41.74

43.65

40

38.65

125

30

100

Duration (s)

175

45.32

20

75 50

10

25 0

1 Device

2 Devices

3 Devices

0

Figure 5.6: Restricted and Open Spaces: Group Formation phase duration and energy consumption evaluation results. Results Figure 5.6 shows that in the case of a restricted space, for two devices to form a meeting group, they need around 41.74 seconds and when the number of devices increases to three, the duration increases by 8.58% up to 45.32 seconds. When a device is joining an already established meeting, it takes around 36.43 seconds to discover the meeting service and request to join; which takes 12% less time than that in the case of two devices. The interpretation of these timings is that each device’s service takes some time to be advertised; thereby, discovered by the rest of the devices. As a result, when the number of devices participating in a meeting increases, the time required to successfully discover all of their services increases. In the open space environment, the results were similar, where it takes 43.65 seconds for two devices to form a meeting, 51.70 seconds for three devices and 38.65 seconds for a device to join an already established meeting.

78

5 Evaluation

Furthermore, as shown in Figure 5.6, the average drain of energy also increases with the increase in the number of devices. In the restricted space environment, the drain of energy increases for 6% in case of two devices more than in the case of one device. Whereas, in the case of three devices, each device’s consumption is increased by 15.24%. The difference between the first two scenarios, is that when a device is joining an already established meeting, it doesn’t advertise its own service since its not interested in establishing a new meeting. Additionally, at the time of discovery, the meeting service that’s hosted at the supernode is already advertised; immediately available for discovery. The increase of energy consumption is also related to each scenario’s duration; since each device is performing services discovery in that additional period, more energy is consumed. Thus, the three devices scenario consumes more than the others. The energy consumption in both environments is similar in all scenarios.

5.3.2 Proximity Detection This experiment evaluates the methods implemented in our service to detect proximity between the devices in terms of duration and energy consumption. Each method is evaluated separately, since detecting proximity via audio employs three different techniques, a further analysis is carried out for each technique. Furthermore, considering that the supernode device is responsible for comparing the extracted features and inferring the proximity between devices, the energy is evaluated for both sides; client and supernode. Results As illustrated in Figure 5.7, the duration in both environments is close. It is expected that collecting and analyzing the WiFi context information takes this amount of time since a single WiFi scan duration is between 3 - 5 seconds. In case of audio, the majority of the time refers to the configured duration of the recording, i.e 10 seconds. To understand what happens in the remainder of the time, Figure 5.8 demonstrates the duration of each single feature; the Power Spectrogram needs the most time because the resultant

79

5 Evaluation

1851.18 18.76 1750

Energy-Client Energy-Supernode Duration

1675.98

1500

17.5 15.0

13.54 12.5

1000 8.85 750

10.0

Duration (s)

Energy (mW)

1250

7.5

500

5.0 330.22

250 0

2.5 41.76

114.57

58.7

WiFi

Audio

0.0

Ultrasound

(a) Restricted space

1844.0 20.04 1750

Energy-Client Energy-Supernode Duration

1673.33

20.0 17.5

1500 15.0

14.47

12.5 1000 8.93 750

10.0

Duration (s)

Energy (mW)

1250

7.5

500

5.0 340.67

250

2.5

120.19 0

62.3

42.0 WiFi

Audio

Ultrasound

0.0

(b) Open space

Figure 5.7: Proximity detection techniques’ duration and energy consumption evaluation results for restricted and open space environments.

80

5 Evaluation feature represents the spectrum of all the 10 seconds which is too large in comparison to the resultant Landmark robust frequencies fingerprint as an example. Therefore, transferring and comparing the similarity between two spectrograms spends 39.9% and 19.3% more time than MFCC and Landmark respectively. As for the ultrasound, playing the message "Proximity" at the supernode takes most of the time; 7 seconds. Playing smaller words reduces the duration, however, long words are more robust against false positive decoding.

732.9

15.69

Energy Duration

700 649.7

13.15

16 14

600 12

11.21 444.8 400

10 8

300

6

200

4

100

2

0

Duration (s)

Energy (mW)

500

Power Spectrogram

MFCC

Landmark

0

Figure 5.8: Single audio features duration and energy consumption analysis. In the same context, Figure 5.7 provides an overview of the difference in energy consumption in case of a client and a supernode for all the methods. In case of WiFi, the client’s energy consumption is within an average of 41.76 mW while for the supernode it’s 114.57 mW. The main difference between both sides is executing a number of metrics on the scan results to infer the devices’ proximity at the supernode’s device. Thus, performing these measures increases the average energy consumption by 74.81 mW. Looking into audio, since the proximity service doesn’t exchange the raw audio, the feature extraction process occurs at both devices. By that, the energy

81

5 Evaluation consumption is relatively high for the client and the supernode; in comparison to WiFi and ultrasound. This can be explained as the audio features’ size is much larger than the WiFi scan results and requires more processing than encoding and decoding an ultrasound message. To understand which audio feature is increasing the energy’s consumption, the same experiment is repeated for each feature separately as shown in Figure 5.8; the Power Spectrogram is the most energy expensive in comparison to the others and the energy consumption of the Landmark fingerprint is the lowest. The main factors influencing the amount of the consumed energy are the executed algorithm and the size of the data transferred from the client to the supernode. Each feature’s algorithm affects the energy consumed by the device’s CPU, while when the size of the transferred data increases, more energy from the mobile device’s WiFi component is consumed. Whereas in case of ultrasound, the supernode requires an average energy of 58.7 mW to generate an in-audible signal that carries an encoded version of the word "Proximity" and broadcast it; which is low in comparison to what the client consumes, 330.22 mW. In this case, the client records what the supernode broadcasts, processes it attempting to decode the original text message. Considering that the client only exchanges the final decoded message with the supernode, recording and processing the audio signal consume the substantial part of the energy.

5.4 Proximity Detection Accuracy After evaluating the duration and energy consumption of the methods implemented in our service to detect proximity between the devices, this section evaluates the accuracy of each method in a restricted and open space, in addition to investigating how each feature within each method behaves in different scenarios.

82

5 Evaluation

5.4.1 Restricted Space Environment In this experiment, a playing speaker is used to simulate an ongoing meeting. Each time the experiment was performed, the devices were placed in different places, including the room boundaries, beside the room’s closed door from the inside and the outside. This way, the experiment covers all the possible scenarios in which a meeting inside a closed room can encounter. The performance of the proximity service is assessed in the following scenarios: • Scenario A: a speaker playing inside while the outside is silent. • Scenario B: a speaker playing inside while there is a continuous noise outside. • Scenario C: silence inside while there is a continuous noise outside. • Scenario D: silence inside and outside. Results As illustrated in Figure 5.9, our service is capable of verifying the users proximity with an accuracy of 99% in the four scenarios when employing the combination of the extracted features from WiFi, audio and ultrasound’s context information. Based on the results; WiFi features can infer the users’ existence inside the room with an accuracy between 78% - 80% for the four scenarios. Adding audio features increases the accuracy by 17% - 19% in scenario A, B and C. However, in Scenario D, the accuracy is increased by only 1%; this because the scenario setup requires silence inside and outside the room, which yields the audio fingerprint to be uninformative. Moreover, verifying the users’ proximity via broadcasting an ultrasound message increased the stability of the accuracy in the four scenarios to 99%. Whereas ultrasound signals do not penetrate the room boundaries, no device outside the room is able to decode the exchanged text message. The resultant accuracy for WiFi, Audio+, Ultrasound+ performed well in the four scenarios in comparison to the Random Baseline. For a better understanding how the experiment’s scenarios setup is affecting the audio fingerprint, the single performance of the extracted audio features is measured

83

5 Evaluation

WiFi 1.0

Accuracy

0.8

Audio+

0.980.99

0.79

0.95

0.99

0.78

Ultrasound+

0.980.99

0.8

0.99

0.8 0.81

0.6

0.4

0.22

0.2 0.09 0.0

Scenario A

Scenario B

Scenario C

Scenario D

0.06

Random Baseline

Figure 5.9: Restricted space: proximity detection accuracy. Audio+ refers to the combination of WiFi and audio. Accordingly Ultrasound+ refers to the combination of WiFi, audio and ultrasound. Random Baseline scenario represents a ground truth accuracy for each method. separately in each scenario. Figure 5.10, demonstrates the accuracy of each feature in detecting the users’ proximity. In the four scenarios, Power Spectrogram performance is the best, MFCC produces the lowest accuracy and the combination of multiple features doesn’t improve the accuracy; only by 1% in Scenario A. Additionally, looking at the results of Scenario D, 20% is the maximum accuracy the audio features can guarantee, which is expected since two completely silent recordings are highly similar. The Random Baseline results are lower than all the features in Scenario A,B and C but not Scenario D. This indicates that audio can’t be relied on as a room-level proximity indicator in case of total silence.

84

5 Evaluation

Power Spectrogram 1.0

0.97

0.96

Landmark 0.94

0.94

0.88 0.83 0.8

MFCC

0.97

Combination

0.97

0.85 0.75

0.7 0.63

Accuracy

0.6 0.51 0.45 0.47 0.4

0.2

0.0

0.17

Scenario A

Scenario B

Scenario C

0.2

0.23

0.2

0.0 Scenario D

Random Baseline

Figure 5.10: Restricted space: audio features performance.

5.4.2 Open Space Environment In the open space environment, the accuracy evaluation is not concerned about detecting whether a device is inside or outside the meeting room. Instead, the experiment assesses the service’s capability of identifying the devices’ existence within a specific distance in meters; devices existing within a radius of X meters are considered in proximity. This experiment is repeated in multiple setups and considering different distances: • Scenario A: radius of 5 meters while there is a continuous noise and a speaker is playing on the meeting table. • Scenario B: radius of 8 meters while there is a continuous noise and a speaker is playing on the meeting table. • Scenario C: radius of 5 meters while there is a continuous noise.

85

5 Evaluation • Scenario D: radius of 8 meters while there is a continuous noise. Results

WiFi 1.0 0.88

0.91

Audio+ 0.95

0.92

0.960.98

0.94 0.890.89

Ultrasound+ 0.920.92

0.96

Accuracy

0.8

0.6

0.4

0.23 0.2

0.15 0.05

0.0

Scenario A

Scenario B

Scenario C

Scenario D

Random Baseline

Figure 5.11: Open space: proximity detection accuracy. As shown in Figure 5.11, the overall accuracy is high when the combination of all the techniques is used, i.e Ultrasound+. A crucial observation of Scenario C and D in relation to detecting proximity via audio, the WiFi’s accuracy is equal to Audio+. Which means the accuracy didn’t improve when adding the audio in those particular scenarios. The explanation for that is; because the sound of noise is prominent and shared between all the devices, the similarity is high between the audio recordings. However, in Scenario A and B a speaker is playing on the meeting’s table which brings additional context to the meeting area. As a result, adding the audio features improved the overall accuracy by 3.4% in Scenario A and by 4.3% in Scenario B. Interestingly, the WiFi accuracy has improved in comparison to the restricted space scenarios; which

86

5 Evaluation indicates that the Wi-Fi signal characteristics are preferably used as coarse-grained proximity indicator. Comparing the final accuracy for WiFi, Audio+, Ultrasound+ against the Random Baseline, the service performance is better in the four scenarios. Performing a deeper analysis of the audio features’ performance, Figure 5.12 exhibits the individual features along with their accuracy for all the scenarios of this experiment. The audio features’ performance in the open space scenarios is not as reliable as in the restricted area scenarios due to the absence of boundaries. Adding to that, when a speaker is playing on the meeting table, the Landmark fingerprint produced the best accuracy for identifying the users’ proximity. Which means, the audio features can’t detect the users’ proximity in an open space unless there is an additional conversation or sound being played in the meeting area besides the area’s shared noise. Otherwise, a random estimation of the users’ proximity can produce similar results as using the combination of all the features. Power Spectrogram 0.8

Landmark 0.8

0.75

MFCC

Combination

0.8

0.75

0.7 0.61

0.6

0.51 0.5 0.46

Accuracy

0.5 0.4 0.3 0.2

0.18

0.15

0.2

0.22

0.2

0.1

0.1 0.0

0.2

0.0 Scenario A

0.02 Scenario B

0.0 Scenario C

0.01 0.0 Scenario D

Random Baseline

Figure 5.12: Open space: audio features performance.

87

5 Evaluation WiFi Metrics Evaluation As shown in the previous experiments; WiFi signals can provide useful information about the surrounding environment. Although WiFi signals do not respect room boundaries, such as walls and doors, obtaining a coarse-grained proximity estimation is achievable. To perceive how these information are used as a proximity indicator, Figures 5.13 and 5.14 show how each feature’s value changes as a function of the distance between the devices in meters.

0.9

Cross Correlation Pearson Correlation

0.90

Jaccard Similarity Coefficient

0.85 0.80

Correlation

AP Presence Jaccard +- 10

0.8

0.75 0.70 0.65

0.7

0.6

0.5

0.4 0.60 0.3

0.55 0

1

2

3

4

5

6

7

8

0

1

2

3

Distance (m)

(a) Correlation-based

5

6

7

8

7

8

(b) Jaccard Similarity Coefficient 75

Euclidean

4.5

4

Distance (m)

Manhattan

70 4.4 65 4.3

Distance

Distance

60 4.2

55

4.1 50 4.0 45 3.9 40 0

1

2

3

4

5

6

7

8

Distance (m)

0

1

2

3

4

5

6

Distance (m)

(c) Euclidean Distance

(d) Manhattan Distance

Figure 5.13: WiFi similarity methods: showing the relationship between the change in each method’s result and the distance between devices.

88

5 Evaluation

Free Space Path-Loss

12

10

Distance

8

6

4

2

0 0

1

2

3

4

5

6

7

8

Distance (m)

Figure 5.14: WiFi similarity method: showing how the value of Free Space Path-Loss (FSPL) changes when the distance between devices increases. In the restricted and open space environments experiments, the same features’ thresholds were used and the final accuracy achieved is between 78% - 88% for all the scenarios. This steady accuracy means that the thresholds are reliable even with the change of the environment. To reason more about how the final accuracy is generated, the performance of each feature in detecting the users’ existence within a radius of 5 meters is investigated and presented in Figure 5.15. The single best performing features are the correlation-based methods, P-corr and X-corr over the RSSI values. That’s due to multiple factors related to the environment’s setup and the data pre-processing. In terms of environment, the main factor is the total number of the visible access points; as the accuracy of the correlation-based features is improved with the increasing number of the visible access points. In the previous experiments, the average number of access points is 110. Moreover, the service considers all the captured access points in both devices not just the shared ones; where it replaces the RSSI of any missing access point with a penalty value of -200 dBm, which amplifies its effect on the final accuracy.

89

5 Evaluation

1.0

0.9

0.89

P-corr X-corr Jaccard +- 10 AP Presence Manhattan Euclidean FSPL

0.88 0.82

0.81

0.8

0.8

Accuracy

0.78

0.7

0.68

0.6

0.5

P-corr

X-corr

Jaccard +- 10

AP Presence

Manhattan

Euclidean

FSPL

Figure 5.15: WiFi similarity methods: single feature accuracy results for 5 meters distance radius.

5.5 Discussion In this chapter, the WiFi-Direct connection’s performance was tested in terms of latency and throughput in multiple scenarios. The experiments showed that the throughput and the latency in case of an idle connection state are similar to the conventional WiFi infrastructure. However, in the case of a continuous download running in the background; the average latency of WiFi-Direct is lower than the conventional WiFi due to direct connection between devices via WiFi-Direct. Furthermore, the effect of verifying the users’ proximity on the connection latency was investigated. The connection’s latency increases when verifying the proximity of additional clients reaching up to 200 ms in case of three devices which is 2% more than the case of continuous download; due to an additional overhead confronts the supernode device while concurrently serving multiple clients. Thus, the supernode mobile device’s computational power is an important factor that affects the WiFi-Direct connection’s latency.

90

5 Evaluation Regarding the proximity duration and energy consumption experiments’ results, although one device requires at least 36 seconds to join a meeting, the additional duration reasonably increases afterwards when more devices join; an additional 5 seconds in average for every new device. Furthermore, the average energy consumption reaches up to 213 mW for 3 devices to form a meeting. This much of battery drain does not add high overhead since the tablet lasts 48.6 hours with an average power of 450 mW as shown in the devices specifications Table 5.1. While for the Proximity Detection methods, it requires up to 43.44 seconds to verify a user’s proximity via all the methods; WiFi, Audio and Ultrasound. Considering the average energy consumption for each method, WiFi and ultrasound require less energy than audio. Combining all the methods together translates the service’s consumption to an average of 685 mW in case of a client device and 674 mW in case of a supernode. To assert the service’s validity for practical use, Figure 5.16 presents a comparison of the proximity service while continuously detecting the users’ proximity against some of the known application categories reported by M2 Insights [111]. The proximity service’s consumption comes between the Sports and the News & Magazine application categories. Taking into account how long the users engage with those applications, we can conclude that the service’s consumption is not considered high from the users’ perspective. The previously discussed energy consumption and duration are an outcome of using all the methods including all of their features. However, the service can perform well in a different configuration and exhaust less time and resources. For example, detecting proximity via ultrasound can be only enabled in case of total silence to overcome the audio limitation and when the total accuracy is not certain enough. By doing that, besides the energy savings, a 8 - 9 seconds are reduced from the common scenarios’ duration. Additionally, as observed from the previous experiments, the combination of the Power Spectrogram, MFCC and Landmark fingerprint doesn’t produce noticeable improvement. Therefore, based on the scenario, one or two of those features can be disabled without affecting the overall accuracy which as a result reduces the duration

91

5 Evaluation by 3 to 5 seconds for each feature.

916

Arcade Games

765

Puzzle Games

Sports

690

Proximity Client

685

674

Proximity Supernode

660

News & Magazine

597

Social

0

200

400

600

800

Energy (mW)

Figure 5.16: Comparing the proximity service’s energy consumption with the Sports, New & Magazine and Social application categories. [111]. The overall proximity service’s accuracy performed better than the random baseline in all the scenarios of the restricted and open space environments. However, when taking the audio features separately, the random baseline produced higher accuracy where there is a total silence inside and outside the room. Similarly, in the case of a prominent noise in an open space environment, if there is no speaker playing on the meeting table, the audio features’ accuracy is lower than the random baseline. This emphasizes the limitation of passively sensing the ambient sound for not being capable of extracting suggestive information in all cases. Table 5.7 provides a summary of all the proximity detection techniques’ evaluation.

92

5 Evaluation

Table 5.7: Summary of the proximity detection techniques’ performance and the influencing factors.

Property

WiFi

Audio

Ultrasound

Accuracy (Restricted Space) Accuracy (Open Space) Sensing Method Duration Energy Consumption Influence Factors

Coarse-grained

Fine-grained

Fine-grained

Coarse-grained

Coarse-grained

Fine-grained

Passive Medium Low Number of deployed access points; accuracy increases when there are more access points available.

Passive Long Medium Existence of a unique sound in the meeting area; doesn’t work well in case of total silence or shared dominant noise.

Active Short Low Existence of ambient noise at frequencies 1720 kHz can cause signal interference, thereby, error decoding the exchanged message.

93

6 Conclusion and Future Work This thesis presents the design and implementation of a device-to-device proximitybased service that allows a group of devices to automatically establish a private meeting based on their relative position. This chapter summarizes the contributions of this work and discusses possible improvements of the implemented service.

6.1 Conclusion In this thesis, we have achieved the goal of providing a fine-grained devices’ proximity via passive WiFi and ambient sound sensing, in addition to active ultrasound probing. Initially, we described the background areas, the available communication paradigms and reviewed the different localization techniques explaining their strengths and limitations in inferring devices’ proximity. Then, elaborated on the architecture to enable a D2D communication via WiFi-Direct and proceeded with analyzing the possible use-cases requirements in order to design the service; resulting into all devices electing the device with most powerful resources, including the available memory, battery resources and CPU usage, as a supernode. This supernode is responsible for managing the communication between devices and verifying their proximity. Taking the service design into account, we presented the implementation of multiple approaches to extract reliable features from different types of context information and measure the similarity between two devices’ features where a variety of metrics are employed. When the similarity between two devices’ features is above a specific threshold that is derived from experiments, the devices are considered in proximity. The features include: First, the presence of an access point and its RSSI value. Second, extract three features from the recorded ambient sound: Power Spectrogram, Mel-Frequency

94

6 Conclusion and Future Work Cepstral Coefficients (MFCC) and Landmark fingerprint. When the supernode receives the extracted features from a client device, it compares them with its own features to deduce the client’s proximity to its own location. Third, the supernode device broadcasts an encoded text message via in-audible sound signals, known as ultrasound, and requests from the client to decode it; if the message is successfully decoded then the client is in proximity. Subsequently, to evaluate the feasibility of the service for practical use, the WiFiDirect connection’s latency and throughput are tested in multiple states: idle, download running in the background and while the supernode is verifying the clients’ proximity. Then, the duration and energy consumption of forming a group, in addition to verifying the users’ proximity are evaluated in two different environments; restricted and open spaces. Thereafter, the proximity service’s accuracy is evaluated in multiple concrete scenarios. The evaluation reveals the following findings: first, when the WiFi-Direct connection performance is compared to a conventional WiFi: (1) it has lower latency in case of a continuous download running in the background and similar latency in case of idle connection, (2) a highly similar throughput and (3) increasing the number of connected devices significantly increases the latency due to the mobile devices’ limited computational power and resources. Second, the service’s energy consumption is not considered high from the users’ perspective when compared to the applications available on Google Play [112]. Finally, when using the combination of all features, the service’s accuracy performed better than the random baseline in all scenarios of restricted and open space environments. However, when taking audio features separately, the random baseline produced higher accuracy in case of total silence and a prominent noise which emphasizes the limitation of a single type of context information in fulfilling the requirements of all scenarios. The contribution of this thesis is the application framework that (i) enables automatic device-to-device group formation via the WiFi-Direct’s service advertising and discovery, (ii) collects and extracts reliable features from passive WiFi and ambient sound sensing, (iii) infers devices’ proximity from comparing the extracted features, in

95

6 Conclusion and Future Work addition to active ultrasound probing and (iv) allows devices in proximity to securely exchange files and messages. An Android application is built as a prototype, see Appendix for screenshots of the user interface.

6.2 Future Work In this section, we discuss possible improvements towards enhancing the proximity service implemented in this work. First, we introduce two additional methods of proximity detection. Second, we provide an approach for improving the service’s accuracy and automating the configuration of the service parameters. Finally, we motivate investigating the privacy challenges resulting from using device-to-device communication for proximity services. WiFi Probe Requests As Proximity Indicator Investigate the ability of detecting the presence of devices by capturing the information sent along with their probe requests; for any WiFi-enabled device, the operating system caches the access point’s information after a successful connection in order to ease connecting to it next time. Thus, the mobile devices are periodically broadcasting probe requests to discover if any of the cached access points is available. Capturing these requests obtains several information about the device and the access point, including, MAC addresses, RSSI, the device’s vendor and type. These information can identify which devices are in proximity and reveal additional information about their position. Visible Light Communication (VLC) As Proximity Indicator Same as audio signals, one of the main characteristics of light is respecting the environment boundaries. Therefore, multiple approaches can be researched, starting from the ability of capturing particular pattern of light signals transmitted over a specific frequency, to measuring the time and angle of arrival, reaching to exchanging encoded text messages. Employing VLC can be useful in a variety of scenarios, such as total

96

6 Conclusion and Future Work silence and high ambient noise especially the LED signals do not interfere with the surroundings’ radio frequencies. Make Use Of Machine Learning Techniques Machine learning techniques, such as clustering-based on audio and WiFi features, can be employed to improve the service’s accuracy and automatically configure the service’s parameters. For instance, when a few access points are available, retrieve more WiFi scans in each fingerprint request, or based on the environment, enable retrieving particular context information and disable the others. Furthermore, the service’s algorithm can provide a better adaption to various environments when different metrics thresholds, which determine the devices’ proximity, are derived dynamically based on the surrounding conditions. Device-to-Device Privacy Challenges Investigate the privacy challenges arising from using device-to-device communication paradigm for proximity services. As an example, for the devices to be discovered, each device advertises a service. These services can be discovered by any nearby device; what can this information reveal about the current user? Moreover, proximity-based services rely on the environment’s context information to infer the devices’ relative position; can the supernode conclude additional information about the user? Furthermore, the current implementation encrypts the entire communication via AES encryption; is AES the most suitable for this kind of connection and services?

97

7 Appendix Prototype User Interface

(a)

(b)

Figure 7.1: Prototype screenshots: (a) services advertising and discovery, (b) client device connecting to the elected supernode device.

98

7 Appendix

(a)

(b)

Figure 7.2: Prototype screenshots: showing (a) WiFi and (b) audio features’ similarity results at the supernode device.

99

7 Appendix

(a)

(b)

Figure 7.3: Prototype screenshots: (a) ultrasound decoded message and (b) client device view.

100

7 Appendix

(a)

(b)

Figure 7.4: Prototype screenshots: showing the ability of sending (a) messages and (b) files to the meeting devices.

101

List of Figures 1.1 1.2 1.3

A context-aware service scenario using device-to-device communication. Illustrates inferring individuals proximity from WiFi and sound information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example of room-level proximity service. . . . . . . . . . . . . . . . . . .

2.1 2.2 2.3 2.4 2.5 2.6

Proximity-based services paradigms. . . . . . . . . . . . . WiFi and WiFi-Direct communication topology. . . . . . WiFi-Direct discovery and group formation flow. . . . . LTE-Direct communication links between eNB and UEs. Cellular Network localization methods. . . . . . . . . . . WiFi vs. sound signal propagation. . . . . . . . . . . . . .

3.1

Device-to-Device Service group formation: demonstrates the service’s flow between two devices, Alice and Bob’s. After both devices advertised and discovered each other, Bob’s device is selected as the supernode since it has better resources. Therefore, it starts a sockets server and Alice connects to it. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Service roles use-case scenarios: an overview of the Supernode and the Client’s roles responsibilities; diagramming the supernode and the client’s device as actors. Each line links an actor with a use-case that it interacts with. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proximity Detection process’ sequence diagram. . . . . . . . . . . . . . .

32 34

Supernode Negotiation Flow Diagram: describes the flow of supernode negotiation process for both scenarios; creating a new and joining an existing meeting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

3.2

3.3 4.1

102

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

4 6 7 10 13 14 15 17 24

30

List of Figures 4.2 4.3 4.4 4.5 4.6 4.7

5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9

5.10 5.11 5.12 5.13

Power Spectrogram extraction steps. . . . . . . . . . . . . . . . . . . . . . Amplitude of 10 seconds recording for two devices inside the same room. Amplitude of 10 seconds recording of a crowded environment. . . . . . Mel-Frequency Cepstral Coefficients (MFCC) extraction steps. . . . . . . Landmark fingerprint: extracting fingerprint from robust frequency peaks and calculating hash keys. . . . . . . . . . . . . . . . . . . . . . . . Encoded ultrasound message structure; the message starts with a Preamble flag, RS-Codes are divided into parts and wrapped with Guard flags and the end of the message is identified by the Tail flag. . . . . . . . . .

53 55 56 57

WiFi-Direct and conventional WiFi communication flow. . . . . . . . . . WiFi-Direct and WiFi connections’ latency in an idle and continuous download states scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . The latency resulting from the supernode verifying the client devices’ proximity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . WiFi-Direct and conventional WiFi connections’ average throughput. . . Restricted space: sketch for the room shape. . . . . . . . . . . . . . . . . Restricted and Open Spaces: Group Formation phase duration and energy consumption evaluation results. . . . . . . . . . . . . . . . . . . . . . . . Proximity detection techniques’ duration and energy consumption evaluation results for restricted and open space environments. . . . . . . . . Single audio features duration and energy consumption analysis. . . . . Restricted space: proximity detection accuracy. Audio+ refers to the combination of WiFi and audio. Accordingly Ultrasound+ refers to the combination of WiFi, audio and ultrasound. Random Baseline scenario represents a ground truth accuracy for each method. . . . . . . . . . . . Restricted space: audio features performance. . . . . . . . . . . . . . . . Open space: proximity detection accuracy. . . . . . . . . . . . . . . . . . Open space: audio features performance. . . . . . . . . . . . . . . . . . . WiFi similarity methods: showing the relationship between the change in each method’s result and the distance between devices. . . . . . . . .

68

103

61

65

69 71 73 73 78 80 81

84 85 86 87 88

List of Figures 5.14 WiFi similarity method: showing how the value of Free Space Path-Loss (FSPL) changes when the distance between devices increases. . . . . . . 5.15 WiFi similarity methods: single feature accuracy results for 5 meters distance radius. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.16 Comparing the proximity service’s energy consumption with the Sports, New & Magazine and Social application categories. [111]. . . . . . . . . . 7.1 7.2 7.3 7.4

89 90 92

Prototype screenshots: (a) services advertising and discovery, (b) client device connecting to the elected supernode device. . . . . . . . . . . . . 98 Prototype screenshots: showing (a) WiFi and (b) audio features’ similarity results at the supernode device. . . . . . . . . . . . . . . . . . . . . . . . . 99 Prototype screenshots: (a) ultrasound decoded message and (b) client device view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Prototype screenshots: showing the ability of sending (a) messages and (b) files to the meeting devices. . . . . . . . . . . . . . . . . . . . . . . . . 101

104

List of Tables 1.1

4.1 4.2 4.3 5.1 5.2 5.3

5.4 5.5 5.6 5.7

Mobile Devices Sensors: comparison between IPhone 7, LG Nexus 5X and Samsung Galaxy S8 built-in sensors. . . . . . . . . . . . . . . . . . .

2

Wi-Fi P2P Intents: notifies registered applications when the state of the connection or the peers changes. . . . . . . . . . . . . . . . . . . . . . . . WiFi P2P Methods: allow the interaction with the WiFi hardware. . . . WiFiP2pDevice object properties. . . . . . . . . . . . . . . . . . . . . . . .

38 41 43

Specifications of the devices’ models used in the experiments [108, 109]. Proximity detection latency experiment: a summary of the amount of data sent for each context type request and response. . . . . . . . . . . . WiFi features’ thresholds derived from pre-evaluation experiments: for each feature, the service defines a threshold; if the feature value is above a threshold, then the user is within X meters proximity. . . . . . . . . . . Audio and ultrasound techniques signal settings derived from preevaluation experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Power Spectrogram similarity metrics thresholds derived from preevaluation experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . MFCC similarity metrics thresholds derived from pre-evaluation experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary of the proximity detection techniques’ performance and the influencing factors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

105

66 70

74 75 76 76 93

Bibliography [1] J. L. Waycott. The appropriation of PDAs as learning and workplace tools: an activity theory perspective. Place of publication not identified, 2004. [2] Cisco. Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2016–2021 White Paper. http://www.cisco.com/c/en/us/solutions/collateral/ service - provider / visual - networking - index - vni / mobile - white - paper c11-520862.html. Accessed: 2017-06-07. [3] G. D. Abowd, A. K. Dey, P. J. Brown, N. Davies, M. Smith, and P. Steggles. “Towards a better understanding of context and context-awareness.” In: International Symposium on Handheld and Ubiquitous Computing. Springer. 1999, pp. 304–307. [4] B. N. Schilit and M. M. Theimer. “Disseminating active map information to mobile hosts.” In: IEEE network 8.5 (1994), pp. 22–32. [5] S. Poslad. Ubiquitous computing: smart devices, environments and interactions. John Wiley & Sons, 2011. [6] M. Baldauf, S. Dustdar, and F. Rosenberg. “A survey on context-aware systems.” In: International Journal of Ad Hoc and Ubiquitous Computing 2.4 (2007), pp. 263– 277. [7] S. Zickau, D. Thatmann, T. Ermakova, J. Repschläger, R. Zarnekow, and A. Küpper. “Enabling location-based policies in a healthcare cloud computing environment.” In: Cloud Networking (CloudNet), 2014 IEEE 3rd International Conference on. IEEE. 2014, pp. 333–338. [8]

Google. Google Maps. https://www.google.de/maps. Accessed: 2017-06-08.

106

Bibliography [9] A. G. de Prado and G. Ortiz. “Context-aware services: A survey on current proposals.” In: The Third International Conferences on Advanced Service Computing. 2011, pp. 104–109. [10] M. Othman, S. A. Madani, S. U. Khan, et al. “A survey of mobile cloud computing application models.” In: IEEE Communications Surveys & Tutorials 16.1 (2014), pp. 393–413. [11] N. Fernando, S. W. Loke, and W. Rahayu. “Mobile cloud computing: A survey.” In: Future generation computer systems 29.1 (2013), pp. 84–106. [12] B. Rao and L. Minakakis. “Evolution of mobile location-based services.” In: Communications of the ACM 46.12 (2003), pp. 61–65. [13] S. Mascetti, C. Bettini, D. Freni, X. S. Wang, and S. Jajodia. “Privacy-aware proximity based services.” In: 2009 Tenth International Conference on Mobile Data Management: Systems, Services and Middleware. IEEE. 2009, pp. 31–40. [14] C.-J. M. Liang, H. Jin, Y. Yang, L. Zhang, and F. Zhao. “Crossroads: A framework for developing proximity-based social interactions.” In: International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services. Springer. 2013, pp. 168–180. [15] P. A. Iannucci, R. Netravali, A. K. Goyal, and H. Balakrishnan. “Room-area networks.” In: Proceedings of the 14th ACM Workshop on Hot Topics in Networks. ACM. 2015, p. 9. [16] W.-T. Tan, M. Baker, B. Lee, and R. Samadani. “The sound of silence.” In: Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems. ACM. 2013, p. 19. [17] S. Androutsellis-Theotokis and D. Spinellis. “A survey of peer-to-peer content distribution technologies.” In: ACM computing surveys (CSUR) 36.4 (2004), pp. 335–371. [18] M. Hefeeda. “Peer-to-peer systems.” In: School of Computing Science, Simon Fraser University, Surrey, Canada (2004).

107

Bibliography [19] G. R. Hiertz, D. Denteneer, L. Stibor, Y. Zang, X. P. Costa, and B. Walke. “The IEEE 802.11 universe.” In: IEEE Communications Magazine 48.1 (2010), pp. 62–70. [20] W. alliance. Wifi direct specifications. http : / / www . wi - fi . org / discover - wi fi/wi-fi-direct. Accessed: 2016-11-20. [21] Qualcomm. LTE Direct ProSe. https://www.qualcomm.com/invention/research/ projects/lte-direct. Accessed: 2017-04-16. [22] W. K. Edwards. “Discovery systems in ubiquitous computing.” In: IEEE Pervasive Computing 5.2 (2006), pp. 70–77. [23] S.-Y. Lien, C.-C. Chien, G. S.-T. Liu, H.-L. Tsai, R. Li, and Y. J. Wang. “Enhanced LTE Device-to-Device Proximity Services.” In: IEEE Communications Magazine 54.12 (2016), pp. 174–182. [24] X. Lin, J. Andrews, A. Ghosh, and R. Ratasuk. “An overview of 3GPP deviceto-device proximity services.” In: IEEE Communications Magazine 52.4 (2014), pp. 40–48. [25] S. Mumtaz, H. Lundqvist, K. M. S. Huq, J. Rodriguez, and A. Radwan. “Smart Direct-LTE communication: An energy saving perspective.” In: Ad Hoc Networks 13 (2014), pp. 296–311. [26] Y. Wang, J. Tang, Q. Jin, and J. Ma. “BWMesh: a multi-hop connectivity framework on Android for proximity service.” In: Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), 2015 IEEE 12th Intl Conf on. IEEE. 2015, pp. 278– 283. [27] K. Ozsoy, A. Bozkurt, and I. Tekin. “Indoor positioning based on global positioning system signals.” In: Microwave and Optical Technology Letters 55.5 (2013), pp. 1091–1097. [28] W. G. 3. “Indoor location test bed report.” In: Communications Security, Reliability and Interoperability Council III, Mar 14 2013. 2013.

108

Bibliography [29] H. Liu, Y. Gan, J. Yang, S. Sidhom, Y. Wang, Y. Chen, and F. Ye. “Push the limit of WiFi based localization for smartphones.” In: Proceedings of the 18th annual international conference on Mobile computing and networking. ACM. 2012, pp. 305–316. [30] S. P. Tarzia, P. A. Dinda, R. P. Dick, and G. Memik. “Indoor localization without infrastructure using the acoustic background spectrum.” In: Proceedings of the 9th international conference on Mobile systems, applications, and services. ACM. 2011, pp. 155–168. [31] A. I. Kyritsis, P. Kostopoulos, M. Deriaz, and D. Konstantas. “A ble-based probabilistic room-level localization method.” In: Localization and GNSS (ICLGNSS), 2016 International Conference on. IEEE. 2016, pp. 1–6. [32] P. Sapiezynski, A. Stopczynski, D. K. Wind, J. Leskovec, and S. Lehmann. “Inferring Person-to-person Proximity Using WiFi Signals.” In: arXiv preprint arXiv:1610.04730 (2016). [33] C. Martella, A. Miraglia, M. Cattani, and M. van Steen. “Leveraging proximity sensing to mine the behavior of museum visitors.” In: Pervasive Computing and Communications (PerCom), 2016 IEEE International Conference on. IEEE. 2016, pp. 1–9. [34] Y. Gu, A. Lo, and I. Niemegeers. “A survey of indoor positioning systems for wireless personal networks.” In: IEEE Communications surveys & tutorials 11.1 (2009), pp. 13–32. [35] I. Guvenc and C.-C. Chong. “A survey on TOA based wireless localization and NLOS mitigation techniques.” In: IEEE Communications Surveys & Tutorials 11.3 (2009). [36] J. Liu, B. Priyantha, T. Hart, H. S. Ramos, A. A. Loureiro, and Q. Wang. “Energy efficient GPS sensing with cloud offloading.” In: Proceedings of the 10th ACM Conference on Embedded Network Sensor Systems. ACM. 2012, pp. 85–98. [37] K. Lin, A. Kansal, D. Lymberopoulos, and F. Zhao. “Energy-accuracy trade-off for continuous mobile device location.” In: Proceedings of the 8th international conference on Mobile systems, applications, and services. ACM. 2010, pp. 285–298.

109

Bibliography [38] J. Paek, J. Kim, and R. Govindan. “Energy-efficient rate-adaptive GPS-based positioning for smartphones.” In: Proceedings of the 8th international conference on Mobile systems, applications, and services. ACM. 2010, pp. 299–314. [39] C. Papamanthou, F. P. Preparata, and R. Tamassia. “Algorithms for location estimation based on rssi sampling.” In: International Symposium on Algorithms and Experiments for Sensor Systems, Wireless Networks and Distributed Robotics. Springer. 2008, pp. 72–86. [40] S. P. Subramanian, J. Sommer, S. Schmitt, and W. Rosenstiel. “Inr indoor navigator with RFID locator.” In: Next Generation Mobile Applications, Services and Technologies, 2009. NGMAST’09. Third International Conference on. IEEE. 2009, pp. 176– 181. [41] H. Hashemi. “The indoor radio propagation channel.” In: Proceedings of the IEEE 81.7 (1993), pp. 943–968. [42] K. D’hoe, G. Ottoy, J.-P. Goemaere, and L. De Strycker. “Indoor room location estimation.” In: Advances in electrical and computer engineering 8.2 (2008), pp. 78– 81. [43] S. Holm. “Ultrasound positioning based on time-of-flight and signal strength.” In: Indoor Positioning and Indoor Navigation (IPIN), 2012 International Conference on. IEEE. 2012, pp. 1–6. [44] M. P’erez, D. Gualda, J. Villadangos, J. Ureña, P. Pajuelo, E. D’iaz, and E. Garc’ia. “Android application for indoor positioning of mobile devices using ultrasonic signals.” In: Indoor Positioning and Indoor Navigation (IPIN), 2016 International Conference on. IEEE. 2016, pp. 1–7. [45] P. Bahl and V. N. Padmanabhan. “RADAR: An in-building RF-based user location and tracking system.” In: INFOCOM 2000. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE. Vol. 2. Ieee. 2000, pp. 775–784. [46] S.-Y. Jung, S. Hann, and C.-S. Park. “TDOA-based optical wireless indoor localization using LED ceiling lamps.” In: IEEE Transactions on Consumer Electronics 57.4 (2011).

110

Bibliography [47] T. Komine and M. Nakagawa. “Fundamental analysis for visible-light communication system using LED lights.” In: IEEE transactions on Consumer Electronics 50.1 (2004), pp. 100–107. [48] Y. Wang, L. Tao, X. Huang, J. Shi, and N. Chi. “8-Gb/s RGBY LED-based WDM VLC system employing high-order CAP modulation and hybrid post equalizer.” In: IEEE Photonics Journal 7.6 (2015), pp. 1–7. [49] F. Li, C. Zhao, G. Ding, J. Gong, C. Liu, and F. Zhao. “A reliable and accurate indoor localization method using phone inertial sensors.” In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing. ACM. 2012, pp. 421–430. [50] C. Hide, T. Botterill, and M. Andreotti. “Low cost vision-aided IMU for pedestrian navigation.” In: Ubiquitous Positioning Indoor Navigation and Location Based Service (UPINLBS), 2010. IEEE. 2010, pp. 1–7. [51] J. C. Ching, C. Domingo, K. Iglesia, C. Ngo, and N. Chua. “Mobile indoor positioning using Wi-Fi localization and image processing.” In: Theory and Practice of Computation. Springer, 2013, pp. 242–256. [52] A. Mulloni, D. Wagner, I. Barakonyi, and D. Schmalstieg. “Indoor positioning and navigation with camera phones.” In: IEEE Pervasive Computing 8.2 (2009). [53] D. Namiot and M. Sneps-Sneppe. “Wi-Fi Proximity and Context-aware Browsing.” In: ICDT. 2012. [54] F. Lassabe, P. Canalda, P. Chatonnay, and F. Spies. “Indoor Wi-Fi positioning: techniques and systems.” In: Annals of telecommunications-Annales des télécommunications 64.9-10 (2009), pp. 651–664. [55] N. Pritt. “Indoor location with Wi-Fi fingerprinting.” In: Applied Imagery Pattern Recognition Workshop (AIPR): Sensing for Control and Augmentation, 2013 IEEE. IEEE. 2013, pp. 1–8. [56] K. Kaemarungsi and P. Krishnamurthy. “Modeling of indoor positioning systems based on location fingerprinting.” In: INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies. Vol. 2. IEEE. 2004, pp. 1012–1022.

111

Bibliography [57] A. Stopczynski, V. Sekara, P. Sapiezynski, A. Cuttone, M. M. Madsen, J. E. Larsen, and S. Lehmann. “Measuring large-scale social networks with high resolution.” In: PloS one 9.4 (2014), e95978. [58] M. Kotaru, K. Joshi, D. Bharadia, and S. Katti. “Spotfi: Decimeter level localization using wifi.” In: ACM SIGCOMM Computer Communication Review. Vol. 45. 4. ACM. 2015, pp. 269–282. [59] A. Wang et al. “An Industrial Strength Audio Search Algorithm.” In: ISMIR. Vol. 2003. Washington, DC. 2003, pp. 7–13. [60] M. Wirz, D. Roggen, and G. Tröster. “A wearable, ambient sound-based approach for infrastructureless fuzzy proximity estimation.” In: International Symposium on Wearable Computers (ISWC) 2010. IEEE. 2010, pp. 1–4. [61] D. P. Ellis, H. Satoh, and Z. Chen. “Detecting proximity from personal audio recordings.” In: INTERSPEECH. 2014, pp. 2519–2523. [62] A. Wang. “The Shazam music recognition service.” In: Communications of the ACM 49.8 (2006), pp. 44–48. [63] C. V. Cotton and D. P. Ellis. “Audio fingerprinting to identify multiple videos of an event.” In: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE. 2010, pp. 2386–2389. [64] J. Haitsma and T. Kalker. “A highly robust audio fingerprinting system.” In: Ismir. Vol. 2002. 2002, pp. 107–115. [65] J. Haitsma, T. Kalker, and J. Oostveen. “Robust audio hashing for content identification.” In: International Workshop on Content-Based Multimedia Indexing. Vol. 4. Citeseer. 2001, pp. 117–124. [66] P. Cano, E. Batlle, T. Kalker, and J. Haitsma. “A review of audio fingerprinting.” In: Journal of VLSI signal processing systems for signal, image and video technology 41.3 (2005), pp. 271–284. [67] H. Satoh, M. Suzuki, Y. Tahiro, and H. Morikawa. “Ambient sound-based proximity detection with smartphones.” In: Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems. ACM. 2013, p. 58.

112

Bibliography [68] M. Hazas and A. Hopper. “Broadband ultrasonic location systems for improved indoor positioning.” In: IEEE Transactions on mobile Computing 5.5 (2006), pp. 536– 547. [69] A. API. Wi-Fi Peer-to-Peer. https://developer.android.com/guide/topics/ connectivity/wifip2p.html. Accessed: 2016-11-20. [70] DNS-SD. DNS SRV (RFC 2782) Service Types. http : / / www . dns - sd . org / ServiceTypes.html. Accessed: 2017-04-30. [71] C. Seeger, A. Buchmann, and K. Van Laerhoven. “Wireless sensor networks in the wild: Three practical issues after a middleware deployment.” In: Proceedings of the 6th International Workshop on Middleware Tools, Services and Run-time Support for Networked Embedded Systems. ACM. 2011, p. 1. [72] P. Jaccard. Etude comparative de la distribution florale dans une portion des Alpes et du Jura. Impr. Corbaz, 1901. [73] P. Sapiezynski, R. Gatej, A. Mislove, and S. Lehmann. “Opportunities and challenges in crowdsourced wardriving.” In: Proceedings of the 2015 ACM Conference on Internet Measurement Conference. ACM. 2015, pp. 267–273. [74] Z. Zhang, X. Zhou, W. Zhang, Y. Zhang, G. Wang, B. Y. Zhao, and H. Zheng. “I am the antenna: accurate outdoor AP location using smartphones.” In: Proceedings of the 17th annual international conference on Mobile computing and networking. ACM. 2011, pp. 109–120. [75] R. N. Bracewell and R. N. Bracewell. The Fourier transform and its applications. Vol. 31999. McGraw-Hill New York, 1986. [76] C. M. Rader, L. R. Rabiner, and R. W. Schafer. Digital Processing of Speech Signals. Signal Processing Series. 1980. [77] E. Manders, F. Verbeek, and J. Aten. “Measurement of co-localization of objects in dual-colour confocal images.” In: Journal of microscopy 169.3 (1993), pp. 375– 382. [78] B. McCune, J. B. Grace, and D. L. Urban. Analysis of ecological communities. Vol. 28. MjM software design Gleneden Beach, 2002.

113

Bibliography [79] P. E. Black. “Manhattan distance.” In: Dictionary of Algorithms and Data Structures 18 (2006), p. 2012. [80] R. Poovendran, C. Wang, and S. Roy. Secure localization and time synchronization for wireless sensor and ad hoc networks. Vol. 30. Springer Science & Business Media, 2007. [81] J. B. Andersen, T. S. Rappaport, and S. Yoshida. “Propagation measurements and models for wireless communications channels.” In: IEEE Communications Magazine 33.1 (1995), pp. 42–49. [82] A. V. Oppenheim. Discrete-time signal processing. Pearson Education India, 1999. [83] J. G. Proakis and D. G. Manolakis. Digital signal processing: principles, algorithms, and applications. 1996. [84] W. T. Cochran, J. W. Cooley, D. L. Favin, H. D. Helms, R. A. Kaenel, W. W. Lang, G. Maling, D. E. Nelson, C. M. Rader, and P. D. Welch. “What is the fast Fourier transform?” In: Proceedings of the IEEE 55.10 (1967), pp. 1664–1674. [85] A. Bhattacharyya. “On a measure of divergence between two multinomial populations.” In: Sankhya: the indian journal of statistics (1946), pp. 401–406. [86] G. Chowdhury. Introduction to modern information retrieval. Facet publishing, 2010. [87] S. Davis and P. Mermelstein. “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences.” In: IEEE transactions on acoustics, speech, and signal processing 28.4 (1980), pp. 357–366. [88] F. Zheng, G. Zhang, and Z. Song. “Comparison of different implementations of MFCC.” In: Journal of Computer science and Technology 16.6 (2001), pp. 582–589. [89] V. Peltonen, J. Tuomi, A. Klapuri, J. Huopaniemi, and T. Sorsa. “Computational auditory scene recognition.” In: Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on. Vol. 2. IEEE. 2002, pp. II–1941. [90] M. McKinney and J. Breebaart. “Features for audio and music classification.” In: (2003).

114

Bibliography [91] S. S. Stevens, J. Volkmann, and E. B. Newman. “A scale for the measurement of the psychological magnitude pitch.” In: The Journal of the Acoustical Society of America 8.3 (1937), pp. 185–190. [92] N. Ahmed, T. Natarajan, and K. R. Rao. “Discrete cosine transform.” In: IEEE transactions on Computers 100.1 (1974), pp. 90–93. [93] H. C. Andrews. “Multidimensional rotations in feature selection.” In: IEEE Transactions on Computers 100.9 (1971), pp. 1045–1051. [94] D. J. Berndt and J. Clifford. “Using dynamic time warping to find patterns in time series.” In: KDD workshop. Vol. 10. 16. Seattle, WA. 1994, pp. 359–370. [95] L. Muda, M. Begam, and I. Elamvazuthi. “Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques.” In: arXiv preprint arXiv:1003.4083 (2010). [96] R. De Maesschalck, D. Jouan-Rimbaud, and D. L. Massart. “The mahalanobis distance.” In: Chemometrics and intelligent laboratory systems 50.1 (2000), pp. 1–18. [97] T. Kamei. Image feature extractor, an image feature analyzer and an image matching system. US Patent 6,243,492. June 2001. [98] A. S.-L.-H. A. (ASHA). Noise levels. http://www.asha.org/public/hearing/ Noise/. Accessed: 2016-11-20. [99] I. Constandache, S. Agarwal, I. Tashev, and R. R. Choudhury. “Daredevil: indoor location using sound.” In: ACM SIGMOBILE Mobile Computing and Communications Review 18.2 (2014), pp. 9–19. [100] G. Borriello, A. Liu, T. Offer, C. Palistrant, and R. Sharp. “Walrus: wireless acoustic location with room-level resolution using ultrasound.” In: Proceedings of the 3rd international conference on Mobile systems, applications, and services. ACM. 2005, pp. 191–203. [101] J. Hoppe, F. Höflinger, and L. Reindl. “Acoustic receivers for indoor smartphone localization.” In: Proc. of International Conference on Indoor Positioning and Indoor Navigation (IPIN2012). 2012.

115

Bibliography [102] I. S. Reed and G. Solomon. “Polynomial codes over certain finite fields.” In: Journal of the society for industrial and applied mathematics 8.2 (1960), pp. 300–304. [103] A. Bo, G. Jian-Hua, and W. Yong. “Symbol synchronization technique in COFDM systems.” In: IEEE transactions on Broadcasting 50.1 (2004), pp. 56–62. [104] S. B. Wicker and V. K. Bhargava. Reed-Solomon codes and their applications. John Wiley & Sons, 1999. [105] C. E. Shannon. “Communication in the presence of noise.” In: Proceedings of the IRE 37.1 (1949), pp. 10–21. [106] S. Kumar and R. Gupta. “Bit error rate analysis of Reed-Solomon code for efficient communication system.” In: International Journal of Computer Applications 30.12 (2011), pp. 11–15. [107] G. Navarro. “A guided tour to approximate string matching.” In: ACM computing surveys (CSUR) 33.1 (2001), pp. 31–88. [108] SAMSUNG. Samsung Galaxy Tab A 10.1" Specs. http://www.samsung.com/us/ system/consumer/product/sm/t5/80/smt580nzkaxar/TAB- GALAXYTABADSHTNOV16TFinal11-10-16.pdf. Accessed: 2017-06-25. [109] SAMSUNG. Samsung Galaxy Tab S5 Specs. http : / / www . samsung . com / uk / business / business - products / smartphones / smartphones / SM - G900FZWABTU. Accessed: 2017-07-04. [110] L. Zhang, B. Tiwana, R. P. Dick, Z. Qian, Z. M. Mao, Z. Wang, and L. Yang. “Accurate online power estimation and automatic battery behavior based power model generation for smartphones.” In: Hardware/Software Codesign and System Synthesis (CODES+ ISSS), 2010 IEEE/ACM/IFIP International Conference on. IEEE. 2010, pp. 105–114. [111] M. Insights. Average Foreground Battery Drain for Android App Categories – An M2 App Insight Report. http : / / www . m2mobileinsights . com / blog / average foreground - battery - drain - for - android - app - categories - an - m2 - app insight-report/. Accessed: 2017-06-27.

116

Bibliography [112] Google.com. Google Play. https://play.google.com/store. Accessed: 2017-0627. [113] Softel. Ping(Host) Monitor. https://play.google.com/store/apps/details? id=bg.softel.pingmonitor&hl=en. Accessed: 2017-07-04. [114] Z. Pallagi. iPerf for Android. https://play.google.com/store/apps/details? id=com.magicandroidapps.iperf&hl=en. Accessed: 2017-07-04. [115] MagicAndroidApps.com. WiFi Speed Test. https://play.google.com/store/ apps/details?id=com.pzolee.android.localwifispeedtester&hl=en. Accessed: 2017-07-04. [116] L. Lei, Z. Zhong, C. Lin, and X. Shen. “Operator controlled device-to-device communications in LTE-advanced networks.” In: IEEE Wireless Communications 19.3 (2012), p. 96. [117] G. Fodor, E. Dahlman, G. Mildh, S. Parkvall, N. Reider, G. Miklós, and Z. Turányi. “Design aspects of network assisted device-to-device communications.” In: IEEE Communications Magazine 50.3 (2012), pp. 170–177. [118] A. Smailagic and D. Kogan. “Location sensing and privacy in a context-aware computing environment.” In: IEEE Wireless Communications 9.5 (2002), pp. 10–17. [119] N. Bulusu, J. Heidemann, and D. Estrin. “GPS-less low-cost outdoor localization for very small devices.” In: IEEE personal communications 7.5 (2000), pp. 28–34. [120] T. He, C. Huang, B. M. Blum, J. A. Stankovic, and T. Abdelzaher. “Range-free localization schemes for large scale sensor networks.” In: Proceedings of the 9th annual international conference on Mobile computing and networking. ACM. 2003, pp. 81–95. [121] J. Blumenthal, R. Grossmann, F. Golatowski, and D. Timmermann. “Weighted centroid localization in zigbee-based sensor networks.” In: Intelligent Signal Processing, 2007. WISP 2007. IEEE International Symposium on. IEEE. 2007, pp. 1– 6.

117

Bibliography [122] R. Behnke and D. Timmermann. “AWCL: adaptive weighted centroid localization as an efficient improvement of coarse grained localization.” In: Positioning, Navigation and Communication, 2008. WPNC 2008. 5th Workshop on. IEEE. 2008, pp. 243–250. [123] M. Bouet and G. Pujolle. “A range-free 3-D localization method for RFID tags based on virtual landmarks.” In: 2008 IEEE 19th International Symposium on Personal, Indoor and Mobile Radio Communications. IEEE. 2008, pp. 1–5. [124] K. Doppler, M. Rinne, C. Wijting, C. B. Ribeiro, and K. Hugl. “Device-to-device communication as an underlay to LTE-advanced networks.” In: IEEE Communications Magazine 47.12 (2009), pp. 42–49. [125] K. Doppler, J. Manssour, A. Osseiran, and M. Xiao. “Innovative concepts in peerto-peer and network coding.” In: Celtic Telecommunication Solutions 16 (2008), p. 09. [126] A. Madhavapeddy, D. Scott, and R. Sharp. “Context-aware computing with sound.” In: International Conference on Ubiquitous Computing. Springer. 2003, pp. 315–332. [127] J. K. S. Lau, C.-K. Tham, and T. Luo. “Participatory cyber physical system in public transport application.” In: Utility and Cloud Computing (UCC), 2011 Fourth IEEE International Conference on. IEEE. 2011, pp. 355–360. [128]

Yelp.com. Yelp.com. https://www.yelp.com.sg/. Accessed: 2017-03-18.

[129] F.-J. Wu and T. Luo. “Infrastructureless signal source localization using crowdsourced data for smart-city applications.” In: Communications (ICC), 2015 IEEE International Conference on. IEEE. 2015, pp. 586–591. [130] D. Lazer, A. S. Pentland, L. Adamic, S. Aral, A. L. Barabasi, D. Brewer, N. Christakis, N. Contractor, J. Fowler, M. Gutmann, et al. “Life in the network: the coming age of computational social science.” In: Science (New York, NY) 323.5915 (2009), p. 721. [131] R. Zekavat and R. M. Buehrer. Handbook of position location: Theory, practice and advances. Vol. 27. John Wiley & Sons, 2011.

118

Bibliography [132] J.-S. Lee, Y.-W. Su, and C.-C. Shen. “A comparative study of wireless protocols: Bluetooth, UWB, ZigBee, and Wi-Fi.” In: Industrial Electronics Society, 2007. IECON 2007. 33rd Annual Conference of the IEEE. Ieee. 2007, pp. 46–51. [133] 802.11u-2011. “Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications Amendment 9: Interworking with External Networks.” In: 2011. [134] A. Bhattachayya. “On a measure of divergence between two statistical population defined by their population distributions.” In: Bulletin Calcutta Mathematical Society 35.99-109 (1943), p. 28. [135] S. Shin. “Introduction to json (javascript object notation).” In: Presentation www. javapassion. com (2010). [136] R. Want, A. Hopper, V. Falcao, and J. Gibbons. “The active badge location system.” In: ACM Transactions on Information Systems (TOIS) 10.1 (1992), pp. 91– 102. [137] M. Xu, L.-Y. Duan, J. Cai, L.-T. Chia, C. Xu, and Q. Tian. “HMM-based audio keyword generation.” In: Pacific-Rim Conference on Multimedia. Springer. 2004, pp. 566–574. [138] M. Sahidullah and G. Saha. “Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition.” In: Speech Communication 54.4 (2012), pp. 543–565. [139] V. Filonenko, C. Cullen, and J. Carswell. “Investigating ultrasonic positioning on mobile phones.” In: Indoor Positioning and Indoor Navigation (IPIN), 2010 International Conference on. IEEE. 2010, pp. 1–8. [140] B. Clarkson, N. Sawhney, and A. Pentland. “Auditory context awareness via wearable computing.” In: Energy 400.600 (1998), p. 20.

119

Proximity Reasoning via Multimodal Context Fusion - LRZ Sync+Share

Proximity Reasoning via Multimodal Context Fusion - LRZ Sync+Share

Suggest Documents

Multimodal Sentiment Analysis using Hierarchical Fusion with Context

Temporal Bounded Reasoning for Context-based Information Fusion

A Dynamic Context Reasoning based on Evidential Fusion Networks

COLLABORATIVE CONTEXT-BASED REASONING

Photran - LRZ

FUSION VIA TRANSPSOAS LATERAL APPROACH ...www.researchgate.net › publication › fulltext › Fusion-via

Multimodal Reasoning with Rule Induction and Case-Based Reasoning

Context Representation and Reasoning for

Proof styles in multimodal reasoning - Semantic Scholar

Representation and Reasoning in a Multimodal Conversational ...

Equational Reasoning via Partial Reflection

Image Fusion for Context Enhancement

Multimodal Medical Image Fusion using ... - Semantic Scholar

Fusion Framework for Multimodal Biometric Person ... - IAENG

An Architecture for Multimodal Information Fusion - CiteSeerX

Multimodal Medical Image Fusion Framework ... - Semantic Scholar

Relational Reasoning via SMT Solving

Multimodal Fusion in Human-Agent Dialogue - Ilhaire

ON CONSISTENT FUSION OF MULTIMODAL ... - CiteSeerX

Feature Selection and Multimodal Fusion for

Research Article Multimodal Deep Feature Fusion

Investigation of Multimodal Features, Classifiers and Fusion

Fusion engines for multimodal input: A survey

Multimodal Data Fusion - Archive ouverte HAL