Development of a contextual thinking engine in mobile devices (PDF ...

14 downloads 2931 Views 822KB Size Report
The inferred activities were then compared with the labeled activities to assess the performance of ..... smartphones (A Samsung Galaxy 4S and a Galaxy Note 3), two .... [3] Zandbergen, P.A.: 'Accuracy of iPhone Locations: A Comparison of.
Development of a Contextual Thinking Engine in Mobile Devices Ruizhi Chen1, Tianxing Chu1, Jingbin Liu2, Yuwei Chen2, Liang Chen2, Wenchao Xu1, Xiao Li1, Juha Hyyppä2, Jian Tang2 1

Conrad Blucher Institute of Surveying & Science Texas A&M University Corpus Christi Corpus Christi, US [email protected]

Abstract—This paper introduces a framework of contextual thinking in mobile devices. It is based on real-time sensing of local time, significant locations, location-dwelling states, and user states to infer significant activities. A significant activity is a well-defined activity to be inferred, for example, waiting for a bus, having a meeting, working in office, taking a break in a coffee shop et al. A significant location is defined as a geofence, which can be a node associated with a circle, or a polygon. A location-dwelling state is defined as enter into a significant location, the location-dwelling duration, or exit from a significant location. A user state is a combination of user mobility states, user actions, user social states and event psychological states. With this initial study, we just focus on the user motion states including static, slow walking, walking and fast moving that can be fast walking or driving. However, the framework and the activity inference algorithm are flexible for adopting other user states in the future. Using the measurements of the built-in sensors and radio signals in mobile devices, we can capture a snapshot of a contextual tuple for every second, which includes a time tag, an ID of a significant location, a locationdwelling duration, and a user state. The sequence of contextual tuples is used as the inputs for inferring the user significant activities. The contextual thinking engine will evaluate the posteriori probability of each significant activity for each given contextual tuple using a Bayesian approach. An “un-defined” activity is adopted to cover all activities other than the selected significant activities. A prototype of the contextual thinking engine has been developed in the Geospatial Computing Lab at Texas A&M University Corpus Christi. A test environment was setup on the campus. Six significant activities were defined and tested by two different testers for three days using two different smartphones. These significant activities include: 1) working in an office; 2) having a meeting; 3) having a lunch, 4) having a coffee break, 5) visiting the library, and 6) waiting for a bus. An “un-defined” activity was included to cover all activities other than the selected significant activities. The inferred activities were then compared with the labeled activities to assess the performance of the contextual thinking engine. We demonstrated that the success rate of inference was more than 90% on average. We recognized that the positioning accuracy plays a significant role in the inference algorithm because it has direct impact to two elements in the contextual tuple: the significant location and the location-dwelling duration. Keywords—contextual thinking; context awareness; location awareness; smartphone positioning.

UPINLBS 2014, Corpus Christi, Texas, Nov 20-21, 2014 978-1-4799-6004-0/14/$31.00 ©2014 IEEE

2

Department of Photogrammetry and Remote Sensing Finnish Geodetic Institute Masala, Finland I. INTRODUCTION

Smartphones are nowadays equipped with various built-in sensors and support multiple radio signals [1, 2]. These built-in sensors include GNSS (Global Navigation Satellite System) receiver, accelerometer, magnetometer, gyroscope, and barometer, while the radio signals includes WiFi signals, Bluetooth signals and cellular signals [2]. The sensors and radio signals offer us a unique opportunity to locate mobile users indoors/outdoors. For outdoor environment, a low-cost GNSS receiver is capable of locating mobile users with an accuracy of about 5-10 meters [3], which is adequate for many navigation and user tracking applications and a growing number of location-based services such as Google now. To locate mobile users indoors, measurements from other sensors such as accelerometer, gyroscope, and magnetometer and radio signals such as WiFi and Bluetooth are required [4-7]. The positioning accuracy indoors varies depending on the level of supports from the positioning environments, e.g., the number of WiFi access points available and the spatial resolution of the radio map (or fingerprinting database). In any case, room-level positioning accuracy is achievable without any complex setup for the positioning environment [8, 9]. The location of a mobile user plays an essential role in enabling smart location awareness applications such as Google now. From location-aware to context-aware, additional information such as local time (when), location-dwelling duration (when), user motion patterns and activity (what) are required. This information can be derived (sensed) using the measurements of the built-in sensors and radio signals. Accelerometer have been successfully used for detecting user motion patterns [10, 11], which are essential information for creating context-aware applications [12] [13, 14]. Context-aware is more complicated than location-aware because we need to identify not only the user location, but also the activity occurring in the user ambient environment. This paper introduces a framework of activity inference based on real-time sensing of local time, user locations, location-dwelling duration and user mobility context. A Naïve Bayes’ classifier is adopted in this study for inferring significant activity. The inferred activity can then be applied for developing contextaware application. For example, your smartphone can automatically answer any incoming calls while you are driving by sending a text or voice message to the callers. The phone

90

needs to detect your current activity “driving” before it can react on your behalf. The paper will be presented with the following sections. Section II will discuss the related research work; section III will present the research methods and theoretical background of the framework including algorithms for positioning, motion detection, and context inference; section IV will discuss the setup of a test experiments, data analysis and discussions, and section V will conclude the paper with a short description of our future study. II. RELATED RESEARCH WORKS Despite the great potential of context-aware applications, a reliable commercial solution is not yet available according to our knowledge. Google now is one of the smart applications publicly available nowadays with location-aware functionality. It pushes relevant information to the mobile users based on his/her current location (http://www.google.com/landing/now/). For context inference, Chen and Guinness [2] abstracted a context as a Context Pyramid, as shown in Figure 1, where the raw data from diverse sensors is the foundation of the Context Pyramid. Based on the Raw Sensor Data, we can extract Physical Parameters such as position coordinates, total accelerations, heading, angular velocity and moving speeds. Features/Patterns of physical parameters are generated for further pattern recognition in the Simple Contextual Descriptors, which infer the simple context such as location, motion, and surroundings. Activity-Level Descriptors combine the simple contextual information into the activity level. On the top of the pyramid, Rich Context includes rich social and psychological contexts, which is ultimately expressed in natural language [2]. Inference of a rich context is a very challenging task. It requires sensing a lot of information to derive the knowledge required. The study will attempt to infer the activity-level descriptors defined in Level 5 in Figure 1.

Figure 1. The context pyramid [2].

UPINLBS 2014, Corpus Christi, Texas, Nov 20-21, 2014 978-1-4799-6004-0/14/$31.00 ©2014 IEEE

Pei et al. [15] developed a smartphone solution for detecting human behaviors using a LoMoLo (Location-Motion-Context) model, which utilized the user locations and motion states. They achieved an accuracy of 92.9% in detecting motion states using a Least Square-Support Vector Machine classifier. Reddy et al. [16] used a mobile phone with a GPS receiver and an accelerometer to determine transportation modes of users, achieving 93.6% classification accuracy with the use of discrete Hidden Markov Models. Bancroft et al. [17] used a footmounted device (with GPS receiver and inertial measurement unit) to determine different motion-related activities, such as walking, running, biking, and moving in a vehicle, reporting less than 1% classification error during 99% of measurements. Jin et al. [18] detected several motion states using an armbandmounted accelerometer and a “fuzzy inference system,” which can marginally be considered as a Machine Learning technique. We are aware of several studies that used both accelerometer data and standard Machine Learning techniques to determine user mobility or activity-related contexts [11, 19]. III. RESEARCH METHODS The main goal of our study is to develop a framework of inferring significant activity in mobile devices. It is based on real-time sensing of local time, significant location, locationdwelling states, and user states to infer significant activities. A significant activity is a well-defined activity to be inferred, for example, waiting for a bus, having a meeting, working in office, taking a break in a coffee shop et al. A significant location is defined as a geofence, which can be a node associated with a circle, or a polygon. A location-dwelling state is defined as enter into a significant location, the location-dwelling duration, or exit from a significant location. A user state is a combination of user mobility states, user actions, user social states and event psychological states. With this initial study, we just focus on the user motion states including static, slow walking, walking and fast moving that can be fast walking or driving. However, the algorithms implemented in the framework are flexible for adopting other user states in the future study. The implementation of the framework requires more information than the location. This information include local time that can be obtained directly from the smartphone; user locations indoors/outdoors that can be derived with a positioning module implemented inside the smartphone; user states that can be derived e.g., from the measurements the motion sensors using machine learning approaches, and the location-dwelling duration, which can be obtained by analyzing the time series of the user locations. A snapshot of a contextual tuple of [local_time, location, location_dwelling state, user state] can be derived in real-time by analyzing the sensor measurements and radio signal strengths. The elements in the contextual tuple belong to physical parameters, features and patterns listed in Levels 2 & 3 as shown in Figure 1. The significant activities will be inferred for every second by analyzing the latest set of the

91

contextual tuples using a Naïve Bayesian classifier. The activity classifier outputs activity-level descriptors (significant activity) listed in Level 5 in Figure 1. A. Ubiquitous Positioning The ultimate goal of the positioning solution is to identify current significant location, which is defined as a geofence that can host one or more significant activities. For example, offices, meeting rooms, cafeterias, coffee shops, libraries, and bus stops are typical significant locations of our daily working life. The geometry of a significant location can be defined as a circle or an area of a polygon. To simplify the implementation of the framework in this initial study, we implemented a simple positioning solution to determine the significant locations based on fingerprinting WiFi signals. As we know, fingerprinting WiFi signals works in two phases: a data training phase and a positioning phase. The datatraining phase is the phase of collecting WiFi signals at known locations (called reference points) and generating the radio map of all reference points. The positioning phase is to find a reference point ( y ) that has the best match between its fingerprints and the observed signal strengths. Our matching algorithm is to find a reference point r in the database with a shortest mean signal distance l as follows:

 1 y = arg min l = n r 

n

 wi2 ( soi  sri ) 2



i =1 

where n is the total number of the matched access points that can be found from both observed RSSIs and fingerprints, wi is the weight of the RSSI (Received Signal Strength Indication) measurement to the i-th access point, sri is the i-th RSSI fingerprint stored in the radio map (fingerprinting database), and soi is the normalied observed RSSI mesurements, which is estimated with

soi = soi  a 1 n a =  ( soi  sri ) n i=1 where a is the bias between the observed RSSI tuple and the fingerprints stored in the radio map. By introducing the bias correction to the observed RSSIs will allow us to utilize the same fingerprinting database for different types of smartphones. It is very common that the RSSI measurements from different devices have a significant bias.

UPINLBS 2014, Corpus Christi, Texas, Nov 20-21, 2014 978-1-4799-6004-0/14/$31.00 ©2014 IEEE

The weight wi is estimated with:

 1 wi =  2  100 / n

n  10 n 1.4 m/s

The user speed is determined based on the Pedestrian Dead Reckoning algorithm developed by Chen et.al. (2011) [4]. The user speed v is estimated with:

v = SL  SF  b(SF 1.79)H  SL =  0.7 + a(H 1.75) + c   1.75 where SL is the step length, SF is the step-frequency, H is the height of the pedestrian, and a, b, and c are the model coefficients with the values of 0.371, 0.227 and 1. Step frequency SF is detected in real time by analyzing cyclic patterns of the accelerometer measurements. This model has a performance with the error of less than 3% in terms of travelled distance. We fixed pedestrian height as 1.75 meters in this study instead of taking it from the input of the pedestrian. In this model, c is personal

92

factor that can be calibrated on-line using a known travelled distance. With this approach, user state detection is an unsupervised solution. C. The Location Dwelling State In this initial study, location-dwelling durations are used to define the location-dwelling states. The following locationdwelling durations applied: 10 seconds, 60 seconds, and 300 seconds. A sliding window approach is adopted to detect the location-dwelling states. The state is defined as “true” if 90% of the time in the window size that the mobile user occupies on the same location. D. Activity Inference Activity inference is based on a Naïve Bayesian classifier as shown in Figure 2. The inference utilizes the information of local time, user location, location-dwelling states and user states. The posterior condition probability p(Ak|x), which is the probability of activity Ak given the observed contextual tuple x = [local_time, significant location, loation_dwelling_state, user_state], can be derived with:

p(Ak )p(x | Ak ) p(x) p(Ak )p(x | Ak ) ,

 p(A )p(x | A ) i

   

      

i

 

d

p(x | Ak ) =  p(xi | Ak )



i=1

where na is the total number of significant activities to be inferred, d is dimension of the contextual tuple, which is four in our case.

p(A) Activity p(x1|A) Time

In this initial study, we adopt empirical models for all probability density functions required. The empirical models are based on common sense, e.g. the probability of having a meeting in a meeting room is much higher (e.g. 0.8) than having a meeting in a bus stop (e.g. 0.01). Although these empirical models are the very first approximation, it is sufficient for this study of proof-of-concept.

   

na i

The best way of obtaining these probability distributions is through training. However, a very large data set is needed for training all these models. This is a very interested research topic of our future study. Another approach is to adapt empirical models. For example, people normally have lunch between 11:00 AM and 1:00 PM. Thus, the probability distribution of local time given the activity “having a lunch” can be estimated with an empirical model shown in Figure 3.

 

p(Ak | x) = =

• The probability distributions for all p(xi|Ak), where i = 1 …d, and k =1...na. In total, we need d×na probability density functions that are the core of the activity inference engine. • The probability distribution of the activities p(A) .

p(x4|A) p(x2|A)

p(x3|A)

Location

Dwelling State

User States

Figure 2. Structure of the Naïve Bayes classifier used for significant activity inference. In order to derive the posterior probability p(Ak|x), we need to know

UPINLBS 2014, Corpus Christi, Texas, Nov 20-21, 2014 978-1-4799-6004-0/14/$31.00 ©2014 IEEE

                               

Figure 3. The probability distribution of local time given the activity of “having a lunch”. How to determine (train) these a priori conditional probability distributions automatically is still a challenging task for its requirement of a very large data set. Smart machine learning algorithm is needed in this case. The knowledge of these a priori probability distributions plays a significant role in the performance of the classifier. IV. EXPERIEMENT AND DATA ANALYSIS To evalaute the performance of the Bayes classifer implemented in this framework, a test environment was setup on the campus of the Texas A&M university Corpus Christi. It consisits of seven actitivies to be inferred, which include six signifcant activity and one undefined activity that cover all other activities than the six well-defined significant activities. Table 2 lists the definition of these activity and the probablities of these activities.

93

TABLE 2. LIST OF ACTIVITIES Activity -ID

Description

Probability distribution*

1

Working

2

Having a meeting

0.125

3

Having a lunch

0.125

4

Taking a coffee break

0.0625

5

Visiting libary

0.0625

6

Waiting for bus

0.025

7

Other activities (undefined activities)

0.1

Total

1.0

0.5

The first feature in the contextual tuple is the local time. It is obtained directly from the system clock of the smartphone. For this parameter, we divided the local time into 24 1-hour intervals as shown in Figure 3. The a priori conditional probabilities p(ti|Ak) varies depending on the given activity. Figure 3 shows that of the activity - “having a lunch”. The second feature in the contextual tuple is the significant location, which is defined as a geofence. Table 3 lists all significant locations defined in the test environment on campus. It includes an undefined location that will cover all locations other than the specified significant locations. TABLE 3. LIST OF SIGNIFICANT LOCATIONS

 





 



Figure 4. Distribution of the significant locations (See Table 3). The third feature in the contextual tuple is the locationdwelling states. With this scope of this study, three locationdwelling states are used as listed in Table 4. A location dwelling state is “observed” if the user occupied for more than 90% of the time of the slide window, which can be 10, 60, or 300 seconds in our case, at the same significant location. In case there are multiple states observed, the one with the largest window size is selected. Again, the a priori conditional probability density functions of the dwelling states vary depending on the given activity.

Location -ID

Description

1

Office

2

Meeting room

3

Cafeteria

4

Coffee shop

ID

Description

5

Libary

1

Dwelling length between 1-10 seconds

6

Bus Stop

2

Dwelling length between 11-60 seconds

7

All Other Locations

3

Dwelling length between 61-300 seconds

Figure 4 shows the positions of these significant locations. ‘Office” and “Meeting room” are located closely in the same laboratory. The a priori conditional probability density functions for the significant locations vary depending on the given activity.

UPINLBS 2014, Corpus Christi, Texas, Nov 20-21, 2014 978-1-4799-6004-0/14/$31.00 ©2014 IEEE

TABLE 4. DEFINITIONS OF THE LOCATION-DWELLING STATES

The last feature in the contextual tuple is the user state. The user states include information related to the user, which can be mobility context (static, walking, running, driving…); social states e.g., with who he is now connected on-line, what kinds of applications he is currently used et.al., and psychological states e.g., exciting, nervous, or tire. In this initial study, we just consider four simple user mobility states defined in Table 5.

94

TABLE 5. DEFINITIONS OF THE USER STATES

TABLE 7. THE CONFUSION MATRIX

ID

Description

Activity

1

Static, speed < 0.1m/s

2 3 4

1

2

3

4

5

6

7

1

98.4

1.1

0.0

0.0

0.0

0.0

0.4

slow walking, 0.1 m/s < speed < 7 m/s (less than one step per second

2

6.7

93.2

0.0

0.0

0.0

0.0

0.1

3

0.0

0.0

98.2

0.0

0.0

0.0

1.8

walking, 0.7 m/s < speed < 1.4m/s, 1-2 steps per second

4

0.0

0.0

0.0 98.1

0.0

0.0

1.9

5

0.0

0.0

0.0

0.0

99.5

0.0

0.5

fast moving, speed > 1.4m/s, more then 2 steps per second, or driving.

6

0.0

0.0

0.0

0.0

0.0

79.8

20.2

7

1.4

0.9

2.3

1.1

2.4

0.6

91.3

We carried out five test sessions with two different smartphones (A Samsung Galaxy 4S and a Galaxy Note 3), two testers in three different days. Each session last a few hours during working hours typically start from 10:00AM to 15:00PM. During each test session, the tester performed all significant activities defined in Table 2 in a radom order. The testers labelled the time intervals manuanlly for each significant activity. A record of local time, significant location, user speed is recorded for every second. A contextual tuple of x = [local_time, significant location, loation_dwelling_state, user_state] can then be derived and used to infer the significant activities. Table 6 list the statiscics of each test session including the session length in seconds, the number of epoch with correct inferrence and success rate. The average success rate of all five sessions is 95.9%. There is no significant difference between two different devices although the fingerprint database was generated with the Galaxy S4 device. Table 7 lists the confusing matrix, in which the diagonal elements show the success rate of each significnat activity, while the non-diagonal elelment shows the error rates to other activities. TABLE 6. STATISTICS OF THE TEST SESSIONS Session Device

Senssion length (seconds)

Num. of correct inference

Success rate %

1

Galaxy S4

13007

12666

97.4

2

Galaxy S4

13196

13016

98.6

3

Galaxy Note 3

14146

13840

97.84

4

Galaxy Note 3

16443

15745

95.8

5

Galaxy S4

17866

16093

90.1

74658

71360

95.9

Total

UPINLBS 2014, Corpus Christi, Texas, Nov 20-21, 2014 978-1-4799-6004-0/14/$31.00 ©2014 IEEE

1–working in office, 2-having a meeting, 3-having a lunch, 4having a coffee break, 5-visiting library, 6–waiting for bus, 7-all non significant acticities, e.g. walking on campus. The result based on the analysis of all five test sessions show that we can achive a success rate of inference for more than 90% except the case of “waiting for bus”. The reason that caused the low success rate for this case was that the WiFi signals was very weak outdoors. The WiFi-based positioning accuracy at bus stop was low compared to other significant locations. There are many cases that the bus stop was reconized as “undefined location” because of the mismatches of the observed RSSI and the finerprints. The situation will be improved in the future while taking the GPS postions into accout outdoors for determing the significant location. V CONCLUSIONS AND FUTURE WORKS This paper presented a framwork for infering significant activities using a Bayesian approach. In addition to locations, the framework takes the advantages of information such as local time, location-dwelling states and user states as input to infer the activities. Therefore, it is a solution beyond the scope of locationaware. The framework was tested in a typical campus environment in Texas A&M University Corpus Christi. We inferred six typical on-campus activities including working in office, having a meeting, having a lunch, taking a coffee break, visiting the libaray and waiting for a bus. The test results demonstrated that the success rate of activity inference is more than 90%. In this initial study, the a priori conditional probability density functions used in the inference algorithm are taken from emperical models, which are good approxiamtions. To obtain better models, a large data set and smart machine learning algorithms are needed to train these models. This will be the topic of our future study.

95

REFERENCES [1]

Chen, R.: ‘Ubiquitous Positioning and Mobile Location-Based Services in Smart Phones’ (IGI-Global, 2012. 2012). [2] Chen, R., and Guinness, R.: ‘Geospatial Computing in Mobile Device’ (Artech House, 2014. 2014). [3] Zandbergen, P.A.: ‘Accuracy of iPhone Locations: A Comparison of Assisted GPS, WiFi and Cellular Positioning’, Transactions in GIS, 2009, 2009, (13), pp. 5-26. [4] Chen, R., Pei, L., and Chen, Y.: ‘A smart phone based PDR solution for indoor navigation’, in Proc. ION, 2011, pp. 1404-1408. [5] Chen, W., Chen, R., Chen, X., Zhang, X., Chen, Y., Wang, J., and Z, F.: ‘Comparison of EMG-based and accelerometer-based speed estimation menthods in pedestrian dead reckoning’, The Journal of Navigation, 2011, 64, pp. 265-280. [6] Liu, J., Chen, R., Pei, L., Guinness, R., and Kuusniemi, H.: ‘A Hybrid Smartphone Indoor Positioning Solution for Mobile LBS’, Sensors, 2012, 12, pp. 17208-17233. [7] Zahidul, M., Kuusniemi, H., Chen, L., Pei, L., Ruotsalainen, L., Guinness, R., and Chen, R.: ‘Performance Evaluation of Multi-Sensor Fusion Models in Indoor Navigation’, European Journal of Navigation, 2013, 11, (2), pp. 20-28. [8] Pei, L., Chen, R., Liu, J., Tenhunen, T., Kuusniemi, H., and Cehn, Y.: ‘Using Bluetooth RSSI Probability Distributions for Indoor Locating’, Journal of Global Positioning Systems, 2010, 9, (1), pp. 22-32. [9] Bahl, P., and Padmanabhan, V.: ‘Radar: An in-building RF-based user location and tracking system.’, in Proc. IEEE INFOCOM 2000 Conference on Computer Communications, pp. 775-785. [10] Pei, L., Liu, J., Guinness, R., Chen, Y., Kuusniemi, H., and Chen, R.: ‘Using LS-SVM Based Motion Recognition for Smartphone Indoor Wireless Positioning’, Sensors, 2012, 12, pp. 6155-6175.

UPINLBS 2014, Corpus Christi, Texas, Nov 20-21, 2014 978-1-4799-6004-0/14/$31.00 ©2014 IEEE

[11] Susi, M., Renaudin, V., and Lachapelle, G.: ‘Motion Mode Recognition and Step Detection Algorithms for Mobile Phone Users’, Sensors, 2013, 13, (2), pp. 1539-1562. [12] Dey, A.: ‘Context-aware computing’, in J., K. (Ed.): ‘Ubiquitous computing fundamentals’ (Chapman and Hall/CRC, 2009), pp. 321252. [13] Guinness, R.: ‘Beyond Where to How: A Machine Learning Approach for Sensing Mobility Contexts Using Smartphone Sensors’, in Proc. ION GNSS+ 2013, pp. 2868-2879. [14] Liu, J., Chen, L., Chen, Y., Tang, J., Hyyppa, J., Chen, R., Chu, T., and Hyyppa, H.: ‘HMM-based pedestrain location and motion joint estimation toward a Smartphone geo-context computing solution’, in Proc. IEEE UPINLBS 2014. [15] Pei, L., Guinness, R., Chen, R., Liu, J., Kuusniemi, H., Chen, Y., Chen, L., and Kaistinen, J.: ‘Human Behavior Cognition Using Smartphone Sensors’, Sensors, 2013, 13, pp. 1402-1424. [16] Reddy, S., Mun, M., Burke, J., Estrin, D., Hansen, M., and Srivastava, M.: ‘Using mobile phones to determine transportation modes’, ACM Transactions on Sensor Networks 2010, 6, (2), pp. 13:11 - 13:23 [17] Bancroft, J., Garrett, D., and Lachapelle, G.: ‘Activity and environment classification using foot mounted navigation sensors’, in Proc. IPIN 2012. [18] Jin, G., Lee, S., and Lee, T.: ‘Context awareness of human motion states using accelerometer.’, Journal of Medical Systems, 2008, 32, (2), pp. 93-100. [19] Ravi, N., Dandekar, N., Mysore, P., and Littman, M.: ‘Activity recognition from accelerometer data’, in Proc. National Conference on Artificial Itelligences 20 (3), pp. 1541-1546.

96

Suggest Documents