Energy-efficient recognition of human activity in body ... - SAGE Journals

Research Article

Energy-efficient recognition of human activity in body sensor networks via compressed classification

International Journal of Distributed Sensor Networks 2016, Vol. 12(12) Ó The Author(s) 2016 DOI: 10.1177/1550147716679668 ijdsn.sagepub.com

Ling Xiao, Renfa Li, Juan Luo and Zhu Xiao

Abstract Energy efficiency is an important challenge to broad deployment of wireless body sensor networks for long-term physical movement monitoring. Inspired by theories of sparse representation and compressed sensing, the power-aware compressive classification approach SRC-DRP (sparse representation–based classification with distributed random projection) for activity recognition is proposed, which integrates data compressing and classification. Random projection as a data compression tool is individually implemented on each sensor node to reduce the amount of data for transmission. Compressive classification can be applied directly on the compressed samples received from all nodes. This method was validated on the Wearable Action Recognition Dataset and implemented on embedded nodes for offline and online experiments. It is shown that our method reduces energy consumption by approximately 20% while maintaining an activity recognition accuracy of 88% at a compression ratio of 0.5. Keywords Activity recognition, sparse representation, compressed sensing, random projection, energy efficiency, body sensor networks

Date received: 20 June 2016; accepted: 18 October 2016 Academic Editor: Joel Rodrigues

Introduction Physical activity monitoring and classification using wearable sensors have become one of the most attractive research areas in recent years due to a wide range of health-related applications,1 in fields such as disease monitoring and diagnosis,2 child and elderly care,3 and rehabilitation and assisted living.4 A number of tiny wireless sensors, strategically attached to the human body, can constitute a wireless body sensor network (WBSN), promising to offer inexpensive, continuous, and remote health monitoring of people in their normal living environment. Long-term physical fitness monitoring requires continuous sensing, since physical activity can occur at any time and require to be classified by the incoming stream of sensor data. The battery limitations of the WBSN severely limit the maximum deployment time for continuous monitoring. Therefore, it is

necessary to design power-efficient signal-processing approaches on the sensor nodes to minimize the energy consumption while offering sufficient classification accuracy and real-time responsiveness. Our work explores energy-efficient relationships not only on classification but also on sensing and communication. In recent years, the theories of sparse representation and compressed sensing (CS)5 have been gaining attention in the area of signal processing. Sparse signal representation has emerged as a successful tool for analyzing a large class of signals. CS enables the reconstruction of College of Computer Science and Electronic Engineering, Hunan University, Changsha, China Corresponding author: Ling Xiao, Hunan University, Changsha 410082, China. Email: [email protected]

Creative Commons CC-BY: This article is distributed under the terms of the Creative Commons Attribution 3.0 License (http://www.creativecommons.org/licenses/by/3.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (http://www.uk.sagepub.com/aboutus/ openaccess.htm).

2 sparse and compressible signals from a small number of non-adaptive linear measurements in the form of random projection (RP). RP6 refers to the tool of projecting a set of points from a high-dimensional space to a randomly chosen low-dimensional subspace. RP is firmly evidenced as a near-optimal measurement scheme as a part of CS.7 In many applications, we are not interested in obtaining a precise signal recovery, but rather in making some kind of detection or classification decision. Inspired by CS, compressive classification (CC) appeared in Davenport et al.8 CC is designed to directly classify such compressed samples without the need to recover the original signals. CC has advantages when it is applied in physical activity recognition of WBSN. Dimensionality reduction by RP is data independent, whereas popular dimensionality reduction approaches such as principal component analysis (PCA) and linear discriminant analysis (LDA) are data dependent; our classification method can easily add a new class of activity as well as remove existing classes. Furthermore, RP can be implemented on embedded nodes of WBSN as a computationally simple method, while PCA or LDA can be too computationally expensive to implement on nodes because of the computational burden of eigenvalue decompositions. In addition, dimensionality reduction also reduces the computational cost so that classification can be implemented in real time. RP on sensors cuts down the communication cost by lowering the amount of sensed data to be transferred from nodes to a base station, a significant sensor source of energy consumption. Moreover, if compressing and sensing can be implemented on sensors’ hardware at the same time, just like a new single-pixel compressive camera, the power consumption of sensors for sampling and processing can be further reduced. Our energy-efficient CC approach is novel in the following two aspects: (1) data compression by RP on resource-constrained WBSN platforms and (2) activity classification by sparse representation and CS operating directly on the compressed samples. To the best of our knowledge, it is the first study that explores both classification and data compression together to satisfy the usability requirements for a long-term monitoring environment. The rest of the article is organized as follows: In section ‘‘Related works,’’ related works are summarized. The spare representation classification algorithm is reviewed in section ‘‘Background on SRC.’’ Section ‘‘Proposed compressed classification’’ describes the proposed classification approach in detail. The experimental results and analysis are presented in section ‘‘Experimental results.’’ Section ‘‘Conclusion’’ concludes the article and gives possible directions to future work.

International Journal of Distributed Sensor Networks

Related works WBSNs with multiple inertial sensors are widely used in various studies on human body movement. Recent work involves prototyping wearable sensor systems and developing pattern recognition, as well as machinelearning algorithms, to model and recognize human activities. As for the recognition techniques, a large number of classification methods have been investigated.9 Some studies incorporated the idea of simple heuristic classifier, whereas others employed more generic and automatic methods from the machine-learning literature including the decision trees, nearest neighbor (NN), Bayesian networks, support vector machines (SVMs), artificial neural networks (ANN), and hidden Markov model (HMM). Inspired by sparse representation–based classification (SRC) for face recognition,10 Yang et al.11 and Zhang and Sawchuk12 extended SRC to activity recognition in body sensor networks (BSNs). Compared to the similar work of Yang et al.11 and Zhang and Sawchuk,12 our research utilizes the theory of sparse representation and CS to combine both classification and data compression. Compressed classification operates on data acquired by a compressed sampling technique. This article exploits the ability of compressed classification to conserve sensor energy for communication while preserving accurate activity recognition precision. Most of the classification methods usually process and analyze raw signals sampled from sensor nodes after transmission to a centralized server. These methods can result in nodes forwarding a great deal of data to a base station, which could not be applied to the limited bandwidth and energy resources of WBSN. However, communication generally consumes more energy than local computation. From the energy preservation point, it is more beneficial to perform signal processing on individual nodes, such as feature extraction13 and local activity classification.14 Wu et al.15 developed a data compression approach via inter-node collaboration and overhearing. Au et al.16 dynamically scheduled sensor-measurement episodes to reduce energy consumption. Ghasemzandeh et al.17 dynamically selected and activated motion sensors to reduce the amount of active nodes. However, some methods are too complex to be implemented on embedded sensor platforms, and this article exploits a lightweight signal-processing approach on resourceconstrained sensor nodes for energy efficiency.

Background on SRC One of the most successful applications of the sparse representation and CS in pattern recognition is the SRC algorithm for face recognition,10 which uses the whole set of training samples as a redundant dictionary and casts a recognition problem as one of

Xiao et al.

3

discriminatively finding a sparse representation of the test image as a linear combination of training images. Consider an activity recognition problem with C different classes. Each class i has ni training samples, Vi = ½vi, 1 , vi, 2 , . . . , vi, ni 2 Rm 3 ni , each vi,ni have m attributes. If the test sample vk,test belongs to the ith class, vk,test will approximately lie in the linear span of the training samples associated with the ith class vk, test = ak, 1 vk, 1 + ak, 2 vk, 2 + + ak, nk vk, nk

ð1Þ

for some scalars ai, j 2 R, j = 1, 2, . . . , ni . Since the membership i of the test sample is initially unknown, we construct a redundant dictionary V for the entire training set as the concatenation of n training samples V = ½V1 , V2 , . . . , Vk = ½v1, 1 , v1, 2 , . . . , vC, nC 2 Rm 3 n ð2Þ

where n = n1 + n2 + + nk , then the linear representation of vk,test can be rewritten in terms of all training samples as vk, test = V a

ð3Þ

where a = ½0, 0, ak, 1 , ak, 2 ak, ni , 0, 0T is a coefficient vector whose entries are zero except those associated with the kth class. As the entries of the vector a encode the identity of the test sample vk,test, it attempts to obtain it by solving the linear system of equations. Assuming there is a set of J wearable sensor nodes attached to the human body, each consists of a threeaxis accelerometer (x, y, z) and a two-axis gyroscope (u, r). A sample s of activity signal at a node contains five measurement values s = ½x, y, z, u, r

ð4Þ

and an action segment of a window size of h at node j can be represented as an 5 h 3 1 vector vj = ½s(1), s(2), . . . , s(h)T 2 R5h

ð5Þ

For our activity classification, we required no feature extraction from the time series of data and directly used the raw measurements of activity sequence to form a vector v.

Proposed compressed classification If the SRC method is used on the raw motion vectors, then continuous transmission of the complete sensor data to a base station would rapidly deplete the sensor node’s batteries. Our goal is to minimize the number of bits transmitted while reliably preserving the signal information at a minimum implementation cost on the embedded nodes. CS theory implies that the precise choice of the number of features should not be critical

for classification problems: a small number of random features contain enough information to preserve the underlying local structure and hence to correctly classify any test sample. To reduce power consumption requirements, the proposed activity classification framework consists of two steps, as shown in Figure 1: (1) distributed random projection (DRP) is operated individually on each sensor node and (2) the back-end device such as PC runs the SRC method on the compressed samples received from all nodes. Data compression by RP can be implemented on the nodes of a WBSN for minimizing the amount of data to be transferred wirelessly from the nodes to the base station.

DRP RP has recently been viewed as a powerful tool for dimensionality reduction. Given a M 3 N random matrix F, whose columns have unit length, the original N-dimensional sampled data v 2 RN 3 1 is projected onto a M-dimensional (M N) vector ~v 2 RM 3 1 , the dimensionality reduction process is a simple matrix multiplication, given by ð6Þ

~vM 3 1 = FM 3 N vN 3 1

Ideally, we wish to ensure that F preserves pair-wise distances approximately between all pairs of signals for RP. That is to say, for any two vectors, v1 and v2, the distance between them is approximately preserved (1 g)

kFv1 Fv2 k2 (1 + g) kv 1 v 2 k2

ð7Þ

for small g.0. One of the important results in Lu and Do18 from CS theory is the restricted isometry property (RIP), which states that equation (7) is indeed satisfied with overwhelming probability by certain random matrices. It is well known that many types of random measurement matrices follow RIP. Perhaps, the most prominent M 3 N random matrices F whose entries fm, n are independent are identically distributed and formed by (1) a Gaussian distribution fm, n ;N (0, 1), (2) a Bernoulli distribution P(fm, n = 6 1) = 1=2, and (3) a sparse binary matrix such as 8 > : 0,

with probability with probability with probability

1 6 1 6 2 3

ð8Þ

The above three types of random matrices have been proven to satisfy the RIP with very high probability.19 In this article, we consider a special sparse binary matrix, where fm, n has exactly d of the ones in each

4


Figure 1. Activity recognition system architecture. Sensor placement on subject’s body (WS: waist; RW: right wrist; LW: left wrist; RL: right leg; LL: left leg).

column (d N), and all the other entries are equal to zero. It has been shown20 that such matrices can satisfy a weaker form of the RIP property. Compared with RP, the traditional dimensionality reduction methods such as PCA or LDA are with no guarantee that distances between the original and projected signals are well preserved. Additionally, RP is computationally very simple: projecting the N-dimensional data into M-dimensional by random matrix F is just a matrix multiplication and takes O(MN) complexity. Considering RP has computational advantages, we argue that RP is especially suitable for our purpose. In the case of several sensor nodes attached on various locations of the body for activity recognition, RP is individually processed in a distributed fashion on each sensor node. On each node j, the original sampled vector vj 2 RN is projected to a low-dimensional vector ~vj 2 RM by RP matrix Fj , they are ~vj = Fj vj

ð9Þ

Each sensor sends the projection vector ~vj to the base station.

V~j = Fj Vj

For each sensor data, the sparse representation of aj of ~vk, test satisfies the following linear system ~j a j + e ~vk, test = V

The base station collects J sensors’ vector ~v of RP. For each sensor j, we construct a new redundant dictionary ~j matrix V

ð11Þ

where aj is a coefficient vector of the sensor j and e is the approximation error term modeling measurement errors. After projection, typically, the dimension N is much smaller than the number n of all training samples. Therefore, the new linear system (11) is underdetermined, and the desired aj is the unique solution to the following optimization problem ~ j aj a ^ j = arg min aj 1 subject to~v k, test V e ð12Þ 2

After the sparsest representation aj is recovered, we project the coefficients onto each class subspaces. Subsequently, the membership of the test sample vk,test is assigned to the class with the smallest residual J X ~ j di (^ ri (vk, test ) = arg min aj ) , ~v k, test V i

SRC with DRP

ð10Þ

2

j=1

i = 1, 2, . . . , C

where di (^ aj ) = ½0, 0, 1, 1 1 , 0, 0. |fflfflfflffl{zfflfflfflffl} i

ð13Þ

Xiao et al. The algorithm below summarizes the complete recognition procedure of sparse classification with RP.

Experimental results WARD dataset In this work, we used a publicly available dataset called wearable action recognition database (WARD), which is provided by Yang et al.21 of the University of California, Berkeley. The data were recorded by five sensor nodes, containing a three-axis accelerometer and a two-axis gyroscope, which were attached to body parts: one on the waist, two on the wrists, and two on the ankles. The dataset consists of data recorded from 20 participants with different gender, age, height, and weight for 13 action categories: (1) stand (ST), (2) sit (SI), (3) lie down (LI), (4) walk forward (WF), (5) walk left-circle (WL), (6) walk right-circle (WR), (7) turn left (TL), (8) turn right (TR), (9) go upstairs (UP), (10) go downstairs (DO), (11) jog (JO), (12) jump (JU), and (13) push wheelchair (PU). Each participant performs five trials for each action. In total, there are 1300 (20 3 13 3 5) activity sequences.

Offline experiment We first processed the WARD dataset offline in MATLAB SGPL1 (Spectral Projected Gradient for l1minimization) toolbox22 was used to solve the sparse recovery problem in equation (12). Experiment design. Only minimal preprocessing for median filtering with a five-point moving average was applied to the raw sampled data in order to remove any abnormal noise spikes produced by the sensors. The training set and the test set were designed as follows. For each motion sequence in the WARD database, we randomly selected a window size of 40 samples (h = 40) without overlapping, which corresponds to 2 s given the 20 Hz sampling rate. The dimension of an activity vector v of a sensor is 200 (2 s 3 20 Hz 3 5 values). In total, there were 1300 training examples. The 20-fold leave-one-subject-out-validation approach was performed to obtain the subject-independent classification results, where all the training examples from one subject were withheld for testing and the rest of the training examples from the remaining subjects were used for training. In our offline experiment, we considered four types of random matrices: Gaussian, Bernoulli, very sparse, and special sparse binary. For each kind of random matrix, the validation process was repeated 10 times; a group of RP matrices was generated randomly every time. Each group of RP matrices consists of five random matrices Fj , each matrix corresponds to one sensor node, respectively.

5 Algorithm. Sparse representation classification after distributed random projection (SRC-DRP). 1: Input: A set of stacked training samples V = ½v1, 1 , v1, 2 , . . . , vk, nk T , a test sample vk, test on a sensor node j, a random projection matrix Fj , and an optional error tolerance e.0 2: Projection: ev k, test = Fj vk, test Vej = Fj Vj 3: Normalize the columns of Vej and ev k, test . 4: Solve the problem: e j aj V a b j = arg min aj subject to ev e 1

k, test

2

5: Compute the residuals J P a j ) , ri (ev k, test ) = arg min ev k, test Vej di (b j=1

2

i = 1, 2, . . . , C

6: Output: label (vk, test ) = arg min ri (ev k, test )

Activity recognition performance. To investigate the classification robustness of the proposed SRC-DRP algorithm for dimensionality reduction by RP, we selected five different compression ratios (CRs; 0, 0.3, 0.5, 0.7, and 0.9). CR is defined as (N-M)/N. Table 1 gives the mean and standard deviation for activity-recognition accuracies of the SRC-DRP approach under four types of RP random matrices with various CRs; the first column indicates the ratio of compression. First, we obtained an activity-recognition accuracy of 90.23% when the SRC-DRP was applied to the raw data with no compression (CR = 0). From Table 1, we calculated the one-way analysis of variance values among the average-recognition accuracies and standard deviation using four types of matrices at four different CRs (0.3, 0.5, 0.7, and 0.9). For example, for a CR of 0.5, the p-value equals 0.902, which is larger than 0.05. This means their difference is considered to be not statistically significant. We can get similar results for the other three CRs. We also compared the recognition performances of RP with the conventional dimensionality reduction methods such as PCA, and then provided the results on the last column of Table 1. We show that RP yields results of activity-recognition accuracy comparable to PCA under the four types of RP random matrices with the same CR. However, using RP is computationally and significantly less expensive than using PCA. Moreover, it is interesting to see that the recognition accuracies under four kinds of random matrices are not statistically significant in their difference at the same CR. The special sparse binary matrix for RP is more suitable to be implemented on the sensor platforms because of the much lower computation costs with little loss in recognition accuracy. We explored a very sparse RP matrix to be implemented on the wireless wearable sensor platforms in our latter experiments. We further assessed the impact of varying data dimension on the computational costs of our method.

6


Table 1. Activity-recognition accuracies of the SRC-DRP approach. Compression ratio

Activity-recognition accuracies (%) (mean 6 standard deviation)

Gaussian

Bernoulli 0.3 0.5 0.7 0.9

90.57 90.36 90.02 88.26

6 6 6 6

1.34 1.64 1.72 2.29

90.47 90.35 90.29 88.22

6 6 6 6

Sparse binary 1.42 1.49 1.67 2.52

90.46 90.32 90.26 88.28

6 6 6 6

1.47 1.74 1.59 2.29

Special sparse binary (d = 10) 89.51 89.49 89.36 87.31

6 6 6 6

1.53 1.77 1.93 2.61

PCA 91 90.46 90.31 89.08

SRC: sparse representation–based classification; DRP: distributed random projection; PCA: principal component analysis.

Figure 2. Average accuracies and processing time of SRC-DRP under sparse binary random matrix.

We recorded the total time for classifying all the test samples and then divided it by the number of samples, hence obtaining the average computation time for each test sample. The PC we used has an Intel Pentium Dual 2.16 GHz CPU and 952 MB RAM. It is validated that dimensionality reduction also reduces the computational costs. Figure 2 shows the computation time against various CRs for classifying one test sample and average recognition accuracy by SRC-DRP under very sparse RP matrices.

Online experiment To characterize the real-time performance and the energy consumption of the SRC-DRP method, online experiments were performed on realistic sensor nodes.

Sensor node platform. The sensor node includes a TI MSP430 microcontroller with 10 KB RAM, a Chipcon CC2420 IEEE 802.15.4 compliant radio, and two AA batteries. The sensor board incorporates a three-axis accelerometer and a two-axis gyroscope; each axis is reported as a 12-bit value to a node. Sensor nodes communicate with a base station attached to a PC through a USB port. Just like the WARD dataset, we also used five sensors.

Experiment design. We chose the very special sparse binary projection matrices (d = 10) for RP on the embedded platform and implemented RP onboard the node with embedded intelligence. Since our aim is to study the feasibility of RP on the embedded node platforms, instead of acquiring data from inertial sensors directly, we alternatively used the serial port to input the WARD database records sampled at 20 Hz and to output the inertial data. RP was implemented on each sensor node every 2 s, and the inertial data were compressed to an active segment of 200 dimensions (corresponding to a five-axis data with duration of 2 s at a sampling rate of 20 Hz). Then, the compressed data were transmitted using its IEEE 802.15.4-compliant radio to a base station. Activity recognition for compressed samples was performed on a PC. Activity-recognition performance. To further validate the performance of the SRC-DRP method, four common classifiers were chosen for comparison. The classifiers were NN, nearest subspace (NS), Bayesian networks, and SVM. The average recognition accuracies of subject-independent for four classifiers are given in Table 2. When considering the CRs as 0.3, 0.5, 0.7, and 0.9, we calculated the t-test values of the mean recognition accuracies and standard deviation by comparing the SRC-DRP with other classifiers, respectively. Take the CR of 0.5 as an example where the p-value between SRC-DRP and NS equals 0.0001 (p \ 0.05), indicating that their difference is considered to be extremely statistically significant. We can get similar results with the other three CRs. Table 3 shows that for the same CR, the SRC-DRP classifier achieves the highest average recognition rate than the other classifiers. Energy evaluation. To measure the energy consumption of a node, we obtained real-time data via inertial sensors. The total energy consumption on sensor nodes consists of three main processes: (1) sensor sampling, (2) radio transmission, and (3) data processing. The power consumption by sensor sampling depends on the

Xiao et al.

7

Table 2. Activity recognition accuracies of the four approaches. Compression ratio

Activity recognition accuracies (%) (mean 6 standard deviation) SRC-DRP

0 0.3 0.5 0.7 0.9

90.23 89.46 89.02 88.06 86.77

6 6 6 6

NN 81.01 79.38 79.27 78.51 76.40

1.54 1.66 1.71 2.38

NS 6 6 6 6

1.89 2.17 2.58 3.61

83.23 82.85 82.20 81.66 78.75

6 6 6 6

1.66 1.83 2.13 5.29

Bayesian network

SVM

87.82 83.22 83.57 82.05 77.87

87.91 83.29 6 1.65 83.02 6 1.83 8251 6 1.58 80.15 6 1.63

6 6 6 6

1.66 1.69 1.76 1.83

SRC: sparse representation–based classification; DRP: distributed random projection; PCA: principal component analysis.

Table 3. Node lifetime with different compression radios.

Code execution time (ms) Energy consumption (mJ) Life time (h) (2 3 3000 mAh at 3.7 V)

No comp.

CR = 0.5

CR = 0.7

0 99.8 405

26 81.1 499

28 73.8 549

Figure 3. Energy profile of the sensor platform for each operation.

types of the sensors and sampling frequency. In addition, the energy consumption for storing data in the local flash memory is relatively low. Data processing is sensitive to the complexity of the code execution on the CPU. To evaluate the energy consumption of operation on the sensor node, a 10.1-O resistor was placed in the energy path of the platform. The voltage was measured by an oscilloscope, and the responding current and energy consumption were calculated. The sampling rates for both accelerometer and gyroscope were set to 20 Hz. The raw sensor data was logged to flash. Figure 3 presents the power consumption of a node within an operation period of 5 s; the CR is set to 0.5. Figure 3 shows the gyroscope sampling, and the continuously transmitting raw accelerometer draws the most energy. Additionally, the power consumption by performing RP on the sensor nodes is relatively inexpensive, and the energy saved in terms of reduced flash logging and data transmissions is more than compensating for the

energy cost of processing additional RP. Compared with the case without compression, transmitting the compressed data by RP every 2 s reduced the energy consumption of transmission by 47%. Accordingly, Table 3 compares the energy consumption and the resulting node lifetime (calculated for two 3000 mAh batteries at 3.7 V) of the proposed approach for three different kinds of CRs. For each node, the embedded RP code to compress an active sequence of 200 dimensions is executed in 26 ms (CR is 0.5) and 28 ms (CR is 0.7). The results show that data compression by RP achieves the energy consumption reduction of 18.7% and 26.1% with a node lifetime extension of 23.2% and 35.6.7%, for CR is 0.5 and 0.7, respectively, when compared with no compression used. We also found that the gyroscope sampling process consumes a large amount of energy. Since the energy consumption of sampling depends on the hardware implementation and the sampling frequency, if the compressive sampling can be realized in sensor hardware, the energy consumption can be significantly reduced.

Conclusion The aim of this article is to exploit the effectiveness of compressive classifiers in action recognition. To our knowledge, it is the first study that explores both classification and data compressing together in order to maximize the deployment time of BSN for the continuous monitoring while maintaining sufficient classification accuracy. An energy efficiency compressed classification approach for human activity recognition is proposed. Our approach uses RP to reduce the dimension of sample data processed directly on the sensor node instead of transmitting raw sampled data to the base station. Recognition performance was validated on the WARD database and real nodes for offline and online experiments. The recognition accuracy of our method only decreases slightly under RP. We also measure the energy consumption of a node using RP. It has been validated that the computation of light RP data compression only consumes a litter code execution time and

8 energy while reducing the amount of data to transmit, as well as resulting in the limited lifetime extension, compared with no data compression. Our ongoing work will be carried out both theoretically and practically. We are going to find the analytical relationship between the RIP constants and the standard deviation of the results with the theory development of CS. This will give a solid foundation to determine the number of projections required to get robust activity recognition results. We will also test the proposed algorithms in a real-time experiment with a large group of persons. Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (61300219).

References 1. Bulling A, Blanke U and Schiele B. A tutorial on human activity recognition using body-worn inertial sensors. ACM Comput Surv 2014; 46(3): 1–33. 2. Rigas G, Tzallas AT, Tsipouras MG, et al. Assessment of tremor activity in the Parkinson’s disease using a set of wearable sensors. IEEE T Inf Technol B 2012; 16(3): 478–487. 3. Nam YY and Park JW. Child activity recognition based on cooperative fusion model of a triaxial accelerometer and a barometric pressure sensor. IEEE J Biomed Health Inf 2013; 17(2): 420–426. 4. Grunerbl A, Muaremi A, Osmani V, et al. Smartphonebased recognition of state and state changes in bipolar disorder patients. IEEE J Biomed Health Inf 2015; 19(2): 140–148. 5. Donoho D. Compressed sensing. IEEE T Inform Theory 2006; 52(4): 1289–1306. 6. Dasgupta S. Experiments with random projections. In: Proceedings of the 16th conference uncertainty in artificial intelligence, San Francisco, CA, 30 June–3 July 2000, pp.143–151. New York: ACM. 7. Cande´s E and Tao T. Near-optimal signal recovery from random projections: universal encoding strategies? IEEE T Inform Theory 2006; 52(12): 5406–5425.

International Journal of Distributed Sensor Networks 8. Davenport M, Duarte M, Wakin M, et al. The smashed filter for compressive classification and target recognition. In: Proceedings of the SPIE symposium electron imaging: compute imaging, San Jose, CA, 16–20 January 2007, pp.326–330. 9. Preece SJ, Goulermas JY, Kenney LP, et al. Activity identification using body-mounted sensors—a review of classification techniques. Physiol Meas 2009; 30(4): R1–R33. 10. Wright J, Yang A, Ganesh A, et al. Robust face recognition via sparse representation. IEEE T Pattern Anal 2009; 31(2): 210–227. 11. Yang A, Jafari R, Shankar S, et al. Distributed recognition of human actions using wearable motion sensor networks. J Ambient Intell Smart Environ 2009; 1: 103–115. 12. Zhang M and Sawchuk AA. Human daily activity recognition with spare representation using wearable sensors. IEEE J Biomed Health Inf 2013; 17(3): 553–560. 13. Ghasemzadeh H and Jarari R. Physical movement monitoring using body sensor networks: a phonological approach to construct spatial decision trees. IEEE T Industr Inform 2011; 7(1): 66–77. 14. Ghasemzadeh H, Loseu V and Jafari R. Structural action recognition in body sensor networks: distributed classification based on string matching. IEEE T Inf Technol Biomed 2010; 14(2): 425–435. 15. Wu CH and Tseng YC. Data compression by temporal and spatial correlations in a body-area sensor network: a case study in pilates motion recognition. IEEE T Mobile Comput 2011; 10(10): 1459–1472. 16. Au L, Bui A, Batalin M, et al. CARER: efficient dynamic sensing for continuous activity monitoring. In: Proceedings of the 33st annual international conference of the IEEE engineering in medicine and biology society, Boston, MA, 30 August–3 September 2011. New York: IEEE. 17. Ghasemzadeh H, Guenterberg E and Jafari R. Energyefficient information driven coverage for physical movement monitoring in body sensor networks. IEEE J Sel Area Comm 2009; 27(1): 58–69. 18. Lu Y and Do M. Sampling signals from a union of subspaces. IEEE Signal Proc Mag 2008; 25(2): 41–47. 19. Achlioptas D. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J Comput Syst Sci 2003; 66: 671–678. 20. Baraniuk R, Davenport M, Devore R, et al. A simple proof of the restricted isometry property for random matrices. Constr Approx 2008; 28(3): 253–263. 21. Yang A, Kuryloski P and Bajcsy R. WARD: a wearable action recognition database, 2009, http://www.eecs.berkeley.edu/;yang/software/WAR/ 22. Berg E and Friedlander MP. SPGL1: a solver for largescale sparse reconstruction, 2007, http://www.cs.ubc.ca/ labs/scl/spgl1