Enhanced Distributed Dynamic Skyline Query for Wireless ... - MDPI

4 downloads 476 Views 8MB Size Report
Feb 3, 2016 - distance routing; uniform distribution; lower bound distribution; and upper .... In DBDCS, each application uses a disc track and sector analogy to map data ... Each sensor node uses a greedy algorithm to compute a local skyline certificate, ..... Sensor MAC (SMAC) [32] and Sector Based Distance Routing.
Journal of

Sensor and Actuator Networks Article

Enhanced Distributed Dynamic Skyline Query for Wireless Sensor Networks Khandakar Ahmed 1,2,3, *, Nazmus S. Nafi 4,† and Mark A. Gregory 4,† 1 2 3 4

* †

School of Electrical and Computer Engineering, RMIT University, Melbourne, VIC 3000, Australia Melbourne Institute of Technology, Melbourne VIC 3000, Australia Department of Computer Science & Engineering, Shahjalal University of Science & Technology, Sylhet-3114, Bangladesh School of Electrical and Computer Engineering, RMIT University, Melbourne, VIC 3000, Australia; [email protected] (N.S.N.); [email protected] (M.A.G.) Correspondence: [email protected] or [email protected] or [email protected]; Tel.: +61-3-9925-1180 (ext. 51180) These authors contributed equally to this work.

Academic Editor: Stefan Fischer Received: 16 November 2015; Accepted: 28 January 2016; Published: 3 February 2016

Abstract: Dynamic skyline query is one of the most popular and significant variants of skyline query in the field of multi-criteria decision-making. However, designing a distributed dynamic skyline query possesses greater challenge, especially for the distributed data centric storage within wireless sensor networks (WSNs). In this paper, a novel Enhanced Distributed Dynamic Skyline (EDDS) approach is proposed and implemented in Disk Based Data Centric Storage (DBDCS) architecture. DBDCS is an adaptation of magnetic disk storage platter consisting tracks and sectors. In DBDCS, the disc track and sector analogy is used to map data locations. A distance based indexing method is used for storing and querying multi-dimensional similar data. EDDS applies a threshold based hierarchical approach, which uses temporal correlation among sectors and sector segments to calculate a dynamic skyline. The efficiency and effectiveness of EDDS has been evaluated in terms of latency, energy consumption and accuracy through a simulation model developed in Castalia. Keywords: wireless sensor networks; distributed data centric storage; skyline query; sector based distance routing; uniform distribution; lower bound distribution; and upper bound distribution

1. Introduction WSNs have successfully been validated as a very economic and effective platform for monitoring diverse physical environments from remote locations. Various types of query techniques have been developed over WSNs including min-max [1], top-k [2,3] and skyline [4]. Skyline and its variants such as traditional skyline [5] and dynamic skyline [6–8] have been applied in many multiple criteria decision making applications. Traditional skyline (TS) retrieves all of the points, which are not dominated by others, from a set of points [9]. Given a dataset X, a point x1 dominates x2 , if x1 is not worse than x2 for each dimension i P l (i.e., x1 [i] ď x2 [i]), and x1 is better than x2 for at least one dimension m P l (x1 [m] < x2 [m]). Figure 1a shows an example, where each point is drawn taking two dimensions i and j as coordinators; points x1 , x3 , x6 and x10 are in the skyline set considering that a point with the least value for each dimension is desirable. Dynamic skyline (DS) retrieves a set of points that are not dynamically dominated by others with respect to a data point q denoted by DS (q, X)) [10]. A point x1 dynamically dominates x2 with respect to q if for each dimension in i P l, |x1 ris ´ q ris| ď |x2 ris ´ q ris| and for at least one dimension in m P l,

J. Sens. Actuator Netw. 2016, 5, 2; doi:10.3390/jsan5010002

www.mdpi.com/journal/jsan

J. Sens. Actuator Netw. 2016, 5, 2

2 of 22

J.J. Sens. Sens. Actuator Actuator Netw. Netw. 2016, 2016, 5, 5, 22

22 of of 21 21

|x1 rms ´ q rms| ă |x2 rms ´ q rms|. In Figure 1b, points x1 , x2 and x4 are the dynamic skyline points of q. xx11m m qqm m  xx2 m m qqm m .. In In Figure Figure 1b, 1b, points points xx11,, xx22 and and xx44 are are the the dynamic dynamic skyline skyline points points of of q. q. Each Each Each point xi = 2(xi [j], xi [i]) is transformed to xi 1 = (xi [j]´q[j], xi [i]´q[i]). point point xxii == (x (xii[j], [j], xxii[i]) [i]) is is transformed transformed to to xxii′′ == (x (xii[j]-q[j], [j]-q[j], xxii[i]-q[i]). [i]-q[i]).

xx33

xx77

xx22 xx44

xx66

xx66

xx99 xx11

XX4'4' xx77 XX1'1' XX4X 6' 4 X X22''X6'

xx33

xx88 i

i

xx55

xx55

qq

xx22

xx88

xx99 xx1010

xx11

xx1010 jj

jj

(a) (a)

(b) (b) Figure 1. 1. (a) (a)TS TSand and (b) (b) DS DS (q, (q, X). X). Figure Figure 1. (a) TS and (b) DS (q, X).

Skyline Skyline query is very useful in multi-preference analysis and decision making for environmental Skyline query query is isvery veryuseful usefulin inmulti-preference multi-preference analysis analysis and anddecision decisionmaking makingfor forenvironmental environmental monitoring applications. For example, in a WSN deployed in a forest with each sensor monitoring applications. For example, in a WSN deployed in a forest with each sensor node sensing monitoring applications. For example, in a WSN deployed in a forest with each sensor node node sensing sensing temperature and wind speed, it is possible to build a fast and energy efficient forest fire detection temperature and wind speed, it is possible to build a fast and energy efficient forest fire detection temperature and wind speed, it is possible to build a fast and energy efficient forest fire detection method [11] by concentrating places with temperature fast wind speed, To method [11]by byconcentrating concentrating on places with high temperature orwind fastspeed, wind or speed, or both. To method [11] onon places with highhigh temperature or fastor both.or Toboth. illustrate illustrate the idea of dominance relations in this scenario, consider the readings in the twoillustrate the idea of dominance relations in this scenario, consider the readings in the twothe idea of dominance relations in this scenario, consider the readings in the two-dimensional attribute dimensional space shown depicting typical of ranking objects by dimensional attribute space shown in in Figure 22example depicting typical example example of more ranking objects by more more space shownattribute in Figure 2 depicting a Figure typical ofaaranking objects by than one criterion. than one criterion. Figure 2a nine records received corresponding sensor than one criterion. Figurereceived 2a lists listsfrom ninenine records received from from nine corresponding sensor nodes nodes Figure 2a lists nine records corresponding sensornine nodes deployed in different places deployed in places and values. 2b the representation 2D deployed in different different places and their their values. Figure Figure in 2badepicts depicts thePlaces representation in 2D space. space. and their values. Figure 2b depicts the representation 2D space. P1 , P2 , P3 ,in P5aa, and P6 are Places P 1 , P 2 , P 3 , P 5 , and P 6 are all dominated by other points so the skyline query returns the points Places P1, P2, Pby 3, Pother 5, andpoints P6 are so all the dominated by other points the skyline query returns the by points all dominated skyline query returns thesopoints that are not dominated any that are not dominated by any other points. Consider the point P 55 that is dominated by P 77,, as it has that are not dominated by any other points. Consider the point P that is dominated by P as it has other points. Consider the point P5 that is dominated by P7 , as it has a higher wind speed than Paa5 higher than PP55though both the same The only higher wind speed thansame though both have have theskyline same temperature. temperature. The skyline skyline query only retrieves retrieves thoughwind bothspeed have the temperature. The query only retrieves thequery dominant places in the dominant in of higher temperature and wind speed. The query will not the dominant places in terms terms ofwind higher temperature andquery windwill speed. The skyline skyline query will not terms of higherplaces temperature and speed. The skyline not exhibit any place with lower exhibit any and once by location exhibit any place place with lower lower temperature and wind windbyspeed speed once it it is is dominated dominated by aaTherefore, location with with temperature and with wind speedtemperature once it is dominated a location with higher values. the higher values. Therefore, the result set of the skyline query consists of {P 4 , P 7 , P 8 , P 9 }, which are higher values. theconsists result set of4 ,the query are consists of {Pof 4, P 7, P8, P9}, places which that are result set of the Therefore, skyline query of {P P7 , skyline P8 , P9 }, which indicators dangerous indicators of indicators of dangerous dangerous places places that that need need special special attention. attention. need special attention. Temperature Temperature

Wind Windspeed speed

P1 P1

11

99

P2 P2

22

66

P3 P3

55

33

P4 P4

9.5 9.5

1.5 1.5

P5 P5

9.0 9.0

4.5 4.5

P6 P6

66

77

P7 P7

99

88

P8 P8

77

99

P9 P9

55

10 10

P9 P9 P8 P8

P1 P1

Wind speed

Points Points

P7 P7 P6 P6 P2 P2

P5 P5 P3 P3 P4 P4

Temperature Temperature (a) (a)

(b) (b)

Figure2. 2.(a) (a)Skyline Skylineof ofsensor sensorreadings, readings,(b) (b)Representation Representation of of sensor sensor readings readings in in 2D 2D space space [11]. [11]. Figure Figure 2. (a) Skyline of sensor readings,

Another Another example example that that demonstrates demonstrates the the usefulness usefulness of of DS DS can can be be found found when when monitoring monitoring the the air air pollution pollution in in aa region region of of interest. interest. A A high high concentration concentration of of CO CO or or SO SO22,, or or both, both, can can be be aa strong strong indication indication

J. Sens. Actuator Netw. 2016, 5, 2

3 of 22

Another example that demonstrates the usefulness of DS can be found when monitoring the air pollution in a region of interest. A high concentration of CO or SO2 , or both, can be a strong indication that the location is highly polluted. An environmental scientist can issue a dynamic skyline query for CO and SO2 levels to monitor the air pollution surrounding a particular region of interest. DS query processing over moving objects is also very useful for numerous applications, such as location-aware computing, object tracking and monitoring, virtual environments, uncertain data stream, computer games, visualization, etc. Although the skyline query approach has potential for use in WSN applications, it faces challenges because of high processing complexity and cost related with updating the results. The main contributors to the high cost are the data access cost from storage locations and the processing cost while executing the user specified query for a dominance check. It is important to note that search efficiency and update criteria are two key requirements of skyline query processing and skyline result maintenance. Data-Centric Storage (DCS) [12,13], an alternate to External Storage (ES) and Local Storage (LS), is considered to be a promising and efficient storage and search mechanism. There has been a growing interest in understanding and optimizing WSN DCS schemes in recent years, where a various number of query mechanisms such as point query, range query, similarity searching, top-k query and skyline query can be applied in a consolidated framework. As part of developing such consolidated framework, this paper proposes a threshold based hierarchical approach, which uses temporal correlation among the sectors and sector segments to calculate DS in a region of interest. To our knowledge, this is the first practical demonstration of DS in DCS of WSN in current state-of-the-art. In this paper, first a novel tree building algorithm is incorporated so that each head node of a cluster could construct a tree with itself as a root. This allows exertion of three core operations such as Tree Propagation, Regular Update and Triggered Query from any head node with a particular range. The performance of the proposed model is analyzed by framing and implementing it in a distributed information delivery service running one or more applications in a WSN. In this service, a set of producer and consumer nodes exchange information by relaying packets through neighboring clusters for each application. This phenomenon is facilitated by a DCS [14] architecture, also referred to as DBDCS (Section 3.1). DBDCS is an adaptation of the magnetic disk storage platter consisting of tracks and sectors [15,16]. In DBDCS, each application uses a disc track and sector analogy to map data locations and uses distance based indexing method for storing and querying multi-dimensional similar data. The member nodes in each cluster report the sensed event to their associated head node, which aggregates the received events at the end of each epoch. The aggregated event is hashed to produce a hash key, which is mapped from a one dimensional domain into a metric space [17] (Section 3.3). The remainder of this paper is structured as follows: Section 2 provides an overview of the related work in the literature. Network architecture, data processing and mapping, insertion and skyline querying are illustrated in Section 3. This is followed by the simulation results and performance evaluation of EDDS presented in Section 4. The paper is concluded in Section 5. 2. Related Work According to the current state-of-the-art, skyline operator in database community was first introduced by Borzsonyi et al. [5]. In their work, they proposed the solution based on block nested loop (BNL) and divide-and-conquer (D & C). Inspired by BNL, Chomicki et al. [18] and Godfrey et al. [19] proposed two variants of BNL named sort-filter-skylines (SFS) and linear elimination sort for skyline (LESS), respectively. Later, two progressive processing algorithms such as Bitmap and Index were proposed by Tan et al. [20]. Nearest neighbor (NN) method and its variant NN with branch-and-bound (BB) were later presented by Kossman et al. [21] and Papadias et al. [22], respectively. However, these approaches are mainly suitable for centralized environments with high computational and energy resources and thus inappropriate for WSN environments.

J. Sens. Actuator Netw. 2016, 5, 2

4 of 22

A filter-based distributed algorithm for skyline evaluation and maintenance in WSN is proposed by Liang et al. [23]. Each sensor node uses a greedy algorithm to compute a local skyline certificate, which is a subset of the local skyline and forwards it to its parent. The parent or non-leaf node calculates the certificate based on a set of local skyline points plus the certificate received from child nodes. Following this process, the root calculates the set of skyline points, which is used as a Global Skyline Filter (GSF) or global certificate. The root broadcasts the global certificate followed by a simple merge-based algorithm to filter out the points from transmissions that cannot utilize the certificate. An energy efficient Sliding Window Skyline Monitoring Algorithm (SWSMA) is proposed in [24] to continuously maintain sliding window skylines over WSN. It reduces the amount of data transferred and reduces energy consumption, as a consequence, by devising two filter based algorithms: (1) Single point filter based algorithm referred to as a Tuple Filter (TF); and (2) Grid Filter (GF) based algorithm. Chen et al. [25] proposed two evaluation algorithms for finding the skyline points on a dataset progressively. The dataset is first partitioned into disjoint subsets. Then skyline points are returned through the examination of each subset progressively using discovered skyline points to filter out the unlikely skyline points from transmission. However, the features of DCS have not been considered in these approaches and thus direct implication of them in DCS is not suitable. Song et al. [26] propose a skyline query processing algorithm exploiting the key features of DCS. The algorithm processes a query in three stages on the basis of four assumptions: (1) choice of DCS framework is limited to KDDCS [27], GDCS [28] and DIM [29]; (2) all nodes are identified by unique ID; (3) each node knows the geographic location of its own and its neighbor nodes; and (4) each node knows the range of data stored in its neighbor nodes. In the first stage, the base station locates the Start Node (SN) where the query is to commence. In the second stage, SN creates a Skyline Query Message (SQM) and propagates the SQM to neighbor nodes to search for Candidate Nodes (CNs) where candidate data for the query are stored. This SQM transmission process is repeated until the complete candidate results are found. In the third and final stage BS generates a final query result from its gathered candidate dataset. The aforementioned assumptions made in this work are particularly expensive in a distributed environment like WSN especially DCS framework. Su et al. [30] proposed an algorithm, known as Skyline Sensor Algorithm (SkySensor), in a customized DCS method in order to collect and store all sensor readings and retrieve skyline results efficiently from the network. The major disadvantages of SkySensor are: (1) SkySensor needs increased effort from an application designer who has to write the center location of each cluster in such a way that no two clusters overlap; (2) number of clusters in a sensor network depends on the number of attributes of a tuple, therefore, an active sensor and actuator network, with higher rate of data generation and a lower number of attributes leads to high concentration of data in a small portion of the network which creates congestion and a hot spot around the cluster and limits the ultimate goal or advantage of DCS; and (3) in the case of resilience to node failure, an inefficient and old technique referred to as local replication is used, which is expected to incur high storage cost and increased data loss during a node group failure. Based on the above discussion, it is obvious that none of the work from the current literature has implemented DS in WSN, especially in DCS. It consumes great effort in pre-processing, such as gathering the local skylines of all sensors, mapping all detected data of the entire network (both inflow and outflow). The overall process is overwhelmed with higher overheads and complexity if the researcher or analysts are interested in a particular region. In this paper, a threshold based hierarchical DS approach is proposed and implemented using the temporal correlation among the sectors and sector segments. This allows reducing the transmission cost in a greater aspect and facilitates finding skyline points from both entire network and a particular region of interest. 3. Basic System Design In this section, at first, we briefly discuss the DBDCS network architecture that has been used for testing our proposed EDDS approach. Definition of skyline is restated in Section 3.2 in terms of

J. Sens. Actuator Netw. 2016, 5, 2

5 of 22

metric based searching. This is followed by the discussion of data processing, mapping and insertion technique in Sections 3.3 and 3.4. Finally, we have presented the EDDS approach in Section 3.5. 3.1. Network Architecture The surface/platter of a magnetic disk storage device consisting of tracks and sectors provides an interesting approach that J. Sens. Actuator Netw. 2016, 5, 2may be applied to a large scale WSN. This assumption led to the Disk 5Based of 21 Data Centric Storage (DBDCS) architecture, as shown in Figure 3a, dividing the rectangular field into a and sector Sj, respectively. Thetophysical deployment is mapped to arepresent m x n matrix, is the matrix of storage cells (referred as a sector) where row and column track where Ti and m sector Sj , number of tracks and n is the number sectors for each track. Hence, the nodes in the network are respectively. The physical deployment is mapped to a m x n matrix, where m is the number of tracks divided into S(mxn) sectors, each track. comprising Sector and are sector members that and n is the number sectors for each Hence, athe nodesHead in the(SH) network divided into S(mxn) SH i  1..that S  . Each communicate via one hopa to the SH (see(SH) Figure 3c), where node is configured to beto sectors, each comprising Sector Head and sector members communicate via one hop aware of the deployment layout knowing: (1)node eachisSH is assigned the sector as a the SH (see Figure 3c), where SHiby P r1..Ss. Each configured to with be aware of thenumber deployment virtualby address and (1) node id;SH andis(2) all member knownumber their own id and number tracks layout knowing: each assigned withnodes the sector as anode virtual address andofnode id; (m) (2) andallsectors (n) of the network field. As node shownidinand Figure 3b, the communication and member nodes know their own number of intra-sector tracks (m) and sectors (n) of(i.e., the communication sector members to SH or vice-versa) is constrained to communication one hop while inter-sector network field. Asfrom shown in Figure 3b, the intra-sector communication (i.e., from sector transmission multi-hop. simplification, the hop sensor nodes inside each sector areisnot shown members to SHis or vice-versa)For is constrained to one while inter-sector transmission multi-hop. explicitly in Figure 3b. Instead, an aggregated link (see Figure 3c) is shown to represent the total For simplification, the sensor nodes inside each sector are not shown explicitly in Figure 3b. Instead, an traffic from member nodes to head node. aggregated link (see Figure 3c) is shown to represent the total traffic from member nodes to head node.

Figure3.3.(a) (a)DBDCS DBDCSMapping; Mapping; (b) (b) inter-sector inter-sector communication; communication; and Figure and (c) (c) intra-sector intra-sectormember membernode nodetoto headnode nodecommunication communication head

3.2.Skyline SkylineDefinition Definitionwith withMetric-Based Metric-Based Searching Searching 3.2. Metric space space M M can can be d),d), where D is domain of objects and and d is the Metric be defined definedas asaapair pairMM= =(D, (D, where D the is the domain of objects d is distance function—d: D × D  R satisfying the following constraints (Equation (1)) for all objects a , b cD: the distance function—d: D ˆ D Ô R satisfying the following constraints (Equation (1)) for all, objects a, b, c P D: d a, b  0 (non-negativity) d pa, bq ě 0 pnon ´ negativityq d a, b  0 (identity) pidentityq d pa, bq “ 0 (1)(1) d (a, b)  d (b, a) dpa, bq “ dpb, aq (symmetry) psymmetryq cq, cď cq ptriangle inequalityq d (a, c)  d (a, b) dpa,  d (b ) dpa, bq ` dpb,(triangle inequality)

In thatsmaller smallervalues valuesare arepreferable preferabletoto larger ones a set Inthis thismetric metric space, space, considering considering that larger ones forfor a set of l-of l-dimensional data,the thedynamic dynamicskyline skylinequery queryresult resultset setcan canbebedefined definedas: as: dimensional data, Skyline(q, r):

ì" Î [1, l ], a, b Î I | a(i) £ b(i) ï i ï

and ü ï ï

(2)

J. Sens. Actuator Netw. 2016, 5, 2

6 of 22

Skyline(q, r): $ , 6 of 21 ’ & @i P r1, ls , a, b P I|apiq ď bpiq and / . X“ (2) Di P r1, ls, apiq ă bpiq and ’ / In Equation (2), q denotes the point, of the ith attribute of an object %query d pq, aq ď r a(i) denotes the value and r defines the range of the region of interest. The data space is divided into S sectors with a pivot Indenoted Equation q denotes the query a(i) key denotes the valuexof ith attribute ofas an(Figure object  the D can point, by(2), Pi, for each sector Si. Thepoint, iDistance for an object be defined and r defines the range of the region of interest. The data space is divided into S sectors with a pivot 4a, Equation (3)): point, denoted by Pi , for each sector Si . The iDistance key for an object x P D can be defined as iDistx   d Pi , x   i  c (3) (Figure 4a, Equation (3)): J. Sens. Actuator Netw. 2016, 5, 2

iDist pxq “ dfor pPi ,individual xq ` i ¨ c sectors. Given q  D , the region(3) In Equation (3), c is the separating constant of

interest with radius be defined as (Figure 4b,for Equation (4)):sectors. Given q P D, the region of In Equation (3),r ccan is the separating constant individual interest with radius r can be defined as (Figure 4b, Equation (4)):

r = éëd ( Pi , q) + i × c - r, d ( Pi , q) + i × c + rùû

(4) (4)

S2

12

13

14

15

8

9

10

11

T2

4

5

6

7

0

1

2

3

Tm

S1

T1

r “ rd pPi , qq ` i ¨ c ´ r, d pPi , qq ` i ¨ c ` rs

Sm S2

P2

S0

P0

S1

...

... 2C 3C 4C

11C 12C

... 15C

2C 3C 4C

(a)

... 11C 12C

15C

(b)

Figure 4. 4. (a) Data mapping; mapping; and and (b) (b) range range query query example. example. Figure (a) Data

3.3. Data Data Processing Processing and and Mapping Mapping 3.3.

A sensed sensed event eventEEcan canbe bedefined definedby byan anl-l-dimensional dimensionaltuple, tuple, 1, A2, A3, … Al) where A ,   1, l  A (A(A 1 , A2 , A3 , . . . Al ) where A gg, @ g gP r1, ls denotes the thegth gthattribute attribute and is domain the domain of attribute Agmember . Each member oftransmits a sector denotes and D AgDisAgthe of attribute Ag . Each node of anode sector xv y , v , ......, v the sensed event as an l-tuple , where 1 ď i ď M , M is the total number of member i1 l-tuple i2 vi1il, vki 2 ,......,vil , where 1k i k M k , Mk is the total number of transmits the sensed event as an nodes in kth sector and vij denotes the value of the jth kattribute received from ith node of kth sector. member nodes in kth sector and vij denotes the value of the jth attribute received from ith node of kth A SH node after collecting tuples from all the member nodes aggregates them at the end of each epoch sector. A SH node after collecting tuples from all the member nodes aggregates them at the end of before finding the mapping for the target SH. each epoch before finding the mapping for the target SH. Hence, after aggregation at epoch t Hence, after aggregation at epoch t M żk

v t  EkEpAgg ptqq “  vi1 , vxv k (Agg(t)) i 2 ,......, i1 , vi2 ,il ......, vil y ptq Mk

Aggi 1

(5.1) (5.1)

Aggi“1

  1 , 2 ,......., l t  “ xψ1 , ψ2 , ......., ψl y ptq Here,

! ) M M M ψj “ maxi“k1 vij , mini“k1 vij , avgi“k1 , j P r1, ls , k P r1, Ss

 j  max vij , min vij , avg Mk i 1

Mk i 1

Mk i 1

, j  1, l , k  1, S 

(5.2) (5.2) (6)

(6)

In Equation (6), Mk denotes the number of member nodes in kth sector. It is assumed that all attribute’s aggregated values of  i have been normalized to be between 0 and 1. As shown in Table 1, weights have been assigned to different attributes based on their importance in the event

J. Sens. Actuator Netw. 2016, 5, 2

7 of 22

In Equation (6), Mk denotes the number of member nodes in kth sector. It is assumed that all attribute’s aggregated values of ψi have been normalized to be between 0 and 1. As shown in Table 1, weights have been assigned to different attributes based on their importance in the event description. Hence, an attribute with higher weight has higher influence in deciding the similarity among events. Table 1. Weight settings. Attribute

Weight

A1 A2 .... Al

w1 w2 .... wl

3.3.1. Pivot Point Generation l ´´ ÿ

αmin “

¯ ¯ Aipminq {Aipmaxq ˆ wi

(7.1)

¯ ¯ Aipmaxq {Aipmaxq ˆ wi

(7.2)

i “1 l ´´ ÿ

αmax “ i “1 l ´´ ÿ

β“

¯ ¯ Aipavgq {Aipmaxq ˆ wi

(7.3)

i “1 l ´´ ÿ

δ“

¯ ¯ Aipθ q {Aipmaxq ˆ wi

(7.4)

i“1

In Equations (7.1)–(7.4), Ai(min) , Ai(max) , Ai(avg) and Aipθq denote the minimum, maximum, average and threshold value of ith attribute. Based on Equations (7.1) and (7.2), the domain of the hash key, denoted by HD , is α (αmin , αmax ). The center of gravity, denoted by β, is derived in Equation (7.3) to find the normalized center point of the domain of the hash key HD , whereas δ is the separating factor between two pivot points. However, in order to balance the load among sectors, it is important to find the range where the concentration of the data points is high. Hence, β and δ can be used to find the range for center of gravity, denoted by β (βrange´min , βrange´max ), as shown in Equation (8). β range´min “ β ´ δ β range´max “ β ` δ

(8)

Thus, the separating step, denoted by η, between two pivot points in COM range can be defined by: ` ˘ η “ β range´max ´ β range´min {S ´ 1 (9) Thus, the pivot points for S sectors can be defined in each sector head by: $ ’ i“0 & αmin , Pi “ β range´min ` i ˆ η, 0 ă i ă S ’ % α i“S max ,

(10)

J. Sens. Actuator Netw. 2016, 5, 2

8 of 22

Algorithm 1. Pivot Point Generation Algorithm (implemented at each SH node). Input: attrRangeTable (containing minimum, maximum, average and theta of each attribute), W (weights to different attributes based on their importance in the event description). Output: P (derived pivot point for each sector) 1: mapRec.minRange = mapRec.maxRange = 0 2: m = lengthof (attrRangeTable) J. Sens. Actuator 5, 2 8 of 21 3: for i Netw. = 1 to2016, m do 4: mapRec.minRange+=(attrRangeTable[i].min/attrRangeTable[i].max) ˆ W[i] 5:5: mapRec.maxRange+=(attrRangeTable[i].max)/attrRangeTable[i].max)× mapRec.maxRange+=(attrRangeTable[i].max)/attrRangeTable[i].max)ˆW[i] W[i] 6:6: mapRec.com mapRec.com += +=(attrRangeTable[i].avg)/attrRangeTable[i].max) (attrRangeTable[i].avg)/attrRangeTable[i].max)׈W[i] W[i] 7:7: mapRec.theta mapRec.theta += +=(attrRangeTable.theta)/attrRangeTable[i].max) (attrRangeTable.theta)/attrRangeTable[i].max)׈W[i] W[i] 8:8: ii == ii ++11 9:9:end endfor for 10: comLowerLimit 10: comLowerLimit= =mapRec.com mapRec.com- mapRec.theta - mapRec.theta 11: comUpperLimit = mapRec.com 11: comUpperLimit = mapRec.com+ +mapRec.theta mapRec.theta 12: - 1)- 1) 12:ηη= =(comUpperLimit (comUpperLimit- comLowerLimit)/(S - comLowerLimit)/(S 13: 13:for forj =j =0 0totoS S 14: 0)0) then P[j] = mapRec.minRange 14: ifif(j(j==== then P[j] = mapRec.minRange 15: else if (j == S) then P[j] = mapRec.maxRange 15: else if (j == S) then P[j] = mapRec.maxRange 16: else P[j] = comLowerLimit j ×j ηˆ η 16: else P[j] = comLowerLimit+ + 17: end if 17: end if 18: 11 18: j =j =j +j + 19: end for 19: end for

Figure 5. Pivot point generation example. Figure 5. Pivot point generation example.

Algorithm 1 uses Equations (8)–(10) for calculating pivot points. Figure 5 illustrates the domain and sub-domain of pivot points. 3.3.2. Mapping Mapping 3.3.2. Given ll attributes in aa WSN WSN application, application, Given attributes in in an an attribute attribute list list associated associated with with weight weight w wjj (1 (1 ď ≤ jj ≤ďl)l)in the source source SH generates the the hash hash value value by: by: the SHkk generates l



l ´´ ÿ k





M h  h “ avgiM1 vavg Ak (max) w ij i“1jv ij {A jpmaxqj ˆ w j j 1

¯

¯ (11) (11)

j“1

Hence, after each epoch, SHk forwards the aggregated event Ek   1 , 2 ,, l , t , h , where t Hence, after each epoch, SHk forwards the aggregated event Ek “ xrψ1 , ψ2 , . . . , ψl s , rt, hsy, where t i`Pi11 and denotes the epoch epochnumber, number,totothe thedestination destination sector head denoted i where, Pi ăh P Pii denotes the sector head denoted by by SHSH and P i where, Pi ď h and Pi+1 thelower lowerand andupper upperlimit limitofofith ithsub-interval, sub-interval,respectively. respectively. i`1isisthe 3.4. Insertion Within a sector, data are further distributed among nodes according to their distance from the SH. In order to do so, a sector is divided into segments. Figures 6 and 7 and Table 2 illustrate the idea of sector segmentation. Given a kth sector containing Mk member nodes, the SHk first sorts all member

J. Sens. Actuator Netw. 2016, 5, 2

9 of 22

3.4. Insertion Within a sector, data are further distributed among nodes according to their distance from the SH. In order to do so, a sector is divided into segments. Figures 6 and 7 and Table 2 illustrate the idea of sector segmentation. Given a kth sector containing Mk member nodes, the SHk first sorts all member nodes based on RSSI in ascending order. The member nodes are then divided into r segments. J. Sens. Actuator Netw. 2016, 5, 2 9 of 21 Each segment forms a circle, denoted by BpX,Yq (ri ), where the center of the circle is (X, Y) with radius ri . (X, Y) is the geographic co-ordinates for SHk . The number of segments depends on the WSN application, sector size and the number of member nodes in each sector. Thus the set of sensors that B( X ,Y ) (ri )  SensorsCoordinatex, y  :  X , Y , x, y   ri (12) are within a Euclidean distance ri from (X, Y) form the segment defined by:





BpX,Yq pri q “ tSensorsCoordinate       r , 1px,  kyq:S|pX, Yq , px, yq| ď ri u k 1

k

k

(12) (13)

β k “ pPk`1 ´ Pk q {r, (13) 1ďkďS k , i0 $  S  ’ k P & k ,  i, 0  i  r i “ 0 ! M K ) ( i ) S k 1 (14) P M K pi q “k 1P, k ` β ˆ (14) i, r 0ăiăr i   ’ k “1 % P , i “ r k`1 By Equations (13) and (14), the pivot points of r segments within the kth sector are calculated. By Equations (13) and (14), the pivot points of r segments within the kth sector are calculated. An event with hash value, denoted by h, is stored in a member sensor node of ith segment where An event with hash value, denoted by h, is stored in a member sensor node of ith segment where M ( i )  h  M (i 1) . In order to balance the load, data are distributed among the nodes inside a segment P MK piq ď h ď P MK pi`1q . In order to balance the load, data are distributed among the nodes inside a in a round fashion Algorithm 2). segment inrobin a round robin(see fashion (see Algorithm 2).



K



K

Algorithm Algorithm2.2.Search_Target_Node Search_Target_Node(segment[i]), (segment[i]),implemented implementedatateach eachSH SHnode. node. Input:segment[i] segment[i](a(adata data structure structure containing to to count thethe number of of Input: containing member membernode nodeID IDand andtally tally count number packets stored in this member node) packets stored in this member node) Output:return returnthe thetarget targetMember MemberNode NodeID. ID. Output: 1: sort segment[i] in ascending order based on segment[i].tally 1:2:sort segment[i] in=ascending order+1 based on segment[i].tally segment[i].tally segment[i].tally 2:3:segment[i].tally = segment[i].tally +1 memberNodeId = segment[i].ID 3:4:memberNodeId = segment[i].ID return memberNodeId 4: return memberNodeId

Figure Figure 6. 6. Formation Formation of of Segments Segments inside inside aa Sector. Sector.

SH m11

m12

...

m1i

First Segment

m21

m22

...

m2i

Second Segment

...

J. Sens. Actuator Netw. 2016, 5, 2

10 of 22

Figure 6. Formation of Segments inside a Sector.

SH m11

m12

...

m1i

First Segment

m21

...

m22

m2i

Second Segment

... mn1

m12

...

mni

rth Segment Figure Figure 7. Segmentation Segmentation architecture architecture of member nodes inside a sector.

J. Sens. Actuator Netw. 2016, 5, 2

10 of 21

Table 2. Member table of a SH node. Table 2. Member table of a SH node. Member Node Id

r2

r4

m1 m2 m3 m4 .. .. mi .. .. .. .. MK

Received Signal Strength Indicator RSSI1 RSSI2 RSSI3 RSSI4 .. .. RSSIi .. .. .. .. RSSIk

r1

r3

3.5. Skyline Query 3.5. Skyline Query A threshold based hierarchical approach has been used in this research to calculate DS (q, r). The A threshold based hierarchical approach has been used in this research to calculate DS (q, r). The threshold based hierarchical approach uses the temporal correlation among the sectors and segments threshold based hierarchical approach uses the temporal correlation among the sectors and segments of a sector. All the key notations used in this section are listed in the Table 3 to increase reader comfort. of a sector. All the key notations used in this section are listed in the Table 3 to increase reader comfort. Table 3. Skyline Notation. Table 3. Skyline Notation.

Symbol ei Symbol E ei θi E θi LDSiLDS i

𝑒𝑖 ≺(𝑞,𝑟) 𝑒𝑗

ei ăpq,rq

𝑒𝑖 ≼(𝑞,𝑟) 𝑒𝑗 ei ďpq,rq 𝑒𝑖 ≺(𝑞,𝑟) 𝐸 ei ăpq,rq ei ďpq,rq 𝑒𝑖 ≼(𝑞,𝑟) 𝐸

Description An event at SHi orDescription segment ri of a sector An event SHofi or segment ri of a sector Aatset events A set of events The threshold point of SH i or Segment ri The threshold point of SHi or Segment ri The local dynamic skyline of SHi or Segment ri The local dynamic skyline of SHi or Segment ri Event ej is dominated by event ei with respect to query point q and the target Event e j is dominated by event ei with respect to query point q and the target ej regionregion of interest is a circle of radius r centered of interest is a circle of radius r centeredatatq.q. Event ej is dominated by or equal to event ei with respect to query point q and Event e j is dominated by or equal to event ei with respect to query point q and ej the target of interest is a circle of radius r centered the region target region of interest is a circle of radius r centeredatatq.q. Each event in E is dominated by event ei with respect to query point q and the Each event in E is dominated by event ei with respect to query point q and the E target region of interest is a circle of radius r centered target region of interest is a circle of radius r centeredatatq.q. Each event E is in dominated by orbyequal to event ei with respect Eachin event E is dominated or equal to event ei with respecttotoquery query E point q and the target region of interest is a circle of radius r centered at q. q. point q and the target region of interest is a circle of radius r centered at

Algorithm 3. Build_Tree(), implemented at each SH. Input: n (total number of sectors (columns)), SELF_NET_ADDR Output: Node (to store the node values), L (to store the address of left child of each node), R (to store the address of right child of each node) 1: // Finding the Track (row) number of current sector 2: i = (SELF_NET_ADDR)/n; 3: index = S;

J. Sens. Actuator Netw. 2016, 5, 2

11 of 22

Algorithm 3. Build_Tree(), implemented at each SH. Input: n (total number of sectors (columns)), SELF_NET_ADDR Output: Node (to store the node values), L (to store the address of left child of each node), R (to store the address of right child of each node) 1: // Finding the Track (row) number of current sector 2: i = (SELF_NET_ADDR)/n; 3: index = S; 4: for j = 0 to S´1; 5: node [j] = (i, j) 6: if (i-1 ě 0) 7: L[j] = index; 8: N[index] = (i´1, j) 9: end if 10: Li = i´1 11: while (Li -1ě0) 12: L[index] = index + 1 13: Node[index] = (Li ´1, j) 14: index++ 15: Li –; 16: end while 17: index++ 18: if (i+1