environments, and it employs a Master-Slave model for com- munication ... i.e., the time taken by a node to discover and to connect to another node in its radio ...
Clustering Algorithms for Wireless Ad Hoc Networks Lakshmi Ramachandran, Manika Kapoor, Abhinanda Sarkar, Alok Aggarwal IBM India Research Laboratory, Block 1, liT, HauzKhas, New Delhi, India 110016 {rlakshmi, mkapoor, sabhinan, aggarwa}@in.ibm.com
ABSTRACT
Master controls the traffic to the Slaves 1. Inter-cluster communication is through common Slaves, also called Bridge nodes.
Efficient clustering algorithms play a very important role in the fast connection establishment of ad hoc networks. In this paper, we describe a communication model that is derived directly from that of Bluetooth, an emerging technology for perva.~ive computing; this teclmology is expected to play a major role in future personal area network applications. We filrther propose two new distributed algorithms for clustermg in wireless ad hoc networks. The existing algorithms often become infeasible because they use models where the discovering devices broadcast their Ids and exchange substantial information in the initial stages of the algorithm.
Efficient clustering and topology construction algorithms play a very important role in the fast connection establishment of ad hoc networks. The performance of these algorithms is chiefly dependent on the device discovery time, i.e., the time taken by a node to discover and to connect to another node in its radio range (which is already part of the existing network). This device discovery time is also crucial in other situations. For example, when a large n u m b e r of devices within radio range of each other are powered on, the time taken to complete the formation of the network is an important performance criterion.
We propose a 2-stage distributed O(N) randomized algorithm for an N node complete network, that always finds the minimum number of star-shaped clusters, which have maximum size. We then present a completely deterministic O(N) distributed algorithm for the same model, which achieves the same purpose. ~,Vedescribe in detail how these algorithms can be applied to Bluetooth for efficient scatternet formation. Finally, we evaluate both algorithms using simulation experiments based on the Bluetooth communication model, and compare their performance.
1.
In this paper, we investigate the problem of distributed cluster formation in an ad hoc wireless environment. Our model is derived from Bluetooth. It is an asynchronous system in which each node has a unique Id, but does not know the Id of any other node. A node trying to discover other nodes broadcasts a generic message and does not advertise its Id in the message. The replying node gives its Id in the reply message, but does not know which node it is replying to. However, after a device has discovered another device, much more information can be passed between them with relatively less overheads. The details of the model are given in the next section. Clearly, this model has unique features, which cannot be solved using conventional techniques.
INTRODUCTION
Ad hoe networks are expected to play a significant role in future mobile computing applications. An ad hoc network consists of a set of self-organizing mobile nodes which require no fixed infrastructure, and which communicate with each other over wireless links. For efficient communication between nodes, ad hoc networks are typically grouped into clusters, where each cluster has a clusterhead. Bluetooth [1, 2] is an emerging technology for indoor wireless picocellular environments, and it employs a Master-Slave model for communication between nodes. In this model, each cluster has a star topology, with a Master at the centre of the star, and the
The paper is organized as follows. In Section 2 we describe the system model and give the problem statement. In Section 3 we propose a randomized O(N) distributed clustering algorithm for asynchronous complete networks of N nodes (all within radio range of each other), which can be used to construct a minimal set of star-shaped clusters of limited size, and prove the correctness of this algorithm. In Section4 we propose a completely deterministic algorithm to achieve clustering. In Section 5 we describe in detail how these can be applied to Bluetooth. Finally, we evaluate both the algorithms and compare their performance by doing detailed simulation experiments, as applied to Bluetooth is Section 6 and summarize our contributions in Section 7.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributedfor profit or commercialadvantageand that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistributeto lists, requires prior specific permissionand/or a fee. DIAL M Workshop 2000 Boston MA USA Copyright ACM 2000 1-58113-301-4/00/08...$5.00
~In this paper, we use the terms 'Master' and 'Clusterhead', and 'Slave' and 'Non-clusterhead' interchangeably.
54
1.1
Related Work
are within radio range of each other. ~Ve assume that each node has a unique Id known to itself, but not to other nodes. The total number of nodes, N, and the m a x i m u m number of nodes that a single cluster can accommodate (excluding the clusterhead), S, are known to all the nodes. The network is asynchronous, and there is no notion of global time, with each node keeping its own local clock. There is also no centralized entity which has complete information on the whole network.
It is well known that partitioning an arbitrary graph into a set of clusters, all of maximum size, is NP-complete [20]. The election of clusterheads or leaders has been investigated in several papers [3, 5, 13, 4, 23]. There has been some work on the formation of clusters [9, 10]. Some recent work [6] has been done oil cluster formation such that a node is either a clusterhead or is at most d hops away from a clusterhead. Some work has also been done on randomized algorithms for initializing packet radio networks [14]. Each of these employ different models of communication, and allow the passing of different types of messages between the nodes during cluster formation. For example, in [5] a distributed algorithm for election of a single leader has been proposed for an asynchronous complete network with limited number of faulty links incident at each node. The model assumes that each message sent by a node will definitely reach all nodes that are connected to it by non-faulty links. It also assumes that messages can be forwarded to other nodes if required. The information passed in the messages transmitted by a node include the Id, the number of nodes it has eliminated in the leader election, the Ids of the nodes it has heard from, and the greatest Id that the node has seen. In [14] the initialization protocols involve checking the status of the channel after a broadcast round. In [24] the underlying topology is not assumed to be a complete graph, and require the exchange of neighborhood information between nodes. In [9] the nodes are required to know the entire topology. In [16] a randomized leader election algorithm has been proposed for anonymous networks, but involve the broadcast of several rounds of messages and forwarding received messages. This problem reduces to the well-studied set-covering problem [11, 8], where each set has a leader and a given number of nodes. The problem of minimizing intra-cluster communication costs is related to the facility location problem [17, 22]. Some work on Bluetooth network formation has been done in [21, 18].
1.2
All nodes use a common fixed set of frequencies to communicate. A node trying to discover another node repeatedly broadcasts a message on a sequence of frequencies. This sequence is determined by its local clock. The t r a n s m i t t i n g node listens in between broadcasts for replies. A listening node also listens on a sequence of frequencies, and a message reaches only when the frequencies of the broadcasting node and the listening node match. When a listening node successfully receives a message, it sends a reply, which is also broadcast. However, the nodes use a random backoff mechanism before replying so that collisions can be assumed to be absent. The broadcast message used for discovery of other nodes does not contain the Id of the transmitting node, and the replying node does not know who it is replying to. Clearly, it is different from other models found in the literature. Further, a node can be in one of the following states 2: I N Q U I R Y : A device in this state broadcasts Inquiry packets, which do not contain the sender's Id or any other information, except that it is an Inquiry packet. I N Q U I R Y S C A N : A device in this state listens for Inquiry packets and broadcasts an Inquiry Response packet in return. This response packet contains the sender's unique Id and clock (which can be used to determine its frequency at any future instant). A limited amount of information (at most a few bits) can be piggybacked on this packet. P A G E : In this state, a device tries to make a connection to a node whose Id and clock are known to it, by sending Page packets, which contain the destination node's Id. If the connection is successful, then this node automatically becomes a Master. P A G E S C A N : In this state, a device listens for a Page packet and acknowledges it on receipt, completing the connection establishment with the node which sent the Page packet. If the connection is successful, this node automatically belongs to the cluster headed by the paging node. C O N N E C T E D : in this state, a device is part of a cluster and has a connection established with the clusterhead after a successful handshake as described above.
Bluetooth Application Space
The Bluetooth SIC aims to provide solutions for short-range wireless connectivity between pervasive devices, like PDAs, mobile phones, palmtops, laptops, pagers, etc. It is meant to be a cable-replacement solution for desktops, keyboards and other peripheral devices. The potential applications range from smart home appliances to wireless connectivity to backbone data networks. Bluetooth is being considered for use by the top players in the consumer electronics market. Products would include wireless headsets, cameras and portable games. The automotive industry is also looking to use Bluetooth technology as the key solution for onboard wireless communication systems, connecting vehicular and external networks. These and other applications in the ofrice and classroom environments, like shared white boards, would make it important for the devices to quickly selforganize into an ad hoc network. Our work is intended to provide solutions to this problem.
2.
SYSTEM MODEL AND PROBLEM STATEMENT
SVe model the wireless ad hoc network as an undirected complete graph, where the set of nodes represents the devices in the network, and there is an edge between two nodes if they
A node is in one of these states at any given time, and since they are not synchronized, the set of nodes in INQUIRY/INQUIRY SCAN or P A G E / P A G E SCAN is random. Clearly, two nodes should be in complementary states in order to discover each other. The INQUIRY/INQUIRY SCAN states correspond to the device discovery phase of the connection setup. We also assume that connection establishment with a node that has been discovered using Inquiry packets, is almost instantaneous due to the availability of ~V~reuse Bluetooth terminology to describe the states of the nodes and the messages.
55
the clock information. This means that the Page/Page Response packets are always delivered successfully if sent after a successful Inquiry process has been carried out. Any two nodes which are "connected" to each other are always in a Master-Slave configuration (one of the devices is a Master and the other is a Slave), and any messages can be passed over the link with very little overhead. V~realso assume that any device is equally suited to become a Master or a Slave, and each device is equally likely to request a connection establishment to any of the other devices.
2.1
Problem
The second stage corrects the effect of the randomness introduced in the previous stage by using a deterministic algorithm to decide on the final set of Masters and Slaves, and to efficiently assign Slaves to Masters. A Super-master is elected, which counts the actual number of Masters and collects information about all the nodes. Th e Super-master can then run any centralized algorithm to form a network of desired topology. The election of the Super-master is interleaved with the cluster formation, which speeds up the ad hoc network formation. The algorithm is described below:
Statement
The main objective of this work is to develop efficient distributed algorithms for cluster formation, which can be used by a set of asynchronous, self-organizing nodes to form an ad hoc network, under the above model. When a set of nodes are powered on at the same time, they start executing these algorithms immediately in order to form a connected set of clusters. These topology construction algorithms should ensure the following. • The nodes should be organized into star-shaped clusters, each of which has a clusterhead. Each node should either be identified as a Slave or a Master, where a Master is at the center of the star, and the rest are all Slaves. • The maximum number of Slaves per cluster is S. • The size of the clusters should be at their maximum. • The exchange of roles between a Master and a Slave is expensive, and should be avoided. • The transfer of nodes from one cluster to another after connection is established is also expensive, and should be avoided. • At the end of the algorithm, each node knows whether it is a Master or a Slave. If it is a Slave, it knows the Master of the cluster it belongs to. • The network should be connected, and there should be no orphan nodes. • On termination, a single node should have complete information about all the clusters.
Stage I: Each of the N nodes conducts T rounds of Bernoulli trials with P[success] = p. A node which is successful atleast once becomes a Master-designate, and the remaining nodes become Slave-designates. Let X be the random variable denoting the number of Master-designates at the end of Stage 1. Let Xi be the number of successes in each round. We choose p such that P[Xi > m] is very small, where m is typically 1. From [12, 7, 15], we have P[X, > m] < -
p
- L:gJ
~
(1)
L1 - ~ J
In order to almost ensure we get less than m Master-designates per round, we equate
pm(l_p)N-'n=r#[N]m[i
- ~in]j N-m
(2)
and set r/ to be very small, typically .001. This has to be solved for p. L H S > R H S for p = ~ , and L H S < R H S for p = 1. LHS is concave in p, and the Newton-Raphson m e t h o d converges quickly. We use the following Chernoff bound formulae [14, 19]:
P[Z > (1 + e)E[Z]] < e (-~E[zD
(3)
P[Z 2k] = P [ X > (1 + 1)k] k, a Super-master-designate receives k messages from Proxy-slaves. It then waits for responses from all X clusters. The cluster information passed to it involves X = O ( N ) messages. This makes the message complexity for X > k also O(N). Clearly, any additional messages to be sent by the Super-master in order to inform the clusters about bridges and neighbors is also O(N).
~
Design~
Proxy Slave
~
3.2 F i g u r e 3: R a n d o m i z e d A l g o r i t h m for X m o r e t h a n k. If the actual number of Masters, X, is greater than the required number, k, then the Super-master-designate knows this since atleast one cluster would not be full (see Figure 3). Each Super-master-designate inquires to c~)llect responses from all clusters. When the total number adds up to the number of nodes in the network, then the Master of the Proxy-slave with the highest Id is declared the Super-master, and all the cluster information is sent to him. Since we need exactly k clusters, the extra clusters are torn down and the nodes distributed among the k largest clusters. When the Super-master has information about all the nodes, all the clusters are informed about the identity of the Super-master and the algorithm terminates.
Lemma 1: Exactly one Super-master is elected during stage2 of the algorithm. PROOF: See Appendix A. Lemma 2: Each Slave-designate belongs to exactly one cluster and each node knows which cluster it belongs to and whether it is a Master or a Slave. PROOF: See Appendix A.
It should be noted that the messages used towards the end carry a lot of information, but can be sent in a relatively short time since when these messages are used, the Supermaster knows the Id and Clock of all the nodes. The pseudocode of the algorithm executed by the nodes is given in Appendix B.
Lemma 3: Exactly k clusters are formed, with k - 1 clusters of size S + 1 and one cluster of size N m o d ( S + 1). PROOF: See Appendix A. THEOREM 1. Every execution of the 2-stage randomized algorithm produces exactly k = f/ S +'~I ] star-shaped clusters, with k - 1 clusters of size S + 1 and one cluster of size N m o d ( S + 1), and exactly one Super-master is elected which has information about all the nodes in the network.
At this point, since a single node knows about all the nodes in the ad hoc network, any centralized algorithm can be used to connect the clusters using Bridge nodes to form the desired topology. For example, in order to form a completely connected network, where each cluster is connected to every other cluster, the Super-master selects one Bridge node between any pair of clusters, and sends the messages to the Masters.
3.1
Proof of Correctness
We prove the correctness of the algorithm described above by showing that when it terminates, there are exactly k = [ ~N - $ ] star-shaped clusters, with k - 1 clusters of size S + 1 and one cluster of size N m o d ( S + 1). Each node knows which cluster it belongs to and whether it is a Master or a Slave. ~,Ve also prove that exactly one Super-master is elected which has information about all the nodes in the network.
3.
4.
Message Complexity
'~Vefirst examine the case when X = k. In the first stage, the formation of each cluster takes S messages, one from each Slave-designate to the Master-designate. There is one message from each Master-designate to the first Slave in asking it to become a Proxy-slave. Therefore, the first stage requires k S + k messages. During the Super-master election, a Master can collect O(k) responses from Proxy-slaves. The cluster information sent also involves O(k) messages. Therefore, the message complexity for X = k is O ( k S + k + k) = O(N) messages.
PROOF. The theorem is proved by Lemmas 1, 2, and []
A DETERMINISTIC ALGORITHM FOR CLUSTER FORMATION
In this section, we present a deterministic algorithm for forming star-shaped clusters of m a x i m u m size, in a distributed fashion. The assumptions and model are the same as in the previous case. In this algorithm, the nodes elect multiple Masters autonomously instead of using Bernoulli trials to determine potential Masters. This algorithm also requires all nodes to alternate between INQUIRY and INQUIRY.SCAN states, which increases the expected time for discovering a node. It assumes that up to logs bits of information can be piggybacked on the Inquiry_response packet. However, it is much simpler to implement than the previous one. The basic idea of this algorithm is that nodes discovering each other form a tree of responses, the root of each tree being a Master
When X < k, as before, each of the X clusters takes S messages. The remaining N - S X nodes respond to either the Super-master or to the k - X newly designated l~Iasters.
58
Master Node Phase=l ~ Phase=l
releases some nodes back into the free pool, which can be at most S - 1 in number. Therefore, the worst case message complexity for Stage-1 is O ( k . ( 2 S - 1 + S - 1)) = O ( N ) .
Phase=l +7
~.Y
.~~hase=l
+4
5.
Bluetooth is an emerging technology for low cost, low power, indoor picocellular environments. According to the Bluetooth specification [1, 2], the smallest network unit is a piconet, consisting of a Master Bluetooth device and several Slave Bluetooth devices. A Master is responsible for controlling the traffic on the piconet, and the Slaves send packets on a slotted Time Division Duplex (TDD) channel in response to the Master's polling. The maximum number of active devices in a piconet is eight. Multiple piconets with overlapping coverage areas is termed a scatternet, as illustrated in Figure 5. Inter-piconet communication is through Bridges, which participate in more than one piconet by timemultiplexing. A device can be a Slave in more than one piconet (a Slave-Slave bridge), but can be a Master in only one piconet. However, it can be a Master in one piconet and a Slave in another (a Master-Slave bridge).
F i g u r e 4: D e t e r m i n i s t i c A l g o r i t h m (see Figure 4). This parallelizes the formation of each cluster (as against the randomized algorithm, where all Slaves of a cluster had to reply to one Master-designate/Master). Each node i maintains a variable i.phase which is the number of Inquiry_responses received by it and all the nodes in its subtree. Once a node receives an Inquiry_response from another node, it increments its phase by the phase of the replying node. A node which sends an Inquiry_response goes to the PAGE_SCAN state, and is out of the competition for becoming a Master. A node whose phase is S + 1 declares itself Master, and all the nodes which replied to it (directly or indirectly) and contributed to its phase, are its Slaves. However, unlike in the previous algorithm, the Master does not at first have the Ids and Clocks of all its Slaves. Therefore, it first connects to all those nodes which directly replied to it. Once a connection is established, the Slave sends information about the replies that it had got, to its Master. This type of chaining of message exchanges eventually leads to the Master collecting information about all the nodes in its subtree. It then connects to all its Slaves directly to complete the star formation. However, there is a possibility of the phases of two nodes adding up to more than S. In such a situation, the Master who has received the response instructs either some of its Slaves or those of the responding node to go back to INQUIRY state. The second half of the algorithm involves the election of a Super-master among the Masters. To achieve this, we repeat the above algorithm (after the clusters are formed), among the Masters. In this case, the first node which reaches a phase of k becomes the Super-master and conveys this message to all nodes. This can also be done by repeating the Super-master election strategy of Stage-2 of the randomized algorithm, described in Section 3. However, this time, it is much more simplified since the number of clusters is always exactly k. But the second stage can start only after the clusters are completely formed, since the Masters are not determined until all the Slaves of their respective clusters are found. The pseudo-code for the first half of deterministic cluster formation algorithm is given in Appendix C. The proof of correctness is straightforward, and we do not give it here because of space constraints.
4.1
APPLICATION TO BLUETOOTH SCATTERNETS
Piconets
', "'-.. ,
"-.
• Master O Bridge © Slave
F i g u r e 5: B l u e t o o t h S c a t t e r n e t Bluetooth devices 'discover' each other by executing the Inquiry and P,age procedures. The Inquiry message broadcast by the source does not contain the device address, but the access code which is used to identify a class of devices. In between broadcasts of Inquiry messages, the source also listens for responses. The device in INQUIRY.SCAN state periodically listens for a fixed interval of time, called the inquiry scan window. The PAGE and PAGE_SCAN states are used for establishing a connection. A device which successfully pages another device becomes the Master and the paged device becomes its Slave. Bluetooth also defines procedures for a Master-Slave switch, which is a role exchange between a Master and a Slave. The standard also allows a Master to transfer its Slaves to another Master. However, these procedures are expensive operations, and hence we keep them to a m i n i m u m in our solution.
Message Complexity
Each Slave of a cluster has replied to exactly one Inquiry message, accounting for S messages per cluster. Each Master then sends a message to each of its S Slaves to connect to it. If the phase of a node adds up to exactly. S + 1, then the number of messages to form the cluster is thus S + S. When the phase of a node adds up to more than S + 1, it
In the randomized algorithm, we use continuous inquiry and inquiry scan, while in the deterministic algorithm, we make the devices alternate between inquiry and inquiry scan states. It is clear that an inquiry procedure can take a fair
59
amount of time even for two devices to discover each other, and an inquiring device has no control on when the message will get caught by another device. The Inquiry messages cannot take any extra information, while the Inquiry response packet has a few undefined bits which can be used to convey useful information during device discovery. Once a connection is established, any amount of information can be exchanged between the nodes without much overhead. The messages used in our model during device discovery translate directly to those defined in the Bluetooth Inquiry and Page procedures. This makes the proposed algorithms extremely well-suited for application to Bluetooth scatternets.
N=40
I ri j 0
N=~
N=80
"tl ~ of c l ~ t O r l fOrl111od,
6.
SIMULATION RESULTS
In this section, we compare the performance of our algorithms using simulation experiments. We have developed the simulation setup described below to capture the behavior and study the performance of these algorithms. We simulated the Bluetooth inquiry procedures in detail in order to determine the device discovery time. Since the page procedures are assumed to take almost no time once the device address and clocks are known, we do not simulate the page procedures in detail. Time is measured in slots, in keeping with the Bluetooth channel's slotted structure. We let the nodes arrive at uniformly random times, within a short time interval of each other (all nodes power on within the first 100 slots), in order to get random shifts in the phases of their clocks. The inquiry scan window is 18 slots, during which a node listens on a single frequency. At the start of the simulation, the frequencies are chosen randomly for each device which from then on change once every 1.28seconds. This start frequency and the clock-offset from the start are then used to calculate the frequency corresponding to any later point in time. The inquiring device sends the inquiry message on a train of 16 frequencies, and the number of train repetitions during inquiry is at its maximum of 256. It sends this query message on one train, say trainA, 256 times and then, on the complimentary train of 16 frequencies, say trainB, another 256 times. It thus alternates 256 repetitions of each train for as long as it is in the inquiry mode. We assume an error-free wireless channel, where an inquiry response always reaches the inquirer. \Ve implemented the proposed algorithms on top of these inquiry and page procedures in order to get realistic times for the scatternet formation. The total number of nodes for which we run these experiments varies from 25 to 80. The maximum number of nodes in a piconet or cluster, including the Master is 8. Figure 6 shows, for 40 to 80 nodes, the progress of cluster formation with time, for the randomized algorithm. As can be seen, 50% of the clusters are formed in a short time, typically the first 50 slots. However, for all clusters to be formed completely, it takes as large as 12,000 slots for the last of 5 clusters (see Figure 8). This is effectively the time taken to elect the Super-master, which does not increase with the total number of nodes. The percentage of Slavedesignates that have been caught by Master-designates in the first 50 slots is very high (70%). This is because the nodes are into continuous inquiry and scan, and hence, the states are always complementary.
60
Figure 6: Scatternet f o r m a t i o n time for the Randomized Algorithm.
~180 21~ .~2140 N=40 I_I,, N.50
~ 21~ c
,2.o Numb~r ol cluslerl f o r e s t
Figure 7: Cluster formation t i m e for the D e t e r m i n istic Algorithm.
For the deterministic algorithm, a node first goes into inquiry or scan, each with probability 0.5. The period of an inquiry is 2048 slots, and the n u m b e r of train repetitions is as before, and the scan period is 18 slots. The cluster formation times (which in this ease correspond to Master determination time) are shown in Figure 7. Although about 50% of the nodes have responded to an inquiring node in the first 50 slots, the Masters are declared elected only between 1000 and 1100 slots, irrespective of the total number of nodes. This is because the scan period is much smaller than the inquiry period. This leads to the possibility of most of the nodes inquiring (or scanning) simultaneously, reducing the period of overlap between the inquiry window of one node and the scan window of another. In stage 2, the Super-master is determined by repeating the phase-based algorithm to elect a single leader. The time taken for this is plotted against the total number of nodes in Figure 8. As expected, as the total number of nodes increases, we find a steady increase in the scatternet formation time, from about 28,000 slots for 30 nodes to 40,000 slots for 80 nodes. The deterministic algorithm which uses a phase-based Supermaster election takes about 2.5 times longer to completely form the scatternet, compared to the randomized one. As mentioned before, Stage-2 of the randomized algorithm can be used for the Super-master election in this case too, in or-
SO
[6]
A. D. Amis, R. Prakash, T. H. P. Vuong, and D. T. Huynh. Max-rain D-cluster formation in wireless ad hoc networks. In INFOCOM, 2000.
[7]
H. Chernoff. A measure of asymptotic efficiency for tests of a hyi)othesis I)ase(t on sums of observations. Ann. Math Statistist, 23:493-507.
[8]
V. Chvatal. A greedy heuristic for the set-covering problem. In Mathematics of Operations Research, pages 233-235, 1979.
[9]
B. Das, E. Sivkumar, and V. Bhargavan. Routing in ad hoc networks using a spine. In IEEE Int. Conf. On Computers, Communication, and Networks., pages 1-20, Sept. 1997.
70
~so
Det~mlnl$11c Random~d
TOlai numb~ of nodes
F i g u r e 8: S c a t t e r n e t f o r m a t i o n t i m e
der to speed up the scatternet formation. However, in this case, Stage-2 can start only after the CLUSTER_TO, which was kept at 2500 slots in our Simulations.
7.
[10]
R. Dechter and L. Kleinrock. Broadcast communication and distributed algorithms. IEEE Trans. Computers, pages 210-219, 1986.
[11]
R. Duh and M. Furer. Approximation of k-set cover by semi-local optimization. In 29th Annual ACM Symp. on Theory of Computing, pages 256-264, 1997.
CONCLUSIONS AND FUTURE WORK
In this paper, we have proposed two new distributed clustering algorithms for wireless ad hoc networks. Our model is based on that of Bluetooth, an emerging technology for future indoor picocellular environments. We presented a 2stage O(N) randomized algorithm for an N node complete network, which finds the minimum number of star-shaped clusters, all at their maximum size. We also proved the correctness of this algorithm. ~Ve then presented a completely deterministic O(N) algorithm in which clusterheads are elected autonomously by the nodes. ~,Vecompared their performance using simulations on top of Bluetooth's device discovery procedures. Results show that the randomized algorithm performs better with respect to both cluster and network formation times.
[12] G. S. Fishman. Monte Carlo, Concepts, Algorithms and Applications. Springer-Verlag, 1996.
[13]
M. Gerla and J. T. C. Tsai. Multicluster, mobile, multimedia radio network. A CM Baltzer Journal of Wireless Networks, 1(3):255-265, Oct. 1995.
[14]
T. Havashi, K. Nakano, and S. Olariu. Randomized initialization protocols for packet radio networks. In
13th Int. Parallel Processing Symp. And lOth Symposium on Parallel and Distributed Processing, 1999.
[15]
Future work could include clustering algorithms for this model based on a mobility model and service classes. It would also be interesting to explore the same problem for incomplete graphs, where all devices are not within radio range of each other. Defining optimal topologies and efficient routing protocols are also related problems of interest.
W. Hoeffding. Probability inequalities for sums of bounded random variables. J. Amer. Statist Assoc, 58:13-29, 1963.
[16] A. Itah and M. Rodeh. The lord of the ring, or probabilistic methods for breaking s y m m e t r y in distributed networks. Technical Report RJ 3110, IBM, Yorktown Heights, N.Y, 1981.
[17]
M. R. Korupolu, C. G. Plaxton, and R. Rajaraman. Analysis of a local search heuristic for facility location problems. In Proc. of 9th Annual ACM-SIAM Syrup. on Discrete Algorithms, pages 1-10, 1998.
[2] http://www.bluetooth.net.
[18]
[3] H. H. A. Amara. Fault-tolerant distributed algorithm for election in complete networks. IEEE Trans. Computers, 37(4):449-453, 1988.
A. Mizutani, T. Aihara, S. Shimotono, and H. Ishikawa. Bluetooth scatternet. Technical Report RT5176, IBM Tokyo Research Lab, 1999.
[19]
[4] H. H. A. A m a r a and A. Kanevsky. On the complexities of leader election algorithms. 5th Int. Conf. on Computing and Information., 1993.
R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995.
[20]
C. V. Ramamoorthy, J. Srivatsava, and W. T. Tsai. Clustering techniques for large distributed systems. In INFOCOM, 1986.
[5] H. H. A. A m a r a and J. Lokre. Election in asynchronous complete networks with intermittant link failures. IEEE Trans. Computers, 43(7):778-788, 1994.
[21] T. Salonidis, P. Bhagwat, and L. Tassiulas. Proximity awareness and fast connection setup in Bluetooth.
8.
REFERENCES
[1] http://www.bluetooth.com.
Submitted for publication.
61
A.2
[22] D. B. Shmoys, E. Tardos, and K. Aardal. Approximation algorithms for facility location problems. In 29th A C M Syrup. on Theory of Computin9, pages 265-274, 1997. [23] G. Singh. Efficient distributed algorithms for leader election in complete networks. In 11th Int. Conf. on Distributed Computing Systems., 1991. [24] J. Wu and H. Li. On calculating connected dominating set for efficient routing in ad hoc wireless networks. In D I A L - M for Mobility, 1999.
APPENDIX A. A.1
Proof of Lemma 2
PROOF. Each time a Slave-designate responds to an Inquiry packet, it waits to be connected immediately by the Inquirer. If it gets connected to the Inquirer, it does not scan any more and hence belongs to one cluster. On the other hand, if the Inquirer does not connect to it (because its cluster is full), then it goes back to INQUIRY_SCAN state, and the process is repeated until it gets connected to its Inquirer. Thus a Slave-designate belongs to exactly one cluster. Master-designates and Slave-designates are determined autonomously by Stage-l, the result of which is known to them. Since the Master-designate/Master connects to the Slave, the Slave is informed about the Master's Id and hence it knows its home cluster. []
PROOF OF CORRECTNESS
A.3
Proof of Lemma 3
PROOF. Clearly, if X = k, by Lemma 1 and Assumption1, the above lemma is true. Case 1) X < k: By Assumption-I, each of the X Masterdesignates collects S Slave-designates within the CLUSTER_TO period. Hence, there are N - X - S X Slave-designates which are still scanning and do not belong to any cluster, since a Master-designate collects responses from at most S Slavedesignates. By Lemma 1, exactly one node gets elected as Super-master. Suppose k - X < S, then the Super-master makes k - X of its Slaves as Masters, which collect the remaining Slave-designates. If k - X > S the Super-master makes all its S slaves as Masters, and the first new k - X - S Slave-designates are made Masters. By Assumption-2 and Lemma 1, the Super-master will catch all the Masters well before the SUPERM_TO period. Therefore, there are now exactly k Masters in the system. "~Veneed to prove that all the remaining Slave-designates will belong to some cluster. By Assumption-I, within another CLUSTER_TO period (local to each new Master), each of the new Masters will catch S Slaves each, except for one of them, which will have N m o d ( S + 1) Slaves. Some of the new Slave-designates will fill up the gap in the Super-master's cluster itself. From Lemma 2, each Slave-designate belongs to exactly one cluster. Case 2) X > k: Clearly, the size of each cluster is not at its maximum. By Assumption-I, there are no Slave-designates which do not belong to any cluster after the CLUSTER_TO period. All nodes which have received responses from k Proxy-slaves become Super-master-designates. By Lemma 1, exactly one Super-master gets elected. According to the algorithm, the remaining X - k Proxy-slaves will eventually respond to the Super-master, within the SUPERM_TO period. The nodes in the extra clusters are distributed such that all clusters have S + 1 nodes except for one, which has N m o d ( S + 1) nodes. As in the previous case, a Slavedesignate always belongs to exactly one cluster. []
Proof of Lemma 1
PROOF. We examine the cases X k separately. We prove this lemma by showing that in each case, there is at least one Super-master-designate which becomes a Super-master, and there cannot be two Super-masters. Case 1) X _< k: By Assumption-2, within the SUPERM_TO period, each of the X Master-designates/Masters catches at least one Slave each. So the number of Proxy_Slaves is also X. Since there are no more Proxy-slaves, the nodes time out after SUPERM_TO units. Therefore, there exists at least one Super-master-designate, which sends messages to all the Proxy-slaves to collect information about the clusters. If there is more than one Super-master-designate, they would have received responses from each other's Proxy-slaves. Since all nodes have unique Ids, there is exactly one Master with the highest Id, which becomes the Super-master. Thus, out of the existing Master-designates, exactly one Supermaster is elected. If any of the new Masters with a Proxy-slave which has higher Id enters the Super-master competition, then there is a possibility of there being two Super-masters. However, this is avoided since, once a Super-master has been elected, none of the Proxy-slaves scan any more. The first k - X (or k - X - S, as the case may be) nodes that respond to the Super-master are forced to become Masters and hence Inquire and discover the remaining nodes. The new Masters and the Slaves of the Super-master who have been made Masters, do not assign Proxy-slaves. The only other way there can be two Super-masters is if the set of N - X - S X nodes try to form a parallel network. This is a contradiction to the fact that there axe exactly X Master-designates. Thus there is at most one Super-master elected out of all the N nodes. Case 2) X > k: By Assumption-2, there is at least one node which becomes a Super-master-designate before the timeout SUPERM_TO. Clearly, after collecting cluster information from k Proxy-slaves, the total number of known nodes would be less than N. Hence, the node knows that the number of clusters is more than k. It tries to collect information from all the Proxy-slaves that responded to it. By Assumption-2, after SUPERM_TO, at least one of the Super-master-designates will hear from all X Proxy-slaves. Therefore, there is at least one Super-master-designate who knows about all N nodes. Since the nodes have unique Ids, and the Master with the highest Id is chosen to be the Supermaster, there is exactly one Super-master that is elected. []
B.
PSEUDOCODE FOR STAGE-2 OF THE RANDOMIZED ALGORITHM
if (node is a Master-designate or a Master) w h i l e (forever) broadcast Inquiry packets until CLUSTER_TO or SUPERM_TO if (Master-designate receives Inquiry response from Slave-designate) t h e n page and connect to the node as Master-Slave
62
pair if (this is the first Slave) t h e n make it a Proxy-slave if ( number of Slaves = S ) t h e n cluster is complete, no more Slaves are collected else if (response received from Proxy-slave) then if ( CLUSTER_TO occurs and (number-of-masters known = k or SUPERM_TO occurs ) and node became Master-designate in Stage-1 ) t h e n node becomes Super-master-designate informs Proxy-slave if (node has not become Master-designate by Stage-1 and CLUSTER_TO occurs ) t h e n break if ( CLUSTER_TO occurs and no responses received ) t h e n node becomes Slave-designate endwhile endif
all N nodes are known make highest Id node as Super-master endwhile if ( node has been asked to become Super-master and all N nodes are not known ) t h e n w h i l e ( all N nodes are not known ) make extra Masters when fresh nodes are discovered, which in turn collect the remaining nodes if ( extra Slaves are caught ) t h e n redistribute them among the Masters which have fewer Slaves endwhile if ( number of clusters formed > k ) t h e n tear down excess clusters and reidstribute nodes among k clusters endif
C.
PSEUDOCODE FOR THE DETERMINISTIC ALGORITHM
All nodes have initial phase 1 and alternate between INQUIRY and INQUIRY_SCAN w h i l e ( node i does not become Master with a full cluster or is not a Slave in C O N N E C T E D state ) if ( Inquiry response is received from a node j ) t h e n i.phase+ = j.phase if ( i.phase = S ) t h e n connect to Slaves which replied directly get the Id of Slaves which replied to them in turn and connect to them if ( i.phase > S ) t h e n connect to only S slaves, instruct the remaining to Inquire/scan again if ( Inquiry packet is received ) t h e n send a response with the phase information and go to PAGE_SCAN endwhile
if (node is a Slave-designate) t h e n do continuous inquiry scan w h i l e ( node is a Slave-designate ) if ( Inquiry packet is received ) t h e n respond and go to PAGE_SCAN if ( paged immediately ) t h e n become Slave e l s e go back to INQUIRY.SCAN endwhile endif
if ( node is a Proxy-slave ) t h e n w h i l e (Super-master-elected message not received) alternate between INQUIRY_SCAN and C O N N E C T E D states if ( Inquiry packet is received ) t h e n go to PAGE_SCAN state after responding if ( no page is requested ) t h e n go back to INQUIRY.SCAN if ( cluster-status update is received from Master ) t h e n update cluster information if ( Super-master-designate asks for cluster information ) t h e n send cluster information to it endwhile go to C O N N E C T E D state endif
D.
NOTE ON STAGE 1 OF THE RANDOMIZED" ALGORITHM
Let Yi be the random variable denoting the number of successes obtained in a round of the Bernoulli trials.
Let Y = ~-~T=I1~, where 1~ E {0, 1,..., N} We set E[Y] = k Let Z = ~--~i~1Zi, where Zi e {0, 1} P[Z - ~ >_ e] k / N + Te] < e -2(Te)2 / T Or P [ Y > k + N•] < e -2~=/T
63