As the cost of integrating a 802.11 wireless local area net- work transceiver into ... ing advantages comparing with inf
FINAL REPORT FOR CSC463 AND ELEC569. DECEMBER 2015
1
Implementation of a Fast Deployable Ad-Hoc Sensor Network with Internet Access Chenchen Guo and Linlin Zhang
Abstract—Many devices nowadays have integrated WLAN transceivers that support both infrastructure mode and Ad-Hoc mode. Unfortunately, products that took the advantage of AdHoc mode can hardly be found in the commercial market. One of the reason might be that it could be difficult for a user mode application to achieve the establishment of an entire mesh network (as root access is required). This report proposed a way to easily configure machines using Unix/Linux operating system to build a fast deployable Ad-Hoc sensor network that supports redundant Internet gateways.
N Sensor Node N
A Access Point
Coast
N A
N N
A
N Extender
Keywords—deployable, Ad-Hoc, sensor, network, Internet.
I. I NTRODUCTION As the cost of integrating a 802.11 wireless local area network transceiver into embedded systems has been drastically reduced in the recent years, many of the mobile devices now have the ability to join a WLAN network or interconnect with each other and create a mesh network. However, despite that the optional ”Ad-Hoc” mode had been standardized for more than a decade,[1] most of the devices in the commercial market has applications that support infrastructure mode only. One of the possible reason is, it could be hard to implement in a user mode application because root permission is required to access the wireless adapter and receive raw packets on that interface. At the meantime, it also seems redundant and may cause overhead if integrated with device driver. In this report, we will propose a way to create a wireless mesh network with reliable Internet access on devices that has POSIX standard operating systems installed, without modifying either the enduser applications or the device driver. A. Advantages For sensor network applications, Ad-Hoc mode has following advantages comparing with infrastructure mode: • Faster Deployment In infrastructure mode, all the nodes connect to an Access Point and only exchange information with the Access Point. Generally, the access points are connected to the Internet using Ethernet cables, which is efficient and reliable in most cases. However, for sensor network applications, it is often required to monitor a large area that may not have cable connection available across the entire monitoring area. In this case, in order to use the traditional infrastructure mode, we need to build the proper infrastructure first. The effort including but not limited to: setting up a potentially large number of access C. Guo and L. Zhang are with University of Victoria.
River
N
A
House with Internet Access
Fig. 1.
•
•
Configuration using Infrastructure Mode
points across the field, connecting the access points with excessively long Ethernet cables, amplifying the Ethernet signal with additional devices to compensate the lost due to long cable length, etc. Constructing an environment as in Figure 1 could be time consuming and expensive. With Ad-Hoc network described in Figure 2, access points are no longer required. We only need to connect the Ethernet cable to one of the node and every node in the entire mesh network could have access to the Internet. Since we could choose any node in the mesh network to connect the Ethernet cable with, choosing the closest one could help us get rid of the Ethernet extender. This property allows the user to deploy a large sensor network with Internet access within a very short period of time. Lower Cost Using Ad-Hoc network could sometimes significantly reduce the cost on hardware for each nodes that required to access the Internet. In the previous example, we can save the budget used for purchasing Access Points, long Ethernet cables, Ethernet extenders, etc. Besides, when special equipments are required to access the Internet (e.g. Satellite Internet), instead of giving all the nodes a satellite transceiver, we can create an Ad-Hoc network and share the Internet access among the entire network. In this way, only one transceiver is required. Since WLAN adapters are much cheaper than satellite transceiver, a great amount of budget could be saved. Higher Reliability In the infrastructure mode, all the nodes communicate
FINAL REPORT FOR CSC463 AND ELEC569. DECEMBER 2015
2
•
N Sensor Node N
Coast
•
N
Choose proper routing protocol to achieve network wide connectivity Find a way to forward packets from local mesh network to the Internet
N N
N
River
N
House with Internet Access
Fig. 2.
Configuration using Ad-Hoc Mode
Higher Level
Sensor Data Collection Tool
Data Transfer Client
Other User Applications
Babel Routing Protocol
Monitor
Ad-Hoc Mode Interface Lower Level
Fig. 3.
System Structure
only with the assigned access point. If one access point breaks down, all the nodes that within its coverage will be affected. However, for a wireless mesh network established using our approach, the nodes could intelligently detect Internet access (either wired or wireless), share its Internet connection with other nodes when available, or access the Internet through other nodes when it doesn’t have Internet access on its own. II. D ESIGN G OALS Our goal is to design a system service that lies above device driver and below the end-user applications as described in Figure 3, providing the routing, Internet gateway advertisement and auto default gateway configuration features which are required in a mesh network. Since sensors are usually not as powerful as a desktop computer, we decided to build and test the entire solution on embedded systems that using ARM processors so we could monitor the impact on CPU time and power consumption easier. In our prototype, we chose the second generation Raspberry Pis as the central control unit of our mobile nodes. The remaining objectives are as follows. • Acquire basic Node-to-Node connectivity by configuring the wireless adapter to Ad-Hoc mode
III. ROUTING P ROTOCOL C HOICE In our research, we found several routing solutions that might satisfy the requirement of our application, including: • B.A.T.M.A.N. Advanced • Babel - a loop-avoiding distance-vector routing protocol • Project Byzantium After careful inspection, we chose the Babel routing protocol. ”Babel is a loop-avoiding distance-vector routing protocol that is designed to be robust and efficient both in networks using prefix-based routing and in networks using flat routing (’mesh networks’),and both in relatively stable wired networks and in highly dynamic wireless networks.”[2] Basically, Babel is a routing protocol designed base on DSDV, AODV and Cisco’s EIGRP, that offers several features which are necessary in wireless Ad-Hoc network. First, the loop-freedom feature. Since the wireless transmission is not always reliable and the packet lost rate is much higher than Ethernet, it is important to minimize the impact caused by a lost routing protocol packet or a mobility event. Babel is designed so that the routing loop or ”black-holes” during reconvergence could be corrected in a time at most proportional to the network’s diameter. The second feature is fast recovery. In many cases, after a mobility event, Babel could recover connection to other nodes very quickly even without any package exchanges. Although the route might not be the optimal one after such event, it will still converge to the optimal configuration within minutes. Finally, unlike B.A.T.M.A.N. Advanced, Babel will not create any virtual network adapter. This provides a better compatibility as Babel is transparent to the user mode applications. IV. FAILED ATTEMPTS We encountered a significant difficulty at the very first step where we need to configure the wireless network adapters to work in Ad-Hoc mode, which turns out to be a software compatibility issue. As it is highly possible that the same problem might occur again in real world implementations, we decide to record the entire diagnostic procedure in this section for future reference. The wireless adapter that we found has compatibility issue is EW-7811Un, which is using RT2800USB driver. We chose this wireless adapter before because during the research, we found some people created an Ad-Hoc network successfully with this model of wireless adapter. For example, this is the desired configuration for the first node in the format of ”interfaces” file located at /etc/network/. a u t o wlan0 i f a c e wlan0 i n e t s t a t i c address 192.168.13.1 netmask 2 5 5. 25 5 .2 5 5. 25 5 w i r e l e s s −c h a n n e l 6 w i r e l e s s −e s s i d S e n s o r N e t
FINAL REPORT FOR CSC463 AND ELEC569. DECEMBER 2015
w i r e l e s s −mode Ad−Hoc w i r e l e s s −ap 9 0 : B1 : 7 F : D8 : 6 9 : 5A After stopping the Network Manager and modifying the /etc/network/interfaces file, we could observe that the working mode has been switched to Ad-Hoc using the iwconfig command. The strange thing is, the cell ID is not what we just set and the frequency is always at channel 1. Manually specified cell ID is required in order to avoid ”cell splitting”. Fortunately, although the cell ID is different from what we set, it is still the same among all the nodes. Which indicates that all the nodes are now in the same Ad-Hoc network. However, when we try to ping other nodes with the manually assigned IP address, it just keep showing ”destination host unreachable”. In order to solve this problem, we made many attempts including: • Installing different operating systems, the attempted OSes are Ubuntu Mate, Raspbian and Arch Linux • Using nl80211 driver with wpa-supplicant to configure the interface • Direct manual configuration using ifconfig, ip and iwconfig • Attempting different channels, encryption methods, IP and subnet mask settings. Unfortunately, the only reply we got was either ”Destination Host Unreachable” or timeout message. As the wireless adapters are blinking synchronously when ping process was running, we suspected that the message actually went through but discarded by the kernel driver. Therefore, we added an ARP record manually with the destination’s IP address and MAC address and tried again. This time, the ping command completed successfully. In conclusion, the following phenomena might indicates a compatibility issue: • channel setting doesn’t take effect • explicit cell ID setting doesn’t take effect • ping failure but succeed after manual ARP table configuration If the Ad-Hoc network is not working properly and any of the above phenomena is observed, it is highly recommended to test with other wireless adapters (i.e. adapters that use different chipsets). The problem is resolved by substituting the RT2800 adapters with other adapters that using RT5370 chipset. V.
D ESIGN E VOLUTION
A. First Edition Given that the network wide connectivity is achieved by Babel, the only thing we need to care about is to forward the data from the local Ad-Hoc network to the Internet. Therefore, we decided to write a data forwarding program. This program will start when an active Internet connection is detected. After it starts, it will broadcast its IP address to all the nodes that appear in its routing table (i.e. all the nodes that reachable from it). In this version, a special sensor application (application that gather data from the sensor and send back to the server) is needed. This application will keep sending data back to the server if Internet access is available locally (e.g. by Ethernet
3
cable or a second WLAN adapter). If the Internet is no longer available, it will start a UDP server and listening for the possible broadcasts by the data forwarding programs on other nodes. If a broadcast message is received, it will establish a TCP connection with that node and send all the data to the data forwarding program on that node. B. Second Edition The first edition of our solution didn’t really give all the nodes ”complete” Internet access. It is more like a VPN program running in user mode with self advertisement feature. In order to achieve a better compatibility and efficiency, we need to design the protocol carefully. Thinking that the actual implementation of this data forwarding program might be more complicated than our expectation, we decided to find an alternative way of data forwarding. After reading the Babel’s RFC documents, we noticed that Babel could be configured to distribute the default gateway automatically. Therefore, in the second edition, we planned to discard the data forwarding program and achieve this feature purely using Babel and the tools provided in most Linux/Unix operating systems. C. Final Edition Although the second edition works great when there is only one Internet gateway available in the entire mesh network, the performance dropped significantly when there are two or more nodes have direct Internet access. By monitoring the routing table, we discovered that, if there are more than one nodes has Internet access, Babel will always try to choose the one with less local latency (i.e. the latency within local Ad-Hoc network). When two nodes with similar local latency both have Internet access, the Babel will jumps between these nodes consistently and breaks all the current TCP connections. (Recall that the 4-tuple uniquely identifies a TCP connection) In order to solve this problem, we decided to combine the first edition and the second edition together. Do the packet forwarding using built-in utilities like the second edition does, but do the self advertisement and default route selection using our own program as we planned in the first edition. We call our program, the ”Monitor”. VI.
M ONITOR A LGORITHM
The Monitor program prototype is written as a Python script for best compatibility. It has two states, the ”broadcasting state” and the ”listening state” like in Figure 4. Since this program is designed as a system service, it could monitor the condition of the global network interface (network interface that might have access to Internet directly) right after the system is started without any user intervention. When a connection is detected, it will enter the ”broadcasting state”, try to reach all the nodes in its routing table and advertise itself through UDP packets. When a connection is unavailable, it will enter the ”listening state”, start a UDP server and see if there are any nodes with Internet access broadcasting.
FINAL REPORT FOR CSC463 AND ELEC569. DECEMBER 2015
Broadcasting State
Listening State
• Start dhclient to acquire valid IP address • Configure iptables to act like a NAT box • Send hello message to reachable nodes with predicted latency information calculated using EMA
• Listen on UDP port • Choose fastest gateway • Set default gateway and nameserver accordingly • Keep listening and check if there are faster gateway appears • Switch gateway only when necessary
Plugged
Fig. 4.
Unplugged
Overview of the Monitor Program
Main Monitoring Internet
Listening Thread
Broadcasting Thread Discover Potential Client
Configure Default Gateway
Ping Test Threads Test Potential Gateways
Fig. 5.
Listening Server Thread
UDP Broadcast Thread
Parse Broadcast Message
Send UDP Packets
Thread Tree of the Monitor Program
A. Thread Structure Different from the traditional server/client model. The Monitor program could be a client application when Internet connection is detected, and becomes a server when Internet connection is not available. Therefore, it is necessary for the program to create threads and execute the server or client code in the child thread because the main program should keep monitoring the status of the global network interface. (That is why the program is called ”the monitor”.) The child thread tree could be described in Figure 5. B. Main Thread Behaviour When the program is started, it will try to release DHCP lease to test if it has root permission and avoid the outdated default gate way. After that, it will try to find the configuration file which recorded the global network interface (the network interface that could access the Internet) and the local network interface (the network interface that used for AdHoc network). If the configuration file exist, it will try to verify it by checking whether the network interface is still exist in the current system. If the configuration file is not found or the data is outdated, it will ask the user to do the initial configuration. (i.e. choose the local network interface and the global network interface) After the configuration file is loaded or created, it will enter the main loop, which keep checking on the carrier status of the global network interface
4
by reading the file ”/sys/class/net//carrier”. This naive polling method might not be very efficient, but it is necessary because the ”carrier” file is not a real file, the change in that file will not trigger a file system event. When the carrier status changed to 0 (unplugged), it will try to release DHCP lease and start the lithening thread. When the carrier status changed to 1 (plugged in), it will try to acquire the DHCP lease and start the broadcasting thread. C. Listening Thread Behaviour After the Listening Thread is started. It will first reset the IP table to clear the potential influence caused by broadcasting thread. It will also try remove the default gateway that might be set by other programs so it could configure the default gateway later. Then it will start the Listening Server Thread to listen on port 32767. The packet handler of the Listening Server Thread will parse the packet and add or update the potential gateway list with the latest global latency (latency to the Internet) and initial TTL (Time To Live value). The Listening Thread itself will enter a loop. In every cycle of this loop, it will spawn multiple threads to ping the potential gateways that in the potential gateway list, update the local latency value accordingly and calculate the total latency by summing up the global latency and local latency (latency in the Ad-Hoc network). After all the ping threads returned, it will sort the potential gateway list and choose the fastest gateway. If the fastest gateway is much faster than the current gateway (i.e. the difference of the total latency is greater than a threshold value), it will delete the current default gateway, setting the new default gate way and configure the DNS accordingly. At the end of the loop, it will deduct the TTL value of all the entries in the potential gateway list and sleep for a fixed amount of time. This TTL value could help the program to expire gateways that are no longer available. D. Broadcasting Thread Behaviour If the Broadcasting Thread is started, it means that Internet connection is available locally and ready to share the connection with other nodes. Therefore, once it is started, it will configure the iptables to enable the NAT feature. Then it will start the UDP Broadcasting Thread. In the UDP Broadcasting Thread, it will keep measuring the global delay, traverse the potential client list and send UDP advertisement packet to each of the potential client. The packet contains its local IP address (IP address used in the Ad-Hoc network) and the latency to a specific server in the Internet. The Broadcasting Thread itself will enter a loop, in this loop it will keep updating the potential client list by calling the route command. It will also remove the entry in the potential client list if it is no longer in the routing table or apeeared to be ”unreachable”. E. Latency Measurement Method Both the gloabl latency and the local latency are measured using the method of Exponentially Weighted Moving Average (EWMA) listed below because the dynamic range of the latency value is limited and we want get a smother result
FINAL REPORT FOR CSC463 AND ELEC569. DECEMBER 2015
by taking the history values into consideration. The dynamic range of latency value is limited by setting an upper bound, when the ping test timeout, the upper bound value will be returned. estimatedDelay = oldEstimatedDelay + alpha ∗ (measuredDelay − oldEstimatedDelay) VII. D EPLOYMENT P ROCEDURE As we develop the entire solution with compatibility in mind, this solution could be applied to many Linux/Unix machines easily. The only required tools are: • GCC Compiler and Babel Source Code(Compile Babel from source if package is not available) or • Precompiled Babel Package • Python 2.7 • Python Package: ping 0.2 • The Monitor Application The step of applying our solution is as follows. 1) Compile and install Babel 2) Configure Babel service by editing the configuration file at /etc/default/babeld 3) Disable Network Manager Service using command: sudo service network-manager stop 4) Configure /etc/network/interfaces file following the example in Section IV 5) Install ”the Monitor” program at /local 6) Run ”the Monitor” program once to configure the global and local network interface 7) Start ”the Monitor” service 8) DONE It is possible to create a shell script and make the deployment process even faster. VIII. P ERFORMANCE E VALUATION Limited by the number of Raspberry Pis we have, the following result might not be representative. The setup of the experiment is similar to Figure 2. A. Latency Testing For four nodes, the farthest node has Internet access. The latency measured from the closest node to the farthest node is around 2 to 4 ms. The downloading speed is in average around 550KB/s. B. Gateway Switch For four nodes, choose one of them for Internet access. The time required between Ethernet cable is connected to successfully configured the default gateway and receive the first ping reply is in average around 4 second. C. Redundant Gateway Testing The test was done using four nodes with two nodes have Internet access in different networks. When severely congesting one of the network on purposely, an automatic gateway switch could always be observed within 10 seconds.
5
D. Gateway Switch Impact Testing For a client software that already has a TCP connection with a server in the Internet, the impact of the gateway switch is around 5 to 30 seconds depends on the network condition. (i.e. the connection will be temporary terminated for 5 to 30 seconds) It also depends on the timeout setting of the client software. IX. C ONCLUSIONS AND F UTURE W ORK In this report, we presented an easy way of creating wireless mesh network with support of redundant Internet access. According to our test result, it is feasible to implement a fast deployable sensor network with reasonable performance without any additional hardware or excessively large efforts. The great flexibility and good redundancy makes it perfect for some specific applications. Apparently, the performance of ”the Monitor” program could be greatly improved if integrated with the Babel routing protocol. In future work, we will keep improve the Monitor program, find a better way of monitoring the Internet connection, add auto configuration feature, enhance security and optimize power consumption. A PPENDIX A F ULL CODE OF THE M ONITOR P ROGRAM Limitation of space prevent us of posting the full version of code here so we put them online. The full version of code could be downloaded at: http://www.maplerecall.com/host/elec569/monitor.py The sample configuration file could be downloaded at: http://www.maplerecall.com/host/elec569/monitor.conf The script required for service setup could be downloaded at: http://www.maplerecall.com/host/elec569/monitor ACKNOWLEDGMENT Special thanks to users on Stack Overflow, their responses helped us eventually identified the software compatibility issue. R EFERENCES [1]
G. Anastasi, M. Conti, and E. Gregori. IEEE 802.11 in Ad Hoc Networks: Protocols, Performance and Open Issues 69. [2] Chroboczek, J., ”The Babel Routing Protocol”, RFC 6126, DOI 10.17487/RFC6126, April 2011, http://www.rfc-editor.org/info/rfc6126.