Connectivity Prediction in Mobile Ad Hoc Networks for

0 downloads 0 Views 2MB Size Report
106. 6.5. Evaluation of Data-link layer frame reception prediction with small train- ..... hapter 6. C hapter 7. C hapter 2. Mobile Ad Hoc. Networks. Real-Time Systems .... Control System (NCS) with usually fixed network structure and carefully designed and. 11 ...... net/pdf/wndw2-en/wndw2-ebook.pdf (visited on 10/28/2014).
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control Sebastian Thelen

First published in Germany in 2015 by BoD – Books on Demand, Norderstedt as doctoral thesis at the Faculty of Mechanical Engineering of the RWTH Aachen University. 2nd unchanged edition; layout modifications for online publishing only

c 2015 Sebastian Thelen (ORCID: 0000-0002-4033-8634) Copyright http://orcid.org/0000-0002-4033-8634

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License. https://creativecommons.org/licenses/by-nd/4.0/

Acknowledgements The ideas behind this thesis emerged from the engineering and research that I performed for the German telemedicine project TemRas. This was one of the research areas that I was lucky to be involved in during my five years of work as scientific researcher at the Institute of Information Management in Mechanical Engineering (IMA) that is part of the institute cluster IMA/ZLW & IfU of the RWTH Aachen University. Further research in this cluster encouraged me to widen my view regarding the applicability of these ideas to other domains such as autonomous vehicles and cooperative driving. At the same time, this work gave me the necessary freedom to complete my thesis during these five years. Hence, my sincere gratitude belongs to every person who supervised me, supported me, or worked with me at the IMA/ZLW & IfU or in the research projects or otherwise supported me and my work. First and foremost, this is of course my adviser and head of the institute cluster, Prof. Sabina Jeschke, who always encouraged me in my work, provided critical feedback, and offered helpful support. I want to thank Prof. Klaus Henning for his role as second examiner of my thesis. Furthermore, I want especially to thank Prof. Daniel Schilberg, Tobias Meisen, Marie-Th´er`ese Menning, Max Haberstroh, Philipp Meisen, and Jesko Elsner for their time and effort they put into critical discussions and comments regarding my research and thoughts that contributed to the thesis. Special thanks go to Margit Werden, J¨ urgen Heinel, and Nicolai Mathar for their uncomplicated technical support, to Tomas Sivicki for making nice figures out of my sketched drawings, and Christian Schwier for helping to implement the simulation studies. I am thankful to the parties that funded my work. Namely, the EU’s EFRE-Fonds and the Ministry of Innovation, Science and Research of the state of North Rhine-Westphalia (Germany) for the public funding of TemRas. In addition, the involved project partners— Philips HealthCare, P3 communications, 3M, the RWTH Aachen University, and the University Hospital Aachen—contributed own financial resources and have my gratitude. Finally, I would never have completed the thesis without the loving support, dedication, and encouragement from my fianc´ee Juliane. I am also greatly thankful for the love and support I received from my parents, who have always believed in me and helped me to go my way. Aachen, July 2015 Sebastian Thelen

Abstract The term cyber-physical systems expresses the fundamental issues that arise when embedded systems are no longer encapsulated, closed systems but form open, interconnected systems of systems and established abstractions of system design begin to fail; namely, the aspect of time and availability of resources must no longer be hidden from application layer functions. From the numerous open research challenges that remain, this thesis addresses the prediction of local communication in mobile ad hoc networks in order to contribute to a more dependable communication in such a system of systems that cyber-physical systems are envisioned to form. A research gap concerning the influence that contextual factors exert on the three connectivity metrics end-to-end communication delay, packet delivery ratio, and streaming window width, i.e., the amount of successive end-to-end transmissions without a packet loss, in a mobile ad hoc network with moving nodes has been identified. To fill this gap, a simulation study that follows a systematic, full factorial design using discrete event simulations is carried out. The simulation study’s outcome is analyzed with statistical data analysis methods to identify the study’s scenario parameters that have significant influence on the connectivity metrics. Furthermore, the thesis contributes to the current state of the art research of real-time communication for control tasks via mobile ad hoc networks by proposing and evaluating three classes of prediction models for each of the three mentioned connectivity metrics. The three model classes differ in their complexity and intrusiveness regarding the network architecture. The simple black-box models fully reside in the application layer of the flow’s end-point nodes and use time-series forecasting and statistical models from reliability engineering. The cross-layer models require cooperation from intermediate nodes in the network to acquire information that is sensed along a flow’s current route. Most complex are the probabilistic network graph models that incorporate predictions of uncertain node locations and information sensed from throughout the network. Second level adaptation models use on-line supervised machine learning to improve the domain and statistical models’ predictions. The proposed prediction models are evaluated in carefully designed simulation studies using discrete event simulations that follow state of the art recommendations from the computer networking community to ensure the results’ validity.

Contents 1. Introduction 1.1. Existing Research Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3. Methodology and Structure . . . . . . . . . . . . . . . . . . . . . . . . . 2. Fundamental Concepts and Definitions 2.1. Real-Time Systems and Communication . . . . . . . . . 2.2. Networked Control Systems . . . . . . . . . . . . . . . . 2.3. Mobile Ad Hoc Networks . . . . . . . . . . . . . . . . . 2.3.1. Computer Networking Basics . . . . . . . . . . . 2.3.2. Wireless Networks . . . . . . . . . . . . . . . . . 2.3.3. Routing in Mobile Ad Hoc Networks . . . . . . . 2.4. Data Analysis, Prediction, and Machine Learning . . . . 2.4.1. Regression, Classification, and Measures of Error 2.4.2. Time-Series . . . . . . . . . . . . . . . . . . . . . 2.4.3. Forecasting Methods for Time-Series . . . . . . . 2.4.4. Statistical Machine Learning . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

3. Application Scenarios 3.1. Scenario 1: Telemedicine for Disaster Intervention . . . . . . . . . 3.2. MANETs for Telemedicine . . . . . . . . . . . . . . . . . . . . . . . 3.3. Scenario 2: External Sensor Assistance for Autonomous Vehicles . 3.4. MANETs for Autonomous Vehicles and Vehicular Communication

. . . . . . . . . . .

. . . . . . . . . . .

1 4 5 8

. . . . . . . . . . .

11 11 12 14 15 18 19 20 21 23 23 24

. . . .

. . . .

. . . .

27 27 29 31 32

4. Related and Previous Work 4.1. Related Research in the Computer Networking Community . . . . . 4.1.1. Enhanced Routing Protocols for Mobile Ad Hoc Networks . . 4.1.2. Quality of Service Mechanisms for Mobile Ad Hoc Networks . 4.1.3. Connectivity Analysis in Wireless Sensor Networks . . . . . . 4.2. Related Research in the Control Systems Community . . . . . . . . 4.2.1. Using Real-Time Guarantees From the Network . . . . . . . . 4.2.2. Increased Robustness Towards Connectivity Issues . . . . . . 4.3. End-to-End Communication Delay Prediction . . . . . . . . . . . . . 4.3.1. Aggregating Single-Hop Communication Delay Predictions . 4.3.2. Communication Delay Prediction in the Internet . . . . . . . 4.3.3. Forecasting of End-to-End Communication Delay Time-Series 4.4. Context Awareness in Mobile Ad Hoc Networks . . . . . . . . . . . . 4.5. Node Mobility and Localization . . . . . . . . . . . . . . . . . . . . . 4.5.1. Node Mobility Prediction . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

37 38 38 41 42 45 46 47 48 48 49 50 50 52 52

vii

Contents 4.5.2. Uncertainty in Node Localization . . . . . . . . . . . . . . . . . . 4.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes 5.1. Design of Experiment . . . . . . . . . . . . . . . . . . . . . . 5.1.1. Physical Layer Parameters . . . . . . . . . . . . . . . 5.1.2. Scenario Parameters . . . . . . . . . . . . . . . . . . . 5.2. Method for Statistical Experiment Analysis . . . . . . . . . . 5.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1. Observed Communication Delay . . . . . . . . . . . . 5.3.2. Observed Streaming Window Width . . . . . . . . . . 5.3.3. Rank Correlations and Explanatory Linear Models . . 5.3.4. Autocorrelation in Communication Delay Time-Series 5.3.5. Factor Influence Models . . . . . . . . . . . . . . . . . 5.4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1. Simulation Performance . . . . . . . . . . . . . . . . . 5.4.2. Connectivity Metrics . . . . . . . . . . . . . . . . . . . 5.4.3. Influencing Factors . . . . . . . . . . . . . . . . . . . . 5.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53 57

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

59 60 62 63 68 73 75 77 80 85 86 87 87 88 90 91

6. Connectivity Prediction for Mobile Ad Hoc Networks 6.1. Mathematical Notations for the Network Model . . . . . . . . . 6.2. Sensory Capabilities and Network Context Awareness . . . . . 6.3. Predicting Connectivity from Node Locations . . . . . . . . . . 6.3.1. Probabilistic Network Graph . . . . . . . . . . . . . . . 6.3.2. Prediction of Communication Link Probability . . . . . 6.3.3. Handling Uncertainty in Predicted Node Locations . . . 6.3.4. Constructing the Probabilistic Network Graph . . . . . 6.4. Connectivity Prediction Models . . . . . . . . . . . . . . . . . . 6.4.1. Black-Box Models . . . . . . . . . . . . . . . . . . . . . 6.4.2. Cross-Layer Models . . . . . . . . . . . . . . . . . . . . 6.4.3. Probabilistic Network Graph Models . . . . . . . . . . . 6.5. Online Supervised Learning of Second-level Adaptation Models 6.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

93 93 95 98 99 100 108 113 114 116 118 121 122 123

7. Evaluation 7.1. Method for Model Assessment . . . . . . 7.2. Simulation Scenarios . . . . . . . . . . . 7.3. Prediction Model Cross-Validation . . . 7.3.1. Results . . . . . . . . . . . . . . 7.3.2. Discussion . . . . . . . . . . . . . 7.4. Prediction Errors in Simulation Studies 7.4.1. Results . . . . . . . . . . . . . . 7.4.2. Discussion . . . . . . . . . . . . . 7.5. Conclusion . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

125 125 128 130 130 133 133 133 139 143

viii

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

Contents 8. Conclusion 8.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2. Critical Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

145 145 147 147

Bibliography

149

Appendix

167

A. Extended Concepts and Definitions A.1. Mobile Ad Hoc Networks . . . . . . . . . . . . . . . . . A.1.1. Computer Networking Basics . . . . . . . . . . . A.1.2. IEEE 802.11 Wireless Local Area Networks . . . A.1.3. Routing in Mobile Ad Hoc Networks . . . . . . . A.2. Data Analysis, Prediction, and Machine Learning . . . . A.2.1. Regression, Classification, and Measures of Error A.2.2. Time-Series . . . . . . . . . . . . . . . . . . . . . A.2.3. Forecasting Methods for Time-Series . . . . . . .

169 169 169 170 173 175 175 176 176

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

B. Mathematical Formulations and Computations 179 B.1. Log-distance Path Loss Model . . . . . . . . . . . . . . . . . . . . . . . . 179 B.2. Log-normal Shadowing Model . . . . . . . . . . . . . . . . . . . . . . . . 180 C. Software Packages 181 C.1. Use of the Statistical Computing Environment R . . . . . . . . . . . . . 181 C.2. Use of the Discrete Event Simulator OMNeT++ . . . . . . . . . . . . . 181

ix

List of Figures 1.1. From embedded system to cyber-physical systems . . . . . . . . . . . . . 1.2. Structural model of the thesis’s contents . . . . . . . . . . . . . . . . . .

2 7

2.1. Complexity cube of interconnected systems . . . . . . . . . . . . . . . . 2.2. MANET with an application level data stream . . . . . . . . . . . . . .

13 15

3.1. Telemedicine scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Sensor assistance for autonomous vehicles . . . . . . . . . . . . . . . . .

28 31

5.1. 5.2. 5.3. 5.4. 5.5. 5.6. 5.7.

61 71 72 74 74 75

Parametrized simulation scenario . . . . . . . . . . . . . . . . . . . . . . Data preparation and explanatory model fitting . . . . . . . . . . . . . . Data collection and factor influence model fitting . . . . . . . . . . . . . Comparison of passive network metrics . . . . . . . . . . . . . . . . . . . Histogram of the factorial experiment simulations’ packet delivery ratio The observed flow’s end-to-end communication delay . . . . . . . . . . . Observed distributions from the factorial experiment’s communication delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8. Spread of observed communication delay . . . . . . . . . . . . . . . . . . 5.9. Streaming window width and duration . . . . . . . . . . . . . . . . . . . 5.10. Observed streaming window width distributions . . . . . . . . . . . . . . 5.11. Spread of observed streaming window widths . . . . . . . . . . . . . . . 5.12. Communication delay explanatory models’ goodness of fit . . . . . . . . 5.13. Streaming window width explanatory models’ goodness of fit . . . . . . 5.14. Packet delivery ratio explanatory models’ goodness of fit . . . . . . . . . 5.15. Coefficients for communication delay explanatory models . . . . . . . . 5.16. Coefficients for streaming window width explanatory models . . . . . . . 5.17. Coefficients for packet delivery ratio explanatory models . . . . . . . . . 5.18. Partial autocorrelation for the communication delay time-series . . . . . 5.19. Model coefficients for factor influence models . . . . . . . . . . . . . . .

76 77 78 79 80 81 82 83 84 84 84 86 87

6.1. 6.2. 6.3. 6.4. 6.5.

Sensory capabilities of network protocol layers . . . . . . . . . . . . . . . 96 Estimates for KLD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Comparison of the path loss coefficient’s time-series . . . . . . . . . . . . 104 Evaluation of Data-link layer frame reception prediction . . . . . . . . . 106 Evaluation of Data-link layer frame reception prediction with small training set sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.6. Comparison of reference and approximated uncertain distance distributions110 6.7. Estimated kernel densities of communication link probabilities . . . . . . 111 6.8. Distribution of errors of estimated communication link probability . . . 112

xi

List of Figures 7.1. Connectivity metrics computation using Application layer packet timings 7.2. Normalized differences in model score . . . . . . . . . . . . . . . . . . . 7.3. Comparison of communication delay forecasting models . . . . . . . . . 7.4. Comparison of packet delivery ratio forecasting models . . . . . . . . . . 7.5. Total packet delivery ratios . . . . . . . . . . . . . . . . . . . . . . . . . 7.6. Median hop counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7. Median of observed communication delay forecast errors . . . . . . . . . 7.8. Maximum of observed communication delay forecast errors . . . . . . . 7.9. Median of observed packet delivery ratio forecast errors . . . . . . . . . 7.10. Maximum of observed packet delivery ratio forecast errors . . . . . . . . 7.11. Observed transmissions until next stream interruption prediction errors

126 131 132 132 134 135 137 138 140 141 142

A.1. Communication between two applications via intermediate hosts . . . . 170 A.2. Carrier-Sensing Multiple Access scheme with collision avoidance . . . . 172

xii

List of Tables 2.1. Naming of protocol messages for each layer in an Internet protocol/IEEE 802.11 protocol suit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. 5.2. 5.3. 5.4. 5.5. 5.6. 5.7. 5.8. 5.9.

Configuration parameters used to analyse the factorial experiment study Transmit power of IEEE 802.11 WLAN devices . . . . . . . . . . . . . . Path loss exponents for various environments . . . . . . . . . . . . . . . Physical layer model parameters . . . . . . . . . . . . . . . . . . . . . . Morpholigical field of simulation parameters . . . . . . . . . . . . . . . . Node speed and simulation area scenario parameters . . . . . . . . . . . Definitions of node densities depending on average node speed . . . . . . Traffic type and other traffic scenario parameter . . . . . . . . . . . . . Summary statistics for each simulation run in the factorial experiment study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.10. Exact values to the communication delay order statistics distributions’ of figure 5.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.11. Exact values to the streaming window width order statistics distributions’ of figure 5.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.12. Spearman rank correlation coefficients of the factorial experiment’s configuration parameters to the connectivity metrics’ median . . . . . . . .

17 61 62 63 64 65 66 67 68 69 76 79 81

6.1. Overview of the black-box connectivity prediction models . . . . . . . . 116 6.2. Overview of the cross-layer connectivity prediction models . . . . . . . . 119 6.3. Overview of the probabilistic network graph connectivity prediction models121 7.1. Usage of the adaptation models for cross-validation . . . . . . . . . . . . 127 7.2. Selection of models for further evaluation after cross-validation . . . . . 133 7.3. Estimation of the forecast horizon’s influence on the prediction errors . 135 C.1. Utilized R packets and their versions . . . . . . . . . . . . . . . . . . . . 181

xiii

List of Acronyms AODV Ad hoc On demand Distance Vec- ISO International Organization for Stantor dardization ARIMA Autoregressive Integrated Moving LAN Local Area Network Average MAC Medium Access Control CAN Controller Area Network MANET Mobile Ad hoc Network CPS Cyber-Physical System MSSE Mean Squared Scaled Error CSMA Carrier-Sensing Multiple Access MMC Mobility Markov Chain DCF Distributed Coordination Function NCS Networked Control System DSR Dynamic Source Routing ECG electrocardiogram

OLSR Optimized Link State Routing OSI Open Systems Interconnection

EDCA Enhanced Distributed Channel Ac- POI Point Of Interest cess QoS Quality of Service EIRP Equivalent Isotropic Radiated Power RSSI Received Signal Strength Indicator GNSS Global Navigation Satellite System SNR Signal to Noise Ratio GPS Global Positioning System TCP Transport Control Protocol HTTP Hyper Text Transfer Protocol UDP User Datagram Protocol ICMP Internet Control Message Protocol VANET Vehicular Ad hoc Network IoT Internet of Things WAVE Wireless Access in Vehicular EnviIP Internet Protocol ronments IQR interquartile range

WLAN Wireless Local Area Network

ISM Industrial, Scientific, and Medical

WSN Wireless Sensor Network

xv

1. Introduction Ubiquitous computing has been a vision of computer scientists since the late 1980s: originating from the Xerox Palo Alto Research Center and pioneered by Mark Weiser, ubiquitous computing envisions “a physical world richly and invisibly interwoven with sensors, actuators, displays, and computational elements, embedded seamlessly in the everyday objects of our lives and connected through a continuous network” [WGB99, p. 694]. The currently more prevalent term Internet of Things (IoT) refers to the technological vision in which physical things that a user interacts with are connected to services in the Internet that enrich them with contextual information and provide a pervasive service experience [Zor+10]. In their comprehensive survey, Atzori, Iera, and Morabito [AIM10] emphasise the shift from IoT’s initial focus on uniquely identifiable physical objects, the things, to a more converged vision of information that is attached to things via Internet services. The information transfer via the Internet ensures interoperability and seamless accessibility, but is unable to provide reliable, dependable communication in the sense of critical real-time applicability. Combining computing capabilities and physical objects has typically been the domain of embedded systems: devices, whose functionality is defined by their hardware as much as their software, running on microprocessors or in electronic circuits. The term Cyber-Physical System (CPS) marks a change of perspective in the development of such embedded systems. E. A. Lee [Lee06] argues that the currently available level of computing power and networking capabilities for embedded systems caused this change away from regarding their development mostly as an optimization problem towards a focus on reliability and predictability, especially for safety-critical applications like avionics or medicine. Following the vision of ubiquitous computing, embedded systems are now getting designed for much closer interaction with their physical vicinity, while at the same time relying more on the interaction with other computing systems through networking than before [Sta+05]. Their tight, often closed-loop, coupling with the physical world inherently enforces time as a measure for correctness onto software and communication in these embedded systems [Lee06]. But unlike classical embedded systems that are designed as closed systems, fully validated at design time, figure 1.1 expresses how this has changed to individual CPSs that together form an open, dynamic system of systems.

1

1. Introduction

classical embedded systems

cyber-physical systems

Figure 1.1.: Embedded systems typically were designed as single, closed systems; CPSs instead are embedded systems that connect to other systems to form an open, dynamic system of systems. The fundamental issue that arises when embedded systems are no longer encapsulated, closed systems but form open, interconnected systems of systems, is that established abstractions of system design begin to fail; namely, the aspects of time and availability of resources must no longer be hidden from application layer functions [Lee08]. Time becomes a coordinated measure and the ability of individual systems to keep their timing constraints or to offer a certain service might vary with time and the presence of other systems. With this background, IoT and CPS are merely regarded as labels for formerly distinct, but strongly converging efforts to get closer to the old idea of ubiquitous computing. A view that, three centuries later, underlines the visionary power at the Xerox Palo Alto Research Center. The definition of said distinction is that IoT is about devices, in the form of things, that present or gather contextual information; CPS is about devices that actively manipulate the physical world. The former is all about semantic interoperability and connectivity to Internet services that store and process the information, the latter is inherently real-time and requires dependable systems. From the numerous open research challenges that remain, this thesis addresses the prediction of local communication, an aspect that falls into the CPS label of ubiquitous computing, when considering the above distinction. Lee et al. [Lee+12] name the rise of network dependant functionality in embedded devices as one of the major issues that drive the complexity of system design for CPSs; connectivity between devices in CPSs is still a challenge and an open field of research, while at the same time being one of the core aspects of CPSs [Lee08; Lee+12]. An abundance of wireless networking technologies for communication between multiple computing devices exist today. Of these, infrastructure based Wireless Local Area Networks (WLANs), often called WiFi networks, and cellular mobile networks, currently

2

in their fourth generation, are ubiquitous around the world. Less common are Mobile Ad hoc Networks (MANETs), computer networks that connect devices without the necessity of dedicated infrastructure. The ability to create a communication network between dynamically changing participants without any additional infrastructure lets MANETs appear to be designated for the role of providing local communication between the individual devices of the described system of systems. Yet, the technological challenges that arise from the lack of central coordination and general dynamics in the communication network have prevented widespread adoption of MANETs, a matter that Basagni et al. [Bas+13] discuss further. With the intention to contribute to some of the challenges that arise when using MANETs to connect CPSs, the thesis’s focus lies on this networking concept. Principally, the thesis’s research objectives, which are laid out in more detail below, are independent of specific use cases. Nevertheless, two application scenarios, to which the investigated subject is of relevance, are introduced to provide a less abstract perspective on the matter and to help to derive concrete requirements for evaluation. The two application scenarios are motivated by previous work by the author and general research interest: telemedicine and cooperative driving. Lately, Haupt et al. [Hau+14] have presented techniques to use MANETs for control applications that require real-time communication. In the field of transportation, MANETs are currently gaining importance as a method to provide future operationcritical communication between vehicles that shall enable cooperative driving; first steps to bring such vehicular communication systems to market are underway, e.g., with the European Cooperative ITS Corridor that will disseminate road work and obstacle warnings to drivers, as Ross [Ros15] reports. Other ideas are more visionary, such as Gerla et al. [Ger+14] who propose massive cooperation of autonomous vehicles in the form of Vehicular Clouds. Besides transportation, network communication for disaster intervention is a field considered for the application of MANETs in order to be independent of possibly destroyed communication infrastructure. A possible application that such a network has to support is remote patient monitoring for real-time telemedicine. This introduces reliability and timing requirements into the communication systems, as discussed by Thelen et al. [The+15]. To highlight the importance of dependability in inter-device communication in medical systems, Lee and Sokolsky [LS10] have coined the term Medical Cyber-Physical Systems. However, Sneha and Varshney [SV13] have identified the predictability of communication to be an open research issue when using MANETs for remote patient monitoring.

3

1. Introduction

1.1. Existing Research Gaps In light of the background given above and in anticipation of the detailed discussion of previous and related work in chapter 4, the research gaps that this thesis addresses are presented here. A very specific research gap has already been identified by others: There is a lack of understanding about the influence that node mobility exerts on end-to-end connectivity metrics for mulit-hop MANETs. The two comprehensive, recent survey on Quality of Service (QoS) for MANETs by Khoukhi et al. [Kho+13] and by Al-Anbagi, Erol-Kantarci, and Mouftah [AEM14] prominently highlight the lack thereof in current work. Furthermore, from the work on this thesis, it has been found that existing work not only neglects the effect of mobility, but only addresses the two connectivity metrics delay and packet delivery ratio, i.e., the ratio of successfully received to transmitted packets. There is generally no consideration of a metric that describes the time until the next packet is lost. Much of the existing work uses stochastic and queueing model frameworks to analyse connectivity metrics, but only few of these studies validate their results with experimental methods. Research question 1 that is addressed in chapter 5 contributes toward closing this research gap. Prediction of connectivity metrics for 1-hop communication between directly neighboring nodes is used in routing protocols to anticipate route breaks and establish new routes before communication is affected. Missing though, is any work on predicting the chance of finding a new route and thus the influence of an anticipated route break for the communication that uses the route. In delay tolerant networks, which realise information dissemination via the communication of moving and intermittently connected participants, predictions of end-to-end connectivity metrics from the participants’ mobility are used to improve the information dissemination strategy. But delay tolerant networks are a different category of networks than ones that shall support real-time communication and the applied methods are hardly transferable. This leaves a research gap on the prediction of end-to-end connectivity in multi-hop MANETs for real-time communication that are affected by node mobility. Research questions 2 and 3 that are addressed in chapters 6 and 7 contribute methods and first prediction models to address this research gap. Far broader open issues of research are to actually make applications, services, and control systems adaptable to predicted connectivity metrics and to achieve connectivity awareness for applications that use dynamic multi-hop MANETs to improve the dependability of such networks for real-time communication. Solving these issues will improve the versatility of MANETs for communication in highly dynamic CPSs. The lack of work on the predictability on communication in MANETs may largely stem from the research domains’ focus: research from the computer science community

4

1.2. Objectives primarily addresses the improvement of routing and other protocols in the network or QoS methods that reserve and guarantee network capacity until the guarantee can no longer be upheld and the participants affected have to reapply for their required capacities. The control systems community on the other hand has focused on research to improve a controlled system’s performance given predefined uncertainties that communication via a packet switched network induces. Recent work, such as Haupt et al. [Hau+14], rigorously address the interactions of network and control system in a joint design approach that is suitable for closed systems. So far no attempt has been published to predict the behaviour of an open MANET in order to increase its dependability for use by future dynamic and interconnected CPSs.

1.2. Objectives The introductory discussion of CPSs using both autonomous, cooperative driving and telemedicine as application examples point at a prevalent dilemma: on one hand the need for dynamic and flexible communication between open collections of devices to benefit from their offered services and on the other hand the need for predictable and timely communication between a device and the services its operation depends upon. At a roundtable on the reliability of embedded systems, J. A. Stankovic stipulates that a system’s awareness of various operational aspects is key to increase its robustness [BSS10, p. 32]: There are many, many examples of physical properties from the real world that cause the system to fail. To address this, we need what I’ve been calling star-aware software, where the star is the Kleen star and refers to such software as physically aware, security aware, privacy aware, and so on. Such an approach has the potential to make the system very robust. The level of robustness must increase, and hopefully we can make this into more of a scientific process, so that when we construct systems, they are robust enough against the vagaries of the real world. In this sense, the thesis’s objective is to contribute towards making systems network aware by predicting the future connectivity of an ongoing communication flow in a MANET. Connectivity here is used to unite three metrics: the delay that messages experience in the network on their way from origin to destination, the communication delay, and two metrics that are themselves summarized as communication interruption: the remaining time until streaming interruption, i.e., the time until the next Network layer message will be lost, and the packet delivery ratio, i.e., the ratio of successfully received to transmitted Network layer messages. When analysing the connectivity in

5

1. Introduction hindsight, instead of the time varying remaining time until streaming interruption, the streaming window duration is used instead. Having these metrics available to a device’s applications shall then allow the applications to adapt and degrade gracefully under worsening conditions, because the change in conditions will be anticipated. Communication delay and packet delivery ratio are common metrics for the evaluation of new protocols that are researched and proposed frequently by the computer science community. Still, they are usually not analyzed in and out of themselves, nor with the intention to predict or forecast the metrics. Because of this lack of insight into the properties of connectivity in MANETs that is exploitable for prediction, they are investigated before deriving potential prediction models. The investigation of connectivity in MANETs is done by looking at the influence that contextual factors in the form of scenario parameters, such as number of participants in the network or their average movement speed, exert. An important property of the contextual factors has to be that they define a setting that remains more or less stable. Thereby a necessary baseline for the connectivity metrics is established and viable directions for realizing the metrics’ prediction models are obtained. Explicitly formulated the thesis addresses three research questions: RQ 1 To what degree do contextual factors, in the form of scenario parameters, influence the connectivity metrics, communication delay and communication interruption, in a MANET? RQ 2 By what method can the communication delay in a MANET be predicted to be suitable for real-time control applications? RQ 3 By what method can the communication interruption in a MANET be predicted to be suitable for real-time control applications? The primary tool to investigate the research questions is the use of discrete event simulations, as is common place for research in MANETs. It is further assumed that the movement of the devices that form the MANET is predictable to a certain degree. In case of autonomous vehicles, Levinson et al. [Lev+11] suggest that future high precision movement trajectories are usually available at the planning level. For humans, Song et al. [Son+10] have found potential for 93% predictability of the locations visited, based on past locations. The relatively slow movement of a walking human is expected to further simplify the prediction of possible movement trajectories at a time horizon of a few minutes.

6

Data Analysis, Prediction, and Machine Learning

Mobile Ad Hoc Networks

Telemedicine Cooperative Vehicles

Application Scenarios

Computer Networking and Control Systems Communities

Use of Mobile Ad Hoc Networks

Requirements

End-to-End Communication Delay Prediction

Context Awareness and Node Mobility

Chapter 3

Networked Control Systems

Chapter 4

Real-Time Systems and Communication

Chapter 2

1.3. Methodology and Structure

Chapter 5

Experimental Analysis of Connectivity in Mobile Ad Hoc Networks with Moving Nodes

Statistical Data Analysis Influence on Delay

Sensory Capabilities in the Network Stack

Influence on Streaming Window Width

Influence on Packet Delivery Ratio

Predicting Connectivity from Uncertain Node Locations

Candidate Prediction Models

Chapter 6

Related and Previous Work

Application Scenarios

Fundamental Concepts and Definitons

RQ 1

Black-Box Models

Cross-Layer Models Probabilistic Network Graph Models 2nd Level Adaptation Model

Chapter 7

Evaluation Scenarios Cross-Validation Prediction Performance Evaluation

RQ 2

Conclusion and Outlook

RQ 3

Chapter 8

Figure 1.2.: Structural model of the thesis’s contents with chapter alignments.

7

1. Introduction

1.3. Methodology and Structure The main methods that underlie the thesis’s contribution belong to the domains of statistical data analysis, predictive statistical modelling enriched with machine learning, and simulation of computer networks with discrete event simulations. All data analysis in chapter 5 as well as the implementation and evaluation of the predictive models in chapters 6 and 7 is carried out using the statistical computing environment R1 . The experiments that provide the data that underlies the statistical analysis in chapter 5 as well as the experiments that are used for model synthesis in chapter 6 and evaluations in chapter 7 are realized with the discrete event simulator OMNeT++2 . Methodologically, the systematic, experimental analysis of contextual scenario parameters regarding their influence on end-to-end connectivity metrics that is discussed in chapter 5 results in knowledge on the metrics variance and dependence on contextual scenario parameters. This answers research question 1 and provides the necessary groundwork to derive the candidate predictive models in chapter 6. For each metric, candidate models from three categories are derived: pure forecasting models that treat the network as a black-box, models based on metrics along the current route that get accessible via cross-layer design, and models that build a probabilistic network graph from predictions of the network participants’ future locations. Furthermore, a second level adaptation model for the predictions from the cross-layer and the probabilistic network graph models is considered that uses on-line supervised learning to continuously adapt the prediction models during their operation. To evaluate the prediction models in chapter 7, the complete set of candidate models is compared in a cross-validation step. A set consisting of the best model per connectivity metric and model category is then evaluated in more detail to answer research questions 2 and 3. Figure 1.2 shows the thesis’s complete structure. Besides the central chapters 5, 6, and 7 that contain the thesis’s contribution to advance the current state of the art research and that have already been outlined in the presentation of methodology above, the thesis’s structure is as follows: Chapter 2 introduces the terms, fundamental concepts, and methods that are used in the thesis. Chapter 3 introduces the two application scenarios that serve to connect the thesis’s theoretical work with practical applications. The first application scenario, cf. section 3.1, is motivated by previous work in telemedicine: the use of a MANET to perform synchronous telemedical consultations 1 R is open source software that is available from http://www.r-project.org/. R version 3.1.1 was used for the work that is presented in the thesis, more details are given in the appendix section C.1. 2 OMNeT++ (OpenSim Ltd., Budapest, Hungary) is distributed under an Academic Public License that allows free use for academic, non-commercial purposes; the software is available from http: //www.omnetpp.org/. OMNeT++ version 4.4.1 with the INET library version 2.4.0 was used for the work that is presented in the thesis, more details are given in the appendix section C.2.

8

1.3. Methodology and Structure with real-time biomedical patient monitoring. The second application scenario, cf. section 3.3, is geared towards the development of intelligent transportation systems: the use of a MANET of vehicles and infrastructure to increase the sensor coverage of autonomous vehicles and enable cooperation. Both scenarios are used throughout the later chapters to justify parameter and design choices. Chapter 4 presents the relevant previous and related work and chapter 8 concludes the thesis with a summary, a critical review, and an outline of the important open issues in the domain of dependable communication for CPSs via MANETs that future research may address.

9

2. Fundamental Concepts and Definitions 2.1. Real-Time Systems and Communication The term real-time emphasises the passing of time, as is immanent in the physical world. For a real-time computation or communication system, the correctness of a computation or a message transaction depends as much on the time of its completion, as on the result or message content itself. Already in 1988 Stankovic raised awareness for a major misconception of real-time systems and argued [Sta88, p. 11]: “Predictability, not speed, is the foremost goal in real-time-systems design.” 20 years later, in his discussion of design challenges for CPS, E. A. Lee raised awareness of the same issue: most advancements in the performance of computing devices, e.g., various levels of caches, branch prediction, and pre-fetchers, have all worsened the predictability of timing [Lee08]. Instead, aspects of time are well hidden by all layers of abstractions that form the basis of today’s computer programming methods, starting down at the level of machine instructions and going all the way up to higher programming languages and model driven development: Systems are first designed and only thereafter validated against their timing requirements [Lee08]. Real-time requirements imposed on a system usually arise from the physical processes of the environment with which it is designed to interact. These requirements are defined in terms of deadlines before which computations or message transactions have to be completed. Depending on the criticality of missing a deadline, three levels of real-time requirements are distinguished [But11]: hard real-time Missing a deadline leads to a catastrophic system failure. firm real-time Missing a deadline renders the computation or message useless, but causes no further harm. soft real-time Missing a deadline reduces the system’s performance but does not invalidate a computation or message. From a network perspective, hard real-time requirements are the domain of Networked Control System (NCS) with usually fixed network structure and carefully designed and

11

2. Fundamental Concepts and Definitions validated scheduling [WY01]. For soft real-time requirements, two traffic categories are further distinguished [Pea+11]: inelastic soft real-time Communication with stringent delay constraints and usually fixed bandwidth requirements. System performance strongly correlates with the communication delay. elastic soft real-time Communication for which increased delay is merely inconvenient, like traffic from normal email transmission or web browsing; this traffic is usually greedy in bandwidth consumption [Li+11]. With WirelessHART, the industry has adopted a specialized network protocol stack for centrally managed mesh control networks on top of the IEEE 802.15.4 Physical layer [Che+14]. Never the less, models for control systems over wireless networks are stochastic; they do not allow deriving of deterministic guarantees for upper bounds of network-induced delays [JJ10]. Hence, hard real-time applications depend on a carefully engineered, static setup. With this background it is clear that providing a predictable communication delay is a key issue of networking in real-time systems [WY01]. Given the definitions above, the thesis contributes to prediction of connectivity metrics for inelastic soft real-time communication.

2.2. Networked Control Systems A rigorous take on CPSs comes from the control systems research community, to which Lunze and Gr¨ une [LG14] give a state of the art introduction. Understanding this research domain’s methods to incorporate communication properties into system design is important when proposing new concepts that aim at improving the communication’s dependability for such control systems. Research in systems and control theory has developed a wide range of methods to describe and analyze dynamic systems: from simple linear state-less over state-full to non-linear systems, operating with continuous or discrete time. But instead of being a single system of a plant and one or more independent controllers, CPSs usually have the form of multiple, interacting, dynamic systems. The complexity cube of interconnected systems, cf. figure 2.1, classifies such systems based on the three types of complexity that are relevant for system analysis and controller design: individual system’s complexity, link complexity, and topological complexity [Wie10]. Topological complexity in this sense is the topology of information exchange at the level of control between individual systems, independent of any communication technology or network. Properties of communication and networking technology is an issue of link complexity and is the area of research to which this thesis contributes.

12

2.2. Networked Control Systems

topological complexity

system complexity x(t) ˙ = f (x(t), u(t)) x(t) ˙ = A · x(t) + B · u(t) x(t) ˙ = u(t) link complexity ideal link

comm. packet switched delay network

Figure 2.1.: Complexity cube of interconnected systems, adapted from [Wie10]. Networked Control Systems (NCSs) are control systems that close the control loop over a packet-switched network. A very recent, comprehensive book on the matter is offered by Lunze [Lun14]. Other, valuable material is provided by Bemporad, Heemels, and Johansson [BHJ10] and Wang and Liu [WL08]. According to the complexity cube of figure 2.1 they express high link complexity. Nowadays NCSs are commonplace and well understood, albeit communication delay caused by the network must be bounded and taken into account at design time. Hence, NCS typically use carefully designed, real-time communication systems with special scheduling mechanisms that guarantee timely delivery of messages, often in the form of field bus technology such as a Controller Area Network (CAN) bus. As explained by Lunze and Gr¨ une [LG14], control system design for CPSs with high complexity in the complexity cube’s all three dimensions is approached as consensus problem in multi-agent systems. Gr¨ une et al. [Gr¨ u+14] suggest the use of model predictive control as a powerful method to control such systems, because it can naturally handle discretized controller inputs with a-priori unknown, but measurable delay. Yet, this method depends on having good estimates of the future delay that the controller output will experience, to align its plant model with the controlled plant, an issue that Gr¨ une et al. [Gr¨ u+14] call prediction consistency. Despite the advances in designing complex NCSs, their current, very static, networking paradigm strongly contrasts the envisioned dynamic environments of CPSs that require far more flexibility regarding device connectivity. The surveys of Wu, Kao, and Tseng [WKT11] and Xu, He, and Li [XHL14] show how the open communication paradigms

13

2. Fundamental Concepts and Definitions that underpin the Internet and current research in IoT are much more favorable in this regard, but still lack the predictability that is required to build industrial control systems.

2.3. Mobile Ad Hoc Networks A Mobile Ad hoc Network (MANET) is a spontaneously formed computer network of, possibly mobile, computing devices, established using wireless links, that supports message exchanges over a sequence of links, i.e., over multiple hops. The most common approach to enable multi-hop communication in a MANET is for each node to include the functionality of a router on the Network layer [Per01]; a less common approach is to implement the multi-hop message forwarding in the Data-link layer [SKH11]. Basagni et al. [Bas+13] discuss the important research and the current state of the art of MANETs with a critical view upon their actual industry impact. Besides an introduction to MANETs—from a time where this technology was regarded as the future of mobile computing—, Perkins [Per01] presents the most important approaches to the challenge of routing messages in a MANET. For an introduction to MANETs, Walke, Mangold, and Berlemann [WMB06], Tanenbaum and Wetherall [TW11], and Gast [Gas05] may be consulted as well. This section introduces the general concepts and terms of MANETs and wireless computer networks that are important to understand the research that is presented in the thesis. The presented technologies provide important influence on the study design in chapter 5 and the formulation of prediction models in chapter 6. Appendix section A.1 offers further details concerning MANETs that are still important to the presented work but not critical for its understanding. Node mobility and varying environmental conditions result in a dynamic network topology: over time, established links break and new links appear, as nodes move in to or out of their respective communication ranges. Figure 2.2 shows how this affects application level data streams between two nodes. In case the network does not offer a route between two distinct sets of nodes, the network is called partitioned. Most common applications of MANETs are Wireless Sensor Networks (WSNs), mission networks for military or rescue operations, mesh networks, Vehicular Ad hoc Networks (VANETs), and delay tolerant networks [Bas+13]. WSNs are networks of low powered devices that monitor their environment over an extended period of time; most networking related research in their domain focuses on a reduced energy consumption. Mission networks have since long been the primary argument for general research in MANETs; besides military deployments, this domain has resulted in very little actual applications because of its very specific scope. Basagni et al. [Bas+13] see commercial success mainly in the remaining three domains:

14

2.3. Mobile Ad Hoc Networks

source node

link

E

vC

ΘC destination node

A C

D

active flow B vF

ΘF

Θi : future movemente trajectory of node i vi : velocity of node i

F

Figure 2.2.: MANET with an application level data stream, a flow, from node A to node D. The movement of node C, depicted by the dark grey trajectory arrow, will cause its links to nodes B and D to break, but the movement of node F into node C’s current position allows a new route to be established to support the flow from node A to node D. • Mesh networks combine MANETs with other communication infrastructure to either grant the MANET nodes access to other networks, like the Internet, or to connect the other networks via the MANET [AX05]. • VANETs, networks that connect road vehicles with each other and road side infrastructure, gain traction with the ongoing attempt to improve safety, efficiency, and control automation of automobiles. Research of safety critical applications in VANETs mainly addresses single-hop communication [JD08]. • Research about delay tolerant networks addresses MANETs that are partitioned most of the time and node movement is used to exchange messages between disconnected parts of the network [Bas+13].

2.3.1. Computer Networking Basics In order to focus on the aspects of computer networks that are important for this thesis, the following subsection gives a brief introduction to the most important concepts and terminology of computer networks that are used through the thesis. Appendix section A.1.1 provides more detail and background for the reader that is not familiar with computer networks. For a more in-depth study of computer networks, please consult Tanenbaum’s classical work Computer Networks [TW11]. Table 2.1 contains the common names for the messages exchanged by protocol processes at each layer in the network suit. The term datagram is generally used for unreliably transmitted messages and packet is used nearly as generically as message. Further

15

2. Fundamental Concepts and Definitions networking terminology that is frequently used in the thesis is: node A participant in a computer network, typically a computing device. A node can transmit and receive messages via the computer network. terminal A radio signal transmitter or receiver; in a wireless computer network, a node is a terminal. Use of this term is usually limited to the Physical layer’s domain. channel The physical medium in the frequency range that the associated Physical layer protocol utilises. link A direct connection between two nodes. More specifically, a link is a connection on the Data-link layer. flow A sequence of Network layer datagrams sent from a source node to a destination node for a single Transport layer connection or stream. For the purpose of this thesis, the definition is fundamentally equal to the one for IPv6 flow labels [RFC6437]. delay A time period by which something is late. In the sense of communication in a computer network it is the time that an arbitrarily large message, on any layer in the protocol stack, needs to travel from its sender to its receiver. Some authors use the term latency synonymous to delay; others, like Tanenbaum and Wetherall, use the term latency only for the delay of an individual bit in a continuous stream of bits [TW11]. bitrate The rate, measured in bit/second, at which the Physical layer transmits the bit-stream from the Data-link layer via the channel. The term is used in the sense of net bitrate to which the Physical layer adds overhead. Hence, the signal modulated on the physical medium has to transport more bits per second than the bitrate. In computer networking, the term bandwith is often used synonymously.1 The channel bitrate, R, is the first obvious factor that has influence on communication delay. Given R and a message’s size, S, when it leaves the Data-link layer, the lower bound for the message’s communication delay can be calculated: τlow =

S R

(2.1)

When neglecting processing time in the Physical layer’s peer process, τlow is the delay that frames transmitted in the Data-link layer experience. Scheduling of these frame 1

Bandwidth in the sense of computer networking is not to be confused with the term’s usage in a radio communication and signal processing context where it refers to a frequency range, i.e., the width of a frequency band that a channel uses on its physical medium [Rap02].

16

2.3. Mobile Ad Hoc Networks Table 2.1.: Naming of protocol messages for each layer in an Internet protocol/IEEE 802.11 protocol suit. Layer (Protocol)

Message Name

Application Transport (TCP) Transport (UDP) Network (IP) Data-link (IEEE 802.11 WLAN) Physical (IEEE 802.11WLAN)

packet/message segment datagram/packet datagram/packet frame symbol

transmissions is in the Data-link layer’s responsibility and adds more delay to messages coming from upper layers [WY01]. If a link is saturated, i.e., peer processes in the Data-link layer get more data per second to transmit from the Network layer than the bitrate permits, then congestion occurs, which in the Internet protocol suite is handled by the Transport Control Protocol (TCP) in the Transport layer [RFC5681; WDM01]. The Network layer is responsible for global addressing of nodes in the network and routing, i.e., finding a path through the network from source node to destination node, but does not offer a reliable end-to-end transmission of the messages. While the Physical and Data-link layers ensure a reliable transmission of messages between two adjacent nodes, the Data-link layer may give up retransmission of frames after too many failed attempts; likewise, messages may get dropped if queues, i.e., memory, at a node fill up due to congestion. In the Transport layer, TCP provides a reliable end-to-end transmission of messages. The User Datagram Protocol (UDP), the other common Transport layer protocol in the Internet protocol suite, does not offer any form of congestion control nor reliable message transmission; it only adds a check sum and port numbers to Application layer messages to validate received messages and to hand them to the correct application. For any kind of real-time communication that requires low communication delays, TCP is generally considered unsuited, because its congestion control and retransmission mechanisms add delay as well as uncertainty regarding the delay. The effect is especially severe in MANETs, because TCP is designed to handle any packet loss as sign of network congestion, which is an invalid assumption for MANETs with severe impact on the protocol’s performance. Instead, UDP or adaptations thereof are used for real-time communication. The clearly defined abstractions and the separation of responsibilities that the Open Systems Interconnection (OSI) model’s and the Internet protocol suite’s layered architecture offer are conceived as advantages for design, implementation, and adaptation of functionality in the network suite [KK05]. But, especially in the area of wireless

17

2. Fundamental Concepts and Definitions networking, researchers and engineers design and test protocols that violate the layer abstractions; such protocol design is called cross-layer design [SM05]. Cross-layer design has been an important tool to address challenges in the optimization of MANETs: approaches like the service differentiation proposed by Ahn et al. [Ahn+02] use cross-layer design to improve real-time behaviour. Despite initial controversy and widespread opinion that specialised protocols were necessary, recent research has shown that the Internet protocol suite can be efficiently used on low energy devices like sensor nodes [HC08]. Under the term 6LoWPAN, an adaptation layer has been created that optimizes the Internet protocol suite for this specific use case [RFC4919; RFC6282].

2.3.2. Wireless Networks The most widespread protocols below the Network layer that are used to create local, wireless networks of computers are defined by the IEEE 802.11 WLAN standard [TW11]. IEEE 802.11 defines two operation modes: infrastructure and ad hoc mode. The infrastructure mode is used most of the time, whereby an access-point connects the network participants with each other and often to a local fixed network and the Internet. In the ad hoc mode, participants directly communicate with each other, without relying on special infrastructure. For device connectivity, especially when it comes to small, low energy devices, other technologies are usually preferred over IEEE 802.11 to deliver wireless connectivity due to their reduced energy consumption, such as Bluetooth or IEEE 802.15.4 based protocols, like ZigBee or 6LoWPAN [AIM10]. With IEEE 802.11p, an amendment to the original WLAN specification, industry and researchers are working together to create a common connectivity standard to connect road vehicles and infrastructure to form a VANET [JD08; Bas+13]. With smartphones, currently perhaps the most pervasive computing device, IEEE 802.11 WLAN enabled devices are commonplace.2 In order to keep the necessary focus in this thesis, the research is limited to IEEE 802.11 protocols, a reasonable choice given its prevalence. The Physical Layer in Wireless Networks Rappaport [Rap02] covers the foundations of wireless communication and radio wave propagation in depth, Walke, Mangold, and Berlemann [WMB06] cover similar topics but with a direct relation to the IEEE 802 family of networking standards. Both books are written from an electrical engineering perspective and consequently have a stronger focus on the physical layer aspects than more computer network oriented literature. 2 A 2013 survey by the PewResearchCenter found that 56% of U.S. American adults own a smartphone [Smi13].

18

2.3. Mobile Ad Hoc Networks Most of the details in the Physical layer are of no concern for this thesis; this section briefly covers the important aspects. The physical transmission medium of wireless networks is the unguided radio wave propagation. The original IEEE 802.11 standard and its various amendments specify different frequency ranges that the radio waves are transmitted at, e.g., 2.401 GHz to 2.483 GHz (in the 2.4 GHz Industrial, Scientific, and Medical (ISM) band) divided into 13 channels in Europe [Man+06]. Likewise, the standard defines various coding schemes, modulation techniques, and forward error correction methods to transmit the bit stream via the radio signal. After emission from the transmitting terminal, various phenomena such as signal attenuation, reflection, diffraction, and interference affect the radio signal. In the most simple, ideal case, the radio signal’s amplitude—expressed as the signal’s power— decreases quadratically with the Euclidean distance, d, to the transmitting terminal. The Friis free space equation describes this phenomenon [Rap02]: PrFS (d) =

P t G t Gr L



λ 4πd

2 (2.2)

where Pt is the transmitted power, Gt is the transmitter antenna gain, Gr is the receiver antenna gain, L is the system loss factor (L ≥ 1), and λ is the signal’s wavelength. The wavelength, λ, is related to the signal’s frequency, f , by: λ=

c f

(2.3)

where c is the speed of light.3 At a receiving terminal, the received signal strength, i.e., the signal’s power at the receiver, and the Signal to Noise Ratio (SNR), i.e., the quotient of the signal’s power to the power of interfering radio waves (the noise), are the main factors that affect the receiver’s ability to successfully decode the bit stream from a signal.

2.3.3. Routing in Mobile Ad Hoc Networks Routing protocols for MANETs have received large attention from the computer science networking community since research on MANETs started; new protocols as well as improvements are still proposed regularly. Without a routing protocol, communication in a MANET is restricted to directly neighboring nodes via a single hop. Hence, the network’s routing protocol is a major influencing factor for the network’s operational 3 The speed of light in vacuum, c0 = 299 792 458 m s−1 , is a constant and the upper bound for the speed of light inside a medium [MTN08; Ein05].

19

2. Fundamental Concepts and Definitions characteristics and performance. To prevent a bias from use of a single routing protocol in the simulations that are applied later in the thesis, all analysis and evaluation is carried out with three different routing protocols: Ad hoc On demand Distance Vector (AODV), Dynamic Source Routing (DSR), and Optimized Link State Routing (OLSR). All three protocols have been designed in the earlier days of research on MANETs; they are general purpose and are still the basis for research on the next generation of MANET routing protocols. Two of the protocols, i.e., AODV and DSR, are reactive routing protocols. Reactive here means that they only discover new routes in the network out of necessity because a packet has to be routed. Proactive protocols, such as OLSR, on the other hand discover available nodes in the network and routes to them up front. Because they store routing tables for the complete network, the latter protocols are often referred to as table driven. Much of the MANET research that is introduced in chapter 4 references one of these protocols and either builds upon them, uses one or more of them during evaluation, or uses them as benchmarks. An in-depth discussion of these protocols does not serve the thesis content; but rather, a short discussion of their primary characteristics and routing mechanics are given in appendix section A.1.3.

2.4. Data Analysis, Prediction, and Machine Learning Data analysis unites methods from statistical analysis and inference, machine learning, and data visualization to gain knowledge from information, the latter which is represented by data.4 Today, applying these methods on large and divers data sets is often called data mining [ASW12]. Tan, Steinbach, and Kumar [TSK05] thoroughly introduce the broad method portfolio of this domain, whereas Azzalini, Scarpa, and Walton [ASW12] provide a more compressed and application oriented introduction. Pearson [Pea11] offers a practical and detailed handbook on how to use statistical tools and regression analysis to explore and describe properties of data sets. Box, Jenkins, and Reinsel [BJR94] is a classical book to the field of time-series analysis and forecasting. In chapter 5, methods of statistical data analysis, especially linear regression models, are used to interpret the data gathered from the experimental study to identify the contextual factors that have the strongest influence on the connectivity metrics. In chapter 6, various time-series forecasting, regression, and classification models are considered as parts of or complete prediction models for the connectivity metrics. For the models’ evaluation in chapter 7, prediction error measures provide the basis to asses the prediction models’ performance. The following subsections introduce the important 4 Rowley [Row07] offers workable definitions of the terms data, information, and knowledge including context to the discussions evolving around them.

20

2.4. Data Analysis, Prediction, and Machine Learning concepts and terms related to data analysis, prediction, and machine learning. Most mathematical formulations have been omitted here and are to be found in the appendix section A.2.

2.4.1. Regression, Classification, and Measures of Error One core task of data analysis is to derive prediction models that allow to predict the value of an attribute, the response variable, based on the values of one or more other attributes, the predictor variables. In case of qualitative or categorical response variables, the prediction is a task of classification that uses a classification model. For quantitative response variables, a prediction model is called regression model, which expresses the response variable, yi , as a function of the predictor variables, xi , model parameters θ ∈ Rm , and an error term, e ∈ R: f : Rn → R

(2.4)

yi = fθ (xi ) + e Given a regression model function f and a data set of N ∈ N known tupples of predictor and response values, {(xi , yi )|xi ∈ Rn , n ∈ N, yi ∈ R, i = 1, . . . , N }, the regression model is fitted to the data set by finding the model parameters, θ, that minimise the model’s error terms. In logistic regression, the left hand side of equation (2.4) is not set equal to the response variable but rather to the odds of the occurrence of state 1 for a two state random variable A that has probability p1 = P(A = 1). The odds are expressed via the logit function5 . Logistic regression models allow to use—typically but not necessarily linear—regression model functions, fθ (x), to define classification models. When building regression models, an important issue is to validate their accuracy on known data. To do this, three general measures of error for actual versus predicted values can be utilised: the residual sum of squares, SSE, is the cost function that the least squares method tries to minimize in order to find the coefficients for a regression model; the total sum of squares, SST , is the prediction error when estimating the response variable by its arithmetic mean value, yˆ, and is used to compute scaled error measures that are independent of the underlying data [TSK05]. Using these measures, the coefficient of determination, R2 , is an indicator for a model’s goodness of fit to the data, i.e., how much of the variance that is found in the data is accounted for by the

5

For the logit function, cf. equation (A.2) in appendix section A.2.

21

2. Fundamental Concepts and Definitions regression model [ASW12]: SSE R =1− =1− SST 2

2 i (yi − f (xi )) P ˆ)2 i (yi − y

P

(2.5)

A drawback of the coefficient of determination is that it increases with the number of predictor variables used in the model [TSK05]. Instead, the adjusted coefficient of ¯ 2 , is used to compare regression models that use differing numbers of determination, R predictor variables [TSK05]: ¯2 = 1 − R



N −1 N −d



(1 − R2 )

(2.6)

where N is the number of observations and d + 1 is the number of predictor variables, the parameters, of the regression model. In chapter 5, mainly the adjusted coefficient of determination is used to assess the regression models’ goodness of fit in dependence of the contextual factors that they utilize as independent variables to describe the connectivity metrics. The primary reason to use this error measure over others is that its scale independence allows to compare model performance between prediction models from all three connectivity metrics. Because the metrics have different scales, this would ¯ 2 penalizes inclusion of parameters into otherwise not be possible. At the same time, R the models that do not improve the models accuracy. If a regression model perfectly fits ¯ 2 has a value of 1. With decreasing fit, the value to the data it is tested with, then R ¯ 2 lessens to 0, in which case the data’s arithmetic mean fits the data equally well of R ¯ 2 has a negative value; in that case as the model. Mathematically it is possible that R the regression model does not suite the data at all and the data’s arithmetic mean is a better fit than the model. The standard error of regression is an error measure that may be meaningfully compared with the data’s arithmetic mean or variance in order to judge the scale of a model’s predictive error [HA13]. Because of this property, it is the primary error measure that is used to asses prediction model performance per connectivity metric in the evaluation in chapter 7. To evaluate the accuracy of a model to predict the response variable for new data, the model’s error measures from the data that were used to fit the model must not be used. It is important to use fresh data instead. The easiest way to achieve this is to use one part of a data set to fit a model and the other part to evaluate its prediction accuracy. Likewise, when fitting multiple models to choose the best one, which again is considered as a model fitting process, the initial data set is split in three parts: one part—the training set—to fit all the models, one part—the cross validation set—to calculate the

22

2.4. Data Analysis, Prediction, and Machine Learning models’ prediction errors and chose the best performing one, and one part—the test set—to evaluate the final model’s prediction accuracy [ASW12].

2.4.2. Time-Series A time-series is a data set that represents observations that are sequential in time [cf. BJR94]. If the observations are taken at fixed, equidistant time intervals, the time-series is regular. A time-series is stationary, if the probability distribution of observed values does not depend on the time at which the series is observed [BJR94; HA13]. The prediction of a time-series’s future values using the time-series’s past data is called forecasting. Data in a time-series may exhibit seasonal correlations, i.e., patterns repeat after a fixed interval, like an hour, a week, a month, or a year. Another property that may be found in time-series data is a dependency between consecutive or temporally close observations, called autocorrelation. Box, Jenkins, and Reinsel [BJR94] is a classical and influential work on the methods of time-series analysis and forecasting. Hyndman and Athanasopoulos [HA13] introduce forecasting methods and their application in the data analysis environment and programming language R.

2.4.3. Forecasting Methods for Time-Series Consecutive observations of the communication delay form a time-series. Hence, forecasting methods are considered for predicting future communication delay. The Na¨ıve forecast is the most simple forecasting method [HA13]: the last observed value simply is carried over as the forecast for all future values up to the forecasting horizon. As such, the Na¨ıve forecast has no configuration parameters and does not need to be trained with actual observations, before being usable. More complex are the two most widely-used forecasting methods, exponential smoothing and Autoregressive Integrated Moving Average (ARIMA) models [HA13]. The various forms of exponential smoothing primarily address time-series with trend and seasonal behaviour, from which at least the latter is not expected to occur in the connectivity metrics. ARIMA models on the other hand are intended to handle nonstationary but homogeneous time-series with a mixed autoregressive and moving average model [BJR94]. Homogeneous in this case means that after differencing the time-series a finite number of times, it is stationary. In their state space representation, exponential smoothing models with additive error are expressed as [HA13]: yt = lt−1 + t

(2.7)

lt = lt−1 + αt

(2.8)

23

2. Fundamental Concepts and Definitions where lt ∈ R, t ∈ N is the time-series of actual, unobservable states; yt ∈ R is the observation of the time-series at time-step t; t is a white noise time-series of normally and independently distributed errors, i.e., with zero mean and constant variance; α ∈ [0, 1] is the smoothing factor that is the model parameter that controls the influence that old observations exert on the current state. To use the model in forecasting, the initial value l0 and the model parameter α are fitted to the time-series’s previous observations by minimising the residual sum of squares. Given fitted values for the model parameters, the model’s innovation equation (2.8) is used to predict point forecasts for the time-series future state expected values. In this form, exponential smoothing models are very similar to simple Kalman filters and dynamic linear models [cf. PPC09].

2.4.4. Statistical Machine Learning Statistical machine learning addresses the matter of gaining knowledge in the form of mathematical models about given data that is then used for decision making. Russell, Norvig, and Davis [RND10] discuss the important methods of machine learning from the point of view of general artificial intelligence. Bishop [Bis06] offers a comprehensive and likewise broad introduction into the general field of machine learning and pattern recognition. James et al. [Jam+13] offer an introduction that is more focused on statistical aspects of machine learning, with direct references to the application of methods in the R statistical computation environment. Applicable methods for machine learning vary depending on the kind of machine learning task. The two principal categories that are distinguished are supervised and unsupervised learning: • In supervised machine learning, models are learned from a-priori known combinations of expected model input and output. From a statistical view, the act of learning in supervised machine learning, which is likewise referred to as training, is the same as fitting a model to given data. • In unsupervised machine learning, the learning task is to find a-priori unknown structure in given data. Gradient descent is a general purpose, iterative optimization method that is applicable to supervised machine learning. It is commonly used to train artificial neural networks, but may likewise be applied to iteratively fit any kind of regression model. Given the regression model function hθ (x) = y with the model parameter vector θ ∈ Rn , the input vector x ∈ Rm , and the output value y ∈ R with n, m ∈ N, let J(θ) be the cost function that has to be minimized to learn the model and x(i) respectively y (i) indicate the corresponding input and expected output values at index i = {1 . . . k} of the a-priori

24

2.4. Data Analysis, Prediction, and Machine Learning known training data: J(θ) =

k 2 1 X hθ (x(i) ) − y (i) 2k

(2.9)

i=1

Gradient descent works by iteratively updating the model parameters θ with the cost function’s gradient for the current parameter values, adapted by the learning rate α: θ ← θ − α∇J(θ) ⇒ θj ← θj − α

∂ J(θ) ∂θj

(2.10) with θ = (θ1 , . . . , θn )T

(2.11)

For simple linear models with x = (x1 , . . . , xm )T , m = n, x1 = 1, and hθ (x) = θT · x, equation (2.11) is simplified to: θj ← θj − α

k  1 X  T (i) (i) θ · x − y (i) · xj k

(2.12)

i=1

The iterative updates are carried out until a predefined convergence criteria is reached. A very powerful form of the gradient descent algorithm that has proven effective for handling very large data sets is the stochastic gradient descent that does not compute the full gradient but rather only uses a single data point or small subset of data points per iteration step. Instead of applying the gradient descent algorithm on an a-priori available training data set, the stochastic gradient descent allows to use the algorithm for on-line learning. As new observations for combinations of input and expected output values become available over time, the stochastic gradient descent may be executed once for each new observation or set of new observations.

25

3. Application Scenarios Since long, the military domain has been the only field of MANET application, with disaster response oftentimes only cited as possible application domain [CG13; Per01]. Since a few years, the industry has discovered MANETs as a way to connect vehicles and infrastructure to enhance the capabilities of driver assistance systems and autonomous vehicles, all with the ultimate goal to create a safe, public traffic environment free of accidents [Fae+12]. Such vehicular network usually go by the term VANET. Besides the definition of the thesis’s application scenarios in sections 3.1 and 3.3 below, sections 3.2 and 3.4 cover the current state of relevant MANET research in these two domains. This wider background provides the research context to the two scenarios.

3.1. Scenario 1: Telemedicine for Disaster Intervention The possibility to use MANETs to support emergency responders in cases of severe disasters that have destroyed the regular communication infrastructure is an often quoted motivation for the research on MANETs [Per01; Bas+13]. For direct medical interventions in the field, paramedics may use the MANET to get support by a physician that supervises their treatment. Past research projects have successfully established technology for such a use case. Not in a disaster setting using MANETs, but using cellular communication infrastructure to support emergency medical services [Ber+12; B¨ us+14; The+15]: A trained emergency physician engages with the paramedics on-scene by bidirectional real-time voice communication to support them in patient treatment.1 Continuously streamed biomedical vital signs from the patient, such as electrocardiogram (ECG) or pulse oximetry, provide the physician with additional information to engage in patient diagnosis and treatment decisions. This form of telemedicine is often referred to as real-time or synchronous telemedical consultation, which places constraints on acceptable communication delay and interruption. The recently launched research project AUDIME transforms this telemedicine model from regular emergency medical services to medical interventions in disaster scenes 1

Emergency medical services practitioners might be surprised because of this strong involvement of a physician in pre-clinical activities: this model has its origin in the German, physician lead emergency medical service that regularly places physicians on-scene.

27

3. Application Scenarios treatment area

incident site

victim & paramedic

other intervention staff

moving in the area

emergency physician

Figure 3.1.: Synchronous telemedical consultation for medical interventions over a MANET in a disaster scene. that lack the fixed cellular infrastructure for mobile data connectivity. In this scenario, which serves as the first application scenario in this thesis and is depicted in figure 3.1, emergency physicians are engaged in the casualties treatment area to care for patients that have already been delivered from the incident site. Meanwhile, paramedics locate and triage injured persons on the incident site, which are then transported to the treatment area to be stabilized before transport into a hospital.2 All helpers that are involved in the disaster response carry communication devices that form a MANET. The triaging paramedics use patient monitoring devices that may stream the recorded vital signals via the MANET to one of the physicians in the treatment area. Using bidirectional voice communication and a patient’s streamed vital signals, this physician may assist the paramedics in their triage decision making and supervise early, medical interventions already on the incident site. Besides the diagnostic information, the physician must have information on the communication’s availability to securely support the treatment. Being able to quickly assess the expected availability of the communication during a synchronous telemedical consultation session is a critical requirement that the involved emergency physicians repeatedly requested in previous projects [Pro10; The+13]. In these previous projects, treatment was mostly carried out under stationary conditions so that users could assume that connectivity conditions did not vary much. Concerning a strict delay requirement, Alesanco and Garc´ıa [AG10] reported a 2

Auf der Heide [Auf06] surveyed emergency response to large scale disasters in the Unites States and finds that it seldom follows the structured intervention plans; a reason is that often times the disaster’s survivors carry out most early intervention. Never the less, these are organizational matters, far out of the thesis’s scope. Hence, the scenario evolves around a more structured procedure that presents a simpler context and clearer requirements onto the communication technology.

28

3.2. MANETs for Telemedicine maximum acceptable signal delay of 3 s for critical operations. But because of the lack of more precise information, emergency physicians have successfully worked with an expected upper delay bound of 10 s in previous studies [The+15]. For more general patient surveillance or diagnostics, a lager delay may be sufficient. This indicates that a categorical scale with three classes, short (less than 3 s), medium (3 s to 10 s), and large delay (more than 10 s), to describe the signal delay may be sufficient when displayed to a user. Contrary to a cellular network, a MANET in that humans carry the communication devices has a dynamic network topology and a network partitioning that interrupts the vital signal stream may happen unanticipated by the users. Hence, the system should predict the time until the next stream interruption. Mobility wise, the nodes in this scenario either move with regular to fast walking speed or remain at one position for an extended time for patient triage and treatment. It is assumed that both the source and the destination node of biomedical vital signal streams are stationary. Nodes move inside the scenario’s different areas or they move back and forth along certain paths between incident site and treatment area to transport patients to the treatment area.

3.2. MANETs for Telemedicine Kim et al. [Kim+09] present a rudimentary, experimental study that uses a MANET with stationary nodes to support a synchronous telemedicine application for remote consultation of medical specialists from a disaster scene. In this study, they successfully transmit multiple biomedical signals—among others, the biomedical signals data include electrocardiogram (ECG), oxygen saturation, and respiration that all combined result in a payload of 1413 byte, transmitted once per second—, video with a bit rate under 50 kbit s−1 , and images with a size of 200 KiB each simultaneously over 4 hops, using the AODV routing protocol and IEEE 802.11g WLAN technology. This study is merely a proof of concept that proves that MANETs principally may be used to support synchronous telemedicine applications. Husni et al. [Hus+06] report on evaluation results of an early stage telemedicine system that uses a mesh network leveraging mobile IPv6 and IEEE 802.11 WLAN based MANETs with an adapted AODV routing protocol to support telemedical applications in disaster areas that lack communication infrastructure. Despite the intended telemedicine application, the evaluation, carried out using a discrete event simulator and a static node setup, only covers real-time voice communication. Their results show a negative correlation of packet delivery ratio to the number of data streams and a positive correlation of the communication delay to the number of data streams. Varshney and Sneha [VS06] propose to use a broadcast or multicast routing procedure

29

3. Application Scenarios in combination with topology control3 schemes to improve the reliability of biomedical vital signal transmission to monitoring stations from sensors, attached to multiple patients’ bodies that form a MANET. Their described WSN monitoring system is intended for long term patient monitoring, e.g., for elderly people or chronically ill patients. Ren et al. [Ren+10] propose the use of the same methods in their discussion of similar body sensor networks for mobile health care. Both works do not present any reasonable evaluation of their concepts. Sneha and Varshney [SV13] assess the use of WLAN based MANETs in the form of mesh networks together with cellular networks to improve the reliability of patient monitoring at distant sites. The covered use-case is, similar to [VS06] above, long term patient monitoring with the intention to generate alarms when the necessity for a medical intervention is detected. The authors propose the use of multicast routing combined with topology control in the form of transmit power regulation to prevent data loss while minimizing the node’s energy consumption. For the evaluation of their proposed methods, Sneha and Varshney introduce a theoretical framework that uses stochastic node densities in the presence of node mobility and node clustering to analyze the communication’s end-to-end reliability and a M/M/1 queueing model for the transmissions of a single node to analyze the communication’s end-to-end delay. According to this framework, the communication delay increases with the network load and, at low node densities, the delay increases with the density, whereas the delay mostly remains constant in regard to density variations at higher node densities. In all cases, the communication reliability increases with node density, whereby the change from very low to full reliability happens in very narrow node density intervals. The location of this interval depends on the node’s transmit power levels, with higher transmit power levels reducing the node density necessary to achieve full reliability. Unfortunately, the work does neglect to validate the theoretical framework in any form, leaving their presented results and conclusions questionable. Wu, Kao, and Tseng [WKT11] survey the use of MANETs and WSNs in projects that utilize the CPS concept in various application domains. All surveyed projects from the health-care domain target medical body sensors for ambient assisted living, non of which considers node mobility or real-time communication requirements to be of any concern. Most other research that somehow utilizes MANETs in the context of telemedicine deals with aspects of WSNs: routing, computation, and localization on energy constrained, low power devices; the probably most extensive work in this direction was carried out under the long term project BlueCode by the Harvard Sensor Lab [Mal+04; Gao+08], an other example is the BigNurse project [Bad+06]. 3

30

Topology control is further discussed in section 4.5.

3.3. Scenario 2: External Sensor Assistance for Autonomous Vehicles

B (standing)

A ? Figure 3.2.: Planning consensus and sensor assistance for networked autonomous vehicles to resolve situations with blocked sensor sight.

3.3. Scenario 2: External Sensor Assistance for Autonomous Vehicles The motivation for application scenario 2 is a German magazine article about the autonomous Mercedes-Benz car Bertha that successfully drove from Mannheim (Germany) to Pforzheim (Germany) via the same route that Carl Benz’s wife Bertha drove over hundred years ago to prove that the automobile actually was capable of doing so [Zie+14]. The car, according to the magazine article, handled all but two situations all by itself [Gr¨ u13]: • It would wait indefinitely for some pedestrians that stood at a crosswalk, none of whom intended to actually cross the street and instead waived at the driver to pass by. • It would stay behind a garbage collection truck that blocked its lane and did not overtake. Specifically that second hindrance showcases a situation where the car by itself hardly will be able to decide when it is safe to move onto the opposite lane in order to overtake the garbage truck that blocks the street, because the car’s sensor vision is severely impaired by that same garbage truck. Figure 3.2 depicts such a situation schematically: an autonomous car (labeled A in the figure) behind another vehicle (labeled B in the figure), the latter of which blocks the lane, needs to overtake but by itself does not have sufficient vision. Communication with other vehicles offers multiple possibilities to solve the situation without a driver’s intervention: one option would be to query other vehicles for a decision

31

3. Application Scenarios on starting the overtaking maneuver. For the purpose of this thesis, another option is more interesting: the other vehicles stream parts of their world model in real-time to the vehicle in question. Additionally, other autonomous vehicles may continuously transmit their planned trajectory to further improve the planning capabilities for the overtaking maneuver. In such a case, a VANET increases the communication range and hence the number of assisting vehicles compared to a simple single-hop network. The directly associated matter of trust in this situation is not of further concern here, but having access to world models from more than one source is clearly an advantage in this regard and strengthens the case for a real-time VANET. The second application scenario regards the VANET as part of a NCS: consider that the vehicle uses the received world model fragments as input to its control system, to again act in the physical world that constituents these same world model fragments. Then this setup is a dynamic system with feedback loop. To get an understanding of the time scale at which such a distributed control system for autonomous vehicles operates, the findings of Ioannou and Chien [IC93] are used, who show that a sampling rate of 10 Hz for a car’s front distance sensor is sufficient to ensure a stable control law for automatic cruise control. Zhang, Branicky, and Phillips [ZBP01] argue that keeping the network induced delay below the sampling period simplifies the system’s stability analysis compared to larger delays and that, by using a traditional one-step predicting state observer, the delay can be compensated for easily. For the case of dynamic cruise control, that would require a delay of less then 100 ms. Mobility wise, the scenario takes place in a city with a 50 km/h speed limit. Accomplishing the above driving maneuver would, under these circumstances, only require a few seconds: even with an average speed of 30 km/h the vehicle passes 30 m in 3.6 s. Hence, the forecasting horizon does only need to be a few seconds. It is assumed that all vehicles that participate in the VANET provide access to their planned trajectory with a planning horizon at least equal to the forecasting horizon.

3.4. MANETs for Autonomous Vehicles and Vehicular Communication The state of the art in autonomous driving has seen major advances during the last years where the DARPA4 Grand Challenges (held 2004, 2005, and 2007) mark important technical milestones that have fostered much competition and progress by researchers [Urm+08; Cam+10]. Levinson et al. [Lev+11] discuss their subsequent shift in develop4 The Defense Advanced Research Projects Agency (DARPA) is the United States Department of Defense’s agency that is responsible for the research and development of new technologies.

32

3.4. MANETs for Autonomous Vehicles and Vehicular Communication ment strategy from a competition centric mode to a more research oriented operation that has helped to solve many still outstanding problems in autonomous driving under normal traffic conditions. When studying the current research in this field, it is evident that the developed systems usually are closed in the sense that they do not rely on communication with external systems to fulfil their control tasks. Never the less, Campbell et al. [Cam+10] have identified that stable and safe decision making for route planing at a high level in complex situations that involve other, autonomous vehicles, requires a consent among the involved planing agents. This insight was one of the key findings that the teams reported from the last DARPA Grand Challenge, the Urban Challenge. In cooperative driving research on the other hand, communication between vehicles has a central role [cf. Kat+02; LW06]. Mainly driven by the goal of an accident free traffic, industry and research efforts in the area of vehicular communication systems have lead to the development of the Wireless Access in Vehicular Environments (WAVE) framework as a harmonized foundation to further develop vehicular communication systems [UA09]. The Physical layer and the Data-link layer of WAVE are defined in IEEE 802.11p and hence an adaptation to regular WLAN with dedicated frequency bands; the upper layers, specified in the IEEE 1609 standards that still have draft status, are designed as two separate protocol stacks: one for safety critical real-time communication and one for non-critical best effort communication, whereas the latter is based on default Internet protocols [UA09]. The IEEE 802.11p amendment to IEEE 802.11 was first approved in 2010; the IEEE 1609 standards had been released as trial-use standards starting in 2006, with their first releases as approved standards happening from 2010 to 2013. Research in multi-hop communication in VANETs is primarily focused on routing protocols for broadcast or multicast transmissions of safety relevant information to groups of vehicles, such as from intersections, accidents, or dangerous road surface conditions [Che+11]. In their survey on routing protocols for WAVE based VANETs, Chen et al. [Che+11] discuss different approaches to use geographic information or information on the network topology to efficiently transmit messages to groups of receiving nodes. From their perspective, multi-hop unicast communication between vehicles does lack demand from applications, which explains the sparsity of research on this specific field. Furthermore, they identify the dynamic adaptation of communication strategies, based on the environmental conditions as a necessity for efficient inter-vehicular communication and the research on this issue as one of the key challenges for future work. QoS, in the form of transmission prioritization in WAVE is based upon Enhanced Distributed Channel Access (EDCA) from IEEE 802.11e [HL08]. Hence, this QoS mechanism only influences scheduling of single-hop transmissions on the Medium Access Control (MAC) sub-layer. Engelstad and Osterbo [EO06] use queueing theory to derive a probabilistic

33

3. Application Scenarios model that predicts the average transmission delay of a frame in a Data-link layer with EDCA; using the correct parametrization, their method is likewise applicable to a Data-link layer with Distributed Coordination Function (DCF). Baber et al. [Bab+05] present their research on autonomous cooperative driving and the demonstration test bed that implements fully autonomous driving using an experimental vehicle platform. Among the supported autonomous driving tasks are two that require two or more vehicles to cooperate: cooperative overtaking and unsignalized intersection handling. Both these cooperative tasks are realized using communication between the involved autonomous vehicles. Despite the potential complexity of these tasks, the implemented strategies are rudimentary. For the cooperative overtaking, the slower vehicle that gets overtaken slows down further after the approaching vehicle signals its overtaking intend to enable the maneuver to complete faster. The unsignalized intersection is only entered by one vehicle at a time; a vehicle that enters the intersection signals that the intersection is blocked and signals that the intersection is free after it has left the intersection area; in case of multiple waiting vehicles, the remaining vehicles arbitrate who may pass the intersection next. Both these cooperative driving strategies do only require the exchange of simple arbitration messages and no continuous streaming of critical data [Bab+05; VEK00]. The vehicles use dual-channel ultra high frequency radio communication, but the communication technology is not further discussed. Nagel, Eichler, and Eberspacher [NEE07] discuss concepts and requirements to enable cooperation of autonomous vehicles by sharing sensor data achieving consent. Their design is based on multi-hop group communication and nodes being aware of the network topology in a certain vicinity. The topology awareness includes location and velocity information of other nodes and incorporates three main functions: using beacon messages, a node detects its neighbors in the network and assesses the link properties; the beacon messages include location and current velocity information that each node uses to anticipate the change in communication properties over the respective link; based on information received from other nodes, a node will anticipate the network topology at future locations by assuming that it will experience a similar topology as nodes currently at this anticipated location. Besides these general design principals, Nagel, Eichler, and Eberspacher discuss routing protocols (from which they select a proactive protocol) and the necessity of QoS mechanisms to ensure reliable and timely delivery of critical messages. Jakubiak and Koucheryavy [JK08] investigate the industry and research initiatives on VANETs and consent that initiatives in both the United States and Europe push the implementation of IEEE 802.11p, where as in Japan already various, differing technologies have been implemented. For multi-hop communication, they envision a focus on broadcast transmissions and routing protocols that support delay tolerant

34

3.4. MANETs for Autonomous Vehicles and Vehicular Communication communication to overcome the often fragmented state of vehicular networks, which the authors anticipate. Kumar et al. [Kum+12] discuss how the exchange of sensor data between autonomous vehicles—over a single hop—improves their capabilities to safely react to pedestrians or other vehicles that move into a vehicles path from an otherwise obstructed location. To efficiently realize this sensor data exchange, they propose that regions of sensor data should content for channel access at the Data-link layer, instead of individual nodes. Qu, Wang, and Yang [QWY10] discuss how ubiquitous, wireless communication technology may create intelligent transportation spaces in which different sensors, vehicles, and infrastructure are interconnected to work together. They elaborate the suitability of various wireless technologies—specifically Bluetooth, ZigBee, ultra wide band, and 60 GHz millimeter wave technology—for this purpose but conclude that these are not generally suitable and instead focus on WAVE as a solution. Further, they highlight research interests in routing protocols that use geographic information. Gr¨afling, M¨ah¨onen, and Riihij¨arvi [GMR10] present their simulation study of the WAVE multichannel operation’s performance in combination with IEEE 802.11p, based on the standards’ trial-use revisions, in which they assess transmission range, throughput, and average communication delay in urban and highway scenarios. On a flat surface without obstructions that interfere with the radio-wave propagation and depending on the WAVE channel, transmission ranges of approximately 2500 m to 750 m at data rates of 3 Mbit were achieved; with increasing data rates, the transmission range decreases, with data rates of 27 Mbit still resulting in transmission ranges of approximately 1000 m to 125 m, again dependant on the WAVE channel. The control channel shows best results for transmission ranges as well as having a single-hop delay well bellow 100 ms as long as less then 1000 messages are transmitted per second; the other WAVE channels perform considerably worse.

35

4. Related and Previous Work MANETs have received great attention from the computer networks research community over the last twenty years, but in more recent years the attention has declined. Despite the work from, mainly academic, research, MANETs have not gained much interest from industry during this time and have not found their way into today’s products. Conti and Giordano argue that this situation has been caused by the community’s focus solely on MANETs as general purpose networking technology to bring communication and the Internet to mobile participants everywhere, a domain where the rapid progression and industry driven establishment of cellular networks has diminished the need for MANETs [CG13]. From a consumer perspective, there is not much to oppose to this argument. The efforts of groups like the German Freifunk 1 community or the Wireless Networking in the Developing World 2 initiative are mere exceptions that step in where either the demand is too low for the industry to have a business case, as in the case of bringing computer networking to rural areas in the developing world, or where individuals want to create a network free from corporate or government control. Since long, the military domain has been the only field of MANET application, with disaster response oftentimes only cited as possible application domain [CG13; Per01]. Since a few years, the industry has discovered MANETs as a way to connect vehicles and infrastructure to enhance the capabilities of driver assistance systems and autonomous vehicles, all with the ultimate goal to create a safe, public traffic environment free of accidents [Fae+12]. The related work on real-time communication in MANETs is covered both from the viewpoint of the computer networking community in section 4.1 and the control systems community in section 4.2. Relevant work related to research question 1 that is concerned with the influence of connectivity metrics is discussed as part of section 4.1. Section 4.3 discusses relevant work that concerns end-to-end communication delay prediction and hence, provides valuable input to derive the prediction models to address research question 2. Section 4.4 on context awareness in MANETs introduces methods to gain information on the current state of the network as experienced by individual nodes, input that may be used by the prediction models in chapter 6. Section 4.5 discusses methods to incorporate uncertain node mobility predictions in the prediction models. 1 2

Cf. https://freifunk.net/. Cf. http://wndw.net/.

37

4. Related and Previous Work

4.1. Related Research in the Computer Networking Community 4.1.1. Enhanced Routing Protocols for Mobile Ad Hoc Networks Naturally, the computer networking community has focused most of its MANET research on routing algorithms and their performance. Communication delay is—besides the packet delivery ratio, i.e., the ratio of successfully delivered to lost packets and protocol overhead—their main benchmarking metric. The most common routing protocols for MANETs have been introduced in section 2.3.3. More advanced routing protocols exploit the predictability of node mobility to predict single link stability and to proactively establish new routes in the network in anticipation of a failing link. Though predicting the communication delay or the future existence of a route is not a concern of the research on routing protocols, the existing work provides a starting point for the use of node mobility to predict changes in the network that will influence both communication delay and route availability. Use of Geographic Node Locations Chen et al. [Che+11] survey routing protocols for VANETs and find that most such protocols address broadcast or multicast communication and use geographic node locations or knowledge on the network topology to improve route discovery. The presented protocols focus on reducing communication overhead and finding the optimal routes. Node locations are used to infer the best next hop by selecting the neighbor with the location closest to the destination nodes. The node locations are either disseminated in the network via location services that allow to query Global Positioning System (GPS) coordinates to a node identifier, such as the Internet Protocol (IP) address or by embedding them into message headers. Network topology is queried via special control packets and used to construct trees that reflect currently available links. Inference of the network topology from given node locations is not addressed. Prediction of Single Link Lifetimes Linear Velocity Extrapolation Su, Lee, and Gerla [SLG00] assume availability of absolute node location information (e.g., via GPS), time synchronization (e.g., via Network Time Protocol or GPS), and the free space radio propagation model, to linearly extrapolate node movement trajectories by assuming constant velocity. From neighboring node’s movement trajectories and the maximum communication radius, r,—a fixed value derived from the propagation

38

4.1. Related Research in the Computer Networking Community model—they predict the link expiration time, Dt : Dt =

−(ab + cd) +

p (a2 + c2 )r2 − (ad − bc)2 a2 + c2

with a = vi cos θi − vj cos θj , b = xi − xj ,

(4.1)

c = vi sin θi − vj sin θj , d = yi − yj For the nodes i and j in above equation, their current locations are given as Cartesian coordinates {xk , yk }, whereas the velocities are given as speed, vk , and direction, θk , with k ∈ {i, j}. For evaluation, Su, Lee, and Gerla use two different mobility models in discrete event simulations with 50 mobile hosts at random start locations in a square area with 1000 m side length. The first mobility model, used to compare different routing protocols, is parametrized with the node speed, which is then identical for all nodes and constant during simulation. Nodes randomly select their movement direction, when reaching a border of the simulation area they bounce back and continue to move. The second mobility model is the random way point model, adapted to select new way points only with a constant distance to the current way point. The simulation results, using the first mobility model to compare four different routing protocols, indicate a packet delivery ratio above 90% at movement speed up to 70 km/h for the two routing protocols proposed, which both use the route expiration time; the protocol not using mobility information drops below a packet delivery ratio of 40% at about 5 km/h and below 20% at about 35 km/h. To compare the regular distance vector routing protocol with the distance vector routing protocol using mobility prediction, the authors use the adapted random way point mobility model. Their results indicate that the mobility prediction leads to a significantly higher packet delivery ratio for next way point distances of 70 m and longer. Similarly, Jiang, He, and Rao [JHR05] assume constant node velocities to predict a time period, Tp , during that two nodes are able to continuously maintain their communication link, when assuming a fixed maximum communication radius, D. Instead of using exact node velocities and locations, they use three successive node distance measurements—

39

4. Related and Previous Work possibly based on exact positioning information—to estimate Tp : (d21 t2 − d22 t1 ) − d20 (t2 − t1 ) t1 t2 (t1 − t2 ) 2 2 (d t − d22 t21 ) − d20 (t22 − t21 ) β= 1 2 t1 t2 (t2 − t1 )

α=

γ = d20 p β 2 + 4D2 − 4αγ − β − t2 Tp = 2γ

(4.2) (4.3) (4.4) (4.5)

with d0 , d1 , and d2 being the measured node distances at times t0 , t1 , and t2 respectively. Based on Tp , they estimate the link availability, L(Tp ), by assuming that the actual duration that nodes do not alter their velocity is exponentially distributed with mean λ−1 and independent. For the first time that one or both nodes change their velocity during Tp , the factor p expresses the probability that this change causes the nodes to move closer together, with p = 0.5 in unconstrained environments. By joining effects of all later velocity changes during Tp into the error term, , they get:   1 1 −2λTp L(Tp ) ≈ ++e pλTp − − 2λTp 2λTp

(4.6)

To derive , Jiang, He, and Rao propose to initialize it with zero and then repeatedly measure the estimation error to learn  to account for the situational node mobility and other influences. With estimates for both Tp and L(Tp ), they calculate the link reliability T r as T r = Tp · L(Tp )

(4.7)

which again is a time value. Menouar, Lenardi, and Filali [MLF07] use a link stability metric to improve routing in VANETs. The link stability is calculated using next hop node locations predicted 1 s into the future based on current vehicle velocity and exact current vehicle location. Node locations are propagated using a hierarchic location service. Linear extrapolation of current node mobility offers a simple method to estimate remaining link lifetimes that has been proposed by different researchers that all use it to reduce packet loss by discovering new routes before a link in the current route breaks. In every case, knowledge on the maximum communication range is assumed to be given and non of the work addresses the estimation of this parameter that is crucial for the methods. Neither is the estimation of end-to-end connectivity metrics from the 1-hop link lifetimes of concern in any of the existing work. The adaptation mechanism that Jiang, He, and Rao [JHR05] propose simply adapts to the statistical properties of the

40

4.1. Related Research in the Computer Networking Community nodes’ movements without the possibility to account for actual knowledge on future node movements. Reinforcement Learning of Single Link Lifetimes Ara´ ujo et al. [Ara+14] use a discrete-time Markov chain to predict the future link quality—SNR in this case—between two nodes in a MANET; each state in the Markov chain corresponds to a certain SNR range, but the authors do not further elaborate the necessary discretization, the choice of states is arbitrary. To compute the Markov chain’s transition matrix, T , they propose to use a genetic algorithm for reinforcement learning, terminating the learning process as soon as the best candidate transition matrix performs better than a predefined prediction accuracy threshold. Consequently, once past its learning phase, this scheme does not further adapt to changing conditions. Despite claiming the methods applicability to the Network layer, the authors do neither demonstrate nor discuss this transfer in their work.

4.1.2. Quality of Service Mechanisms for Mobile Ad Hoc Networks From the perspective of real-time communication applications, MANETs have seen only very limited attention; the majority of work has been solely focused on infrastructure based networks. QoS schemes for MANETs, usually researched to improve multi-media streaming, are a small but active research area in the computer networking community that addresses MANETs from the real-time application perspective. These schemes primarily revolve around the end-to-end communication delay that a flow will experience when admitted to the network and how the network can prioritize messages of flows with stringent delay constraints over non-critical flows in the network. Admission Control Khoukhi et al. [Kho+13] survey admission control schemes in MANETs. The idea of admission control is to ensure that establishing of new real-time communication flows is only permitted if the network can sustain their QoS requirements. They find that only few such admission control methods anticipate node mobility and conclude that predicting future network topology and resource availability should be attempted by future research. Service differentiation is a special form of admission control described by Ahn et al. [Ahn+02] that removes uncertainty of transmission delay caused by congestion on a static route through the MANET. The QoS mechanism estimates the communication delay of new flows to enable admission control. Message prioritization at the Data-link layer improves communication delay and reduces jitter, i.e., the communication delay’s

41

4. Related and Previous Work variance. Effects of node mobility are not handled proactively, but are instead ignored until congestion at a node forces already active real-time streams to be readmitted. Readmission, of course, may fail and cause interruption of a real-time stream without prior notice to involved applications. The local admission control for new real-time streams does not offer an application any hint for degradation. Consequently, this form of admission control is only suited for uncritical communication, e.g., video or audio streams. Sensor readings crucial to a control task cannot rely on this form of real-time stream, its availability is unpredictable in the form described. Message Scheduling With Prioritization Li, Li, and Zhao [LLZ14] propose a QoS-aware cross-layer scheduling mechanism for the Application and Network layer of a service oriented IoT architecture. The QoS problem is modeled as a Markov decision process at the Application layer that is then optimized using different cost functions in the architecture’s different layers to improve the use of currently available resources and connectivity metrics by message prioritization. Neither mobility of devices nor the prediction of changes in resource usage or connectivity metrics are addressed. Mangold et al. [Man+03] discuss the scheduling mechanisms for message prioritization that has been introduced with the IEEE 802.11e amendment. By introducing multiple traffic classes and adapting the MAC sublayer protocol, the message scheduling can better adapt to application demand and is more efficient than previous scheduling mechanisms. The introduction of hybrid coordinators allows to use coordinated channel access for delay bounded real-time traffic in the WLAN ad hoc mode without fixed infrastructure. Their detailed analysis shows issues with multiple hybrid coordinators that may interfere with each other and cause unexpected communication delays. Prediction of connectivity metrics is not addressed by the amendment.

4.1.3. Connectivity Analysis in Wireless Sensor Networks WSNs are a special type of MANETs that consist of low power devices that are used to gather environmental data as a distributed sensor array. Typically, a larger number of WSN-nodes are deployed in an area and then remain stationary. Due to their constrained energy capacity, the influence of contextual factors onto the connectivity in a WSN is of interest to the computer networking research community. The previous research in this field is relevant to research question 1 and the experimental study as well as its analysis to find the influencing contextual factors in chapter 5. Al-Anbagi, Erol-Kantarci, and Mouftah [AEM14] have surveyed the state of the art of cross-layer QoS techniques applied to WSNs that make applications aware of

42

4.1. Related Research in the Computer Networking Community end-to-end communication delay and reliability. Of the 55 discussed references, 47 address end-to-end communication delay awareness and 24 address end-to-end reliability awareness. The focus of current work lies on layer interactions and the influence that the network traffic exerts on its connectivity metrics. Non of the work surveyed includes node mobility into the QoS mechanisms for WSNs, but it is found to be important to cover upcoming intelligent transportation system implementations. Stochastic Models of Communication Delay Engelstad and Osterbo [EO06] use queueing models to analyse 1-hop communication delay in IEEE 802.11 networks with stationary nodes to compare the classical DCF with the prioritized message scheduling of EDCA that was introduced with IEEE 802.11e. They find that the sum of mean scheduling delay, τschedule , and mean queueing delay, τqueue , results in the average total delay that the Data-link layer causes for messages it receives from the Network layer, which they validate using discrete event simulations. The scheduling delay includes the time that a frame in the MAC sublayer has to wait until the channel is free to perform a transmission attempt and the time that the frame is transmitted over the air. The proposed models describe the influence of network load onto both the mean scheduling delay and mean queueing delay. The relation is highly non-linear, whereby the network load’s influence on mean queueing delay results in an exponential relation with asymptotes parallel to the network load and delay axes, which the delay follows closely without a large radius where the asymptotes meet. The access categories that IEEE 802.11e introduces only result in different levels for the network load at which the mean queueing delay raises to infinity. The influence on the scheduling delay differs only for the two higher priority access categories, which are each bounded by a maximum scheduling delay, while the other two increase to infinity at a certain network load threshold. Other than showing that there principally is an influence of network load on 1-hop delays, the lack of node mobility prevent application of the analysis results in the context of this thesis. Wang, Vuran, and Goddard [WVG12] propose a probabilistic framework based on Markov chains that model transmissions at the MAC sublayer to asses the end-to-end communication delay distribution in a WSN of stationary nodes. Their results, which they validate with measurements from a field test, all show s-shaped empirical cumulative density functions that raise from nearly 0 cumulative density to nearly 1 cumulative density in the end-to-end delay range of 0.05 s to 0.12 s. Node mobility is not considered. Despaux, Song, and Lahmadi [DSL12] combine analytical methods using a frequency domain framework to join multiple Laplace transformed queueing models into a single stochastic model to estimate the end-to-end communication delay in a WSNs of stationary

43

4. Related and Previous Work nodes. In their simulation based evaluation they find very good accuracy of their models which produce similar delays as the models from Wang, Vuran, and Goddard [WVG12]. By using two different levels of network load, they find that an increased network load causes a higher spread in the end-to-end communication delay distribution that includes higher delays at constant lower delay bounds. No node mobility is considered. Topology Control Santi [San05] presents an extensive review of research in the field of topology control in MANETs and WSNs. The goal of topology control is to reduce the energy consumption of nodes by changing their transmission range—usually with the constraint of maintaining a certain level of global connectivity inside a network. Besides reduced energy consumption, a lower node transmission range reduces the chance for radio interference between nodes and thus reduces congestion on the shared, wireless medium. The MANET is modelled in d ∈ {1, 2, 3} dimensions with a pair Md = (N, P ), a range assignment, RA, and the communication graph, Gt , resulting from applying RA to Md at time t as expressed in equations 4.8 to 4.10. Θ : N × T → [0, l]d for some l > 0

(4.8)

RA : N → (0, rmax ]

(4.9)

Gt = (N, E(t))

(4.10)

N , with |N | = n, is the set of nodes in the network. Θ is the placement function, which, for any time t ∈ T , assigns a location—inside the d-dimensional cube with side length l—to each node in the network. RA defines the transmission range for each node, with a maximum transmission range of rmax ∈ R. Gt , the communication graph, is a directed graph, with nodes N and edges E(t), in which a directed edge (i, j), with i, j ∈ N , exists if and only if RA(i) ≥ ||Θ(i, t) − Θ(j, t)||. If edge (i, j) exists, then node j is called a neighbor of node i. The communication graph—called point graph and unit disk graph in other literature—assumes that a node’s radio coverage area is a perfect circle, which is unrealistic for every environment but flat open-air settings. In homogeneous topology control, each node in the network is assigned the same transmission range. In nonhomogeneous topology control, each node in the network is individually assigned an optimal transmission range. For nonhomogeneous topology control, the problem of assigning an optimal, individual transmission range to each network node has been shown to be NP-hard if the nodes are located in a two- or threedimensional space [CPS99]. When implementing nonhomogeneous topology control in nonstationary networks, a protocol on the network nodes executes the topology control.

44

4.2. Related Research in the Control Systems Community Santi suitably calles such a protocol topology control protocol and declares that it should ideally “[. . . ] be fully distributed, asynchronous, and localized.” The mathematical communication graph model provides the origin to the probabilistic network graph that is proposed in chapter 6. In the form that the model is proposed here, it is unable to handle uncertainties both in geographic node locations that are expressed via the placement function and in probabilities for successful message transmissions along links in the graph. Stochastic Models to Describe End-to-End Connectivity Metrics Sneha and Varshney [SV13] propose a stochastic model for a theoretic analysis of the influence that node density exerts on the network’s multi-hop reliability in terms of packet loss and a queueing model to analyse the multi-hop delay. Node mobility is addressed by considering the probability that non-uniform node distributions occur that cause clusters of nodes. Actual movement of nodes is not addressed. Their results show that end-to-end reliability increases with growing node density, whereas an increasing hop-count has the opposite effect. The end-to-end communication delay is found to increase with raising network load and hop-count from 0.1 s at low network load up to 0.3 s at a network load of 75% of the channel capacity. The results lack experimental validation and the models that describe communication reliability partly produce values below 0% and above 100% reliability. Dongol and Vaman [DV14], similar to [SV13], propose a queueing model to analyse the multi-hop delay for video streaming in a MANET. Under the assumptions that the number of hops that a packet has to travel on its path from source to destination varies and that the variance of delay at each node is random, for a reasonable path length, they find that the end-to-end communication delay can be modeled as a Gaussian Process. They propose to actively manage the receiving node’s buffer to provide QoS in form of interruption-free video streaming. Besides offering a stochastic model for end-to-end communication delay, it remains unclear how this model may be applied to adapt the buffer in result to actual observations from the network.

4.2. Related Research in the Control Systems Community In their research on Transport layer protocols for NCS, Blind and Allg¨ower [BA13] show how varying loss statistics influence the stability and achievable controller performance for a dynamic system. From their work it is clear that a control system that communicates over a dynamic multi-hop network has to be network aware in order to decide if information acquired over the network shall be relied on for a control task: knowing

45

4. Related and Previous Work both the future delay and message loss for a system allows to assess overall system stability and to optimize the controller while in operation.

4.2.1. Using Real-Time Guarantees From the Network Gr¨afling, M¨ah¨onen, and Riihij¨arvi [GMR10] evaluate the performance of real-time communication via IEEE 1609 WAVE and IEEE 802.11p in a simulated VANET scenario. For the control packets that are scheduled with priority, the communication delay remains below the defined upper bound of 0.1 s, independent of node mobility, as long as less than 1000 messages are transmitted per second. The simulated scenarios model communication of road-side units with vehicles that pass along and only 1-hop communication is considered. Hence, routing is completely ignored in this study. Kumar et al. [Kum+12] propose a special data dissemination protocol for VANETs to support cooperative driving in which geographic regions, represented by nodes that have acquired knowledge from that region, contend for channel access in the MAC sublayer instead of individual nodes directly. The idea is to efficiently distribute environmental knowledge to all participating nodes by preventing transmissions of redundant data from different nodes. While this approach enables efficient use of sparse network resources, it is a highly specialized protocol that is incompatible with the Internet’s protocol stack. The WirelessHART specification, officially released in 2007, was the first complete open wireless communication standard for the field of industrial control automation [Son+08]. Song et al. [Son+08] give an overview of this specification and discuss their insights gained from implementing an early technology demonstrator. The protocols follow a layered design that fully aligns with the simplified layer model of figure A.1. It defines a deterministic Data-link layer that uses a centrally coordinated, time-triggered medium access scheme atop the IEEE 802.15.4 Physical layer to support real-time communication in MANETs. A key element of every WirelessHART network is the Network Manager, a central coordination entity, that has three responsibilities: • configure the network • build routing graphs for each node in the network that they use for their routing tasks at the Network layer • build tables with control information for each node in the network that they use to schedule frame transmissions at the Data-link layer As such, the wireless links support a certain node mobility and easy extension or replacement of nodes, but the central coordination prohibits node mobility that would cause frequent topology changes. Recent development has lead to a convergence of the ZigBee specification and WirelessHART [Che+14].

46

4.2. Related Research in the Control Systems Community Freitas Francisco and Rammig [FR05] present a reconfigurable communication system for distributed real-time systems that is specifically designed to handle hard real-time requirements with fault tolerance. The communication system has a non-layered design, as opposed to the OSI model, and is targeted at embedded systems with many of the network functions realized in hardware. Before admitting a new real-time flow or to keep it admitted after topology changes, the flow’s route—or multiple routes in case of redundancy—is probed to asses if the flow’s deadline can be upheld considering its data and arrival rate, i.e., bandwidth and message frequency. Each node uses an earliest deadline first scheduler to transmit pending messages that allows interrupting and later resuming a transmission to give way to a higher priority message. Hence, this system is event driven and allows for more flexibility in message scheduling than time-triggered systems. Despite being designed around a multi-hop communication system that relies on intermediate nodes that act as routers, the concept requires pointto-point communication links between nodes, which makes it unsuitable for wireless communication.

4.2.2. Increased Robustness Towards Connectivity Issues Nikolakopoulos, Panousopoulou, and Tzes [NPT08] present a method to design and analyse a controller using a time-discrete state space system model with communication delay between plant and controller. They show that, for maintaining a stable system, an optimal output feedback controller’s gain has an upper bound that depends on the communication delay. Using this knowledge, they propose to use a hybrid system model that, for a certain time window, switches its controller gain based on the communication delay observed in a past time window. In this setup, the communication delay is measured as round-trip time from controller to plant and back to the controller. For evaluation, the authors apply this method for the design of three NCSs that communicate via a mobile cellular network, an infrastructure WLAN, and a MANET respectively; both of the latter use IEEE 802.11b networking technology. The controller gain switching for the MANET NCS does not directly depend on the communication delay, but uses the data flow’s hop-count from plant to controller instead: the hop-count is arbitrarily grouped into three categories—[0, 3], [3, 9], and [9, 15]—and for each of them an expected upper end-to-end communication delay bound is estimated by taking the maximum observed delay per category from simulations carried out using the network simulator ns-2; these delay bounds then are used to calculate the optimal controller gains that are switched between. UDP is used in the Transport layer and for routing the DSR routing protocol is used. Haupt et al. [Hau+14] discuss a design method for NCSs that use wireless networking

47

4. Related and Previous Work technology to connect sensors, actuators, and controllers. The method uses cross-layer optimization in the design phase to optimize the control performance: by using a hardware in the loop approach together with a simulation of the network, the controller’s gain is adapted to minimize a cost function, e.g., with a genetic optimization algorithm. To achieve deterministic communication delay in MANETs, the authors propose to use a time-triggered Data-link layer that uses a special synchronization scheme dubbed Black Burst Synchronization on top of an IEEE 802.15.4 Physical layer, together with a suitable, QoS aware routing protocol in the Network layer. A state estimator is used to compensate for lost packets or unexpectedly large delays. The authors point out that, despite their choice for a time-triggered medium access, modern contention based medium access such as Carrier-Sensing Multiple Access (CSMA) provides superior control performance.

4.3. End-to-End Communication Delay Prediction 4.3.1. Aggregating Single-Hop Communication Delay Predictions The QoS scheme that Ahn et al. [Ahn+02] propose, cf. section 4.1.2 above, estimates the end-to-end communication delay by collecting 1-hop communication delay estimations from each node along the flow’s route that is at the same time discovered by this query and adds them to get the end-to-end communication delay. A node estimates the 1-hop communication delay that it causes in a route by keeping track of the network load that already administered flows registered and computing the delay from the network load that will result from accepting the new flow. Sun et al. [Sun+04] propose a model-based resource prediction method to estimate available throughput and end-to-end communication delay that a flow would currently experience if admitted to the MANET; this estimate then serves as criterion to either admit or reject a new flow if the network can either meet its QoS demand or not, respectively. The model is similar to the one proposed by [Ahn+02] but supports an arbitrary number of QoS priority classes. For their estimation method, the authors assume a stationary network topology while sampling current input values; as the sampling period is just a few seconds long, this assumption is valid for networks of nodes with low mobility, e.g., of pedestrians. They express the expected end-to-end delay as service delay τservice : τservice = τdefer + τtransmit

(4.11)

τdefer denotes the delay caused by packet waiting times at the MAC layer because of an occupied channel or back-off times after collisions; τtransmit denotes the time which is

48

4.3. End-to-End Communication Delay Prediction needed to actually transmit data over the channel and can be calculate from the message length F and channel capacity C: τtransmit =

F C

(4.12)

Both methods are relevant building blocks for the prediction models that are proposed in chapter 6: adding communication delay estimates along a flow’s route is a simple method to estimate the end-to-end communication delay. Important aspects that are not addressed by any of the work are methods to predict future routes, methods to incorporate actual observations of the network, and methods to predict 1-hop communication delays further than the flow’s admission.

4.3.2. Communication Delay Prediction in the Internet Ramasubramanian et al. [Ram+08; Ram+09] propose to use a tree graph, called prediction tree, of network participants—the graph’s nodes—and communication links— the graph’s edges—, which are labeled with the communication link’s delay; the graph is constructed with the direct measurements of selected Internet links. To insert a new node, a few reference measurements are performed to guess the new node’s position in the tree graph and to add an edge connecting it with the existing graph. By inserting virtual nodes on existing edges, new sub-trees can be introduced into the graph, without requiring full knowledge of the actual network topology. The end-to-end delay between two network nodes is estimated by calculating the sum of all delay values along the tree graph’s edges on the path connecting the nodes; measuring the delay between all network participants is not necessary. Ping, Kit, and Karuppiah [PKK13] enhance the above prediction tree approach by addressing the selection of nodes used for reference measurement when adding a new node into the graph. First, a set of leaf nodes, L, from the existing prediction tree are selected for latency measurement. Second, from the non-leaf nodes on the paths connecting the leaf-nodes in L, a set of candidate interior nodes I is selected. Third, using the nodes from L and I, candidate metric trees are generated by inserting a virtual node between two candidate nodes from L, I, and connecting the new node to this virtual node. Fourth, with the known measurements, the optimal candidate metric tree is selected from the candidate metric trees. Their evaluation using PlanetLab3 claims to achieve a latency prediction accuracy of 18%. Madhyastha et al. [Mad+06] use traceroute measurements from PlanetLab nodes to a single representative host per routable Border Gateway Protocol prefix to construct a 3 PlanetLab is a globally distributed service for Internet service and infrastructure testing, operated by a consortium of research institutions from universities and industry [PP07].

49

4. Related and Previous Work structural model of the Internet, referred to as atlas. With a few traceroute measurements from a new client, this client can be positioned in the atlas. In order to predict the latency between two nodes, first the path which the packets would be routed along through the Internet is predicted, and then a latency estimation for this path is calculated. Both prediction steps make use of the constructed atlas. The evaluation shows that 54% of the predicted path latencies have an absolute error of 10 ms or less. Sun et al. [Sun+13] use a wighted k-nearest neighbor algorithm to build a knowledge base by learning end-to-end link latencies for repeating time periods in peer-to-peer overlay networks, operating on the Internet. Defining a link similarity metric and assuming that links have stable latencies, they estimate link latencies with information from this knowledge base. They experimentally validate their latency estimation method using PlanetLab. This concept demonstrates that data analysis algorithms can be applied for latency estimation; nevertheless, the concept is not applicable to MANETs due to their dynamic nature.

4.3.3. Forecasting of End-to-End Communication Delay Time-Series Marques and Casimiro [MC13] use a probabilistic model to compute one step ahead forecasts for the end-to-end communication delay time-series in WSNs. The required forecast accuracy is configurable by the application in terms of the probability, pc , called coverage, that the next observed communication delay is less or equal to the forecast. The proposed model is a simple order statistic from the last n samples of the timeseries. Their evaluation, using a static testbed setup and different simulation scenarios, measures the forecasts deviation from the true percentile that reflects the requested coverage. Their results show worst performance in case the nodes are non-stationary and concurrent flows exist in the network, with a deviation of just below ±6% at maximum. In this worst case, the median is slightly above 0% deviation and the quartiles are at ±2%. Further, they find that for stationary nodes, increasing the number of samples reduces deviation from the configured coverage, while for moving nodes, the opposite is true. This finding follows intuition, because in the latter case, the network changes over time and older observations do no longer reflect the network’s actual state.

4.4. Context Awareness in Mobile Ad Hoc Networks Acquiring contextual information from the network is necessary in order to provide input to the connectivity prediction models that are presented in chapter 6. A method to efficiently sens various metrics using cross-layer design is discussed below. Petz et al. [Pet+11] propose a passive approach to network context sensing that only

50

4.4. Context Awareness in Mobile Ad Hoc Networks eavesdrops on existing network traffic instead of actively probing the network. They use OMNeT++ for simulation and an implementation based on Click, [cf. Koh06], for real world evaluation. Using this infrastructure, they define three passive metrics—network load, network density, and network dynamics—for which they compare the specificity with active measurement counterparts. Each metric uses sensed context information that it feeds into an estimator function at an interval of length ν. All estimator functions calculate a weighted sum of the last estimate and the new context information collected during the time window [t − ν, t] to get a running average; γ ∈ [0, 1] denotes the weight-factor, with γ = 1 neglecting the old estimate when calculating the new estimate. The network dynamics metric uses link quality information to derive the network’s reliability surrounding the measuring node. Petz et al. propose to either directly observe the quality of received packets or to instead count the occurrence of packets with linkfailure semantics. In the former case, a node i access Data-link layer information for packets it receives from neighboring node j to determine the normalized link quality lqij ∈ [0, 1] for the communication link from node i to node j at time t: lqij (t) = γ · lqij (t − ν) + (1 − γ)lqij,avg (t − ν)

(4.13)

where lqij,avg (t − ν) is the average link quality observed during the time window [t − ν, t]. In the latter case, the node counts route error packets in the Network layer to derive a comparable link quality metric: lqij (t) = γ · lqij (t − ν) + (1 − γ)nrej,m i (t − ν)

(4.14)

where nrej,m i (t − ν) denotes the amount of route error packets received at node i from node j during the time window [t − ν, t]. Network density is measured by counting the number, ndm i (t − ν), of distinct host identifiers, i.e., MAC addresses, found in Data-link layer frames during the time window [t − ν, t] that node i encounters: ndi (t) = γ · ndi (t − ν) + (1 − γ)ndm i (t − ν)

(4.15)

As such, this metric actually measures the neighbor count of node i, as it is defined in section 5.1.2. Because of the medium’s broadcast property, the encountered frames do not necessarily have to be intended for node i as recipient. Rather, the frames’ signal strength at node i must be high enough to successfully decode the signal. Consequently, all identifiers that are counted in this way belong to nodes that lie within communication range of node i. Important to notice is that IEEE 802.11 acknowledgement frames do not carry any sender identifier. Hence, a node that does not send any other frames

51

4. Related and Previous Work during the sensed time window is omitted by the metric. The network load metric accumulates the received frames’ size during the time window [t − ν, t] to calculate the average network load. Consequently, the metric does exactly measure the network load. From their evaluation, Petz et al. conclude that the passive metrics have very different specificity, often very dependant on the chosen estimation interval ν. The larger the chosen value for ν, the more susceptible for estimation errors is especially the network density metric. Correlating different metrics with each other proves to be an effective manner of asserting a metric’s current validity, specifically a high network load indicates a good network density estimation.

4.5. Node Mobility and Localization 4.5.1. Node Mobility Prediction Work on node mobility prediction is briefly discussed below to give a perspective on the prediction accuracy that is to be expected from such predictions. Recently, Elsner [Els15] has shown how to use machine learning techniques to improve localization of humans in their daily routine and how to enable their localization without current availability of regular geolocation services. Prediction of Sparse Node Interactions Bayir and Demirbas [BD14] propose a routing protocol for delay tolerant networks: A node learns the social network by storing connection encounters with other nodes in a local observation table which is divided in time slices. The decision to forward a message to another node is based on the two factors Observation Score and Information Dissemination Score. The Observation Score describes the probability of a node to connect with the message’s destination in the near future. The Information Dissemination Score describes how good a node can distribute the message to other nodes. The message’s hop count is then used as metric for how much it has already spread; the higher the hop count is, the smaller is the Information Dissemination Score’s impact on the message forward decision. Data for training and validation is taken from Eagle and Pentland [EP06]. Prediction of a Node’s Next Location Song et al. [Son+10] analyze cell tower associations from 45,000 mobile phone users over a three month period. Based on the entropy in the time-series describing an individual’s

52

4.5. Node Mobility and Localization movement, they find that the average predictability of a human’s location, based on its historic location trajectory, is at average 93% (lower bound is 80%). Gambs, Killijian, and del Prado Cortez [GKd12] use n-Mobility Markov Chains (MMCs) to predict the next macro scale location, in the form of a Point Of Interest (POI), an individual will visit, based on the individual’s current and n − 1 past POIs visited. A MMC models an individual’s macroscopic mobility behavior as a probabilistic automaton, with states representing POIs and transitions representing movement from one POI to another POI [GKd11]. The transition matrix, which describes the n-MMC, is learnt from an individual’s recorded mobility traces. To learn the POIs, they remove all data points from a trace that have a movement speed above a predefined threshold to retain only locations at which the individual was stationary and then remove redundant locations as well. The remaining set of stationary locations is then clustered with the Density-Joinable Cluster algorithm, from which the resulting clusters define the indivudial’s POIs.

4.5.2. Uncertainty in Node Localization Using predictions of node locations in the prediction models that are introduced in chapter 6 to contribute to research questions 2 and 3 is one of the thesis’s objectives. The methods discussed below either are used as parts of the models in chapter 6 or provide background on the available methods to localize nodes geographically. Computing Distances From Uncertain Node Locations Xiao and Hung [XH07] propose an efficient method to calculate the expected distance by approximating the actual uncertainty in the coordinates of two objects’ locations via sampling from the actual distributions and approximating each location with a single normal distribution. Given the location samples xu ∈ Rt , u = {1, . . . , ni } and yv ∈ Rt , v = {1, . . . , nj } for both objects, where ni and nj are the numbers of samples taken and t ∈ R is the dimensionality of the objects’ locations, e.g., t = 2 for nodes that move on a plane surface. Then the squared Euclidean distance’s expected value, E{d2 }, is: nj ni X 1 1 X µ = E{d } = kxu − yv k2 ni nj 2

u=1 v=1

=

t X



E{x2w }

+

(4.16) 2 E{yw }

− 2E{xw }E{yw }



w=1

53

4. Related and Previous Work The expected values E{xw } and E{x2w } are the mean respectively the mean of the squares of the object’s samples from dimension w. Hung, Xiao, and Hung [HXH12] enhance this method by adding an efficient method to calculate the distance’s variance from the same set of samples as above, which they then use to approximate the distance’s probability density function to reflect the uncertainty in the result. This sampling based method is able to approximate unknown distributions of the object locations and was shown to achieve 90% or better accuracy for the distance’s probability density function with as little as 200 samples from each object’s location distribution. Their proposed Variance Approximated by Sample-based Statistics algorithm uses aggregations of the samples that can be precomputed to speed up the calculation: var = Var{d2 } =

t h X 4 4 2 E{ww } + E{yw } + 6E{x2w }E{yw } w=1

− +2

4E{x3w }E{yw }

t−1 X

3 − 4E{yw }E{xw } − µ2

i

t h X E{x2t1 x2t2 } + E{yt21 yt22 } + E{x2t1 }E{yt22 }

(4.17)

t1 =1 t2 =t1 +1

 + E{x2t2 }E{yt21 } − 2 E{xt1 x2t2 }E{yt1 } + E{yt1 yt22 }E{xt1 } i  + E{x2t1 xt2 }E{yt2 } + E{yt21 yt2 }E{xt2 } + 4E{xt1 xt2 }E{yt1 yt2 } To derive the distance’s probability density function approximation, Hung, Xiao, and Hung [HXH12] then propose a hybrid method that uses the expected value and variance from equations (4.16) and (4.17) for a normal distribution, d2 ∼ N (µ, var), in case of a large distance, or to compute the shape, k, and scale, θ, parameter for a gamma distribution, d2 ∼ Γ(k, θ), according to: k = µ2 /var

(4.18)

θ = var/µ

(4.19)

The resulting, hybrid, probability density function for the squared Euclidean distance then is: p(x) =

  

1 xk−1 exp − xθ Γ(k)·θk h i (x−µ)2 √1 exp − 1 2 var 2π



k < k0 k ≥ k0

(4.20)

Large distance, for the purpose of this function, is defined by the use of k as the gamma function’s parameter, which for large k is not easy to compute or even might be unavailable, as is the case for the software package R. The threshold k0 is used to change

54

4.5. Node Mobility and Localization from the small distance gamma distribution, to the large distance normal distribution. Node Distance Estimation Hara et al. [Har+05] propose a WSN that estimates a target’s location using the Received Signal Strength Indicator (RSSI) value from all sensor node’s that receive the targets communication signal. The WSN uses IEEE 802.15.4 based communication, all sensor nodes have equal, fixed spacing with known positions. The estimation method is based on the radio signal’s fading characteristic; the presented measurements, which were obtained in different in-door locations, validate the underlying assumption that the received signal power, P , exponentially correlates to the signal origin’s distance, r, with a correlation coefficient of 0.92. The probability density function p(P |r) =

P 1 − Λ(r) e Λ(r)

Λ(r) =αr−β ,

(4.21)

1.90 ≤ β ≤ 4.75

(4.22)

expresses this relation; β being the power decay factor and α being a—not further specified—constant. A single node gathers all sensor values resulting in the joined probability density function p(P1 , . . . , PN |r1 , . . . , rN ) =

N Y

p(Pi |ri )

(4.23)

i=1

p ri = (X − xi )2 + (Y − yi )2

(4.24)

with i = 1, . . . , N being the sensor node index, (xi , yi ) being node i’s known position, and (X, Y ) being the target’s unknown position. With the maximum likelihood estimation of (X, Y ) maximizing the logarithm of equation (4.23), they derive # N ∂ X log p(Pi |ri ) =0 ∂X ˆ i=1 X=X " # N ∂ X log p(Pi |ri ) =0 ∂Y ˆ

"

i=1

(4.25)

(4.26)

Y =Y

to get a 2-dimensional nonlinear equation, which they solve by applying the NewtonRaphson method. The equations (4.25) and (4.26) contain the power decay factor β, defined to be β = 2.35 by Hara et al., corresponding to their measurements. Using this

55

4. Related and Previous Work technique, Hara et al. demonstrate an accurate estimation of the target’s location using a 5 × 5 grid of sensor nodes and 2 RSSI measurements. Benkic et al. [Ben+08] present measurements that relate the RSSI in a receiver node to the node’s distance in regard to the signal’s origin node, using ZigBee wireless communication. They use three different ZigBee radio chips, for each chip they perform a measurement series inside a long corridor (4 m wide, 30 m long, and 5 m high) with node distances ranging from 0 m to 25 m and between 200 and 270 measurements per distance. They get a relative standard deviation of 2% median (range 0.5% to 7.8%) for the measured RSSI and conclude that the use of RSSI as distance indicator depends on “[. . . ] how precise the distance evaluation must be.” [Ben+08, p. 4] In [Maz+09], Mazuelas et al. use a short succession of RSSI measurements from multiple access points to estimate the path fading coefficient for the current environment to estimate the location of a mobile node with known access point locations. During indoor testing, the estimated locations had a mean error of 3.97 m (standard deviation 1.18 m) inside an area of more than 30 m side length. Inertial Sensors for Node Positioning and Tracking The accuracy and position update frequency of a Global Navigation Satellite System (GNSS) can be improved by combining the positioning information calculated from the tracked satellites with data from inertial sensors, such as accelerometer and gyroscope, using a Kalman-filter [Bul+06]. This method is commonly found in positioning systems of land, air, and sea vehicles. The filter computations require fixed cycle times and depend on the sensor being in a fixed and known alignment with the object being tracked—requirements that can easily be fulfilled in an embedded system platform, mounted inside a vehicle; less so inside a smartphone or other device, loosely carried by a pedestrian. In [Li+12], Li et al. present a method to localize a pedestrian using only the inertial sensors from a smartphone. The method relies on step detection with dynamic time warping validation, personalized stride length estimation, and heading inference to trace a person’s movement; from this trace, they infer the person’s current location. In their evaluation, Li et al. achieve a mean positioning error of 1.5 m (95th percentil error at 4.3 m) for a smartphone carried in one hand while walking and a mean positioning error of 2 m (95th percentil error at 4.8 m) for a smartphone inside a pant pocket.

56

4.6. Conclusion

4.6. Conclusion Very little research so far has been published that helps to verify the systematic influence that contextual factors at a scenario level—like node speed, node density, the kind of traffic in the network, or network load—exert onto the communication delay and interruption connectivity metrics in the context of real-time communication in MANETs. The same observation holds for methods to predict these metrics given mobile nodes. Depending on the research community, either the computer networking/computer science community or the control systems community, very different viewpoints on network connectivity and approaches to handle it exist. Previous research from the computer science community that in some form addresses the thesis’s topics of interest belongs to one of the three categories of routing protocol research for MANETs, QoS in MANETs, or WSNs. In routing protocols, predictions of remaining single link life-times are used to find new routes before the existing one breaks due to a link; end-to-end predictions or extrapolation of single link effects to end-to-end effects are of no concern in this case and not considered in any form. The two major topics in QoS research are message prioritization to reduce effects of other traffic on the connectivity metrics of specifically marked flows and estimation of achievable connectivity metrics before admitting such a marked flow to the network. The latter is a form of prediction, but current research only addresses estimation of the current network state; mobility is generally ignored in this regard and changes to the network cause the QoS mechanisms to cancel existing flows to force them to request readmission to the network, which then might simply fail. The consequence is that current methods do not allow graceful degradation but instead fail hard. Relevant work for WSNs up to today only addresses static positioning of nodes. Prevalent is the used of queueing models to predict communication delay and stochastic models to predict the degree of connectedness, i.e., existence of routes. The effect of node mobility has been generally ignored, as has lately been noticed and criticised by others. Research domains of computer networking that do address connectivity prediction are, primarily, delay tolerant networks and to a lesser degree the Internet. Delay tolerant networks are a subgroup of MANETs that are explicitly not intended for realtime communication: information is primarily disseminated by the movement of nodes. Predicting neighbors of a node at the time scale of hours or days is a major theme for efficient information propagation, but the methods are not applicable to real-time communication on a minutes to seconds or less time scale. Predicting communication delay in the Internet relies on its static network topology and seasonality of observable time-series; both are assumptions that do not hold for MANETs with moving nodes. In the control systems community, the classical approach is still the most prevalent one:

57

4. Related and Previous Work to use networking technology that guarantees upper bounds for communication delay by utilizing special message scheduling in the MAC sublayer or for routing that is mostly centrally coordinated. Newer research addresses the issues of unknown communication delay by developing control methods that either have increased robustness towards the unknown communication delay or that measure experienced communication delay to adapt to the current connectivity. Incorporating connectivity predictions is of no concern.

58

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes The formulation of prediction models for communication delay and interruption in MANETs requires knowledge about these two phenomena and the factors that influence them. In the literature, the communication delay and interruption are rarely discussed for their own merits, but rather as means to evaluate routing algorithms, MAC protocols, and QoS mechanisms. While the end-to-end communication delay is a direct metric used in these evaluations, the communication interruption is commonly reported in the form of packet loss. But the packet loss statistics do only offer limited insight into the temporal correlation of successful or interrupted packet transmissions. Studying the mechanisms of communication in a MANET, which are introduced in section 2.3, suggests three aspects that, on a technical level, influence communication delay and interruption: channel utilization (including interference) on the Data-link and Physical layers, the hop-count on the Network layer, which acts as multiplier for the effects that influence single links, and node dynamics, which influence both the Data-link and Network layer. From these three aspects, the channel utilization’s influence on communication delay and interruption is well understood. Various analytical models have been proposed and validated to describe it: The physical models described in the sections 2.3.2 and below in section 5.1.1 and queueing models for the MAC sub-layer behaviour [SV13; DV14]. Interferences from sources outside the network have been addressed by methods from time-series analysis [DV14]. The influence of node dynamics in the form of node speed has been investigated by others. Royer, Melliar-Smith, and Moser [RMM01] show how node speed and node density impact packet delivery statistics as well as network throughput. They do not investigate their influence on the communication delay. The characteristics of the three technical aspects—channel utilization, hop count, and node dynamics—themselves, as they appear in a certain scenario or context, result from the configuration or expression of multiple other factors, hereafter called contextual factors. Introducing this second, contextual layer motivates research question 1: To which degree do contextual factors, in the form of scenario parameters, influence the

59

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes connectivity metrics in a MANET? For simulation studies of MANETs, the contextual factors that are mostly used to describe and parametrize the simulations are: average node speed, mobility model and node placement, simulation area, node count, amount of flows in the network, and the type of network traffic, defined by packet size and transmission interval. Given the application scenarios defined in chapter 3, similar situations in the real world reveal the same set of contextual factors: A fixed set of nodes move in a bounded area with some average speed; their network communication has certain bandwidth and transmission interval characteristics. From the literature and the technical communication mechanisms, the presented contextual factors are expected to influence the communication delay and the streaming window duration; yet unclear is, which of them actually has a measurable impact and how strong the influence actually is. Before continuing, the node count as contextual factor has to be further discussed. While it probably is the simplest of all the enumerated factors, its influence on the connectivity depends on the area in which the nodes are spread. This assertion becomes obvious when considering that a node can only directly communicate with other nodes in an area that is defined by its average communication range. The range at which other nodes affect it via interference may be larger but it is bounded by a dependence on the average communication range nevertheless. Consequently, as a contextual factor, the area dependant node count is replaced by the node density, i.e., the node count per unit of area, which has an expected impact on the connectivity that is independent of the simulation area: node density =

node count simulation area

(5.1)

5.1. Design of Experiment Based on these preliminary considerations, the basic simulation setup in figure 5.1 is used to configure a factorial experiment parameter study in the discrete event simulation framework OMNeT++ with the INET library to empirically investigate the influence of the contextual factors on the network connectivity. Table 5.1 contains further details about the simulation study parameters and their relation to the contextual factors. Simulation based studies using discrete event simulations are a wide spread and accepted method to research and evaluate networked communication systems in general and MANETs specifically [VOM13; HOG13]. Despite their popularity, surveys have repeatedly uncovered a prevalent lack of credibility in published results obtained from such simulation studies [KCC05; HOG13; SG14]; major points of critique are the lack of reproducible results, reliance on single data points, i.e., single simulation runs, ignorance of the influences of underlying random number generation, and simulation scenarios that

60

5.1. Design of Experiment

r¯com sending node

r¯com average communication range area for calculation of neighbor count

speed

receiving node

50 m

50 m

flow under observation

ysize

0.5 · ysize

other traffic

moving node stationary node flows (specified by traffic type and other traffic)

xsize Figure 5.1.: Parametrized simulation scenario used for the factorial experiments.

Table 5.1.: Configuration parameters used to analyse the factorial experiment study. Parameter

Symbol

Explanation

xsize ysize neighbor count speed traffic type

lx ly nconf nghb |v| tr

other traffic

nflow

simulation area’s horizontal (x) side length in m simulation area’s vertical (y) side length in m neighbor count used to configure the simulation node speed for the moving nodes transmission interval for all traffic in the simulation; because the payload per transmission is adapted to have a constant data rate per flow, only this single parameter is used number of other data streams, i.e., flows, in the network besides the flow under observation

61

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes Table 5.2.: Transmit power of IEEE 802.11 WLAN devices. Source/Device

Transmit Power

typical configuration in laptops [Wik14] mobile workstation (Dell Precision, Ubuntu) RaspberryPi (Raspbian, ad hoc WLAN mode)

15 dBm/32 mW 15 dBm/32 mW 20 dBm/100 mW

do not suite the research question being investigated. Despite this continuing, pessimistic state of simulation study use, the referenced authors provide a set of recommendations that help to ensure scientifically sound, reproducible results. Beside the configuration parameters that account for the contextual factors, the parameters that specify the nodes’ Physical layer properties have to be specified to reasonably represent the application scenarios, previously defined in chapter 3.

5.1.1. Physical Layer Parameters Central to the Physical layer’s parametrisation is the radio propagation model, which describes the radio signal’s attenuation, i.e., decline, with distance to its emitting terminal. Together with the node transmit power, the radio propagation model and its parameters specify the maximum geographic distance at that two nodes will be able to directly communicate and hence, influence the impact that a specific node mobility and simulation area will have on the simulation’s outcome. For the 2.4 GHz ISM band most regulators in the world, including Europe and the USA, limit the Equivalent Isotropic Radiated Power (EIRP) to 100 mW [BW06]. Often however, the transmit power is lower than this. Because of lacking scientific literature regarding WLAN radio’s transmit power in typical device configurations, the small survey summarized in table 5.2 hints at 32 mW for consumer devices. The Friis equation (2.2) that was introduced in section 2.3.2 is a basic radio propagation model, but suitable only for ideal environments. This equation results in a deterministic, quadratic path loss in relation to the Euclidean distance, d, between transmitter and receiver. It neglects other important, variable influences on the received signal strength such as the large-scale reflection, diffraction, scattering, or the small-scale fading [Rap02; Kun+08]. The Log-distance path loss model, which introduces the path loss coefficient, α, as the distance’s exponent into the Friis equation, enables an adaptation of the radio propagation model to different environmental characteristics [Wal+06]: PrLD (d)

62

P t Gt Gr = L



λ 4πm

2 

m α d

(5.2)

5.1. Design of Experiment Table 5.3.: Path loss exponents for various environments, taken from [Rap02]. Environment

Path Loss Exponent (α)

free space urban area cellular radio in building with line-of-sight in building with obstruction in factory with obstruction

2.7 1.6 4 2

2 to to to to

3.5 1.8 6 3

where m is the unit Meter, not a variable. Applicable values for the pass loss coefficient range from 2 (free-space propagation, according to the Friis equation) to 5 (strong signal attenuation, e.g., in a dense city). Table 5.3 lists path loss exponents for different environments. Using a probabilistic radio propagation model introduces a variance to the received signal strength that better reflects the true, probabilistic nature of radio signal transmission under real conditions than a purely deterministic radio propagation model [Kun+08]. The Log-normal shadowing model accounts for the empiric observation that, in the logarithmic dB domain, the received signal strength follows a normal distribution centered around the expected value that is calculated using the before mentioned, deterministic Log-distance model [Rap02]. Following this model, the received signal strength in Watt is calculated as: Xσ

PrLNS (d) = PrLD (d) · 10− 10

(5.3)

with the zero-mean, Gaussian distributed random variable Xσ that has a standard deviation of σ. Table 5.4 summarizes the choices made for the simulation’s Physical layer parameters. Besides for the factorial experiment that is used in this chapter, the same parameters are used for the simulations that are discussed throughout the thesis. These parameter choices ensure sufficiently realistic simulation results that reflect the chosen application scenarios. For more details on the parameters, models, and alternative available options, Rappaport [Rap02], Walke et al. [Wal+06], or Kuntz et al. [Kun+08] may be consulted.

5.1.2. Scenario Parameters In addition to the Physical layer parameters that were discussed in the previous subsection, the simulations’ scenario parameters are specified as a parameter study. The parameter study setup allows a simple execution of all simulation runs that are required for the factorial experiment. The intention of the factorial experiment is to establish an understanding of the way that the contextual factors principally influence

63

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes Table 5.4.: Physical layer model parameters. Parameter

Value

Physical layer specification

IEEE 802.11g (maximum available net bitrate: 54 Mbit/s) 32 mW −85 dBm (default) −110 dBm (default) Log-normal Shadowing (σ = 1) 2.7 167 m

Transmitter power (Pt ) Receiver sensitivity Thermal noise level Propagation model Path loss exponent (α) Average communication range (¯ rcom )* *

Derived value using equation 5.2 and the receiver sensitivity as specified in this table.

the communication connectivity; it shall not reflect a single, specific scenario. In each simulation run, only a single data flow is observed to record the connectivity related data. The flow’s end-point nodes are labeled sending node and receiving node, respectively, hereafter as well as in figure 5.1. Those two nodes are positioned at fixed locations in the simulation area—vertically centered and each with 50 m distance from one of the simulation area’s borders in horizontal direction—and remain stationary. The immobility and chosen locations of the observed flow’s end-points removes the randomness in the distance of the end-points and makes it directly dependent on the simulation area’s horizontal side length, i.e., the xside parameter. The remaining nodes are initialized with a random starting location and move according to the Randomway-point mobility model. While the Random-way-point mobility model is the most commonly used mobility model in MANET simulation studies, its major critique is that it is very artificial and does not model human movements well [HOG13]. The mobility model’s artificiality renders it less suitable to model realistic scenarios but is of no concern for the purpose of the factorial experiment’s simulation study. Likewise, the Random-way-point mobility model’s waiting-time parameter that specifies the time a node waits until moving towards a new random location after having reached its previous location, is specified arbitrarily to be chosen from a normal distribution with a mean of 10 s and a standard deviation of 5 s that is truncated at 0 in order to only include positive values. A single simulation run has a duration of 1800 s, after which the simulation’s event processing is simply stopped. The influence of initialization effects, like empty buffers in the Data-link layer, at the start of each simulation run on the communication connectivity statistics is reduced by using this long simulation duration and by starting the observed flow at ts = 30 s. The morphological scenario field of table 5.5 specifies the set of contextual factors and categorical expressions thereof, to be used for factor variations in

64

5.1. Design of Experiment Table 5.5.: Morpholigical field of simulation parameters to investigate the context factors’ influence on the communication connectivity. Node Density very low low middle high

Area

Traffic Type

small light square medium medium square large heavy square corridor

Other Traffic

Node Speed

Routing protocol

none

walking

AODV

low middle

inner city driving highway driving

DSR OLSR

high

the factorial experiment setup. The remainder of this subsection discusses the specific choices for the values that express the various factor categories. For every combination of the contextual factors’ categorical expressions, as specified in table 5.5, two replications of the simulation are run. In total, this results in 3456 simulation runs. Every such simulation run uses a different set of seed values for the random number generators1 . The high number of simulation runs, the replication of each parameter set, and the distinct initialization of random numbers together ensure that the experiment has only very little risk to randomly introduce bias in the results due to the simulations’ stochastic nature. Below, the simulation configuration parameters that are necessary to run the simulations are derived from the contextual factors and their categories. Node Speed and Simulation Area The choice of concrete node speeds, listed in table 5.6, is straight forward. A survey on adult walking speeds justifies the walking speed of 5 km/h [Boh97]. The driving speeds follow German traffic regulation with 50 km/h for inner city driving and the recommendation of 130 km/h for highway/Autobahn driving. The possible route length increases with the scenario area and a stretched, corridor like area increases the probability of routes sharing common nodes and influencing each other. Both the node speeds and the average communication range justify the concrete values chosen for the simulation area parameters, listed in table 5.6: The Physical layer 1 The simulations use the default OMNeT++ pseudo random number generator, Mersenne Twister by Matsumoto and Nishimura, that has a period of 219937 − 1 before it repeats; it outputs uniformly distributed 32 bit numbers in up to 632 dimensions [MN98; VO14]. While there exist recent pseudo random number generators that improve over Mersenne Twister, e.g., WELL [PLM06] or SFMT [SM08], Mersenne Twister appears sufficient for the purpose of the factorial experiment and is readily available in the simulation framework.

65

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes Table 5.6.: Node speed and simulation area scenario parameters. Node Speed

Area Value (x × y)

Category

Value

Category

walking inner city driving highway driving

1.4 m/s (= 5 km/h) 13.9 m/s (= 50 km/h) 36.1 m/s (= 130 km/h)

small square medium square large square corridor

500 m × 500 m 1000 m × 1000 m 2000 m × 2000 m 2000 m × 500 m

parameters (cf. table 5.4) result in an average communication range of 167 m; at a side length of 500 m for the small square area, the communication along one side requires at least two hops via three nodes. At walking speed, a node may walk along one side in roughly 6 minutes. For each larger square area the side lengths are simply doubled to 1000 m and 2000 m respectively. Using the smallest and the largest of these side lengths defines the corridor size. Node Density Node density is directly related to a node’s average neighbor count, i.e., the number of nodes that one node is able to send messages to with only a single hop, via the expected transmission range, r¯com , and the resulting node coverage area: node density =

neighbor count + 1 2 π¯ rcom

(5.4)

To configure the simulation, equation (5.1) is used to compute the simulation run’s total node count based on the factors node density and area. Royer, Melliar-Smith, and Moser [RMM01] have shown that the neighbor count’s influence in a MANET varies with node speed. Due to the complexity of this relationship, it is unsuitable to be implemented into the configuration of the factorial experiment’s parameter study. Instead the larger set of categories for this factor ensures that the relevant neighbor counts for the different node speeds, according to [RMM01], are present. Table 5.7 lists all node density related parameter values. The value for a medium node density corresponds to the neighbor count with the maximum number of delivered packets for node speeds of 1 m/s (walking speed) from [RMM01, Fig. 6]., which are the respectively closest matching node speeds present in the their results. The remaining neighbor count values have been derived by doubling the medium value once to high and halving the medium value once to get to low respectively twice to get to very low.

66

5.1. Design of Experiment Table 5.7.: Definitions of node densities depending on average node speed. Node Density

Neighbor Count

very low low medium high

2.5 5 10 20

Node Count (dependant on simulation area) small sq.

medium sq.

large sq.

corridor

10 17 31 60

40 68 126 240

160 274 502 969

40 68 126 240

Traffic Type and Other Traffic Network traffic in the simulation results from the data flows that are configured via application instances on the network nodes. Each data flow consists of a stream of UDP datagrams that are transmitted at a constant interval. The transmission interval and payload size per transmission depend on the traffic type simulation parameter and are the same for all flows in a single simulation run. While the observed flow is present in all simulation runs, the amount of additional flows, which are hereafter called other flows, that compete for network resources with the observed flow depends on the other traffic simulation parameter. The precise number of other flows results from evenly distributing a targeted number, nflow , of other flows among all but the observed flow’s sending and receiving nodes and rounding up to ensure all these other nodes host the same number of other flows: nnode flow



nflow = node count − 2

 (5.5)

Each of the other flows starts transmitting at a random start time that is drawn from a uniform distribution over the interval [0 s, 2 s). This spreads the transmission attempts to create an even network load over time and prevents network load fluctuations that follow a repeating pattern that would be caused from all nodes transmitting at the same time. The destination node for all the other flows is the observed flow’s receiving node. This setup guarantees that the observed flow is influenced by all other flows in the network, at least at the final hop. The combinations of parameter values for each of the traffic type categories, as shown in table 5.8, ensure that the configured network load per flow is approximately identical for all three categories. The traffic type’s payload sizes originate from the heavy category, which was chosen to have a payload size of 2048 Byte, the largest power of 2 Byte that fit into an IEEE 802.11 Data-link layer frame’s maximum payload of 2304 Byte (without encryption), when accounting for a 20 Byte IPv4 header2 and an 8 Byte UDP header [cf. 2

20 Byte is the smallest possible IPv4 header size; depending on the use of optional fields in the

67

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes Table 5.8.: Traffic type and other traffic scenario parameter.

Category light medium heavy

Traffic Type Send Interval Payload Size 0.1 s 1s 2s

128 Byte 1024 Byte 2048 Byte

Category none low middle high

Other Traffic Capacity Flow Count 0% 10% 50% 90%

0 130 650 1170

Man+06; RFC791; RFC768]. The transmission interval choices for the three traffic type categories are more arbitrary than for the payload parameter. There is no general reason to guide their specification; thus, the values are chosen to be reasonable for the application scenarios of chapter 3: 10 transmissions per s for control data and 1 transmission per s as well as 1 transmission per 2 s for telemedical data. For a reference on the scale of the specified values, a two lead electrocardiogram stream, sampled at 360 Hz with 11 bit resolution as available from the MIT-BIH Arrhythmia database [Gol+00; MM01], transmitted with 512 samples in a single message results in a mean payload size of 102 Byte and a send interval of 1.42 s [AG10]. Hence, all traffic type parameter combinations in table 5.8 result in a higher network load for a single stream than a two lead electrocardiogram stream. The targeted number of flows for each of the other traffic categories, as listed in table 5.8, is specified as a configured network load percentage of the networks capacity. Despite the theoretical net bitrate of IEEE 802.11g being 54 Mbit/s, the average throughput, i.e., the average achievable bitrate transmitted over one link, is about 22 Mbit/s [Fli07]. Communication over more than one hop in a MANET at least halves this throughput further, because incoming and outgoing Data-link layer frames on a node have to share the available throughput. All traffic types in table 5.8 define data flows with a bitrate of 8192 × 10−6 Mbit/s; to saturate the 11 Mbit/s throughput with these flows requires roughly 1300 of them, when neglecting further overhead by the routing protocols.

5.2. Method for Statistical Experiment Analysis Measuring the end-to-end communication delay in the simulations’ data is trivial: The observed flow’s constant transmission interval creates a regular time-series of message transmissions at the Application level; for each message the send time at the sending node and the time of reception at the receiving node is recorded in the nodes’ Application layers. Unique message identifiers are used to match both timestamps. From these IPv4 header, it may be larger than that.

68

5.2. Method for Statistical Experiment Analysis Table 5.9.: Summary statistics that are calculated for each simulation run in the factorial experiment study. Observed variable

Statistics

communication delay streaming window width packet delivery ratio

median, 75th percentile, 95th percentile, and maximum 25th percentile, median, 75th percentile, and maximum 25th percentile, median, 75th percentile, and maximum

timestamps, the regular time-series of end-to-end communication delays is calculated, in which the delays are ordered according to their message’s send time. Dropped packets appear as missing delay values in the time-series and in case an Application layer message was received more than once, all but the first message are discarded. From the communication delay time-series, two metrics are calculated that describe the communication interruption: • each streaming window width, which are the number of non-missing values in the delay time-series between one missing value and the next missing value • the packet delivery ratio, which is the quotient of non-missing values in the timeseries to the sum of missing and non-missing values While the packet delivery ratio is an aggregated value per simulation run, the communication delay time-series as well as the sequence of streaming window durations are aggregated using the order statistics listed in table 5.9. Order statistics are used instead of, e.g., mean and multiples of the standard error or variance, because both data sets do not follow a Gaussian distribution. In case of the communication delay, the higher percentiles are of special interest because they provide upper bounds for a large amount of the data points. To estimate how accurately the simulated networks reflect a simulation’s configuration regarding node density and network load, passive network metrics according to Petz et al. [Pet+11], which were introduced in section 4.4, are used. Specifically, each node records the network density and network load passive metric with the time interval, ν = 1 s and the weight factor γ = 0.2. The divergence of each metric’s median, per simulation run, from the configured node density respectively expected network load is then normalized by the expected value and reported as residual, relative value. In evaluating the simulation runs’ data, it became apparent that many simulations had a very low packet delivery ratio, i.e., only very few Application layer packets of the observed flow were actually received by the receiving node. With too few data points available for a single simulation run, the sample size for the statistics that aggregate communication delay and streaming window duration gets very low and the risk for

69

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes biased results increases. As a safeguard, the data for communication delay and streaming window width statistics from simulations with a packet delivery ratio of less than 10% is removed before proceeding with further analysis. The packet delivery ratio itself does not suffer from this effect, because its sample size always equals the number of transmitted Application layer packets, independent of their successful reception. Besides the filtering step, all dependent variables, i.e., each contextual factor’s values, are normalized to have 0 mean value and a variance of 1: xnormalized =

¯ x−X , x∈X 2 sX

(5.6)

¯ is the sample mean of X, the set of the sampled values, and s2 is its variance. where X X From this data, using the median for communication delay and stream window width, the Spearman rank correlation coefficient, ρS ∈ {x ∈ R| − 1 ≤ x ≤ 1}, is calculated for each of the considered contextual factors. This correlation coefficient measures the monotone association between a dependent and a single independent variable, i.e., to which extent small values of the independent variable are associated with small values of the dependent variable and large values of the independent variable are associated with large values of the dependent variable [cf. Pea11]. The transformation from equation (5.6) is positive monotone and preserves the properties assessed by this rank correlation. As such, the Spearman rank correlation coefficient describes the tendency of the contextual factors’ influence on each of the communication connectivity metrics. Linear models are a simple, yet powerful tool to describe the relationship between one or more explanatory or independent variables—here the normalized contextual factors—and an explained or dependent variable—here each of the communication connectivity metrics. While it may be tempting to fit a linear model that uses all of the contextual factors as explanatory variables to the data and then simply decide which factors are most important based on the model coefficients and their statistical significance values, Hyndman and Athanasopoulos [HA13] advice against this approach because it creates problems with different scales for the data and even not significant variables in the model may effect the other variables. Instead, they recommend to fit a single linear model per combination of explanatory variables and then choose the model with the best performance and its variables. In the following, this later approach is taken. The benefit is that, depending on the model performance measure, the results for all three connectivity metrics and the different statistics used to aggregate them are comparable. Because the models are intended for description purposes and not for prediction, the same data is used for model fitting and to calculate the models’ goodness of fit. As ¯ 2 , is used measure for the goodness of fit, the adjusted coefficient of determination, R

70

5.2. Method for Statistical Experiment Analysis data sets

connectivity metrics

per node per simulation run

filtered statistics

per simulation run packet delivery ratio ≥ 0.1

...

data {|v|}, {tr}, {nflow }, . . . , {|v|, tr}, . . . {|v|, tr, . . . , ly } simulation configuration parameters all combinations

explanatory variables explanatory models

Figure 5.2.: Data preparation and explanatory model fitting to asses the contextual factors’ influence on a single connectivity metric. that was introduced with equation (2.6) in section 2.4. This ensures that adding model parameters that do not improve the model’s goodness of fit is penalized and that all performance measures have the same scale and hence, are directly comparable. Figure 5.2 visualises the described data preparation and model fitting process for a single connectivity metric. Because the resulting linear models explain a connectivity metric in terms of the contextual factors, these models are called explanatory models. A single explanatory model is fitted per connectivity metric per statistic per combination of contextual factors; the models are handled as one set of models per connectivity metric. From the three sets of explanatory models, it is not readily visible which of the contextual factors are the ones that contribute most to the goodness of fit of a single model, of the models from a single connectivity metric, or of the models from all three metrics. To find these contributions, a new data set is created with the explanatory ¯ 2 as the dependent variable and one binary valued indicator column per models’ R contextual factor that has the value 1 when the column’s factor is used as independent variable in the explanatory model and that has the value 0 otherwise. With this data, respectively subsets thereof, four linear models are fitted: one for each subset of rows that belong to one of the three connectivity metrics’ set of explanatory models and one on the full data set that combines all explanatory models. The 4 models that result from this second stage model fitting are called factor influence models. Figure 5.3 visualises the process to get from the three sets of explanatory models to the factor influence models.

71

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes

simulation configuration parameters all combinations

{|v|}, {tr}, {nflow }, . . . , {|v|, tr}, . . . {|v|, tr, . . . , ly } explanatory variables

communication delay

packet delivery ratio

streaming window width

explanatory models ¯ 2 per R model

factor influence models

communication delay streaming window width packet delivery ratio combined

Figure 5.3.: Data collection and factor influence model fitting to asses the contextual factors’ influence on the explanatory models’ goodness of fit.

72

5.3. Results The factor influence models are fitted without the use of a constant intercept coefficient. Hence, the resulting model coefficients directly estimate the absolute value that the ¯2. inclusion of their associated contextual factor adds to the explanatory models’ R ¯ 2 may be interpreted as to measure the fraction of Because the explanatory models’ R the explained data’s variance that the model is able to describe, the factor influence models’ coefficients value the fraction of variance that their associated contextual factor is able to explain, either for a single connectivity metric or all three metrics combined. In addition to the contextual factors’ influence, the communication delay’s timeseries properties are considered by analysing the partial auto-correlation that the delay expresses over all simulation runs. Auto-correlation is a measure for correlation between values from a single variable in a time-series with identical lag [cf. BJR94]. In partial auto-correlation, when assessing the correlation for a certain amount of lag, only the information is considered that is not expressed by the values with less lag.

5.3. Results The two histograms in figure 5.4 show the deviation of the median neighbor count and network load, as estimated by the passive network metrics per simulation run to the respectively configured values. Overall, the estimated neighbor count is close to 50% higher than configured, as indicated by the grey, dashed line; most simulations lie in the range between the configured neighbor count up to double the configured value. The spike at −1 and the lower bars shortly above that value show that there are simulations in which nodes did not encounter any neighbors at all. The network load histogram in figure 5.4 has been truncated to contain only relative deviations that are smaller than 2 to be legible in the primary range of interest between −1 and 1. Overall, the measured network load is slightly above 50% of the configured value. The histogram is highly skewed towards low values and has its maximum at the −1 bin, which indicates no network load. Not shown in the histogram are 142 data points that lie at or above double the configured network load, the highest having 59 times the configured network load. An overview of the simulations’ packet delivery ratio is given by the histogram in figure 5.5. Its count axis uses a square root scale to make the low count numbers of higher ratios visible. In 1436 (42%) of the simulations, all packets from the observed flow were dropped; 400 (12%) of the simulation have a packet delivery ratio of 0.1 or higher.

73

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes

300 300

200

count

count

200

100 100

0

0 −1

0

1

−1

relative residual neighbor count

0

1

2

relative residual load (truncated)

(a) Histogram of the relative deviation of measured to configured neighbor count.

(b) Histogram of the relative deviation of measured to configured network load.

Figure 5.4.: Comparison of the passive network metrics’ median per simulation to the configured value. The grey dashed line indicates the overall median.

2000

count

1000

300

100 30 5

0.00

0.25

0.50

0.75

1.00

Packet delivery ratio

Figure 5.5.: Histogram of the factorial experiment simulations’ delivery ratio of Application layer UDP packets from the observed flow. The plot’s count axis uses a square root scale.

74

5.3. Results

1.5

600

Density

Density

1.0 400

0.5 200

0

0.0 0

10

20

30

Delay [s] (a) Estimated kernel density of delay (using Gaussian kernel).

1e−03

1e−02

1e−01

1e+00

1e+01

Delay [s] (b) Estimated kernel density of delay at logarithmic scale (using Gaussian kernel).

Figure 5.6.: The observed flow’s end-to-end communication delay distribution from all factorial experiment simulation runs.

5.3.1. Observed Communication Delay End-to-end communication delay in the factorial experiment expresses a very heavy tail at the lower end with most (74%) observed values being smaller than 0.01 s. Another 11% of the observed values lie in the interval (0.3 s, 3 s). The smallest observed value is 0.0009 s and the largest is 32 s. Figure 5.6 shows the communication delay’s distribution by using its estimated kernel density; the right panel shows the same distribution at logarithmic scale, in which the detailed distribution at lower values gets visible. The estimated kernel density shows that the most common observed values sharply cluster at or around discrete values. The communication delay’s order statistics that were calculated per simulation run, cf. table 5.9, do all four have similar ranges with their maximum and minimum values respectively having the same order of magnitude (max: 101 s, min: 10−3 s). The statistics’ distributions are multimodal: median and 95th percentile have one dominating mode (median: 0.013 s, 95th percentile: 2.7 s), maximum has two very close dominating modes (2.1 s and 4.7 s), and the 75th percentile has two nearly equally high modes at 0.023 s and 1.4 s. Figure 5.7 contains the estimated kernel densities and box plots to characterize each statistic; both tableaux have logarithmically scaled delay axis, allowing to see the differences in the values’ orders of magnitude. Table 5.10 contains exact values that help

75

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes

Statistic

median

75th percentile

95th percentile

max

Estimated kernel density

1.2

0.8

0.4

0.0 1e−03

1e−02

1e−01

1e+00

1e+01

Delay [s] (a) Estimated kernel densities (using Gaussian kernel).

Statistic

max

95th percentile

75th percentile

median

1e−03

1e−02

1e−01

1e+00

1e+01

Delay [s] (b) Box plots.

Figure 5.7.: Observed distributions from the factorial experiment’s communication delay order statistics per simulation run. Table 5.10.: Exact values to the communication delay order statistics distributions’ of figure 5.7.

76

Statistic

min. [s]

median 75th perc. 95th perc. maximum

13 × 10−4 18 × 10−4 44 × 10−4 44 × 10−4

1st q. [s] 0.0071 0.017 0.26 2.0

median [s] 0.023 0.10 1.4 3.3

3rd q. [s] 0.20 1.1 3.1 5.5

max. [s] 13 17 27 32

Data subset

5.3. Results

5% extreme observations removed

all observations

0

5

10

20

30

Spread [s]

Figure 5.8.: Spread of observed communication delay in each of the factorial experiment’s simulation runs. to interpret the figure. The box plots’ notches—showing the median’s 95% confidence interval—strongly suggest that the statistics’ medians differ. The interquartile ranges of median, 75th percentile, and 95th percentile span multiple orders of magnitude and have mostly overlapping data that is not considered to be outliers, as indicated by the box plots’ whiskers that represent data lying inside 1.5 times the interquartile range above respectively below half of the data points3 . The observed communication delay’s spread in a simulation run is, besides outliers, smaller than the statistics’ spread across all simulations. Figure 5.8 shows box plots for the communication delay’s spread from the individual simulation runs, once when calculating the spread over all observations per simulation run and once when excluding the 5% extreme observations from a simulation run before calculating the spread. Not considering the 5% extreme observations reduces the spread’s median (1.8 s) significantly in comparison to all observations (3.1 s).

5.3.2. Observed Streaming Window Width Windows of end-to-end interruption free packet transmission in the observed flow are mostly very small with 46% consisting only of a single transmission and 18% consisting of two transmissions. The maximum observed streaming window has 1909 transmissions without any interruption. Figure 5.9 shows the full distribution as histogram with the count axis using a square root scale to magnify the small count values of the larger streaming window widths. The resulting streaming window durations are shown as estimated kernel density in the same figure. This kernel density has to be interpreted carefully because of two different properties of the underlying data:

3 The whiskers’ ranges are calculated in the transformed—logarithmic—scale and as such suited to compare orders of magnitude in the data.

77

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes

60000

40000

20000

density

count

10

5

0 0

500

1000

1500

2000

Streaming window width [transmissions]

(a) Histogram of streaming window width (count axis at square root scale).

0.1

0.3

1.0

3.0

10.0

100.0

Streaming window duration [s] (b) Estimated kernel density of streaming window duration (using Gaussian kernel with reduced bandwidth, duration axis at logarithmic scale).

Figure 5.9.: Streaming window width and duration from all factorial experiment simulation runs. • Due to fixed transmission intervals, the streaming window duration only has discrete values that are full multiples of the experiment’s transmission intervals of 0.1 s, 1 s, and 2 s. • The amount of transmission attempts is reciprocal to the transmission interval, i.e., there are twice as many transmissions with a 1 s interval than a 2 s interval and 10 times as many transmissions with a 0.1 s interval than a 1 s interval. The streaming window duration’s two highest modes are 0.1 s and 2 s, followed by 0.2 s and 1 s with equal height. Similar to the general distribution of streaming window width, its order statistics, cf. table 5.9, are skewed towards small values with all four statistics having their global maximum at 1 transmission. For the 25th percentile, streaming windows wider than that are considered outliers, while the median statistic still has 50% of its window widths at this count. Figure 5.10 contains a histogram of the statistics’ distributions. Its streaming window width axis is truncated at 30 transmissions to prevent the bars from being displayed too thin. The box plot in the same figure covers the streaming window width’s full range at a logarithmic scale. Fractions of transmission that occur in

78

5.3. Results

900 Statistic

count

25th percentile 600

median 75th percentile max

300

0 0

2

4

6

8

10

20

Stream window width [transmissions] (truncated)

30

(a) Histogram with individual bins per statistic. The streaming window width axis is truncated at 30 transmissions.

Statistic

max 75th percentile median 25th percentile 1

10

100

1000

Streaming window width [transmissions]

(b) Box plot per statistic with the streaming window width axis at logarithmic scale.

Figure 5.10.: Observed distributions from the factorial experiment’s streaming window width order statistics per simulation run.

Table 5.11.: Exact values to the streaming window width order statistics distributions’ of figure 5.10. Statistic 25th percentile median 75th percentile maximum

min. 1 1 1 1

1st quartile 1 1 1 2

median 1 1 2 6

3rd quartile 1 2 4 18

max. 9 17 29.75 1909

79

Data subset

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes

5% extreme observations removed

all observations

0

10

20

30

40

50

Spread of streaming window width (truncated)

Figure 5.11.: Spread of observed streaming window widths in each of the factorial experiment’s simulation runs. the box plot are caused by even transmission counts in a simulation run, which results in the order statistics being calculated as arithmetic mean from two count values. Table 5.11 contains the precise values that characterise the streaming window width’s order statistics. The observed streaming window widths’ spread in a simulation run is, besides the maximum statistic, larger than the statistics’ spread across all simulations. Figure 5.11 shows box plots for the streaming window widths’ spread from the individual simulation runs, once when calculating the spread over all observations per simulation run and once when excluding the 5% extreme observations from a simulation run before calculating the spread. The spread axis is truncated at a width of 50 transmissions to keep the important, lower part of the distributions eligible. Not considering the 5% extreme observations reduces the spread’s median (3 transmissions) significantly in comparison to all observations (5 transmissions) and reduces the maximum spread from 1908 transmissions to 526.5 transmissions.

5.3.3. Rank Correlations and Explanatory Linear Models Table 5.12 presents the Spearman rank correlations between the factorial experiment’s configuration parameters, cf. table 5.1, and the connectivity metrics’ medians. The only strong rank correlation (0.81) is found between transmission interval, tr, and the streaming window duration. The next strongest rank correlation (−0.44) is found between the number of data flows in the network, nflow , and the streaming window width. Barring the stream window duration, the simulation area’s x-side length, lx , has a medium rank correlation with the other metrics: communication delay (0.39), streaming window width (−0.23), and packet delivery ratio (−0.31). Node speed, |v|, has a small to medium rank correlation to all four metrics: communication delay (0.18), streaming window duration (−0.12), streaming window width (−0.20), and packet delivery ratio (−0.23). In contrast, the neighbor count, nflow , has only very small rank correlations,

80

5.3. Results Table 5.12.: Spearman rank correlation coefficients of the factorial experiment’s configuration parameters to the connectivity metrics’ median. The configuration parameter symbols are defined in table 5.1. Metric

lx

delay streaming window duration streaming window width packet delivery ratio

nconf nghb

ly

|v|

tr

nflow

0.39 0.22 0.07 0.18 −0.14 −0.06 −0.10 −0.05 −0.04 −0.12 0.81 0.08 −0.23 −0.10 −0.09 −0.20 −0.09 −0.44 −0.31 −0.19 0.03 −0.23 0.23 −0.25

Adjusted coefficient of determination

0.4

0.3 Statistic median 0.2

75th percentile 95th percentile max

0.1

0.0 0

50

100

Delay regression model, individualy ordered (ascending) per Statistic

Figure 5.12.: Communication delay explanatory models’ goodness of fit in ascending order per statistic. of which the strongest is with the streaming window width (−0.09). Apart from the transmission interval, all other configuration parameters have a stronger rank correlation with the streaming window width than with the streaming window duration. Explaining the observed communication delay statistics with a linear model that uses the factorial experiment’s configuration parameters, cf. table 5.1, as explanatory ¯ 2 = 0.39) variables achieves its maximum goodness of fit for the maximum statistics (R using parameters lx , nconf nghb , |v|, nflow , and the routing protocol; the runner-up model 2 ¯ (R = 0.38) additionally uses the tr parameter. The latter model parameters are the ¯ 2 = 0.26) and the 75th same that achieve best goodness of fit for the 95th percentile (R ¯ 2 = 0.21) statistics. The median statistic models have the worst fit to the percentile (R data with the best parameter combination (lx , ly , |v|, tr, nflow , and routing) reaching

81

Adjusted coefficient of determination

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes

0.4 Statistic 0.3

25th percentile median

0.2

75th percentile max

0.1

0.0 0

50

100

Streaming window width regression model, individualy ordered (ascending) per Statistic

Figure 5.13.: Streaming window width explanatory models’ goodness of fit in ascending order per statistic. ¯ 2 = 0.10. The maximum statistic is the only one that has steep increases in goodness R of fit, the largest of which is caused by the inclusion of the routing protocol as model parameter. Figure 5.12 shows the goodness of fit for all communication delay statistics explanatory models. For the communication interruption metrics, the linear models to describe streaming ¯ 2 = 0.48) window width achieve best goodness of fit for the 75th percentile statistic (R with the model parameters lx , ly , nconf nghb , |v|, tr, nflow , and routing protocol, closely 2 ¯ = 0.48) by the same parameter combination excluding tr. The median followed (R ¯ 2 = 0.43) with the same best fitting statistic has slightly lower goodness of fit (R parameter combination as the 75th percentile statistic; here, the runner-up model ¯ 2 = 0.42) uses the same parameter combination excluding nconf . Best goodness of (R nghb 2 ¯ fit for the maximum statistic (R = 0.34) is achieved by the parameter combination lx , ¯ 2 = 0.34) additionally ly , |v|, tr, and routing; its follow up parameter combination (R includes nconf nghb . For the streaming window width, the 25th percentile statistic is explained with the worst goodness of fit, compared to the other statistics. The best parameter ¯ 2 = 0.25) is the same as the best combination of the 75th percentile combination (R ¯ 2 = 0.25) is identical to the median and median statistics; the next best combination (R statistic’s second best. Figure 5.13 shows the goodness of fit for all streaming window width statistics explanatory models. The streaming window duration is not further considered, details are given below in the discussion section 5.4.

82

5.3. Results

Adjusted coefficient of determination

0.5

0.4

0.3

0.2

0.1

0.0 0

50

100

Packet delivery ratio regression model, individualy ordered (ascending) per Statistic

Figure 5.14.: Packet delivery ratio explanatory models’ goodness of fit in ascending order per statistic. The packet delivery ratio’s explanatory model goodness of fit has its maximum ¯ 2 = 0.50) for the parameter combination lx , ly , nconf , |v|, tr, nflow , and routing (R nghb protocol, which is identical to the streaming window width’s best performing model for the 25th percentile, median, and 75th percentile statistics. The second best goodness of ¯ 2 = 0.50) is achieved by the same parameter combination excluding nconf . Figure fit (R nghb 5.14 shows the goodness of fit for all packet delivery ratio explanatory models. The last two steep steps are caused by inclusion of the parameters |v| and tr, respectively, into the model. Figures 5.15, 5.16, and 5.17 provide an aggregated overview of the estimated model coefficients from the explanatory models that do have a goodness of fit down 5% or less from a statistics best performing model per metric. Each plot aggregates the model coefficients for the same factor from all statistics per metric in a single box plot. In the following, only the factors with absolute coefficient medias larger or equal to 1 are reported. While this value does not express any form of significance, it filters out the coefficients that are very close to 0 and as such have only small influence on the explained values’ magnitudes. For the communication delay explanatory models, three factors have median coefficients with absolute values that are orders of magnitude larger than the remaining coefficients: lx (median: 691, IQR: 199 to 854), nflow (median: −405, IQR: −444 to −93), and ly (median: 117, IQR: 27 to 205). At the same time, these coefficients’

83

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes

coefficient

1000

500

0

−500 intercept

neighbor count

routing DSR

routing OLSR

speed

other traffic

traffic type

xsize

ysize

predictor

Figure 5.15.: Estimated coefficients for the communication delay explanatory models.

coefficient

0 −10000 −20000 −30000 −40000 intercept

neighbor count

routing DSR

routing OLSR

speed

other traffic

traffic type

xsize

ysize

predictor

Figure 5.16.: Estimated coefficients for the streaming window width explanatory models.

coefficient

0

−20

−40

−60

intercept

neighbor count

routing DSR

routing OLSR

speed

other traffic

traffic type

xsize

ysize

predictor

Figure 5.17.: Estimated coefficients for the packet delivery ratio explanatory models.

84

5.3. Results observed ranges of values are very large in comparison, as evident by the interquartile ranges (IQRs). The other factors with an absolute median coefficient value larger than 1 are: nconf nghb (median: −8.8, IQR: −9.2 to −3.8), |v| (median: 13.4, IQR: 3.0 to 14.5), DSR routing protocol (median: 7.4, IQR: 2.1 to 8.2), and the intercept coefficient (median: 5.0, IQR: 1.0 to 5.3). For the streaming window width explanatory models, only the DSR routing protocol parameter’s median coefficient has a single digit order of magnitude (median: −2, IQR: −4.0 to −0.5). The simulation area factors lx (median: −2123, IQR: −29 782 to −778) and ly (median: −1065, IQR: −19 139 to −385) have median coefficients that have absolute values at least one order of magnitude larger than the remaining factors’ coefficients and multiple orders of magnitude larger spread. One order of magnitude smaller are the absolute median coefficients of |v| (median: −752, IQR: −801 to −45) and nflow (median: −138, IQR: −664 to 1439). Note that the nflow factor’s coefficient range spans positive and negative values. From the remaining factors’ coefficients nconf nghb (median: −17, IQR: −26 to −2.2), tr (median: −79, IQR: −80 to −0.2), OLSR routing protocol (median: −82, IQR: −88 to −4.9), and the intercept coefficient (median: 70, IQR: 3.8 to 73) only the intercept coefficient is strictly positive. The tr coefficient range has its maximum at 0.2, making it span across 0. For the packet delivery ratio explanatory models, four factors have coefficients with an absolute median value above 1: lx (median: −72, IQR: −75 to −68), ly (median: −9.7, IQR: −9.9 to −9.5), |v| (median: −1.1, IQR: −1.1 to −1.0), and nflow (median: −40, IQR: −40 to −39). The coefficients’ spreads are small compared to the coefficients’ spreads found in the other metrics’ explanatory models.

5.3.4. Autocorrelation in Communication Delay Time-Series The communication delay time-series’ partial autocorrelation is significant only for a lag of 1 that has a median autocorrelation of 0.27 while its first quartile (0.029) has a magnitude outside the significant range for autocorrelation (> 0.047, < −0.047); the third quartile is at 0.67, cf. figure 5.18. All remaining lags have a median autocorrelation of approximately 0 (the approximation’s largest error is 0.013). For a lag of 2, both the first (−0.13) and third quartile (0.14) have a magnitude inside the significant range and with a lag of 3 this is still true for the third quartile (0.13). Some of the remaining lags have first or third quartiles that are significant but still small compared to the lags 1 to 3. Lags 4 to 30 all express a similar distribution of partial autocorrelation: their first and third quartiles are close to the significance level, the part of the data represented by the box plot whiskers lies in the significant range to ±0.25 and outliers exist that have full positive or negative correlation.

85

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes

partial autocorrelation

1.0

0.5

0.0

−0.5

−1.0 1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

lag

Figure 5.18.: Partial autocorrelation according to [BJR94], aggregated as box plots for the communication delay time-series from all simulation runs. The grey dashed lines indicate the significance limits

5.3.5. Factor Influence Models For each configuration parameter, respectively the contextual factor it represents, the influence of its inclusion in a model on the model’s goodness of fit is expressed by the factor influence models. The factor influence models’ goodness of fit to the explanatory models’ performances varies for the individual metrics. The influence on ¯ 2 = 0.98), followed by explanatory performance is best for the packet delivery ration (R ¯ 2 = 0.84). The influence on the communication delay the streaming window width (R ¯ 2 = 0.75), with the combined influence being slightly explanatory models is worst (R ¯ 2 = 0.77). better (R Figure 5.19 contains the estimated model coefficients and their standard errors for all four factor influence models. The maximum coefficient standard error (0.0069) is an order of magnitude smaller than most coefficient values. In most cases, the factors have a stronger influence on a single explanatory model than on the combined model. The traffic type factor, tr, being the sole exception with a coefficient of 0.08 for the combined factor influence model. Two outstanding factors are lx (xsize) and routing. The xsize factor has the strongest influence on a single explanatory model of all factors: Its coefficient for the packet delivery ratio factor influence model is 0.14. Second strongest influence has the routing parameter on the communication delay explanatory model

86

5.4. Discussion

Model coefficient

0.15

Model subset

0.10

combined delay packet delivery ratio streaming window width

0.05

0.00

neighbor count

routing

speed

other traffic

traffic type

xsize

ysize

Explanatory variable

Figure 5.19.: Estimated model coefficients and their standard error for the factor influence models: separately for all three connectivity metrics’ explanatory models and combined for all explanatory models taken together. with a coefficient of 0.13. Besides its strong influence on the communication delay explanatory model, the routing factor’s lowest coefficient (0.046), for its influence on the streaming window width explanatory model, is still higher than most other factors’ coefficients. The neighbor count coefficient for the communication delay is the only significant (0.03 with p = 1.32 × 10−5 ), at 95% confidence, coefficient of this factor. The ysize factor’s coefficients for all, excluding the communication delay factor influence model, have similar values and significance. The influence on the communication delay’s explanatory model goodness of fit by the factors is, besides the two mentioned factors routing and neighbor count, lower than on the other explanatory models. Only the speed factor has a low (0.03 with p = 1.03 × 10−7 ) but somewhat significant influence.

5.4. Discussion 5.4.1. Simulation Performance The passive network metrics used in figure 5.4 have to be interpreted with care. While the network load actually accumulates all network traffic, Petz et al. [Pet+11] report specificity issues with the passive node density metric. It relies on the successful overhearing of frames that carry a sender’s identifier; this excludes all acknowledgment frames. Furthermore, a node that does not transmit any frame during the time interval

87

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes ν = 1 s is not recognized. Especially the simulation runs without any other flow beside the observed one are susceptible to underestimating the node density due to this effect, because nodes that are not part of the observed flow’s route do only communicate due to activity of their routing protocol. This explains the large spike of simulation runs with, in median, no encountered neighbors. In simulations with other traffic, every node hosts one or more data flows. Hence, the probability of a new packet originating at a node or moving through it on its route during a 1 s interval is close to 1. Consequently, for the majority of simulation runs, nodes experience an equal or higher neighbor count than configured. The median estimated neighbor count is 50% larger than the configured neighbor count, which hints at the fact that the average communication range that is used to calculate actually configured node counts from intended node densities lies below the true expected value. Regarding the network load, 25% of the factorial experiment’s simulation runs do have the observed flow as only contributor towards network load, besides the communication overhead caused by the routing protocol. Hence, in these runs, nodes only perceive network load if and while they are part of the observed flow’s route. If more than half the simulation’s nodes are not part of that route at a given time, the use of the median as aggregating statistic causes the simulation run to appear with no network load or only overhead network load. Figure 5.4 shows that the routing protocols’ overhead is small, compared to the configured network load. When further interpreting the factorial experiment’s simulation results, it is important to consider that they result from an artificial scenario setup. Consequently, the results must not be applied to real networks without further reasoning. That being said, the models that underlie the simulation are state of the art. Measures have been taken to parametrize the simulation reasonably and this type of simulation is a common place tool to research, test, and evaluate new networking related techniques.

5.4.2. Connectivity Metrics Communication Delay Despite that 74% of the observed end-to-end transmissions have a delay below 0.01 s, the median’s and 75th percentile’s spreads, cf. figure 5.6, show that the observed delays in a considerable amount of simulations are too high to directly support the real-time communication requirements of application scenario 2: The median statistic’s 3rd quartile violates the delay requirement of 0.1 s and half the simulations have their communication delay’s 75th percentile above this mark. This strengthens the premise of the thesis that applications need to be aware of the network’s current and forth coming connectivity to be able to adapt to it. The communication delay requirements

88

5.4. Discussion of application scenario 1 are covered much better: the delay’s 95th percentile statistic has its 3rd quartile at the 3 s mark, indicating that for 75% of the simulations, 95% of the successful transmissions fall into the smallest delay category. The communication delay’s observed spread, cf. figure 5.8, creates a similar situation: In a single simulation run, i.e., in a specific context, the communication delay’s spread is too large for application scenario 2. But for application scenario 1, the spread that is observed in 75% of the simulations would only cause a change from the smallest delay category into the middle category and not beyond. Communication Interruption Exceptionally low are the observed flows’ packet delivery ratios, cf. figure 5.5. This highlights the challenge of using MANETs in highly dynamic environments for dependable communication. Typical measures to counter this phenomenon are packet retransmissions on the Transport or Application layer, either with packet reception acknowledgements and retransmission timers or upfront redundant transmissions of each packet. While such mechanisms are practically important, they are outside the direct scope of this work. To configure simulation scenarios with less interruptions, the rank correlations and explanatory model coefficients for the packet delivery ratio show that choosing a small simulation area size, a small number of concurrent flows in the network, or a low node speed are measures to improve the packet delivery ratio. Similar to the low packet delivery ratio, the number of consecutive successful transmissions, expressed in the streaming window width metric is very low: Even for the maximum streaming window width statistic, cf. figure 5.10, 25% of the simulation runs do not achieves better width than 2 transmissions. This prevalence of very small streaming window widths emerges in the streaming window duration’s estimated kernel density as well, cf. figure 5.9. The distribution’s dominant modes all correspond to 1 respectively 2 consecutive transmissions. Likewise, the high rank correlation between transmission interval and streaming window duration and the otherwise very low rank correlations of the metric support this finding, cf. table 5.12. Based upon this dependence, there is no strong enough evidence for an influence of the factor time on the communication interruption metrics in this data to further pursue its analysis. Hence, the streaming window duration is not further considered as metric for the communication interruption. The remaining analysis is carried out using packet delivery ratio and streaming window width.

89

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes

5.4.3. Influencing Factors The explanatory models solely include main effect terms of the contextual factors and an intercept term as model parameters. Despite the principal possibility, no higher order terms, transformation other than normalization, or parameter interaction terms were included in the analysis. While these additional terms have the potential of providing deeper insight and better goodness of fit for the models, the resulting number of terms would cause a combinatorial complexity explosion for the parameter combinations. The current set of 7 configuration parameters that represent the investigated contextual factors already incurs 27 − 1 = 127 parameter combinations that have to be considered. Every additional model parameter exponentially increases the number of combinations; adding, e.g., second order terms would thus result in 214 − 1 = 16383 combinations, whereas adding an interaction term for each pair of configuration parameters would add  7 28 2 = 21 to the number of model parameters and cause 2 − 1 = 268 435 455 model parameter combinations. The ordinary method to pick the factors to use for further modelling would be to choose all parameters from the best model in the cross-validation step that resulted in the explanatory models ranking according to their adjusted coefficient of determination. Simply combining the factors from all three metrics’ statistics then yields the set of parameters to pursue. But this approach has multiple drawbacks: • In the present case, all parameters will get selected and the goal to reduce the set of factors would not be achieved. • All factors will be treated equal despite the fact that the change in the explanatory models’ goodness of fit shows that the factors have differing levels of importance to the models’ performances. • No discrimination of factors due to their originating model’s goodness of fit is incorporated into the decision. Due to the mentioned drawbacks, a more creative approach to the selection of factors is taken: To identify the important influential factors, the information from the factor influence models, cf. figure 5.19, is considered together with the explanatory models’ estimated coefficients, cf. figures 5.15, 5.16, and 5.17. From this data it becomes apparent that the factors that have model coefficients with large absolute medians and at the same time a large spread (lx , ly , and nflow for communication delay and lx and ly for streaming window width) have very little influence on the explanatory models’ quality. The factors with large absolute median coefficients and relatively low spread (lx and nflow for the packet delivery ratio), on the other hand, have a strong influence according to the factor influence model.

90

5.5. Conclusion Care has to be taken when interpreting the ly factor’s influence: Three of the four simulation area categories are squares. Hence, in most configurations lx and ly do not vary independently and consequently their effects are susceptible to be mixed. The factor influence model suggest that the ly parameter has only low influence; even being not significant in case of the communication delay explanatory models. In this regard it makes more sense to interpret the lx parameter not as a part of an area contextual factor, but as the geographic distance between the communication’s end-point nodes. If following the factor influence model, then only the routing protocol has more than a minor influence on the communication delay. This finding is supported by the comparatively low goodness of fit of the communication delay’s explanatory models ¯ 2 = 0.39 for the maximum statistic is still low. from which even the by far best value of R From the rank correlation, solely the lx parameter’s value suggests a stronger influence than is expressed by the factor influence model. To describe the streaming window width, the factor influence model suggests the factors |v|, nflow , and tr and to a lesser extent lx and the routing protocol. Looking at the estimated explanatory model coefficients for these factors supports the case for selecting lx and |v|. The other two parameters are discarded because their estimated model coefficients range very close to 0 or span both positive and negative values. Choosing factors for the packet delivery ratio is more obvious than for the other metrics: The factor influence model suggests the factors lx , |v|, nflow , and routing protocol. The estimated explanatory model coefficients strongly support selecting lx and nflow . Both |v| and routing protocol have very small estimated coefficients, but the coefficient spread from maximum to minimum value is less than 0.015. The latter fact suggests against dismissing them, despite their small coefficients. This decision is in line with the factor influence model’s suggestion.

5.5. Conclusion In this chapter, an original, systematic experimental study design of a simulation study using the discrete event simulator OMNeT++ that considers the two application scenarios from chapter 3 has been derived. The simulation study’s statistical analysis was used to acquire insight on the influence that contextual factors have on the connectivity metrics. From the considerations in the discussion section above, the contextual factors lx (as the communication end-point nodes’ geographic distance), |v| (node speed), and nflow (network load) remain as the influential contextual factors that are further considered for use in the prediction models in chapter 6. The routing protocol as technical configuration parameter shows significant influence that has to be accounted for when considering networks with different routing protocols but it is not considered as contextual factor

91

5. Connectivity in Mobile Ad Hoc Networks with Moving Nodes that underlies scenario influences. No prevailing influence from contextual factors on the communication delay was found. The communication interruption metrics, on the other hand, are influenced by contextual factors: The communication end-point nodes’ geographic distance and the node speed have dominating influence on the communication streaming window width and both these and the network load have dominating influence on the packet delivery ratio. This answers research question 1. While no contextual factors were found that have a considerable influence on the communication delay, it expresses some extent of autocorrelation that may be exploited by time-series forecasting methods. This may already be sufficient to fulfill the requirements of the telemedicine application scenario. Further it remains to investigate how to exploit the influence of momentary contextual factors that cannot be considered as aggregate over a complete simulation run, such as network structure or actual network load on the nodes that form the observed flow’s route.

92

6. Connectivity Prediction for Mobile Ad Hoc Networks 6.1. Mathematical Notations for the Network Model The communication network created by a MANET is modeled as the directed graph D, expressed as ordered set D = (N, A) with (νi , νj ) ∈ A and νi , νj ∈ N

(6.1)

with N being the set of nodes in the network and A being the set of arcs [cf. San05]. An arc (νi , νj ) ∈ A is an ordered pair of nodes; its existence in A expresses the capability of node νj to receive Data-link layer frames that are being transmitted by node νi . This single arc is sufficient to transmit broadcast messages from node νi to node νj , but in order to allow for regular uni-cast communication, nodes require symmetric links and thus both (νi , νj ) ∈ A and (νj , νi ) ∈ A must be fulfilled—with one of the two arcs missing, either the data frame is never received by νj or the consequent acknowledgement frame is never received by νi , or the data-frame is not transmitted in the first place, because node νi is not aware of the existence of its outgoing arc to node νj , without having received any transmission from it. Despite the unicast communication requiring symmetric edges between nodes, modelling them as two (directed) arcs, helps to understand the local view each node has of the network. Data in the network is exchanged using messages, with M denoting the set of all messages. In accordance with figure A.1, M contains the subset M dl ⊆ M of all messages belonging to the Data-link layer, called frames, and the subset M net ⊆ M of all messages belonging to the Network layer, called packets; the other layers are not of concern at this point and omitted for brevity. Section 2.3 defines flow in accordance to the IPv6 Flow Label Specification. To further use flow as a concept in the network model, let φ denote a flow as the 3-tuple φ = (νo , νd , B) with νo , νd ∈ N and B = (b1 , . . . , bn ), bi ∈ M net , i = 1, . . . , n ∈ N

(6.2)

93

6. Connectivity Prediction for Mobile Ad Hoc Networks of origin node, νo , destination node, νd , and B the sequence of Network layer packets, in the order transmitted by νo that belong to the flow. A flow is active at time t if at least one packet belonging to the flow has been transmitted by the origin node either before or at time t and at least one packet belonging to the flow will be received by the destination node either at or after time t. The set Φ is the set of all flows, i.e., φ ∈ Φ; with Φ(t) ⊆ Φ being the set of flows that are active in the network at time t. Let the function route : M net → N × . . . × N b 7→ (νi )ni=1 , νi ∈ N, n ∈ N

(6.3)

assign to each Network layer packet b ∈ M net the sequence of nodes that the packet, b, travels from origin node, ν1 ,—the first element in the sequence—to destination node, νn ,—the last element in the sequence—as determined by the routing algorithm employed in the network. Here n is the number of nodes in the sequence, which is one larger than the route’s hop-count. Let the function packets : Φ → M net

(6.4)

map a flow to the sequence of its packets. Further, let R indicate a set of routes, which, when used without subscript or parameter, shall be the set of all routes in existence in a network during a time period of discourse; a subscript shall denote all routes of a flow and a time parameter, t, shall denote routes active at the indicated time or time period: Rφ = {ρb |ρb ∈ R, b ∈ packets(φ)}

(6.5)

Rφ (t) = {ρb |ρb ∈ R(t), b ∈ packets(φ)}

(6.6)

An active route at time t is a route along that at least one package is traveling at time t; then R(t) is the set of all active routes at time t, Rφ is the set of routes utilized by all packets of flow φ, and Rφ (t) is the set of active routes at time t, utilized by the packets of flow φ. The set of routes interfering with flow φ at time t, written as Rφ0 (t), is the set of active routes at time t that have nodes in common with the routes of flow φ active at the same time:  Rφ0 (t) = ρ0 |ρ0 ∈ R(t) \ Rφ (t), ∃ρ ∈ Rφ (t) : ρ0 ∩ ρ 6= ∅ .

(6.7)

Besides the routes—in the sense of sequences of nodes—, the links traversed by a packet while traveling along its route is another formulation of the same matter. Let

94

6.2. Sensory Capabilities and Network Context Awareness the function links : R → ((N, N ) × . . . × (N, N )) (ν1 , . . . , νn ) 7→ (ai )n−1 i=1 , ai = (νi , νi+1 ) ∈ A

(6.8)

map a route—a sequence of nodes with length n ∈ N—to the sequence of arcs, i.e., the links, connecting these nodes in the direction traversed by the Network layer packet to which the route belongs. Then the function links : M net → ((N, N ) × . . . × (N, N )) b 7→ (links ◦ route)(b)

(6.9)

is a short-hand notation to get the sequence of links traversed by a packet on its route through the network. Considering equation (4.11) for the single-hop delay and given the individual delays τdefer,e and τtransmit,e for each link e, then the time that elapses from the start of sending a Network layer packet b ∈ M net at node νs until the end of reception of said packet at node νd , i.e., the end-to-end communication delay of packet b, is computed as: τb =

X

τdefer,e + τtransmit,e

(6.10)

e∈links(b)

6.2. Sensory Capabilities and Network Context Awareness One consequence from the layered network architecture of current communication networks, which was discussed in section 2.3, is that each layer has access to different information about the network’s current state: at the Physical and Data-link layers, only information from a node’s physical vicinity is available; at the Network layer, information about the route of packets that traverse a node is accessible; and at the Application layer, the end-point nodes may enrich the functional data with additional contextual and meta information. When the layered network architecture is stringently enforced, applications do only have access to the information that is present at the Application layer. Having timestamps present in the Application layer messages and given synchronized clocks at both end-point nodes, it is possible to calculate the communication delay of a received message at the receiving node. Determining a node’s geographic location is a task carried out by an application on that node, e.g., by using a global navigation satellite system. Likewise, a node’s velocity is information that is not related to the network stack but has to be acquired by an application on a node. In section 3.4, it was already discussed that predictions of a node’s mobility, in terms of both geographic location and velocity, are present in the planing layer of autonomous vehicles. This planing layer

95

6. Connectivity Prediction for Mobile Ad Hoc Networks node mobility (location and velocity) neighbor nodes (via beaconing) end-to-end communication delay

Application Transport

MAC and queueing delay link failure error rate neighbors local network load

Network

flows supported by node route failures routes of supported flows hop-count

Data-link Physical

SNR received signal strength

Figure 6.1.: Relevant sensory capabilities of the network protocol layers in a MANET node. is part of a node’s applications and thus the future movement trajectories are present in the Application layer without violation of the layered network architecture. The same argument holds for prediction algorithms that may predict future locations and movements of nodes that do not have an algorithmic planning layer like nodes that are associated with human actors in an emergency scene. All information that is accessible from a node’s Application layer is not necessarily confined to this individual node. For nodes that are configured to join to a single MANET and that are able to exchange data streams for networked control at the Application layer, there is no reason to argue against the possibility to deploy additional applications on the nodes that disseminate the contextual information that is accessible at a node’s Application layer between the network’s participating nodes. Such a contextual information dissemination service then enables a flow’s receiving node to use, either dedicated or aggregated, information from other nodes’ Application layer for connectivity prediction. Via the same mechanism or a dedicated, active beaconing service, which a node uses to actively announce its presence via a link local broadcast, nodes are able to detect their neighboring nodes in the network. As alternative to the strict layered architecture design of network stacks, cross-layer design, e.g., as mandated by Aktas et al. [Akt+10], offers access to information from all layers in a node’s network stack through a clean interface. Figure 6.1 summarizes the layers’ sensory capabilities that become available when allowing for cross-layer design. Once available at a node’s Application layer, the information gained via cross-layer design may be disseminated via the contextual information dissemination services for use by the receiving node’s connectivity prediction. Cross-layer design methods, although wide spread for QoS and performance optimiza-

96

6.2. Sensory Capabilities and Network Context Awareness tion in MANETs, have not remained uncriticised. E.g., Kawadia and Kumar [KK05] warn that cross-layer design introduces layer dependencies that reduce reusability and architectural flexibility. The information that may be acquired by cross-layer design methods and used as sensory data in the connectivity prediction models is discussed in the following. The Transport layer is omitted in the discussion, because the UDP protocol that is used does not carry any relevant information in addition to the layers below. The only exception are the origin and destination ports that may be used to tell different flows with the same origin and destination identifier pair apart. Network layer At the Network layer, an IP datagram carries identifiers for its end-point nodes. Combined with the Transport layer protocol identifier, flows can be differentiated into groups of type, as per Transport layer protocol, per combination of origin and destination node. Route failures in IP based networks are communicated at the Network layer via Internet Control Message Protocol (ICMP) messages [cf. RFC3561]. Depending on the utilized routing protocol, each Network layer datagram carries its complete intended route as part of its header, e.g., in the DSR protocol. For other protocols, a node’s Network layer only is aware of the route’s previous node and next node. The time to live header field is a counter for the maximum number of hops that the datagram is allowed to travel before it expires. Given this field’s value from a datagram and knowledge about the maximum configured value for the sending node1 allows to calculate the hop-count from the origin node to the current node. Data-link layer At the Data-link layer, two phenomena are responsible for the communication’s deferred delay, τdefer , per node: the queueing delay, τqueue , that is the time a frame waits in the Data-link layer’s queue until it is its time to be the next frame for transmission over the channel and the MAC delay, τMAC , that is the time during which the node contends for channel access until it starts the successful transmission attempt: τdefer = τqueue + τMAC

(6.11)

Link failures, either due to frame collisions or too low received signal strength caused by too far node distances, are recognized at the Data-link layer. From the amount of failed and successful transmission, a failure rate is calculated as aggregate statistic. The IEEE 802.11 MAC protocol causes a terminal to overhear all transmissions that are in its reception and, under the assumed link symmetry, 1

[RFC1700] defines the configured maximum time to live value for the Internet to be 64. Contrarily, the default OMNeT++ MANET IP host configuration that is used in the simulations use a maximum time to live of 32.

97

6. Connectivity Prediction for Mobile Ad Hoc Networks transmission range. From the overheard transmissions a node is able to identify its neighbors in the network without any dedicated communication. The passive neighbor count network metric discussed in section 4.4 uses this data. Likewise by overhearing all communication on the network’s channel, a node senses the network load as already discussed in section 4.4. Physical layer The radio at the Physical layer senses the signal strength and aggregates it during the reception of each Data-link layer frame. At the same time it calculates the SNR.

6.3. Predicting Connectivity from Node Locations Utilizing predictions of future node mobility for the communication connectivity prediction is achieved in multiple steps. Given the predicted node locations at time t, the following steps have to be performed: 1. For each pair of nodes, given their geographical locations, the probability of having a communication link between the two nodes, in the following called the communication link probability, is predicted. 2. Given the communication link probabilities, one or more routes for the observed flow are computed. 3. Along the computed route, predictions of per node connectivity metrics and influencing factors are collected. 4. Given the connectivity metrics and influencing factors from the flow’s route, its end-to-end connectivity metrics are predicted for time t. The results from step 1 above have to be stored in an appropriate data structure that is then used in the remaining steps. In section 6.1, a directed graph was introduced to model a general MANET. In case of IEEE 802.11, the secure frame transmission with per frame acknowledgements causes the network to have only symmetric links. Consequently, the network topology resembles an undirected graph. Section 6.3.1 below introduces the probabilistic network graph as a data structure to organize predicted link probabilities and other information for use by communication connectivity prediction models. Given the proper edge weights for a probabilistic network graph, in step 2 the most probable route for the observed flow may be found as shortest path through the graph, e.g., with Dijkstra’s algorithm [cf. Dij59] or the A* algorithm. Proper in this case means that the weights have to satisfy the graph search algorithm’s requirements: They have

98

6.3. Predicting Connectivity from Node Locations to be positive, the combined weight of two edges must be the sum of both edges’ weights, and smaller weights must indicate higher link probabilities.

6.3.1. Probabilistic Network Graph The probabilistic network graph, GPN , is a weighted graph of the MANET’s nodes at a single time instant in which the edges represent predicted Data-link layer links. Let the ordered set GPN (t) = (N, Lm (t)) with {νi , νj } ∈ Lm (t) and νi , νj ∈ N weight : Lm (t) → R+ e 7→ we

(6.12) (6.13)

denote the probabilistic network graph at time t with the set of nodes N and the set Lm (t) of Data-link layer links predicted to exist with a probability of at least m at time t. The function weight provides the graph’s edge weights. Due to the constraints set by the graph search algorithms, the communication link probability must not be used as edge weight directly. To derive proper edge weights, consider that the communication link probability, PL (e), is the probability that a link, e = {ν1 , ν2 }, between the two nodes ν1 and ν2 exists; this is identical to the probability of a successful frame exchange between both nodes. Assuming that the communication link probabilities PL (a) and PL (b) of two edges, a = {ν1 , ν2 } and b = {ν2 , ν3 } of node ν2 are independent, then the probability, PR (ρ), of a successful communication along the route ρ = (ν1 , ν2 , ν3 ) is the product of all communication link probabilities along the route: PR (ρ) =

Y

PL (e)

(6.14)

e∈links(ρ)

Then the most probable route is the route, ρ, with the maximum PR (ρ). Maximizing PR (ρ) is equivalent to minimizing PR (ρ)−1 , given PR (ρ) > 0. Furthermore, by using

99

6. Connectivity Prediction for Mobile Ad Hoc Networks the logarithm function, the product is transformed into a sum: 1 , PL (e) e∈links(ρ)    Y 1 ⇔ ln = ln  PR (ρ)

(6.14) ⇔

1 = PR (ρ)

⇔ wρ =

Y

X e∈links(ρ)

PR (ρ) > 0, PL (e) > 0

(6.15)



  X 1  1 = ln PL (e) PL (e) e∈links(ρ) e∈links(ρ)   1 we with we = ln , PL (e)

PL (e) ≤ 1 ⇔ we =≥ 0

(6.16) (6.17) (6.18)

Given equations (6.14)–(6.18), when using inverse, log transformed communication link probabilities as edge weights in the probabilistic network graph, the preconditions laid out above are fulfilled. Because the transformations are injective, the probability of successful communication along that path, i.e., route, can be obtained directly from the cost, wρ ∈ R, that is obtained for a route, ρ, in GPN : PR (ρ) =

1 exp(wρ )

(6.19)

6.3.2. Prediction of Communication Link Probability Communication link probabilities, PL , are predicted in two steps: First, the geographic, Euclidean distance, d, between 2 nodes is used to predict the expected received signal strength, E{Pr (d)}, at the receiving node’s terminal for signal transmissions between the 2 nodes. Second, PL is calculated as the probability of a successful frame exchange given E{Pr (d)}. Step 1: Predicting Received Signal Strength from Node Locations In section 4.5.2, the empirical study from Hara et al. [Har+05] was introduced. It shows that the received signal strength in a MANET follows an exponential probability distribution, cf. equation (4.21), and that the expected value, E{Pr (d)} = Λ(d) in equation (4.22), obeys the Log-distance Shadowing model, cf. equation (5.2). The relationship between equation (4.22) and the Log-distance Shadowing model becomes obvious when simplifying equation (5.2) to Pt Gt Gr λ2 −α · d = KLD · d−α L · 16 · π 2 Pt Gt Gr λ2 and λ, d given in Meter. = L · 16 · π 2

Λ(d) = E{Pr (d)} = with KLD

100

(6.20)

6.3. Predicting Connectivity from Node Locations Equation (6.20) describes a dependency between expected received signal strength and Euclidean distance, d, of the sending and receiving terminals and consequently the nodes. Hara et al. [Har+05] estimate the model’s α parameter a-priori in the same environment where they carry out their experiments and assume prior knowledge of KLD . Given equation (6.20), KLD depends on the transmitting and the receiving terminals’ configuration properties Gt , Gr (the antenna gains), and L (the system loss factor), the signal’s carrier wave length, λ, as well as the transmission power, Pt . Assuming that no topology control techniques are applied to regulate Pt , KLD is a constant that may be calculated, given knowledge of the terminal configurations. From this reasoning, assuming prior knowledge of KLD is reasonable. Alternatively, a method to estimate KLD during operation of the MANET is discussed below. The path loss coefficient, α, is dependant on the environmental conditions and may only be estimated a-priori in a controlled, static environment. Given possibly changing conditions or an a-priori unknown environment, the path loss coefficient has to be constantly estimated during operations. A method to accomplish the necessary estimation of α is discussed below after the discussion of estimating KLD . Estimating KLD Instead of pre-calculating KLD , estimating it during operations would reduce the system’s dependency on correct configuration values. Having a constant value for KLD is a precondition to the estimation of α, as described below. Hence, KLD must be estimated at the time of the MANET’s deployment. Under the assumption that the MANET nodes’ terminals have identical configurations, each node may estimate a single value for KLD . Otherwise, each node must calculate an individual value for KLD , for every of its neighbor nodes. In this case, when a new neighbor node is encountered, its respective KLD must be estimated and stored for later use. Independent of the necessity of estimating KLD for each neighbor individually or if on value suffices for all nodes in the network, the estimation method is the same: Given measurements of past transmissions’ received signal strength, Pr , and Euclidean distance, d, of sending and receiving terminal during the transmission, the exponential model described by equation (6.20) with the two model parameters KLD and α can be fitted. Contrary to fitting linear models, non-linear model fitting is sensitive to outliers and potentially unstable. To evaluate the stability of estimating KLD via the non-linear model, it was repeatedly fitted to consecutive measurements with data set sizes ranging from 50 to 10 000 measurements. Once using the regular Gauss-Newton method and once the robust IRLS Gauss-Newton methods. For each data set size, 1000 data sets were generated by taking consecutive samples with random starting points from a single

101

6. Connectivity Prediction for Mobile Ad Hoc Networks

Method

Estimated K

1 0.1 0.01

● ● ● ● ● ● ● ● ● ● ● ● ●

IRLS Gauss−Newton

● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



Gauss−Newton

● ●





● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●







● ●



● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ●

● ● ● ● ● ● ● ●

● ● ● ● ●

● ● ● ● ● ● ● ● ● ●

● ● ● ●

● ● ● ● ● ● ● ● ● ●

0.001 ● ● ●

1e−04

● ● ● ●

● ● ● ● ● ● ● ●

● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ●



● ● ● ● ● ● ● ●

● ● ● ● ●

● ● ● ●



● ● ● ● ● ●

● ● ● ● ● ●

8000

10000

● ● ● ● ●

● ● ●

1e−05 50

100

250

500

1000

2000

4000

6000

Number of samples

Figure 6.2.: Estimates for KLD using varying numbers of sampled data points for nonlinear model fitting, once using the robust IRLS Gauss-Newton method and once using the regular Gauss-Newton method. The dashed line indicates the actual value for KLD , computed using equation (6.20). node’s received transmissions of evaluation scenario 1 that is described in chapter 7. Figure 6.2 shows the box plots of the estimated value for KLD of each data set size and model fitting method individually. It can be seen that the robust method spreads less to higher values than the regular Gauss-Newton, but that its spread to smaller values is larger, which only improves with 500 and more samples. The robust model fitting generally achieves satisfactory results above this number of samples, although outliers still span one and a half orders of magnitude around the correct value. The regular Gauss-Newton algorithm, on the other hand, produces much more severe outliers even with very large numbers of samples. For the path loss coefficient’s estimation, which is discussed below, a stable and accurate estimation of KLD is necessary. The non-linear estimation of KLD as discussed here, creates unsatisfactory results, due to the outliers that are only reduced, but not fully prevented, when using large sampling sizes and the robust IRLS Gauss-Newton estimation method. Applying filters to a series of KLD estimates might be an option to improve the estimation performance as required, but this approach is left for future research and not further addressed here, because knowledge of KLD may still be assumed to be available per configuration.

102

6.3. Predicting Connectivity from Node Locations Estimating the Path Loss Coefficient From equation (6.20), the path loss coefficient, α, may be calculated using the measurement of a transmission’s received signal strength, Pr , and assumed knowledge of the terminals’ Euclidean distance, d, as: α=−

ln(Pr ) − ln(KLD ) ln(d)

(6.21)

By repeating this measurement for every successfully received Data-link layer frame, an irregular time-series for the path loss coefficient is obtained. Functionally, the path loss coefficient characterises the signal attenuation due to environmental factors, cf. section 2.3.2. In addition to the mostly stable signal attenuation that is caused by the current average environment, random effects and dynamics influence the wireless signal’s strength before reception at the receiving node’s terminal. To predict the future path loss coefficient that is necessary to calculate communication link probabilities for the probabilistic network graph from future node distances, the assumption is introduced that the average path loss coefficient remains stable at a given location. Notwithstanding this assumption, random but by itself stable changes of the path loss coefficient, due to persistent changes in the environment or due to a node moving into a new location with differing environmental influence, may occur and have to be recognized. Carrying forward the current value as prediction is the Na¨ıve forecast method that was already discussed in section 2.4. Given the current average path loss coefficient, its forecasting with this method is trivial. An efficient method to find the current average path loss coefficient is to smooth the measurements by applying a filter to each new measurement of the time-series. A well known and efficient method to filter such sensor input measurement time-series is the Kalman filter; for an introduction cf., e.g., Welch and Bishop [WB06]. The Kalman filter is an estimator for dynamic linear systems that are modeled by a state space representation [PPC09]. In case of the path loss coefficient’s assumed stable value with unknown persistent changes, a simple random walk plus noise model represent the system’s dynamics: Yt = µt + vt ,

vt ∼ Nm (0, V )

(6.22)

µt = µt−1 + wt ,

wt ∼ Np (0, W )

(6.23)

The time-series of observations, Yt , directly measures the system’s state, here the path loss coefficient, affected by normally distributed, 0-mean measurement errors. The system state, µt , may change randomly over time, for which W describes the variance

103

6. Connectivity Prediction for Mobile Ad Hoc Networks

Figure 6.3.: Comparison of the path loss coefficient’s observed time-series with the filtered time-series that results from applying a Kalman filter based upon the state space model from equations (6.22) and (6.23) with V = 9 and W = 2.5 × 10−5 . The observed time-series has been altered in the interval 1000 s to 1400 s to asses the filter configuration’s dynamic properties. that governs the change. A Kalman filter delivers optimal estimates for linear systems with normally distributed measurement errors. In case of measurement errors with a non-normal distribution, the filter loses its optimality property but may still be applied with usually very good results. Given the assumption of a stable path loss coefficient, the property of a linear system is given as modeled in equations (6.22) and (6.23). But the measurement errors or noise that affect the observed path loss coefficient do not necessarily follow a normal distribution2 . Due to the path loss coefficient’s exponential influence on the received signal strength, its estimation precision has a large effect on the received signal strength prediction. By choosing a small variance for change that may occur during the state update, wt , and a large variance for the measurement error, vt , randomness in the measured path loss coefficient is primarily attributed to noise and only persistent changes are reflected in updates to the estimated path loss coefficient. Figure 6.3 shows the estimation accuracy evaluation of the dynamic model random walk filter with measurement variance V = 9 2

In the simulations, random errors are modeled via the Log-normal shadowing model that adds a random value, drawn from a normal distribution with 0 mean, to the signal’s power in the logarithmic domain.

104

6.3. Predicting Connectivity from Node Locations and update variance W = 2.5 × 10−5 . The path loss coefficient time-series is taken from a simulation run of evaluation scenario 1 that has been altered in the interval 1000 s to 1400 s to include a ramp and a step change in level. The measured path loss coefficient’s random variations at a stable level, including multiple very sharp outliers, is clearly seen in the grey curve, while the filtered, black, curve remains stable and sharply follows the artificially introduced, persistent level changes. The dashed line indicates the path loss coefficient’s actually configured value, which is not applicable inside the interval of 1000 s to 1400 s. Step 2: Predicting Communication Link Probability from Received Signal Strength Given the maximum number of transmission retries, nret , that a node attempts at the Data-link layer before it gives up and drops a frame, the communication link probability, PL , is computed as: PL = 1 − (1 − P(s|Pr ))nret ⇔ P(s|Pr ) = 1 − (1 − PL )

1 nret

(6.24) (6.25)

Wherein P(s|Pr ) is the probability that a Data-link layer frame is successfully decoded by its receiver, given the received signal strength Pr . Models for P(s|Pr ) commonly involve non-linear relations and tables that derive a bit error rate from the SNR at the receiving terminal that follows the error function with a steep slope in the transition from high to low bit error rates. In addition to the SNR, the probability that a bit is decoded erroneously from the received signal depends on the utilized coding scheme: its forward error correction influences the possibility of recovering from such errors. Because the resulting error probability is computed for a single bit in a transmission, the frame’s length in bit influences its probability for successful reception. Instead of using these complex models to describe the relation between received signal strength and successful Data-link layer frame reception, a linear logistic regression model that explicitly includes the received signal strength and noise level is fitted to the data at each node: logit(P(s|Pr )) = a0 + a1 · Pr + a2 · l + a3 · Pn + a4 · l · Pn +    x with logit(x) = ln 1−x

(6.26) (6.27)

Where P(s|Pr ) is used as above, l is the frame length in bit, Pn is the noise level in mW, and  is the model’s error term; the received signal strength, Pr , is likewise used in mW. The noise level may be calculated from the received signal strength and SNR,

105

6. Connectivity Prediction for Mobile Ad Hoc Networks

Figure 6.4.: Evaluation of the logistic model from equation (6.26) to predict the successful Data-link layer frame reception. the latter given in dB, as: Pn =

Pr 10

SNR 10

(6.28)

Evaluation3 of this classification model with 50% randomly sampled training data and the remaining data used as test set yields a classification precision of 0.87 and a recall of 0.93. Figure 6.4 shows the prediction for the full data set underlying this evaluation; the categories biterror and collision both indicate that the frame is lost; the colors reflect the models prediction, grey being successful reception. To assess the model’s accuracy given smaller training set sizes that may be used for continuous adaptation of the model during operations, the model’s evaluation was carried out with training set sizes of 50, 100, 200, 400, and 800 samples. Figure 6.5 shows the achieved accuracies from 100 repetitions per sample size. Independent of the training set’s sample size, similar distributions for precision and recall are achieved, both with medians at the values obtained above, using the large training set. Only very few outliers for both metrics exist in the result and only 4 of them drop below a value of 0.6. Hence, this model is suitable to estimate current minimum necessary received signal strength, Pr,min , using only small samples sizes.

3

106

Using the data of node 0 of simulation run 0 from evaluation scenario 1.

6.3. Predicting Connectivity from Node Locations

1.0

Accuracy measure

value

0.8

precision 0.6

recall

0.4 50

100

200

400

800

Training set sample size

Figure 6.5.: Evaluation of the logistic model from equation (6.26) to predict the successful Data-link layer frame reception with small training set sizes. For each training set size, samples were drawn and fitted 100 times. Solving equation (6.26) for P(s|Pr ) yields: β = a0 + a1 · Pr + a2 · l + a3 · Pn + a4 · l · Pn P(s|Pr ) =

exp(β) 1 + exp(β)

(6.29)

Predicting the communication link probability, PL , using this model requires—besides predictions of the received signal strength, Pr —forecasts of both the noise level and frame length. While the frame length may vary due to changes in the headers at the Data-link or higher layers as well as varying compression or Application layer message size, in the following this values are assumed to be constant and are thus forecasted using the Na¨ıve method. The noise level is influenced by various environmental factors. Additionally, all transmissions in a certain range of the receiving terminal create interference that adds to the noise level. The IEEE 802.11 medium access scheme that was discussed in section 2.3.2 is designed to minimize interference by nodes that belong to the same network. Hence, the noise level prediction is only covered via the exponential smoothing forecasting method in this thesis and not investigated in more detail. Minimum Necessary Received Signal Strength and Maximum Transmission Distance Given the model parameters a0 to a4 from equation (6.26) after fitting the logistic model; the threshold value PL,min = 0.5, which defines the minimum probability for considering a transmission as successful, for insertion into equation (6.25); the noise level, Pn ; and the message length, l, the equations (6.25) and (6.26) can be combined and solved for

107

6. Connectivity Prediction for Mobile Ad Hoc Networks Pr to yield the minimum necessary received signal strength: 1

Pr,min

logit(1 − (1 − PL,min ) nret ) − a0 − a2 · l − a3 · Pn − a4 · l · Pn = a1

(6.30)

When using equation (6.30) to estimate the minimum necessary received signal strength, it has to be taken into account that this model may potentially yield 0 or negative values. A simple measure to prevent too low values, which is further used in conjunction with this model, is to employ a lower bound threshold value, Pr,threshold : ∗ Pr,min = max(Pr,min , Pr,threshold )

(6.31)

∗ Given Pr,min from equation (6.31), equation (6.20) can be restated to compute the max-

imum distance at which the received signal strength is at least Pr,min and, consequently, the communication link probability is at least PL,min : ∗ Pr,min

 dmax =

− α1

KLD

(6.32)

6.3.3. Handling Uncertainty in Predicted Node Locations Predictions of future node locations in the above calculations of communication link probability depend on knowledge about the Euclidean distance, d, between pairs of communication nodes. In equation (6.20), this distance is assumed to be precisely known. But predicted node locations from which the distances may be calculated usually are afflicted by uncertainty. Consequently, the distance cannot be calculated precisely. As discussed in section 4.5.2, using the hybrid method that Hung, Xiao, and Hung [HXH12] propose, the distance’s uncertainty is approximated either with a normal distribution, N (µ, σ 2 ), in case of large distances, or with a gamma distribution, Γ(k, θ), in case of small distances. The parameters for both distributions are calculated based on samples taken from the uncertain location’s distributions. Important to note is that the resulting probability density function, cf. equation (4.20), is defined for the squared Euclidean distance. Given the probability density function, p(x), for a continuous random variable, X, and a function, f (X), that transforms the continuous random variable, then the expected value after applying the transformation is defined as [cf. Pea11]: Z



E{f (X)} =

p(x)f (x)dx

(6.33)

−∞

Theoretically, this integral describes how to use the probability density function defined

108

6.3. Predicting Connectivity from Node Locations in equation (4.20) to calculate the expected received signal strength, E{Pr (d)}, from equation (6.20) with an uncertain distance as input. Important to consider is that the Log-distance shadowing model that underlies equation (6.20) is only defined for the far-field of an antenna, i.e., for distances that are large in comparison to the antenna. Especially for a distance of 0, equation (6.20) yields ∞. To account for this limited scope, the integral’s lower bound is set to equal 1 m distance and if the uncertain distance has an expected value smaller than 1 m, it is set to 1 m. By changing the distance instead of directly assigning a certain, high communication link probability, e.g., of 1, the distance’s variance is still taken into account and the knowledge about the terminal configurations that is included in the factor KLD is honored as well. In conclusion, the expected received signal strength for the uncertain, squared Euclidean distance X with probability density function p(x) according to equation (4.20) is calculated as: 1

α

Pr∗ (x) = Pr (x 2 ) = KLD · x− 2 Z ∞ Z ∗ ∗ E{Pr (X)} = p(x)Pr (x)dx = 1

(6.34) ∞

α

p(x) · KLD · x− 2 dx

(6.35)

1

The complexity of p(x) prevents an analytic integration of equation (6.35). Numeric integration methods are well known and readily available [cf. DR84]. The results presented in the following have been obtained by using the standard integrate function in the R software package4 . To evaluate the reasonableness of estimated communication link probabilities that the above method computes, its reference distribution is computed using Monte Carlo simulations [cf. Lem09]: Given the number of samples, nMC = 1 × 104 , for Monte Carlo simulation, the number of samples for the uncertain locations, n = 250, and one set of location samples for each of the two nodes, xi ∈ R2 , yj ∈ R2 , i, j = 1 . . . n; then the samples for the node’s squared Euclidean distance, kx − yk2 , are computed by drawing nMC samples from each node’s location samples with replacement and computing the distance d2m = kxi(m) − yj(m) k2 for each draw m. Using equation (6.34), for each distance sample, d2m , a received signal strength, Pr,m , is calculated. To compute the resulting reference communication link probability distribution, a second Monte Carlo simulation is used: for each Pr,m , draw nMC samples from the resulting exponential distribution: p(x) =

1 Pr,m

  x exp − Pr,m

(6.36)

From the resulting n2MC samples, the reference communication link probability according to equation (6.24) is computed by counting the fraction of samples that are larger than 4

Cf. https://stat.ethz.ch/R-manual/R-patched/library/stats/html/integrate.html.

109

6. Connectivity Prediction for Mobile Ad Hoc Networks

50

density

40

30

20

10

d

0 30 0.00

0.04

0.08

0.12

Kolmogorov−Smirnov distance

Figure 6.6.: Comparison of the reference and the approximated uncertain distance distributions from all 630 uncertain location pairs: Estimated kernel density and box plot of the Kolmogorov-Smirnov tests’ resulting distance measures. Pr,min . Uncertain location pairs are generated by defining a set of 21 nominal distances (10, 20, . . . , 100, 120, . . . , 200, 250, . . . , 500) and 3 ranges for the location variances (small: [0.5, 10], large: [50, 100], and wide: [2, 80]). For each combination of nominal distance and variance range, 10 uncertain location pairs are generated by defining normal distributions for their coordinates: per pair, the 4 coordinate variances are independently sampled from the variance range; for the pair’s first location, both coordinates get a mean value of 0; and for the second location, the Cartesian coordinates’ mean values are computed from polar coordinates with the radius equal to the nominal distance and a uniform sample from the polar axis at range [0, 2π], whereby the extreme values are most certainly not drawn. The resulting 630 uncertain location pairs underlie the evaluation results discussed in the following. The approximated distance distributions’ goodness of fit to the uncertain distances’ reference distributions is evaluated with the Kolmogorov-Smirnov test’s distance measure that returns the maximum vertical difference between the estimated cumulative distribution functions based on samples from the two distributions [cf. AE11]. Figure 6.6 shows the Kolmogorov-Smirnov tests’ distribution that results from all 630 uncertain location pairs. The median at 0.02 with the narrow interquartile range and only outliers that are larger than 0.045 show that the approximation from equation (4.20) to the

110

6.3. Predicting Connectivity from Node Locations

Method

density

1.5

approx. distribution

expected values

Monte Carlo sim.

1.0

0.5

0.0 0.00

0.25

0.50

0.75

1.00

Communication link probability

Figure 6.7.: Estimated kernel densities of communication link probabilities from all 630 uncertain location pairs. The approximated distribution corresponds to the approximating method described above, the expected values only uses the expected value from the approximated distribution, and the Monte Carlo simulation offers the reference distribution. reference distribution results in only minor deviations. The communication link probability that is calculated from the approximated squared Euclidean distance’s distribution via equation (6.35) shows strong bimodality with values concentrating at both extremes of either no or maximum communication link probability. Figure 6.7 shows the corresponding estimated kernel density and contains, in addition to the reference communication link probability distribution from the Monte Carlo simulation, the result obtained when using the squared Euclidean distance’s expected value of equation (4.16) together with equation (6.34) to directly calculate the received signal strength’s expected value without honoring the distance’s distribution. From the estimated kernel densities, this later, simplified, method shows much improved similarity to the reference distribution than the more complex approximation via the distance’s distribution. Figure 6.8 shows both methods’ error distributions and supports the previous finding. While both methods have their median at or very close to 0, using only the expected distance results in a much smaller error range that does not rise above 0.25. At the same time, these errors result from underestimating the probability, with only a few very small outliers overestimated the communication link probability. The complete approximation of the distance distribution has a much larger spread that has outliers close to 1 and at the same time has half its errors representing over-estimations that reach nearly down to −0.75.

111

6. Connectivity Prediction for Mobile Ad Hoc Networks

12 1.5

1.0

density

density

9

3

0.0 1.9 1.7 1.5 1.3 1.1

0

d

1.5

0.5

6

−0.5

0.0

0.5

1.0

Absolute difference (a) Error when using approximated distance distribution.

12 0.0

0.1

0.2

Absolute difference (b) Error when using expected distance only.

Figure 6.8.: Distribution of errors of estimated communication link probability. Reference are the results from the Monte Carlo simulation, values larger 0 indicate an underestimation of the probability.

112

6.3. Predicting Connectivity from Node Locations

6.3.4. Constructing the Probabilistic Network Graph In the previous subsections, the individual functions that are required to construct an instance of the probabilistic network graph have been discussed: From the predicted, uncertain node locations at a time t, the pairwise communication link probabilities are calculated. With the pairwise communication link probabilities, the probabilistic network graph, GPN (t), for time t is constructed. An edge between two nodes in the graph exists, if their communication link probability is larger than the threshold m ∈ R. Each edge, e, is assigned an edge weight, we , that is computed as the logarithm of the edge’s reciprocal communication link probability, cf. equation (6.17). The time interval duration, Tv , during that a probabilistic network graph instance, GPN (t)

that is constructed for time t remains a useful model of the actual network

topology strongly depends on the node mobility during the time interval (t, t + Tv ]: The faster the nodes move, the faster do the distances that underlie the edge weight calculations of GPN (t) change beyond a tolerable limit. This conclusion is supported by the results of chapter 5 that found a significant influence of average node speed, v¯, on the communication interruption metrics. The predictability of the nodes’ mobility and the environmental factors do not influence Tv , but rather set a limit, Th,max , to the forecasting horizon, Th ≤ Th,max , at which a probabilistic network graph that is constructed at time t0 , GPN (t0 + Th ), is a useful model of the actual network topology during the time interval (t0 + Th , t0 + Th + Tv ]. In principle, Tv does not have to be a constant value; but, to not overburden the algorithm’s complexity, it is assumed to be constant and used to discretize the prediction algorithm’s execution in time. Finding the maximum usable forecasting horizon, Th,max , is of general interest for the connectivity prediction and will be further discussed in the evaluation in chapter 7. Maximising the value is a matter of improving the forecasting and prediction methods that are the thesis’s subject. On the other hand, estimating the minimum necessary value of Tv to achieve a required prediction accuracy is a precondition to define the frequency at which new probabilistic network graph instances have to be computed. Maximising Tv is important to reduce computational demand, which is an issue when implementing the prediction algorithms on devices with narrow restrictions on available energy and computational power. For the laboratory setup of this thesis, the latter restrictions are not given at a level of concern and the matter of maximising Tv lies beyond the thesis’s scope. Important to notice however, is that there is no necessary implication to couple updates of metrics from the network that are attached to the nodes and edges of the probabilistic network graph to the update interval of the graph’s structure and its edge weights. Only for simplicity, both update intervals will be synchronized and treated equal in the remainder of this work. As such, Tv is here defined as a 10th of the time

113

6. Connectivity Prediction for Mobile Ad Hoc Networks that a node in average needs to move across the maximum communication distance, dmax : Tv =

dmax 10 · v¯

(6.37)

Given Tv , the graph validity duration, and the forecasting horizon, Th , the discretized forecasting horizon, h, is computed as the largest natural number multiple of Tv that is not larger than Th :



Th h= Tv



Reciprocal to the graph validity duration is the graph update frequency, fG =

(6.38) 1 Tv ,

which

is the rate for constructing a new set, G, of probabilistic network graph instances to cover the current forecasting horizon, starting with the current time, t0 : T = (t0 , t0 + Tv , t0 + 2 · Tv , . . . , t0 + h · Tv ) PN

G(T ) = {(t, G

)|t ∈ T, G

PN

according to eq. (6.12)}

(6.39) (6.40)

wherein the ordered set (t, GPN ) is used to reference the time, t, for when GPN has been constructed to model the network’s topology; in equation (6.12), the notation GPN (t) is used to express the same time reference. The sequence T is hereafter called the forecasting time sequence. Given the location predictions, X(T ), as: X(T ) = {Xν,t ∈ Rn×2 |ν ∈ N, t ∈ T, n ∈ N}

(6.41)

for all nodes, N , at the time instances of the forecasting time sequence, T . Then X(T )ν,t expresses an index into this set that retrieves the n samples for the 2-dimensional uncertain location of node ν at time t. Summarizing the description in this section, algorithm 6.1 details how to construct the set of probabilistic network graphs for a given forecasting horizon. This algorithm has to be executed repeatedly with the graph update frequency, fG .

6.4. Connectivity Prediction Models Three categories of connectivity prediction models differentiate the models that are defined in this section according to their dependency on sensorical data from the network nodes that is discussed in section 6.2 above: Black box models reside purely in the communication end-point nodes’ Application layers and function without mobility prediction or support from special services

114

6.4. Connectivity Prediction Models

Algorithm 6.1 Constructing the probabilistic network graph. Require: The forecasting time sequence, T , the location predictions covering T , X(T ), the communication link probability threshold, m, and the set of nodes, N . Ensure: A set that contains one probabilistic network graph instance per time step starting now up to the forecasting horizon. function ConstructGraph(T , X, m, N ) G←∅ for all t ∈ T do C ← {c = {i, j}|i ∈ N, j ∈ N, i 6= j} E←∅ for all {i, j} ∈ C do x ← X(T )i,t ; y ← X(T )j,t PL ← ComLinkProb(x, y, t) if PL ≥ m then  w ← ln P1L e ← {i, j} E ← E ∪ {e} weight(e) ← w end if end for GPN ← (N,  E) PN G ← G ∪ (t, G ) end for return GPN end function

. All pairs of nodes.

. Get uncertain locations. . Communication link probability.

. Create the edge e. . Assign weight to edge e.

. Create graph.

115

6. Connectivity Prediction for Mobile Ad Hoc Networks Table 6.1.: Overview of the black-box connectivity prediction models. Label

Name

Metric

N E ER A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 F1

Na¨ıve forecast exponential smoothing with initial estimation of model exponential smoothing with continuous estimation of model ARIMA(0, 0, 1) ARIMA(0, 0, 2) ARIMA(0, 0, 3) ARIMA(1, 0, 1) ARIMA(2, 0, 2) ARIMA(3, 0, 3) ARIMA(1, 1, 1) ARIMA(2, 1, 2) ARIMA(1, 0, 0) ARIMA(2, 0, 0) ARIMA(3, 0, 0) simple failure rate model

QPD , τ QPD , τ QPD , τ QPD , τ QPD , τ QPD , τ QPD , τ QPD , τ QPD , τ QPD , τ QPD , τ QPD , τ QPD , τ QPD , τ E{Wt }

from other network nodes. Cross-layer models are allowed to access information from all layers in the network stacks of the nodes along a flow’s current route but have no access to the node mobility predictions. Probabilistic network graph models extend the cross-layer models with access to the node mobility predictions and derived predictions that are represented in the probabilistic network graph. The model is not limited to information from the flow’s current route. The tables 6.1, 6.2, and 6.3 offer an overview of the three categories’ prediction models that each are defined in detail in the following subsection. The models’ labels from the three tables are used to identify the models hereafter.

6.4.1. Black-Box Models With the intention to be as less intrusive to the network architecture as possible, the black box prediction models honor the network stack’s layered design without applying cross-layer interactions. The simplest form of these models treats the packet delivery ratio, QPD , and the end-to-end communication delay, τ , as regular time-series. For these time-series, common forecasting models are applied. The streaming window width does not lend itself to the time-series model and is instead predicted using failure rate

116

6.4. Connectivity Prediction Models models. The failure rate models are adaptations from the frameworks that underlie statistical analysis and inference in the field of reliability engineering, as discussed, e.g., by Finkelstein [Fin08]. Forecasting of Packet Delivery Ratio and End-to-end Communication Delay Modelling τ as a regular time-series is straight forward and has been addressed in chapter 5. QPD , on the other hand, was only considered as a single data point per simulation run. To get a time-series, the ratio of successfully received Application layer packets at the destination node, νd , belonging to the observed flow, φ, in a sliding window of nPD attempted transmissions is calculated: QPD (t) =

|route (packets(φ, [t − nPD , t])) ∩ {νd }| nPD

(6.42)

To define the sliding window width, nPD , the average node speed, v¯, the maximum communication range, dmax , from equation (6.32), and the observed flow’s transmission rate are taken into account. The intention is to divide the movement across the maximum communication range in up to 10 segments that reflect the sliding windows’ width; to ensure that the sliding window contains a minimum number of samples, the smallest permissible value for nPD is set to 50 samples: 

nPD



dmax · fs = max 50, v¯ · 10

 (6.43)

Basic forecasting methods for time-series have already been introduced in section 2.4. The Na¨ıve forecast model, i.e., N, offers a base line reference that helps to asses the other models’ prediction performances. Furthermore, different ARIMA models are considered: The data that was gathered in chapter 5 suggest to use a lag of 1 for autocorrelation in τ , cf. figure 5.18; to have a reference, up to 3 autoregressive terms are considered. Because the data does not indicate that slopes or higher order differences exist, differencing up to once is considered. No information on the moving average process has been obtained in chapter 5, up to 3 terms are considered. To keep it simple, the same ARIMA parameter sets are evaluated for QPD and τ . Simple Failure Rate Model for Streaming Window Width From a failure rate, λ, the mean remaining lifetime at time t, E{Tt }, is infered. In this model, the failure rate is simply a time dependant formulation of the packet delivery ratio: λ = fs · (1 − QPD )

(6.44)

117

6. Connectivity Prediction for Mobile Ad Hoc Networks wherein fs is the send rate for the observed flow’s Application layer packets. The lifetime is identical to the streaming window duration connectivity metric. In accordance with the findings from chapter 5, the discretized streaming window width is used instead of the continuous time model that the streaming window duration represents. Consequently, the mean remaining streaming window width, E{Wt }, given that the packet sent at time t was successfully received, is: E{Wt } = fs · E{Tt }

(6.45)

For the simple failure rate black-box model, interruptions of end-to-end communication are assumed to occur with a constant failure rate, λ, in a memory-less Poisson process that, in accordance with Finkelstein [Fin08], has a negative exponential lifetime distribution: F (t) = P(T ≤ w) = 1 − exp(−λt)

(6.46)

The resulting, expected mean remaining lifetime, which is independent of t, is: E{Tt } =

1 λ

(6.47)

By substituting with the equations (6.44) and (6.45), the equation E{Wt } =

1 1 − QPD

(6.48)

is derived.

6.4.2. Cross-Layer Models The cross-layer models have, at time t, access to information from all layers in the protocol stack of nodes, νi , that are part of the route, ρt , that the last received packet of the observed flow, φ, traveled along. Mobility predictions are not available to these models, but via the received signal strength and via the current node velocities of nodes on the last received packet’s route the current movement of nodes may be interpolated to predict link breaks. Related methods to predict the time of a link break are discussed in section 4.5. Packet Delivery Ratio From Per Hop Failure Ratio Each node, νi , individually computes the ratio of successful frame deliveries at the Data-link layer to the number of frames that were attempted to be transmitted at time t, QF D,i (t), via a sliding window of duration TFD,i that includes all outgoing

118

6.4. Connectivity Prediction Models Table 6.2.: Overview of the cross-layer connectivity prediction models. Label

Name

Metric

PX SX SX ref NX EX AX1 AX2 AX3 AX4 AX5 AX6 AX7 AX8 AX9 AX10 AX11

link failure ratio product on route shortest estimated link lifetime on route shortest actual link lifetime on route sum of Na¨ıve forecasts per link on route sum of exponential smoothing forecasts per link on route sum of ARIMA(0, 0, 1) per link on route sum of ARIMA(0, 0, 2) per link on route sum of ARIMA(0, 0, 3) per link on route sum of ARIMA(1, 0, 1) per link on route sum of ARIMA(2, 0, 2) per link on route sum of ARIMA(3, 0, 3) per link on route sum of ARIMA(1, 1, 1) per link on route sum of ARIMA(2, 1, 2) per link on route sum of ARIMA(1, 0, 0) per link on route sum of ARIMA(2, 0, 0) per link on route sum of ARIMA(3, 0, 0) per link on route

QPD E{Wt } E{Wt } τ τ τ τ τ τ τ τ τ τ τ τ τ

Data-link layer traffic at the node during the time interval [t − TFD,i , t]. Retransmission attempts of frames are not considered in this ratio, only ultimately dropped and successfully transmitted frames are included; frame retransmissions do not affect the ratio of successfully forwarded to dropped Network layer packets. From the past observed QF D,i , the node then forecasts the time-series for the forecasting horizon, h. For simplicity, TFD,i is set equal to the observed flow’s transmission period time, i.e., TFD,i =

1 fs .

Under the assumption that the per node QF D,i are independently

distributed, the end-to-end QPD is calculated as product of all link failure ratios along the current route: QPD (t) =

Y

QF D,i (t)

(6.49)

i∈ρt

Streaming Window Width from Shortest Link Lifetime The minimum remaining link lifetime at time t along the current route is used as the remaining lifetime for the current route, which is assumed to equal the remaining streaming window duration: Tt = Tρt ,t = min{Te,t }, e ∈ links(ρt )

(6.50)

119

6. Connectivity Prediction for Mobile Ad Hoc Networks With the transmission rate, fs , the remaining streaming window width is calculated as: wt = bfs · Tt c

(6.51)

A link’s remaining lifetime, Te,t , is predicted individually by each link’s receiving node, i.e., the later node in the sequence ρt . To predict Te,t , different applicable methods have been presented in section 4.5. Here, the linear velocity extrapolation based on actual geographic mobility information as proposed by Su, Lee, and Gerla [SLG00], cf. equation (4.1), is extended by the estimation of the maximum communication range given a minimum probability for successful communication via the link that was introduced in subsection 6.3.2 above, cf. equation (6.32). End-to-end Communication Delay From Per Hop Forecasts Similar to the cross-layer model for packet delivery ratio prediction, the end-to-end communication delay, τ , is calculated from the forecasts of each individual node along the flow’s current route, ρt , according to equation (6.10): τ (t) =

X

τdefer,e (t) + τtransmit,e (t)

(6.52)

e∈links(ρt )

Where the transmission delay per hop, τtransmit,e (t), is assumed to be constant in time and may be calculated using equation (2.1). Because the Network and Data-link layer headers have to be included in the frame size in equation (2.1), τtransmit,e (t) may vary along the route and is calculated by each node based on actual measurements of the frame size that is then forecasted with the Na¨ıve forecast method. τdefer,e (t) is predicted as forecast from the time-series of average Network layer packet waiting times by each node individually: At time t, with a frequency identical to the observed flow’s send rate, fs , a node computes the mean packet waiting time, τ¯Net (t), of all Network layer packets that had their last Data-link layer frame successfully transmitted during the sliding window with interval [t −

1 fs , t]

and uses forecasts from

the resulting time-series for its τdefer,e (t). For this purpose, the packet waiting time of a packet b, τNet,b , is measured starting with the entrance of b into the node’s Network layer service process and ending when the first Data-link layer frame containing parts of b leaves the Data-link layer. In case no fragmentation of b is necessary, the mapping between b and the outgoing frame is one-to-one. This calculation of τNet,b neglects delays that occur due to frame retransmissions.

120

6.4. Connectivity Prediction Models Table 6.3.: Overview of the probabilistic network graph connectivity prediction models. Label

Name

Metric

PG SG NG EG AG1 AG2 AG3 AG4 AG5 AG6 AG7 AG8 AG9 AG10 AG11

link probability product on shortest path time until next graph partitioning sum of Na¨ıve forecasts per link on shortest path sum of exponential smoothing forecasts per link on shortest path sum of ARIMA(0, 0, 1) per link on shortest path sum of ARIMA(0, 0, 2) per link on shortest path sum of ARIMA(0, 0, 3) per link on shortest path sum of ARIMA(1, 0, 1) per link on shortest path sum of ARIMA(2, 0, 2) per link on shortest path sum of ARIMA(3, 0, 3) per link on shortest path sum of ARIMA(1, 1, 1) per link on shortest path sum of ARIMA(2, 1, 2) per link on shortest path sum of ARIMA(1, 0, 0) per link on shortest path sum of ARIMA(2, 0, 0) per link on shortest path sum of ARIMA(3, 0, 0) per link on shortest path

QPD E{Wt } τ τ τ τ τ τ τ τ τ τ τ τ τ

6.4.3. Probabilistic Network Graph Models Designed around the probabilistic network graph that is introduced in section 6.3 above, the following set of connectivity prediction models uses this data structure and associated parameter estimation methods to integrate precise or uncertain predictions of future node locations into the connectivity metric prediction. In section 6.3.4, the probabilistic network graph’s construction is discussed, including the graph validity duration, Tv , and its reciprocal, the graph update frequency, fG : with the frequency fG , a new set of probabilistic network graphs is constructed, each covering an interval of duration Tv , to fill the complete forecasting horizon. Each graph in this set is then used to predict the connectivity metrics for the graph’s covered interval. Consequently, the connectivity metric predictions themselves are updated with the frequency fG as well. Packet Delivery Ratio from Shortest Path Given the shortest path between the observed flow’s end-point nodes in the probabilistic network graph that is valid in the interval (t, t + Tv ], this path’s route probability, PR , cf. equation (6.19), is directly interpreted as the observed flow’s packet delivery ratio at time t, QPD (t).

121

6. Connectivity Prediction for Mobile Ad Hoc Networks Streaming Window Width from Next Graph Partition Given the sequence of probabilistic network graphs that cover the upcoming forecasting horizon starting at time t, the next occurrence, at time tp > t, of a graph partition that prevents a path between the observed flow’s end-point nodes, is used as the flow’s current mean remaining lifetime, E{Tt }. With equation (6.45), the mean remaining streaming window width is derived as E{Wt } = fs · tp . End-to-end Communication Delay from per Hop Forecasts Similar to the cross-layer prediction model for the end-to-end communication delay, per hop forecasts are accumulated to predict the end-to-end delay. But instead of using the forecasts from the current route, the forecasts are collected from the nodes along the probabilistic network graph’s shortest path.

6.5. Online Supervised Learning of Second-level Adaptation Models Compared to the black-box models that use actual observations of the time-series to forecast the communication delay and the rolling packet delivery ratio, the corresponding cross-layer and probabilistic network graph models have the principal disadvantage that they do not include actual observations at the receiving node into their predictions. This leaves the possibility for bias in their predictions from effects that are not represented in the models. To test for the existence of such bias and to simultaneously adapt the prediction result to it, a simple linear model is used as a second-level model. At every time-step, it transforms the first-level model’s predicted value: fθ : R → R yˆt = fθ (xt ) = θT · x with

θ = (θ0 , θ1 )T ∈ R2 , xt = (1, yˆt∗ )T ∈ R2

(6.53) (6.54) (6.55)

Here yˆt∗ is the predicted value from the first-level model at time t and θ is the vector of model parameters that takes the starting value of (0, 1)T , which passes through yˆt∗ without modification. All adaptation models are trained on-line during operations using the stochastic gradient descent method, without any pre-training. At each time-step at that a new prediction is performed, the algorithm uses the data from the last full prediction horizon that has fully passed to compute the current gradient to adapt θ given the training rate, α, as described in section 2.4. Use of this second-level adaptation model is indicated by

122

6.6. Conclusion appending a slash and the learning rate to the first-level model’s label, e.g., PG/0.1 to show that the probabilistic network graph model that predicts the packet delivery rate is used with a second-level adaptation model that uses the learning rate α = 0.1. For the next stream interruption prediction models, the actual observation for a predicted value is not available to an a-priori known instant and does not map to a single prediction. It rather gets available with the occurrence of an interruption in the future that, at the time of its occurrence, needs to be associated with the previous predictions. This temporal dependency leads to a so called credit assignment problem and is targeted by reinforcement learning. Defining adaptation models based on reinforcement learning techniques is not further covered here but left for future work.

6.6. Conclusion In this chapter, three classes of prediction and forecasting models have been proposed for each of the connectivity metrics that are considered in this thesis. The proposed models are the candidate models that form the basis to answer research questions 2 and 3 in chapter 7 after selecting the best performing model per class and metric via cross-validation. The proposed black-box models use plain forecasting methods to predict the time-series of communication delay and packet delivery ratio; they use an exponential failure rate model that is adapted from reliability engineering to predict the number of transmissions to the next stream interruption. The number of transmissions to the next stream interruption is the on-line representation of the streaming window width connectivity metric. The cross-layer models predict the connectivity metrics by collecting and aggregating data and time-series forecasts from the nodes along the last received packet’s route. The cross-layer packet delivery ratio model is derived from single link connectivity prediction methods that have already been proposed in the literature; the model’s extension to end-to-end predictions, as well as the other black-box models, are original propositions from this thesis. The probabilistic network graph models use potentially uncertain node location predictions to construct a graph of the network with edge weights that represent probabilities for successful communication between the nodes that are connected by that edge. While representing the network with a graph is well established, its specific use for predicting connectivity metrics, the use of link probabilities, and the model to predict link probabilities are original ideas proposed in this chapter.

123

7. Evaluation Chapter 6 introduced three classes of prediction models: Black-box models that predict the connectivity metrics purely by observing them at the receiving end-point node; crosslayer models that use information that is collected at nodes along the flow’s current route; and probabilistic network graph models that use predicted node locations to predict the network’s future topology and use this to predict the connectivity metrics. In each model class, a single prediction model is defined for the time to next stream interruption metric. For the remaining two metrics, packet delivery ratio and communication delay, models with varying forecasting methods and parametrization for these methods are considered, cf. tables 6.1, 6.2, and 6.3. The combined run-time of all proposed models for a single simulation run is too long to be viable for an extended evaluation of all models on a reasonable set of simulations that would permit statistically valid conclusions. Hence, a cross-validation step, in which all models are run on a small number of simulations, is used to reduce the amount of models that are further considered in the evaluation. From the candidate models before cross-validation, per model class one model and its parametrization is chosen by selecting the best performing one based on the cross-validation simulations. The 9 models that get selected by this method and, if not included, the Na¨ıve forecast models as reference, are then evaluated on the extended set of simulations to acquire results on the models’ predictive performances.

7.1. Method for Model Assessment The computations of the connectivity metrics for the prediction models’ application and evaluation differs from the method that is used in chapter 5 for the connectivity interruption metrics. Figure 7.1 visualises the metrics relation to each other that is further described in the following: • Instead of the overall scenario packet delivery ratio, a sliding window packet delivery ratio is calculated. The sliding window’s width is used according to nPD in section 6.4. Depending on the use of the time-series, the sliding window is aligned differently. For the observed time-series, which is used as input to the

125

7. Evaluation packets at origin node

packets at destination node

delay

τ1 = t1,d − t1,o

3

t1,d t2,d

τ2 = t2,d − t2,o

2

τ3 = t3,d − t3,o

1

t3,d

τ4 = ∅

0

τ5 = t5,d − t5,o

2

τ6 = t6,d − t6,o

1

τ7 = ∅

0

τ8 = t8,d − t8,o



t1,o t2,o t3,o t4,o t5,o t6,o

packet deliv. next stream ratio window interruption

t7,o

t6,d t5,d

t8,o

t8,d

Figure 7.1.: Connectivity metrics computation using Application layer packet timings from the discrete event simulations’ data. forecasting models, the sliding window ranges from the last packet observed at the current time-step backwards. For the actual time-series that is used as the reference to calculate prediction errors, the window is centered on the packet at the current time-step in order to better reflect the actual packet delivery ratio. • Instead of the full width of each interruption free window, i.e., streaming window width that is used in chapter 5, the time varying equivalent of time or packets until next stream interruption is used, which continuously decreases during an interruption free window until it reaches 0 at the interruption’s occurrence. For cross-validation, scaled error measures are used as score to better be able to compare model performance; the amount of underlying observations vary for some of the models, because of the way they handle missing data. In case of the communication delay and packet delivery ratio, which are both treated as ordinary time-series, the Mean Squared Scaled Error (MSSE) that is recommended by Hyndman and Athanasopoulos [HA13] is used. That means forecast errors are scaled with mean errors from the Na¨ıve forecast. Given their forecast errors ei for i ∈ T observations: T 1X MSSE = T

1 j=1 T −1

e2j PT

i=2 (yt

− yt−1 )2

(7.1)

Because the next stream interruption metric is not predictable using standard forecasting methods and consequently the Na¨ıve forecast is not applicable, the cross-sectional total

126

7.1. Method for Model Assessment Table 7.1.: Usage of the adaptation models for cross-validation. Metric

Base model

Included adaptation models

QPD

PX PG

PX/0.05, PX/0.1, PX/0.2 PG/0.05, PG/0.1, PG/0.2

τ

NX AX1 AX4 AX7 AX9 NG AG1 AG4 AG9

NX/0.05, NX/0.1, NX/0.2 AX1/0.1 AX4/0.1 AX7/0.1 AX9/0.1 NG/0.05, NG/0.1, NG/0.2 AX1/0.1 AX4/0.1 AX7/0.1

sum of squares, cf. section 2.4, is used for scaling instead: MSSE =

T 1X T

e2j

1 j=1 N

PN

¯)2 i=1 (yt − y

(7.2)

This scaling has a similar effect as the mean squared error, but here the data’s mean is the reference instead of Na¨ıve forecast. In both cases all scores are positive, lower scores are better, and a value of 1 indicates equality with the reference. To evaluate the effect that the use of the adaptation model has on the predictive performance, a set of models are—in addition to the base models that are listed in tables 6.1, 6.2, and 6.3—run with the adaptation model. The three learning rates 0.05, 0.1, and 0.2 are used as table 7.1 shows. On a single model per metric and model class, all three learning rates are run to be able to compare their individual effects. On a selected set of additional base models, the adaptation model with a learning rate of 0.1 is configured to have more data available to assess if the use of the adaptation model does generally have any effect on the prediction errors. The adaptation model is not used for all base models to keep the overall run-time at a reasonable level. For all three metrics combined, 79 models are included in the cross-validation step. To assess the adaptation model’s influence in the cross-validation, the relative score difference compared to their base model is tested to be 0. Given the adaptation model’s score, si , and its base model’s score, s∗i , the relative score difference, s∆,i , is calculated as: s∆,i =

si − s∗i s∗i

(7.3)

Given the relative score differences, the null hypothesis that s∆,i is larger than or

127

7. Evaluation equal to 0 is tested using the on-sided Wilcoxon signed rank test [cf. Pea11]. This non-parametric test is chosen in stead of the Student’s t-test, because there is no reason to assume normally distributed relative score differences. Furthermore, the selection of only two simulation runs per configuration to limit required run-times for cross-validation yields only a relatively small sample data-set that prevents the central limit theorem’s application. The primary error measure to evaluate the prediction errors in section 7.4 is the standard error of regression, cf. equation (A.3) in appendix section A.2, because it allows a comparison with the data’s actual scale. For every simulation run and prediction model, the standard error of regression is computed for forecast horizons ranging from 10 to 60 discrete time-steps, i.e., transmissions, with a step size of 10. Per evaluated model and forecast horizon, the aggregated median and maximum standard error of regression from all simulation runs is reported; for both statistics, 95% confidence intervals are computed using bootstrapping with the adjusted percentile method that is recommended by Davison and Hinkley [DH97] for non-parametric, unknown distributions instead of the basic, Student, or percentile methods. An important advantage of the adjusted percentile method over the otherwise considered alternative Student method is that it may be used to compute the maximum statistic’s asymmetric confidence interval. The influence that the forecast horizon has on the prediction errors is formally assessed by fitting a linear regression model to describe the standard error of regression per metric and model using the forecasting horizon, h, as single independent variable: std. error of regression = a0 + slope · h + 

(7.4)

The resulting slope model parameter indicates how strong and in what direction the forecast horizon influences the standard error of regression. The parameter’s p value that is estimated while fitting the model states the statistical significance of the slope, i.e., the probability that its estimated value could have been observed if the underlying random variable is 0.

7.2. Simulation Scenarios Two scenarios that use parameters from the parameter sets that were introduced in section 5.1.2, but with different mobility patterns, are used for cross-validation and subsequent evaluation. For each scenario, three configurations that differ only in the routing protocol are used. As for the experimental study in chapter 5, the routing protocols are AODV, DSR, and OLSR. For every scenario configuration, 20 replications are run with an individual simulated time of 1800 s, in order to minimize the potential for

128

7.2. Simulation Scenarios bias because of the discrete event simulation’s stochastic nature. After their execution, every simulation run is checked for the inclusion criterion of a total packet delivery ratio larger than 4%. The value allows for large and frequent interruptions, but removes simulation runs with only a handful of successful transmissions. Every simulation run that does not fulfill the inclusion criterion is dropped without replacement, potentially reducing the amount of simulation runs that are used for evaluation. From every set of simulation configuration parameters, i.e., for every combination of routing protocol and scenario, 2 simulation runs (total: 12) are used for the cross-validation step. The remaining, up to 18, simulation runs per configuration (total: up to 120) are used to evaluate the selected models’ prediction errors. The scenario’s use the following parametrization: Scenario 1: Simple average scenario—medium node density, medium square area, medium network traffic, medium amount of other traffic, random way-point mobility with walking speed, and both origin and destination are moving. Scenario 2: Small emergency scenario—very low node density, small square area, medium network traffic, low amount of other traffic, random walk mobility with walking speed, and both origin and destination remain stationary. Instead of the default OMNeT++ mobility models, these scenarios use the improved mobility models that are provided by the BonnMotion package1 [cf. Asc+10]: • While Random-way-point is the most common mobility model in simulation studies, its initial state does usually not reflect its steady-state that emerges during the simulation runs [HOG13]. To overcome this initialization issue, BonnMotion’s steady-state random way-point model implementation is used here instead. It directly starts with the node distribution present in the regular model’s steady-state [Asc+10]. • Hiranandani, Obraczka, and Garcia-Luna-Aceves [HOG13] recommend the selfsimilar least action walk from [Lee+09] to represent realistic human movement: the mobility of nodes is modeled as trips along points of interest. The model is intended to simulate human movement over the course of multiple days: every node begins and ends all its trips at the same location and may choose to add a new point of interest to its set of locations to be visited, after one day, defined as a period of 12 hours, has passed. Munjal, Camp, and Navidi [MCN11] present a similar mobility model, called Smooth, that imposes less constraints on the 1 BonnMotion version v2.1a (09/08/2013), available from http://sys.cs.uos.de/bonnmotion/src/ bonnmotion-2.1a.zip.

129

7. Evaluation represented time scales. Hence, the scenario uses the latter as mobility model to represent human walk. Keeping both a flow’s origin and destination node stationary ensures a minimum route length and isolates their movement influence while reflecting a typical telemedical consultation scenario. All moving nodes are initialized with random starting locations; their locations and velocities are updated in an interval of 0.1 s. Stationary nodes are not positioned randomly, but are instead positioned at special locations. If both origin and destination node are stationary, then their locations are set as follows: • Both the origin and the destination nodes are centered vertically, i.e., along the y-axis, in the simulation area. • The origin node’s horizontal, i.e., x-axis, coordinate is 50 m, i.e., 50 m from the simulation area’s left border. • The destination node’s horizontal coordinate is 50 m less than the simulation area’s horizontal side length, i.e., 50 m from the simulation area’s right border. The vertically centered location ensures that the edge effect is reduced to the side that faces the opposite direction of the transmissions destination respectively origin. Edge effects appear when parts of a node’s communication area lie outside the simulation area and hence, are effectively unusable [KCC05]. The 50 m distance from the simulation area’s border is a compromise between reducing the edge effect and reducing the simulation area that is mostly irrelevant for the observed flow between origin and destination node.

7.3. Prediction Model Cross-Validation 7.3.1. Results For the communication delay, τ , time-series, the Wilcoxon signed rank test fully accepts the null hypothesis (p-value: 1) that using the adaptation model does not reduce the forecasting errors at a 95% significance level. In case of the packet delivery ratio, QPD , the test strongly rejects the same null hypothesis (p-value: 1.1 × 10−13 ). Figure 7.2 shows box plots for the s∆,i of both metrics, grouped by model class and learning rate. Medians and the lower quartiles of s∆,i for the communication delay are larger than, but close to 0, only outliers are larger than 0.5 and the upper quartiles are all smaller than 0.25. All s∆,i but three outliers from the cross-layer model are lower than −0.5 for

130

7.3. Prediction Model Cross-Validation

delay

packet delivery ratio

normalized difference in score



● ●

● ●

2

0.0



Learning rate 0.05 0.1

1

0.2 ●

−0.5 ●



0 cross−layer

prob. netw. graph

cross−layer

prob. netw. graph

Model class

Figure 7.2.: Normalized differences in model score, s∆,i , for prediction models with and without 2nd-level adaptation models. the packet delivery ratio prediction. Here, larger learning rates show a small tendency of reducing s∆,i , but this cannot be stated with significance. Figure 7.3 shows the cross-validation score for the 4 best communication delay forecasting models per model class. Of the black-box models, the Na¨ıve forecast, N, yields the best result (s∆ = 1), the next best model is the ARIMA(0, 0, 1), A1, which has more than double the score (s∆ = 2.23). The best cross-layer model, AX1 (s∆ = 0.89), uses ARIMA(1, 0, 0) models for the per hop forecasts. The three follow-up models’ score is less than 0.001 larger. For the probabilistic network graph models, the 4 best models—AG2, AG3, AG4, and AG5—all achieve identical scores (s∆ = 3.09). The 4 models use ARIMA(1, 0, 0), ARIMA(2, 0, 0), ARIMA(1, 0, 1), and ARIMA(2, 0, 2) models respectively for the per hop forecasts. Figure 7.4 shows the cross-validation score for the 4 best packet delivery ratio prediction models per model class. Exponential smoothing with re-estimation of its parameters, ER (s∆ = 0.97), is the best performing black-box forecasting model, closely followed by the E and the N model. The best cross-layer model, PX/0.2 (s∆ = 6.94), uses the adaptation model with a learning rate of 0.2. The top three models use the 2nd-level adaptation model; at the smallest learning rate the score increases to s∆ = 7.65. Out of the probabilistic network graph models, the one with highest learning rate, PG/0.2 (s∆ = 4.10) performed best. At the smallest learning rate, the score increases to s∆ = 4.39.

131

7. Evaluation

score

black−box

cross−layer

4

1.2

3

1.0

prob. netw. graph

6





4 ●



2









0.8

2

0.6

0









1

4 AG

5 AG

3

AG

AG

2

1 AX

2 AX

3 AX

4 AX

N

A1

A2

A3

0

Model

Figure 7.3.: Comparison of the 4 communication delay forecasting models per model class with lowest mean score.

black−box

cross−layer

prob. netw. graph

40 40 2.0 30

score

30 1.5







20 20 10 ● ●

1.0









10 ●

0





/0 .2 PG

/0 .1 PG

5 .0 /0 PG

PG

PX /0 .2

PX /0 .1

05 0. PX /

PX

ER

E

N

A1

0

Model

Figure 7.4.: Comparison of the 4 packet delivery ratio forecasting models per model class with lowest mean score.

132

7.4. Prediction Errors in Simulation Studies Table 7.2.: Selection of models for further evaluation after cross-validation. Metric τ QPD Wt

Black-box

Cross-layer

Probabilistic network graph

N N, ER F1

NX, AX1 PX/0.2 SX

NG, AG4 PG/0.2 SG

7.3.2. Discussion A significant improvement of prediction accuracy when using the 2nd-level adaptation model is found for the packet delivery ratio models. Comparison of the models’ mean scores show a score reduction of between

1 3

and nearly

1 6

compared to the base model.

The highest learning rate (0.2) yields best results, but the improvement over the smallest learning rate (0.05) is barely noticeable. Hence, the packet delivery ratio models PX and PG will be used with the larger learning rate of 0.2. Figure 7.2 strongly supports the hypothesis test that accepts the null hypothesis for the communication delay. The figure even suggests that using the 2nd-level adaptation model actually increases the prediction error in most cases. Based on the cross-validation results, table 7.2 lists the models that are further evaluated. In addition to the 9 best performing models, the Na¨ıve forecasting models N, NX, and NG to forecast packet delivery ratio and communication delay respectively, are included. The Na¨ıve model to forecast packet delivery ratio is required to scale forecasting errors to compute MSSE. NX and NG are included because their results in the cross-validation is not considerably worse than the respectively best performing models, but algorithmically, they are considerably simpler.

7.4. Prediction Errors in Simulation Studies 7.4.1. Results From the total of 108 simulation runs—54 per scenario—intended for evaluation of prediction performance, 59 fulfilled the inclusion criterion (scenario 1: 35, scenario 2: 24). Of these, the median total packet delivery ratio for scenario 1 is 47.2% and for scenario 2 is 46.4% respectively. The total packet delivery ratios’ distributions, shown in figure 7.5, are both nearly uniform. The median hop counts are 3 (range 1 to 6) respectively 4 (range 3 to 6). Figure 7.6 shows that the median hop count per run in scenario 1 is smaller than 2 only in 8.6% of the runs and in 34% of runs it is equal to 2; for scenario 2, only 4% of the runs have a median hop count smaller than 4 and 71% have a median hop count equal to 4. Only the simulation runs that fulfill the inclusion

133

7. Evaluation

Empirical cumulative density

1.00

0.75 scenario 1

0.50

2

0.25

0.00 0.00

0.25

0.50

0.75

Packet delivery ratio

Figure 7.5.: Empirical cumulative density function of total packet delivery ratios per simulation run, grouped by scenario. criterion are included in the previous statistics. Table 7.3 shows the estimates for the forecast horizon’s influence on the predictions’ standard errors of regression, for the investigated horizons in the range from 10 to 60 transmissions ahead. In case of the models N and ER that forecast packet delivery ratio, a small, significant raise of the error with increasing forecast horizon (N: 1.04 × 10−3 · horizon, ER: 1.25 × 10−3 · horizon) is estimated. For all other models, the smallest p value is 3.19 × 10−2 (metric: τ , model: AG4). In any case, the adjusted coefficient of determination for the linear models that were used to estimate the forecast horizon’s influence is close to 0 (range: [−0.003, 0.011]). Communication Delay Figures 7.7 and 7.8 show the communication delay prediction models’ median and maximum standard errors of regression per forecast horizon respectively; the bars indicate the statistics’ 95% confidence intervals. The median standard error of regression for the Na¨ıve forecast model (N) varies without visible trend in a small band between 0.31 s to 0.32 s, with confidence intervals in the range of 0.26 s to 0.41 s. The model’s maximum error on the other hand steadily increases from 3.2 s at a forecast horizon of 10 up to 7.2 s at a forecast horizon of 60. Both cross-layer models have a very similar median standard error of regression

134

7.4. Prediction Errors in Simulation Studies

Empirical cumulative density

1.00

0.75 scenario 1

0.50

2

0.25

0.00 0

2

4

6

Median hop count

Figure 7.6.: Empirical cumulative density function of median hop counts per simulation run, grouped by scenario.

Table 7.3.: Estimation of the forecast horizon’s influence on the prediction errors via linear model with horizon and intercept as independent variables. Metric

τ

QPD

Model N NX AX1 NG AG4 N ER PX/0.2 PG/0.2

Slope

p-value

¯2 R

5.43 × 10−3 1.56 × 10−5 −8.92 × 10−5 −1.13 × 10−3 1.06 × 108

1.03 × 10−1 9.94 × 10−1 9.65 × 10−1 6.64 × 10−1 3.19 × 10−2

0.005 −0.003 −0.003 −0.003 0.011

1.04 × 10−3 1.25 × 10−3 3.66 × 10−5 3.60 × 10−4

2.42 × 10−4 8.28 × 10−6 9.11 × 10−1 1.36 × 10−1

0.035 0.053 −0.003 0.004

135

7. Evaluation (NX: 0.22 s to 0.23 s, AX1: 0.21 s to 0.22 s) with fully overlapping confidence intervals (range: 0.16 s to 0.29 s) and no visible trend. Likewise, their maximum standard errors of regression lie in the range of 3.41 s to 3.93 s (NX) and 3.35 s to 4.10 s (AX1) with confidence intervals that reach down to 0.64 s. Starting at a forecast horizon of 30, both models’ maximum standard error of regression shows a small but monotone upward slope. The two probabilistic network graph models have very similar confidence intervals for their median standard errors of prediction (NG range: 0.20 s to 0.34 s and AG4 range: 0.20 s to 0.37 s). The model that uses intermediate Na¨ıve forecasts has a marginally lower median standard error of prediction (range: 0.25 s to 0.27 s) for all forecast horizons than the alternative that uses an ARIMA model (range: 0.27 s to 0.31 s). The maximum standard error of regression for the models, on the other hand, strongly differ (NG range: 5.00 s to 5.75 s and AG4 range: 5.00 s to 2.14 × 1011 s), with the AG4 model having a very steep slope at higher forecast horizons. Packet Delivery Ratio Figures 7.9 and 7.10 show the packet delivery ratio prediction models’ median and maximum standard errors of regression per forecast horizon respectively; the bars indicate the statistics’ 95% confidence intervals. The two black-box models both show monotonically increasing median standard errors of regression with forecasting horizon at equal range (N range: 0.11 to 0.16, ER range: 0.11 to 0.17). Their confidence interval widths increase monotonically too (N: 0.03 to 0.05, ER: 0.03 to 0.06). The maximum standard errors of regression of the black-box models differ more than the errors’ median statistics. In case of the Na¨ıve forecast (N), it increases monotonically with constant slope from 0.48 to 0.50; in case of the exponential smoothing forecast (ER), it increases monotonically from 0.48 to 0.57 with a sharp increase of the slope at a forecast horizon of 40 steps. For the cross-layer model (PX/0.2), the median standard error of regression decreases monotonically with increasing forecast horizon from 0.35 to 0.34 at a nearly constant confidence interval of range 0.30 to 0.40. The cross-layer model’s maximum standard error of regression increases from 0.68 to 0.72 and has confidence intervals that reach down to 0.50 (with 2 outliers at ±0.01), independent of forecast horizon. The median standard error of regression for the probabilistic network graph model (PG/0.2) increases monotonically from 0.28 to 0.30. The median statistics confidence intervals’ lower bounds increase from 0.23 to 0.26 and the upper bounds increase from 0.30 to 0.32. The model’s maximum standard error of regression, except for 2 outliers (at horizon 40: 0.43), decreases from 0.48 to 0.44. Its confidence intervals’ lower bounds

136

7.4. Prediction Errors in Simulation Studies

N 0.40 0.35 0.30





































0.25 0.20

NX 0.40 0.35 0.30

Median standard error of regression [s]

0.25







0.20

AX1 0.40 0.35 0.30 0.25 ●

0.20





NG 0.40 0.35 0.30 ●

0.25





0.20

AG4 0.40 0.35 0.30







0.25 0.20

20

40

60

Forecast horizon [num. steps]

Figure 7.7.: Median of observed communication delay forecasts’ standard errors of regression with bootstrapped (n = 104 ) 95% confidence intervalls.

137

7. Evaluation

N ●

6













4 ●

2

NX 4 ●







3 2

Max standard error of regression [s]

1

AX1 4 ●











3 2

1

NG 6 5













4 3 2 1 AG4 ●

2.0e+11 1.5e+11 1.0e+11 5.0e+10 ● ●



20





40

60

Forecast horizon [num. steps]

Figure 7.8.: Maximum of observed communication delay forecasts’ standard errors of regression with bootstrapped (n = 104 ) 95% confidence intervalls.

138

7.4. Prediction Errors in Simulation Studies reach down to 0.40 with 2 exceptions (at horizon 10: 0.38, at horizon 20: 0.39). Next Stream Interruption Figure 7.11 shows the next stream interruption prediction models’ distributions of their standard errors of regression. No results were obtained for the probabilistic network graph model (SG). The black-box model (F1) has a median standard error of regression of 14.1 transmissions (range: 0.5 to 111.8); the cross-layer model (SX) has median standard error of regression of 94.2 (range: 1 to 1536.5).

7.4.2. Discussion The utilized simulation scenarios provide data to evaluate the models’ prediction performances for more than 1 hop. While scenario 1 has a considerable amount of 2-hop transmissions, more than 50% of its simulation runs have a median hop count of 3 and larger. Scenario 2 produced even better data in this regard, with only very few simulation runs that have a median hop count smaller than 4. The observed packet delivery ratios cover the full spectrum above the cutoff that was used as inclusion criterion. Regarding the forecast horizon’s influence, the estimated p-values are too large to reject the null hypothesis that states that the forecast horizon has no influence on the standard error of regression for all cases but the ones with a slope very close to 0. While these results suggest that the forecast horizon has no influence, these models’ goodness of fit values show a very poor descriptive quality regarding the actual observations. From plots of median standard errors of regression, cf. figures 7.7 and 7.9, no evidence supports assuming a significant influence of the forecast horizon on the prediction error. The plots of maximum standard errors of regression, cf. figures 7.8 and 7.10, on the other hand suggest that at least the maximum prediction error mostly increases with a growing forecast horizon. Both for the cross-layer and the probabilistic network graph delay prediction models, the ones that use Na¨ıve intermediate forecasts (NX and NG) perform equally well to or even—in case of the maximum prediction error of the NG versus the AG4 model at larger forecast horizons—vastly better than their counterpart with more advanced intermediate forecast. An important reason for this observation is that the intermediate forecasts are based on sliding window averages of the actually observed values, what effectively transforms the Na¨ıve forecast into a moving average model. For predicting the communication delay, all three model classes show prediction errors with the same order of magnitude. The cross-layer models perform best (NX and AX1), closely followed by the probabilistic network graph model with Na¨ıve intermediate forecast (NG). The Na¨ıve forecast black-box model shows worst prediction

139

7. Evaluation

N 0.4

0.3

0.2

0.1













ER 0.4

Median standard error of regression

0.3

0.2

0.1













PX/0.2 0.4 ●

















0.3

0.2

0.1 PG/0.2 0.4

0.3







0.2

0.1 20

40

60

Forecast horizon [num. steps]

Figure 7.9.: Median of observed packet delivery ratio forecasts’ standard errors of regression with bootstrapped (n = 104 ) 95% confidence intervalls.

140

7.4. Prediction Errors in Simulation Studies

N 0.7 0.6 0.5













0.4 0.3

ER 0.7 0.6



Max standard error of regression



0.5













0.4 0.3

PX/0.2 0.7















0.6 0.5 0.4 0.3

PG/0.2 0.7 0.6 0.5







0.4 0.3

20

40

60

Forecast horizon [num. steps]

Figure 7.10.: Maximum of observed packet delivery ratio forecasts’ standard errors of regression with bootstrapped (n = 104 ) 95% confidence intervalls.

141

7. Evaluation

F1 ●●



●● ●●



● ● ●●

●● ●

● ● ●

● ●

●● ● ●●

● ●● ●

● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●



SX

● ●

1

10

●● ● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ● ● ● ● ● ● ●● ●● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●

100

● ●

1000

Stander error of regression [num. transmissions]

Figure 7.11.: Observed transmissions until next stream interruption predictions’ standard errors of regression (scatter plot with overlaid box plot). performance. The cross-layer models’ close but still better performances compared to the NG probabilistic network graph model indicate that the prediction of routes from the probabilistic network graph is good but has potential for improvement. Compared to the telemedicine application scenario’s requirements of delays ideally less than 3 s and a hard upper bound of 10 s, the observed prediction errors allow to predict when the required delays will be violated; the maximum prediction errors still lie well below the required communication delay’s hard upper bound. In contrast, the required time scale of 0.1 s communication delay in the cooperative vehicles application scenario lies below the minimum achieved median standard error of regression and hence, cannot be predicted reliably. The packet delivery ratio prediction is not subject to explicitly stated requirements. Rather, scheduling of proactive packet retransmissions that are used to increase reliability and decrease communication delay of real-time communication when experiencing packet losses may be optimized given predictions of future packet delivery ratios: the amount of packet retransmissions can be adapted to achieve a required probability of successful end-to-end message transmission. Given such a proactive retransmission scheme, the expected prediction error has to be added to the predicted packet delivery ratio as safety margin before calculating the necessary amount of proactive packet retransmissions. A purely stochastic prediction of the time until next stream interruption that is provided by the best evaluated black-box model (F1) is most helpful to put the packet delivery ratio, from which it is derived, into perspective of actual future transmissions. Conceptually, the metric’s prediction from anticipated network state, as is the case with the cross-layer and probabilistic network graph models, allows to perform adaptations

142

7.5. Conclusion based on the prediction. But to do this, the SX model’s median prediction error that is at close to 100 time-steps is too high to be used reliably. The missing results for the probabilistic network graph model invalidate the assumption that, at the node densities used for evaluation, network partitioning is the primary cause for packet loss.

7.5. Conclusion In the cross-validation step, the set of candidate models that were proposed in chapter 6 is reduced to a set of one prediction model per model class and metric. Because of similar performance of the cross-layer and probabilistic network graph models that use intermediate Na¨ıve forecasts, compared to the actually best model in their model class according to cross-validation, these were included for further evaluation too. The intermediate Na¨ıve forecasts are the simplest models and are preferred over the more advanced forecast models in case of similar prediction errors. The model evaluation, based on carefully designed simulation studies from discrete event simulations, is presented and discussed in detail to finally answer the research questions 2 and 3 with this conclusion. The preferred prediction models due to observed prediction performance per metric are: the cross-layer model with Na¨ıve intermediate forecast model (NX) for communication delay, the black-box Na¨ıve forecast model (N) for packet delivery ratio, and the black-box failure rate model (F1) for next stream interruption. While the probabilistic network graph models do not beat the best model per metric in terms of prediction errors, they nevertheless show good prediction performance and perform second best for communication delay and packet delivery ratio prediction. At the same time, the probabilistic network graph models offer the most potential for improvement by adapting models contained therein to actual observations during operation. In the models’ current form, observing the actual route yields better results for communication delay prediction than the current route prediction model in the probabilistic network graph. However, the probabilistic network graph’s route probability prediction performs significantly better than the cross-layer models, when predicting packet delivery ratios. In all, the probabilistic network graph models’ performances show the concept’s potential and hint at the primary target for future improvement: The link probability prediction that is at the core of the probabilistic network graph model suffers from bias that yields too optimistic predictions. Incorporating feed-back loops to allow on-line adaptation at the model’s link level is the best approach to improve its performance. The achieved communication delay prediction errors with a median value below 0.3 s at 95% confidence up to 60 time-steps ahead fulfill the requirements of the telemedi-

143

7. Evaluation cine application scenario but are too large to be usable inside the control-loop of the cooperative vehicles application scenario. The packet delivery ratio prediction errors with a median value below 20% (absolute error) at 95% confidence up to 60 time-steps ahead do benefit both application scenarios because it allows to adapt the amount of proactively scheduled packet retransmissions for real-time control communication. The time to next stream interruption models in their proposed form do not offer additional value for system adaptation over the packet delivery ratio prediction models.

144

8. Conclusion 8.1. Summary In the context of real-time communication via MANETs for future generations of CPSs, the influence that contextual factors exert on the device connectivity of mobile network participants has been investigated. Furthermore, methods to predict the connectivity during operations have been proposed and evaluated. Whilst the current state of the art research on device connectivity in MANETs mostly neglects the mobility of the networks’ nodes, this factor has been considered here. In chapter 5, an original, systematic, full factorial study design based on discrete event simulations has been developed. From a thorough statistical analysis of the study’s simulation outcomes, the contextual factors, in the form scenario parameters, that have a statistically significant influence on the three connectivity metrics—end-to-end communication delay, packet delivery ratio, and streaming window width—were identified, answering research question 1 as follows: End-to-end communication delay No significant influence of the investigated contextual factors was found. Minor temporal correlation exists in this metric, suggesting the use of time-series forecasting methods. Additionally, local phenomena that are not represented at the scenario parameter level exist and may be exploited for prediction. Packet delivery ratio The three contextual factors geographic distance of communication end-point nodes, average node speed, and network load were found to have significant influence on the packet delivery ratio. All three contextual factors have a negative correlation with the connectivity metric. Streaming window width The two contextual factors geographic distance of communication end-point nodes and average node speed were found to have a significant influence on the streaming window width. Both contextual factors have a negative correlation with the connectivity metric. In chapter 6, three classes of prediction models per connectivity metric and a 2nd-level on-line adaptation model have been proposed, of which the best performing models are selected in a cross-validation step in chapter 7 that then are evaluated in detail to

145

8. Conclusion contribute answers to the research questions 2 and 3. The best performing prediction models per connectivity metric are: End-to-end communication delay The cross-layer model with Na¨ıve intermediate forecast model achieves a median standard error below 0.3 s. Packet delivery ratio The black-box Na¨ıve forecast model with a 2nd-level adaptation model that uses a learning rate of 0.2 achieves a median standard error below 20% (absolute error). Next stream interruption The black-box failure rate model achieves a median standard error of 14.1 time-steps. The probabilistic network graph models that use uncertain node location predictions perform second best in predicting the end-to-end communication delay and packet delivery ratio, by achieving only slightly worse results than the best performing model as listed above. The 2nd-level adaptation models improve the prediction of packet delivery ratios both for the cross-layer and the probabilistic network graph models. The proposed black-box models use plain forecasting methods to predict the timeseries of end-to-end communication delay and packet delivery ratio and they use an exponential failure rate model that is adapted from reliability engineering to predict the time to next stream interruption. The principal methods used in the black-box models are not original, but their application to predict the connectivity metrics as described has not been proposed before. The cross-layer models predict the connectivity metrics by collecting and aggregating data and time-series forecasts from the nodes along the last received packet’s route. While their prediction of single link life-times is derived from existing research on routing protocols, the composition to end-to-end life-times and models that predict communication delay and packet delivery ratio are completely original of this thesis. The probabilistic network graph models allow to use potentially uncertain node location predictions to predict the connectivity metrics. While representing the network with a graph is well established, its specific use for predicting connectivity metrics, the use of link probabilities, and the domain model to predict link probabilities are original propositions of this thesis. As such, the work carried out to develop and evaluate these models further contributes towards increasing dependability on communication via MANETs to enable CPSs to better utilize local connectivity for critical decision and control tasks.

146

8.2. Critical Discussion

8.2. Critical Discussion All experimental results that have been obtained for the thesis have been generated using discrete event simulations. The lack of validation by field tests is an obvious point of critique to this approach. To judge its impact on the thesis results’ applicability, the representation of uncertain physical processes in the simulation and their role in the simulation studies has to be considered. Based on the discussion underlying the design of the experiments and the existing work on the accuracy of the simulated IEEE 802.11 network stack, a good enough accuracy to be applicable to real world applications of the results that depend on the technical processes may be expected. More critical than the simulated technical systems is the human behaviour that is expressed through the mobility models and potential systematic influence on the radio signal propagation by non-random factors in an actual real-world scene. The former may seem severe, because the utilized mobility models must not be considered to accurately mimic actual human mobility. This argument would be valid critique when assessing the mobility prediction. But for the work presented here, the mobility prediction is assumed to be given and is of no further concern. As discussed in section 4.5, others have already shown that general mobility and next place prediction with high accuracy is possible.

8.3. Outlook Given the research that has been presented and discussed in this thesis, three directions for future work are laid out below. Their common theme is that they address the enhancement of dependability on communication in CPSs by predicting the systems’ connectivity to their peers and dependant services. Each of the proposed research directions addresses this issue on a different level. Only the last proposition directly continues the research that is presented herein.

Middleware Oriented towards research in the computer science community, a network middleware to efficiently distribute the information that is used by the cross-layer and probabilistic network graph models in the network is left as open issue. The options of using Network layer or even Data-link layer headers for information dissemination instead of a strictly layered approach of using dedicated Application layer messages are to be addressed. Even a combination thereof, where nodes locally disseminate only information on their mobility via Data-link layer headers provides options that support the connectivity prediction.

147

8. Conclusion

Adaptation of Applications Research in dynamic systems modeling and control theory that realizes the adaptation of applications to predicted connectivity metrics to actually improve a system’s capabilities to degrade gracefully has to be carried out. Model predictive control is, from the current point of view, the method of choice to incorporate the work that has been presented in this thesis into actual control systems.

Refined Prediction Models Finally, the prediction models themselves, as presented in this work, offer handles to improve the prediction capabilities. The 2nd-level adaptation model has, despite its very simple form, shown the improvements that may be gained by including on-line learning for adaptation into the models. Primarily the prediction of communication link probabilities has shown to have potential for improvement. In its current form, the link probability is predicted using domain specific models for technical parameters, which are then aggregated using additional domain specific computational models. The current model does not incorporate a feedback loop that observes communication link probabilities and learns a model to adapt its communication link probability predictions. Improving the communication link probability prediction is thus considered central to further enhance the prediction performance of the probabilistic network graph models in future work. Additionally, more advanced 2nd-level models, e.g., based on artificial neural networks, should be considered to improve on-line learning capabilities. These models may even consider to aggregate predictions from various of the 1st-level prediction models. Neural networks would in any case need pre-training on a large set of simulation data.

148

Bibliography [AE11]

Taylor B. Arnold and John W. Emerson. “Nonparametric Goodnessof-Fit Tests for Discrete Null Distributions”. In: The R Journal 3.2 (2011), pp. 34–39.

[AEM14]

Irfan Al-Anbagi, Melike Erol-Kantarci, and Hussein T. Mouftah. “A Survey on Cross-layer Quality of Service Approaches in WSNs for Delay and Reliability Aware Applications”. In: IEEE Communications Surveys & Tutorials (2014). issn: 1553-877X. doi: 10.1109/ COMST.2014.2363950. ´ Alesanco and J. Garc´ıa. “Clinical Assessment of Wireless ECG A. Transmission in Real-Time Cardiac Telemonitoring”. In: IEEE Transactions on Information Technology in Biomedicine 14.5 (2010), pp. 1144– 1152. issn: 10897771. doi: 10.1109/TITB.2010.2047650.

[AG10]

[Ahn+02]

Gahng-Seop Ahn et al. “Supporting service differentiation for realtime and best-effort traffic in stateless wireless ad hoc networks (SWAN)”. In: IEEE Transactions on Mobile Computing 1.3 (2002), pp. 192–207. issn: 1536-1233. doi: 10.1109/TMC.2002.1081755.

[AIM10]

Luigi Atzori, Antonio Iera, and Giacomo Morabito. “The Internet of Things: A survey”. In: Computer Networks 54.15 (2010), pp. 2787– 2805. issn: 13891286. doi: 10.1016/j.comnet.2010.05.010.

[Akt+10]

Ismet Aktas et al. “Towards a Flexible and Versatile Cross-LayerCoordination Architecture”. In: Computer Communications Workshops (INFOCOM), Conference on. Piscataway, NJ: IEEE, 2010. isbn: 978-1-4244-6739-6. doi: 10.1109/INFCOMW.2010.5466613.

[Ara+14]

Gustavo Medeiros de Ara´ ujo et al. “Genetic Machine Learning Approach for Link Quality Prediction in Mobile Wireless Sensor Networks”. In: Cooperative Robots and Sensor Networks. Ed. by Anis Koubˆaa and Abdelmajid Khelil. Vol. 507. Studies in Computational Intelligence. Berlin, Heidelberg: Springer, 2014, pp. 1– 18. isbn: 978-3-642-39300-6. doi: 10 . 1007 / 978 - 3 - 642 - 39301 3\textunderscore1.

[Asc+10]

Nils Aschenbruck et al. “BonnMotion: a mobility scenario generation and analysis tool”. In: Simulation Tools and Techniques (SIMUTools), 3rd International ICST Conference on. Brussels, Belgium: ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), 2010. isbn: 978-963-9799-87-5. doi: 10.4108/ ICST.SIMUTOOLS2010.8684.

149

Bibliography [ASW12]

Adelchi Azzalini, Bruno Scarpa, and Gabriel Walton. Data analysis and data mining: An Introduction. Oxford and New York: Oxford University Press, 2012. isbn: 978-0-19-976710-6.

[Auf06]

Erik Auf der Heide. “The importance of evidence-based disaster planning”. In: Annals of emergency medicine 47.1 (2006), pp. 34–49. issn: 1097-6760. doi: 10.1016/j.annemergmed.2005.05.009.

[AX05]

I. F. Akyildiz and Xudong Wang. “A survey on wireless mesh networks”. In: IEEE Communications Magazine 43.9 (2005), S23–S30. issn: 0163-6804. doi: 10.1109/MCOM.2005.1509968.

[BA13]

Rainer Blind and Frank Allg¨ower. “On the Optimization of the Transport Layer for Networked Control Systems”. In: at - Automatisierungstechnik 61.7 (2013), pp. 495–505. issn: 0178-2312. doi: 10.1524/auto.2013.1028.

[Bab+05]

J. Baber et al. “Cooperative autonomous driving - Intelligent vehicles sharing city roads cooperative autonomous driving”. In: IEEE Robotics & Automation Magazine 12.1 (2005), pp. 44–49. issn: 10709932. doi: 10.1109/MRA.2005.1411418.

[Bad+06]

Roland Bader et al. “BigNurse: A Wireless Ad Hoc Network for Patient Monitoring”. In: Pervasive Health Conference and Workshops. Piscataway, NJ: IEEE, 2006, pp. 1–4. isbn: 1-4244-1085-1. doi: 10. 1109/PCTHEALTH.2006.361691.

[Bas+13]

Stefano Basagni et al., eds. Mobile ad hoc networking: Cutting edge directions. Second edition. IEEE series on digital & mobile communication. Wiley, 2013. isbn: 978-1-11-808728-2.

[BD14]

Murat Ali Bayir and Murat Demirbas. “On the fly learning of mobility profiles for routing in pocket switched networks”. In: Ad Hoc Networks 16 (2014), pp. 13–27. doi: 10.1016/j.adhoc.2013.11.011.

[Ben+08]

K. Benkic et al. “Using RSSI value for distance estimation in wireless sensor networks based on ZigBee”. In: Systems, Signals and Image Processing (IWSSIP), 15th International Conference on. Ed. by Gregor Rozinaj. Bratislava: Slovak University of Technology in Publishing House STU, 2008, pp. 303–306. isbn: 978-80-227-2856-0. doi: 10.1109/IWSSIP.2008.4604427.

[Ber+12]

Sebastian Bergrath et al. “Feasibility of Prehospital Teleconsultation in Acute Stroke – A Pilot Study in Clinical Routine”. In: PloS one 7.5 (2012). issn: 1932-6203. doi: 10.1371/journal.pone.0036796.

[BHJ10]

Alberto Bemporad, Maurice Heemels, and Mikael Johansson, eds. Networked Control Systems. Vol. 406. Lecture notes in control and information sciences. London: Springer, 2010. isbn: 978-0-85729-0328.

150

Bibliography [Bia00]

G. Bianchi. “Performance analysis of the IEEE 802.11 distributed coordination function”. In: IEEE Journal on Selected Areas in Communications 18.3 (2000), pp. 535–547. issn: 07338716. doi: 10.1109/ 49.840210.

[Bis06]

Christopher M. Bishop. Pattern recognition and machine learning. New York: Springer, 2006. isbn: 978-0-387-31073-2.

[BJR94]

George E. P Box, Gwilym M. Jenkins, and Gregory C. Reinsel. Time series analysis: Forecasting and control. 3rd ed. Englewood Cliffs, N.J.: Prentice Hall, 1994. isbn: 978-0-13-060774-4.

[Boh97]

Richard W. Bohannon. “Comfortable and maximum walking speed of adults aged 20—79 years: reference values and determinants”. In: Age and Ageing 26.1 (1997), pp. 15–19. issn: 0002-0729. doi: 10.1093/ageing/26.1.15.

[BSS10]

Sean Barnum, Shankar Sastry, and John A. Stankovic. “Roundtable: Reliability of Embedded and Cyber-Physical Systems”. In: IEEE Security & Privacy Magazine 8.5 (2010), pp. 27–32. issn: 1540-7993. doi: 10.1109/MSP.2010.162.

[Bul+06]

J. Blake Bullock et al. “Integration of GPS with Other Sensors and Network Assistance”. In: Understanding GPS. Ed. by Elliott D. Kaplan and Christopher J. Hegarty. Norwood: Artech House, 2006, pp. 459–558. isbn: 1-58053-894-0.

[B¨ us+14]

Christian B¨ uscher et al. “The Telemedical Rescue Assistance System “TemRas” – development, first results, and impact”. In: Biomedical Engineering 59.2 (2014), pp. 113–123. doi: 10.1515/bmt- 20130025.

[But11]

Giorgio C. Buttazzo. Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications. 3rd ed. Real-Time Systems Series. Boston, MA: Springer, 2011. isbn: 1461406765.

[BW06]

Lars Berlemann and Bernard H. Walke. “Radio Spectrum Regulation”. In: IEEE 802 wireless systems. Ed. by Bernard H. Walke, Stefan Mangold, and Lars Berlemann. Chichester and Hoboken, NJ: John Wiley & Sons, 2006, pp. 43–52. isbn: 978-0-470-01439-4.

[Cam+10]

Mark Campbell et al. “Autonomous driving in urban environments: approaches, lessons and challenges”. In: Philosophical transactions. Series A, Mathematical, physical, and engineering sciences 368.1928 (2010), pp. 4649–4672. issn: 1364-503X. doi: 10.1098/rsta.2010. 0110.

[CG13]

Marco Conti and Silvia Giordano. “Multihop Ad hoc Networking: The Evolutionary Path”. In: Mobile ad hoc networking. Ed. by Stefano Basagni et al. IEEE series on digital & mobile communication. Wiley, 2013, pp. 3–33. isbn: 978-1-11-808728-2.

151

Bibliography [Che+11]

Wai Chen et al. “A survey and challenges in routing and data dissemination in vehicular ad hoc networks”. In: Wireless Communications and Mobile Computing 11.7 (2011), pp. 787–795. issn: 15308669. doi: 10.1002/wcm.862.

[Che+14]

Deji Chen et al. “WirelessHART and IEEE 802.15.4e”. In: Industrial Technology (ICIT), International Conference on. IEEE, 2014, pp. 760– 765. doi: 10.1109/ICIT.2014.6895027.

[CPS99]

Andrea E. F. Clementi, Paolo Penna, and Riccardo Silvestri. “Hardness Results for the Power Range Assignment Problem in Packet Radio Networks”. In: Randomization, Approximation, and Combinatorial Optimization. Algorithms and Techniques. Ed. by Dorit S. Hochbaum et al. Vol. 1671. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer, 1999, pp. 197–208. isbn: 978-3-54066329-4. doi: 10.1007/978-3-540-48413-4\textunderscore21.

[DH97]

A. C. Davison and D. V. Hinkley. Bootstrap methods and their application. Cambridge and New York, NY, USA: Cambridge University Press, 1997. isbn: 978-0-521-57391-7.

[Dij59]

E. W. Dijkstra. “A note on two problems in connexion with graphs”. In: Numerische Mathematik 1.1 (1959), pp. 269–271. issn: 0029-599X. doi: 10.1007/BF01386390.

[DR84]

Philip J. Davis and Philip Rabinowitz. Methods of numerical integration. 2nd ed. Orlando: Academic Press, 1984. isbn: 978-0-12-206360-2.

[DSL12]

Francois Despaux, Ye-Qiong Song, and Abdelkader Lahmadi. “Combining Analytical and Simulation Approaches for Estimating End-toEnd Delay in Multi-hop Wireless Networks”. In: Distributed Computing in Sensor Systems (DCOSS), 8th International Conference on. IEEE, 2012, pp. 317–322. isbn: 978-1-4673-1693-4. doi: 10.1109/ DCOSS.2012.31.

[DV14]

Prabesh Dongol and Dhadesugoor R. Vaman. “End to End Quality of Service Assurance for Multi-Service Provisioning in Mobile Ad Hoc Networks.” In: International Journal of Network Security & Its Applications (IJNSA) 6.4 (2014), pp. 1–12. doi: 10.5121/ijnsa. 2014.6401.

[Ein05]

A. Einstein. “Zur Elektrodynamik bewegter K¨orper”. In: Annalen der Physik 322.10 (1905), pp. 891–921. issn: 0003-3804. doi: 10. 1002/andp.19053221004.

[Els15]

Jesko Elsner. AI-Driven Volunteer Selection. 1st ed. Norderstedt: Books on Demand, 2015. isbn: 978-3-7347-6798-2.

[EO06]

Paal Engelstad and Olav Osterbo. “Analysis of the Total Delay of IEEE 802.11e EDCA and 802.11 DCF”. In: Communications (ICC), International Conference on. IEEE, 2006, pp. 552–559. isbn: 978-14244-0355-4. doi: 10.1109/ICC.2006.254853.

152

Bibliography [EP06]

Nathan Eagle and Alex (Sandy) Pentland. “Reality mining: sensing complex social systems”. In: Pers Ubiquit Comput 10.4 (2006), pp. 255–268.

[Fae+12]

Miad Faezipour et al. “Progress and challenges in intelligent vehicle area networks”. In: Communications of the ACM 55.2 (2012), p. 90. issn: 00010782. doi: 10.1145/2076450.2076470.

[Fin08]

Maxim Finkelstein. Failure rate modelling for reliability and risk. London: Springer, 2008. isbn: 978-1-84800-985-1. doi: 10.1007/9781-84800-986-8.

[Fli07]

Rob Flickenger, ed. Wireless Networking in the Developing World: A practical guide to planning and building low-cost telecommunications infrastructure. 2nd ed. Hacker Friendly LLC, 2007. url: http://wndw. net/pdf/wndw2-en/wndw2-ebook.pdf (visited on 10/28/2014).

[FR05]

Andr´e Luiz de Freitas Francisco and Franz J. Rammig. “FaultTolerant Hard-Real-Time Communication of Dynamically Reconfigurable, Distributed Embedded Systems”. In: Object-Oriented RealTime Distributed Computing (ISORC), International Symposium on. IEEE, 2005, pp. 275–283. doi: 10.1109/ISORC.2005.27.

[Gao+08]

Tia Gao et al. “Wireless Medical Sensor Networks in Emergency Response: Implementation and Pilot Results”. In: Technologies for Homeland Security, Conference on. IEEE, 2008, pp. 187–192. isbn: 978-1-4244-1977-7. doi: 10.1109/THS.2008.4534447.

[Gas05]

Matthew Gast. 802.11 wireless networks: The definitive guide. 2nd ed. O’Reilly, 2005. isbn: 978-0-596-52264-3.

[Ger+14]

Mario Gerla et al. “Internet of vehicles: From intelligent grid to autonomous cars and vehicular clouds”. In: Internet of Things (WFIoT), World Forum on. IEEE, 2014, pp. 241–246. doi: 10.1109/WFIoT.2014.6803166.

[GKd11]

S´ebastien Gambs, Marc-Olivier Killijian, and Miguel N´ un ˜ez del Prado Cortez. “Show Me How You Move and I Will Tell You Who You Are”. In: Trans. Data Privacy 4.2 (2011), pp. 103–126.

[GKd12]

S´ebastien Gambs, Marc-Olivier Killijian, and Miguel N´ un ˜ez del Prado Cortez. “Next place prediction using mobility Markov chains”. In: Measurement, Privacy, and Mobility (MPM), First Workshop on. Ed. by Hamed Haddadi and Eiko Yoneki. New York, NY: ACM, 2012, 3:1– 3:6. isbn: 978-1-4503-1163-2. doi: 10.1145/2181196.2181199.

[GMR10]

Sebastian Gr¨afling, Petri M¨ah¨onen, and Janne Riihij¨arvi. “Performance evaluation of IEEE 1609 WAVE and IEEE 802.11p for vehicular communications”. In: Ubiquitous and Future Networks (ICUFN), 2nd International Conference on. IEEE, 2010, pp. 344–348. isbn: 978-1-4244-8088-3. doi: 10.1109/ICUFN.2010.5547184.

153

Bibliography [Gol+00]

A. L. Goldberger et al. “PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals”. In: Circulation 101.23 (2000), e215–e220. doi: 10.1161/ 01.CIR.101.23.e215.

[Gr¨ u+14]

L. Gr¨ une et al. “Distributed and Networked Model Predictive Control”. In: Control theory of digitally networked dynamic systems. Ed. by Jan Lunze. Cham and New York: Springer, 2014, pp. 111– 167. isbn: 978-3-319-01131-8. doi: 10.1007/978- 3- 319- 011318\textunderscore4.

[Gr¨ u13]

Tom Gr¨ unweg. Selbststeuernder Wagen: Ausfahrt mit Autopilot. 2013. url: http://www.spiegel.de/auto/aktuell/autonomes-fahrenunterwegs - mit - einer - s - klasse - auf - autopilot - a - 920803 . html (visited on 08/13/2014).

[HA13]

Rob J. Hyndman and George Athanasopoulos. Forecasting: Principles and practice. OTexts, 2013. isbn: 978-0-9875071-0-5. url: https: //www.otexts.org/fpp.

[Har+05]

S. Hara et al. “Propagation Characteristics of IEEE 802.15.4 Radio Signal and Their Application for Location Estimation”. In: Vehicular Technology Conference. IEEE, 2005, pp. 97–101. isbn: 0-7803-8887-9. doi: 10.1109/VETECS.2005.1543257.

[Hau+14]

A. Haupt et al. “Wireless Networking for Control”. In: Control theory of digitally networked dynamic systems. Ed. by Jan Lunze. Cham and New York: Springer, 2014, pp. 325–362. isbn: 978-3-319-01131-8. doi: 10.1007/978-3-319-01131-8\textunderscore7.

[HC08]

Jonathan W. Hui and David E. Culler. “IP is dead, long live IP for wireless sensor networks”. In: Embedded network sensor systems, Proceedings of the 6th ACM conference on. Ed. by Tarek Abdelzaher. New York, NY: ACM, 2008, pp. 15–28. isbn: 978-1-59593-990-6. doi: 10.1145/1460412.1460415.

[HL08]

H. Hartenstein and K. P. Laberteaux. “A tutorial survey on vehicular ad hoc networks”. In: IEEE Communications Magazine 46.6 (2008), pp. 164–171. issn: 0163-6804. doi: 10.1109/MCOM.2008.4539481.

[HOG13]

D. Hiranandani, K. Obraczka, and J. J. Garcia-Luna-Aceves. “MANET protocol simulations considered harmful: the case for benchmarking”. In: IEEE Wireless Communications 20.4 (2013), pp. 82–90. issn: 1536-1284. doi: 10.1109/MWC.2013.6590054.

[Hus+06]

E. M. Husni et al. “Mobile Ad Hoc Network and Mobile IP For Future Mobile Telemedicine System”. In: Wireless and Optical Communications Networks, International Conference on. IEEE, 2006. isbn: 1-4244-0340-5. doi: 10.1109/WOCN.2006.1666562.

154

Bibliography [HXH12]

Edward Hung, Lurong Xiao, and Regant Y.S. Hung. “An efficient representation model of distance distribution between uncertain objects”. In: Computational Intelligence 28.3 (2012), pp. 373–397. issn: 08247935. doi: 10.1111/j.1467-8640.2012.00440.x.

[IC93]

P. A. Ioannou and C. C. Chien. “Autonomous intelligent cruise control”. In: IEEE Transactions on Vehicular Technology 42.4 (1993), pp. 657–672. issn: 0018-9545. doi: 10.1109/25.260745.

[ISO/IEC7498-1] ISO/IEC. Information technology – Open Systems Interconnection – Basic reference model: The basic model. ISO/IEC 7498-1. 1994. [Jac+01]

P. Jacquet et al. “Optimized link state routing protocol for ad hoc networks”. In: Technology for the 21st Century (INMIC), Multi Topic Conference. IEEE, 2001, pp. 62–68. isbn: 0-7803-7406-1. doi: 10.1109/INMIC.2001.995315.

[Jam+13]

Gareth James et al. An Introduction to Statistical Learning: With applications in R. Vol. 103. Springer texts in statistics. New York, NY: Springer, 2013. isbn: 978-1-4614-7137-0. doi: 10.1007/978-14614-7138-7.

[JD08]

Daniel Jiang and Luca Delgrossi. “IEEE 802.11p: Towards an International Standard for Wireless Access in Vehicular Environments”. In: Vehicular Technology Conference (VTC). IEEE, 2008, pp. 2036–2040. isbn: 978-1-4244-1644-8. doi: 10.1109/VETECS.2008.458.

[JHR05]

Shengming Jiang, Dajiang He, and Jianqiang Rao. “A predictionbased link availability estimation for routing metrics in MANETs”. In: IEEE/ACM Transactions on Networking 13.6 (2005), pp. 1302– 1312. issn: 1063-6692. doi: 10.1109/TNET.2005.860094.

[JJ10]

Mikael Johansson and Riku J¨antti. “Wireless Networking for Control: Technologies and Models”. In: Networked Control Systems. Ed. by Alberto Bemporad, Maurice Heemels, and Mikael Johansson. Vol. 406. Lecture notes in control and information sciences. London: Springer, 2010, pp. 31–74. isbn: 978-0-85729-032-8. doi: 10. 1007/ 978- 0 85729-033-5\textunderscore2.

[JK08]

Jakub Jakubiak and Yevgeni Koucheryavy. “State of the Art and Research Challenges for VANETs”. In: Consumer Communications and Networking Conference (CCNC). IEEE, 2008, pp. 912–916. isbn: 978-1-4244-1456-7. doi: 10.1109/ccnc08.2007.212.

[JMB01]

David B. Johnson, David A. Maltz, and Josh Broch. “The Dynamic Source Routing Protocol for Multihop Wireless Ad Hoc Networks”. In: Ad hoc networking. Ed. by Charles E. Perkins. Boston: AddisonWesley, 2001, pp. 139–172. isbn: 978-0-321-57907-2.

[Kat+02]

S. Kato et al. “Vehicle control algorithms for cooperative driving with automated vehicles and intervehicle communications”. In: IEEE Transactions on Intelligent Transportation Systems 3.3 (2002), pp. 155–161. issn: 1524-9050. doi: 10.1109/TITS.2002.802929.

155

Bibliography [KCC05]

Stuart Kurkowski, Tracy Camp, and Michael Colagrosso. “MANET simulation studies: the incredibles”. In: ACM SIGMOBILE Mobile Computing and Communications Review 9.4 (2005), pp. 50–61. issn: 15591662. doi: 10.1145/1096166.1096174.

[Kho+13]

Lyes Khoukhi et al. “Admission control in wireless ad hoc networks: a survey”. In: EURASIP Journal on Wireless Communications and Networking 2013.109 (2013). issn: 1687-1499. doi: 10.1186/16871499-2013-109.

[Kim+09]

J. C. Kim et al. “Implementation and performance evaluation of mobile ad hoc network for Emergency Telemedicine System in disaster areas”. In: Engineering in Medicine and Biology Society, Annual International Conference of the. IEEE, 2009, pp. 1663–1666. isbn: 978-1-4244-3296-7. doi: 10.1109/IEMBS.2009.5333889.

[KK05]

V. Kawadia and P. R. Kumar. “A cautionary perspective on crosslayer design”. In: IEEE Wireless Communications 12.1 (2005), pp. 3– 11. issn: 1536-1284. doi: 10.1109/MWC.2005.1404568.

[Koh06]

Eddie Kohler. Click for Measurement. 2006. url: http : / / www . read.seas.harvard.edu/%5Ctextasciitilde%20kohler/pubs/ kohler06click.pdf (visited on 05/25/2014).

[Kum+12]

Swarun Kumar et al. “CarSpeak: a content-centric network for autonomous driving”. In: ACM SIGCOMM Computer Communication Review 42.4 (2012), pp. 259–270. issn: 01464833. doi: 10.1145/ 2377677.2377724.

[Kun+08]

A. Kuntz et al. “Introducing Probabilistic Radio Propagation Models in OMNeT++ Mobility Framework and Cross Validation Check with NS-2”. In: Simulation tools and techniques for communications, networks and systems (Simutools), Proceedings of the 1st International Conference on. ICST, 2008, 72:1–72:7. isbn: 978-963-9799-20-2.

[Lee+09]

K. Lee et al. “SLAW: A New Mobility Model for Human Walks”. In: Computer Communications (INFOCOM), Conference on. IEEE, 2009, pp. 855–863. isbn: 978-1-4244-3512-8. doi: 10.1109/INFCOM. 2009.5061995.

[Lee+12]

Insup Lee et al. “Challenges and Research Directions in Medical Cyber-Physical Systems”. In: Proceedings of the IEEE 100.1 (2012), pp. 75–90. issn: 0018-9219. doi: 10.1109/JPROC.2011.2165270.

[Lee06]

Edward A. Lee. Cyber-Physical Systems – Are Computing Foundations Adequate? 2006. url: http://ptolemy.eecs.berkeley.edu/ publications/papers/06/CPSPositionPaper/Lee%5Ctextunderscore% 20CPS%5Ctextunderscore%20PositionPaper.pdf (visited on 09/28/2014).

156

Bibliography [Lee08]

Edward A. Lee. “Cyber Physical Systems: Design Challenges”. In: Object and Component-Oriented Real-Time Distributed Computing (ISORC), 11th International Symposium on. IEEE Computer Society, 2008, pp. 363–369. isbn: 978-0-7695-3132-8. doi: 10.1109/ISORC. 2008.25.

[Lem09]

Christiane Lemieux. Monte carlo and quasi-monte carlo sampling. Springer Series in Statistics. New York: Springer, 2009. isbn: 978-0387-78165-5.

[Lev+11]

Jesse Levinson et al. “Towards fully autonomous driving: Systems and algorithms”. In: Intelligent Vehicles Symposium (IV). IEEE, 2011, pp. 163–168. isbn: 978-1-4577-0890-9. doi: 10.1109/IVS.2011. 5940562.

[LG14]

J. Lunze and L. Gr¨ une. “Introduction to Networked Control Systems”. In: Control theory of digitally networked dynamic systems. Ed. by Jan Lunze. Cham and New York: Springer, 2014, pp. 1– 30. isbn: 978-3-319-01131-8. doi: 10 . 1007 / 978 - 3 - 319 - 01131 8\textunderscore1.

[Li+11]

Ruogu Li et al. “A Unified Approach to Optimizing Performance in Networks Serving Heterogeneous Flows”. In: IEEE/ACM Transactions on Networking 19.1 (2011), pp. 223–236. issn: 1063-6692. doi: 10.1109/TNET.2010.2059038.

[Li+12]

Fan Li et al. “A reliable and accurate indoor localization method using phone inertial sensors”. In: Ubiquitous Computing, Conference on. Ed. by Anind K. Dey, Hao-Hua Chu, and Gillian Hayes. ACM, 2012, pp. 421–430. isbn: 978-1-4503-1224-0. doi: 10.1145/2370216. 2370280.

[LLZ14]

Ling Li, Shancang Li, and Shanshan Zhao. “QoS-Aware Scheduling of Services-Oriented Internet of Things”. In: IEEE Transactions on Industrial Informatics 10.2 (2014), pp. 1497–1505. issn: 1551-3203. doi: 10.1109/TII.2014.2306782.

[LS10]

Insup Lee and Oleg Sokolsky. “Medical cyber physical systems”. In: Design Automation Conference, Proceedings of the 47th. IEEE, 2010, pp. 743–748. isbn: 978-1-4244-6677-1. doi: 10.1145/1837274. 1837463.

[Lun14]

Jan Lunze, ed. Control theory of digitally networked dynamic systems. Cham and New York: Springer, 2014. isbn: 978-3-319-01131-8. doi: 10.1007/978-3-319-01131-8.

[LW06]

Li Li and Fei-Yue Wang. “Cooperative Driving at Blind Crossings Using Intervehicle Communication”. In: IEEE Transactions on Vehicular Technology 55.6 (2006), pp. 1712–1724. issn: 0018-9545. doi: 10.1109/TVT.2006.878730.

157

Bibliography [Mad+06]

Harsha V. Madhyastha et al. “A structural approach to latency prediction”. In: Internet Measurement Conference, Proceedings of the ACM SIGCOMM. Ed. by Jussara Almeida, Virgilio Almeida, and Paul Barford. New York, NY: ACM Press, 2006, p. 99. isbn: 1-59593-561-4. doi: 10.1145/1177080.1177092.

[Mal+04]

David Malan et al. “CodeBlue: An Ad Hoc Sensor Network Infrastructure for Emergency Medical Care”. In: Applications of Mobile Embedded Systems (WAMES), MobiSys Workshop on. ACM, 2004.

[Man+03]

S. Mangold et al. “Analysis of IEEE 802.11e for QoS support in wireless LANs”. In: IEEE Wireless Communications 10.6 (2003), pp. 40–50. issn: 1536-1284. doi: 10.1109/MWC.2003.1265851.

[Man+06]

Stefan Mangold et al. “IEEE 802.11 Wireless Local Area Networks”. In: IEEE 802 wireless systems. Ed. by Bernard H. Walke, Stefan Mangold, and Lars Berlemann. Chichester and Hoboken, NJ: John Wiley & Sons, 2006, pp. 77–117. isbn: 978-0-470-01439-4.

[Maz+09]

Santiago Mazuelas et al. “Robust Indoor Positioning Provided by Real-Time RSSI Values in Unmodified WLAN Networks”. In: IEEE Journal of Selected Topics in Signal Processing 3.5 (2009), pp. 821– 831. issn: 1932-4553. doi: 10.1109/JSTSP.2009.2029191.

[MC13]

Luis Marques and Antonio Casimiro. “Fighting Uncertainty in Highly Dynamic Wireless Sensor Networks with Probabilistic Models”. In: Reliable Distributed Systems (SRDS), 32nd International Symposium on. IEEE, 2013, pp. 31–40. isbn: 978-0-7695-5115-9. doi: 10.1109/ SRDS.2013.12.

[MCN11]

Aarti Munjal, Tracy Camp, and William C. Navidi. “SMOOTH: a simple way to model human mobility”. In: Modeling, analysis and simulation of wireless and mobile systems, Proceedings of the 14th ACM international conference on. New York, NY: ACM, 2011, pp. 351–360. isbn: 978-1-4503-0898-4. doi: 10.1145/2068897.2068957.

[MLF07]

Hamid Menouar, Massimiliano Lenardi, and Fethi Filali. “Movement Prediction-Based Routing (MOPR) Concept for Position-Based Routing in Vehicular Networks”. In: Vehicular Technology Conference (VTC), 66th. IEEE, 2007, pp. 2101–2105. isbn: 978-1-4244-0263-2. doi: 10.1109/VETECF.2007.441.

[MM01]

G. B. Moody and R. G. Mark. “The impact of the MIT-BIH Arrhythmia Database”. In: IEEE Engineering in Medicine and Biology Magazine 20.3 (2001), pp. 45–50. issn: 07395175. doi: 10.1109/51. 932724.

[MN98]

Makoto Matsumoto and Takuji Nishimura. “Mersenne twister: a 623dimensionally equidistributed uniform pseudo-random number generator”. In: ACM Transactions on Modeling and Computer Simulation 8.1 (1998), pp. 3–30. issn: 10493301. doi: 10.1145/272991.272995.

158

Bibliography [MTN08]

Peter J. Mohr, Barry N. Taylor, and David B. Newell. “CODATA recommended values of the fundamental physical constants: 2006”. In: Journal of Physical and Chemical Reference Data 37.3 (2008), p. 1187. issn: 00472689. doi: 10.1063/1.2844785.

[NEE07]

Robert Nagel, Stephan Eichler, and Jorg Eberspacher. “Intelligent Wireless Communication for Future Autonomous and Cognitive Automobiles”. In: Intelligent Vehicles Symposium. IEEE, 2007, pp. 716– 721. isbn: 1-4244-1067-3. doi: 10.1109/IVS.2007.4290201.

[NPT08]

George Nikolakopoulos, Athanasia Panousopoulou, and Anthony Tzes. “Switched Feedback Control for Wireless Networked Systems”. In: Networked Control Systems. Ed. by Fei-Yue Wang and Derong Liu. London: Springer, 2008, pp. 153–195. isbn: 978-1-84800-214-2.

[Pea+11]

Sarogini Grace Pease et al. “Cross-layer signalling and middleware: A survey for inelastic soft real-time applications in MANETs”. In: Journal of Network and Computer Applications 34.6 (2011), pp. 1928– 1941. issn: 10848045. doi: 10.1016/j.jnca.2011.07.005.

[Pea11]

Ronald K. Pearson. Exploring data in engineering, the sciences, and medicine. New York: Oxford University Press, 2011. isbn: 978-0-19508965-3.

[Per01]

Charles E. Perkins, ed. Ad hoc networking. Boston: Addison-Wesley, 2001. isbn: 978-0-321-57907-2.

[Pet+11]

Agoston Petz et al. “Passive Network-Awareness for Dynamic ResourceConstrained Networks”. In: Distributed Applications and Interoperable Systems. Ed. by David Hutchison et al. Vol. 6723. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer, 2011, pp. 106– 121. isbn: 978-3-642-21386-1. doi: 10.1007/978- 3- 642- 213878\textunderscore9.

[PKK13]

Lim Boon Ping, Chong Poh Kit, and Ettikan K. Karuppiah. “Network latency prediction using high accuracy prediction tree”. In: Ubiquitous Information Management and Communication, Proceedings of the 7th International Conference on. ACM, 2013, 42:1–42:8. isbn: 978-14503-1958-4. doi: 10.1145/2448556.2448598.

[PLM06]

Fran¸cois Panneton, Pierre L’ecuyer, and Makoto Matsumoto. “Improved long-period generators based on linear recurrences modulo 2”. In: ACM Transactions on Mathematical Software 32.1 (2006), pp. 1–16. issn: 00983500. doi: 10.1145/1132973.1132974.

[PP07]

Larry Peterson and Vivek S. Pai. “Experience-driven experimental systems research”. In: Communications of the ACM 50.11 (2007), pp. 38–44. issn: 00010782. doi: 10.1145/1297797.1297820.

[PPC09]

Giovanni Petris, Sonia Petrone, and Patrizia Campagnoli. Dynamic linear models with R. Dordrecht and New York: Springer, 2009. isbn: 978-0-387-77237-0.

159

Bibliography [PR99]

Charles E. Perkins and Elizabeth M. Royer. “Ad-hoc on-demand distance vector routing”. In: Mobile Computing Systems and Applications, Second IEEE Workshop on. IEEE, 1999, pp. 90–100. isbn: 0-7695-0025-0. doi: 10.1109/MCSA.1999.749281.

[Pro10]

Michael Protogerakis. Systemarchitektur eines telematischen Assistenzsystems in der pr¨ aklinischen Notfallversorgung. 1st ed. Norderstedt: Books on Demand, 2010. isbn: 978-3-8391-3554-9.

[QWY10]

Fengzhong Qu, Fei-Yue Wang, and Liuqing Yang. “Intelligent transportation spaces: vehicles, traffic, communications, and beyond”. In: IEEE Communications Magazine 48.11 (2010), pp. 136–142. issn: 0163-6804. doi: 10.1109/MCOM.2010.5621980.

[Ram+08]

Venugopalan Saraswati Ramasubramanian et al. “Internet Latencies Through Prediction Trees”. US 2008/0304421 A1. December 11, 2008.

[Ram+09]

Venugopalan Ramasubramanian et al. “On the treeness of internet latency and bandwidth”. In: Measurement and modeling of computer systems, Proceedings of the eleventh international joint conference on. New York: ACM, 2009, pp. 61–72. isbn: 978-1-60558-511-6. doi: 10.1145/1555349.1555357.

[Rap02]

Theodore S. Rappaport. Wireless communications: Principles and practice. 2nd ed. Prentice Hall communications engineering and emerging technologies series. Upper Saddle River, NJ: Prentice Hall PTR, 2002. isbn: 978-0-13-042232-3.

[Ren+10]

Yonglin Ren et al. “Monitoring patients via a secure and mobile healthcare system”. In: IEEE Wireless Communications 17.1 (2010), pp. 59–65. issn: 1536-1284. doi: 10.1109/MWC.2010.5416351.

[RFC1700]

J. Reynolds and J. Postel. Assigned Numbers. Ed. by RFC Editor. October 1994. url: http://www.rfc-editor.org/rfc/rfc1700. txt.

[RFC3561]

C. Perkins, E. Belding-Royer, and S. Das. Ad hoc On-Demand Distance Vector (AODV) Routing. Ed. by RFC Editor. July 2003. url: http://www.rfc-editor.org/rfc/rfc3561.txt.

[RFC3626]

T. Clausen and P. Jacquet. Optimized Link State Routing Protocol (OLSR). Ed. by RFC Editor. October 2003. url: http://www.rfceditor.org/rfc/rfc3626.txt.

[RFC4728]

D. Johnson, Y. Hu, and D. Maltz. The Dynamic Source Routing Protocol (DSR) for Mobile Ad Hoc Networks for IPv4. Ed. by RFC Editor. February 2007. url: http://www.rfc-editor.org/rfc/rfc4728. txt.

[RFC4919]

N. Kushalnagar, G. Montenegro, and C. Schumacher. IPv6 over LowPower Wireless Personal Area Networks (6LoWPANs): Overview, Assumptions, Problem Statement, and Goals. Ed. by RFC Editor. August 2007. url: http://www.rfc- editor.org/rfc/rfc4919. txt.

160

Bibliography [RFC5681]

M. Allman, V. Paxson, and E. Blanton. TCP Congestion Control. Ed. by RFC Editor. September 2009. url: http://www.rfc-editor. org/rfc/rfc5681.txt.

[RFC6282]

J. Hui and P. Thubert. Compression Format for IPv6 Datagrams over IEEE 802.15.4-Based Networks. Ed. by RFC Editor. September 2011. url: http://www.rfc-editor.org/rfc/rfc6282.txt.

[RFC6437]

S. Amante et al. IPv6 Flow Label Specification. Ed. by RFC Editor. November 2011. url: http://www.rfc-editor.org/rfc/rfc6437. txt.

[RFC768]

J. Postel. User Datagram Protocol. Ed. by RFC Editor. August 1980. url: http://www.rfc-editor.org/rfc/rfc768.txt.

[RFC791]

J. Postel. Internet Protocol. Ed. by RFC Editor. September 1981. url: http://www.rfc-editor.org/rfc/rfc791.txt.

[RMM01]

E. M. Royer, P. M. Melliar-Smith, and L. E. Moser. “An analysis of the optimum node density for ad hoc mobile networks”. In: Communications (ICC), International Conference on. IEEE, 2001, pp. 857–861. doi: 10.1109/ICC.2001.937360.

[RND10]

Stuart J. Russell, Peter Norvig, and Ernest Davis. Artificial intelligence: A modern approach. 3rd ed. Upper Saddle River, NJ: Prentice Hall, 2010. isbn: 978-0-13-207148-2.

[Ros15]

Philip Ross. “Thus spoke the autobahn”. In: IEEE Spectrum 52.1 (2015), pp. 52–55. issn: 0018-9235. doi: 10 . 1109 / MSPEC . 2015 . 6995635.

[Row07]

J. Rowley. “The wisdom hierarchy: representations of the DIKW hierarchy”. In: Journal of Information Science 33.2 (2007), pp. 163– 180. issn: 0165-5515. doi: 10.1177/0165551506070706.

[San05]

Paolo Santi. “Topology control in wireless ad hoc and sensor networks”. In: ACM Computing Surveys 37.2 (2005), pp. 164–194. issn: 03600300. doi: 10.1145/1089733.1089736.

[SDD10]

Christoph Sommer, Isabel Dietrich, and Falko Dressler. “Simulation of Ad Hoc Routing Protocols using OMNeT++”. In: Mobile Networks and Applications 15.6 (2010), pp. 786–801. issn: 1383-469X. doi: 10.1007/s11036-009-0174-5.

[SG14]

Nurul I. Sarkar and Jairo A. Guti´errez. “Revisiting the issue of the credibility of simulation studies in telecommunication networks: highlighting the results of a comprehensive survey of IEEE publications”. In: IEEE Communications Magazine 52.5 (2014), pp. 218–224. issn: 0163-6804. doi: 10.1109/MCOM.2014.6815915.

161

Bibliography [SKH11]

Daniel Seither, Andr´e K¨onig, and Matthias Hollick. “Routing performance of Wireless Mesh Networks: A practical evaluation of BATMAN advanced”. In: Local Computer Networks (LCN), 36th Conference on. Ed. by Tom Pfeifer, Anura Jayasumana, and Nils Aschenbruck. IEEE, 2011, pp. 897–904. isbn: 978-1-61284-926-3. doi: 10.1109/LCN.2011.6115569.

[SLG00]

W. Su, S.-J. Lee, and M. Gerla. “Mobility prediction in wireless networks”. In: 21st Century Military Communications Conference (MILCOM). IEEE, 2000, pp. 491–495. isbn: 0-7803-6521-6. doi: 10.1109/MILCOM.2000.905001.

[SM05]

V. Srivastava and M. Motani. “Cross-layer design: a survey and the road ahead”. In: IEEE Communications Magazine 43.12 (2005), pp. 112–119. issn: 0163-6804. doi: 10.1109/MCOM.2005.1561928.

[SM08]

Mutsuo Saito and Makoto Matsumoto. “SIMD-Oriented Fast Mersenne Twister: a 128-bit Pseudorandom Number Generator”. In: Monte Carlo and Quasi-Monte Carlo Methods 2006. Ed. by Alexander Keller, Stefan Heinrich, and Harald Niederreiter. Berlin, Heidelberg: Springer, 2008, pp. 607–622. isbn: 978-3-540-74495-5. doi: 10.1007/978-3540-74496-2\textunderscore36.

[Smi13]

Aaron Smith. Smartphone Ownership 2013. 2013. url: http://www. pewinternet . org / 2013 / 06 / 05 / smartphone - ownership - 2013/ (visited on 09/29/2014).

[Son+08]

Jianping Song et al. “WirelessHART: Applying Wireless Technology in Real-Time Industrial Process Control”. In: 14th Real-Time and Embedded Technology and Applications Symposium (RTAS), Proceedings of the. Los Alamitos, Calif.: IEEE Computer Society Press, 2008, pp. 377–386. isbn: 978-0-7695-3146-5. doi: 10.1109/RTAS.2008.15.

[Son+10]

C. Song et al. “Limits of Predictability in Human Mobility”. In: Science 327.5968 (2010), pp. 1018–1021. issn: 0036-8075. doi: 10. 1126/science.1177170.

[Sta+05]

J. A. Stankovic et al. “Opportunities and obligations for physical computing systems”. In: Computer 38.11 (2005), pp. 23–31. issn: 0018-9162. doi: 10.1109/MC.2005.386.

[Sta88]

J. A. Stankovic. “Misconceptions about real-time computing: A serious problem for next-generation systems”. In: Computer 21.10 (1988), pp. 10–19. issn: 0018-9162. doi: 10.1109/2.7053.

[Sun+04]

Yuan Sun et al. “Model-based resource prediction for multi-hop wireless networks”. In: Mobile Ad-hoc and Sensor Systems, International Conference on. IEEE, 2004, pp. 114–123. isbn: 0-7803-8815-1. doi: 10.1109/MAHSS.2004.1392086.

162

Bibliography [Sun+13]

Weihua Sun et al. “A Method for Overlay Network Latency Estimation from Previous Observation”. In: Networks (ICN), The 12th International Conference on. IARIA XPS Press, 2013, pp. 95–100. isbn: 978-1-61208-245-5.

[SV13]

Sweta Sneha and Upkar Varshney. “A framework for enabling patient monitoring via mobile ad hoc network”. In: Decision Support Systems 55.1 (2013), pp. 218–234. issn: 01679236. doi: 10.1016/j.dss.2013. 01.024.

[The+13]

Sebastian Thelen et al. “A Multifunctional Telemedicine System for Pre-hospital Emergency Medical Services”. In: eTELEMED 2013, The Fifth International Conference on eHealth, Telemedicine, and Social Medicine. Ed. by Lisette Van Gemert-Pijnen and Hans C. Ossebaard. IARIA XPS Press, 2013, pp. 53–58. isbn: 978-1-61208252-3.

[The+15]

Sebastian Thelen et al. “Using off-the-shelf medical devices for biomedical signal monitoring in a telemedicine system for emergency medical services”. In: IEEE journal of biomedical and health informatics 19.1 (2015), pp. 117–123. issn: 2168-2208. doi: 10.1109/ JBHI.2014.2361775.

[TSK05]

Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. Introduction to data mining. 1st ed. Boston: Pearson Addison Wesley, 2005. isbn: 978-0-321-32136-7.

[TW11]

Andrew S. Tanenbaum and D. Wetherall. Computer networks. 5th ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2011. isbn: 0-13212695-8.

[UA09]

R. Uzcategui and G. Acosta-Marum. “Wave: A tutorial”. In: IEEE Communications Magazine 47.5 (2009), pp. 126–133. issn: 0163-6804. doi: 10.1109/MCOM.2009.4939288.

[Urm+08]

Chris Urmson et al. “Autonomous driving in urban environments: Boss and the Urban Challenge”. In: Journal of Field Robotics 25.8 (2008), pp. 425–466. issn: 15564959. doi: 10.1002/rob.20255.

[VEK00]

Ljubo Vlacic, Anthony Engwirda, and Makoto Kajitani. “Cooperative Behavior of Intelligent Agents: Theory and Practice”. In: Soft computing and intelligent systems. Ed. by Naresh K. Sinha and Madan M. Gupta. San Diego, Calif.: Academic Press, 2000, pp. 279–307. isbn: 978-0-12-646490-0. doi: 10.1016/B978-012646490-0/50015-9.

[VH08]

Andr´ as Varga and Rudolf Hornig. “An overview of the OMNeT++ simulation environment”. In: Simulation tools and techniques for communications, networks and systems (Simutools), Proceedings of the 1st International Conference on. ICST, 2008. isbn: 978-963-979920-2.

163

Bibliography [VO14]

Andr´ as Varga and OpenSim Ltd. OMNeT++ User Manual Version 4.4.1. 2014. url: http://www.omnetpp.org/doc/omnetpp/Manual. pdf (visited on 06/05/2014).

[VOM13]

Carlo Vallati, Victor Omwando, and Prasant Mohapatra. “Experimental Work Versus Simulation in the Study of Mobile Ad hoc Networks”. In: Mobile ad hoc networking. Ed. by Stefano Basagni et al. IEEE series on digital & mobile communication. Wiley, 2013, pp. 191–238. isbn: 978-1-11-808728-2.

[VS06]

U. Varshney and S. Sneha. “Patient monitoring using ad hoc wireless networks: reliability and power management”. In: IEEE Communications Magazine 44.4 (2006), pp. 49–55. issn: 0163-6804. doi: 10.1109/MCOM.2006.1632649.

[Wal+06]

Bernard H. Walke et al. “Wireless Communication – Basics”. In: IEEE 802 wireless systems. Ed. by Bernard H. Walke, Stefan Mangold, and Lars Berlemann. Chichester and Hoboken, NJ: John Wiley & Sons, 2006, pp. 7–41. isbn: 978-0-470-01439-4.

[WB06]

Greg Welch and Gary Bishop. An Introduction to the Kalman Filter. 2006. url: http : / / www . cs . unc . edu / %5Ctextasciitilde % 20welch / media / pdf / kalman % 5Ctextunderscore % 20intro . pdf (visited on 03/04/2015).

[WDM01]

J. Widmer, R. Denda, and M. Mauve. “A survey on TCP-friendly congestion control”. In: IEEE Network 15.3 (2001), pp. 28–37. issn: 08908044. doi: 10.1109/65.923938.

[WGB99]

M. Weiser, R. Gold, and J. S. Brown. “The origins of ubiquitous computing research at PARC in the late 1980s”. In: IBM Systems Journal 38.4 (1999), pp. 693–696. issn: 0018-8670. doi: 10.1147/sj. 384.0693.

[Wie10]

Peter Wieland. From static to dynamic couplings in consensus and synchronization among identical and non-identical systems. Berlin, Germany: Logos, 2010. isbn: 978-3-8325-2638-2.

[Wik14]

Wikipedia contributors. DBm — Wikipeida, The Free Encyclopedia. 2014. url: http://en.wikipedia.org/w/index.php?title=DBm% 5C&oldid=619985086 (visited on 08/08/2014).

[WKT11]

Fang-Jing Wu, Yu-Fen Kao, and Yu-Chee Tseng. “From wireless sensor networks towards cyber physical systems”. In: Pervasive and Mobile Computing 7.4 (2011), pp. 397–413. issn: 15741192. doi: 10.1016/j.pmcj.2011.03.003.

[WL08]

Fei-Yue Wang and Derong Liu, eds. Networked Control Systems: Theory and Applications. London: Springer, 2008. isbn: 978-1-84800214-2.

164

Bibliography [WMB06]

Bernard H. Walke, Stefan Mangold, and Lars Berlemann, eds. IEEE 802 wireless systems: Protocols, multi-hop mesh/relaying, performance and spectrum coexistence. Chichester and Hoboken, NJ: John Wiley & Sons, 2006. isbn: 978-0-470-01439-4.

[WVG12]

Yunbo Wang, Mehmet C. Vuran, and Steve Goddard. “Cross-Layer Analysis of the End-to-End Delay Distribution in Wireless Sensor Networks”. In: IEEE/ACM Transactions on Networking 20.1 (2012), pp. 305–318. issn: 1063-6692. doi: 10.1109/TNET.2011.2159845.

[WY01]

G. C. Walsh and Hong Ye. “Scheduling of networked control systems”. In: IEEE Control Systems Magazine 21.1 (2001), pp. 57–65. issn: 02721708. doi: 10.1109/37.898792.

[XH07]

Lurong Xiao and Edward Hung. “An Efficient Distance Calculation Method for Uncertain Objects”. In: Computational Intelligence and Data Mining (CIDM), Symposium on. IEEE, 2007, pp. 10–17. isbn: 978-1-4244-0705-7. doi: 10.1109/CIDM.2007.368846.

[XHL14]

Lida Xu, Wu He, and Shancang Li. “Internet of Things in Industries: A Survey”. In: IEEE Transactions on Industrial Informatics (2014), pp. 2233–2243. issn: 1551-3203. doi: 10.1109/TII.2014.2300753.

[ZBP01]

Wei Zhang, M. S. Branicky, and S. M. Phillips. “Stability of networked control systems”. In: IEEE Control Systems Magazine 21.1 (2001), pp. 84–99. issn: 02721708. doi: 10.1109/37.898794.

[Zie+14]

Julius Ziegler et al. “Making Bertha Drive—An Autonomous Journey on a Historic Route”. In: IEEE Intelligent Transportation Systems Magazine 6.2 (2014), pp. 8–20. issn: 1939-1390. doi: 10.1109/MITS. 2014.2306552.

[Zim80]

H. Zimmermann. “OSI Reference Model – The ISO Model of Architecture for Open Systems Interconnection”. In: IEEE Transactions on Communications 28.4 (1980), pp. 425–432. issn: 00906778. doi: 10.1109/TCOM.1980.1094702.

[Zor+10]

Michele Zorzi et al. “From today’s INTRAnet of things to a future INTERnet of things: a wireless- and mobility-related view”. In: IEEE Wireless Communications 17.6 (2010), pp. 44–51. issn: 1536-1284. doi: 10.1109/MWC.2010.5675777.

165

Appendix

167

A. Extended Concepts and Definitions A.1. Mobile Ad Hoc Networks A.1.1. Computer Networking Basics The most commonly found reference model, used to describe the architectural structure of computer networks, is the International Organization for Standardization (ISO)/Open Systems Interconnection (OSI) basic reference model [TW11]. This model defines an architecture comprised of seven separate layers—Application, Presentation, Session, Transport, Network, Data-link, and Physical—, each with a specific responsibility, thus simplifying specifications and future changes of behaviour in each layer [ISO/IEC7498-1; Zim80]. The Presentation and Session layer do no longer have any relevance for most current network applications [TW11]. The same holds for this thesis and Tanenbaum’s simplified five layer model is used instead. Figure A.1 depicts the five remaining layers in the reference model for a communication between two Applications on the computer hosts A and D, which has to be routed over the intermediate network hosts B and C. On the computer hosts A and D, each layer has an active instance participating in the communication, called a peer process. The intermediate, i.e., the network hosts, B and C only have participating instances for the Network layer and below. Physical signals only travel via the physical medium below the Physical layer, but conceptually a peer process on one host only communicates with its counterpart on another host. The specification detailing this communication between two peer processes on separate hosts is called a protocol. A protocol suite is a set of protocols that defines one protocol for each layer in the network architecture. On a single computer host, the implementation of this protocol suite is called the protocol stack. It is obvious that, in a protocol stack, each layer process has to communicate with its adjacent layer’s peer processes. This communication is done through a service interface that a layer process offers to the layer process directly above. Today’s Internet protocol suite aligns with the simplified reference model in figure A.1. The Internet Protocol (IP) is the Internet’s Network layer protocol, TCP is the typical Transport layer protocol on that the application specific protocol, such as the Hyper Text Transfer Protocol (HTTP), depends. An alternate Transport layer protocol for the Internet is UDP. The Internet protocol suite does not specify a certain Data-link layer protocol that has to be used, rather this may be chosen depending on the context and underlying medium. Common lower layer protocols are IEEE 802.3, Ethernet, for wired, and IEEE 802.11, WLAN, for wireless networking of personal computers, servers, and other computing devices like smartphones or tablets. Besides the primary network related terminology that is introduced in chapter 2, the following terms are used in the thesis: host A node that is running applications to offer Application layer services to a user or

169

A. Extended Concepts and Definitions layer name

A (host)

B (router)

C (router) D (host) peer process

Application protocol

service interface

Transport

Network

Data-link

Physical

Figure A.1.: Communication between an application on host A and an application on host D via intermediate (network) hosts B and C according to the layered network reference model, adapted from [TW11]. other nodes. router A node that forwards messages between at least two other nodes in the Network layer. unicast A message/transmission intended for reception by a single node. multicast A message/transmission intended for reception by a group of nodes. broadcast A message/transmission intended for reception by all nodes, whereas the meaning of all depends on the layer to which a message belongs.

A.1.2. IEEE 802.11 Wireless Local Area Networks Gast [Gas05] provides an in-depth reference to the IEEE 802.11 WLAN standard from a computer network perspective. The standard addresses both the Data-link and the Physical layer for wireless communication networks. Its various amendments use different modulation schemes and usable frequency bands that result in varying maximum data rates. In the Data-link layer, IEEE 802.11 specifies the MAC sublayer and refers to the Logical Link Control of IEEE 802.2 for the remaining Data-link layer responsibilities. Reliable frame transmission A receiver acknowledges every correct unicast data frame that it receives. If the original sender does not receive the acknowledgement, it assumes that the data frame was lost

170

A.1. Mobile Ad Hoc Networks or corrupted and initiates a retransmission, up to a maximum count for retransmission attempts. Each frame contains a check sum that the receiver uses to verify its integrity. Shared Channel Access The standard defines three coordination functions that regulate a node’s access to the shared, physical medium: the Distributed Coordination Function (DCF), the point coordination function, and the hybrid coordination function. The hybrid coordination function, defined in the IEEE 802.11e amendment to improve QoS capabilities, consists of the two channel access methods EDCA and controlled channel access. Both the point coordination function and the hybrid coordination function’s controlled channel access require a central arbiter—like an access point—to poll registered nodes for transmission and thus cannot be used in ad hoc mode. The polling is used to guarantee QoS constraints and effectively assigns fixed transmission intervals and durations to a node. With both DCF and EDCA, nodes use a Carrier-Sensing Multiple Access (CSMA) scheme with collision avoidance to access the communication medium: Before transmitting a frame, the node senses the channel for at least the time of a distributed interframe space (34 µs in 802.11a) to know if there is another transmission already ongoing; if it senses the channel to be clear, it backs off and senses the channel for an additional time span of a random multiple of a slot time (9 µs in 802.11a), which is the collision avoidance part. This random multiple is picked from the interval between 0 and the node’s contention window. Only if the channel remains clear during this extended time, the node begins transmitting. In case the node encounters an ongoing transmission during its back off time, it interrupts its transmission attempt until it again has sensed the channel to be clear for the duration of a distributed interframe space and then resumes its previous transmission attempt by continuing with the back off timer. Figure A.2 illustrates the contention for channel access of two nodes that follow the IEEE 802.11 CSMA scheme with collision avoidance. If a node’s transmission fails, i.e., it does not receive an acknowledgement for the frame, it doubles the value for its contention window, up to a certain maximum; after a successful transmission the contention window is again reset to its predefined minimum value. The EDCA defines multiple access categories, which reflect differing transmission priorities, to support distributed QoS functionality; each access category has its own minimum value for the contention window, effectively assigning each category an individual transmission probability. For further details, Mangold et al. [Man+03] and Bianchi [Bia00] offer in depth analyses of the functionality and performance of the IEEE 802.11 EDCA respectively DCF medium access schemes. To increase a node’s chance to sense a free channel when trying to transmit, nodes use a network allocation vector to arbitrate access to the shared medium. With every transmission, a node may reserve access to the medium by specifying a duration in the transmitted frame’s header. Other nodes that overhear the frame’s transmission register this duration and only attempt their next transmission afterwards. Using this mechanism, a sending node can prevent other nodes from transmitting on the medium before the receiving node has fully received and then acknowledged the transmission. CSMA suffers from the hidden node problem: A sending node can only sense the channel at its own location. But another node that is too far away from the sending node to sense its transmission still is able to cause too strong interference at the intended

171

A. Extended Concepts and Definitions ongoing transmission (channel busy)

slot duration

node A transmits (channel busy)

acknowledgment to node A's transmission (channel busy)

time distributed interframe space

back off timer node A

back off timer node B

interruption of back off timer node B

resume back off timer node B

Figure A.2.: Carrier-Sensing Multiple Access scheme with collision avoidance of two nodes, A and B, contending for channel access. receiving node and cause a collision. To counter the hidden node problem, the DCF allows the use of request to send/clear to send messages: A node that wants to send a data frame to a receiver first sends a request to send message that the receiver answers with a clear to send message if the channel is clear. In combination with the network allocation vector mechanism, this message pair ensures that other nodes that might cause collisions at the receiving node do not try to send on the channel until the full message exchange is done. Ad Hoc Mode Specifics In ad hoc mode, IEEE 802.11 WLAN nodes form an independent basic service set, identified by an up to 32 bit long field that is often referred to as the network name and exists in the same form in the infrastructure mode. The independent basic service set is the label that users regularly use to identify a network. To identify an ad hoc network on the MAC sublayer, the frames contain the network’s independent basic service set identifier —a 48 bit number with the first bit in network order set to 0 (the individual/group bit, here indicating individual), the second bit set to 1 (the universal/local assignment bit, here indicating local assignment), and the remaining 46 bits chosen at random—in one of their address fields, not the 32 bit field for the independent basic service set. All nodes that form an ad hoc network have to use the same independent basic service set identifier. When joining an existing ad hoc network this is not a problem because the identifier is announced via the network’s beacon frames. When multiple nodes join to form an ad hoc network that they have pre-configured, they must agree on the independent basic service set identifier to use, a mechanism that might be unstable. Consequently, some operating systems allow to directly configure the independent basic service set identifier. The IEEE 802.11p amendment for VANETs allows nodes to transmit data frames to the wildcard basic service set identifier, which effectively is a broadcast transmission on the Data-link layer, that may otherwise only be used to probe for access points or nodes belonging to any basic service set, either independent or infrastructure [JD08]. This local broadcast enables vehicles to rapidly exchange security related data without the need for any delaying handshake procedure.

172

A.1. Mobile Ad Hoc Networks Access points of infrastructure WLANs send so called Beacon frames in regular intervals to announce their network and provide timer synchronization for network related scheduling of attached clients. The timing synchronization function uses a 1 MHz clock and thus has microsecond precision. In ad hoc mode, at the time when a new beacon frame has to be sent, all nodes in the network halt other transmissions and schedule a random backoff timer. When the timer fires on a node, it transmits the Beacon frame; every node that receives the Beacon frame cancels its own timer. For timer synchronization in the ad hoc mode, a node only updates its local clock to the Beacon’s time stamp if the latter is ahead of the local time. Hence, the nodes synchronize to the fastest running clock. The Logarithmic Decibel Scale to Measure Power A power, P , is typically measured in Watt (W), as is the transmit power of radio waves. But to better handle the large range of values for signal power that is typical in radio propagation, the relative, logarithmic unit dBm with a 1 mW reference is often used instead [Rap02]:   P P (A.1) = 10 · log10 dBm 0.001 W

A.1.3. Routing in Mobile Ad Hoc Networks Ad hoc On-demand Distance Vector Routing AODV is a reactive routing protocol that is used, e.g., in Zigbee networks. Its current specification is published as [RFC3561]; this discussion of AODV is based upon [PR99], a paper by two of the specification’s authors. Sommer, Dietrich, and Dressler [SDD10] describe it as “[. . . ]probably the best known protocol in the ad hoc networking community[. . . ]”. For the principal routing mechanism, each node along a route stores forward and backward pointers to the next, respectively previous, hop of the route in its local route cache; packets are passed along the route purely with this local knowledge. A node that initiates the transmission of a Transport layer packet but has no route information cached for the intended destination node triggers a route discovery by sending a link-local broadcast route request to its direct neighbors. The route request is re-broadcasted by intermediate nodes until it either reaches the intended destination node or an intermediate node that has cached a recent route to the intended destination node. On reception of a route request, each node updates its backward pointer for the source node, if the request has a newer source sequence number than its last backwards pointer; if the source sequence numbers are the same, the pointer is updated if the current request’s hop count is smaller than for the previous one. The route reply is then passed along to the node indicated by the backward pointer; upon reception of the route reply, a node stores the sender in the route’s forward pointer. Additional functions for route and local connectivity maintenance ensure that the protocol performs efficiently: • An intermediate node that receives multiple route requests with the same source/destination combination and source sequence number usually dropps all but the first

173

A. Extended Concepts and Definitions it received. Subsequent receptions of the request are only re-broadcasted, if their remaining time-to-live field indicates a shorted route than the previous one. This prevents routing loops and helps to find the shortest route. • To prevent stale route information, a node removes cached route information, if it does not receive any packets for the route during a route caching timeout period. • Every node maintains a set of known neighbor nodes that it updates whenever it receives a broadcast packet from that node. To support this function, a node that has not sent a packet to its predecessor nodes of active routes during a hello interval broadcasts a hello packet. Nodes are removed from the set of neighbors if no packet from it is received during a certain amount of consecutive hello intervals. Perkins and Royer [PR99] recommend a value of 2 intervals. • A node that detects a link break along a cached route towards the route’s destination, either because it fails to pass a packet via that link or because it removes the link’s target node from its set of neighbors, sends an updated route reply to all predecessor nodes of affected routes to invalidate the routes. • Upon reception of a route reply that invalidates a route, the route’s source node directly triggers the route discovery if it still uses the route. This function helps to reestablish active but broken routes before a next packet has to be transmitted and thus before the application performance is affected. Dynamic Source Routing DSR is a reactive routing protocol for MANETs, developed in the context of the long term research project Monarch at Carnegie-Mellon University with its current specification published as [RFC4728]. This discussion of DSR is based on the text book chapter [JMB01]. The principal routing mechanism has the source node inserting the complete route for a packet into the packet’s header. Intermediate nodes then use this route information to forward that packet along the route. The protocol uses two distinct phases, route discovery and route maintenance, both which get triggered purely on demand. The source node invokes route discovery when it starts sending IP packets to a node it does not know any route to and may, by doing so, learn one or more routes to its intended destination. Route maintenance is the mechanism by which the source node discovers that a currently utilized route is broken. To recover from a broken route it either switches to a previously discovered alternative route or it enters route discovery to find new routes. Additional procedures for route maintenance increase the protocol’s resilience to link failures and node mobility: • An intermediate node on a route that experiences a link failure sends a route error message to the source node. In the case that the intermediate node has cached another route to the destination node, it salvages the package by sending it along this cached route. • A node that overhears a packet transmission and finds itself in a later part of the yet unused portion of the included route information knows the route can be

174

A.2. Data Analysis, Prediction, and Machine Learning shortened. To inform the source node about the shorter route, the intermediate node sends a route response packet with the shortened route back to the source node. • A source node that receives a route error packet piggypacks this error packet onto the subsequent route request packet that it uses to find a new route to the destination node. The error packet’s propagation ensures that other nodes have a chance to prune outdated routes from their caches and prevents them answering the route request with an invalid route. Optimized Link State Routing OLSR is a proactive routing protocol that has been prominently used, e.g., by the German Freifunk community1 to build free wireless mesh networks; the protocol’s specification is published in [RFC3626]. This discussion of OLSR is based upon [Jac+01], a paper to which the specification’s authors have contributed. The protocol’s routing mechanism decides the next hop for a packet per node with the help of the node’s routing table that reflects the complete, known state of the network at the given time. Each node builds its routing table from two different, periodic announcement messages that each node broadcasts: • With periodic, link-local broadcast hello messages, a node announces its presents together with a list of its known neighbours to its link-local neighbours. From these messages, every node learns about the nodes that it may reach with a distance of two hops. With this knowledge, a node selects a subset of its direct, i.e., 1-hop, neighbours in such a way that all its known 2-hop neighbours are still reachable via two hops. The nodes in this subset are called the node’s multipoint relays. • With periodic topology control messages that are broadcasted to the complete network, a node announces its set of multipoint relays. From the most recent topology control message per node, each node constructs the routing table that it uses to forward packets. A node that experiences a link failure removes the intended recipient node from its set of known neighbours. The update is then reflected in its next hello message, but no other, direct action is initiated.

A.2. Data Analysis, Prediction, and Machine Learning A.2.1. Regression, Classification, and Measures of Error In logistic regression, the left hand side of equation (2.4) is not set equal to the response variable but rather to the logarithmic odds of the occurrence of state 1 for a two state random variable A that has probability p1 = P(A = 1). The logarithmic odds are

1

Cf. http://freifunk.net/en/.

175

A. Extended Concepts and Definitions expressed via the logit function such that:   p1 = fθ (x) + e logit(p1 ) = ln 1 − p1

(A.2)

Using equation (A.2) allows to use—typically but not necessarily linear—regression model functions, fθ (x), to define classification models. The standard error of regression is an error measure that may be meaningfully compared with the data’s arithmetic mean or variance in order to judge the scale of a model’s predictive error [HA13]. Mathematically, the standard error of regression is the standard deviation of the prediction’s residuals [HA13]: v u N u 1 X se = t (yi − f (xi ))2 (A.3) N −p i=1

where p is the number of parameters in the prediction model.

A.2.2. Time-Series The backward shift operator, B, is a helpful device to describe mathematical models in the context of regular time-series [cf. BJR94]. Given zt ∈ R is the time-series’s observed value at time t ∈ N, then Bzt = zt−1 ∈ R is the previous observation and Bm zt = zt−m ∈ R, m ∈ N is the observation m time steps before t. Likewise helpful is the backward difference operator, ∇: ∇zt = zt − zt−1 = (1 − B)zt

(A.4)

A.2.3. Forecasting Methods for Time-Series The Na¨ıve forecast is the most simple forecasting method [HA13]: the last observed value simply is carried over as the forecast for all future values up to the forecast horizon, h: B −m zt0 = zt0 ∀m ∈ {x ∈ N|x > 0, x ≤ h ∈ N} (A.5) Box, Jenkins, and Reinsel [BJR94] describe ARIMA models as follows: Given the autoregressive operator φ(B) = 1 − φ1 B1 − φ2 B2 − . . . − φp Bp

(A.6)

of degree p ∈ N and the weights φ1 , . . . , φp ∈ R and the moving average operator θ(B) = 1 − θ1 B1 − θ2 B2 − . . . − θq Bq

(A.7)

of degree q ∈ N and the weights θ1 , . . . , θq ∈ R. Then φ(B)∇d zt = θ(B)at

(A.8)

is an ARIMA(p, d, q) model. at is the value at time t of a white noise process, i.e., a

176

A.2. Data Analysis, Prediction, and Machine Learning time-series of independent random values, drawn from a Normal distribution with 0 mean. The three parameters p, d, and q have to be defined before fitting the model to a time-series; they control the model’s behavior: p specifies the number of previous observations that are taken into account for autoregression, d specifies the amount of the time-series’s differentiation that is necessary to reach stationarity, and q specifies the number of previous forecasting errors to be used in the moving average term.

177

B. Mathematical Formulations and Computations B.1. Log-distance Path Loss Model The Log-distance path loss model’s power equation (5.2) in section 5.1.1, taken from [Wal+06], is not obviously equivalent to the equation used by [Rap02]:   LLD (d) d (B.1) = L(d0 ) + 10 · α log10 dB d0 for which the path loss exponents in table 5.3 are given. Following [Rap02], we may use the free space path loss formula from equation (2.2) with a reference distance d0 = 1 m in case we are dealing with microcellular systems. The latter requirement holds, because the typical transmission range of WLAN radios is at maximum up to a few hundred Meters. The path loss is defined as the ratio of transmitted power to received power in dB [Rap02]:   Pt L(d) = 10 · log10 (B.2) dB Pr (d) Using the Friis free space equation (2.2) to substitute the powers’ quotient yields the free space path loss: "   # LFS (d) Gt Gr λ 2 ⇒ = −10 · log10 (B.3) dB L 4πd Substituting the reference path loss in equation (B.1) with the free space path loss from equation (B.3) yields   LLD (d) d FS = L (d0 ) + 10 · α · log10 dB d0 " (B.4)  2 #  α Gt Gr λ d0 = −10 · log10 − 10 · log10 L 4πd0 d where we can substitute d0 = 1 m and join the two logarithmic terms to "  2   # LLD (d) Gt G r λ m α = −10 · log10 dB L 4πm d

(B.5)

with m being the unit Meter, not a variable. Subtracting the path loss from the

179

B. Mathematical Formulations and Computations transmitted power in dB produces the received signal strength in dB:   PrLD (d) Pt − LLD (d) = 10 · log10 dB W "    2   # Pt G t Gr λ m α = 10 · log10 + 10 · log10 W L 4πm d

(B.6)

with dB, W, and m being the respective units, not variables. When joining the logarithmic terms and converting from dB to the original SI units we get "  2   # λ m α PrLD (d) P t G t Gr (B.7) = 10 · log10 · dB W L 4πm d  2   P t Gt G r λ m α LD ⇒ Pr (d) = (B.8) L 4πm d which equals equation (5.2). Thus, the pat loss exponents from table 5.3 are applicable to this formulation of the Log-distance path loss model.

B.2. Log-normal Shadowing Model Equation (5.3) in section 5.1.1 for the calculation of received signal strength in Watt according to the Log-normal Shadowing propagation model is, in this form, not present in the referenced literature. It can be derived it from the Log-normal Shadowing model of [Rap02] in the same way as the Log-distance path loss equation above. With LLNS (d) LLD (d) = + Xσ dB dB the received signal strength in dB is calculated to be   Pt LLNS (d) PrLNS (d) = 10 · log10 − dB W dB   LD Pt L (d) = 10 · log10 − − Xσ W dB "    2   # Pt Gt Gr λ m α = 10 · log10 + 10 · log10 − Xσ W L 4πm d "  2   # Pt Gt Gr λ m α = 10 · log10 · − Xσ W L 4πm d =

(B.9)

(B.10)

PrLD (d) − Xσ (with substitution from B.7) dB

which can be transformed from the logarithmic domain in dB to the original SI units to yield the Log-normal Shadowing model according to equation (5.3).

180

C. Software Packages C.1. Use of the Statistical Computing Environment R All statistical data analysis and the prediction model implementation that is presented in the thesis have been carried out using the R statistical software package version 3.1.1 that is available from http://www.r-project.org/. Table C.1.: Utilized R packets and their versions. Package

Version

Package

Version

Package

Version

dlm doSNOW flexmix foreach forecast fpc ggplot2 gridExtra

1.1-4 1.0.12 2.3-11 1.4.2 5.5 2.1-7 1.0.0 0.9.1

gtable igraph infotheo iterators lattice MASS mclust plyr

0.1.2 0.7.1 1.2.0 1.0.7 0.20-29 7.3-34 4.3 1.8.1

reshape2 robustbase scales snow timeDate tseries zoo

1.4 0.92-3 0.2.4 0.3-13 3010.98 0.10-32 1.7-11

C.2. Use of the Discrete Event Simulator OMNeT++ The simulation studies conducted for this thesis all use the OMNeT++ discrete event simulation framework version 4.4.1 that is available from http://www.omnetpp.org/ omnetpp/doc_details/2272-omnet-441-source--ide-tgz with the INET library version 2.4.0 that is available from http://omnetpp.org/download/contrib/models/ inet-2.4.0-src.tgz [cf. VH08]. The simulations are built and run on an Ubuntu 14.04 x86 64 GNU/Linux workstation with Linux kernel 3.13.0-27-generic ]50-Ubuntu SMP. The compiler used to build the discrete event simulator and the simulations is Clang version 3.4-1ubuntu3, based on LLVM 3.4 in C++ 11 standard mode, all data reffered to in this thesis was produced using the release build, compiled with -O3 optimization flag. GNU C Library version Ubuntu EGLIBC 2.19-0ubuntu6 was used as provided by the distribution (compiled with GNU CC version 4.8.2). The simulations use the OMNeT++ 4.4 default number generator Mersenne Twister which has a period of 219937 − 1 and 623-dimensional equidistribution when producing output with up to 32-bit accuracy [VO14; MN98].

181

Suggest Documents