Wireless Sensor Network Localization; i.e., estimating the positions of nodes in a wireless ..... structure prediction [
SEMIDEFINITE PROGRAMMING APPROACHES TO DISTANCE GEOMETRY PROBLEMS
A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
Pratik Biswas June 2007
c Copyright by Pratik Biswas 2007
All Rights Reserved
ii
I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.
(Yinyu Ye)
Principal Adviser
I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.
(Stephen Boyd)
I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.
(Leonidas Guibas Computer Science)
Approved for the University Committee on Graduate Studies.
iii
iv
Abstract Given a subset of all the pair-wise distances between a set of points in a fixed dimension, and possibly the positions of few of the points (called anchors), can we estimate the (relative) positions of all the unknown points (in the given dimension) accurately? This problem is known as the Euclidean Distance Geometry or Graph Realization problem, and generally involves solving a hard non-convex optimization problem. In this thesis, we introduce a polynomial-time solvable Semidefinite Programming (SDP) relaxation to the original formulation. The SDP yields higher dimensional solutions when the given distances are noisy. To alleviate this problem, we adopt ideas from dimensionality reduction and use local refinement to improve the estimation accuracy of the original relaxation. We apply this algorithm to Wireless Sensor Network Localization; i.e., estimating the positions of nodes in a wireless network using pair-wise distance information (acquired from communications between the nodes within ‘hearing’ range of each other). Using this ‘cooperative’ approach, we can estimate network structures with very few anchors, and low communication range, offering a distinct advantage over schemes like GPS triangulation. As the number of unknown points increases, the problem starts becoming computationally intractable. We describe an iterative and distributed approach to realize the structure of a graph, even in the absence of anchor points. This algorithm is implemented for Molecule Conformation; i.e., finding the 3-D structure of a molecule, given just the ranges on a few interatomic distances (acquired by Nuclear Magnetic Resonance measurements). We are able to derive accurate structures of molecules with thousands of atoms, with sparse and noisy interatomic distance data. We also extend these formulations to solve the problem of position estimation using angle information. Once again, we demonstrate the working of the algorithm for real life systems, in this case, image based sensor networks. v
Acknowledgements I hope to express my gratitude towards at least some of the individuals and institutions that, through their kindness and encouragement, have led me to the point where I can claim to be a certified ‘Doctor of Philosophy’. First and foremost, I would like to thank my adviser Professor Yinyu Ye. He has done a lot for me in the last 5 years, ever since I signed up with him for an independent study in my first year as a masters student, curious to learn more about convex optimization. With his endless supply of fresh ideas and openness to looking at new problems in different areas, his guidance has proved to be indispensable to my research. On a personal level, I will always remember the enthusiasm and friendliness with which he received anyone who would stop by his office. I hope to imbibe his sense of humility and good humor into my life. A great debt is also due to Professor Stephen Boyd, for his patient reading of this thesis, and his courses on linear algebra and convex optimization. It was his unique style of teaching that initially helped shape my research interests. Thanks to Professor Leonidas Guibas who took time out of his busy schedule to be in my reading committee. The suggestions offered by Professors Boyd and Guibas greatly helped improve the quality and presentation of this thesis. Professor Fabian Pease was very kind in agreeing to chair my orals committee and I am very grateful to him for it. I would like to acknowledge the contributions of my co-authors and collaborators: Professor Toh-Kim Chuan from NUS, for the work on noise handling and molecule conformation, Tzu-Chen Liang and Ta-Chung Wang for their work on gradient descent, Anthony So for his theoretical work on localizability and tensegrity theory, and Siddharth Joshi for his advice with the truncated Newton methods. The members of Professor Ye’s research group: Arjun, Ben, Dongdong, John, Mark, Shipra, Zhisu and others also gave invaluable comments and suggestions during the course of this research. vi
I am very grateful to Dr. Hamid Aghajan, Director of the Wireless Sensor Networks Laboratory at Stanford, for his help with the work on using angle information, and for the opportunity to be his teaching assistant in one of his courses. I also appreciate the help and guidance offered to me by Dr. Renate Fruchter, Director of the Project Based Learning Laboratory, during my first couple of years at Stanford. I would like to express my gratitude towards the Automatic Control Laboratory at ETH Zurich, the Sensor Networks Operation at Intel Research and Amaranth Advisors, for inviting me to summer internships and giving me interesting problems to work on. My deepest thanks to Stanford University, which may be considered a slightly amorphous entity in this context. For all the wonderful people I have met here, and all the opportunities I was allowed to learn and appreciate the good things in life. I feel incredibly lucky to have been a part of this great institution, and I will cherish the memories of my time here. My family has been a strong presence in my life. Apart from their unquestioning love and support, they have given me the gift of a deep sense of wonder for all creation and an open mind with which to explore it. Whatever pursuits I have undertaken in my life, I have always been inspired by their words and actions, and I will be forever indebted to them for it.
vii
Contents Abstract
v
Acknowledgements
vi
1 Introduction 1.1
Motivation
1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.1.1
Wireless sensor network localization . . . . . . . . . . . . . . . . . .
1
1.1.2
Molecule conformation . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.2
Distance geometry background . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.3
Contributions of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2 SDP relaxation of the distance geometry problem
8
2.1
The distance geometry problem . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.2
Semidefinite programming relaxation . . . . . . . . . . . . . . . . . . . . . .
10
2.3
SDP model analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.3.1
Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.3.2
Analysis of dual and exactness of relaxation . . . . . . . . . . . . . .
13
2.3.3
Computational complexity . . . . . . . . . . . . . . . . . . . . . . . .
17
2.3.4
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
3 Dealing with noise: The sensor network localization problem
21
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
3.2
Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
3.3
The SDP model with noise . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
3.3.1
Error minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
3.3.2
Choice of cost functions and weights . . . . . . . . . . . . . . . . . .
26
viii
3.4 3.5
Effect of noisy distance data . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
3.4.1
34
Regularization
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
Choice of regularization parameter . . . . . . . . . . . . . . . . . . .
37
Local refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
3.6.1
A gradient descent method . . . . . . . . . . . . . . . . . . . . . . .
38
3.6.2
Other local refinement methods . . . . . . . . . . . . . . . . . . . . .
41
Distributed localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
3.7.1
Iterative algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
3.7.2
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
3.7.3
Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
3.8.1
Effect of varying radio range and noise factor . . . . . . . . . . . . .
53
3.8.2
Effect of number of anchors . . . . . . . . . . . . . . . . . . . . . . .
54
3.8.3
Computational effort . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
3.8.4
Distributed method . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
3.8.5
Comparison with existing methods . . . . . . . . . . . . . . . . . . .
56
3.5.1 3.6
3.7
3.8
3.9
The high rank property of SDP relaxations . . . . . . . . . . . . . .
Conclusion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
4 Anchor free distributed graph realization: molecule structure prediction 61 4.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
4.2
Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
64
4.3
The inequality based SDP model . . . . . . . . . . . . . . . . . . . . . . . .
66
4.4
Theory of anchor-free graph realization . . . . . . . . . . . . . . . . . . . . .
68
4.5
Regularization term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
4.6
Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
4.7
Stitching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75
4.8
Distributed algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
4.9
Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
4.10 Conclusion and work in progress . . . . . . . . . . . . . . . . . . . . . . . .
88
5 Using angle information
92
5.1
Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
93
5.2
Integrating angle information . . . . . . . . . . . . . . . . . . . . . . . . . .
94
ix
5.3
5.2.1
Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
5.2.2
Distance and angle information . . . . . . . . . . . . . . . . . . . . .
95
5.2.3
Pure angle information . . . . . . . . . . . . . . . . . . . . . . . . . .
97
Practical considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
99
5.3.1
Noisy measurements . . . . . . . . . . . . . . . . . . . . . . . . . . .
99
5.3.2
Computational costs . . . . . . . . . . . . . . . . . . . . . . . . . . .
100
5.4
Simulation results
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
101
5.5
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
108
6 Conclusions and future research
111
6.1
Theoretical aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
111
6.2
Sensor network localization . . . . . . . . . . . . . . . . . . . . . . . . . . .
112
6.3
Molecule structure determination . . . . . . . . . . . . . . . . . . . . . . . .
113
6.4
Extensions to angle information . . . . . . . . . . . . . . . . . . . . . . . . .
114
Bibliography
115
x
List of Tables 4.1
Results for 100% of Distances below 6˚ A and 10% noise on upper and lower 87
4.2
bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results with 70% of distances below 6˚ A and 5% noise on upper and lower bounds and with 50% of distances below 6˚ A and 1% noise on upper and lower bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
xi
List of Figures 2.1 2.2 2.3
Position estimations with 3 anchors, accurate distance measures and varying R. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
Diamond: the offset distance between estimated and true positions, Box: the square root of individual trace Y¯jj − k¯ xj k2 . . . . . . . . . . . . . . . . . . .
20
Position estimations by Doherty et al., R=0.3, accurate distance measures and various number of anchors. . . . . . . . . . . . . . . . . . . . . . . . . .
3.1
Effect of intelligent weighting in the multiplicative noise model, 50 points, 4 anchors, 20% multiplicative noise. . . . . . . . . . . . . . . . . . . . . . . . .
3.2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
RMSD error of the estimated positions obtained from SDP with regularization parameter λ for the examples in Figure 3.3. . . . . . . . . . . . . . . .
3.5
33
60 sensors, 4 anchors, Radio range=0.3, 20% Normal multiplicative noise. The positions are estimated from the solution of the SDP (3.3). . . . . . . .
3.4
29
Ratio between E[v∗ ] and the upper and lower bounds in Proposition 3.1; and √ the ratio between average RMSD and τ obtained from the SDP relaxation of (3.1).
3.3
20
38
Same as Figure 3.3, but for SDP with regularization. The regularization parameter λ is chosen to be value in (3.16), which equals to 0.551 and 0.365, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
3.6
Refinement through gradient descent method, for the example in Figure 3.3(b). 42
3.7
Refinement through gradient method, for case in Figure 3.3(a). . . . . . . .
3.8
Comparison of truncated Newton and Gradient Descent for a network of 100
3.9
43
points and 5 anchors, Radio range=0.175, 20% multiplicative noise. . . . . .
45
Convergence of local refinement methods. . . . . . . . . . . . . . . . . . . .
45
xii
3.10 Position estimations for the 2, 000 node sensor network, 200 anchors, no noise, Radio range=0.06, and the number of clusters=49. . . . . . . . . . . . . . .
48
3.11 Position estimations for the 2, 000 node sensor network, 200 anchors, 10% multiplicative noise, Radio range=.05, and the number of clusters=36. . . .
49
3.12 Irregular Networks with Geodesic Distances. . . . . . . . . . . . . . . . . . .
51
3.13 Variation of estimation error (RMSD) with measurement noise, 6 anchors randomly placed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
3.14 Variation of estimation error (RMSD) with number of anchors (randomly placed) and varying radio range, 20% multiplicative noise. . . . . . . . . . .
54
3.15 Variation of estimation error (RMSD) with number of anchors (randomly placed) and varying measurement noise, R = 0.3. . . . . . . . . . . . . . . .
54
3.16 Number of constraints vs. number of points. . . . . . . . . . . . . . . . . . .
56
3.17 Solution time vs. number of points. . . . . . . . . . . . . . . . . . . . . . . .
57
3.18 Distributed method simulations. . . . . . . . . . . . . . . . . . . . . . . . .
58
3.19 MDS results, Estimation error for 200 point networks with varying radio range, 5% multiplicative noise. . . . . . . . . . . . . . . . . . . . . . . . . .
59
3.20 Comparison with MDS, Estimation error for 200 point networks with varying radio range, 5% multiplicative noise. . . . . . . . . . . . . . . . . . . . . . . 4.1 4.2 4.3 4.4 4.5
59
Schematic diagram of the quasi-block-diagonal structure considered in the distributed SDP algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
Comparison of actual and estimated molecules. . . . . . . . . . . . . . . . . ˚ and 10% noise on upper 1I7W(8629 atoms) with 100% of distances below 6A
83
and lower bounds, RMSD = 1.3842˚ A. . . . . . . . . . . . . . . . . . . . . . ˚ and 5% noise on upper 1BPM(3672 atoms) with 50% of distances below 6A ˚. . . . . . . . . . . . . . . . . . . . . . and lower bounds, RMSD = 2.4360 A
84 85
Comparison between the original (left) and reconstructed (right) configurations for various protein molecules using Swiss PDB viewer. . . . . . . . . .
86
4.6
CPU time taken to localize a molecule with n atoms versus n. . . . . . . . .
89
5.1
Sensor, Axis orientation and field of view. . . . . . . . . . . . . . . . . . . .
95
5.2
3 points, the circumscribed circle and related angles, hAOBi = 2hACBi. . .
97
5.3 5.4
Variation of estimation error with measurement noise. . . . . . . . . . . . .
103
Variation of absolute estimation error with measurement noise. . . . . . . .
104
xiii
5.5
Variation of estimation error with radio range. . . . . . . . . . . . . . . . .
105
5.6
Comparisons in different scenarios. . . . . . . . . . . . . . . . . . . . . . . .
106
5.7
Variation of estimation error with more anchors. . . . . . . . . . . . . . . .
107
5.8
Effect of field of view: 40 node network, 4 anchors, 10% noise. . . . . . . . .
108
5.9
Example of effect of field of view on pure angle approach: 40 node network, 4 anchors, R = 0.5, no noise. . . . . . . . . . . . . . . . . . . . . . . . . . .
109
5.10 Solution times. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
110
xiv
Chapter 1
Introduction 1.1
Motivation
The study of distances to infer structure is an important one, and is gaining increasing applicability in fields as varied as wireless network position estimation [11, 25, 77], molecule structure prediction [12, 27, 59], data visualization [34, 66], internet tomography [24] and map reconstruction [28]. More recently, the theory is being applied also to manifold learning and dimensionality reduction, particularly in the computer graphics and image processing domains. Examples include face recognition [49] and image segmentation [92]. One essential question in this domain is: given a noisy and incomplete set of Euclidean distances between a network of points (in a given dimension), can we find robust, efficient and scalable algorithms to find their (relative) positions? We will focus on two particular applications of this problem which have received great attention from researchers, and also are the examples considered in detail in this thesis.
1.1.1
Wireless sensor network localization
Advances in micro-electro-mechanical systems (MEMS) and wireless communications have made the sensor network a highly efficient, low cost and robust technology. A typical sensor network consists of a large number of sensors which are densely deployed in a geographical area. Sensors collect the local environmental information such as temperature or humidity and can communicate with each other to relay the gathered data to a decision making authority. Based on the gathered data, inferences can be made about the status of the environment being monitored and appropriate action taken. Examples of areas which are 1
2
CHAPTER 1. INTRODUCTION
attracting large research efforts in this direction and in prototypes networks have been developed include the habitat monitoring system in the Great Duck Island (GDI) experiment [57], environment monitoring for detecting volcano eruptions [3], the Code-Blue project for emergency medical care [55], industrial control in semiconductor manufacturing plants and oil tankers [51], structural health monitoring [23, 99], military and civilian surveillance and moving object tracking [38] and asset location [32]. The information collected through a sensor network can be interpreted and relayed far more effectively if we know where it is coming from and where it needs to be sent, i.e., there is some spatial information corresponding to each sensor. For example, in an environmental monitoring application, just having access to the raw data does not yield much information about the variation of the environmental factors within the space considered. In contrast, having a geographical map and environmental information corresponding to points in the map is much more insightful, in terms of providing a clearer picture of the how the environmental parameters change across the observed region. In fact, some sensor network applications may simply be required to perform the task of target tracking or asset location. The positions of sensor nodes may also be used for intelligent geographical routing purposes, thus reducing the burden on the network’s communication and power resources. For densely deployed networks spread over a large area, it is often very expensive and infeasible to manually administrate the setup of the network, especially when sometimes the setup is ad-hoc in nature. The initialization of such networks should be done automatically by the network itself. Furthermore, the sensor devices are designed to be low power and low bandwidth, in order to minimize the operating costs and maximize operating lifetime. Therefore, it would be too expensive, even infeasible, to use solutions such as a Global Positioning System (GPS) [46] device on every sensor, compounded by the fact that it would have difficulty in indoor scenarios. It might be possible to have at least a few anchor nodes, however, whose positions are known either by GPS or manual placement. For the other sensor nodes, relative position information like distances and angles between sensors may be available. A wireless sensor can have the capability to communicate with other sensors which are within a communication radius of it. The communication radius depends mainly on the transmission and measurement schemes used and their respective power and communication constraints. Using different range and angle measurement schemes such as received signal strength (RSS), time of arrival (ToA), angle of arrival (AoA), relative distance and angle estimates can be made about the neighboring sensors.
1.1. MOTIVATION
3
The problem of inferring global position information for all sensor nodes using this pair-wise distance information is referred to as the localization problem. Some excellent surveys on both the hardware and algorithm aspects of the localization problem include [40], [60],[72]. The different strategies pursued vary in terms of the kind of measurements (distance, angle, hop information) and hardware and sensing mechanisms (RF, acoustic, camera based) used. Many current localization solutions use triangulation with multiple anchors to find unknown sensor positions. This approach, however, calls for high anchor density and high communication radius, both of which are costly propositions. Using global distance information between all sensors, not just anchor-unknown, but unknown-unknown as well, would allow the entire network to ‘cooperatively’ compute positions for all its nodes in a far more cost-effective manner. The challenge is to develop such schemes that are scalable to large networks, in terms of cost and power, but at the same time, to not compromise on estimation accuracy, inspite of the noise and incompleteness in the distance measurements.
1.1.2
Molecule conformation
Significant research efforts are being directed into the field of structural genomics, i.e., the study of mapping the 3-D structure of molecules, proteins in particular [17, 91]. Creating a database of protein structures, like in the case of the Protein Data Bank [8], provides the opportunity to recognize homology in different proteins that may not be detectable by merely comparing sequences. This is quite likely because protein structure is often well conserved over evolutionary time. By developing basic 3-D structures of proteins, we can build up a family of proteins with similar characteristics. In fact, such structure is also crucial in determining the molecular mechanism and function of the protein in question, and can be used in directing efforts in drug design. Combined with other molecular modeling tools, molecule structure determination (or conformation) can prove to be an indispensable component in the study of understanding proteins. Kurt W¨ uthrich and his co-researchers started a revolution in this field by introducing the use of Nuclear Magnetic Resonance (NMR), as a convenient method by which to measure interatomic distances for proteins in solution [97]. The nuclei of some atoms resonate at particular frequencies when placed in a magnetic field of a particular intensity. However, depending on the local chemical environment (influenced by neighboring atoms and electron density), the frequency shifts slightly. By performing a series of experiments on the molecule,
4
CHAPTER 1. INTRODUCTION
an NMR spectrum can be constructed. By observing the peaks in the NMR spectra, distance ranges can be computed between the nuclei of corresponding to a particular peak. Having recovered distance ranges between some of the atoms, valid configurations of the molecule that satisfy these distance constraints need to be found, a task in which distance geometry techniques play a huge role. The basic statement of the problem is as follows: Given upper and lower bounds on a sparse subset of all interatomic distances, can we find configurations of the entire molecule that satisfy the given distance constraints? The book by Crippen and Havel [27] provides a comprehensive background to the links between molecule conformation and distance geometry. The problem sizes in this space (often thousands of atoms) call for parallel or distributed approaches, without which the problem may be intractable.
1.2
Distance geometry background
The distance geometry problem has been studied extensively, and for a long time, in the framework of Euclidean distance matrix (EDM) completion. Schoenberg [74] and Young and Householder [103] established some basic properties of distance matrices which have set the stage for future research in distance geometry. In particular, they considered the case when all pair-wise distances between a set of n points in Rd are known. Let X = [x1 , x2 , . . . , xn ] be the d × n matrix of points, and D be
the distance matrix, where Dij = kxi − xj k2 . The Gram matrix or inner product matrix is defined as Y = X T X. Note that
Dij = Yii + Yjj − 2Yij . By assuming xn = 0, without loss of generality(since this corresponds to a simple translation), the Gram matrix can also be expressed in terms of the distance matrix as Yij =
1 (Din + Djn − Dij ). 2
It was shown that the distance matrix D can correspond to a realization in dimension d if and only if the corresponding Gram matrix Y is positive semidefinite(psd), and has a rank equal to d. Therefore, in the case of incomplete distance matrices, the task would be to complete
1.2. DISTANCE GEOMETRY BACKGROUND
5
the distance (or inner product matrix) in a manner such that a realization in dimension d is possible. It is also worth noting that in a factorization of the completed Gram matrix with ˜ T X, ˜ X ˜ would correspond to a realization of the points in dimension d, rank d, where Y = X and if there are enough distances in the incomplete matrix to ensure a unique realization to begin with, this would only be a translated and rotated version of the original set of points. These ideas form the basis of a class of algorithms known as multidimensional scaling [26, 85]. Algorithms based on MDS sometimes use objective functions (such as the STRESS criterion) [87, 88] that ensure low-dimensional solutions for the given incomplete distance matrix. The problem with these techniques is that there is a possibility of getting stuck in local minima due to the nonconvex nature of the problem, and there is no guarantee of finding a desired realization in polynomial time. With this in mind, researchers have also attempted to pin down the exact computational complexity of this problem. Saxe [73] proved that the EDM problem is NP-hard in general. Aspnes et al. [4] proved a similar result in the context of sensor localization. More and Wu [58], who used global optimization techniques for distance geometry problems in the molecule conformation space, established that finding an optimal solution, when is small, is also NP-hard. Other general global optimization approaches which employ techniques like pattern search [65] face similar problems. Chapter 8 of the book by Boyd and Vanderberghe [16] investigates the use of convex optimization for a wide range of geometric problems. The book by Dattorro [28] explores more specifically the theoretical links between convex optimization and Euclidean distance geometry. Semidefinite programming (SDP) relaxation techniques have been exploited for a large variety of NP-hard problems, many of which are concerning Euclidean distances, such as data compression, metric-space embedding, covering and packing, chain folding [22, 42, 54, 101], manifold learning and nonlinear dimensionality reduction [93]. Not surprisingly, it also finds application in Euclidean distance matrix completion. Alfakih [1] and Laurent [52] have discussed before the use of SDP relaxations in EDM completion, focussing on issues such as realizability and interior point algorithms for these SDPs. Barvinok [5] has used SDP theory to study this problem in the context of quadratic maps and to analyze the dimensionality of the solutions.
6
CHAPTER 1. INTRODUCTION
1.3
Contributions of the thesis
While there has been a lot of theoretical work in using SDP for distance geometry, its application to real life problems has so far been limited. This thesis presents a novel SDP relaxation for the distance geometry problem, and analyzes its theoretical properties in the context of the uniqueness and realizability of solutions in a fixed dimension. It then addresses actual scenarios where the SDP relaxations for Euclidean distance geometry may be applied, and in doing so, attempts to tackle the issues of achieving good estimation accuracy with noisy measurements and scalability for large scale implementation. Chapter 2 introduces the basic distance geometry problem with exact distance information and its SDP relaxation. The properties of this relaxation, such as its tightness and computational complexity are discussed. The material in this chapter is based on the papers: Pratik Biswas and Yinyu Ye. Semidefinite programming for ad hoc wireless sensor network localization.
In Proceedings of the Third International Symposium on Information
Processing in Sensor Networks, pages 46–54. ACM Press, 2004, and Pratik Biswas, Tzu-Chen Liang, Ta-Chung Wang, and Yinyu Ye. Semidefinite programming based algorithms for sensor network localization. ACM Transactions on Sensor Networks, 2(2):188–220, 2006. The duality analysis in Section 2.3.2 of the chapter is borrowed from the paper: Anthony Man-Cho So and Yinyu Ye. Theory of semidefinite programming relaxation for sensor network localization. In Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 405-414, 2005 and in Mathematical Programming, Series B, 109:367-384, 2007. The basic SDP model is extended to deal with noisy distance information and its application to sensor network localization is discussed in Chapter 3. In particular, methods are developed to compute low-dimensional solutions for the SDP relaxation, such as adding a regularization term to the objective function of the optimization problem and using a postprocessing gradient descent for local refinement. The material in this chapter is based on the paper:
7
1.3. CONTRIBUTIONS OF THE THESIS
Pratik Biswas, Tzu-Chen Liang, Kim-Chuan Toh, Ta-Chung Wang, and Yinyu Ye. Semidefinite programming approaches for sensor network localization with noisy distance measurements.
IEEE Transactions on Automation Science and Engineeering, Special Issue on
Distributed Sensing, 3(4):360–371, October 2006. The distributed algorithm with anchor nodes is based on the paper: Pratik Biswas and Yinyu Ye. A distributed method for solving semidefinite programs arising from ad hoc wireless sensor network localization. Technical report, Dept of Management Science and Engineering, Stanford University, appeared in Multiscale Optimization Methods and Applications, Springer, October 2003. Chapter 4 applies the SDP relaxation to molecule structure prediction. The challenges in molecule structure prediction are that the molecules are very large, consisting of thousands of atoms, and there are no nodes whose positions are previously known. Therefore, a distributed algorithm, that solves smaller clusters of points, and stitches them together using common points, is introduced. This material is based on the paper: Pratik Biswas, Kim-Chuan Toh and Yinyu Ye. A distributed SDP approach for large-scale noisy anchor-free 3d graph realization (with applications to molecular conformation). Technical report, Dept of Management Science and Engineering, Stanford University, submitted to SIAM Journal on Scientific Computing, February 2007. Finally Chapter 5 deals with localization using angle information. Formulations that maintain the Euclidean distance geometry flavor of the problem are developed. Issues relating to its computational efficiency are also resolved. This material is based on the papers: Pratik Biswas, Hamid Aghajan, and Yinyu Ye. Semidefinite programming algorithms for sensor network localization using angle of arrival information. In to appear, in Proceedings of Thirty-Ninth Annual Asilomar Conference on Signals, Systems, and Computers, October 2005, and Pratik Biswas, Hamid Aghajan, and Yinyu Ye. Integration of angle of arrival information for multimodal sensor network localization using semidefinite programming.
Technical
report, Wireless Sensor Networks Lab, Stanford University,, May 2005. The thesis states its conclusions and lists directions for future research in Chapter 6.
Chapter 2
SDP relaxation of the distance geometry problem 2.1
The distance geometry problem
We now describe the distance geometry problem in its most basic form, with exact distance information. Consider a set of points in Rd with m anchors and n unknown points. The points whose locations are yet to be determined are denoted by xi ∈ Rd , i = 1, . . . , n. An
anchor is a point whose location ak ∈ Rd , k = n + 1, . . . , n + m, is already known. For a pair of unknown points xi and xj , the exact Euclidean distance between them is denoted as dˆij . Similarly, for an unknown point xi and an anchor ak , the exact Euclidean distance between them is denoted as dˆik . In general, not all pairs of unknown/unknown and unknown/anchor distances are known. So the pairs of unknown/unknown points and unknown/anchor points for which distances are known are denoted as (i, j) ∈ Nu and (i, k) ∈ Na respectively. Mathematically, the distance geometry problem in Rd can be stated as follows: Given m anchor locations ak ∈ Rd , k = n + 1, . . . , n + m and some distance measurements dˆij , (i, j) ∈
Nu ∪ Na , find the locations of the n unknown xi ∈ Rd , i = 1, . . . , n.
We can also look at it as a graph realization problem. Let V := {1, . . . , n} and A := b is to determine {an+1 , . . . , an+m }. The realization problem in Rd for the graph (V, A; D) b = {dˆij : the coordinates of the unknown points x1 , . . . , xn from the partial distance data D (i, j) ∈ Nu ∪ Na }. Since the distance information is exact, the realization x1 , . . . xn must 8
9
2.1. THE DISTANCE GEOMETRY PROBLEM
satisfy kxi − xj k2 = (dˆij )2 , ∀ (i, j) ∈ Nu ,
kxi − ak k2 = (dˆik )2 , ∀ (i, k) ∈ Na .
(2.1)
In general, the problem of determining a valid realization is a non-convex optimization problem. One closely related approach is described by [29] wherein the proximity constraints between points which are within ‘cutoff distance’ of each other are modeled as convex constraints. Then a feasibility problem can be solved by efficient convex programming techniques. Suppose 2 nodes x1 and x2 are within a distance R of each other, the proximity constraint can be represented as a convex second order cone constraint of the form kx1 − x2 k2 ≤ R. This can be formulated as a matrix linear inequality [15]: "
I2 R
(x1 − x2 )T
x1 − x2 R
#
0.
(2.2)
Alternatively, if the exact distance dˆ12 ≤ R is known, we could set the constraint k(x1 − x2 )k2 ≤ dˆ12 . The second-order cone method for solving Euclidean metric problems can be also found in Xue and Ye [100] where its superior polynomial complexity efficiency is presented. However, this technique yields good results only if the anchors are placed on the outer boundary, since the estimated positions of their convex optimization model all lie within the convex hull of the anchors. So if the anchors are placed in the interior of the network, the position estimation of the unknown points will also tend to the interior, yielding highly inaccurate results. One also may ask why, if d12 is known, another ‘bounding away’ constraint of the type k(x1 − x2 )k2 ≥ dˆ12 , cannot be added. These constraints would make the problem much tighter and would yield more accurate results. The issue is that this type of constraint is not convex. Therefore,
10
CHAPTER 2. BASIC MODEL AND SDP RELAXATION
the efficient convex optimization techniques cannot apply. In the next section, we introduce a relaxation that is able to incorporate tighter convex constraints.
2.2
Semidefinite programming relaxation
We will use the following notation. For two symmetric matrices A and B, A B means
A − B 0, i.e. A − B is a positive semidefinite matrix. The inner product between two m × n matrices A and B, defined as
m X n X
Aij Bij ,
i=1 j=1
is denoted by A • B. We use Id and 0 to denote the d × d identity matrix and the vector
of all zeros, whose dimensions will be clear in the context. The 2-norm of a vector x is denoted as kxk. We use the notation [A ; B] to denote the matrix obtained by appending
B to the last row of A.
Let X = [x1 x2 . . . xn ] ∈ Rd×n be the position matrix that needs to be determined. It
is readily shown that
kxi − xj k2 = eTij X T Xeij kxi − ak k2 = (ei ; −ak )T [X Id ]T [X Id ](ei ; −ak ), where ei is the ith unit vector in Rn and eij = ei − ej . Thus, the Equations in (2.1) can be
represented in matrix form as finding a symmetric matrix Y ∈ Rn×n and a matrix X ∈ Rn such that
eTij Y eij = (dˆij )2 , ∀ (i, j) ∈ Nu ! Y XT T (ei ; −ak ) (ei ; −ak ) = (dˆik )2 , ∀ (i, k) ∈ Na X Id Y = X T X.
(2.3)
Our method is to relax the problem (2.3) to a semidefinite program by relaxing the constraint Y = X T X to Y X T X. From [15], we know that the last matrix inequality is
11
2.3. SDP MODEL ANALYSES
equivalent to the condition that
Let K = {Z ∈
R(n+d)×(n+d)
: Z =
Y
XT
X
Id
Y
XT
X Id written as the following feasibility problem: Find subject to
!
0.
!
(2.4)
0}. The SDP relaxation can now be
Z (eij ; 0)(eij ; 0)T • Z = (dˆij )2 , ∀ (i, j) ∈ Nu (ei ; −ak )(ei ; −ak )T • Z = (dˆik )2 , ∀ (i, k) ∈ Na
(2.5)
Z(n+1:n+d,n+1:n+d) = Id Z 0.
Note that according to the context, 0 is used to represent the zero vector or matrix of appropriate dimension.
2.3
SDP model analyses
In this section, we probe deeper into the SDP model to answer questions like: When will the solution be unique? Is the SDP relaxation exact? What is the computational complexity of the algorithm, and how does it improve over the weaker relaxation?
2.3.1
Uniqueness
The matrix Z of the SDP (2.5) has nd + n(n + 1)/2 unknown variables. Consider the case that among {i, j, k}, there are nd + n(n + 1)/2 pairs in Nu and Na , Then we have at least
nd + n(n + 1)/2 equality constraints. Moreover, if these equalities are linearly independent,
then Z has a unique solution. Proposition 2.1. If there are nd + n(n + 1)/2 distance pairs each of which has an accurate distance measure and the SDP (2.5) has a unique feasible solution Z¯ =
¯T Y X ¯ Id X
!
,
12
CHAPTER 2. BASIC MODEL AND SDP RELAXATION
¯ TX ¯ and X ¯ equal true positions of the unknown sensors, i.e., then we must have Y¯ = (X) the SDP relaxation solves the original problem exactly.
Proof. Let X ∗ be the true locations of the n points, and ∗
Z =
(X ∗ )T X ∗ (X ∗ )T X∗
Id
!
.
Then Z ∗ is a feasible solution for (2.5). On the other hand, since Z¯ is the unique solution to satisfy the nd + n(n + 1)/2 equalities, we must have Z¯ = Z ∗ so that Y¯ = (X ∗ )T X ∗ = ¯ T X. ¯ X
We present a simple case to show what it means for the system has a unique solution. Consider d = 2, n = 1 and m = 3. The accurate distance measures from unknown b1 to known a1 , a2 and a3 are d11 , d21 and d31 , respectively. Therefore, the three linear equations are y − 2xT a1 = (d11 )2 − ka1 k2 , y − 2xT a2 = (d21 )2 − ka2 k2 , y − 2xT a3 = (d31 )2 − ka3 k2 . This system has a unique solution if it has a solution and the matrix 1
1
1
a1 a2 a3
!
is nonsingular. This essentially means that the three points a1 , a2 and a3 are not on the same line, and then x ¯ = b1 can be uniquely determined. Here, the SDP method reduces to the so-called triangular method. Proposition 2.1 and the example show that the SDP relaxation method has the advantage of the triangular method in solving the original problem.
13
2.3. SDP MODEL ANALYSES
2.3.2
Analysis of dual and exactness of relaxation
The dual of the SDP relaxation is as follows: Minimize
Id • V + 0
subject to
X
yij (dˆij )2 +
(i,j)∈Nu
0
0 V
!
+
X
X
wik (dˆik )2
(i,k)∈Na
yij (eij ; 0)(eij ; 0)T +
(i,j)∈Nu
X
(i,k)∈Na
wik (ei ; −ak )(ei ; −ak )T 0. (2.6)
Let U be the the (n + d) dimensional dual slack matrix, i.e.
U=
0
0
0 V
!
+
X
yij (eij ; 0)(eij ; 0)T +
(i,j)∈Nu
X
(i,k)∈Na
wik (ei ; −ak )(ei ; −ak )T .
We will employ duality theory to investigate when the SDP (2.5) is an exact relaxation of problem (2.1). Firstly, observe that the dual problem (2.6) is always feasible, since V = 0, yij = 0 ∀ (i, j) ∈ Nu , wik = 0 ∀ (i, k) ∈ Na is a feasible solution. Now suppose
that the primal problem (2.5) is also feasible. This will actually be the case for the exact
distance values that are being considered so far. Then, from duality theory for SDP [2], we can conclude that there is no duality gap and the optimal value of the dual must be 0. Furthermore, we can state the following theorem. ¯ be an optimal slack Theorem 2.1. Let Z¯ be a feasible solution to the primal (2.5), and U matrix for the dual (2.6). Then ¯ = 0 or Z¯ U ¯ = 0. 1. Complementary slackness holds, i.e., Z¯ • U ¯ + Rank(U¯ ) ≤ d + n. 2. Rank(Z) ¯ ≥ d, and Rank(U¯ ) ≤ n. 3. Rank(Z) This leads to the following corollary: Corollary 2.1. If an optimal dual slack matrix has rank n, then every feasible solution of the primal (2.5) must have rank d. In this case, the problems (2.1) and (2.5) are equivalent and the relaxation is exact. We also state the following technical result to be used in proofs later:
14
CHAPTER 2. BASIC MODEL AND SDP RELAXATION
Proposition 2.2. If every point is connected, directly or indirectly, to an anchor in problem (2.1), then any solution to problem (2.5) must be bounded. Proof. If point xi is connected to an anchor point ak , then: kx2i k + ka2k k − 2aTk xi ≤ Yii + ka2k k − 2aTk xi = (dˆik )2 , which implies that Yii ≤ (dˆik )2 − ka2k k + 2kak kkxi k. kxi k is bounded by the triangle inequality, therefore Yii is bounded too. Now if xj is connected to xi and Yii is bounded,
p Yii + Yjj − 2 Yii Yjj ≤ Yii + Yjj − 2Yij = (dˆij )2 . Again, by a simple application of the triangle inequality, we conclude that Yjj is also bounded. The discussion about rank is particularly important because the interior point algorithms that are used to solve the SDP compute the max-rank solutions for both the primal and dual [36]. A primal (dual) max-rank solution is one that has the highest rank among all solutions for the primal (dual). Keeping this in mind, we move to the following definition: Definition 2.1. The problem (2.1) is said to be uniquely localizable if there is a unique ¯ ∈ Rd×n and there is no xi ∈ Rh , i = 1, . . . , n, where h > d, such that: realization X kxi − xj k2 = (dˆij )2 , ∀ (i, j) ∈ Nu
kxi − (ak ; 0)k2 = (dˆik )2 , ∀ (i, k) ∈ Na xi 6= (¯ xi ; 0) for some i ∈ 1, . . . , n.
This definition basically implies that there should exist no non-trivial realization in a higher dimensional space (The trivial realization corresponds to setting the xi = (¯ xi ; 0) for j = 1, . . . , n). The following theorem links the idea of unique localizability to the SDP relaxation. Theorem 2.2. Suppose that there are enough distance measures between network of points such that the underlying graph is connected. Then, the following statements are equivalent:
15
2.3. SDP MODEL ANALYSES
1. The problem is uniquely localizable. 2. The max-rank solution matrix of the corresponding SDP (2.5) has rank d. 3. The solution matrix of the SDP (2.5), satisfies Y = X T X. Proof. The equivalence between (2) and (3) is straightforward. In proving that (2) implies (1), we note that any rank d solution of (2.5) is also a solution to (2.1). However, we need to prove that if the max-rank solution has rank d, it must be unique. If not, then suppose that there exist 2 rank d feasible solutions, Z1 =
X1T X1 X1T X1
Id
!
X2T X2 X2T
and Z2 =
X2
Id
!
.
Then the matrix Z = αZ1 +βZ2 , where α+β = 1, α, β > 0 is also a feasible rank d solution, since all feasible solutions must have at least rank d, but the max-rank is assumed to be d. Therefore, Z=
αX1T X1 + βX2T X2 αX1T + βX2T αX1T + βX2T
Id
!
=
BT B BT B
Id
!
,
where B = αX1 + βX2 . This implies that (X1 − X2 )T (X1 − X2 )T = 0, or kX1 − X2 k = 0,
or Z1 = Z2 , which is a contradiction.
To prove the opposite direction, we must show that the max-rank solution of (2.5) has rank d. Suppose there is a feasible solution Z with rank higher than d. Then, we must have Y X T X. In fact, we can write Y = X T X + (X 0 )T (X 0 ), where X 0 = [x01 , . . . , x0n ] ∈ Rr×n and r is the rank of Y − X T X. Now, consider the points x ˜i =
xi x0i
!
for i = 1, . . . , n.
We have k˜ xi k2 = Yjj , (˜ xTi )˜ xj = Yij ∀ i, j. We also know from Proposition 2.2 that Yii and
Yij are bounded for all i, j. Therefore,
k˜ xi − x ˜j k2 = (dˆij )2 ,
k˜ xi − (ak ; 0)k2 =
(dˆik )2 ,
∀ (i, j) ∈ Nu ∀ (i, k) ∈ Na ,
16
CHAPTER 2. BASIC MODEL AND SDP RELAXATION
which is a contradiction of (1). Therefore, (1) implies (2). The importance of this theorem is that it states that if the problem is uniquely localizable, then the relaxation problem (2.5) solves problem (2.1) exactly. To tell whether a problem instance is uniquely localizable or not before solving it is not easy. But once we solve the SDP relaxation and observe whether Y = X T X in the solution, we immediately know if the problem is uniquely localizable or not. In fact, in general, the problem of computing a realization of points, even when the instance is unique is NP-complete. But this shows that there exists a family of graphs for which a realization can be computed in polynomial time. It should be kept in mind however, that the assumption in the above analyses is that the distance measures are exact. We will deal with noisy data in the following chapter. We will now extend this analysis to introduce a new notion called strong localizability. Definition 2.2. We say that the problem instance is strongly localizable if the dual of its SDP relaxation (2.6) has an optimal dual slack matrix with rank n. Note that if a network is strongly localizable, then it is uniquely localizable from Theorems 2.1 and 2.2, since the rank of all feasible solution of the primal is d. Using this definition, we develop the following theorem: Theorem 2.3. If a problem (graph) contains a subproblem (subgraph) that is strongly localizable, then the submatrix solution corresponding to the subproblem in the SDP solution has rank d, i.e., the SDP relaxation computes a solution that localizes all possibly localizable unknown points.
Proof. Let the subproblem have ns unknown points and they are indexed as 1, . . . , ns . Since it is strongly localizable, an optimal dual slack matrix Us of the SDP relaxation for the subproblem has rank ns . Then, in the dual problem of the SDP relaxation for the whole problem, we set V and those wkj ’s associated with the subproblem to the optimal slack matrix Us and set all other wkj ’s to 0. Then, the slack matrix U=
0
0
0 Us
!
0
17
2.3. SDP MODEL ANALYSES
must be optimal for the dual of the (whole-problem) SDP relaxation, and it is complementary to any primal feasible solution of the (whole-problem) SDP relaxation
Z=
∗
∗
∗ Zs
!
0 where Zs =
Ys
XsT
Xs
Id
!
.
However, we have 0 = Z • U = Zs .Us and Zs , Us 0. The rank of Us is ns implies that
the rank of Zs is exactly d, i.e., Ys = (Xs )T Xs . So Xs is the unique realization of the subproblem. The above theorems indicate that the SDP relaxation is able to estimate exactly all the localizable points. For those points that are not localizable, the SDP solution yields observable error measures. In particular, each individual trace Y¯jj − k¯ xj k2
(2.7)
helps us to detect the errors in estimation and isolate exactly those points which are incorrectly estimated given the incomplete distance information.
2.3.3
Computational complexity
A worst-case complexity result to solve the SDP relaxation can be derived from employing interior-point algorithms, e.g., [7]. Theorem 2.4. Let k = 3 + |Nu | + |Na |, the number of constraints. Then, the worst-
case number of total arithmetic operations to compute an -solution of (2.5), meaning its √ objective value is at most (> 0) above the minimal one, is bounded by O( n + k(n3 + √ n2 k + k3 ) log 1 ), in which n + k log 1 represents the bound on the worst-case number of interior-point algorithm iterations. Practically, the number of interior-point algorithm iterations to compute a fairly accurate solution is a constant, 20 − 30, for semidefinite programming, and k is bounded by O(n2 ). Thus, the worst case complexity is bounded by O(n6 ). However, in practice, as the
number of points increases, the required radio range and number of constraints required to solve for all positions scales more typically with O(n). Therefore the number of operations is typically bounded by O(n3 ) in solving a localization problem with n sensors.
18
CHAPTER 2. BASIC MODEL AND SDP RELAXATION
2.3.4
Examples
For the purpose of our examples, we generated random networks of points uniformly distributed in a square area of size 1 × 1. Distances were computed between points that are within a radius R of each other. (The radius indicates that the distance values between any
two nodes are known to the solver if they are below the range; otherwise they are unknown.) The original and the estimated point positions were plotted. The (blue) diamond nodes refer to the positions of the anchors; the (black) circle nodes to the original locations of the ¯ The unknown points; and the (blue) asterisk nodes to their estimated positions from X. discrepancies in the positions can be estimated by the offsets between the original and the estimated points as indicated by the solid lines. Unless stated otherwise, the same setup and notations will be used in all examples of this thesis. Example 2.1. A network of 50 points and 3 anchors was considered. The effect of varying R and as a result, connectivity, was observed in Figure 2.1. R was varied from 0.2 to 0.35. For Figures 2.1(a) and 2.1(b), there are some points for which there is not enough distance information, and therefore, they have larger errors in estimation. Their individual trace errors 2.7 are also high, and they accurately correlate with the real estimation errors. See Figure 2.2 for the correlation between the individual estimation error and trace error for each unknown sensor for the cases in Figure 2.1(a) and 2.1(b). The (black) diamonds indicate the (normalized) actual estimation error and the (blue) squares indicate the (normalized) trace error. In comparison, for the same case as Figure 2.1, we computed the results from the Doherty et al. [29] method, but with more points as anchors (10 and 25). Figure 2.3 shows the estimation results. As we expected, the estimated positions were all in the convex hull of the anchors.
19
2.3. SDP MODEL ANALYSES
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.4 −0.2
0
0.2
0.4
0.6
−0.4 −0.2
(a) R=0.2
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
0
(c) R=0.3
0.2
0.4
0.6
0.2
0.4
0.6
(b) R=0.25
0.6
−0.4 −0.2
0
0.2
0.4
0.6
−0.4 −0.2
0
(d) R=0.35
Figure 2.1: Position estimations with 3 anchors, accurate distance measures and varying R.
20
CHAPTER 2. BASIC MODEL AND SDP RELAXATION
0.12
1
0.1
0.8
0.08 0.6 0.06 0.4
0.04
0.2
0.02 10
20 30 Point number
40
10
50
(a) Error and trace correlation in Figure 2.1(a)
20 30 Point number
40
50
(b) Error and trace correlation in Figure 2.1(b)
Figure 2.2: Diamond: the offset distance between estimated and true positions, Box: the xj k2 . square root of individual trace Y¯jj − k¯
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.4 −0.2
0
0.2
(a) 10 anchors
0.4
0.6
−0.4 −0.2
0
0.2
0.4
0.6
(b) 25 anchors
Figure 2.3: Position estimations by Doherty et al., R=0.3, accurate distance measures and various number of anchors.
Chapter 3
Dealing with noise: The sensor network localization problem 3.1
Introduction
The sensor network localization problem is: assuming the accurate positions of some nodes (called anchors) are known, how do we use them and partial pair-wise distance measurements or angle measurements to locate the positions of all sensor nodes in the network? In this chapter, we will only deal with distance measurements. The difficulty in this problem arises from two factors. First, the distance measurements always have some noise or uncertainty. Depending on the accuracy of these measurements and processor, power and memory constraints at each of the nodes, there is some degree of error in the distance information. Since the effect of the measurement uncertainty usually depends also on the geometrical relationship between sensors and is not known a priori, an unbiased model is not easy to build. Second, even if the distance measurements are perfectly accurate, a sufficient condition for this sensor network to be localizable cannot be identified easily, see [31, 43]. In this chapter, we extend the general graph realization problem and SDP model described in Chapter 2 to formulate the sensor network localization problem with noisy measurements, and derive results on the upper and lower bounds on the SDP objective function. The SDP objective function gives us an idea of the accuracy of the technique and also provides a ‘proof of quality’ against which we can measure any estimation. In the SDP approach, noisy distance measurements typically yield higher dimensional solutions. The 21
22
CHAPTER 3. SENSOR NETWORK LOCALIZATION
projection of these solutions which into R2 generally do not produce satisfactory results. We have developed 2 new techniques to obtain better estimation accuracy. One approach is to add a regularization term to the objective function to force the SDP solution to lie close to a lower dimensional space, by penalizing folding between the points and maximizing the spacing between them. The other approach is to improve the SDP estimated solution using a local refinement method (such as gradient descent). One should note that these methods generally do not deliver a global optimal solution when the problem is non-convex. However, our numerical results show that the SDP estimated solution generally gives a good starting point for the local refinement method to converge to the global optimal solution. It is demonstrated that the improved solution is very close to the optimal one by comparing the minimal objective value to the SDP lower bound.
Unfortunately, the existing SDP solvers scale very poorly and the resulting SDP is often intractable for large networks with hundreds or thousands of points given current SDP solvers. An iterative distributed SDP computation scheme, first demonstrated in [13], has been developed to overcome this difficulty. Our distributed approach is designed so as to exploit the the advantages of both the centralized and distributed paradigms in localization, by not being completely distributed, and using an optimal amount of global information.
Section 3.2 discusses previous work in this domain, particularly at the intersection of sensor network localization and distance geometry. Section 3.3 describes the problem setup and a variety of formulations for the sensor network localization problem, using the SDP relaxation model. In Section 3.4, we derive upper and lower bounds for the SDP objective function. The estimation accuracy of SDP based algorithms for noisy distance data is also discussed. Section 3.5 introduces the use of regularization terms in the SDP objective function in order to obtain lower dimensional solutions. Section 3.6 deals with the use of post processing local refinement methods using the SDP estimated solution as a starting point. In Section 3.7, we present the iterative distributed SDP method for solving localization problems that arise from networks consisting of a large number of sensors. Finally, experimental results that show that these methods significantly improve the performance of SDP based localization algorithms are presented in Section 3.8.
3.2. RELATED WORK
3.2
23
Related work
A great deal of research has been done on the topic of position estimation in ad-hoc networks, see [40, 72] for surveys. Most techniques use distance or angle measurements from a fixed set of reference or anchor nodes; see [29], [61], [69, 70, 71, 76]; or employ a grid of beacon nodes with known positions; see [19, 41]. Niculescu and Nath [61] proposed the ”DV-Hop” and related ”DV-Distance” and Euclidean approaches which is quite effective in dense and regular topologies. The anchor nodes flood their position information to all the nodes in the network. Each node then estimates its own position by performing a triangulation using this information. For more irregular topologies however, the accuracy can deteriorate to the radio range. The authors also investigate the use of angle information (”DV-Bearing”) in for localization [62] using the information forwarding techniques described above. Savarese et al. [69] present a 2 phase algorithm in which the start-up phase involves finding the rough positions of the nodes using a technique similar to the ”DV-Hop” approach. The refinement phase improves the accuracy of the estimated positions by performing least squares triangulations using its own estimates and the estimates of the nodes in its own neighborhood. This method can accurately estimate points within one third of the radio range. When the number of anchor nodes is high, the ”iterative multilateration” technique proposed by Savvides et al. [70] yields good results. Nodes that are connected to 3 or more anchors compute their position by triangulation and upgrade themselves to anchor nodes. Now their position information can also be used by the other unknown nodes for their position estimation in the next iteration. Howard et al. [41] and Priyantha et al. [67] have discussed the use of spring based relaxations that initially try to find a graph embedding that resembles the actual configuration and then modify the embedding to approach the actual one using a mass-spring based optimization to correct and balance errors. While the above methods can be run in a distributed fashion, there also exist some centralized methods that offer more precise location determination by using global information. For centralized techniques, the available distance information between all the nodes must be present on a single computer. The distributed approach has the advantage that the techniques can be executed on the sensor nodes themselves thus removing the need to relay
24
CHAPTER 3. SENSOR NETWORK LOCALIZATION
all the information to a central computer. This also affects the scalability and accuracy of the problems under consideration. Distributed techniques can handle much larger networks whereas centralized approaches yield higher accuracy by using global information. Shang et al. [77] demonstrate the use of ”multidimensional scaling” (MDS) in estimating positions of unknown nodes. Firstly, using basic connectivity or distance information, a rough estimate of relative node distances is made. Then classical MDS (which basically involves using a Singular Value decomposition) is used to obtain a relative map of the node positions. Finally an absolute map is obtained by using the known node positions. This technique works well with few anchors and reasonably high connectivity. For instance, for a connectivity level of 12 and 2% anchors, the error is about half of the radio range. The centralized version of the above mentioned technique does not work well when the network topology is irregular. So an alternative distributed MDS and patching technique is explored in [75, 76]. Here local clusters with regular topologies are solved separately and then stitched together subsequently. Some other methods are based on minimizing some global error functions, which can be different when the model of uncertainty changes. Depending on the kind of optimization model being formulated, the characteristic and computation complexity varies. For example, the maximum likelihood estimation for sensor network localization problem is a non-convex optimization problem. Existing algorithms have limited success, even for small problem sizes, see [59, 58]. On the other hand, if we relax the distance constraints, the problem can be formulated as a second order cone program (SOCP) and cheap and scalable algorithms can be applied [29, 89]. However, the solution is acceptable only when the anchors are densely deployed on the boundary of the sensor network. The use of convex optimization for sensor localization was first described by Doherty et al. [29]. However, as mentioned in the previous chapter, this technique yields good results only if the anchor nodes are placed on the outer boundary, since the estimated positions of their convex optimization model all lie within the convex hull of the anchor nodes. So if the anchor nodes are placed in the interior of the network, the position estimation of the unknown nodes will also tend to the interior, yielding highly inaccurate results. For example, with just 5 anchors in a random 200 node network, the estimation error is almost twice the radio range.
25
3.3. THE SDP MODEL WITH NOISE
3.3
The SDP model with noise
Consider a sensor network in R2 with m anchors and n sensors. An anchor is a node whose
location ak ∈ R2 , k = n + 1, . . . , n + m, is known, and a sensor is a node whose location
xi ∈ R2 , i = 1, . . . , n is yet to be determined. In general, not all pairs of sensor/sensor and sensor/anchor distances can be measured, so the set of sensor/sensor and sensor/anchor pairs for which distances are known are denoted as (i, j) ∈ Nu and (i, k) ∈ Na , respectively.
For a pair of sensors (i, j) ∈ Nu , the distance estimate is denoted as dij . Similarly, for a sensor/anchor pair (i, k) ∈ Na , the distance measured is denoted as dik .
Mathematically, the localization problem can be stated as follows: given m anchor
locations ak ∈ R2 , k = n + 1, . . . , n + m, and some inaccurate distance measurements dij ,
(i, j) ∈ Nu and dik , (i, k) ∈ Na , estimate as accurately as possible the locations of the
n sensors xi ∈ R2 , i = 1, . . . , n. One should note that this problem is in fact a special
case of general distance geometry and graph realization problems in Rd where d = 2 (as
discussed in Chapter 2), although in this case, we will have to deal with noisy distance data. We will discuss with the general noisy graph realization problem in Rd and illustrate its applicability to the sensor network localization problem.
3.3.1
Error minimization
A reasonable solution for the noisy graph realization problem should be to try to minimize the discrepancy in the estimated point locations to fit the given distance data. We will illustrate how the SDP relaxation can be extended to the following error minimization model: Minimizex1 ,...,xn ∈Rd
X
(i,j)∈Nu
γij kxi − xj k2 − d2ij +
X
(i,k)∈Na
γik kxi − ak k2 − d2ik ,
(3.1)
where γij > 0 are given weights. By adding different weights to the distance constraints, we are essentially giving extra emphasis to those distance measurements for which we have higher confidence and vice versa. This would be particularly useful in cases where the distance measures are noisy but there is some prior information about which of those are more reliable. In such cases, we can assign higher weights to the distance constraints corresponding to the more reliable measures. It should be borne in mind that the SDP approach is extensible to a larger class of
26
CHAPTER 3. SENSOR NETWORK LOCALIZATION
distance measures, formulations and cost functions. Different types of cost functions have been described in weighted multidimensional scaling literature, see [26]. In fact, weighted multidimensional scaling has been applied to the sensor network localization problem in [25], and the cost function described above is a specific case of the general cost function described therein. The key lies in how to use SDP relaxations in solving this general class of (nonconvex) problems involving Euclidean distances. This particular cost minimization problem also admits an SDP relaxation, which we will use for illustrating our general relaxation technique. The effect of the choice of the cost function and weights is discussed in more detail in the next subsection. For convenience, we let gij = [ei ; −aj ] for (i, j) ∈ Na and gij = [eij ; 0] for (i, j) ∈ Nu .
We further let E = Nu ∪ Na . Then (3.1) can be rewritten in matrix form as: Minimize
X
(i,j)∈E
subject to
T γij gij
Y
XT
X
Id
Y = X T X.
!
gij − d2ij
(3.2)
Once again, problem (3.2) is a non-convex optimization problem and so, our method is to relax it to a semidefinite program by relaxing the constraint Y = X T X to Y X T X.
From [15], we know that the last matrix inequality is equivalent to the condition that ! ! Y XT Y XT (n+d)×(n+d) : Z= 0. Let K = {Z ∈ R 0}. Then the SDP X Id X Id relaxation of problem (3.2) can be written as the following SDP: Minimize
g(Z; D) :=
X
(i,j)∈E
subject to
3.3.2
Z ∈ K.
T γij gij Zgij − d2ij
(3.3)
Choice of cost functions and weights
It is important to recognize that the choice of the cost/penalty function is crucial in providing the statistically ‘best’ estimate of the point positions. Chapter 7 of the book [16] provides a discussion of statistical estimation in the context of convex optimization. In a maximum likelihood framework, the cost function would be dictated by the prior noise distribution that is assumed for the given distance measurements. In other words, we try to find the point positions that minimize the log-likelihood function given the distances,
27
3.3. THE SDP MODEL WITH NOISE
and the log-likelihood function is determined by the kind of noise model we assume. For example, in the formulation (3.1), the noise model being considered would be an additive Laplacian noise on the squares of the distances: d2ij = kxi − xj k2 + ij
p(ij ) = e−γij |ij | ,
(3.4)
where p(z) is the probability density function of random variable z.
The noise model used mostly in this chapter is multiplicative normal noise: dij = kxi − xj k|1 + ij | 2 ij ∼ N (0, σij ).
(3.5)
For 1 + ij ≥ 0, which is usually the case, the noise model can be expressed as additive Gaussian noise
dij = kxi − xj k + m ij
2 2 m ij ∼ N (0, kxi − xj k σij ).
(3.6)
The MLE problem in this case would be Minimizex1 ,...,xn ∈Rd
X
(i,j)∈Nu
X
(i,k)∈Na
1 2 2 (kxi − xj k − dij ) kxi − xj k2 σij 1 2 2 (kxi − ak k − dik ) . kxi − ak k2 σik
+ (3.7)
Since the exact distances (or maybe even the noise variances are not known), suitable approximations of them may be used in choosing the weights. For example, in the above problem, we put weights γij =
1 2 d2 , σij ij
to approximate the actual weights applied to the
different terms of the cost function.
In fact, by using these approximate weighting schemes, we can also develop convex approximations of log-likelihood functions for more sophisticated noise models. Consider
28
CHAPTER 3. SENSOR NETWORK LOCALIZATION
the case of a log-normal noise model where dij = kxi − xj keij 2 ij ∼ N (0, σij ).
(3.8)
The MLE problem corresponding to this noise model is Minimizex1 ,...,xn ∈Rd
X
(i,j)∈Nu
X
(i,k)∈Na
1 2 2 (log kxi − xj k − log dij ) σij
+
1 2 2 (log kxi − ak k − log dik ) . σik
(3.9)
At first glance, this is again a complicated nonconvex function, but we can approximate it by using its first order Taylor expansion. The term log kxi − xj k − log dij can be expressed kx −xj k−dij as log 1 + i dij . Assuming that the error kxi − xj k − dij d, the condition should exclude the trivial case when we set xj = (¯ j = 1, . . . , n. The d-localizability indicates that the distances cannot be embedded by a non–trivial realization in higher dimensional space, and cannot be ‘flattened’ to a lower dimensional space either. We now develop the following theorem. Theorem 4.1. If problem (4.4) is d-localizable, then the solution matrix, Y¯ , of (4.5) is ¯ T X, ¯ then X ¯ = (¯ x1 , ..., x¯n ) ∈ Rd×n unique and its rank equals d. Furthermore, if Y¯ = X P is the unique minimum-norm localization of the graph with nj=1 x ¯j = 0 (subject to only rotation and reflection).
Proof. Note that problem (4.4) is d-localizable, by the definition that the only feasible solution matrix Y to (4.5) has rank d. Thus, there is a rank-d matrix V¯ ∈ Rd×n such that
any feasible solution matrix Y can be written as Y = V¯ T P V¯ , where P is a d × d symmetric
positive definite matrix. We show that the solution is unique by contradiction. Suppose that there are two feasible solutions Y 1 = V¯ T P 1 V¯
and
Y 2 = V¯ T P 2 V¯ ,
where P 1 6= P 2 and, without loss of generality, P 2 − P 1 has at least one negative eigenvalue
(otherwise, if P 1 − P 2 has at least one negative eigenvalue, we can interchange the role of
P 1 and P 2 ; the only case left to be considered is when all the eigenvalues of P 2 − P 1 are
equal to zero, but this case is not possible since it implies that P 1 = P 2 ). Let Y (α) = V¯ T (P 1 + α(P 2 − P 1 ))V¯ .
Clearly, Y (α) satisfies all the linear constraints of (4.5), and it has rank d for 0 ≤ α ≤ 1.
But there is an α0 > 1 such that P 1 + α0 (P 2 − P 1 ) is positive semidefinite but not positive
definite, i.e., one of eigenvalues of P 1 + α0 (P 2 − P 1 ) becomes zero. Thus, Y (α0 ) has a
rank less than d but feasible to (4.5), which contradicts the fact that the graph cannot be ¯ be an optimal dual matrix of (4.6). Then, ‘flattened’ to a lower dimensional space. Let U
71
4.5. REGULARIZATION TERM
¯ = 0. Note that any optimal solution matrix Y¯ satisfies the complementarity condition Y¯ U ¯ e = e, where e is the vector of all ones. Thus, we have X ¯ T Xe ¯ = Y¯ e = Y¯ U ¯ e = 0, which U ¯ = 0. further implies that Xe
Theorem 4.1 has an important implication in a distributed graph localization algorithm. It suggests that if a subgraph is d-localizable, then the sub-configuration is the same (up to translation, rotation and reflection) as the corresponding portion in the global configuration. Thus, one may attempt to localize a large graph by finding a sequence of d-localizable subgraphs. Theorem 4.1 also says that if the graph G is d-localizable, then the optimal ¯T X ¯ for some X ¯ = [¯ solution of the SDP is given by Y¯ = X ¯n ] ∈ Rd×n such that x1 , . . . , x ¯ = 0. It is now clear that when G is d-localizable, we have Y¯ = X ¯ T X, ¯ and hence Xe Y¯jj = k¯ xj k2 for j = 1, . . . , n. But in general, when the given distances are not exact but ¯T X ¯ 0. This inequality however, may give a corrupted with noises, we only have Y¯ − X measure of the quality of the estimated positions. For example, the individual trace Tj := Y¯jj − k¯ xj k2 ,
(4.7)
may give an indication on the quality of the estimated position x ¯j , where a smaller trace indicates more accuracy in the estimation. This is the same idea as was discussed for the case with anchors in (2.7).
4.5
Regularization term
As mentioned before in Chapter 3.5, we add additional terms to the objective function that force the SDP solution to lie in a low dimension, in order to counter the property of interior point algorithms to compute high-rank solutions. Our strategy is to convert the feasibility problem (4.1) into an maximization problem using the following regularization term as the objective function, n X n X i=1 j=1
kxi − xj k2 .
(4.8)
72
CHAPTER 4. MOLECULE CONFORMATION
The new optimization problem is Maximize
n X n X i=1 j=1
subject to
kxi − xj k2 2
d2ij ≤ kxi − xj k2 ≤ dij n X
∀(i, j) ∈ Nu
xi = 0.
(4.9)
i=1
Note that the regularization term can also be expressed as (I − eeT /n) • Y , where e is the P vector of all ones. The constraint ni=1 xi = 0, which is added to remove the translational invariance of the configuration of points by putting the center of gravity at the origin, can also be expressed as Y e = 0. So the objective function reduces to Trace(Y ) = I •Y . Finally,
the SDP relaxation is
Maximize subject to
I •Y
2
d2ij ≤ eTij Y eij ≤ dij , Y e = 0,
∀ (i, j) ∈ Nu ,
Y 0.
(4.10)
The regularization idea works even better in the inequality setting (4.10) than in the model considered previously in (3.15). There is no error minimization term in the objective function and so we no longer need to worry about finding the right balance in weighting the different terms of the objective function. But one important point to be careful about is that in the case of very sparse distance data, often there may be 2 or more disjoint blocks within the given distance matrix, i.e., the graph represented by the distance matrix may not be connected, and have more than 1 component. In that case, using the regularization term leads to an unbounded objective function since the disconnected components can be pulled as far apart as possible. Therefore, care must be taken to identify the disconnected components before applying model (4.10) to the individual components.
4.6
Clustering
The SDP (4.10) is computationally not tractable when there are several hundreds points. Therefore we divide the entire molecule into smaller clusters of points and solve a separate
4.6. CLUSTERING
73
SDP for each cluster. The clusters need to be chosen such that there should be enough distance information between the points in a cluster for it to be localized accurately, but at the same time only enough that it can also be solved efficiently. In order to do so, we make use of matrix permutations that reorder the points in such a way that the points which share the most distance information amongst each other are grouped together. In the problem described in this chapter, we have an upper and a lower bound distance matrix, but for simplicity, we will describe the operations in this section on just the partial distance matrix D. In the actual implementation, the operations described are performed on both the upper and lower bound distance matrices. This does not make a difference because the operations performed here basically exploit information only about the connectivity graph of the set of points. We perform a symmetric permutation of the partial distance matrix D to aggregate ˜ be the permuted matrix. In our the non-zero elements towards the main diagonal. Let D implementation, we used the function symrcm in Matlab to perform the symmetric reverse ˜ is partitioned into a quasiCuthill-McKee permutation [33] on D. Then the matrix D block-diagonal matrix with variable block-sizes. Let the blocks be denoted by D1 , . . . , DL . A schematic diagram of the quasi-block-diagonal structure is shown in Figure 4.1. The size of each block (except the last) is determined as follows. Starting with a minimum block-size, say 50, we extend the block-size incrementally until the number of non-zero elements in the block is above a certain threshold, say 1000. We start the process of determining the size of each block from the upper-left corner and sequentially proceed to the lower right-corner ˜ Geometrically, the above partitioning process splits the job of determining the global of D. ˜ into L smaller jobs, each of which try to determine the subconfiguration defined by D configuration defined by Dk , and then assemble the sub-configurations sequentially from k = 1 to L to reconstruct the global configuration. Observe that there are overlapping sub-blocks between adjacent blocks. For example, the second block overlaps with the first at its upper-left corner, and overlaps with the third at its lower-right corner. The overlapping sub-blocks serve an important purpose. For convenience, consider the overlapping sub-block between the second and third blocks. This overlapping sub-block corresponds to points that are common to the configurations defined by their respective distance matrices, D2 and D3 . If the third block determines a localizable configuration X3 , then the common points in the overlapping sub-block can be used to stitch the localized configuration X3 to the current global configuration determined
74
CHAPTER 4. MOLECULE CONFORMATION
by the first two blocks. In general, if the kth block is localizable, then the overlapping subblock between the k −1st and kth blocks will be used to stitch the kth localized configuration to the global configuration determined by the aggregation of all the previous blocks.
Figure 4.1: Schematic diagram of the quasi-block-diagonal structure considered in the distributed SDP algorithm.
As the overlapping sub-blocks are crucial in stitching a sub-configuration to the global configuration, it should have as high connectivity as possible. In our implementation, we find the following strategy to be reasonably effective. After the blocks D1 , . . . , DL are determined, starting with D2 , we perform a symmetric reverse Cuthill-McKee permutation to the (2,2) sub-block of D2 that is not overlapping D1 , and repeat the same process sequentially for all subsequent blocks. To avoid introducing excessive notation, we still use Dk to denote the permuted kth block. It is also worth noting that in the case of highly noisy or very sparse data, the size of the overlapping sub-blocks needs to be set higher for the stitching phase to succeed. The more common points there are between 2 blocks, the more robust the stitching between them. This is also true as the number of sub-blocks which need to be stitched is large (i.e., the number of atoms in the molecules is large). However, increasing the number of common points also has an impact on the runtime, and therefore we must choose the overlapping sub-block sizes judiciously. Experiments indicated that a sub-block size of 15-20 was sufficient for molecules with less than 3000 atoms but sub-block sizes of 25-30 were more suitable for larger molecules.
75
4.7. STITCHING
4.7
Stitching
After all the individual localization problems corresponding to D1 , . . . , DL have been solved, we have L sub-configurations that need to be assembled together to form the global config˜ Suppose the current configuration determined by the blocks Di , uration associated with D. i = 1, . . . , k − 1 is given by the matrix X (k−1) = [U (k−1) , V (k−1) ]. Suppose also that F (k−1)
records the global indices of the points that are currently labeled as localized in the current global configuration. Let Xk = [Vk , Wk ] be the points in the sub-configuration determined by Dk . (For k = 1, Vk is the null matrix.) Here V (k−1) and Vk denote the positions of the points corresponding to the overlapping sub-block between Dk−1 and Dk , respectively. Let Ik be the global indices of the points in Wk . Note that the global indices of the unlocalized Sk−1 Ii \ F (k−1) . points for the blocks D1 , . . . , Dk−1 is given by J (k−1) = i=1
We will now concentrate on stitching the sub-configuration Dk with the global index
set Ik . Note that the points in the sub-configuration Dk , which have been obtained by solving the SDP on Dk , will most likely contain points that have been correctly estimated and points that have been estimated less accurately. It is essential that we isolate the badly estimated points, so as to ensure that their errors are not propagated when estimating subsequent blocks and that we may recalculate their positions when more points have been correctly estimated. To detect the badly estimated points, we use a combination of 2 error measures. Let xj be the position estimated for point j. Set Fb ← F (k−1) . We use the trace
error Tj from Equation (4.7) and the local error bj = E
P
cuj i∈N
(kxi − xj k − dij )2 cuj | |N
,
(4.11)
cuj = {i ∈ Fb : i < j, D ˜ ij 6= 0}. We require max{Tj , E bj } ≤ T as a necessary where N
condition for the point to be correctly estimated. In the case when we are just provided with upper and lower bounds on distances, we use the mean distance in the local error measure calculations, i.e., dij = (dij + dij )/2.
(4.12)
The use of multiple error measures is crucial, especially in cases when the distance information provided is noisy. In the noise free case, it is much easier to isolate points which are badly estimated using only the trace error Tj . But in the noisy case, there
76
CHAPTER 4. MOLECULE CONFORMATION
are no reliable error measures. By using multiple error measures, we hope to identify all the bad points which might possibly escape detection when only a single error measure is used. Setting the tolerances T for the error is also an important parameter selection issue. For very sparse distance data and highly noisy cases, where even accurately estimated points may have significant error measure values, there is a tradeoff between selecting the tolerance and the number of points that are flagged as badly estimated. If the tolerance is too tight, we might end up discarding too many points that are actually quite accurately estimated. The design of informative error measures is still an open issue and there is room for improvement. As our results will show, the stitching phase of the algorithm is one which is most susceptible to noise and inaccurate estimation, and we need better error measures to make it more robust. Now we have two cases to consider in the stitching process: 1. Suppose that there are enough points in the overlapping sub-blocks Vk and V (k−1) that are well estimated/localized, then we can stitch the kth sub-configuration directly to the current global configuration X (k−1) by finding the affine mapping that matches points in V (k−1) and Vk as closely as possible. Mathematically, we solve the following linear least squares problem: n o min kB(Vk − α) − (V (k−1) − β)kF : B ∈ Rd×d ,
(4.13)
where α and β are the centroids of Vk and V (k−1) , respectively. Once an optimal B is found, set b = [U (k−1) , V (k−1) , β + B(Wk − α)], X
Fb ← F (k−1)
[
Ik .
We should mention that in the stitching process, it is very important to exclude points in V (k−1) and Vk that are badly estimated/unlocalized when solving (4.13) to avoid destroying the current global configuration. It should also be noted that there may be points in Wk that are incorrectly estimated in the SDP step. Performing the affine transformation on these points is useless, because they are in the wrong position in the local configuration to begin with. To deal with these points, we re-estimate the positions using those correctly estimated points as anchors. This procedure is exactly the same as what is described in case 2.
4.8. DISTRIBUTED ALGORITHM
77
2. If there are not enough common points in the overlapping sub-blocks, then the stitching process described in (a) cannot be carried out successfully. In this case, the solution obtained from the SDP step for Dk is discarded, i.e., the positions in Wk are discarded and they are to be determined via the current global configuration X (k−1) point-by-point as follows: b ← X (k−1) , and Fb ← F (k−1) . Let T = 10−4 . For j ∈ J (k−1) S Ik , Set X
˜ ij 6= 0}, where (a) Formulate a new SDP with Nu = ∅ and Na = {(i, j) : i ∈ Fb, D b i) : (i, j) ∈ Na }. the anchor points are given by {X(:,
(b) Let xj and Tj be the newly estimated position and trace error from the previous bj . step. Compute the local error measure E b j) = xj ; else, set X b = [X, b xj ]. If min{Tj , E bj } ≤ T , then (c) If j ∈ J (k−1) , set X(:, set Fb ← Fb ∪ {j}, end.
Notice that in attempting to localize the points corresponding to Wk , we also attempt
to estimate the positions of those previously unlocalized points, whose indices are recorded in J (k−1) . Furthermore, we use previously estimated points as anchors to estimate new points. This not only helps in stitching new points into the current global configuration, but also increases the chances of correcting the positions of previously badly estimated points (since more anchor information is available when more points are correctly estimated).
4.8
A distributed SDP algorithm for anchor-free graph realization
We will now describe the complete distributed algorithm for solving a large scale anchorfree graph realization problem. To facilitate the description of our distributed algorithm for anchor-free graph realization, we first describe the centralized algorithm for solving model (4.1). It is important to note here that the terms localization and realization are used interchangeably. Centralized graph localization (CGL) algorithm. Input: (D, D, Nu ; {a1 , . . . , am }, Na ).
78
CHAPTER 4. MOLECULE CONFORMATION
Output: Estimated positions, [¯ x1 , . . . , x ¯n ] ∈ Rd×n , and corresponding accuracy measures; b1 , . . . , E bn . trace errors, T1 , . . . , Tn and local error measures E 1. Formulate the optimization problem (4.9) and solve the resulting SDP. Let X = [x1 , . . . , xn ] be the estimated positions obtained from the SDP solution. 2. Perform the local refinement algorithm from Section 3.6 using the SDP solution as the starting iterate to get more refined estimated positions [¯ x1 , . . . , x ¯n ]. 3. For each j = 1, . . . , n, label the point j as localized or unlocalized based on the error bj . measures Tj and E
Distributed anchor free graph localization (DAFGL) algorithm.
Input : Upper and lower bounds on a subset of the pairwise distances in a molecule. Output : A configuration of all the atoms in the molecule that is closest (in terms of the RMSD error described in Section 4.9) to the actual molecule (from which the measurements were taken). 1. Divide the entire point set into sub-blocks using the clustering algorithm described in Section 4.6 on the sparse distance matrices. 2. Apply the CGL algorithm to each sub-block. 3. Stitch the sub-blocks together using the procedure described in Section 4.7. After each stitching phase, refine the point positions again using the gradient descent based local refinement method described in Section 3.6 and update their error measures. Some remarks are in order for the above algorithm. In Step 3, we can solve each cluster individually and the computation is highly distributive. In using the centralized CGL algorithm to solve each cluster, the computational cost is dominated by the solution of the SDP problem(the SDP cost is in turn determined by the number of given distances). For a graph with n nodes and m given pairwise distances, the computational complexity in solving the SDP is roughly O(m3 ) + O(n3 ), provided sparsity in the SDP data is fully exploited. For a graph with 200 nodes and the number of given distances is 10% of the total number of pairwise distances, the SDP would roughly have 2000 equality constraints and matrix variables of dimension 200. Such an SDP can be solved on a Pentium IV 3.0 GHz PC with 2GB RAM in about 36 and 93 seconds using the general purpose SDP software
4.8. DISTRIBUTED ALGORITHM
79
SDPT3-3.1 [90] and SeDuMi-1.05 [82], respectively. The computational efficiency in the CGL algorithm can certainly be improved in various ways. First, the SDP problem need not be solved to high accuracy. It is sufficient to have a low accuracy SDP solution if it is only used as a starting iterate for the local refinement algorithm. There are various highly efficient methods (such as iterative solver based interior-point methods [83] or the SDPLR method of Burer and Monteiro [20]) to obtain a low accuracy SDP solution. Second, a dedicated solver based on a dual scaling algorithm can also speed up the SDP computation. Substantial speed up can be expected if the computation exploits the low rank structure present in the constrain matrices. However, as our focus in this chapter is not on improving the computational efficiency of the CGL algorithm, we shall not discuss this issue further. In the numerical experiments conducted in Section 4.9, we use the softwares SDPT3-3.1 and SeDuMi to solve the SDP in the CGL algorithm. An alternative is to use the software DSDP-5.8 [7], which is expected to be more efficient than SDPT3-3.1 or SeDuMi. The stitching process in Step 4 is sequential in nature. But this does not imply that the distributed DAFGL algorithm is redundant and that the centralized CGL algorithm is sufficient for computational purposes. For a graph with 10000 nodes and the number of given distances is 1% of the total number of all pairwise distances, the SDP problem that needs to be solved by the CGL algorithm would have 500000 constraints and matrix variables of dimension 10000. Such a large scale SDP is well beyond the range that can be solved routinely on a standard workstation available today. By considering smaller blocks, the distributed algorithm does not suffer from the scalability limitation faced by the CGL algorithm. If there are multiple computer processors available, say p of them, the distributed algorithm can also take advantage of the extra computing power. The strategy is to divide the graph into p large blocks using Step 2 of the algorithm and apply the DAFGL algorithm to localize one large block on each processor. In our implementation, we use the gradient refinement step quite extensively, both after the SDP step for a single block, and also after each stitching phase between 2 blocks. In the single block case, the calculation does not involve the use of any anchor points, but when used after stitching, we fix the previous correctly estimated points as anchors. The formula used in the gradient calculations changes accordingly. It should be noted that the more sophisticated refinement methods described in Subsection 3.6.2 can also be used, and are expected to provide even better accuracy. Tuning the truncated Newton method for the molecule conformation problem is a part of our future research.
80
CHAPTER 4. MOLECULE CONFORMATION
We end this section with 2 observations on the DAFGL algorithm. First, we observed that usually the outlying points which have low connectivity are not well estimated in the initial stages of the method. As the number of well estimated points grow gradually, more and more of these ‘loose’ points are estimated by the gradient descent algorithm. As the molecules get larger, the frequency of having non-localizable sub-configurations in Step 4 also increase. Thus the point-by-point stitching procedure of the algorithm described in Section 4.7 gets visited more and more often. Second, for large molecules, the sizes of the overlapping-blocks need to be larger for the stitching algorithm in Section 4.7 to be robust (more common points generally lead to more accuracy in stitching). But to accommodate larger overlapping-blocks, each subgraph in the DAFGL algorithm will correspondingly be larger, and that in turns increases the problem size of the SDP relaxation. In our implementation, we apply the idea of dropping redundant constraints to reduce the computational effort in selecting large sub-block sizes of 100-150. This strategy works because many of the distance constraints are for some of the densest parts of the sub-block, and the points in these dense sections can actually be estimated quite well with only a fraction of those distance constraints. Therefore in the SDP step, we limit the number of distance constraints for each point to below 6. If there are more distance constraints, they are not included in the SDP step. This allows us to choose large overlapping-block sizes while the corresponding SDPs for larger clusters can be solved without too much additional computational effort.
4.9
Experimental results
To evaluate our DAFGL algorithm, numerical experiments were performed on protein molecules with the number of atoms in each molecule ranging from a few hundreds to a few thousands. We conducted our numerical experiments in Matlab on a single Pentium IV 3.0 GHz PC with 2GB of RAM. The known 3D coordinates of the atoms were taken from the Protein Data Bank (PDB) [8]. These were used to generate the true distances between the atoms. Our goal in the experiments is to reconstruct as closely as possible the known molecular configuration for each molecule, using only distance information generated from a sparse subset of all the pairwise distances. This information was in the form of upper and lower bounds on the actual pairwise distances. For each molecule, we generated the partial distance matrix as follows. If the distance between 2 atoms was less than a given cutoff radius R, the distance is kept; otherwise,
4.9. EXPERIMENTAL RESULTS
81
no distance information is known about the pair. The cutoff radius R is chosen to be 6˚ A (1˚ A = 10−8 cm), which is roughly the maximum distance that NMR techniques can measure between two atoms. Therefore, in this case, Nu = {(i, j) : kxi − xj k ≤ 6˚ A}. We
then perturb the distances to generate upper and lower bounds on the given distances in the following manner. Assume that dˆij is the true distance between atom i and atom j, we set dij = dˆij (1 + |ij |)
dij = dˆij max(0, 1 − |ij |), 2 ). By varying σ where ij , ij ∼ N (0, σij ij (which we keep as the same for all pairwise
distances), we control the noise in the data. This is a multiplicative noise model, where a higher distance value means more uncertainty in its measurement. For all future reference, we will refer to σij in percentage values. For example, σij = 0.1 will be referred to as 10% noise on the upper and lower bounds. Typically, not all the distances below 6˚ A are known from NMR experiments. Therefore, we will also present results for the DAFGL algorithm when only a fraction of all the distances below 6˚ A are chosen randomly and used in the calculation. Let Q be the set of orthogonal matrices in Rd×d (d = 3). We measure the accuracy
performance of our algorithm by the following criteria:
n o 1 √ min kX true − QX − hkF | Q ∈ Q, h ∈ Rd n 1 2 1/2 X LDME = kxi − xj k − dij . |Nu | RMSD =
(4.14) (4.15)
(i,j)∈Nu
The first criterion requires the knowledge of the true configuration, whereas the second does not. Thus the second criterion is more practical but it is also less reliable in evaluating the true accuracy of the constructed configuration. The practically useful measure LDME gives lower values than the RMSD, and as the noise increases, it is not a very reliable measure. When 90% of the distances (below 6˚ A) or higher were considered, and were not corrupted by noise, the molecules considered here were estimated very precisely with RMSD = 10−4 − 10−6 ˚ A. This goes to show that the algorithm performs remarkably well when there is enough exact distance data. In this regard, our algorithm is competitive compared to the geometric
82
CHAPTER 4. MOLECULE CONFORMATION
build-up algorithm in [96] which is designed specifically for graph localization with exact distance data. With exact distance data, we have solved molecules that are much larger and with much sparser distance data than those considered in [96]. However, our focus in this work is more on molecular conformation with noisy and sparse distance data. We will only present results for such cases. In Figure 4.2, the original true and the estimated atom positions are plotted for some of the molecules, with varying amounts of distance data and noise. Once again, as in the 2-D case, the open (black) circles correspond to the true positions and solid (blue) dots to their estimated positions from our computation. The error offset between the true and estimated positions for an individual atom is depicted by a solid (black) line. The solid lines, however, are not visible for most of the atoms in the plots because we are able to estimate the positions very accurately. The plots show that even for very sparse data (as in the case of 1PTQ), the estimation for most of the atoms is accurate. The atoms that are badly estimated are the ones that have too little distance information for them to be localized. The algorithm also performs well for very noisy data, and for large molecules, as demonstrated for 1AX8 and 1TOA. The largest molecule we tried the algorithm on is 1I7W, with 8629 atoms. Figure 4.3 shows that even when the distance data is highly noisy (10 % error on upper and lower bounds), ˚. the estimation is close to the original with an RMSD error = 1.3842A However, despite using more common points for stitching, the DAFGL algorithm can sometimes generate estimations with high RMSD error for very large molecules due to a combination of irregular geometry, very sparse distance data and noisy distances. Ultimately, the problem boils down to being able to correctly identify when a point has been badly estimated. Our measures using trace error (4.7) and local error (4.11) are able to isolate the majority of the points that are badly estimated, but do not always succeed when many of the points are badly estimated. In fact, for such molecules, a lot of the computation time is spent in the point-by-point stitching phase, where we attempt to repeatedly solve for better estimations of the badly estimated points, and if that fail repeatedly, the estimations continue to be poor. If the number of badly estimated points is very high, it may affect the stitching of the subsequent clusters as well. In such cases, the algorithm more or less fails to find a global configuration. Examples of such cases are the 1NF7 molecule (5666 atoms) and the 1HMV molecule (7398 atoms) solved with 10% noise. While they are estimated correctly when there is no noise, the 1NF7 molecule estimation returns an RMSD of 25.1061˚ A and the 1HMV molecule returns 28.3369˚ A . A more moderate case of stitching failure can
83
4.9. EXPERIMENTAL RESULTS
(a) 1PTQ(402 atoms) with 30% of distances below 6˚ A and 1% noise on upper and lower bounds, RMSD = 0.9858˚ A
(b) 1AX8(1003 atoms) with 70% of distances below 6˚ A and 5% noise on upper and lower bounds, RMSD = 0.8184˚ A
(c) 1TOA(4292 atoms) with 100% of distances below 6˚ A and 10% noise on upper and lower bounds, RMSD = 1.5058˚ A
Figure 4.2: Comparison of actual and estimated molecules.
84
CHAPTER 4. MOLECULE CONFORMATION
Figure 4.3: 1I7W(8629 atoms) with 100% of distances below 6˚ A and 10% noise on upper ˚ and lower bounds, RMSD = 1.3842A .
be seen in Figure 4.4 for the molecule 1BPM with 50% of the distance below 6˚ A, and 5% error on upper and lower bounds, the problem is in particular clusters (which are encircled in the figure). Although they have correct local geometry, their positions with respect to the entire molecule are not. This indicates that the stitching procedure has failed because some of the common points are not estimated correctly, and are then used in the stitching process, thus destroying the entire local configuration. So far, this is the weakest part of the algorithm and future work is heavily focussed on developing better error measures to isolate the badly estimated points and to improve the robustness of the stitching process. In Figure 4.5, we plotted the 3-dimensional configuration (via Swiss PDBviewer [35]) of some of the molecules to the left and their estimated counterparts (with different distance data inputs) to the right. As can be seen clearly, the estimated counterparts closely resemble the original molecules. The results shown in the following tables are a good representation of the performance of the algorithm on different size molecules with different types of distance data sets. The ˚ are numerical results presented in Table 4.1 are for the case when all distances below 6A used and perturbed with 10% noise on lower and upper bounds. Table 4.2 contains the ˚ are used and perturbed with 5% results for the case when only 70% of distances below 6A
4.9. EXPERIMENTAL RESULTS
85
Figure 4.4: 1BPM(3672 atoms) with 50% of distances below 6˚ A and 5% noise on upper and ˚ lower bounds, RMSD = 2.4360 A.
noise and also for the case when only 50% of distances below 6˚ A are used and perturbed with 1% noise. The results for 50% distance and 1% noise are representative of cases with sparse distance information and low noise, 100% distance and 10% noise represent relatively denser but highly noisy distance information and 70% distance and 5% noise is a middle ground between the 2 extreme cases. We can see from the values in Table 4.1 that LDME is not a very good measure of the actual estimation error given by RMSD since the former does not correlate well with the latter. Therefore we do not report the LDME values in Table 4.2. From the tables, it can be observed that for relatively dense distance data (100% of all distances below ˚), the estimation error stays below 2˚ 6A A even when the upper and lower bounds are very loose. The algorithm is seen to be quite robust to high noise when there is enough distance information. The estimation error is also quite low for most molecules for cases when the distance information is very sparse but much more precise. In the sparse distance cases, it is the molecules that have more irregular geometries that suffer the most from lack of enough distance data and exhibit high estimation errors. The combination of sparsity and noise has a detrimental impact on the algorithm’s performance, as can be seen for the results with 70% distances, and 5% noise.
86
CHAPTER 4. MOLECULE CONFORMATION
(a) 1HOE(558 atoms) with 40% of distances below 6˚ A and 1% noise on upper and lower bounds, RMSD = 0.2154˚ A
(b) 1PHT(814 atoms) with 50% of distances below 6˚ A and 5% noise on upper and lower bounds, RMSD = 1.2014˚ A
(c) 1RHJ(3740 atoms) with 70% of distances below 6˚ A and 5% noise on upper and lower bounds RMSD = 0.9535˚ A
(d) 1F39(1534 atoms) with 85% of distances below 6˚ A and 10% noise on upper and lower bounds, RMSD = 0.9852˚ A
Figure 4.5: Comparison between the original (left) and reconstructed (right) configurations for various protein molecules using Swiss PDB viewer.
87
4.9. EXPERIMENTAL RESULTS
Table 4.1: Results for 100% of Distances below 6˚ A and 10% noise on upper and lower bounds PDB ID
No. of atoms
% of total pairwise distances used
RMSD(˚ A)
LDME(˚ A)
CPU time (secs)
1PTQ
402
8.79
0.1936
0.2941
107.7
1HOE
558
6.55
0.2167
0.2914
108.7
1LFB
641
5.57
0.2635
0.1992
129.1
1PHT
814
5.35
1.2624
0.2594
223.9
1POA
914
4.07
0.4678
0.2465
333.1
1AX8
1003
3.74
0.6408
0.2649
280.1
1F39
1534
2.43
0.7338
0.2137
358.0
1RGS
2015
1.87
1.6887
0.1800
665.9
1KDH
2923
1.34
1.1035
0.2874
959.1
1BPM
3672
1.12
1.1965
0.2064
1234.7
1RHJ
3740
1.10
1.8365
0.1945
1584.4
1HQQ
3944
1.00
1.9700
0.2548
1571.8
1TOA
4292
0.94
1.5058
0.2251
979.5
1MQQ
5681
0.75
1.4906
0.2317
1461.1
In Figure 4.6, we plotted the CPU times required by our DAFGL algorithm to localize a molecule with n atoms versus n (with different types of distance inputs). As the distance data becomes more and more sparse, the number of constraints in the SDP also reduce, and therefore they take less time to be solved in general. However, many points may be incorrectly estimated in the SDP phase and so the stitching phase usually takes longer in these cases. This behavior is exacerbated by the presence of higher noise. Our algorithm is reasonably efficient in solving large problems. The spikes that we see in the graphs for some of the larger molecules also correspond to cases with high RMSD error, in which the algorithm fails to find a valid configuration of points, either due to very noisy or very sparse data. A lot of time is spent in recomputing points in the stitching phase, and many of these points are repeatedly estimated incorrectly. The number of badly estimated points grows at each iteration of the stitching process and becomes more and more time consuming. But, given the general computational performance, we expect our algorithm to be able to handle even larger molecules if the number of badly estimated points can be kept low. On the other hand, we are also investigating methods that discard repeatedly badly estimated
88
CHAPTER 4. MOLECULE CONFORMATION
Table 4.2: Results with 70% of distances below 6˚ A and 5% noise on upper and lower bounds ˚ and with 50% of distances below 6A and 1% noise on upper and lower bounds PDB ID
No. of atoms
70% Distances, 5% Noise CPU time ˚) RMSD(A (secs)
50% Distances, 1% Noise CPU time ˚) RMSD(A (secs)
1PTQ
402
0.2794
93.8
0.7560
22.1
1HOE
558
0.2712
129.6
0.0085
32.5
1LFB
641
0.4392
132.5
0.2736
41.6
1PHT
814
0.4701
129.4
0.6639
53.6
1POA
914
0.4325
174.7
0.0843
54.1
1AX8
1003
0.8184
251.9
0.0314
71.8
1F39
1534
1.1271
353.1
0.2809
113.4
1RGS
2015
4.6540
613.3
3.5416
308.2
1KDH
2923
2.5693
1641.0
2.8222
488.4
1BPM
3672
2.4360
1467.1
1.0502
384.8
1RHJ
3740
0.9535
1286.1
0.1158
361.5
1HQQ
3944
8.9106
2133.5
1.6610
418.4
1TOA
4292
9.8351
2653.6
1.5856
372.6
1MQQ
5681
3.1570
1683.4
2.3108
1466.2
points from future calculations.
4.10
Conclusion and work in progress
An SDP based distributed method to solve the distance geometry problem in 3-D with incomplete and noisy distance data and without anchors is described. The entire problem is broken down into subproblems by intelligent clustering methods. An SDP relaxation problem is formulated and solved for each cluster. Matrix decomposition is used to find local configurations of the clusters and a least squares based stitching method is applied to find a global configuration. Gradient descent methods are also employed in intermediate steps to refine the quality of the solution. The performance of the algorithm is evaluated by using it to find the configurations of large protein molecules with a few thousands atoms. The distributed SDP approach can solve with good accuracy and speed large problems with ˚ are given and they are contaminated favorable geometries when 50-70% distances below 6A
89
4.10. CONCLUSION AND WORK IN PROGRESS
Total Solution time (sec)
3000 2500
100% distances, 10% error 70% distances, 5% error 50% distances, 1% error
2000 1500 1000 500 0 0
1000
2000 3000 4000 Number of atoms n
5000
6000
Figure 4.6: CPU time taken to localize a molecule with n atoms versus n.
only with moderate level of noises (0-5%) in both the lower and upper bounds. The current DAFGL algorithm needs to be improved for it to work on very sparse (30-50% of all distances ˚) and highly noisy data (10-20% noise), which is often the case for actual NMR below 6A data used to perform molecular conformation. For the rest of this section, we outline some possible improvements that can be made to our current algorithm. One of the main difficulties we encounter in the current distributed algorithm is the propagation of position estimation errors in a cluster to other clusters during the stitching process when the given distances are noisy. Even though we have had some successes in overcoming this difficulty, it is not completely alleviated in the current work. The difficulties faced by our current algorithm with very noisy or very sparse data cases are particularly noticeable for very large molecules (which correspond to a large number of sub-blocks that need stitching). Usually, the estimation for many of the points in some of the sub-blocks from the CGL step is not accurate enough in these cases. This is especially problematic when there are too few common points that can then be used for stitching with the previous block, or if the error measures used to identify the bad points are unable to filter out some of the badly estimated common points. This usually leads to the error propagating to subsequent blocks as well. Therefore, the bad point detection and stitching phase need to made more robust. To reduce the effect of inadvertently using a badly estimated point for stitching, we can
90
CHAPTER 4. MOLECULE CONFORMATION
increase the number of common points used in the stitching process, and at the same time, use more sophisticated stitching algorithms that not only stitch correctly estimated points, but also isolate the badly estimated ones. As far as stitching is concerned, currently we use the coordinates of the common points between 2 blocks to find the best affine mapping that would bring the 2 blocks into the same coordinate system. Another idea is to fix the values in the inner product matrix Y in (4.10) that correspond to the common points, based on the values that were obtained for it in solving the SDP for the previous block. By fixing the values, we are indirectly using the common points to anchor the new block with the previous one. The dual of the SDP relaxation also merits further investigation, for possible improvements in the computational effort required and for more robust stitching results. A third method that is worth exploring is the sophisticated stitching algorithm introduced recently in [105] for nonlinear dimensionality reduction. With regard to error measures, it would be useful to study if the local error measure (4.11) can be made more sophisticated by including a larger set of points to check its distances with, as opposed to just its immediate neighborhood. In some cases, the distances are satisfied within local clusters, but the entire cluster itself is badly estimated (and the local error measure fails to filter out many of the badly estimated points in this scenario). Also, we need to investigate the careful selection of the tolerance Tε in deciding which points have been estimated correctly and can be used for stitching. In this chapter. we have only used the gradient descent method for local refinement. In subsection 3.6.2, we introduced the use of truncated Newton methods, which can provide more accurate estimations in fewer iterations. The use of these methods for local refinement should further improve the accuracy of the overall algorithm. As has been noted before, our current algorithm does not use the VDW minimum separation distance constraints of the form kxi − xj k ≥ Lij described in [68] and [95]. The VDW constraints played an essential role these previous work in reducing the search space
of valid configurations, especially in the case of sparse data where there are many possible configurations fitting the sparse distance constraints. As mentioned in Section 3, VDW lower bound constraints can be added to the SDP model, but we would need to keep track of the type of atom each point corresponds to. However, one must be mindful that the VDW constraints should only be added when necessary so as not to introduce too many redundant constraints. If the VDW constraints between all pairs of atoms are added to the SDP model, then the number of lower bound constraints is of the order n2 , where n is the
4.10. CONCLUSION AND WORK IN PROGRESS
91
number of points. Thus even if n is below 100, the total number of constraints could be in the range of thousands. However, many of the VDW constraints are for two very remote points, and they are often inactive or redundant at the optimal solution. Therefore, we can adopt an iterative active constraint generation approach. We first solve the SDP problem by completely ignoring the VDW lower bound constraints to obtain a solution. Then we verify the VDW lower bound constraints at the current solution for all the points and add the violated ones into the model. We can repeat this process until all the VDW constraints are satisfied. Typically, we would expect only O(n) VDW constraints to be active at the final solution. We are optimistic that by combining the ideas presented in previous work on molecular conformation (especially incorporating domain knowledge such as the minimum separation distances derived from VDW interactions) with the distributed SDP based algorithm in this work, the improved distributed algorithm would likely be able to calculate the conformation of large protein molecules with satisfactory accuracy and efficiency in the future. Based on the performance of the current DAFGL algorithm, and the promising improvements to the algorithm we have outlined, a realistic target for us to set in the future is to correctly calculate the conformation of a large molecule (with 5000 atoms or more) given only about ˚, and corrupted by 10-20% noise. 50% of all pairwise distances less than 6A
Chapter 5
Using angle information While the application of SDP relaxations to pure distance information has been discussed in previous chapters, a framework does not exist where we can use this technique for angle information. Solving this problem is particularly useful in a scenario where we use sensors that are also able to detect mutual angles, like in image sensor nodes. There is a move towards towards creating designs and applications of large scale heterogenous sensor networks using multi-modal sensing technologies [81, 104]. So there is also a need to develop localization techniques that can use different types of ranging and proximity information that may have been acquired through different ranging mechanisms (Received Signal Strength, Time of Flight, Ultrasonic, light and image sensors) in an integrated fashion. This chapter takes a step in this direction by extending the SDP model to solve the localization problem using angle information in combination with distance information, or independently, using just angle information. Section 5.1 discusses some other approaches to using angle information for determining sensor positions. Section 5.2.1 describes the scenario envisioned for the problem.It explains the nature of the angle information available. In Section 5.2.2, we present the extension of the SDP model for scenarios where both distance and angle information are present. Using trigonometric relations, we arrive at an optimization problem with a set of quadratic constraints that can then be relaxed into an SDP. A more subtle idea is used in Section 5.2.3 to again obtain an optimization problem with quadratic constraints and subsequently relax it to an SDP. Practical considerations such as the computational effort and accuracy of the algorithm and their dependence upon factors such as communication range and field of view of sensors are discussed in Section 5.3. Extensive experimental results are presented in Section 5.4. Finally, we conclude with 92
5.1. RELATED WORK
93
the current limitations of the technique and ideas for further research in Section 5.5.
5.1
Related work
The use of angle information for sensor network localization has been investigated by Niculescu and Nath [62]. It is assumed that the network has a few landmark or anchor points whose positions are already known. The algorithm uses information between the anchors and unknown points to set up triangulation equations in order to determine positions of unknown points. The calculated positions are then forwarded to other unknown points and used in further triangulation. Priyantha et al. [67] describe the Cricket Compass system that uses ultrasonic measurements and fixed beacons to obtain robust estimates of the orientation of mobile devices. This is done by using positions of the fixed beacons and differential distance estimates received from ultrasonic sensors. Recently, Gao et al. have introduced techniques to use local noisy angle information [18] and a combination of noisy angle and distance information [6] for localization. The basic ideas behind this work is to find bounded regions of feasible positions for all the points, which are obtained by solving a set of linear bounding constraints at each point. Similar bounding constraints have also been investigated by Doherty et al. [29]. We hope to use more global information in our approach, as opposed to a purely distributed point by point approach, which suffers from the drawback of using only the local information. The use of orientation information for mobile robot localization has been dealt with extensively in Betke and Gurvitz [9]. A variant of this situation is considered in the context of camera sensor networks by Skraba et al. [78]. A moving beacon transmits its position information to sensor nodes that can also measure the angle of arrival of the signal. The sensors can determine their position and camera orientation based on information from the beacon node signals transmitted at different times. This is a purely distributed method with each node being able to configure itself independently. We consider a slightly different situation where a moving beacon node is not available anymore. In fact, some or maybe all the nodes may not have any information relative to a fixed position. It becomes essential in this scenario to consider relative information between the sensors. In other words, there needs to be a collaborative method that exploits all the mutual information between the nodes. Techniques based on bounding using linear
94
CHAPTER 5. USING ANGLE INFORMATION
hyperplanes may be too loose to provide a meaningful solution. Our method also attempts to solve the position estimation problem in one step by solving the problem using global information. For very large networks of nodes, such global information may be the collective information shared between local clusters of nodes. This also avoids the issues of forwarding and error propagation as in [62]. Furthermore, by using global information, it is hoped that solutions are more accurate.
5.2 5.2.1
Integrating angle information Scenario
The scenario for which we will describe our algorithm is very similar to that described in [62]. Consider n unknown points xi ∈ R2 , i = 1, ..., n. All angle measurements are made at
each sensor relative to its own axis. The orientation θi of the axis with respect to a global coordinate system, after a random deployment of the network, is not known. There are also a set of m anchor nodes ak ∈ R2 , k = 1, ..., m, i.e., nodes whose positions are known a
priori. The orientations of their axes may or may not be known.
Every sensor can detect neighboring sensors which are within its field of view and a specified communication range of it. The field of view is limited by the maximum angle (on either side of the axis) that the sensor can detect with respect to its own axis. Depending on the type of sensing technology used, these parameters can be accordingly set. For example, the field of view for a typical image sensor is about 20-25 degrees on either side. On the other hand, for omnidirectional sensors with multiple cameras or antenna, the field of view can be made to be 180 degrees on either side. Since every measurement is with respect to the axis of a sensor node, we also have angle measurements corresponding to the angle between 2 other sensors (either anchors or unknown) as seen from the sensor, given that the 2 sensors are within communication distance range of it and within its field of view. The problem setup is described for a set of three points in Figure 5.1. The dark arrows correspond to axis orientations. The measurements given to us are the angles at a particular sensor that other sensors are seen at, with respect to its axis. The angle between sensor B and C, as seen from sensor A, can be obtained by subtracting the angle between sensor B and the axis of A, and the the angle between sensor C and the axis of A. The objective now is to find the positions of the unknown nodes, given many such measurements from all the sensors. From the position information, we can also derive the orientation information.
95
5.2. INTEGRATING ANGLE INFORMATION
B N
C
Axis orientation
A
Figure 5.1: Sensor, Axis orientation and field of view.
5.2.2
Distance and angle information
The first case we will consider is when we have both angle and distance information. Consider the situation that we assume that every anchor node is already aware of its axis orientation with respect to the global coordinate system. Then for every anchor, we have an absolute angle measurement (i.e., the angle with respect to the global coordinate system) corresponding to an unknown sensor, if the two are within a communication distance range R. Let the set of all such pairs of points be denoted by Na . For a pair of two points in Na , we have an absolute angle θkj between anchor ak and unknown sensor xj . Therefore, the angle information at the anchors can be expressed as a simple linear constraint in the following manner ak (2) − xj (2) = tan θkj , ∀ k, j ∈ Na , ak (1) − xj (1) ak (2) − xj (2) = tan θkj (ak (1) − xj (1)), where θkj is the angle at which the sensor j is detected by anchor k with respect to the global coordinate system, ak (1) and ak (2) are the x and y coordinates of the point ak respectively. Similarly xj (1) and xj (2) are the x and y coordinates of the point xj respectively. Since
96
CHAPTER 5. USING ANGLE INFORMATION
the equality can be written as Akj X = tan θkj ak (1) − ak (2),
(5.1)
where Akj is a 2 × n matrix with Akj (1, j) equal to tan θkj , Akj (2, j) equal to −1, and zero
everywhere else, and X = [x1 x2 ... xn ] is the 2 × n matrix (corresponding to the point
positions) that needs to be determined.
Also at every unknown sensor, the measurements of the angle and distance of every sensor within its communication range and field of view is available. Note that all the angle measurements are relative to its sensor axis. By subtracting the axis relative measurements at a sensor at xj , we can find the angle between 2 other sensors at xi and xl as seen from the sensor at xj . Let the set of all such triplets be denoted by Nb . For a set of three points in Nb , we have the angle θkji between ak and xi as seen from xj as well as the distance djk and dji ; or the angle θijl between xi and xl as seen from xj as well as the distance dji and djl . The angle information between unknown sensors i and l as seen from sensor j can be expressed as (xj − xi )T (xj − xl ) = cos θijl , for i, j, l ∈ Nb or kxj − xi kkxj − xl k (xj − xi )T (xj − xl ) = cos θijl , or dji djl
kxl − xi k2 = d2ji + d2jl − 2dji djl cos θijl ,
(5.2)
where θijl is the angle between sensor i and sensor l as seen from sensor j, dji is the distance between sensor j and sensor i and djl is the distance between sensor j and sensor l. The angle information between anchor k and unknown sensor i as seen from sensor j can be expressed as (xj − ak )T (xj − xi ) = cos θkji , for k, j, i ∈ Nb , or kxj − ak kkxj − xi k
(xj − ak )T (xj − xi ) = cos θkji , or djk dji
kxi − ak k2 = d2jk + d2ji − 2djk dji cos θkji,
(5.3)
where θkji is the angle between anchor k and sensor i as seen from sensor j, dji is the distance between sensor j and sensor i and djk is the distance between anchor k and sensor
97
5.2. INTEGRATING ANGLE INFORMATION
j. Now by applying the SDP relaxation described in previous chapters, i.e., using the substitution Y = X T X and relaxing this constraint to Y X T X, the problem consisting
of the quadratic constraints of the type described above can be reduced once again to an SDP form. Note that the relaxation used is exactly the same as the one introduced in Chapter 2.
5.2.3
Pure angle information
Next, we will consider the localization problem for pure angle information. The problem can be formulated in a variety of ways depending upon how we choose to exploit the angle constraints in our equations. The idea is to maintain the Euclidean distance geometry ‘flavor’ in our approach where the quadratic term in the formulation is simply relaxed into an SDP. With this aim in mind, the following solution is presented. The same situation (with regard to angle measurements), as considered in the case of combined distance and angle information, will be used here as well. The only difference is that absolutely no distance information between sensors is available.
B
C
r
O
A
Figure 5.2: 3 points, the circumscribed circle and related angles, hAOBi = 2hACBi.
Consider 3 points xi , xj and xk that are not located on the same line. One basic property concerning circles is that the angle subtended by 2 of these points, say xi and xk , at the
98
CHAPTER 5. USING ANGLE INFORMATION
center of the circle that circumscribes the 3 points, is twice the angle subtended by the 2 points at the third point xj , see Figure 5.2. Let the radius of the circumscribing circle be rijk , which is also unknown. Then by a simple application of the cosine law (similar to the one described in the previous section) on the triangle with the points xi , xk and the center of the circle, we get 2 2 kxi − xk k2 = rijk + rijk − 2rijk rijk cos 2φijk 2 kxi − xk k2 = 2rijk (1 − cos 2φijk ).
(5.4)
2 using the angles We can create a similar set of equations in the unknowns xi , xj , xk and rijk
between all such sets of points. These sets of equations can be generated for all sets of points that have mutual angle information with each other. For each set of 3 points, a new 2 is introduced. unknown rijk
It can be observed that the constraints are quadratic in the matrix X = [x1 x2 . . . xn ]. By applying the SDP relaxation, i.e., using the substitution Y = X T X and relaxing this constraint to Y X T X, the problem is reduced once again to an SDP with one matrix
2 . Furthermore, linear inequality and some linear constraints, with the unknowns Y, X, rijk
we can also include the linear constraints introduced by angle measurements at the anchors as described in Equation (5.1). 2 will cause the problem to It is pertinent to ask if the introduction of new unknowns rijk
be under determined. However, given enough angle relations, there will be fewer unknowns than constraints and the network can be localized accurately. A brief analysis is presented to demonstrate that the number of constraints is larger than the number of unknowns and therefore a unique feasible solution must exist. The matrix Y has n(n + 1)/2 unknowns and X has 2n unknowns. Suppose the number of triplets that have mutual angle information amongst each other is k. The angle measurements considered are exact, i.e., they do not have any error. Then k more unknowns corresponding to the radii of the circumscribing circles are introduced. At the same time, we have constraints for each circle, so a total of 3k constraints exist. Hence as long as 3k > k + n(n + 1)/2 + 2n, the set of equations will have x1 x ˆ2 . . . x ˆn ] and Y ∗ = (X ∗ )T X ∗ , where x ˆi is a unique feasible solution. The matrix X ∗ = [ˆ the true position of the sensor i, will be the solution satisfying these equations. Therefore if k is large enough, the unique solution will exist. The maximum possible value of k is in fact n3 . If however, the connectivity is high, then the size of the triplet set k is also large
5.3. PRACTICAL CONSIDERATIONS
99
enough for the set of equations to have a unique solution.
5.3 5.3.1
Practical considerations Noisy measurements
In realistic scenarios, the angle information is not entirely accurate. In our model, we are assuming that all measurements are made relative to the axes of the sensors. These measurements will be noisy. The noise model is described in Section 5.4 where the simulations are presented. Softer SDP models that minimize the errors in the constraints can be developed, based on constraints discussed in the previous section by introducing slack variables. Note that our formulation allows flexibility in choosing any combination of exact or inexact distance and angle constraints depending on the information provided. For example, consider again a constraint of the type (5.4): 2 kxi − xk k2 = 2rijk (1 − cos 2φijk ).
With noisy angle information, it can be modified to 2 (1 − cos 2φijk ) + αijk , kxi − xk k2 = 2rijk
(5.5)
where αijk is the error in the equation. We can now choose to minimize an objective function corresponding to the resulting SDP, the absolute sum of these errors. We can also choose alternative objective functions that penalize the errors in the equations differently. In our presented results, the above mentioned objective function is used. An interesting alternative objective is the sum of the alpha-squares, in which case the SDP formulation becomes a relaxation of a nonlinear least squares problem. The choice of a suitable objective function to minimize that gives the best performance for a particular noise model while being computationally tractable is a further research topic. The SDP solution can also serve as a good starting point for local refinement through a local optimization method. This idea has been explored for distance based information in Section 3.6. Extending local refinement ideas for angle information is being pursued. It is also interesting to observe that when there are no anchors used in the pure angle approach, the size of the network can be scaled by any constant α and the angle relations will still hold. But in the absence of any distance information, we do not know what the
100
CHAPTER 5. USING ANGLE INFORMATION
scaling constant should be. The solution will also be a rotated and translated version of the original network. We still obtain a relative map, which may still be of tremendous use in applications like routing. However, these problems will only present themselves when there are no anchors or distance information at all. Having well placed anchors in the network helps ‘ground’ the network by providing reference information about a global coordinate system. If enough anchor information is provided, we arrive at the exact position estimations without any rotation or translation. In the case of noisy angle information, the placement of anchors assumes a critical role for both approaches, as was also the case when just distance information was used in Chapter 3. If the anchors are placed in the interior of the network, the estimations for points outside the convex hull of the anchor points also tend to be towards the interior of the network and are therefore quite erroneous. This problem does not present itself when the anchors are placed on the perimeter of the network. Once again, the addition of a regularization term in the objective function(to pull the points far apart and obtain a low dimensional realization) as described in Section 3.4 should help remove the sensitivity to anchor placement. and is a part of future research.
5.3.2
Computational costs
Because we consider triplets of points, the number of angle constraints of the type (5.2) and (5.4) increase substantially (O(n3 )) as the network size(n) and radio range(R) increase. It is very inefficient to include all the angle constraints if only a few of them can be used to obtain the solution. The problem becomes too strictly over-determined when in fact most of the constraints are redundant and the problem can be solved using only a subset of the constraints. For example, even for a network size of 40, the number of constraints for a radio range of 0.3 and omnidirectional angle measurements is about 1100. Clearly, the problem starts becoming intractable for even such small cases. Therefore, we propose an intelligent constraint generation methodology that can help in keeping the number of constraints selected small enough such that the problem is tractable. In selecting the constraints, we aim to ensure that the constraints selected are well distributed in terms of the points that they correspond to, i.e., there are not too many or too few constraints arising from angle relations for a single point in the network . This can be done by limiting the number of constraints that arise from angle relations for a particular point below a particular threshold. Furthermore, the constraints are between
5.4. SIMULATION RESULTS
101
pairs of points. So the constraint selection should not concentrate only on angle constraints corresponding to a particular pair of points as seen from other points. While this can be done by keeping a counter of the number of constraints on every possible pair, clearly this strategy adds a lot of overhead to the actual processing. In order to minimize the cost of the constraint selection, we use a randomized constraint selection technique where a constraint is added to the problem with a certain probability. This probability can be made to depend upon the number of desired constraints as a fraction of the total number of possible constraints. The randomized selection ensures that the constraints take into account the global information of the network as opposed to smaller local clusters of points. The constraint selection scheme may even be enhanced by a subsequent active constraint generation methodology in which we solve the SDP once with a subset of constraints and then solve it again after adding in the constraints that were violated in the initial pass. It may however be expensive to check for all the violated constraints. We have not explored this strategy further in this thesis. While the above mentioned methods work well for networks of up to 70-80 points, severe scalability issues will arise for very large networks of say, thousands of points. A distributed method for angle information, similar in spirit to that described in Section 3.7 for pure distance information, is a direction for future research. The entire network will be divided into clusters using anchor information and a smaller SDP is solved for each cluster. The cluster idea still takes advantage of the collective information between a set of nodes which are within a cluster. For points in clusters that are well separated from each other, there is little or no mutual information to begin with. So the problem can be solved in a parallel and ‘decoupled’ fashion in each of the separate clusters. The computations can be made iterative such that points that were well estimated can be used in subsequent iterations to provide estimates for points that were poorly estimated in previous iterations. Information about points that are on the boundary of 2 clusters can even be exchanged between clusters in order to assist in localizing other points within them.
5.4
Simulation results
Simulations were performed on a networks of 40 nodes and 4 anchors in a square region of size 1 × 1. Each node was given a random axis orientation between [−π, π]. The distances
between the nodes was calculated. If the distance between 2 nodes i and j was less than
102
CHAPTER 5. USING ANGLE INFORMATION
a given radio range R, the angle between the axis of i and the direction pointing to node j was computed. If the angle was within field of view, specified by maxang, i.e., the maximum angle on either side of the axis at which a node can see other nodes, the angle measurement was included. Otherwise, all other angle measurements were ignored. The angle measurement mechanism is assumed to be omnidirectional (maxang = π) in our simulations unless otherwise stated. The measured angles were modeled to be noisy by setting θ = θˆ + nf N (0, 1), where θ is the measured angle, θˆ is the true value of the angle, nf is a specified constant, and N (0, 1) is a normally distributed random variable. Therefore the noise error in the angle measurements is modeled as additive and can be varied by changing nf . In order to find relative angles between 2 points as seen from a third point, the measured angles corresponding to the 2 points need to be subtracted. Therefore the angle data used in the problem formulation has higher noise variance than the measured angle data. The noisy distance measurements are modeled in the same way as in the previous chapters using a multiplicative noise model: ˆ + nf N (0, 1)), d = d(1 where d is the measured distance and dˆ is the actual distance. The average effect of varying radio range (R), field of view (maxang), noise (nf ) and number of anchors on the estimation error was studied by performing 25 independent trials each of different configurations. Figure 5.3 shows the variation of average estimation error (normalized by the radio range R) when the noise in the angle measurements increases and with different radio ranges. The maxang is set to π, i.e., the angle measurement mechanism is assumed to be omnidirectional. Four anchors are placed near the corners of the square network. There are no more anchors placed in the interior. Comparing the combined distance-angle approach from Section 5.2.2 to the pure angle approach from Section 5.2.3 for smaller radio ranges, the error is much higher in the pure angle case for a particular setting of radio range (R) and measurement noise (specified by nf ). This behavior is expected since the distance-angle approach employs more information to solve for the network as opposed to the pure angle approach. However, as the measurement noise increases, the error in the distance-angle case increases more rapidly. This is because both the distance and angle measurements are deteriorated by the noise measurements as opposed
103
5.4. SIMULATION RESULTS
to just the angle measurements for the pure angle case. In fact, for higher radio ranges and high measurement noise, the pure angle approach starts outperforming the distance-angle approach. For example, for a radio range R = 0.6 and noise factor nf = 0.25, the pure angle case has an error of about 13% R and the distance-angle case of about 17% R. 4 anchors, Pure angle
4 anchors, Distance and Angle
45 40
R=0.3 R=0.4 R=0.5 R=0.6 R=0.7
30 25
20 Error (%R)
Error (%R)
35
20
R=0.3 R=0.4 R=0.5 R=0.6 R=0.7
15
10
15 10
5
5 0.05
0.1
0.15 nf
0.2
0.25
0.05
0.1
0.15 nf
0.2
0.25
Figure 5.3: Variation of estimation error with measurement noise.
Another observation is that the advantage of having a higher radio range diminishes beyond a particular value. This can be explained by the fact that beyond a certain radio range, there is enough distance and angle information for the network to be solved. The extra information that is obtained for a higher radio range is redundant. We also disregard some of the redundant information by the random constraint sampling technique described in Section 5.3.2 in order to keep the number of constraints small enough so that the problem is tractable. In fact for the relaxation proposed, including the extra information may actually deteriorate the solution quality as well, for the distance-angle case, because with increasing radio range, the multiplicative noise model for distances that is used implies that the distance measures (corresponding to far away points) will have higher noise. This is demonstrated more effectively in Figure 5.4, where we consider average absolute estimation error. For the pure angle case, for a radio range of 0.5 and higher, the error stays more or less the same (about 7-12% R) for varying measurement noises. And for the distance-angle case, by increasing the radio range, while the error does go down in terms of the fraction of the radio range, the absolute error increases. Tuning the radio range such that optimum
104
CHAPTER 5. USING ANGLE INFORMATION
performance is achieved in terms of computational complexity and accuracy is desirable and it is interesting research question to analyze what such an optimum radio range is for localizing a network and subsequently deriving heuristics to find such an optimum. 4 anchors, Pure angle
4 anchors, Distance and Angle
0.14 0.1
0.1 0.08 0.06 0.04 0.02 0.05
0.1
0.15 nf
0.2
R=0.3 R=0.4 R=0.5 R=0.6 R=0.7 0.25
Absolute Error
Absolute Error
0.12
0.08 0.06 0.04
R=0.3 R=0.4 R=0.5 R=0.6 R=0.7
0.02
0.05
0.1
0.15 nf
0.2
0.25
Figure 5.4: Variation of absolute estimation error with measurement noise.
Figure 5.5 shows the variation of estimation error when the radio range is increased and with different measurement noises. With lower radio range, the pure angle case appears to have high error (almost 35-45% R) but with the inclusion of more constraints by increasing the radio range, the error starts diminishing quite rapidly (5-15% R). In contrast, the distance-angle case has lower error with smaller radio ranges to begin with and the drop in error with increasing radio range is not that sharp (about 5% R). The wider spacings between the lines for different nf in the distance-angle case suggests as before that the distance-angle case is more sensitive to change in measurement noise. Overall, the results suggest that when the radio range and the measurement noises are low, i.e., we have few measurements but they are reliable, it is preferable to use the distance-angle approach. However, if there are lots of measurements but they have higher uncertainty, it is better to use the pure angle approach. An actual example of this behavior is presented in Figure 5.6. For Figures 5.6(a) and 5.6(b) corresponding to low radio range (R = 0.35) and low noise (nf = 0.05), while the pure angle approach hardly has enough information to solve the network and performs poorly, the distance-angle approach provides low error estimation for all the points. For the second case in Figures 5.6(c) and 5.6(d),
105
5.4. SIMULATION RESULTS
4 anchors, Pure Angle nf=0.05 nf=0.10 nf=0.15 nf=0.20 nf=0.25
45 40 35 30 25 20
nf=0.05 nf=0.10 nf=0.15 nf=0.20 nf=0.25
20 Error (%R)
Error (%R)
4 anchors, Distance and Angle
15
10
15 10
5
5 0.3
0.4
0.5 Radio Range R
0.6
0.7
0.3
0.4
0.5 Radio Range R
0.6
0.7
Figure 5.5: Variation of estimation error with radio range.
when the radio range is high (R = 0.7) and noise is high (nf = 0.25), the estimations for quite a few of the points are understandably erroneous owing to the high noise. But the pure angle approach seems to perform slightly better. Figure 5.7 shows the variation of estimation error by increasing the radio range while varying the number of anchors from 4 to 8. Four anchors were placed at the perimeter and the rest in the interior. nf is fixed at 0.1. The effect of fewer anchors is felt more strongly for low radio ranges and starts becoming irrelevant with higher radio range for the pure angle case. Overall, the effect of more anchors is not extremely significant. For the distance-angle case, the improvement in accuracy by increasing the number of anchors is not very much (only about 1-2%) The effect of a smaller field of view is presented in Figure 5.8 for varying radio range. For low radio range (R = 0.3) and small field of view (maxang = π/4 − π/3), the accuracy
is quite poor for both approaches, greater than 100% R for pure angle case, and 50-80%
R for distance-angle case. Since distance information is also used in the latter case, the accuracy is better. The lack of more angle information due to limited field of view clearly hurts the accuracy in the pure angle case. Even as the radio range increases, the effect of less angle information goes down, but less dramatically for the pure angle case as for the distance-angle case. It is still about 45% R when R = 0.7 for the pure angle case, whereas enough information is available in the distance-angle case for the error to almost vanish.
106
CHAPTER 5. USING ANGLE INFORMATION
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.4 −0.2
0
0.2
0.4
0.6
(a) Pure angle, low radio range and noise
−0.4 −0.2
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
0
0.2
0.4
0.2
0.4
0.6
(b) Distance and angle,low radio range and noise
0.6
−0.4 −0.2
0
0.6
(c) Pure angle, high radio range and noise
−0.4 −0.2
0
0.2
0.4
0.6
(d) Distance and angle, high radio range and noise
Figure 5.6: Comparisons in different scenarios.
107
5.4. SIMULATION RESULTS
nf=0.1, Pure Angle
nf=0.1, Distance and Angle
35
m=4 m=5 m=6 m=7 m=8
25
7.5 7 Error (%R)
Error (%R)
30
m=4 m=5 m=6 m=7 m=8
8
20 15
6.5 6 5.5 5
10
4.5
5 0.3
0.4
0.5 Radio Range R
0.6
0.7
4 0.3
0.4
0.5 Radio Range R
0.6
0.7
Figure 5.7: Variation of estimation error with more anchors.
The effect of the field of view is also illustrated by means of an actual example depicted in Figure 5.9. Even with measurements with no noise, some points are estimated very poorly when maxang is set to π/3. Note that the axis orientations of the points are random. By limiting the field of view and having random axis orientations, it is hard to establish with certainty if enough information is available for all the points to be estimated correctly, especially for the pure angle case. Even when points are within radio range of each other, they are not able to detect each other if they are not in each other’s field of view. As a result, there is very little information for some of the points and they are estimated badly. With higher radio range, this problem can be reduced but the performance is very sensitive because of the random axis orientations. This also suggests that the random constraint selection may need to be fine-tuned in order to include more constraints for the points that appear to have very little information to begin with. The number of constraints and solution time was also analyzed. Our simulation program is implemented in Matlab and it uses SEDUMI [82] as the SDP solver. The simulations were performed on a Pentium IV 2.0 GHz and 512 MB RAM PC. Figures 5.10(a) and 5.10(b) illustrate how the solution time increases as the number of points in the network increases. Even with random constraint selection, the solution time seems to increase at a superlinear rate with the number of points. This indicates the necessity of developing a scalable distributed method that was mentioned in Section 5.3.2. It is interesting that due
108
CHAPTER 5. USING ANGLE INFORMATION
Pure Angle
Distance and Angle maxang=pi/4 maxang=pi/3 maxang=pi/2 maxang=pi
100
60 Error (%R)
Error (%R)
80 60 40
maxang=pi/4 maxang=pi/3 maxang=pi/2 maxang=pi
70
50 40 30 20
20 10 0.3
0.4
0.5 Radio Range R
0.6
0.7
0.3
0.4
0.5 Radio Range R
0.6
0.7
Figure 5.8: Effect of field of view: 40 node network, 4 anchors, 10% noise.
to the random constraint selection strategy, the running time is almost independent of the radio range. For example, for a network of 40 points, the number of constraints are usually around 480-520. This is illustrated in Figure 5.10(c) that shows the change in solution times with change in radio range for a fixed network size of 40 points and 4 anchors. The number of constraints does not change substantially with increasing radio range and therefore the solution time does not change too much either. Furthermore, the pure angle approach uses more constraints and usually takes longer to solve.
5.5
Conclusions
In this chapter, the use of SDP relaxations for solving the distance geometry problem, pertaining to sensor network localization using angle of arrival information, was described. Formulations that use both distance and angle information as well as just angle information were discussed. A random constraint selection method was also described in order to reduce the computational effort. Experimental results show that the method can provide accurate localization. The distance-angle approach performs well in low radio range and low noise scenarios and the pure angle approach performs better when the radio range and measurement noise is high. The performance becomes more unpredictable with limited field of view. In order to rectify this, the random constraint selection method needs to be refined
109
5.5. CONCLUSIONS
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.4 −0.2
0
0.2
0.4
(a) maxang = π/2
0.6
−0.4 −0.2
0
0.2
0.4
0.6
(b) maxang = π/3
Figure 5.9: Example of effect of field of view on pure angle approach: 40 node network, 4 anchors, R = 0.5, no noise.
in order to include more constraints for points with very little information. Due to the random constraint selection, the solution time is more or less independent of radio range. However, the number of constraints and solution time grows substantially as the number of points in the network increases. A distributed method that solves the localization problem as a set of decomposed clusters is being developed in order to make this method tractable for large networks. For noisy measurements, more accurate estimation would be provided if the anchor nodes in the network are placed optimally, or a regularization term is used, to ensure low dimensional solutions. These strategies need more thorough investigation. Furthermore, for dealing with high noise cases, local gradient based local optimization methods similar to those described for pure distance information in Section 3.6 must be developed.
110
CHAPTER 5. USING ANGLE INFORMATION
Number of points vs Solution Time
Number of points vs Solution Time 25
Solution time (sec)
30
R=0.3 R=0.4 R=0.5
25 20 15
R=0.3 R=0.4 R=0.5
20
15
10
10 5 5 30
40
50 60 Number of points
70
30
40
(a) Pure Angle
50 60 Number of points
(b) Distance and Angle
Pure Angle Distance and Angle
6.5 Solution time (sec)
Solution time (sec)
35
6 5.5 5 4.5 4 3.5 0.3
0.4
0.5 Radio Range
0.6
0.7
(c) Solution Time vs. Radio Range, 40 node network
Figure 5.10: Solution times.
70
Chapter 6
Conclusions and future research In this chapter, the contributions of this thesis are restated, and broad research directions discussed. The future research ideas are also discussed in more detail in the corresponding chapters.
6.1
Theoretical aspects
This thesis introduces an SDP relaxation based technique to solve the Euclidean distance geometry problem. By using the global distance information about a graph, we are able to simultaneously estimate the positions of all its vertices accurately. While aspects such as the exactness, complexity and sensitivity of this relaxation have been studied, there are still many possibilities for very interesting theoretical work in this domain. In particular, tensegrity theory has opened up avenues to further improve the SDP relaxation to achieve realizability of graphs in lower dimensions. Also the analysis of the algorithm under noise can be developed further, such as getting strong bounds on estimation error of all the points, and more importantly, reliable error measures to detect badly estimated points. The key to resolving many of these issues lies in a closer inspection of the dual problem of the SDP relaxation. The analysis of the dual may even uncover newer relaxations that are computationally more efficient and provide accurate low dimensional embeddings. Indications that such an approach is promising can be found in [94], where the graph Laplacian is used to obtain a much lower dimensional SDP formulation. Efforts are also being made to understand the complementarity between the dense distance matrix based methods and sparse Laplacian based methods for dimensionality reduction [98]. 111
112
CHAPTER 6. CONCLUSIONS AND FUTURE RESEARCH
There are still significant gaps in understanding the interplay between these approaches and a deeper investigation can yield many theoretical and practical insights not only in distance geometry but in spectral graph theory as well. Another important theoretical consideration is to characterize the type of ambiguities that exist in the point positions if the solution is not unique. By determining if the ambiguities are discontinuous and discrete, we may be able to significantly narrow down the number of different configurations that the non-unique solution can achieve. The SDP objective function is crucial in deciding which of the non-unique solutions the SDP will converge to. Developing a parameterized objective function for the SDP could allow for such an analysis to be carried out further. There is also scope to move to tighter relaxations than the SDP. The use of Sum of Squares techniques for polynomial global optimization has already been applied in the sensor localization context in the paper [63] and is often able to deliver solutions that are very close to the global optimal solution of the non-convex problem.
6.2
Sensor network localization
The application of the SDP relaxation to sensor network localization with noisy distance measurements has been demonstrated effectively and has received significant attention from other researchers in the sensor network community. The use of global distance information and the combined use of the regularization term in the SDP and local refinement provide very accurate position estimates for networks with even very noisy distance measurements. Studying newer formulations with penalty functions that provide the best statistical estimates given a particular distance noise model also need to be investigated further. The algorithm must also be tested using actual measured data from real sensor networks. In such scenarios, the noise models could be different, such as additive, multiplicative or lognormal. In each case, different penalty functions provide the maximum likelihood estimate, and tailoring the objective function of the SDP accordingly could provide significantly better estimations. The presence of anchors can greatly improve the estimation accuracy. However, the infrastructure cost involved in setting up too many anchors can be overwhelming. As a result, intelligent heuristics or algorithms for effective placement of a limited number of anchors, are worth looking into. A deeper analysis of balance between the regularization and error
6.3. MOLECULE STRUCTURE DETERMINATION
113
minimization terms in the noisy distances case is also important. In particular, if we are able to correctly balance the two terms based on just the given network size, anchor positions and incomplete distance matrix, the estimations can be much more accurate. Closer investigation of tensegrity theory would also allow the development of better regularization terms in order to get low dimensional solutions even with very noisy distance measurements and limited anchors. A distributed iterative algorithm based on the SDP relaxation was also described, to counter the intractability faced in large scale sensor networks. This provides the best opportunity in terms of an actual deployment of such a localization algorithm in a real sensor network. But in doing so, more research needs to be done in deciding the cluster sizes, and in developing a protocol to minimize the cluster setup and communication costs.
6.3
Molecule structure determination
The large scale distributed implementation of the SDP relaxation technique was also demonstrated for molecules with thousands of atoms. We proposed intelligent clustering and stitching algorithms for solving the kind of large scale problems that are encountered in this space. Using these ideas, we were able to solve for molecules of sizes of thousands of atoms. The algorithm still needs improvement for a combination of very noisy and sparse distance data. In this regard, there is particular scope for improvement in finding reliable error measures for discarding points, and for more robust cluster stitching algorithms. The failure cases of the algorithm indicate that it is in these 2 steps that the error goes out of control, and if there are more reliable techniques for performing them, we can solve for even larger molecules with irregular geometries and with much looser distance constraints. The size of the problems considered also highlights the need for faster and more scalable SDP solvers. In fact, significant improvements are noticed even when more sophisticated local refinement methods such as truncated or inexact Newton methods are used instead of a naive gradient descent algorithm. The algorithm also needs to be tested with actual NMR data and other data, such as the VDW constraints. Hopefully, proteins whose structures have so far not been determined can be conformed using this algorithm.
114
CHAPTER 6. CONCLUSIONS AND FUTURE RESEARCH
6.4
Extensions to angle information
The basic distance geometry model was also extended to handle angle information in R2 . The efficacy of these techniques was demonstrated by simulations on sensor networks (image
based sensor networks, for example) that use a mixture of distance and angle information, or pure angle information for localization. However, this approach made more computationally efficient and robust to noise and sparsity. A randomized constraint selection method was used to control the problem size for large networks. However, it is currently still a strongly heuristic method. Better constraint selection would help in improving the accuracy and computational performance, especially when the angle measurement mechanism is very noisy, and the sensors have a very limited field of view. The addition of regularization terms, the development of a distributed method and a local refinement method, much like the ones described for pure distance information in Sections 3.5, 3.7, 3.6 respectively are also required. Deeper theoretical analysis of the uniqueness and exactness of the SDP solution (as was done for distances in Section 2.3), and sensitivity to noise and limited fields of view, could provide insights into how the above mentioned practical issues may be best dealt with. NMR measurements are also able to measure a few orientation constraints in the form of dihedral angles between the planes defined by 2 sets of 3 points each. Such information may further constrain the solution space, which may be very useful in the case of sparse distance information. Preliminary extensions to account for dihedral angle constraints in molecule structure determination have been developed [28], but they suffer from overwhelming computational costs (due to larger number of variables and far too many constraints). Intelligent schemes that incorporate only the required dihedral angle constraints, while maintaining a manageable problem size need to be developed.
Bibliography [1] A. Y. Alfakih, A. Khandani, and H. Wolkowicz. Solving Euclidean distance matrix completion problems via semidefinite programming. Comput. Optim. Appl., 12(13):13–30, 1999. [2] F. Alizadeh. Interior point methods in semidefinite programming with applications to combinatorial optimization. SIAM Journal on Optimization, 5(1):13–51, 1995. [3] G. W. Allen, K. Lorincz, M. Welsh, O. Marcillo, J.Johnson, M. Ruiz, and J. Lees. Deploying a wireless sensor network on an active volcano. IEEE Internet Computing, 10(2):18–25, 2006. [4] J. Aspnes, D. Goldenberg, and Y. Yang. The computational complexity of sensor network localization, in proceedings of the first international workshop on algorithmic aspects of wireless sensor networks, 2004, 2004. [5] A. I. Barvinok. Problems of distance geometry and convex properties of quadratic maps. Discrete & Computational Geometry, 13:189–202, 1995. [6] A. Basu, J. Gao, J. S. B. Mitchell, and G. Sabhnani. Distributed localization using noisy distance and angle information. In MobiHoc ’06: Proceedings of the Seventh ACM International Symposium on Mobile Ad-hoc Networking and Computing, pages 262–273, New York, NY, USA, 2006. ACM Press. [7] S. J. Benson, Y. Ye, and X. Zhang. Solving large-scale sparse semidefinite programs for combinatorial optimization. SIAM Journal on Optimization, 10(2):443–461, 2000. [8] H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne. The Protein data bank. Nucleic Acids Research, 28:235–242, 2000. 115
116
BIBLIOGRAPHY
[9] M. Betke and L. Gurvits. Mobile robot localization using landmarks. IEEE Transactions on Robotics and Automation, 13(2), April 1997. [10] P. Biswas, T. C. Liang, K. C. Toh, T. C. Wang, and Yinyu Ye. Semidefinite programming approaches for sensor network localization with noisy distance measurements. IEEE Transactions on Automation Science and Engineeering, Special Issue on Distributed Sensing, 3(4):360–371, October 2006. [11] P. Biswas, T. C. Liang, T. C. Wang, and Y. Ye. Semidefinite programming based algorithms for sensor network localization. ACM Transactions on Sensor Networks, 2(2):188–220, 2006. [12] P. Biswas, K. C. Toh, and Y. Ye. A distributed SDP approach for large-scale noisy anchor-free 3d graph realization (with applications to molecular conformation). Technical report, Dept of Management Science and Engineering, Stanford University, submitted to SIAM Journal on Scientific Computing, February 2007. [13] P. Biswas and Y. Ye. A distributed method for solving semidefinite programs arising from ad hoc wireless sensor network localization. Technical report, Dept of Management Science and Engineering, Stanford University, to appear in Multiscale Optimization Methods and Applications, October 2003. [14] P. Biswas and Y. Ye. Semidefinite programming for ad hoc wireless sensor network localization. In Proceedings of the Third International Symposium on Information Processing in Sensor Networks, pages 46–54. ACM Press, 2004. [15] S. Boyd, L. E. Ghaoui, E. Feron, and Venkataramanan Balakrishnan. Linear Matrix Inequalities in System and Control Theory. SIAM., 1994. [16] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, March 2004. [17] S. E. Brenner. A tour of structural genomics. Nature Reviews Genetics, 2(10):801–809, October 2001. [18] J. Bruck, J. Gao, and A. Jiang. Localization and routing in sensor networks by local angle information. In MobiHoc ’05: Proceedings of the Sixth ACM International
117
BIBLIOGRAPHY
Symposium on Mobile Ad-hoc Networking and Computing, pages 181–192, New York, NY, USA, 2005. ACM Press. [19] N. Bulusu, J. Heidemann, and D. Estrin. Gps-less low cost outdoor localization for very small devices. Technical report, Computer Science Department, University of Southern California, April 2000. [20] S. Burer and R. D. C. Monteiro.
A nonlinear programming algorithm for solv-
ing semidefinite programs via low-rank factorization. Mathematical Programming, 95(2):329–357, 2003. [21] M. Carter, H. H. Jin, M. A. Saunders, and Y. Ye. Spaseloc: An adaptable subproblem algorithm for scalable wireless network localization. Submitted to the SIAM J. on Optimization, January 2005. [22] B. Chazelle, C. Kingsford, and M. Singh. The side-chain positioning problem: a semidefinite programming formulation with new rounding schemes. In PCK50: Proceedings of the Paris C. Kanellakis memorial workshop on Principles of computing & knowledge, pages 86–94. ACM Press, 2003. [23] K. Chintalapudi, J. Paek, O. Gnawali, T. S. Fu, K. Dantu, J. Caffrey, R. Govindan, E. Johnson, and S. Masri. Structural damage detection and localization using NETSHM. In IPSN ’06: Proceedings of the Fifth International Conference on Information Processing in Sensor Networks, pages 475–482, New York, NY, USA, 2006. ACM Press. [24] F. Chung, M. Garrett, R. Graham, and D. Shallcross. Distance realization problems with applications to internet tomography. Journal of Computer and System Sciences, 63:432–448(17), November 2001. [25] J. A. Costa, N. Patwari, and III A. O. Hero. Distributed weighted-multidimensional scaling for node localization in sensor networks. ACM Transactions on Sensor Networks, 2(1):39–64, 2006. [26] T. Cox and M. A. A. Cox. Multidimensional Scaling. Chapman Hall/CRC, London., 2001.
118
BIBLIOGRAPHY
[27] G. Crippen and T. Havel. Distance Geometry and Molecular Conformation. Wiley, 1988. [28] J. Dattorro. Convex Optimization and Euclidean Distance Geometry. Meboo Publishing, 2006. [29] L. Doherty, L. E. Ghaoui, and K. S. J. Pister. Convex position estimation in wireless sensor networks. In Proceedings of IEEE Infocom, pages 1655 –1663, Anchorage, Alaska, April 2001. [30] Q. Dong and Z. Wu. A geometric build-up algorithm for solving the molecular distance geometry problem with sparse distance data. Journal of Global Optimization, 26(3):321–333, 2003. [31] T. Eren, D. Goldenberg, W. Whiteley, Y. R. Yang, A. S. Morse, B. D. O. Anderson, and P. N. Belhumeur. Rigidity, computation, and randomization in network localization. In Proceedings of IEEE Infocom, pages 1655 –1663, 2004. [32] R. J. Fontana, E. Richley, and J. Barney. Commercialization of an ultra wideband precision asset location system. In Proceedings of 2003 IEEE Conference on Ultra Wideband Systems and Technologies, pages 369–373, 2003. [33] A. George and J. W. Liu. Computer Solution of Large Sparse Positive Definite Systems. Prentice Hall Professional Technical Reference, 1981. [34] D. Gleich, L. Zhukov, M. Rasmussen, and K. Lang. The world of music: SDP layout of high dimensional data, in InfoVis 2005. [35] N. Guex and M. C. Peitsch. Swiss-model and the Swiss-Pdbviewer: An environment for comparative protein modeling. Electrophoresis, 18:2714–2723, 1997. [36] O. G¨ uler and Y. Ye. Convergence behavior of interior point algorithms. Mathematical Programming, 60:215–228, 1993. [37] T. F. Havel and K. W¨ uthrich. An evaluation of the combined use of nuclear magnetic resonance and distance geometry for the determination of protein conformation in solution. Journal of Molecular Biology, 182:281–294, 1985.
BIBLIOGRAPHY
119
[38] T. He, S. Krishnamurthy, J. A. Stankovic, T. Abdelzaher, L. Luo, R. Stoleru, T. Yan, L. Gu, J. Hui, and B. Krogh. Energy-efficient surveillance system using wireless sensor networks. In MobiSys ’04: Proceedings of the 2nd International Conference on Mobile Systems, Applications, and Services, pages 270–283, New York, NY, USA, 2004. ACM Press. [39] B. A. Hendrickson. The molecular problem: Determining Conformation from pairwise distances. PhD thesis, Cornell University, 1991. [40] J. Hightower and G. Borriello. Location systems for ubiquitous computing. Computer, 34(8):57–66, 2001. [41] A. Howard, M. Matari´c, and G. Sukhatme. Relaxation on a mesh: a formalism for generalized localization. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1055–1060, Wailea, Hawaii, Oct 2001. [42] G. Iyengar, D. Phillips, and C. Stein. Approximation algorithms for semidefinite packing problems with applications to Maxcut and graph coloring. In Eleventh Conference on Integer Programming and Combinatorial Optimization, Berlin, 2005. [43] B. Jackson and T. Jordan. Connected rigidity matroids and unique realizations of graphs, 2003. [44] H. H. Jin. Scalable Sensor Localization Algorithms for Wireless Sensor Networks. PhD thesis, Department of Mechanical and Industrial Engineering, University of Toronto, 2005. [45] S. Joshi and S. Boyd. An efficient method for large-scale gate sizing. Technical report, Dept. of Electrical Engineering, Stanford University submitted to IEEE Transactions on Circuits and Systems, 2006. [46] E. Kaplan. Understanding GPS: Principles and Applications. Artech House, 1996. [47] C. T. Kelley. Iterative Methods for Linear and Nonlinear Equations. SIAM, 1995. [48] S. J. Kim, K. Koh, M. Lustig, S. Boyd, and D. Gorinevsky. A method for largescale l1 -regularized least squares problems with applications in signal processing and statistics. Technical report, Dept. of Electrical Engineering, Stanford University, 2007.
120
BIBLIOGRAPHY
[49] R. Kimmel. Numerical Geometry of Images: Theory, Algorithms, and Applications. SpringerVerlag, 2003. [50] K. Koh, S. J. Kim, and S. Boyd. An interior-point method for large-scale l1 -regularized logistic regression. Technical report, Dept. of Electrical Engineering, Stanford University To appear, Journal of Machine Learning Research, 2007. [51] L. Krishnamurthy, R. Adler, P. Buonadonna, J. Chhabra, M. Flanigan, N. Kushalnagar, L. Nachman, and M. Yarvis. Design and deployment of industrial sensor networks: experiences from a semiconductor plant and the north sea. In SenSys ’05: Proceedings of the Third International Conference on Embedded Networked Sensor Systems, pages 64–75, New York, NY, USA, 2005. ACM Press. [52] M. Laurent. Matrix completion problems. The Encyclopedia of Optimization, 3:221– 229, 2001. [53] T. C. Liang, T. C. Wang, and Y. Ye. A gradient search method to round the SDP relaxation solution for ad hoc wireless sensor network localization. Technical report, Dept. of Management Science and Engineering, Stanford University, August 2004. [54] N. Linial, E. London, and Y. Rabinovich. The geometry of graphs and some of its algorithmic applications. Combinatorica, 15:215–245, 1995. [55] K. Lorincz, D. J. Malan, T. R. F. Fulford-Jones, A. Nawoj, A. Clavel, V. Shnayder, G. Mainland, M. Welsh, and S. Moulton. Sensor networks for emergency response: Challenges and opportunities. IEEE Pervasive Computing, 3(4):16–23, 2004. [56] D.G. Luenberger. Linear and Nonlinear Programming. Addison-Wesley, Reading, 1986. [57] A. Mainwaring, J. Polastre, R. Szewczyk, D. Culler, and J. Anderson. Wireless sensor networks for habitat monitoring. In ACM International Workshop on Wireless Sensor Networks and Applications (WSNA’02), Atlanta, GA, September 2002. [58] J. Mor´e and Z. Wu. -optimal solutions to distance geometry problems via global continuation. Preprint MCS–P520–0595, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Ill., May 1994.
BIBLIOGRAPHY
121
[59] J. Mor´e and Z. Wu. Global continuation for distance geometry problems. SIAM Journal on Optimization., 7:814–836, 1997. [60] D. Niculescu. Positioning in ad hoc sensor networks. IEEE Network Magazine, 2004. [61] D. Niculescu and B. Nath. Ad hoc positioning system (APS). In GLOBECOM (1), pages 2926–2931, 2001. [62] D. Niculescu and B. Nath. Ad hoc positioning system (APS) using AoA. In INFOCOM, San Francisco, CA., 2003. [63] J. Nie. Sum of squares method for sensor network localization, 2006. [64] J. Nocedal and S. J. Wright. Numerical Optimization. Springer-Verlag, 1999. [65] P. M. Pardalos and X. Liu. A tabu based pattern search method for the distance geometry problem. In New trends in mathematical programming, pages 223–234. Kluwer Academic Publishers, Boston, MA, 1998. [66] N. Patwari, III A. O. Hero, and A. Pacholski. Manifold learning visualization of network traffic data. In MineNet ’05: Proceedings of the 2005 ACM SIGCOMM Workshop on Mining Network Data, pages 191–196, New York, NY, USA, 2005. ACM Press. [67] N. B. Priyantha, A. K. L. Miu, H. Balakrishnan, and S. Teller. The cricket compass for context-aware mobile applications. In Mobile Computing and Networking, pages 1–14, 2001. [68] R. Reams, G. Chatham, W. K. Glunt, D. McDonald, and T. L. Hayden. Determining protein structure using the distance geometry program APA. Computers & Chemistry, 23(2):153–163, 1999. [69] C. Savarese, J. M. Rabaey, and K. Langendoen. Robust positioning algorithms for distributed ad-hoc wireless sensor networks. In Proceedings of the General Track: 2002 USENIX Annual Technical Conference, pages 317–327. USENIX Association, 2002. [70] A. Savvides, C. C. Han, and M. B. Strivastava. Dynamic fine-grained localization in ad-hoc networks of sensors. In Mobile Computing and Networking, pages 166–179, 2001.
122
BIBLIOGRAPHY
[71] A. Savvides, H. Park, and M. B. Srivastava. The bits and flops of the n-hop multilateration primitive for node localization problems. In Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications, pages 112– 121. ACM Press, 2002. [72] A. Savvides, M. Srivastava, L. Girod, and D. Estrin. Localization in sensor networks. Wireless Sensor Networks, pages 327–349, 2004. [73] J. B. Saxe. Embeddability of weighted graphs in k-space is strongly np-hard. In Proceedings of the 17th Allerton Conference on Communication, Control, and Computing, pages 480–489, 1979. [74] I. J. Schoenberg. Remarks to m. fr’echet’s article ’sur la d’efinition axiomatique d’une classe d’espaces vectoriels distanci’es applicables vectoriellement sur l’espace de hilbert’. Annals of Mathematics, 36:724–732, 1935. [75] Y. Shang and W. Ruml. Improved MDS-based localization. In Proceedings of the 23rd Conference of the IEEE Communicatons Society (Infocom 2004), 2004. [76] Y. Shang, W. Ruml, Y. Zhang, and M. P. J. Fromherz. Localization from connectivity in sensor networks. IEEE Transactions on Parallel and Distributed Systems, 15(11):961–974, 2004. [77] Y. Shang, W. Ruml, Y. Zhang, and M. P.J. Fromherz. Localization from mere connectivity. In Proceedings of the Fourth ACM International Symposium on Mobile Ad Hoc Networking and Computing, pages 201–212. ACM Press, 2003. [78] P. Skraba, L. Savidge, and H. Aghajan. Decentralized node location for wireless image sensor networks. Technical report, Wireless Sensor Networks Lab, Stanford University. [79] A. M. C. So and Y. Ye. Theory of semidefinite programming for sensor network localization. In SODA ’05: Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 405–414, Philadelphia, PA, USA, 2005. Society for Industrial and Applied Mathematics. [80] A. M. C. So and Y. Ye. A semidefinite programming approach to tensegrity theory and realizability of graphs. In SODA ’06: Proceedings of the Aeventeenth Annual
BIBLIOGRAPHY
123
ACM-SIAM Symposium on Discrete Algorithm, pages 766–775, New York, NY, USA, 2006. ACM Press. [81] S. Stillman and I. Essa. Towards reliable multimodal sensing in aware environments, 2001. [82] J. F. Sturm. Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optimization Methods and Software, 11 & 12:625–633, 1999. [83] K. C. Toh. Solving large scale semidefinite programs via an iterative solver on the augmented systems. SIAM J. on Optimization, 14(3):670–698, 2003. [84] K. C. Toh, M. J. Todd, and R.H. T¨ ut¨ unc¨ u. SDPT3 — a Matlab software package for semidefinite programming. Optimization Methods and Software, 11:545–581, 1999. [85] W. S. Torgerson. Multidimensional scaling: I. theory and method. Psychometrika, 17:401–409, 1952. [86] M. W. Trosset. Applications of multidimensional scaling to molecular conformation. Computing Science and Statistics, 29:148–152, 1998. [87] M. W. Trosset. Distance matrix completion by numerical optimization. Comput. Optim. Appl., 17(1):11–22, 2000. [88] M. W. Trosset. Extensions of classical multidimensional scaling via variable reduction. Computational Statistics, 17(2):147–162, 2002. [89] P. Tseng. Second-order cone programming relaxation of sensor network localization, textitsubmitted to SIAM Journal of Optimization, August 2005. [90] R.H. T¨ ut¨ unc¨ u, K. C. Toh, and M. J. Todd. Solving semidefinite-quadratic-linear programs using SDPT3. Mathematical Programming, Series B, 95:189–217, 2003. [91] D. Vitkup, E. Melamud, J. Moult, and C. Sander. Completeness in structural genomics. Nature Structural Biology, 8(6):559–566, June 2001. [92] K. Q. Weinberger and L. K. Saul. Unsupervised learning of image manifolds by semidefinite programming. Int. J. Comput. Vision, 70(1):77–90, 2006.
124
BIBLIOGRAPHY
[93] K. Q. Weinberger, F. Sha, and L. K. Saul. Learning a kernel matrix for nonlinear dimensionality reduction. In ICML ’04: Proceedings of the Twenty-first International Conference on Machine Learning, page 106, New York, NY, USA, 2004. ACM Press. [94] K. Q. Weinberger, F. Sha, Q. Zhu, and L. Saul. Graph Laplacian methods for large-scale semidefinite programming, with an application to sensor localization. In B. Sch¨ olkopf, J. Platt, and Thomas Hofmann, editors, Advances in Neural Information Processing Systems 19. MIT Press, Cambridge, MA, 2007. [95] G. A. Williams, J. M. Dugan, and R. B. Altman. Constrained global optimization for estimating molecular structure from atomic distances. Journal of Computational Biology, 8(5):523–547, 2001. [96] D. Wu and Z. Wu. An updated geometric build-up algorithm for molecular distance geometry problems with sparse distance data., submitted 2003. [97] K. W¨ uthrich. NMR of Proteins and Nucleic Acids. John Wiley & Sons, New York, 1986. [98] L. Xiao, J. Sun, and S. Boyd. A duality view of spectral methods for dimensionality reduction. In ICML ’06: Proceedings of the 23rd international conference on Machine learning, pages 1041–1048, 2006. [99] N. Xu, S. Rangwala, K. Chintalapudi, D. Ganesan, A. Broad, R. Govindan, and D. Estrin. A wireless sensor network for structural monitoring, 2004. [100] G. Xue and Y. Ye. An efficient algorithm for minimizing a sum of euclidean norms with applications. SIAM Journal on Optimization, 7(4):1017–1036, 1997. [101] Y. Ye and J. Zhang. An improved algorithm for approximating the radii of point sets. In RANDOM-APPROX, pages 178–187, 2003. [102] J. M. Yoon, Y. Gad, and Z. Wu. Mathematical modeling of protein structure using distance geometry. Technical report, Department of Computational & Applied Mathematics, Rice University, 2000. [103] G. Young and A. S. Householder. Discussion of a set of points in terms of their mutual distances. Psychometrika, 3:19–22, 1938.
BIBLIOGRAPHY
125
[104] L. Yuan, C. Gui, C. Chuah, and P. Mohapatra. Applications and design of heterogeneous and/or broadband sensor networks. In Broadnets, San Jose, CA, 2004. [105] Z. Zhang and H. Zha. Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM Journal on Scientific Computing, 26(1):313–338, 2005.