Oct 1, 2006 - Per Bergström. 1http://www.tomlab.biz/. 4 ..... At the Olympic Games in Sidney the national women's team manager had to choose from ..... using inner point methods. 1http://www.maths.ed.ac.uk/~gondzio/software/hopdm.html.
2006:285 CIV
MASTER'S THESIS
Solving Linear Optimization Problems using a Simplex Like Boundary Point Method in Dual Space
Per Bergström
Luleå University of Technology MSc Programmes in Engineering Engineering Physics Department of Mathematics
2006:285 CIV - ISSN: 1402-1617 - ISRN: LTU-EX--06/285--SE
Solving Linear Optimization Problems using a Simplex like boundary point method in dual space
by Per Bergstr¨om
Department of Mathematics
Lule˚ a University of Technology SE-971 87 Lule˚ a Sweden
October 1, 2006
Abstract
This thesis treats an algorithm that solves linear optimization problems. The algorithm is based on a similar idea as the simplex method but in this algorithm the start value also might be an inner point or a boundary point at the feasible region. The start value does not necessarily have to be a corner point that the simplex algorithm demands. The algorithm solves problems in standard form. That means that the constraints are with equality and every variable must be positive. The algorithm operates in the dual space to the original problem. During the progress of the algorithm the iteration points will be on the boundary at the feasible region to the dual problem and finally end up in a corner. Afterwards the iteration values go from corner to corner until finally the optimum is reached, just like the simplex algorithm. The expected time to solve linear optimization problems with this algorithm seems to be polynomial in time with respect to the size of the problem, thought the worst case behavior has not been analyzed. If the last iteration value is just an approximate solution to the dual problem the algorithm will transfer it to a quite good approximation to the primal problem. Much of the development in this thesis is a continuation of a similar algorithm which was done one year ago. In the introduction and in the second chapter different forms of linear optimizaR and the tion problems are described. The algorithm is implemented in Matlab° code can be find in the appendices of this paper. There are also different versions, which solve different types of problems. One for general problems and one for network flow problems.
3
Acknowledgments
First of all I would like to thank my supervisor, associate professor Ove Edlund. He has taken his doctor’s degree in scientific computing. A majority part of his doctoral thesis deals with linear programming. The interested reader who wants to read it can find it on the web. There is a link to Ove’s thesis in the references of this thesis, in fact reference number [1]. It was perfect to have a supervisor like Ove when experiment with and writing about linear programming. It is always good to have an experienced supervisor in his own subject when doing research. This thesis has been time consuming. At least it took a long while from start until finish. Before I finished my master thesis I was offered an interesting trainee job. That was a trainee job at the Swedish company Tomlab1 , which is the worldTM R wide leading company in optimization solvers for Matlab° and LabVIEW . The company has offices in V¨aster˚ as (Sweden) and San Diego (USA, CA), and I was placed in San Diego. It was very exciting to live in another part of the world for a while than northern Sweden where I was born and usually live. The trainee job at Tomlab had no direct connection with this thesis, but the job gave me understanding in how extremely important optimization is for the society today. It also gave me lots of experiences in computer science and last but not least, it gave me knowledge in how it is to work in the industry. However, it was not a hard decision to accept the job offer from Tomlab, even if I was forced to finish this thesis during the lovely Swedish summer. Lule˚ a, October 1, 2006 Per Bergstr¨om
1
http://www.tomlab.biz/
4
Table of Contents
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2. Linear Optimization and Sparse Matrices 2.1. Example 1. A Transportation Problem 2.2. Example 2. Relay swimming race . . . 2.3. Example of network flow problem . . .
. . . .
9 9 13 15
3. Simplex method . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
4. Simplex boundary method . . . . . . . . . . . . . . . . . . . . . . 4.1. An improvement . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Dual problem . . . . . . . . . . . . . . . . . . . . . . . . . .
23 24 24
5. Test result . . . . . . . . . . 5.1. Dense matrices . . . . 5.2. Network flow problems 5.3. Netlib test problems .
. . . .
26 26 27 29
6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
7. Other methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
Appendix A. Sketch of a convergence proof . . . . . . . . . . .
32
Appendix B. lpmin.m . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
Appendix C. lpmin flow.m . . . . . . . . . . . . . . . . . . . . . . . . .
44
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
. . . .
. . . .
5
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
List of Tables
Table 2.1. The yearly capacities and requirements in suitable units. . . . Table 2.2. Unit costs for supplying each customer from each supplier. . Table 2.3. The swimmers personal best expressed in seconds. . . . . . . Table 5.1. Comparison between two different algorithms. m is the number of equality constraints and t is measured in seconds. . . . . . . . . . Table 5.2. Comparison between two different algorithms. m is the number of equality constraints and t is measured in seconds. . . . . . . . . . Table 5.3. Results from comparison with problem from simsys sparse. m is the number of equality constraints and t is measured in seconds. Table 5.4. Results from comparison with problem from simsys sparse2. m is the number of equality constraints and t is measured in seconds. Table 5.5. Results from comparison with problem from Netlib. t is measured in seconds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
9 9 13 26 27 28 28 29
1. Introduction
This Master Thesis is a sequel to a D-level thesis that was done one year ago. The result from that was so good, enough to make a continuation possible, a Master thesis, on the same theme. The algorithm in the D-level thesis solves problem in one mathematical form, not a usual form in applied problems. That’s the goal in this thesis, to write the code suited for applied problems. In the beginning of this thesis some applied problems will be described. Two different terms are used, linear programming and linear optimization, but they describe the same thing. First of all, something about linear programming and its history must be mentioned. For the readers who are not familiar with the subject, it can be said that it all started for about 60 years ago, during World War II. At least, lots of things in the area of subject took shape at that time. Of course, lots of persons were involved into developing and improvement of solving linear programming problems, but there is one who has a special influence and has been acknowledged as the founder of the first efficient solving algorithm, the simplex method. His name is George Dantzig. George worked with military planning in Pentagon for the U.S. Air Force during World War II. He was expert in solving planning problems with a desk calculator. It was then it all started. Developing mathematical models were not the only thing he did, he also developed an efficient solver to the problems. The method is called the simplex method. Since that time mathematical planning has been spread to more applications than military planning. In the beginning of the book ”Linear programming and extensions”, which is written by George Dantzig, reference [2], Dantzig gives examples of linear programming problems and also the origins to linear programming models. Today it is an important tool in many industrial applications and financial and economical science. The development of models has led to larger and larger problems. Problems with a size of millions of variables are not unrealistic. See for example reference [3], where the authors describe and solve a problem with millions of variables, and they make use of a linear programming solver. Such large problems will take very long time to solve. With that size of problems it is very important what kind of solving method you are using. It also may bee good idea to make use of a supercomputer. Then it is good to have an algorithm that is efficient to use on a parallel computer. Until the middle of the 80’s the simplex method was with no doubt the leading solver. Then there came new generation solvers called inner point methods. The difference between the inner point methods and the simplex method is that in the simplex method all the iteration points are in corners to the feasible region. In the inner point method all the iteration points are inside the feasible region. For very large problems the simplex method might be very time consuming. That is because of the numbers of corners to the feasible region grow exponential to the size of the problem. However, the inner point method might be unstable because of all the numerical troubles that might occur. So different problems claim different 7
1. Introduction
solvers. For one type of problem one solver is the best, and for another type of problem another solver is the best. Which solver is the best depends on many things. One thing is clear, when solving large linear programming problems, it might be an advantage to have many solvers to choose from. In the following text linear programming will be abbreviated as LP.
8
2. Linear Optimization and Sparse Matrices
In applied linear programming the matrix will often be huge and sparse. The sparse pattern arises because of weak relations between all the decision variables. Every variable just has direct connection with a couple of other variables among perhaps thousands of them. A detailed example of this follows in section 2.3. In this chapter examples of linear programming problems from the reality will be described. There is lot of literature where you can find examples of applied linear programming. For example, see [4],[5] or [6]. One of many types of problems is the transportation problem. In [4] Williams gives an example of such a problem. At page 73 you can find the following example. 2.1. Example 1. A Transportation Problem Three suppliers (S1,S2,S3) are used to provide four customers (T1,T2,T3,T4) with their requirements for a particular commodity over a year. The yearly capacities of the suppliers and requirements of the customers are given in table 2.1 (in suitable units). Table 2.1. The yearly capacities and requirements in suitable units. Suppliers S1 S2 S3 Capacities (per year) 135 56 93 Customers T1 T2 T3 T4 Requirements (per year) 62 83 39 91 The unit costs for supplying each customer from each supplier are given in table 2.2 (in £/unit). Table 2.2. Unit costs for supplying each customer from each supplier. Customers T1 T2 T3 T4 S1 132 97 103 Suppliers S2 85 91 S3 106 89 100 98 A dash indicates the impossibility of certain suppliers for certain depots or customers. The total capacity of the suppliers is greater than the requirement of the customers. To get an easier structure in the problem you can introduce a new customer (it is called a dummy customer) with requirement 9 units per year. With
9
2. Linear Optimization and Sparse Matrices
the new dummy customer the total capacity of the suppliers is equal to the requirement of the customers. The problem can be viewed in a graph. Se figure 2.1. Figure 2.1. Graph for problem in example 1.
This problem can be formulated to a LP model by introducing variables xij to represent the quantity of the commodity sent from Si to Tj in a year. The resultant model is (2.1).
10
−x11 −x13 −x14 −x15 −x21 −x22 −x25 −x31 −x32 −x33 −x34 −x35 +x21 +x31 x11 +x22 +x32 x13 +x33 x +x34 14 x15 +x25 +x35 xij
subject to
min 132x11 + 97x13 + 103x14 + 85x21 + 91x22 + 106x31 + 89x32 + 100x33 + 98x34
= = = = = = = = ≥
−135 −56 −93 62 83 39 91 9 0, ∀i, j
(2.1)
2. Linear Optimization and Sparse Matrices
11
2. Linear Optimization and Sparse Matrices
If the dummy customer T5 was not introduced the three first equalities had been inequalities. As you can see the model has a very special structure. Another type of problem is the assignment problem. In [7] the authors give an example of an assignment problem.
12
2. Linear Optimization and Sparse Matrices
2.2. Example 2. Relay swimming race Before every Olympic Game or big competition the national swimming team manager gets the commission to take out the best team in relay swimming 4x100 meter medley. Medley consists of the sections backstroke, butterfly, breaststroke and freestyle. At the Olympic Games in Sidney the national women’s team manager had to choose from among six swimmers. The swimmers personal best is given in table 2.3 expressed in seconds. A dash in the table means that the person couldn’t or didn’t want to swim that section. Determine which team that gives the lowest total time. Table 2.3. The swimmers personal best expressed Name Freestyle Backstroke Breaststroke Emma 57.78 70.05 Therese 54.41 62.42 Johanna 56.62 67.40 Anna-Karin 57.66 Louise K 57.89 66.09 70.73 Louise J 55.55 -
in seconds. Butterfly 60.60 58.93 59.47 63.10 62.87
Since there are six swimmers and four sections it makes it easier to introduce two pretending sections which will be assign to those swimmers who will not be selected to the team. The two pretending sections will have a cost of 0. The graph of the problem can be seen in figure 2.2.
13
2. Linear Optimization and Sparse Matrices
Figure 2.2. Graph for problem in example 2.
The problem is to find the team that gives the lowest total time. This problem can be expressed as a linear optimization problem similar as in previous example.
14
2. Linear Optimization and Sparse Matrices
2.3. Example of network flow problem In the two previous sections two examples were described more or less from the reality. Both the problems have a similar structure. A more general problem of this kind is network flow problems. A Matlab-program that generates these kind of problems has been made in this master thesis. When testing LP algorithms it is an advantage to have a problem to solve. Of course you can take examples from literature but it takes lots of time to write it down and these problems are also quite small. In real life these problems are much larger, with perhaps hundreds of thousands variables. To get problems of that size it is impossible to write it down from literature, because of no literature has examples of that size, and if they have, it should take to much time to transfer the problems correctly. One way to evade this is to let a computer program generates test problems. Network flow problems can be shown in a graph. A graph consists of nodes and arcs. Each node is connected with at least one other node by an arc (or many arcs). The graph is connected. That means that it is a connection between all the nodes in one or another way. In the network there are production, consumption and a flow. The production and consumption are at the nodes and the flow is in the arcs. One node is viewed in figure 2.3. Figure 2.3. Node with net production, inflow and outflow.
At each node there is a total inflow, a total outflow and a net production. Nothing of the flow disappears and therefore must the equation (2.2) hold at each node. total outflow − total inflow = net production
(2.2)
Because of the fact that the graph is connected it is a connection between all the nodes. In the type of problem that is generated here the graph is undirected. That means that the flow can go in both directions between each node. Se figure 2.4.
15
2. Linear Optimization and Sparse Matrices
Figure 2.4. Undirected graph.
In the continuation one undirected arc will be used instead of two directed arcs between the nodes. For every node the equation (2.2) must hold. The flow from node i to node j is denoted by xi,j . Of course, the flow must be greater then zero and the flow might also have an upper limit (unlike the example problems in the previous chapter). Each node has a number of inflows and a number of outflows. Some of them have also a net production different from zero. It is the cost of the total flow through the network you want to minimize. This kind of problem is an LP-problem of the form (2.3). min cT x x
s.t. Ax = b x≥0
(2.3)
The rows in A consist of information from corresponding node. The matrix will have a distinctly sparse pattern. Every column in A consists of one 1 and one −1. The remaining elements are zero. The number of columns in A is the same as the number of arcs in the graph. This is the same as the number of variables in the problem. A quite large problem can be viewed in figure 2.5. This flow problem has 1024 nodes and 5052 arcs.
16
Figure 2.5. Graph with optimal solution.
2. Linear Optimization and Sparse Matrices
17
2. Linear Optimization and Sparse Matrices
The node that has a positive net production (the source) is marked with red color. Also the amount of the net production is marked in the graph. In this example there are 10 nodes that have a negative net production (the sinks). The sinks are marked with blue colour and also the amounts of the (negative) net production are marked. The nodes that have zero net production are smaller and black. In this case the cost is proportional with the distance between the nodes. The optimal solution is marked with dark red color. The structure plot of the matrix A to this problem is shown in figure 2.6. Figure 2.6. Structure plot of A.
The sparse pattern appears distinctly. As mentioned before the number of nonzeros in each column is two. Since there are 1024 rows only about 0.002 of the total elements are nonzero. For larger problems this share will be even smaller. Another version of generated test problems have also been done. The second version generates the same kind of problmes but the graph of this problem will have a little bit different form. The graph and the structure plot can be seen in figure 2.7 and figure 2.8.
18
Figure 2.7. Graph with optimal solution.
2. Linear Optimization and Sparse Matrices
19
2. Linear Optimization and Sparse Matrices
Figure 2.8. Structure plot of A.
The graph of this problem has another structure and the structure plot of the matrix also shows that the pattern of the matrix is different. This is because of the difficulty to number the nodes. The number of nodes is the same as in the example before, 1024, but the number of arcs is more, in fact 5714. That means that this problem has 5714 decision variables. Likewise in this problem the number of non-zeros in each column is two, so the share of non-zeros is the same. As mentioned in the beginning of this chapter the reason of making a program that generates test problems is to get problems prepared to solve with some method. This type of problem is quite representative for general LP-problems from the reality. The matrix that belongs to the problem is huge and sparse. One way to solve this kind of problems is to test all the possible combinations. With a problem of the same size as the latest present the number of different combinations are about 101164 . For every combination you have to solve a sparse linear equation system of size 1023 × 1023. That time is small but not negligible. Let say it takes about 10−6 seconds for a (very) fast computer to test each combination. Then it takes 101158 seconds to solve the problem. It is about 101140 times the age of the universe. There is no chance to solve this kind of problem by testing all the combinations. You have to use a much more clever method. The programs that generate the two different test problems can be find on the Internet at Mathworks file exchange central1 . The program that generates test problems like the one in figure 2.5 have the name simsys sparse and the program that generates test problems like the one in figure 2.7 have the name simsys sparse2. Athanasios Migdalas has written a book on network problem. That book can be found in the references, [8], and it contains lots of network related problems. A realistic and complex special case of minimum cost network flow problem is when you have one sink and one source and the cost is proportional to the distance between the nodes. Then the optimal solution can be interpreted as the shortest path in a road network system. A modern application of this is trip planning with GPS. A GPS is not as fast as a stationary computer so then it is very important to solve the problem very efficiently. See figure 2.9 for a trip planning example. Here is the shortest path between Lule˚ a University of Technology and Lule˚ a Airport 1 http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId= 8845&objectType=file
20
2. Linear Optimization and Sparse Matrices
marked with a thicker line. Figure 2.9. The shortest path between Lule˚ a University of Technology and Lule˚ a Airport. (Source: www.eniro.se)
21
3. Simplex method
An old faithful solver to LP-problems is the simplex method. George Dantzig mainly developed it in the middle of the 20th century, during World War II. Dantzig was working at Pentagon with military planning for the U.S. Air Force. He developed mathematical models and solved them with a desk calculator. But it was not just mathematical models he developed, he also developed an efficient solver to this kind of problem, in fact the simplex method. The simplex method can be described in two parts, usually named phase-I and phase-II. • In simplex phase-I the algorithm searches for a feasible basic solution. • In simplex phase-II the algorithm iterates toward the optimum from vertex to vertex. Here follows just a short description of the simplex method. In simpex phase-I slack variables will be added to the problem. That must be done to find a value of the decision variable where all the constraints are fulfilled. When such a solution is found, in other words, a feasible basic solution is found, the search for optimum can start. That is what will be done in simplex phase-II. In phase-II the algorithm will search from vertex to vertex to find the optimal solution. The next vertex will be chosen such that the search direction is in the steepest feasible direction. When doing so optimum will finally be reached. There are almost uncountable books written about the simplex method. George Dantzig himself wrote some of them, see for example reference [2]. Another example of literature describing the simplex method is the reference [9], written by Luenberger. In those references the simplex method is described in detail and therefore will no further explanation be given about it.
22
4. Simplex boundary method
In the previous paper [10], a D-level thesis in mathematics, a little bit different method than the simplex method was analyzed. It has many similar properties as the simplex method and can be seen as a simplex phase-I method with another search method (no slack variables). The D-level thesis is written in Swedish. Here follows a short summary about the method. The method, which was analyzed, solves a LP-problem of the form given in (4.1). min cT x x
s.t. Ax ≥ b
(4.1)
The method is quite similar to the simplex method, but the iteration point must not necessarily be a corner point in the beginning. It starts at origin, and then it iterates towards the feasible region by ”jumping” in the direction of the normal to non-fulfilled constraints. If the constraint is not fulfilled the new x will be set by equation (4.2). x=x+
bi − A i x T Ai Ai AT i
(4.2)
In equation (4.2) Ai is the i:th row in A and bi is the i:th element in b. This is equivalent with an orthogonal projection onto the hyper plane spanned by the i:th constraint. The sketch of convergence proof can be found in Appendix A. Instead of just using (4.2) you can use different start values based on the first converged x. With that method you will have different values of x that converge towards the feasible region. Because of the feasible region is a convex region you can use for example the mean value of all converged x to get a point much closer to the feasible region. See figure 4.1 for an illustration.
23
4. Simplex boundary method
Figure 4.1. Illustration of convergence.
When a feasible point is reached, the search for optimum begins. The search direction will be the steepest feasible search direction. That means that if the iteration point is an inner point the search direction will be the negative gradient vector, and if the iteration point is a boundary point the search direction will be the negative gradient vector projected onto the equality fulfilled constraints. The length of every iteration step is adjusted so the step will be as large as possible in the same time the new iteration value will be at the boundary to the feasible region. After some iteration the iteration point will be at a corner and then the simplex phase-II starts. 4.1. An improvement One problem with the method present in the D-level thesis [10] is that even if the function value at the first corner point is pretty close to the optimal function value the number of simplex steps will be huge, especially for large problems. One way to improve this phenomenon is that when you have get a feasible point, you don’t start immediately by searching for a feasible corner. Instead you try to get a more centralized point in the feasible region. After that you go a small step in the negative (if it is a minimum problem) gradient direction. Then you take a step perpendicular to the gradient vector to get a more centralized point. After that a small step in the negative gradient direction has to be done, and so on. After about m steps (the number of variables) you will be closer to optimum and quite centralized. The hard thing is to centralize the point. That is also the most expensive. How good this works out depends on the geometry of the feasible region. 4.2. Dual problem The example problems in chapter 2 were on another form than (4.1), but there is a relation between them. The problems (2.3) and (4.1) are dual to each other, with the exception that the dual problem is a maximization problem. That means it is the same to solve problem (4.3) and problem (4.4). 24
4. Simplex boundary method
max bT y y
s.t. A y≤c
(4.3)
T
min cT x x
s.t. Ax = b x≥0
(4.4)
At optimum cT x = bT y and the constraints in (4.3) that are satisfied with equality, correspond to the elements in x are equal to zero in (4.4). For all feasible x and y, cT x ≥ bT y. That causes one problem since if you have an approximate solution in the problem (4.3) you can’t use it in (4.4). One way to overcome that problem is, when you have found a corner point, which is a good approximation of optimum, you use it in the same way when you are going back to the primal problem, as you do when you have found the absolute optimum. Unfortunately, that point has a function value less than the optimal solution. That means that some of the decision variables xi in x will be less than zeros and hence outside of the feasible set of solutions. To overcome that problem you can use a similar method to reach the feasible region as done in the dual problem. That means first of all, all xi < 0 will be put to zero. After that you solve (4.5) which is an easy problem to solve. min(x − xp )2 x
s.t. Ax = b
(4.5)
In (4.5) xp is the previous x, and x is the new. Some of the new elements xi in x are less than zero. The xi that are less than zeros can be put to zero. Then (4.5) will be solved again and so on. With this method x will converge to the feasible set, unfortunately not to optimum. It might be better to take some simplex step too many than too few. One more thing that has to be mentioned is when you for example solve network flow problem you can make use of the characteristics of that type of problem. First of all, all the elements in c are bigger or equal to zero. That means that no time has to be spent in searching for a feasible point. The part in the algorithm where it tries to find a centralized close-to-optimum point might also be skipped. The special structure makes it unnecessarily to do that. That is because of when a feasible corner is reached with the first part in the algorithm, it will not require so many simplex steps, even with a uncentralized point. Therefore with this kind of problem you can use a reduced algorithm that directly starts with finding a feasible corner point and then starts the simplex method.
25
5. Test result
In this chapter follows the test results. The algorithm is implemented in Matlab with the name lpmin. Comparison has been done with Matlab’s own solver, linprog, which is included in Matlab’s optimization toolbox. linprog is using different methods for Medium-Scale Optimization and Large-Scale Optimization. For more information about linprog and its solver methods, see the documentation of linprog on Mathworks homepage1 . 5.1. Dense matrices When the D-level thesis [10] was made a program that generates a kind of test problem was also made. Here exactly the same principle is used but the problems are in the problem format (2.3). The matrices are dense. This is not the network flow problem that was described in section 2.3 in this thesis.First a quite easy problem, where c > 0. The result is shown in table 5.1. Table 5.1. Comparison between two different algorithms. m is the number of equality constraints and t is measured in seconds. m lpmin (t1 ) linprog (t2 ) t2 /t1 50 0.71 1.87 2.63 100 2.09 10.82 5.18 150 6.31 33.78 5.35 200 17.79 70.69 3.97 250 45.31 159.78 3.53 300 76.34 297.04 3.89 The last column tells the time quota, a comparison between the two different solvers. Here all the quotas are bigger than one and therefore it is obvious that lpmin is faster in this case. A comparison test when the elements in c have both positive and negative element values have also being done. That means it is harder to find a feasible solution in the dual problem. The result for this can be find in table 5.2.
1
http://www.mathworks.com/access/helpdesk/help/toolbox/optim/ug/linprog.html
26
5. Test result
Table 5.2. Comparison between two different algorithms. m is the number of equality constraints and t is measured in seconds. m lpmin (t1 ) linprog (t2 ) t2 /t1 50 0.55 1.48 2.69 100 2.69 10.33 3.84 150 9.28 39.11 4.21 200 19.67 86.01 4.37 250 46.96 161.38 3.44 300 99.47 244.04 2.45 Table 5.2 shows the result from the second comparison. Also here, quotas are all the time bigger than one and therefore lpmin is faster also in this case. However, these problems are quite easy. For example, the problems do not have any degenerated corners and the linear equation systems that have to be solved in each iteration are quite well conditioned, which is not the case in problems from the real world. In real world problems lots of troubles might occur. 5.2. Network flow problems In chapter 2 the network flow problems are described and it is also mentioned something about the network flow test problem generation. However, a program that generates sparse LP test problems based from network flow problems has been constructed. The test problem program returns realistic optimization problems that could have been taken from the real world. The network flow problems have a very characteristic structure. That can be used when solving the problems. The algorithm that solves these special kind of problem is named lpmin flow. The two different network test problem generators described in chapter 2 are named simsys sparse and simsys sparse2. The result with problems from simsys sparse is present in table 5.3 and the result with problems from simsys sparse2 is present in table 5.4.
27
5. Test result
Table 5.3. Results from comparison with problem from simsys sparse. m is the number of equality constraints and t is measured in seconds. m lpmin flow (t1 ) linprog (t2 ) t2 /t1 100 0.17 0.11 0.65 195 0.6 0.17 0.28 306 1.48 0.28 0.19 400 2.69 0.44 0.16 506 3.95 0.50 0.13 600 5.61 0.60 0.11 702 5.99 0.59 0.10 812 9.23 0.99 0.11 900 12.41 0.94 0.08 992 13.89 1.21 0.09
Table 5.4. Results from comparison with problem from simsys sparse2. m is the number of equality constraints and t is measured in seconds. m lpmin flow (t1 ) linprog (t2 ) t2 /t1 100 0.22 0.61 2.77 200 0.77 0.22 0.29 300 1.32 0.44 0.33 400 2.69 0.44 0.16 500 3.45 0.46 0.13 600 5.93 0.77 0.13 700 8.30 0.82 0.099 800 12.85 0.99 0.077 900 15.44 1.21 0.078 1000 17.02 1.43 0.084
28
5. Test result
5.3. Netlib test problems The last comparison test is for LP test problem from Netlib. The Netlib test problems and their description can be found on the internet2 at the link given in the footnote. All the Netlib problems are not compared in this test. One reason is that many of the LP problems have an upper bound on their decision variable and hence can not be solved by lpmin. One other reason, which is more serious, is that round-off errors made lpmin useless in some cases were lpmin failed to find the optimum. The result for some of the problems can be find in table 5.5. Table 5.5. Results from comparison with problem from Netlib. t is measured in seconds. m lpmin (t1 ) linprog (t2 ) t2 /t1 afiro 0.07 0.06 0.86 ∗ bandm 5.79 0.39 0.068 ∗ blend 0.15 0.09 0.60 ∗ israel 1.15 1.23 1.07 sc50a∗ 0.14 0.04 0.28 ∗ sc50b.mat 0.12 0.03 0.25 ∗ sc105 0.34 0.061 0.18 sc205∗ 0.99 0.11 0.11 ∗ scagr7.mat 0.54 0.07 0.13 ∗ scagr25.mat 11.0 0.31 0.028 ∗ sctap1.mat 4.00 0.25 0.062 share1b∗ 0.70 0.20 0.29 ∗ share2b 0.45 0.09 0.20 ∗ stocfor1 0.34 0.10 0.29 ∗
Slightly wrong result.
The test result which is given in table 5.5 tells that lpmin is not the best choice of solver for this kind of problem. It was only one of the Netlib problems lpmin was able to solve with a correct solution. In some of the problems lpmin returned a quite bad solution and there also exists problems were lpmin did not return a solution at all because of difficulties with round-off errors.
2
http://www.netlib.org/lp/data/
29
6. Conclusion
As everyone can see in chapter 5 the results differ a lot between the different kinds of test problems. For the problems with dense matrices lpmin was faster than linprog. That follows from the tables 5.1 and 5.2. In other words lpmin is a better choice of solver for these kind of problems than linprog is. In the D-level thesis, [10], where the problem was in the form (4.1), the result was even much better for lpmin compared with linprog. The problems were after all generated in the same way as the Dense matrices problem in this thesis. With the format (4.1) was the time comparison t2 /t1 growing in the same rate as the number of variables were growing. With the network flow problems linprog was faster as follows from tables 5.3 and 5.4. One possibility is that linprog has a more efficient sparse matrix handling than lpmin. Another possibility is that the structure of the problem makes it more efficient to solve with the method used in linprog. These two things makes it better to use linprog then lpmin for these kind of problem. The result for the Netlib problems was not in favour for lpmin which follows from table 5.5. For some of the Netlib problems lpmin failed because of round-off errors and for some problems lpmin did not find the absolute optimum. This tells that lots of work on the algorithm in lpmin is still remaining. From this it follows that for some kind of problem one solver method might be the best and for another type of problem another solver method might be the best. If a method is perfect for problems in some application it can totally fail in some other application. Consequently, the result for lpmin was blended. This algorithm turns definite out to be a promising first attempt. The result for the first kind of test problems, the one with dense matrices, was eminent for lpmin. The comparison was done with a commercial solver, but lpmin solved it much faster. On the other hand, the network flow problem and the Netlib problem could have been solved faster and more accurate by lpmin. A Swedish proverb might be used in this case, Rome was not built in one day. That means that it takes a lot of time to construct a complicated work and it will never be finished. It is the same with algorithm development. Lots of time has been spent to develop this algorithm, lpmin, but much more work is still remaining to make the algorithm complete. One thing to make lpmin to a better solver is to translate the Matlab-code to for example C-code. That will speed up the calculations which will do lpmin to a much faster solver. Another thing is to do lpmin better in sparse matrix handling. That would do lpmin more competitive compared with commercial solvers.
30
7. Other methods
There are several different methods that solve linear optimization problems. The simplex method is the most known but it is not the only method. As mentioned in the introduction there is also a method type called inner point method. That type of method works in a quite different way. The first distinct difference is that with the inner point method the iteration values are inside the feasible region, not on the boundary as in the simplex method. That explains the name of the method type. There are lots of different kinds of inner point methods. One method is explained in reference [11] and [12]. These reports can be found on the Internet, on the page Higher Order Primal Dual Method1 , which is a site were lots of optimization stuff can be found. The method described in [11] and [12] works in a way that the primal- and dual problem are solved in the same time. The lower and upper bounds of the decision variable are replaced with a special function which is put into the objective function. This boundary function can looks like in many different ways. In the reports [11] and [12] the authors use a logarithmic barrier function to replace the bounds with. It is called a barrier function because it will not be defined outside of the feasible region. Another type of function that can be used to replace the bounds of the decision variable with is a penalty function. It is a function that is defined outside the feasible region but outside the feasible region its function values are of the kind that the iteration value will be forced into the feasible region. An analyze of this has been done in Article III in [1]. One advantage with the inner point methods are that they are quite fast for large scale problems. The simplex method might be very slow for large scale problems. That is because of the number of corners grow exponentially with respect to the size of the problem. Every iteration step goes quickly but the numbers of iterations will be very large. Another advantage with inner point methods is that you can stop whenever you want, and if it has converged well, you will get a good approximate value to optimum. The big disadvantage is that in every iteration an ill conditioned equation system has to be solved. It is very hard to solve an ill conditioned equation system with numerical methods to get a correct solution. The solution you will get might be completely wrong. That can cause troubles when using inner point methods.
1
http://www.maths.ed.ac.uk/~gondzio/software/hopdm.html
31
Appendix A. Sketch of a convergence proof
Here follows a sketch of a convergence proof for the method used in chapter 4. Figure Appendix A.1. Illustration of convergence.
Proof. Choose a local orthogonal coordinate system such that the x1 -axis is orthogonal to the normal of the actual constraint. The other coordinate axes can be choosen arbitrary as long they are¡orthogonal with each ¢ other. ¡ This gives x(j)¢ and x(j+1) will have the coordinates x1 x2 ·¡· · xn and 0 ¢x2 · · · xn respectively. The point P has the coordinates P1 P2 · · · Pn . All the coordinates are expressed in the local coordinate system. The distances between P and x(j) , P and x(j+1) respectively can be written as the following. p (P1 − x1 )2 + (P2 − x2 )2 + . . . + (Pn − xn )2 dist(P − x(j) ) = p dist(P − x(j+1) ) = (P1 )2 + (P2 − x2 )2 + . . . + (Pn − xn )2 This gives: P1 ≥ 0 and x1 < 0 ⇒ dist(P − x(j) ) > dist(P − x(j+1) ) The conclusion is that x(j+1) is closer to the feasible region than x(j) . Assume that x(j) converges to a distance d > 0 to the feasible region. Sooner or later you will come to a constraint which is tangent to the feasible region. Then a longer step than zero must be taken and hence the only possibility is that x(j) converges to the feasible region.
32
Appendix B. lpmin.m
function x=lpmin(c,Aeq,beq) % min c’*x % Aeq*x=beq % 00,p-1e-17,co>30)) co=co+1; x(x0 y=t*beq; c_Ay=c_Ay-t*bA; else error(’Dual unbounded, primal infeasible’); end ind=abs(c_Ay)0.95*m hl=(beq’*Aeq(:,ind))’; r=hl-((Aeq(:,ind)*tt)’*Aeq(:,ind))’; q=r; rrp=r’*r; for j=1:5 if norm(r)0.95*m hl=(dir’*Aeq(:,ind))’; r=hl-((Aeq(:,ind)*tt)’*Aeq(:,ind))’; q=r; rrp=r’*r; for j=1:5 if norm(r)0 y=y+t*dir; c_Ay=c_Ay-t*diA; else error(’Dual unbounded, primal infeasible’); end indp=ind; ind=c_Ay-1e-13; break elseif ttmi==-Inf hl=(beq’*Aeq(:,ind))’; r=hl-((Aeq(:,ind)*tt)’*Aeq(:,ind))’; q=r; rrp=r’*r; for j=1:5 if norm(r)0 y=y+t*dir; c_Ay=c_Ay-t*diA;
47
Appendix C. lpmin flow.m
indp=ind; cusu=cumsum(ind); indred=find(cusu==k); indp(indred(1))=0; ind=c_Aym indp=ind; ind1=randperm(su); ind2=logical(zeros(su,1)); ind2(ind1(1:m))=1; ind(ind)=ind2; indp(indp)=not(ind2); c_Ay(indp)=c_Ay(indp)+(min(norm(t*dir),1)*1e-2)*colnorm(indp); else ind=misor(c_Ay./colnorm,m); end end end % Find x x(ind)=Aeq(:,ind)\beq; function [ind,mii,maa]=misor(vec,p) % index with the p smallest values ind=logical(zeros(size(vec))); [so,ind2]=sort(vec); ind(ind2(1:p))=1; maa=max(vec(ind)); mii=min(vec(ind));
48
References
[1] Ove Edlund. Solution of linear programming and non-linear regression problems using linear M-estimation methods. Phd thesis, Lule˚ a University of Technology, http://epubl.luth.se/1402-1544/1999/17/, 1999. ISSN:14021544 ; 1999:17. [2] George B Dantzig. Linear programming and extensions. 1968. Princeton, N. J. : Princeton Univ. Press, 1963. ISBN 0-691-08000-3. [3] Gondzio J. and R. Kouwenberg. High performance computing for asset liability management. Technical report, Department of Mathematics & Statistics, The University of Edinburgh and Econometric Institute, Erasmus University Rotterdam, http://www.maths.ed.ac.uk/~gondzio/software/ wrecord.ps, 1999. [4] H. Paul Williams. Model building in mathematical programming. 1999. Chichester : Wiley. ISBN 0-471-99788-9. [5] Hillier, Frederick S and Lieberman, Gerald. Introduction to operations research. 2005. Boston : McGraw-Hill. ISBN 0-07-123828-X. [6] Ciriani, Tito A and Leachman, Robert C. Optimization in industry : mathematical programming and modelling techniques in practice. 1993. Chichester ; Wiley. ISBN 0-471-93492-5. [7] Jan Lundgren, Mikael R¨onnqvist, and Peter V¨arbrand. Introduction to operations research. 2003. Lund : Studentlitteratur. ISBN 91-44-03104-1. [8] Athanasios Migdalas. Mathematical programming techniques for analysis and design of communication and transportation networks. 1988. Link¨oping : Univ. ISBN 91-7870-302-6. [9] David G Luenberger. Linear and nonlinear programming. 1984. Reading, Mass. : Addison-Wesley. ISBN 0-201-15794-2. [10] Per Bergstr¨om. En algoritm for linj¨ara optimeringsproblem. D-level thesis, Lule˚ a University of Technology, http://epubl.ltu.se/1402-1552/2005/ 04/, 2005. [11] Gondzio J. and T. Terlaky. A computational view of interior point methods for linear programming. Technical report, Logilab, HEC Geneva, Section of Management Studies, University of Geneva,Faculty of Technical Mathematics and Informatics, Delft University of Technology, http://www.maths.ed.ac. uk/~gondzio/software/oxford.ps, 1994.
49
[12] Andersen E.D., J. Gondzio, C. Meszaros, and X. Xu. Implementation of interior point methods for large scale linear programming. Technical report, Logilab, HEC Geneva, Section of Management Studies, University of Geneva, http://www.maths.ed.ac.uk/~gondzio/software/kluwer.ps, 1996.
50