We proceed to implement a simple reachability algorithm and show how it can be used ... have Ai,j = l iff lookup j (snd
Graph Problems and Vector-Matrix Multiplications in Haskell Nikita Danilenko Institut f¨ ur Informatik, Christian-Albrechts-Universit¨ at Kiel Olshausenstraße 40, D-24098 Kiel
[email protected]
Abstract. It is a known fact that many graph problems can be restated in terms of an algebraic setting. One particular operation that arises in such a setting is the multiplication of a vector with a matrix. Depending on the underlying algebra the multiplication can carry different reachability semantics (e.g. no additional information, single path prolongation, all path prolongation). We propose an abstraction of this operation that is easily implemented in the functional programming language Haskell, explore the associated framework and provide several applications.
1
Introduction
Many tasks in graph theory require the computation of successors of a given set of vertices while possibly collecting some additional information in the process. Consider for example the graph G from Figure 1 and in that graph the set of vertices X := { 1, 2, 6 }.
6
0
i
1
1 2
7
8
5
s
5 3
4
Fig. 1. Example graph G
a
r
6
2
7 e
3
h
8
p
p
g
0
m
a
l
4
Fig. 2. Example graph G0
We can ask for the set of successors of X in G and obtain the set { 0, 2, 3, 6 } or compute the number of times any successor is reached, which yields the set { (0, 2), (2, 1), (3, 1), (6, 2) } where the first component of a pair is the vertex and the second component is the number of times it has been reached. It is well known that both problems can be solved in the very same fashion – first the graph is translated into an adjacency matrix and the set of vertices is translated
into a vector1 . In our example we get the following matrix and vector: 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0100
1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 AG = 00 00 00 01 00 00 00 00 00 , 1 0 0 0 0 0 1 0 0
vX = (0, 1, 1, 0, 0, 0, 1, 0, 0).
0 0 0 1 0 0001 0 0 0 0 0 0000
Then a multiplication of the vector with the matrix is performed, where the necessary addition and multiplication of matrix elements are interpreted in the required fashion. To compute the successors we interpret + and · as the logical disjunction and conjunction resp., 0 as false and 1 as true and obtain the result vX ∗ AG = (1, 0, 1, 1, 0, 0, 1, 0, 0), which, when translated back to set notation, corresponds to the set { 0, 2, 3, 6 }, as we have seen before. When we interpret +, ·, 0, 1 as the usual arithmetic operations and constants, the same multiplication yields vX ∗ AG = (2, 0, 1, 1, 0, 0, 2, 0, 0), which again corresponds to the set we have computed before. In both cases we required a notion of +, ·, 0, 1 on the elements of the vector and the matrix, which are usually provided in the context of semirings or Kleene algebras (cf. [12]), which are well-understood and provide many results with graph-theoretic interpretations. In many applications the edges of a graph carry additional information. For instance, the graph G0 from Figure 2 is structurally identical to G. Following the semiring strategy it is simple to translate the graph into an adjacency matrix by writing the label of an edge in the matrix instead of 1. However, to compute the successors of X in G0 it is only necessary to distinguish between existing and non-existing edges in the graph and the edge values themselves are irrelevant. Instead of either finding a suitable semiring structure for the values or mapping the graph into a suitable semiring, we propose to consider the actual computation that is to be performed and to parametrise it in a fashion that is still close to an algebraic setting, but on the other hand focusses on the necessary calculation. Our contribution is a simple, but very flexible abstraction of such computations and a prototypical implementation thereof, complete with several examples, including non-trivial ones. This article’s structure is as follows. • We discuss matrix representation of graphs in Section 2. • In Section 3 we discuss the strucure of vector-matrix multiplication, isolate its functional components and implement them in a generic fashion. • Applications of the functions from Section 3 are given in Section 4. • We proceed to implement a simple reachability algorithm and show how it can be used to solve more sophisticated problems in Sections 5 and 6. To the best of our knowledge our approach has not been studied in the context of functional programming. There is some similarity with [5], however both the aims as well as the abstractions are different, which we discuss at corresponding places. All presented code is written Haskell [15]. Throughout the 1
v ∈ { 0, 1 }n represents the set Mv := { i ∈ N0 , V := N (v ∼ g) ∼ g Vec [(0, h1, 2i), (2, h1, 3i), (4, h1, 3i), (6, h1, 6i), (7, h1, 3i)]
Prolonging all paths instead of just one works in a similar fashion. While this can be solved in the so-called Kleene algebra of paths, we will rather provide a more hands-on implementation10 . The problem specification is very similar to the one from above, but this time the vertices are labelled with a list of paths that lead to this vertex. We want to label the result with a list of paths, too, such that every path is a previously existing path prolonged by exactly one step. ( ≈ ) :: Vec [Path ] → Mat α → Vec [Path ] ( ≈ ) = vecMatMult allSum pathsMul
If ps, qs :: [Path ] are lists of paths that lead to a given vertex, so is their concatenation, which leads to the definition: allSum :: [Vec [α]] → Vec [α] allSum = bigUnionWith (+ +)
As for the actual prolongation we reason precisely as before – if v :: Vertex is a vertex and ps :: [Path ] is the list of all paths that lead to this vertex, then for any successor w of v we obtain all necessary paths to w by computing map (.v ) ps. pathsMul :: Vertex → [Path ] → Vec α → Vec [Path ] pathsMul = sMultWith (λv ps → map (.v ) ps)
Consider the adjusted example with v = toVecWith [h i] [1, 2, 6]::Vec [Path ]. ghci> v ≈ g Vec [(0, [h2i, h6i]), (2, [h1i]), (3, [h1i]), (6, [h1i, h6i])] ghci> (v ≈ g) ≈ g Vec [(0, [h1, 2i, h1, 6i, h6, 6i]), (2, [h1, 3i]), (4, [h1, 3i]), (6, [h1, 6i, h6, 6i]), (7, [h1, 3i])]
4.5
Outgoing Values
So far all of the functions we have implemented ignored the values in the matrix. This is useful when calculating discretely in a more enriched context, e.g. if it is necessary to find a discrete path in a labelled graph. Our computation scheme can be used in other cases as well. Let us consider the following problem: Given a vector labelled with [Arc a ] we wish to compute the successors of the vertices in the vector such that every successor is labelled with a new list which contains the old list and all of the vertex-value pairs that lead to this vertex. If xs, ys ::[Arc a ] are such vertex-value lists so is their concatenation. This yields: 9 10
We use the “h. . .i” notation to denote a path. The algebraic version needs a semiring structure on the type [Path ] and an additional transformation. These steps are more bulky than what is necessary in this example.
( out ) :: Vec [Arc α] → Mat α → Vec [Arc α] ( out ) = vecMatMult allSum outMult
As for outMult we simply follow the specification. outMult :: Vertex → [Arc α] → Vec α → Vec [Arc α] outMult = sMultWith (λi ovs a → (i, a) : ovs)
There is one rather curious application of this multiplication – it can be used to implement the transposition of square matrices.11 transpose :: Mat α → Mat α transpose mat = fmap Vec ((vertices out mat) ‘lcup‘ vertices) where vertices = verticesWith [ ] mat lcup = unionWith const verticesWith :: a → Mat b → Vec a verticesWith x = fmap (const x )
How does this work? The fmap Vec is not essential, it merely restores the matrix structure. To illustrate the technique we omit the Vec wrapper. Consider the example matrix from Equation (∗). Its representation is a :: Mat Int a = [(0, [(1, 1), (2, 1)]), (1, [(2, 1)]), (2, [(1, 2)])]
and vertices = [(0, [ ]), (1, [ ]), (2, [ ])]. Then the following is computed: vertices out a = bigUnionWith (+ +) (intersectionWithKey outMult vs a) h i = bigUnionWith (+ +) (1, [(0, 1)]), (2, [(0, 1)]) , (2, [(1, 1)]) , (1, [(2, 2)]) = [(1, [(0, 1)]), (2, [(0, 1)])] ∪(++) [(2, [(1, 1)])] ∪(++) [(1, [(2, 2)])] = [(1, [(0, 1), (2, 2)]), (2, [(0, 1), (1, 1)])] This is almost the transposed matrix, but the adjacency list for 0 is missing. The function lcup is the left-biased union – it takes the leftmost occurrence of a value. In the above example we find that (vertices out a) ‘lcup‘ vertices = [(0, [ ]), (1, [(0, 1), (2, 2)]), (2, [(0, 1), (1, 1)])]
which is in fact the representation of the matrix A> . In essence the scalar multiplication maps the entries of the matrix to a special notation. Then the sum, which traverses the rows from top to bottom (i.e. column-wise) adds the values by simply appending them to each other. This is correct since the indices increase from top to bottom, so that the required order is obtained. Clearly, these are only illustrating arguments and it is somewhat technical to prove the correctness of the above function. In contrast implementing transposition by hand is rather technical since the missing positions have to be considered. 11
Transposition of a non-square matrix is possible, too, but is slightly more technical.
5
Successive Computations
Most of our multiplications have the type ( ) :: Vec σ → Mat τ → Vec σ. This allows repeated applications – if we have m :: Mat τ and an v :: Vec σ, we can calculate v m and then reuse this result in the computation (v m) m. Repeated multiplication can be used to compute reachability, which we structure as follows. We collect some information with the vector-matrix multiplication, the reachability starts in an initial vector and traverses a list of graphs12 (leftto-right). The result is a list of vectors, such that the i-th vector represents the i-th reachability step. One possible implementation is the following.13 reachWith :: (Vec α → Mat β → Vec α) → Vec α → [Mat β ] → [Vec α] r [ ] = [r ] reachWith reachWith mul r gs = go r (verticesWith () (head gs)) where go (Vec [ ]) = Vec [ ] go v w = v : go v 0 w 0 where w0 = w \ v v 0 = (foldl mul v gs) ∩ w 0 (∩) :: Vec α → Vec β → Vec α (∩) = intersectionWithKey (λi x
→ (i, x ))
The function (\)::Vec α → Vec β → Vec α denotes “set difference” and is simple to implement in a similar way as unionWith. After the current step has been computed, this function removes the newly reached vertices from the vector of not yet visited vertices and multiplies the current step with all matrices in the given list to obtain the next step, which is then intersected with the unvisited vertices. The above implementation resembles a breadth-first search (BFS), save for the vertex order in the layers. A practical application is the following stencil. shortestWith :: (Vec a → Mat b → Vec a) → Vec a → Vec b → [Mat b ] → Vec a shortestWith m s e gs = head (dropWhile (null ◦ unVec) (map (∩e) (reachWith m s gs)) + + [Vec [ ]])
The call shortestWith mul start end gs finds the subset of end that is reachable from start by going along shortest paths through gs and collecting the information created by mul . This is achieved by computing the reachability layers, then intersecting every layer with the target and dropping the result as long as it is empty. If the remaining list is empty, so is the result, otherwise the function returns the first non-empty intersection of end with a reachability layer. Many graph algorithms that are phrased in the usual imperative way contain references to schemes like BFS that are modified to fit specific purposes. Our implementation allows precisely that through a simple parametric approach instead of rewriting. One modification of BFS is to use paths as labels to compute a path to every reachable vertex: Any newly visited vertex w that is reached from some vertex v which is labelled with a path p is then labelled as visited by the path p . v . This is essentially what our example from Section 4.4 does. 12
13
This can be used to get paths that alternate between several graphs. Using a singleton list yields the usual reachability. Note that all graphs are traversed in one step. We have presented a simplified version in [3].
Given a multiplication that computes v ~ m in O(size v · dim m) steps, the above reachability function is quadratic in dim m, where dim m is the dimension of a square matrix. This is different from the approach in [5] where all reachability based results are cubic in dim m. Interestingly, traversing a list of matrices in (cyclic) sequence merely increases the constants in the term O((dim m)2 ) in our approach. In the semiring scheme of [5] this traversal requires a matrix multiplication thus increasing the constants in the term O((dim m)3 ). Structurally this is the difference between computing v(AB)∗ directly and computing (AB)∗ followed by a multiplication with v. Matrix multiplications can be defined row-wise and thus with our vector-matrix multiplication stencil, too. These can collect or propagate information in a similar fashion as above. An advantage of this approach is that the closure operation is strict, while a row-based multiplication benefits from Haskell’s non-strictness, since (AB)i = Ai B and thus partial information can be obtained without computing the full product.
6
Disjoint Paths Computation
We now show how to combine our presented multiplication scheme with the pruning scheme of [11] (which relies on non-strictness) to solve a more complex problem in graph theory, namely the computation of a maximal set of shortest pairwise disjoint paths14 between two vertex sets. The solution to this problem as presented in [9] can be split into two parts: a BFS on the graph to determine the reachable vertices from the first vertex set and a DFS that finds paths between the two sets and removes all vertices on these paths until no paths remain. We use a slightly different approach that is described in [4]. Suppose we are given a reachability forest such that every vertex that is contained in the forest lies on a shortest path between the given vertex sets. Then all we need to do to find a maximal set of pairwise disjoint paths is to perform a depth-first-search on the graph that collects the path along the way. How can we obtain this forest? We use the notion of trees and forests from Data.Tree. type Forest α = [Tree α] data Tree α = Node α (Forest α)
This data structure is particularly well-suited to represent unevaluated computations. Now we create a multiplication that combines forests into larger forests. ( d ) :: Vec (Forest Vertex ) → Mat α → Vec (Forest Vertex ) ( d ) = vecMatMult allSum fMult fMult :: Vertex → Forest Vertex → Vec α → Vec (Forest Vertex ) fMult = sMultWith (λv forest → [Node v forest ])
We plug this new multiplication into our findWith function and obtain reachForest :: Vec α → Vec β → [Mat γ ] → Vec (Forest Vertex ) reachForest start end = shortestWith ( d ) (fmap (const [ ]) start) end
Figure 3 shows the visualised result of the reachability forests from { 1, 5 } to { 0, 4, 7 } in the example graph G from Figure 1. Note that each of the resulting 14
Paths p, q are called disjoint iff their respective vertex sets are.
Step 0: Step 1: Step 2:
[(1, ) , (5, )] 2, 1 , 3, 1 5 , 6, 1 2 6 3 0, 1 1 , 4, 1 5 , 7,
3 1
5
Fig. 3. Reachability forests in the example graph.
forests may contain several occurrences of the same element. We observe that every vertex that is contained in a forest in the result vector is located on a shortest path. Now we need to prune the resulting forests to our needs. To do that we can apply a technique very similar to the one presented in [11]. We use a monadic set interface15 SetM that provides the functions include :: Vertex → SetM () contains :: Vertex → SetM Bool runNew :: Int → SetM α → α
-- adds a vertex to the monadic set -- checks whether a vertex is contained in the set -- creates a new set and computes its effect
Let (i , f ) be an element in the result vector of the reachForest function. Then f may contain at most one path that is of interest, because every further path has the same final vertex i . This said the result type of a pruning operation could be SetM (Maybe Path). Instead, we use the monad transformer (cf. [14]) MaybeT that allows a less convoluted solution. chop :: Forest Vertex → MaybeT SetM Path chop [ ] = mzero chop (Node v ts : fs) = do b ← lift (contains v ) if b then chop fs else do lift (include v ) fmap (.v ) candidate ‘mplus‘ chop fs where candidate | null ts = return h i | otherwise = chop ts
The strategy works in the following fashion – if there is no tree left, there no path left in the forest. If on the other hand there is a leftmost tree, we check whether its root node is visited. If it is, we continue with the remainder of the forest. If it’s not we visit this vertex. Next we compute a path candidate. In case the candidate is indeed a path, we can add the vertex to its end and obtain a path in the graph. Otherwise we continue searching in the remaining forest. The candidate is the empty path in case ts is empty, since this means we reached the bottom of the forest and the path found by the recursive call chop ts otherwise. Finally, we apply the above chop function to the single-tree forest [Node i f ] for every (i , f ) in the result of reachForest and leave MaybeT . Afterwards we sequence the results16 , exit the SetM monad and finally apply catMaybes to the result list that removes all Nothing values and maps every Just x value to x . disjointPaths :: Int → Vec a → Vec b → [Mat c ] → [Path ] disjointPaths n start end gs = catMaybes (findPaths (paths start end gs)) where 15 16
The implementation of SetM is interchangeable, cf. Data.Graph. By definition sequence ◦ map f = mapM f .
findPaths = runNew n ◦ mapM (runMaybeT ◦ chop ◦ toForest) ◦ unVec toForest (i, f ) = [Node i f ]
Clearly, the functions chop and disjointPaths are more complicated than the “one-liners” from the previous sections, but compared to the complexity of the actual problem, this solution is still reasonably simple.
7
Discussion and Related Work
Our approach to vector-matrix multiplications and graph algorithms is convenient in the sense that we focus on the required parts of the computation. Proving desired properties requires a precise specification of these components, as we have hinted at before (informally). Focussing on the essential parts of a computation can both reveal its inner structure and simplify it as well, since the set of required axioms may be smaller than in a purely algebraic setting. While prototypical, our framework is easily modularised over the employed data structures. For instance, when vectors need heavy random access, it is (probably) better to use Data.IntMap instead of simple association list. We already hinted at the fact that addition may be more efficient when using an intermediate IntMap that is discarded later on. Similarly, it can be useful to replace an intersection of the type C a → C b → C a with one of the type C a → FastQuery b → C a that is no longer based on merging, but on traversing the left structure and querying the second one. Since the calculations are independent of the data structures, such optimisations are straightforward. There are various works that deal with graph algorithms or matrix operations in Haskell. The graph library fgl that is based upon the seminal work [7] treats graphs in an abstract fashion, without explicit matrix algorithms. The work [11] does contain implicitly algebraic reasoning, without providing the abstract context that we need. Certain graph algorithms are specified in [10] through the creation of lazy monolithic arrays and [13] provides an interface for specifying graph algorithms via a monadic EDSL. Arrays are used for fast vector-matrix multiplications in [2]. In [6] the author deals with a view on vector-matrix multiplication that is similar to ours and in [1] star algorithms in a relational context are considered in different implementations. The main difference of our approach to those above and in particular the one in [5] is the fact that we deal with a scheme that is suited to produce multiplications for reachability algorithms rather than the definition of a semiring instance and the application of a closure operation. This approach has the advantage of providing a versatile tool on the one hand and being more convenient and efficient than the closure operation, too. An example of both is our disjoint path computation from the previous section – we ignore edge labels, which requires a transformation in the semiring setting, because the values we operate with are forests of paths and not the values in the graph. The resulting function reachForest is in fact quadratic in the number of vertices in the graph and thus the computation of disjoint paths is just as asymptotically complex as the computation of a single path. To summarise the differences to the semiring approach: we do not require cubic closure operations,
our functions are more abstract and don’t need to be homogeneous in the values (i.e. Vec σ → Mat σ → Vec σ) and finally, the vector-matrix multiplication in case of a semiring is easily implemented with our functions (cf. Section 4.1). Acknowledgements. I thank Jan Christiansen and Rudolf Berghammer for comments on a draft of this paper, Insa Stucke for general discussions. I am grateful for the very insightful remarks provided by the author of the student paper feedback and the comments of the reviewers.
References 1. Berghammer, R. A Functional, Successor List Based Version of Warshall’s Algorithm with Applications. In RAMICS (2011), H. C. M. de Swart, Ed., vol. 6663 of LNCS, Springer, pp. 109–124. 2. Chakravarty, M. M. T., and Keller, G. An Approach to Fast Arrays in Haskell. In AFP (2002), J. Jeuring and S. P. Jones, Eds., vol. 2638 of LNCS, Springer, pp. 27–58. 3. Danilenko, N. Using Relations to Develop a Haskell Program for Computing Maximum Bipartite Matchings. In RAMICS (2012), W. Kahl and T. G. Griffin, Eds., vol. 7560 of LNCS, Springer, pp. 130–145. 4. Dinitz, Y. Dinitz’ Algorithm: The Original Version and Even’s Version. In Essays in Memory of Shimon Even (2006), O. Goldreich, A. Rosenberg, and A. Selman, Eds., vol. 3895 of LNCS, Springer, pp. 218–240. 5. Dolan, S. Fun with Semirings: A Functional Pearl on the Abuse of Linear Algebra. In ICFP’13 (2013), G. Morrisett and T. Uustalu, Eds., ACM, pp. 101–110. 6. Elliott, C. Reimagining Matrices. www.conal.net/blog/posts/reimagining-matrices. 7. Erwig, M. Inductive Graphs and Functional Graph Algorithms. J. Funct. Program. 11, 05 (2001), 467–492. 8. Hinze, R., and Paterson, R. Finger Trees: A Simple General-Purpose Data Structure. J. Funct. Program. 16, 2 (2006), 197–217. 9. Hopcroft, J. E., and Karp, R. M. An n5/2 Algorithm for Maximum Matchings in Bipartite Graphs. SIAM J. Comput. 2, 4 (1973), 225–231. 10. Johnsson, T. Efficient Graph Algorithms Using Lazy Monolithic Arrays. J. Funct. Program. 8, 4 (1998), 323–333. 11. King, D. J., and Launchbury, J. Structuring Depth-First Search Algorithms in Haskell. In POPL (1995), R. K. Cytron and P. Lee, Eds., ACM, pp. 344–354. 12. Kozen, D. On Kleene Algebras and Closed Semirings. In MFCS (1990), B. Rovan, Ed., vol. 452 of LNCS, Springer, pp. 26–47. 13. Lesniak, M. Palovca: Describing and Executing Graph Algorithms in Haskell. In PADL (2012), C. V. Russo and N.-F. Zhou, Eds., vol. 7149 of Lecture Notes in Computer Science, Springer, pp. 153–167. 14. Liang, S., Hudak, P., and Jones, M. Monad Transformers and Modular Interpreters. In POPL (1995), R. K. Cytron and P. Lee, Eds., ACM, pp. 333–343. 15. Marlow, S. The Haskell report. www.haskell.org/onlinereport/haskell2010. 16. Rabhi, F., and Lapalme, G. Algorithms – A Functional Programming Approach. Addison-Wesley, 1999.