Labeled Radial Drawing of Data Structures

0 downloads 0 Views 147KB Size Report
displaying B+-tree data structures, which are commonly used index ... Index data structures are almost always represented as ..... stored in vector format.
Labeled Radial Drawing of Data Structures M. Bernard 1, S. Mohammed 2 The University of the West Indies, Trinidad & Tobago 1 [email protected]; 2 [email protected]

space in the upper part of the drawing and the display quickly becomes cluttered. A radial drawing provides a more even distribution of the nodes. It spreads the larger number of nodes over a larger area as the levels increase and can accommodate a larger number of leaf nodes as these are drawn on the circumference of a circle rather than being drawn horizontally. The radial drawing makes better use of the display space. Nodes can be allocated more display space so that the node content, which is of importance to the viewer, can be displayed. In the radial drawing, the hierarchical structure is still discernable. The central role that the root plays is magnified. There are several variations of methods in the literature for radial tree drawings. A well-known algorithm [7] for the radial view recursively positions children of a sub-tree into circular wedges. The angular width of a node’s subtree is proportional to the number of leaves of the subtree. Nodes are drawn as small circles of same size. [21] provides an animation technique for supporting interactive exploration of radial graphs; it augments the previous method to allow nodes to be drawn with varying sizes corresponding to the size of the node’s sub-tree. [14] considers the issue of optimizing display space in layout design and uses a radial type drawing coupled with techniques from tree maps [10] to develop a spaceoptimized tree visualization. Several graph visualization systems include radial layout algorithms [9,19]. The radial drawing algorithms presented in the literature have taken an ‘in-out’ approach where the positions of children nodes are computed relative to their parents. The radial drawing method used in this paper takes advantage of the balanced nature of B+-trees and uses an ‘out-in’ approach to drawing where the position of leaf nodes are determined first and the positions of parent internal nodes are computed relative to their children. Previous work on displaying data structures [5,16,20] has explored Orthogonal Drawings and Hierarchical Drawings. Emphasis has been on the architectural features and user interface facilities in tools for displaying data structures. Waddle [18] discusses layout issues for hierarchical and non-hierarchical, leveled drawings of data structures in which the node’s internal structure is considered. His emphasis was on avoiding edge crossings

Abstract This paper describes a radial layout method for displaying B+-tree data structures. We present an algorithmic framework for computing the node positions that result in a planar drawing. Layout issues related to displaying the internal structure of the nodes are addressed. Each field value and associated pointer that comprises the internal structure of the node is considered a subnode. The drawing technique uses different polygonal shapes for the subnodes that allow curvature to the design. We discuss the layout of text labels for the fields of the nodes which provides good readability and which preserves the semantics of the data structure. The edge positioning shows the association of the pointers with their corresponding field label. The radial drawing of the B+ -tree makes better use of the display space than the traditional hierarchical drawing.

1. Introduction This paper describes a radial layout method for displaying B+-tree data structures, which are commonly used index structures for text and relational databases. Radial layout is a well-established category of tree drawings [1,6,7]. In a radial tree drawing, the root or focus node is placed at the origin and other vertices are placed on concentric circles centered at the origin according to their distances from the root. Most layout algorithms in the literature were developed to layout graphs in which nodes are points. However, with data structures, the nodes are not simply points but rather they are compound nodes with an internal structure consisting of fields and pointers that must be visible in the drawing. The main contribution of our method is the layout design including the geometry of the nodes, the node labeling of the field values and the specification of ports, the specific locations where the edges connect to the nodes. Index data structures are almost always represented as hierarchical drawings, providing a top-down view. However, in the hierarchical view the nodes on the lower levels become crowded very quickly, particularly for B+trees which are short and wide; it leaves a lot of unused

479

inside the nodes of the data structure when the edges extend to ports inside the nodes. In recent years, the graph labeling problem has been receiving increasing attention in the graph drawing community. Much of the research has been focused on the edge labeling problem for graphs with fixed geometry. Kakoulis and Tollis [11] present an algorithm for the ELP that can be applied to hierarchical drawings. Several authors [3,12] have studied the problem of computing labeled orthogonal drawings. Nakano [13] gives an algorithm for labeling a set of points in a plane with axisparallel rectangles where the labels are approximated by their bounding rectangles. The graph labeling problem has also been addressed for specific application domains involving schemas with textual content such as UML Class diagrams [15] and Statecharts [4]. To our knowledge, no work has been published that deals specifically with the geometry of node shapes and the placement of text labels for radial drawings. The drawing techniques presented in this paper describe the geometry of the nodes which use different polygonal shapes to allow more curvature to the design. We discuss the layout of text labels for the nodes which provides good readability and which preserves the semantics of the data structure. Pointers are represented as the edges. The positioning of these edges shows the association of the pointers with their corresponding key field value. The rest of this paper is organized as follows: In Section 2.0 we present Algorithm 1 that is the framework for the drawing system. In Section 2.1 we discuss the geometry and layout of the compound nodes of the data structure. In Section 2.2 we discuss text labeling of the nodes and in Section 2.3, the layout of the edges. Section 3 gives some implementation details and the direction of future work. We conclude with some comments in Section 4.

distances. The positions of parent internal nodes are computed relative to their children. This is an ‘out-in’ approach to drawing. The radial drawing algorithms presented in the literature have taken an ‘in-out’ approach where the positions of children nodes are computed relative to their parents. Node positions are calculated in sectors so as to avoid edge crossings. However, since the B+-tree is a balanced tree with all leaves on the same level, an ‘out-in’ approach simplifies calculations and the question of edge-crossings does not arise. The position of a parent node (except the root) is found such that it is midway between its first and last child. This ensures that the graph drawing is planar, with no edges crossing. Because of the balanced nature of the tree, there are no problems with the sector sizes. The algorithm can be used in a wider context, beyond data structures, for general tree structures in which all leaves are the same depth in the tree. In the B+-tree the nodes are not simply points but rather more complex structures. They are multivalent compound nodes with several text labels and several edges associated with the node. The text ordering and edge ordering are important to the semantics of the data structure so this internal structure of the nodes must be clearly displayed in the drawing. Several issues related to the geometry of the nodes, the layout of text labels within the nodes, and the positioning of the connecting lines must be considered. Algorithm 1 gives the general framework for the tree drawing: Algorithm1 DrawTree(LNodes, INodes, RNode) //Draw Leaf Nodes Repeat for i = 1 to numLNodes { LNode.Angle= startAngle + (2π(i - 1)/numLNodes; LNode.Distance = r; InsertLeafText (LNode. Angle, LNode. Distance, RectHeight, RectWidth, LNode.DataValues);

2. Radial Drawing Method

} //Draw Internal Nodes Repeat for levelNum = 1 to numILevels{ For k = 1 To numINodes[levelNum]{ INode.Angle=0.5(LastChild.Angle – FirstChild. Angle); INode.Distance = r – (levelNum * δ); DrawInternal (INode.Angle, INode.Distance, ShapeHeight, ShapeWidth); InsertInternalText (INode.Angle, INode.Distance, ShapeHeight,ShapeWidth, INode.DataValues); If levelNum = 1 DrawLeafLines(INode.Angle, INode.Distance, ShapeHeight, ShapeWidth); Else DrawInternalLines(INode.Angle,INode.Distance, ShapeHeight,ShapeWidth, levelNum, k); }} //Draw Root Node DrawRoot ( Height, Width); InsertRootText (Height, Width, RNode.DataValues); DrawRootLines ( Height, Width );

The drawing algorithm presented in this paper generates a radial drawing of the B+-tree. B+-trees are non-binary tree structures in which all the leaves are on the same level. The drawing is a visual representation of a set of data values that are the key field values of some file/data set. All nodes contain several field values that are arranged in ascending order within the nodes. Internal nodes also contain pointers to nodes on the next level. The levels of the B+-tree are drawn on concentric circles, with the leaves radiating outward from the outermost circle, the internal nodes on circles of decreasing radii, and the root at the centre. In Algorithm 1 given below, the positions of the leaf nodes are calculated first. Leaves are equally spaced along the circumference of the outermost circle of radius r. The first leaf node is placed at some selected position (denoted by start angle) and the other leaf nodes are added clockwise at equal arc

480

2.1 Geometry and layout of Tree nodes B+-tree nodes consist of arrays of field values (character strings) and pointers associated with those field values. Data structures can be modelled in this way: nodes consist of a sequence of ‘subnodes’, which have an ordering that is important to the semantics of the data structure. Each subnode has a label (the record’s key field value). In addition, subnodes also have one or more edges incident on them (except for subnodes of leaf nodes). The radial drawing technique addresses layout of the subnodes that comprise the nodes, the layout of the text labels within the subnodes and the positioning of the edges associated with each subnode. Subnodes are represented as polygons and subnodes of a given node are drawn in contact with each other so as to give the appearance of a single node. Leaf nodes are drawn on the outermost concentric circle. All leaf nodes are the same size and they contain the same number of subnodes, say b (referred to as the leaf bucket factor of the B+-tree). Each leaf node is drawn as b rectangles adjacent to each other and in contact along the sides of the rectangles, as shown in Figure 1 for b=3. The leaf node can be oriented in different directions on the circumference of the outermost circle. Several approaches were tried horizontally across the circle, vertically, and even longitudinally along the circle (similar to disk blocks on a magnetic disk track). The preferred design was to orient the leaf node rectangles perpendicular to the tangent to the outermost circle at the point of contact, u as shown in Figure 1. This design accommodates a large number of leaf nodes on a plane surface and avoids the problem of leaves overlapping. The internal nodes on a given level are drawn along the circumference of a circle as shown in Figure 1. There are a much smaller number of nodes on the circles at the internal levels than at the leaf level; hence, the space factor was not as critical. A parallelogram shape was selected to represent each subnode within an internal node as this allowed more curvature and hence gave a more radial effect along the circle than could be accomplished with a rectangle. The parallelograms comprising one node are sheared and rotated to orientate into the circle. An initial gradient is selected for the parallelogram. Then to get the parallelograms to give a circular effect, the gradient is incremented by some factor each time a parallelogram is drawn. The root node is drawn at the centre of the circles. Rectangles are used to represent the root subnodes. The rectangles are positioned vertically and are stacked one below the other so that they give the appearance of a single node.

u

Root

Internal level

Leaf level

Figure 1 Orientation of Nodes in the Radial Drawing

2.2 Labeling Nodes Displaying the text label in the radial drawing of the B+tree is a problem of labeling the nodes of the graph. Graph drawing algorithms for radial graphs have not specifically addressed the problem of labeling nodes. Where labels are required, they are simply written horizontally in the vicinity of the node or over the node and allowed to extend beyond the node [21]. In the label layout that is presented here for the B+-tree, the labels fit within the node shape and the labels themselves are drawn radially. This involves design decisions relating to the direction of the text. In this radial drawing, the labels are properties of the subnodes and are drawn within the polygon that represents the subnode. All labels are the same length (the B+-tree uses fixed-length record key values). For leaf nodes, the text labels are bounded by the rectangles that represent the subnodes. The label is drawn with its base along the width of the rectangle, using a character font size that does not exceed the height of the rectangle. To maintain the semantics of the data structure the text labels must be read in ascending order within the leaf nodes and the sequence must continue from one leaf node to the next. For example, in Figure 2 the first string of the first leaf is AAE; this is followed by AAP and ABC then reading should proceed to the next leaf in an anticlockwise manner. English text, in its normal horizontal usage, is read/written from left to right. In the radial layout, labels in the left semicircle are drawn starting from the outermost edge of the leaf so that key values in higher lexicographical order are closer to the centre (an ‘out-in’ direction). This gives the normal left to right reading direction. In the right semicircle a left to right direction requires that labels be drawn starting from the innermost edge of the leaf (an ‘in-out’ direction). This can cause the viewer some mental disorientation at the 90o

481

and 270o positions where the changes would be effected. Alternatively, labels in the right semicircle can continue to be drawn ‘out-in’ as is shown in the complete drawing in Figure 3. Reading must now be right to left in the right semicircle but there is no disorientation at the 90o and 270o positions. This latter design actually reinforces the radial character of the drawing as viewers always read ‘out-in’ towards the centre. The text labels of the internal nodes are drawn along the circumference of the circle for that level within the parallelograms. Labels are placed along the longer diagonal of the parallelogram. Text labels in the lower semi-circle are inverted so that the characters do not appear upside down. For the root node, the text labels are simply drawn horizontally within each subnode rectangle.

incident on the root node as though the edges end at the root node centre. The edges stop at the borders of the rectangles of the root.

3. Implementation and Future Work The drawing algorithm was implemented using Java. The methods for DrawLeaf, DrawInternal and DrawRoot from Algorithm 1 require manipulation of rotation and translation operations. Affine Transformations were used to transform one coordinate system to another. Affine Transformations can be constructed using sequences of translations, scale, rotations, and shears and these can be concatenated, for example, a translation followed by a rotation. Each transformation takes place in the coordinate system created by the previous transformation. An Affine Transformation is used in the drawing of the shapes as it preserves the straightness of lines and allows concatenation of transformations. For example, in the method for DrawLeaf given in A1.1 below, an Affine Transformation is used to concatenate a translation to the center of the canvas followed by a rotation through the angle calculated for the leaf node. The Affine Transformation is performed on the rectangle, which is then drawn using the Graphics2D’s draw method. A1.1 DrawLeaf (α, R, height, width) Repeat for p = 1 to numLeafRec { a = new AffineTransform( ); a.translate(xcenter,ycenter); a.rotate (α ); rectRadius = R + (numLeafRec-p) * width; rectangle = new Rectangle2D(0, rectRadius, height, width); graph.draw(a.createTransformedShape(rectangle)); } For the interested reader, implementation details of each of the methods in Algorithm 1 can be found at [2]. This radial drawing software was developed in a teaching context. Most Computer Science academic programs at university level include courses on Data Structures. The Graph Drawing software was used in the teaching of Index Structures in such a course. Without this automated support, the instructor will generally use very small examples to illustrate concepts. This drawing software can be used to illustrate larger examples; a data file of up to 200 short key values can be displayed on a screen with good visual impact. The user interface was designed to give the user control over the size of the drawing (the sizes of the radii of the circles can be varied), the start position of the leaf nodes (the entire drawing can be rotated), and the position of the root (the entire drawing can be repositioned). Some simple functionality has been added to the drawing to illustrate random search paths from the root to leaf nodes.

Figure 2: A portion of a radial tree drawing showing 1 internal node and 4 leaf nodes

2.3 Drawing Edges Edges are drawn as straight lines. The challenge here is specifying the specific locations where the edges connect to the nodes, referred to as ports [8]. From the properties of a B+-tree, an internal node with n index records will point to n+1 child nodes. Hence, each internal node will have several edges connecting to nodes on the lower level. All these edges do not begin at the same point, as would be the case if the node were simply a point or small circle. With the data structure, the edges must be incident on the subnodes of the internal nodes. Internal nodes (excluding the root) have ‘incoming’ and ‘outgoing’ ports for each node [17]. An internal node with n subnodes will have n+1 outgoing ports. These are set as the ‘outer’ corners of the parallelograms representing the subnodes. Fig 2 shows an internal node with 3 subnodes and 4 edges anchored at outgoing ports. The incoming port of the internal nodes is located at the midpoint of the inner arc of the node. Each leaf node has exactly one port, an incoming port, which is located at the midpoint of the side of the innermost rectangle. The root node has only outgoing ports. The edges from the root node to internal nodes are

482

Figure 3 Radial drawing of a B+ tree In the future, we intend to add other types of searches such as range queries (a path consisting of a sequence of arcs around the outer circle). In addition, we will also extend the drawing algorithm to represent the changes that occur on insertion and deletion of key values; this involves showing the splitting of nodes and merging of nodes and even possibly an increase of the tree height. Further investigations are also being carried out into techniques for displaying larger graphs. Common techniques of zooming or fisheye projections or clustering cannot be applied in a straightforward manner. Also in the

current version of the drawing software the input data are stored in vector format. We have already implemented a new test version where graph storage has been implemented using XML. Further experimentation with font types and colors will be carried out to improve the aesthetics of the drawing. Graph Visualization software systems can be incorporated into intelligent tutorial systems for the teaching of Computer Science concepts. The Drawing algorithm presented in this paper for B+-trees can be modified for several other popular tree data structures. In

483

particular, the algorithm can be adapted in a straightforward manner for height-balanced trees such as AVL trees and B-trees. The radial drawing may also give reasonable representations for nearly balanced binary search trees and multi-way tries. Several other types of applications where nodes have important content information and which traditionally have been displayed as hierarchical drawings, can benefit from a radial view. For example, radial Organization Charts would provide a ‘top-centric’ view of the organization rather than a top-down view.

the layout of node labels and the specification of node ports for connecting edges. Previous papers on radial layout have not specifically examined these aspects of the drawings. They are important for application domains where the nodes themselves contain structural information and the interrelationships among the elements of the node need to be clearly distinguished. The radial drawing method was used for displaying B+-tree data structures where the nodes consist of arrays of key value and pointers associated with those values. The radial layout of the B+-tree gave better use of the display space than the traditional hierarchical layout. Design issues relating to the placement of the nodes and the node shapes as well as layout of text in a radial fashion were addressed.

4. Conclusion In this paper, we have presented a radial tree drawing method which has focused on the geometry of the nodes,

11. Kakoulis K., Tollis I., “An Algorithm for Labeling edges of Hierarchical Drawings”, Proc. Graph Drawing ‘97, LNCS Vol.1353, Springer-Verlag, 1998, pp.169-180. 12. Klau G., Mutzel P., “Combining Graph Labeling and Compaction”, Proc. Graph Drawing ’99, LNCS Vol.1731, Springer-Verlag, 2000, pp. 27-37. 13. Nakano S., Nishizeki T., Tokuyama T., Watanabe S., “Labeling Points with Rectangles of Various Shapes”, Proc. Graph Drawing ‘00, LNCS Vol.1984, SpringerVerlag, 2001, pp 91-102. 14. Nguyen Q., Huang M., “A Space-Optimized Tree Visualization”, Proc. IEEE Symposium on Information Visualization, InfoVis2002, IEEE CS Press, 2002, pp.85-92. 15. Seeman J., “Extending the Sugiyama Algorithm for Drawing UML Class Diagrams: Towards Automatic Layout of Object-oriented Software Diagrams”, Proc. Graph Drawing ‘97, LNCS Vol.1353, Springer-Verlag, 1997, pp.415-424. 16. Shimomura T., Isoda S., “Linked-List Visualization for Debugging”, IEEE Software, Vol. 8, No. 3, 1991, pp.44-51. 17 Tom Sawyer Software Graph Layout Toolkit www.tomsawyer.com/glt/ports.html 18 Waddle V., “Graph Layout for Displaying Data Structures”, Proc. Graph Drawing ‘00, LNCS Vol.1984, SpringerVerlag, 2001, pp241-252. 19 Wills G., “NicheWorks – Interactive Visualization of Very Large Graphs”, Journal of Computational and Graphical Statistics, Vol. 8, No. 2, 1999, pp. 190-212. 20 Yang J., Shaffer C., Heath L., “SWAN:A Data Structure Visualization System”, Proc. Graph Drawing ’95,LNCS Vol.1027, Springer-Verlag, 1995, pp.520-523. 21 Yee K., Fisher D., Dhamija R., Hearst M., “Animated Exploration of Dynamic graphs with Radial Layout”, Proc. IEEE Symposium on Information Visualization, InfoVis2001, IEEE CS Press, 2001, pp.43-50.

REFERENCES 1.

Bernard M., “On the Automated Drawing of Graphs”, Proc. 3rd Caribbean Conference on Combinatorics and Computing, The University of the West Indies, 1981. 2. Bernard M., Mohammed S., “A Radial Drawing Algorithm for Data Structures with Compact Placement of Node Shapes and Node Labels”, Technical Report 68U05-01-03, The University of the West Indies, 2003. 3 Binucci C., Didimo W., Liotta G, Nonato M., “Computing Labeled Orthogonal Drawings”, Proc. Graph Drawing ‘02, LNCS Vol.2528, Springer-Verlag, 2002, pp.66-73. 4 Castello R., Mili R., Tollis I., “An Algorithmic Framework for Visualizing Statecharts”, Proc. Graph Drawing ‘00, LNCS Vol.1984, Springer-Verlag, 2001, pp.139-149 5. Crescenzi P., Piperno A., “Optimal-Area Upward Drawings of AVL Trees”, Proc. Graph Drawings ’94, LNCS Vol. 894, Springer-Verlag, 1995, pp.307-317. 6. Di Battista G., Eades P., Tamassia R., Tollis I., Graph Drawing Algorithms for the Visualization of Graphs, Prentice Hall, New Jersey, 1999. 7. Eades P., “Drawing Free Trees”, Bulletin of the Institute of Combinatorics and its Applications, Vol. 5, 1992, pp.10-36. 8. Gansner E., Koutsofios, S., North S., Vo K., “A Technique for Drawing Directed Graphs”, IEEE Transactions On Software Engineering, Vol. 19, No. 3, 1993, pp.214-230. 9. Herman I., Melancon G. deRuiter M. Delest M., “Latour – A Tree Visualization System”, Proc. Graph drawing ‘99, LNCS Vol.1731, Springer-Verlag, 1999, pp.392-399. 10. Johnson B., Shneiderman B., Tree Maps: “A Space-filling Approach to the Visualization of Hierarchical Information Structures”, Proc. 1991 IEEE Visualization Conference, IEEE CS Press, 1991, pp.284-291.

484