International Journal of Computer Science and Information Security (IJCSIS), Vol. 15, No. 3, March 2017
Binary Search Optimization Implementation and Amortized Analysis for Splitting the Binary Tree Dhaval Kadia Student, Department of Computer Science & Engineering Faculty of Technology and Engineering, The Maharaja Sayajirao University of Baroda Vadodara, India
[email protected] value. As insertion goes on, skewness of sub tree is checked and balanced accordingly. It is also checked whether sub binary tree is becoming much denser or not. It is kept in mind that binary tree should not be skewed. On based of that, methods are implemented. Implementation very similar to AVL tree. Trees can be balanced recursively with proper control.
Abstract—In this paper, I describe how binary tree can be manipulated well for accessing elements. Binary tree is famous data structure due to its geometry. Though having these wellorganized properties, it has some demerits in some cases. If number of elements increases in binary tree, then those many elements placed in bottommost level, they have high time complexity to be accessed. One solution is to split main binary tree in multiple small binary trees and this is the way of making buckets for each of those. But approaching this solution, one question arises: How dense small (sub) binary trees should be extracted from main binary tree? An answer should explain even after splitting main binary tree, searching time complexity over should be decreased or should not be increased. These changes also should prevent sub tree to behave like a dense tree. Density of nascent sub tree should be in some range that will decrease searching time complexity. This paper describes proper relation among density of main binary tree i.e. depth, number of splits and density of sub binary tree i.e. depth. Explanation regarding code that I had implemented in Language C is mentioned in paper. Implementation is having manipulation of sub binary trees those are replica of one massive binary tree. Manipulation consists of balancing binary tree, merging light weighed binary tree and further splitting of dense sub tree.
II. DEFINITIONS AND NOTATIONS Let us assume depth of main binary tree Tp is t. So, there are t number of levels in Tp. After split, sub binary tree Tc has d number of levels. Number of sub trees extracted from main tree and are connected consecutively is l. For calculation, l = 2k is assumed. a Data structure used for storing binary tree is Linked List (obviously). Sub binary tree is assumed to be linked with next sub tree. Defining Amortized Complexity as A.C. III. ANALYSIS Analysis is done on bases of condition of time complexity of acquiring sub tree in constant time.
Keywords—Data Structure; Binary Tree; Binary Search Tree; Tree Balancing; AVL Tree; Amortized Analysis; Optimization
In general, calculation of amortized complexity of searching in binary tree T having depth of n is as below:
I. INTRODUCTION The motive of modifying binary tree is to get wellstructured binary tree, for that main binary tree is split into proper ratio of depths of main binary tree and sub binary tree. Problem behind modification is groups of large number elements placed in and above the bottommost layer. Reason is those many elements have higher time complexity of searching than elements present in upper levels. They increase the amortized complexity of searching. If solution is made for bottommost elements, then amortized time complexity will be reduced. Two cases are shown how proper changes can be introduced without affecting original retrieving performance of main binary tree.
A. Initialization For searching element from whole set, that particular Tc is searched. For finding which Tc among set of sub binary tree, it is assumed that time taken is constant. An assumption is acquired on bases of the static range of set division of Tp that says, each Tc is allotted particular and relative range of value to be allowed for insertion. Once getting to that particular Tp, separate binary search can be performed on it. Benefit of this procedure is that less depth of tree is being searched. For binary search in binary trees, different nodes at different depth will have different access time.
Secondly, methods are shown how individual sub binary tree can be organized. Sub trees are put in buckets having dynamic range of minimum and maximum value i.e. permitted to be inserted. On insertion, elements are inserted into its relative bucket having allowance for inserting element
a.
338
In Tp and Tc, p represent parent and c represent child. These are main or parent tree and sub and child tree.
https://sites.google.com/site/ijcsis/ ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS), Vol. 15, No. 3, March 2017
It is shown number of elements at particular level and their searching complexity in Table 1. Defining Level which is Depth, Number of nodes at same level i as N(i), Search-Time of node at Level i which is i itself. Total-Search-Time of all nodes at same level i is TST(i). TABLE I.
Ratio of main case over split case =
𝐴.𝐶.𝑝 𝐴.𝐶.𝑏
(should > 1)
=
𝑡 ∑ 2𝑖 ∗ (𝑖+1) 𝑖=0 2𝑡 𝑑 𝑙∗∑ 2𝑖 ∗ (𝑖+1) 𝑖=0 2𝑡
=
2𝑖 ∗ (𝑖+1) 𝑖=0 𝑑 𝑙 ∗ ∑𝑖=0 2𝑖 ∗ (𝑖+1)
SEARCHING TIME COMPLEXITY STATISTICS
𝑡
∑
Level 0
N(i)
Search-Time
TST(i)
20
1
20 * 1
1
21
2
21 * 2
2
2
2
3
2 *3
d+k=m*k
3
2
3
4
23 * 4
d = (m – 1) * k
.
.
.
.
n
2n
n+1
2n * (n+1)
For d + k = t, taking t = m * k
2
It represents for particular number of m with respect to n, what t should be. Later, it will lead us to result – for any value of k, how many number of elements should be allowed in each bucket i.e. Tc.
For tree having maximum level of n (level starts from 0). A.C.
=
Toral number of nodes per binary tree 𝑛
∑
=
For different value of m, number of nodes in Tp and Tc will be decided and those values of m, Optimization Ratio will decide whether after splitting, searching becomes fast or not. That means, if Ratio is greater than 1, then A.C.b is less than A.C.p. It means bucket phase decreases average searching time. Ratio is plotted versus value m. Ratio is greater than 1 so, we can say that average searching time reduced than previous case.
Sum of TST at every levels
𝑖=0
2𝑖 ∗ (𝑖+1) 2𝑛
B. Derivation 𝑡
∑
A.C. for main tree having t levels = A.C.p =
𝑖=0
2𝑖 ∗ (𝑖+1) 2𝑡
Now after splitting Tp of t number of levels into l number of identical subtrees Tc having d number of levels, So, 2d * l = 2t , suppose, l = 2k 2 d * 2k = 2t d+k=t Where k represents 2k number of buckets each having binary tree. Now, A.C. searching element over whole set having l number of buckets A.C.b =
𝑙 ∗ Sum of TST at every levels of sub binary tree
Fig. 1. Ratio versus m.
Toral number of nodes in whole set 𝑑
𝑙∗∑
=
𝑖=0
b
2𝑖 ∗ (𝑖+1)
2𝑡 t
Where 2 is total number of nodes in all bucket which is same as number of elements in main binary tree because all buckets are extracted from main binary tree. Next step is to find proper proportion of total number of elements contained by Tp and Tc. Comparing both cases,
b.
339
Values only of m greater than or equal to 3 (m ≥ 3) are meaningful.
https://sites.google.com/site/ijcsis/ ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS), Vol. 15, No. 3, March 2017
IV. PROGRAM IMPLEMENTATION Current work is programmed in Language C using Linked List. In which main binary tree is split into buckets having sub binary tree. It is taken care if in any bucket if number of nodes which are being inserted increase beyond previously define limit, then that tree is split again and vice versa. If tree is unbalanced, it is balanced using left rotation or right rotation. At each insertion of element, balance of tree is checked. For less checking of it, it can be performed after certain intervals.
Fig. 2. Ratio versus m.
Binary tree is shown in figure 3. L is the rightmost node at bottommost level in left part of tree. L_up is parent node of L. R is the leftmost node at bottommost level in right part of tree. R_up is parent node of R.
c
C. Result Number of nodes in Tp and Tc are 2t and 2d respectively. As before, t = m * k and d = (m – 1) * k. Number of bucket extracted from main tree is l = 2k. For m = 3, Ratio is 1.5. It explains that searching become 50% faster. It is shown in table 2. TABLE II. k
2 4
RELATIVE NUMBER OF ELEMNTS IN MAIN AND SUB BINARY TREE WITH M = 3 Number of Buckets
Number of nodes in Tp
Number of nodes in Tc
l = 2k
2t
2d
4
26
24
2
12
28
18
212
16
6
64
2
8
256
224
216
10
1024
230
220
Fig. 3. Binary tree
While balancing particular binary tree, for rotating right or left, topmost node of tree is passed in parameter of function shift_L() or shift_R().
For m = 4, Ratio is 1.3. It explains that searching become 30% faster. It is shown in table 3. TABLE III. k
2
RELATIVE NUMBER OF ELEMNTS IN MAIN AND SUB BINARY TREE WITH M = 4 Number of Buckets
Number of nodes in
Number of nodes in each Tc = 2 d
l = 2k
Tp = 2 t
4
28
26
16
212
4
16
2
6
64
224
218
32
224 230
8
256
2
10
1024
240
Thus, if this proportion is maitained then average searching time will be reduced.
c.
Values only of m greater than or equal to 3 (m ≥ 3) are meaningful.
340
https://sites.google.com/site/ijcsis/ ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS), Vol. 15, No. 3, March 2017
Fig. 5. Function Set 2 (Linked List using Language C)
REFERENCES [1] DANIEL DOMINIC SLEATOR AND ROBERT ENDRE TARJAN naming systems. Self-Adjusting Binary Search Trees http://dl.acm.org/citation.cfm?id=3835.
Dhaval Kadia was born in Vadodara City in 1995. He is doing bachelor of engineering in computer science & engineering at Faculty of Technology and Engineering, The Maharaja Sayajirao University of Baroda, Gujarat, India. He was Research Intern at Defence Research and Development Organisation (Robotics Research Center) - Ministry of Defence (INDIA). His research interest includes designing and analysis of algorithms, applications of image processing, computer vision, robotics, artificial intelligence, mathematical modelling, simulations, computer graphics, mixed reality & development of autonomous (self-driving) vehicles.
Fig. 4. Function Set 1 (Linked List using Language C)
341
https://sites.google.com/site/ijcsis/ ISSN 1947-5500