Minimal Mergesort 1 Introduction - University of Canterbury

Minimal Mergesort Tadao TAKAOKA Department of Computer Science University of Canterbury Christchurch, New Zealand December, 1996

Abstract

We present a new adaptive sorting algorithm, called minimal merge sort, which merges the ascending runs in the input list from shorter to longer, that is, merging the shortest two lists each time. We show that this algorithm is optimal with respect to the new measure of presortedness, called entropy. Keywords:

adaptivesort, minimal mergesort, ascending runs, entropy

1 Introduction Adaptive sorting is to sort the list of numbers into increasing order as ef ciently as possible by utilizing the structure of the list which re ects some presortedness. See Estivill-Castro, and Wood [1] for a general survey on adaptive sorting. There are many measures of presortedness. The simplest one is the number of ascending runs in the list. Let the given list = ( 1 2 n ) be divided into ascending runs i ( = 1 ), that is, = ( 1 2 k ) where i = ( (1i) (ni) ) and (1i) is the j 1 j + + j i,1 j + 1-th element in . We denote the length of list by j j. Note that (1i) (ni) for each (i) (i+1) if i is not the last list. The sort algorithm called natural i and n 1 merge sort [2] sorts by merging adjacent two lists for each phase halving the number of ascending runs after each phase so that sorting is completed in ( log ) time. Mannila [3] proved that this method is optimal under the measure of the number of ascending runs. In this paper we generalize the measure RUNS ( ) of the number of ascending runs into that of the entropy of ascending runs in , denotes by RUNS ( ) and sometimes denoted by ( ) for simplicity. Then we invent a sorting algorithm, called minimal merge sort, that sorts by merging two minimal length runs successively until we have the sorted list. We show that the time for this method is ( ( )) and is optimal under the measure of ( ). Logarithm is taken with base 2 throughout the paper. n

X

k

X

a

X

;

;a i

i

a

X

a i > a

;k

X

X

X

X

;

a ;a ;

X ;X ;

;a

;X

X

X

a

a i

X

X

O n

k

X

X

H

H X

X

O nH X

H X

1

X

2 Entropy of ascending runs P

Let i = j i j and i = i . Note that i = 1. We de ne the entropy of ascending runs in , RUNS ( ) or ( ) for simplicity, by n

X

p

n =n

X

H

X

p

H X

RUNS (X ) = ,

k X

H

i=1

log

pi

pi :

( = 1 ) can be regarded as a probability measure, we have 0 ( ) log and the maximum is obtained when j i j = ( = 1 ). Lemma 2.1 Any sorting algorithm takes at least ( ( )) time when the entropy of ascending runs in is ( ) and j i j 2 for = 1 . Proof. Sorting into 0 = ( 01 0n ) where 01 0n means that 0 is a permutation of . Let i = ( (i) (ni) ). Let (i) ( = 1 ) 1 , 1 occupy the rst positions in 0 . Then there are nn1,,k1 possibilities of 1 (2) (2) being scattered in 0 . Since we have a constraint of (1) 1 , we put 1 ,n,k,n1 +1 n1 among the rst positions. Then we have n2 ,1 possibilities of 2 being scattered in 0 . Repeating this calculation yields the number of possibilities as , )! = ( , 1)! (( , ( , 1)!( (, , , ,1 + ,1)! + 2)! , + 1)! 1 1 2 1 2 ( k,1 , 1)! ( , 1)! 0! k ! = ! ! ( , 1) 1 ( k, + 1) Since

pi

i

;

;k

H X

k

X

n=k

i

;

;k

nH X

X

X

H X

X

X

X

a ;

X

;a

X

k

i

;

a

a

;

;k

a

;a i

a

i

;

X

X

X

a

> a

a

k

X

X

N

;k

N

n

n

k

n

k

n

n

n

k

n

n

k

n

n

n

n

n

n

nk

n1

n

n n

n

:

k

Since the number of possible permutations is not fewer than this, we have the lower bound on the computing time based on the binary decision tree model approximated by T

T

= log log , N

n

k X

n

i=1

where we use Stirling's formula. T

P

k X i=1

ni

log

n ni

ni

T

log i +

k X

n

(log i , log( , + 1)) n

i=1

n

i

is evaluated by

, log + 2( , 1) + log( , 2 + 2) k

n

k

n

k

;

since log i is minimum when 1 = = k,1 = 2 and k = , 2( , 1). Now noting that the rst term is minimum with the same condition, we have X 2 , , 2 log + 4( , 1) + 2 log( , 2 + 2) i log i 2( , 1) log 2 + ( , 2 + 2) log , 2 + 2 ,2 log + 4( , 1) + 2 log( , 2 + 2) = ( , 2 ) log , 2 + 2 + 2( , 1) 0 T

n

n

nH

n

n

n

k

k

n

k

n

k

n

n

n

n

n n

k

;

2

n

k

n

n

k

n

k

n

k

k

n

k

k

k

since 1 2. Thus we have k

n=

T

( ) 2 = (

nH X =

( )).

nH X

3 Minimal mergesort All list are maintained in linked list structures in this section. Let = ( 1 k ) is the given input list such that each i is sorted in ascending order. Rearrange into 0 = ( i1 i ) in such a way that j i j j i +1 j ( = 1 , 1), that is, ( 1 k ) is sorted with j i j as key. We call this \meta-sort." Since each j i j is an integer we can obtain 0 in ( ) time by radix sort. Now we sort 0 by merging two shortest lists repeatedly. Formally we have the following. Let and are lists of lists, whereas i ( = 1 2) and are ordinary lists. By the operation ( , the leftmost list in is moved to the rightmost part of . By the operation i ( ( = 1 2) the leftmost list of is moved to i . By the operation ( , is moved to the rightmost part of . First ( ) is the rst list in . Algorithm 3.1 (Minimal mergesort) 1 Meta-sort into 0 by length of i ; 2 Let = 0 ; 3 := ;; 4 ( ; 5 if 6= ; then ( ; 6 for := 1 to , 1 do begin 7 ; 1 ( 8 ( ; 2 9 := merge ( 1 2 ); 10 while 6= ; and j j j rst( )j do ( ; 11 ( 12 end f is the sorted listg. X

X ;

;X

X

X

j

;

X

;k

X

;

X ;

;X

X j

k

;X

X j

X

X j

X

O n

X

M

L

W

W

M

X

L

W

W

M

M

L

X

;

L

M

M

i

L

M

W

i

;

W

L

X

X

M M

L

L

M

i

k

W

M

W

M

W

W ;W

L

M

L

W

>

L

M

L

W

W

Theorem 3.1 The algorithm minimal mergesort sorts X = (X1 ; ; Xk ) where each Xi is an ascending sequence in O(nH (X )) time where H (X ) is the entropy of X .

Proof. Consider the moment when W1 and W2 are merged at line 9. Note that M and L are meta-sorted throughout the computation. From Lemma 3.1 we have jW2 j 23 jW j if W2 is a merged list. Since jW1 j jW2 j, W1 and W2 will go to a wider lists at least 3=2 as large if they are not original Xi 's. Therefore each element in Xi will go to wider lists at most d log3=2 n=jXi j e + 1 times. Since at each merge jW1 j + jW2 j , 1 comparisons are performed, we can charge 1 comparison on each element in Xi if Xi is in one of the merged lists. Thus the total number of comparisons can be bounded by k X i=1

ni

log

n ni

3

=

( )

nH X :

Lemma 3.1 If W2 is a merged list at line 9, it holds that

j j j j. 2 W 3

W2

Proof. Suppose to the contrary that jW2 j > 2jW1 j. Then for the previously merged lists V1 and V2 , that is, W2 = merge (V1 ; V2 ), we have jV1 j > jW1 j or jV2 j > jW1 j. Thus V2 or V1 must have been merged with W1 or a shorter list, a contradiction. Example. Let

j j = 2, j i j = 2i, ( = 2 , 1) and = 2k . Then X1

1

X

i

;

;k

n

minimal mergesort sorts in ( ) time, since ( ) = const, whereas natural mergesort takes ( log log ) time to sort . X

O n

O n

H X

n

X

4 Concluding remarks When we scan , we can identify ascending runs and descending runs alternatingly. By reversing descending runs, we can satisfy the condition of j i j 2 in Lemma 2.1. This modi ed version of Algorithm 3.1 with this prescanning is thus optimal with respect to the entropy measure. We showed that the entropy measure covers the measure of . It remains to be seen whether the entropy measure can cover other measures of presortedness. X

X

RU N S

References [1] V. Estivill-Castro and D. Wood, A survey of adaptive sorting algorithms, ACM Computing Surveys 24 (1992), 441{476. [2] D. E. Knuth, \The Art of Computer Programming, Vol.3, Sorting and Searching," Addison-Wesley, Reading, Mass., 1974. [3] H. Mannila, Measures of presortedness and optimal sorting algorithms, IEEE Trans. Comput. C-34 (1985), 318{325.

4

Minimal Mergesort 1 Introduction - University of Canterbury

Minimal Mergesort 1 Introduction - University of Canterbury

Suggest Documents

i======:1 - University of Canterbury

Editorial - University of Canterbury

View - University of Canterbury

1.2.1 - University of Canterbury

CityViewAR - University of Canterbury

12593047_Main.pdf - University of Canterbury

IEPs - University of Canterbury

Resources | University of Canterbury

Download - University of Canterbury

IS600Designandtheory.pdf - University of Canterbury

Canterbury Campus map - Canterbury Christ Church University

Introduction to SIMUL8 - University of Canterbury - New Zealand

Minimal Recursion Semantics: An Introduction - Stanford University

“EMPOWERED EROTICA”? - University of Canterbury

Finschia novaeseelandiae - University of Canterbury

aves: accipitridae - University of Canterbury

Thesis Chapter - University of Canterbury

APA guide - University of Canterbury

RSC Advances - University of Canterbury

short communication - University of Canterbury

Thesis Chapter - University of Canterbury

platanus orientalis - University of Canterbury

Medical Physics | University of Canterbury

uncorrected proof - University of Canterbury