Parallelized QuickSort with Optimal Speedup 1

Recommend Documents

large-scale data, the development of faster processing algorithms with ... Ensemble classifiers are a powerful set of learners in data ... Large data sets, that can not fit into ...... [23] G. Wu, H. Li, X. Hu, Y. Bi, J. Zhang, and X. Wu, âMrec4.

Supercritical Speedup

calculated from the real time it takes to exe cute each event on a particular ... We show. (again) that no conservative mechanism can beat the critical path bound, but we also show that at least four .... appear that if e

Implementing Quicksort Programs

Robert Sedgewick. Brown University. This paper is a practical study of how to implement the Quicksort sorting algorithm and its best variants on real computers ...

Parallel Quicksort Algorithm using OpenMP

Moby Dick.txt 1.18 MB) with different number of threads. The fundamental idea of the proposed algorithm is to creating many additional temporary sub-arrays ...

Scheduling Parallel Jobs with Monotone Speedup - CiteSeerX

Consider a scheduling problem where n jobs j â V , with integral processing times Â¯pj, ... of the machines, and if a job is scheduled on machine i, using s of the k ...

Parallelized STED fluorescence nanoscopy

Nov 7, 2011 - focused excitation beam with a doughnut-shaped STED beam. Although both beams are limited by diffraction, resolution beyond the diffraction ...

THE PARALLELIZED POLLARD KANGAROO

Oct 4, 2001 - We show how to use the parallelized kangaroo method for com- ... generalized to solve the discrete logarithm problem in any finite ... baby steps and giant steps that make use of the ideal arithmetic (see .... same paths once they have

1 with optimal autocorrelation using - YONSEI University

Jong-Seon No, Member, IEEE, Habong Chung, Member, IEEE,. Hong-Yeop ... For p = 2, No, Chung, and Yun [10] studied char- ...... Chicago, IL: Markham, 1967.

OPTIMAL CONTROL PROBLEMS WITH MIXED CONSTRAINTS 1

in which the state and control are subject to mixed constraints. We unify, subsume ... 1. Introduction. We study in this article an optimal control problem with stan-.

BlockQuicksort: How Branch Mispredictions don't affect Quicksort

Jun 23, 2016 - several arrays have been sorted and the total elapsed time measured. .... the same Python script as in [11, 16], which first measures the number ...

OPTIMAL CONTROL PROBLEMS WITH MIXED CONSTRAINTS 1

Key words. optimal control, necessary conditions, mixed constraints, nonsmooth analysis ... can be viewed as the basic building block of the proximal theory of ...

Quicksort with median of medians is considered practical

Aug 17, 2016 - The linear pivot selection algorithm, known as median-of-medians, makes the worst case complexity of quicksort be O(nlnn). Nevertheless ...

1 a preliminary dsm speedup comparison: jiajia x ...

Nautilus has an excellent speedup. This happens because of the its data distribution improving the data locality: multiplicand and result matrices are local giving ...

1 Orders-of-magnitude speedup in atmospheric chemistry

aUniversity of Washington Department of Civil and Environmental Engineering, Box ... the physical sciences, computationally intensive numerical solutions to ...

The throughput of data switches with and without speedup - CiteSeerX

Abstractâ In this paper we use fluid model techniques to establish two results concerning the throughput of data switches. For an input-queued switch (with no ...

Quantum Speedup by Quantum Annealing

Feb 28, 2012 - (Dated: February 29, 2012). We study the glued-trees problem of Childs, et. al. [1] in the adiabatic model of quantum com- puting and provide ...

Filtered overlap: speedup, locality, kernel non-normality and Z_A~ 1

Dirac Clifford algebra with links restricted to the hypercube. ..... [11] S. DÃ¼rr, C. Hoelbling and U. Wenger, Phys. Rev. D 70 ... [18] S. DÃ¼rr, hep-lat/0409141.

PARALLELIZED PRECONDITIONED BICGSTAB SOLUTION OF ...

The latter is a measure for how fast any iterative ..... easiest one to parallelize since no special loop structure connected with any dedicated LDP- ... run time reduction measurements were pursued on a dual processor 12-core computer server.

Parallelized Three-Dimensional Unstructured Euler

A cell-centered nite volume scheme is used. The temporal discretization involves an implicit time-integration scheme based on backward-Euler time differencing.

Parallelized Network Security Protocols - CiteSeerX

distribute a private key, and some form of digital signature algorithm to authenticate the parties, such as RSA [32]. Th

A Permanent Algorithm With (exp ]) Expected Speedup for ... - CiteSeerX

Dec 12, 1997 - Eric Bax and Joel Frankliny ... ifornia, 91125 ([email protected]). ...... 9] N. Karmarkar, R. Karp, R. Lipton, L. Lov asz, and M. Luby, A Monte.

Parallelized Collision Avoidance Architectures for

above follow Arithmetic Logic Unit (ALU) based architecture for arithmetic and logical operations which involve a lot of instructions and hence clock cycles [3].

Parallelized Network Security Protocols - CiteSeerX

affects the achievable throughput of an application. Fur- thermore, there are .... also the default mes- sage digest alg

Optimization of Lighting Systems with the use of the Parallelized ...

Abstract. The paper presents possible parallelization of the optimization process of complex lighting systems with the use of the genetic algorithm. The features ...

Parallelized QuickSort with Optimal Speedup 1

Download PDF

4 downloads 0 Views 47KB Size Report

Comment

processor, time complexity of O(log n) exhibited using a CRCW PRAM model. .... Associated with each pivot is its immediate parent , which partition of the parent it is in, .... communication cost between processors, or processors and memory.

Parallelized QuickSort with Optimal Speedup David M. W. Powers1 Fachbereich Informatik Universität Kaiserslautern 6750 KAISERSLAUTERN WEST GERMANY

1

Overview This paper introduces a parallel sorting algorithm based on QuickSort and having an n-input, n-

processor, time complexity of O(log n) exhibited using a CRCW PRAM model. Although existing algorithms of similar complexity are known, this approach leads to a family of algorithms with a considerably lower constant. It is also significant in its close relationship to a standard sequential algorithm.

1.1 Sorting Knuth (1973,pp2-3) notes that sorting is estimated to take up 25% of the world’s computer time. With the advent of the microcomputer this may well have changed, but it is nonetheless a both practically and theoretically interesting task. Sorting, in the sense of bringing together related things, has now been subsumed by.the more specific task of ordering, and has spawned an enormous number of serial sorting algorithms. Whilst for specific cases, faster algorithms are known, in general sorting requires O(n log n) comparisons, and hence time. The algorithms meeting this expected complexity are based on ideas of either partitioning or merging, usually with the aid of explicit or implicit list and/or tree data structures.

These are prototypically represented by the

algorithms QuickSort and MergeSort (Knuth,1973). Logically these algorithms require two phases: placement into the tree, and extraction from the tree. In some cases, one or other of these phases can be left implicit. As the tree has logarithmic depth, and each element needs to be placed and/or extracted, the O(n log n) complexity follows immediately.

1The

work reported was undertaken while the author was at Macquarie University NSW 2009,

AUSTRALIA, and was supported by IMPACT Ltd, PETERSHAM NSW 2049, AUSTRALIA. The author is currently supported under ESPRIT BRA 3012: COMPULOG.

-2-

1.2 Parallel Sorting Algorithms In view of this sequential result, we would hope for an optimal linear speedup to O(log n) when we move to parallel processing on n processors. In practise, however, this is not easily achieved. The infamous AKS result (Ajtai et al, 1983), which was the first to demonstrate the theoretical feasibility of O(log n) time on O(n) processing elements, is based on a sorting network entailing a fixed sequence of comparisons - but with a rather impractical constant of 6100! This is for achievable datasets (