Fast Algorithms For The Calculation Of Kendall's Tau

Recommend Documents

Fast algorithms for free-space diffraction patterns calculation - CiteSeerX

Jun 15, 1999 - Fresnel integral from object to Fraunhofer domain in a single step. q1999 Published by ... physical extension of the diffracted pattern, D xz.

TESLA Report 2003-31 Multigrid Algorithms for the Fast Calculation of

The requirements of new machines like the linear accelerator TTF (TESLA. Test Facility) at DESY [2] ..... mal convergence rates while relaxation and CG methods considerably slow down the same time the ..... [7] Ch. Van Loan. Computational ...

Fast Algorithms for Generating Delaunay

Fast Algorithms for Generating Delaunay. Interpolation Elements for Domain Decomposition. Philip L. Bowers. Department of Mathematics 3027. Florida State ...

Fast Algorithms for Segmented Regression

Jul 14, 2016 - We study the fixed design segmented regression problem: Given noisy samples from a ...... Fast, provable algorithms for isotonic regression in.

Fast Approximation Algorithms for Fractional

Feb 1, 1992 - A wide variety of problems can be expressed as fractional packing problems. ..... Ax 5 Xb. We shall use the notation (x, A) to denote that X is the minimum value corresponding to x. ... Hence, X 2 (1 - 3d-1c&d/(ytb) 5 (1 - 3+9* 5 (1 + 6

Fast Algorithms for Segmented Regression

Jul 14, 2016 - July 15, 2016. Abstract ... LG] 14 Jul 2016 ... variety of contexts (see, e.g., [GA73, Fed75, Fri91, BP98, YP13, KRS15, ASW13, Mey08, CGS15]).

SOME FAST ALGORITHMS FOR SEQUENTIALLY

The columns of ˜Q(2) then form an orthonormal basis ...... 780–804. [23] G. Golub and C. van Loan, Matrix Computations, 3rd ed., The Johns Hopkins University.

FAST NUMERICAL ALGORITHMS FOR THE ... - UT Math

4D-symplectic maps: The FroeschlÃ© map. 80 .... invariant tori for the FroeschlÃ© map. ...... [FMM77] George E. Forsythe, Michael A. Malcolm, and Cleve B. Moler.

METRIC SCALE CALCULATION FOR VISUAL MAPPING ALGORITHMS

Jun 4, 2018 - be identified in the unscaled point cloud. Extensions of building structures, like the driving lane or the room height, are derived from.

Algorithms for Infeasible Path Calculation - Semantic Scholar

surements were performed on a 1.25 MHz PowerPC G4 processor, 1 Gb memory running Mac OS 10.4.6. We see that we, with a small extra cost, can find in-.

Algorithms for the calculation of psychrometric properties from multi

Sep 8, 2017 - thermodynamic properties are used on a per-unit-dry-air basis. Not all works are .... air per cubic meter (molÂ·mâ3) Ïr. Reducing density in moles ...

fast calculation of the shielding effectiveness for a ... - Semantic Scholar

AbstractâIn this paper, an extremely fast technique is introduced to evaluate the shielding effectiveness of a rectangular enclosure of finite wall thickness and ...

Algorithms for Fast Concurrent Reconfiguration of Hexagonal ...

J. E. Walter is with the Computer Science Department, Vassar College, ... The com- plete interchangeability of the robots provides a high degree of system fault ...

Algorithms for Fast Concurrent Reconfiguration of Hexagonal

1Our algorithms do not address the issue of fault-tolerance (see Section VII for further ... heuristics are used to find a near-optimal probabilistic solution .... 2) In a round, a module is capable of moving to an adja- cent cell. (see Fig. ..... st

Fast calculation of the Fisher matrix for Cosmic Microwave ...

Feb 22, 2012 - The Fisher information matrix of the cosmic microwave background (CMB) radiation .... max entries of the Fisher matrix by a simple vector.

Compilation for Fast Calculation over Pedigrees - Research

[8] David J. Spiegelhalter. Fast algorithms for probabilistic reasoning in influence diagrams, with applications in genetics and expert systems. In Conference on ...

Fast Local Computation Algorithms

Apr 7, 2011 - graph such that yi = 1 if i is in the MIS and yi = 0 otherwise, then it may be sufficient to determine yi1 ,...,yik for a small number of k vertices.

Fast Calculation of the Dimensioning Factors of the ... - DiVA portal

Keywords: railway, traction system, power supply, energy consumption. 1 Introduction ..... helpdesk / help / toolbox / nnet / backpro4.html, 9 March 2007. [10] Kols ...

Fast calculation of fluorescence correlation data

for more complex correlation analysis such as the recently introduced combination of FCS and time-correlated single-photon counting (TCSPC). ... C. Zander, J. Enderlein, R.A. Keller (Eds.) Single-Molecule Detection in Solution - Methods and .... wher

Fast algorithms for the simulation of polygonal particles

To correct the order in the list of the boundaries one uses âinsertion sortâ. Insertion sort is very fast for nearly sorted lists [6] and performs only local exchanges.

Fast Suboptimal Algorithms for the Computation of ... - Semantic Scholar

Computation of Graph Edit Distance. Michel Neuhausâ, Kaspar Riesen, and Horst Bunke. Institute of Computer Science and Applied Mathematics, University of ...

Fast algorithms for the numerical solution of structured

The algorithms devised in this way, applied for the solution of a finite linear system, or for ..... P of (1.3), a sequence fP(n)gn of stochastic (n + 1) (n + 1) block matrices, whose probability ...... 9] A. Berman and R. J. Plemmons. Nonnegative ..

Multi-Channel Scheduling Algorithms for Fast Aggregated

nodes have different transmission ranges; n is the number of nodes in the ..... Graham's list scheduling algorithm according to the Longest. Processing Time ...

Novel Fast Encryption Algorithms for Multimedia

In this paper, a research is described three proposed chaos encryption techniques for multimedia transmission over Mobile. WiMax physical and MAC layers.

Fast Algorithms For The Calculation Of Kendall's Tau

Download PDF

4 downloads 0 Views 149KB Size Report

Comment

Kendall's Tau. David Christensen. EMB, Link House, Great Shelford, Cambridge, England. CB2 5LT. Summary. Traditional algorithms for the calculation of ...

Fast Algorithms For The Calculation Of Kendall’s Tau David Christensen

EMB, Link House, Great Shelford, Cambridge, England. CB2 5LT

Summary Traditional algorithms for the calculation of Kendall’s Tau between two datasets of n samples have a calculation time of O(n2). This paper presents a suite of algorithms with expected calculation time of O(n log n) or better using a combination of sorting and balanced tree data structures. The literature, e.g. (Dwork et al, 2001), has alluded to the existence of O(n log n) algorithms without any analysis: this paper gives an explicit descriptions of such algorithms for general use both for the case with and without duplicate values in the data. Execution times for sample data are reduced from 3.8 hours to around 1-2 seconds for one million data pairs. Keywords: Kendall’s Tau, Algorithm, O(n log n)

1

Introduction

Kendall’s Tau is a non-parametric estimator of correlation between two sets of random variables, X and Y, defined as τ ( X , Y ) ≡ P{(X − X * )(Y − Y * ) > 0}− P{(X − X * )(Y − Y * ) < 0} ..................................... (1)

where (X*,Y*) is an independent copy of (X,Y). The traditional algorithm (Press, Flannery, Teukolsky, Vetterling, 1993, and other sources) to calculate this considers all pairings of X, Y and classifies each of them as one of:

Discordant (d):

(X (X

ExtraX (eX)

X i ≠ X j , Yi = Y j

ExtraY (eY)

X i = X j , Yi ≠ Y j

Spare

X i = X j , Yi = Y j

Concordant (c):

i i

− X j )(Yi − Y j ) > 0

− X j )(Yi − Y j ) < 0

The value of τ is then calculated from τ=

c−d

(c + d + eX )(c + d + eY )

............................................................................... (2)

When each dataset is of size n, the calculation requires the comparison of C2=n(n-1)/2 sets of points, and as such is an O(n2) process. For sample sizes in excess of a few thousand, this rapidly becomes an unattractive proposition. n

Recently, however, Kendall’s τ has been gaining use with large datasets. Areas of application include analysis of results from different Internet search engines (Dwork, Kumar, Naor, Sivakumar 2001) and in the author’s own area of work in the manipulation of stochastic data using elliptical distributions and their copulas, as also addressed by (Lindskog, McNeil, Schmock 2001). When considering stochastic simulations with a million datapoints, the standard O(n2) algorithm takes hours on even a very fast PC. The algorithms described in this paper rely on the following pair of observations: • the value of τ is unaltered if both datasets are reordered using the same set of transpositions; and

•

the calculation of τ becomes considerably simpler if one of the datasets is known to be in ascending order.

A further consideration is that many datasets may be known in advance to lack duplicates – for example sets of rankings, or samples from continuous distributions1. We consider the simpler case of known uniqueness (Xs unique and, independently, Ys unique) in the data samples first.

2

Algorithm NDTau for no-duplicate datasets

If there are no duplicates, then the pairs divide neatly into concordant and discordant. (2) can then be simplified to (c-d)/(c+d). However since c+d=n(n-1)/2, we can further simplify this, giving:

τ=

4c −1 n.(n − 1)

ND1) ND2)

ND3) ND4)

..................................................................... (3)

Sort the pairs (Xi,Yi) such that if i

Suggest Documents

$Fast algorithms for free-space diffraction patterns calculation - CiteSeerX$