Efficient Data Structures for Storing Partitions of ... - Semantic Scholar

Efficient Data Structures for Storing the Partitions of Integers Rung-Bin Lin Computer Science and Engineering Yuan Ze University, Chung-Li, 320 Taiwan [email protected]

Abstract Algorithms for enumerating the partitions of a positive integer n have long been invented. However, data structure for storing the partitions is not received due attention. In this paper, several data structures, ranging from the most intuitive one to the most efficient one, are proposed. The space and time complexity for creating the most efficient data structure is O(n2). The space complexity is low enough to make possible for storing all the partitions of an integer up to several ten thousands. This data structure can be used to enumerate the partitions of any integer smaller than or equal to n.

time needed to create each of these data structures. In Section 3 we discuss how the proposed data structure can be used to enumerate the partitions of an integer. In Section 4, we present some experimental data about the amount of memory needed for these data structures. The last section draws some conclusions.

2 Four Data Structures Let the set of partitions of a positive integer n be denoted by Ω. Any element w ∈ Ω is denoted as w = y1 , y 2 ,..., y k

,

where

k

∑ yi = n

i =1

for

y i ∈ I , y i > 0, i = 1,..., k ≤ n

1

Introduction

Partitioning an integer n is to divide it into its constituent parts which are all positive integers. Algorithms for enumerating all the partitions of an integer or only the partitions with a restriction have long been invented [1,2]. However, data structure for storing the partitions is not received due attention. In this paper, we investigate four data structures for storing all the partitions of an integer. What we mean data structure here is that a data structure can be used to enumerate the partitions without doing any arithmetic operation except for computing an index to an array. The enumeration can be done either with a restriction or without any restriction. Four data structures are investigated. The most efficient data structure need only store 0.75 n 2 + 3n + 3 integers if n is even or 0.75 n 2 + 3n + 2.25 if n is odd. It is created without exhaustively enumerating all the partitions. The time complexity for creating this data structure is the same as the space complexity. The complexity is low enough to make possible for storing all the partitions of an integer up to several ten thousands. The rest of this paper is organized as follows. In Section 2 we describe how the four data structures are created and derive the amount of memory and

and y i is called a part of partition w. The parts of a partition are not necessarily distinct, nor do they have a fixed order. However, in order to ease the representation, we assume y1 ≤ y 2 ≤ ... ≤ y k . Note that the value of k can vary from one partition to another. For example, the partitions of 6 are 〈1,1,1,1,1,1〉, 〈1,1,1,1,2〉, 〈1,1,1,3〉, 〈1,1,2,2〉, 〈1,1,4〉, 〈1,2,3〉, 〈1,5〉, 〈2,2,2〉, 〈2,4〉, 〈3,3〉, and 〈6〉. In the following we will discuss how the partitions of an integer can be stored in a computer. Four data structures are investigated. They are called direct linear, multiplicity linear, tree, and diagram structures. The first two structures store data in a linear array, so they are called linear structures. 2.1 Linear structures Given the set of partitions of an integer n, said Ω = {w1 , w2 ,..., w p} , the partitions can be stored in a one-dimensional

array in the form of w1 w1 w2 w2 ... w p w p , where wi denotes the number

of parts in wi . For example, the partitions of 6 can be stored as 6,1,1,1,1,1,1,5,1,1,1,1,2,4,1,1,1,3, 4,1,1,2,2,3,1,1,4,3,1,2,3,2,1,5,3,2,2,2,2,2,4,2,3,3,1,6. Totally, it needs 46 integers to store all the partitions. This data structure is called direct linear in this paper. It can be created using an algorithm that enumerates the partitions in lexicographic order. Two partitions and are said in y1 , y 2 ,..., y l x1 , x 2 ,..., x k lexicographic order if there exists a d ≤ min(k , l )

such that x i = y i for i < d and x d < y d . Once the data structure is created, it can be employed to enumerate the partitions one-by-one by first deciding the number of parts in a partition and then by retrieving the parts in sequence. This data structure is not amenable to other types of enumeration with a restriction such as enumerating partitions with the smallest part larger than a certain number. A partition of an integer can also be represented by the repetitive parts contained in the partition. For example, the partition 〈1,1,1,3〉 can be denoted by (1,3)(3,1), where (1,3) means that part 1 occurs three times in this partition and (3,1) means that part 3 occurs only once, i.e., 1’s multiplicity is 3 and 3’s multiplicity is 1, respectively. Thus, all the partitions of an integer can be stored in an array as w1' ( w1' : m1) w'2 ( w'2 : m 2)... w'p ( w'p : m p ) , where wi' is the set of distinct parts in partition wi and wi' denotes the number of distinct parts; wi' : mi represents all the pairs of (part, multiplicity) in partition wi . For example, the partitions of 6 can be stored as 1,1,6,2,1,4,2,1,2,1,3,3,1,2,1,2,2,2,2,1,2,4,1, 3,1,1,2,1,3,1,2,1,1,5,1,1,2,3,2,2,1,4,1,1,3,2,1,6,1. It needs 49 integers to store all the partitions of integer 6. This data structure is called multiplicity linear in this paper. Similar to direct linear, this data structure is not amenable to enumerations with certain restriction. The space and time complexity for creating and storing a linear data structure is proportional to the number of partitions of an integer. 2.2

Tree structure

Here we proposed a tree structure to store all the partitions of an integer. The basic idea comes from an observation that two partitions of an integer may differ in only a few parts. For example, 〈1,1,1,1,1,1〉 and 〈1,1,1,1,2〉 differs in only two parts. In this situation, a sequence of the branches in a tree can be used to store those parts common in any two partitions. For example, a tree that stores all the partitions of 6 is shown in Figure 1. Here, a tree node is denoted by (y,Y), where y is a part of a partition and Y is a number remained to be divided into parts that are at least as large as y. For a root node, y is not a part and simply used to denote the least number into which Y should be divided. Prior to discussing how this tree is constructed, let us see how this tree can be used to enumerate all the partitions of 6. Starting from the root, we traverse the tree in depth-first-search. When a leaf node is visited, we print out the parts encountered along the path from the root to the leaf. These parts except the one stored in the root form a partition of 6 and the path length is equal to the number of parts in the partition. A path length from the root to a leaf is defined as the number of edges traversed from the root to the leaf. For example, the path (1,6) (1,5) (1,4) (2,2) (2,0)

represents the partition 〈1,1,2,2〉 and the number of parts in this partition is 4. The root is denoted by (1,n) and a leaf is denoted by (y,0) for y ≤ n . A general partition tree of integer n is presented in Figure 2, where f (• ) is a floor function. The pseudo code for creating such a tree is presented in Figure 3. Some detailed explanations of the algorithm will be presented when we elaborate on the proofs of some lemmas given later. In the following we will give some definitions and lemmas that are used to prove a theorem that gives the number of nodes in a partition tree. Definition 1: A node without any child is called a leaf node and is denoted by (y,0). Definition 2: A node with at least one child is called an internal node and is denoted by (y,Y) with Y > 0 . (1,6)

(1,5) (1,4) (1,3) (1,2) (1,1)

(2,4) (3,3) (6,0)

(2,3) (5,0) (2,2) (4,0) (3,0)

(2,2) (4,0) (3,0)

(2,0)

(3,0) (2,0) (2,0)

(1,0)

Figure 1. A partition tree for integer 6.

(1,n)

...

(1,n-1) (2,n-2) . . . (f((n-1)/2)-1,nf((n-1)/2)-1)

(2,n-4) (3,n-5) . . . (f((n-2)/2),n-2f((n-2)/2))

(f((n-1)/2),nf((n-1)/2))

(n-2,0)

...

Figure 2. A partition tree of integer n.

(n,0)

void integer_partition_tree (int n){ (1) create a node labeled with (1,n); (2) put the node into a queue Q; (3) while (Q is not empty) { (4)

remove a node (y,Y) from Q;

(5)

for (j=y; j with y1 ≤ y 2 ≤ ... ≤ y k is a partition of integer n if and only if ( y 0 , Y 0)( y1 , Y 1)( y 2 , Y 2)...( y k , Y k ) is a path from the root to a leaf. Proof: Supposed we have a path ( y 0 , Y 0)( y1 , Y 1)( y 2 , Y 2)...( y k , Y k ) , based on Lemma 2, we have y1 ≤ y 2 ≤ ... ≤ y k . Since ( y i +1 , Y i +1) is a child node of ( y i , Y i ) , we have Y i = y i +1 + Y i +1 (see line 6 in Figure 3). Using this relation, we can easily derive k

that n =Y 0 = y1 + Y 1 = y1 + y 2 + Y 2 = ... = ∑ y i and thus i =1

< y1 , y 2 ,..., y k > is a partition of n. On the contrary,

supposed < y1 , y 2 ,..., y k > with y1 ≤ y 2 ≤ ... ≤ y k is a

partition of n, we can find a transition from to where ( y 0 , Y 0) = (1, n ) ( y1 , Y 0 − y1) y1 ∈ {1,2,3,..., n / 2 , n} . Generally, for any two parts y i and y i +1 we can find a transition from ( y i , Y i ) to ( y i +1 , Y i +1) = ( y i +1 , Y i − y i +1) based on the tasks done in lines 4, 5, 6, 7, and 8 in Figure 3. Recursively, we can find a path ( y 0 , Y 0)( y1 , Y 1)( y 2 , Y 2)...( y k , Y k ) .  Lemma 4: The total number of partitions of an integer is equal to the number of leaf nodes in a partition tree. Proof: This is an obvious consequence of Lemma 3.  Theorem 1: The total number of nodes needed to store all the partitions of an integer is twice the number of leaf nodes in its partition tree. Proof: To prove this theorem, it is sufficient to show that for any given leaf node, there exists exactly an internal node that is the parent of the leaf node and for any given internal node, there exists exactly a leaf node that is a child of the internal node. From the proof of Lemma 1, we know that an internal node can have only one child that is a leaf node. Since the data structure obtained by the algorithm in Figure 3 is a tree, a leaf node can have only one parent node that is also an internal node. As a consequence, the number of internal nodes is the same as the number of leaf nodes. Using Lemma 4, we complete the proof of this theorem.  It is clear that the space and time complexity for creating and storing a partition tree is proportional to the number of partitions. In our implementation, each of the nodes in a partition tree actually contains three fields shown below. struct tree_node{ int part; // a part in a partition int num_of_children; // the number of child nodes struct tree_node *next; // point to the beginning of child nodes

}; Where part is used to store a part; here, all the child nodes of an internal node are stored in an array in ascending order of their parts and the field num_of_children is used to register the number of children an internal node has; next is a pointer used to locate the starting address of the array that holds the child nodes. num_of_children is used to decide whether the end of an array is reached or not during enumeration. The number of children of an internal node (y, Y) can be computed as follows:

Definition 3: A node (y,Y) that has the largest Y is called an anchored node in a partition diagram.

if (y 1 , we immediately know that (y,Y) has a shared data structure belonging to node (1,Y). However, we don’t know where the sharing begins, i.e., which children of (1,Y) are also the children of (y,Y). We can derive this information from (y,Y) quite easily. If y ≤ f (Y / 2) , data structure sharing starts from the yth child of (1,Y); otherwise, it starts from the ( f (Y / 2) + 1) th child, i.e., the last child of (1,Y). For example, given a node (2,4), data structure sharing starts from the second child of (1,4) because 2 ≤ f (4 / 2) whereas given a node (2,3), data structure sharing starts from the second child, i.e., the last child of (1,3).

[1,4]

void integer_partition_tree (int n){ (1) for (i=1; i

Efficient Data Structures for Storing Partitions of ... - Semantic Scholar

Efficient Data Structures for Storing Partitions of ... - Semantic Scholar

Suggest Documents

An Efficient Approach for Storing and Accessing ... - Semantic Scholar

Efficient binary space partitions for hidden-surface ... - Semantic Scholar

Data Sets, Partitions, and Characters - Semantic Scholar

Ramsey Partitions & Proximity Data Structures - Google Sites

comparative analysis of data structures for storing ... - ISPRS Archives

Data Structures for Efficient Computation of

Efficient Computation of Controller Partitions in ... - Semantic Scholar

Sparse Partitions - Semantic Scholar

Formal Reasoning About Efficient Data Structures ... - Semantic Scholar

cylindric partitions - Semantic Scholar

Dynamic Data Structures for Taskgraph ... - Semantic Scholar

1 Data structures - Semantic Scholar

Storing the Web in Memory: Space Efficient ... - Semantic Scholar

Efficient Data Distribution for DWS - Semantic Scholar

Requirements for storing electrophysiology data

Storing and Querying XML Data in Object ... - Semantic Scholar

Storing Semistructured Data with STORED 1 ... - Semantic Scholar

Efficient Tree Structures for High Utility Pattern ... - Semantic Scholar

Storing and Querying RDF Data in Atlas - Semantic Scholar

Storing Measurement Data

Finding Reusable Data Structures - Semantic Scholar

Rank-Sensitive Data Structures - Semantic Scholar

Transaction Safe Nonblocking Data Structures - Semantic Scholar

Data Structures Considered Harmful - Semantic Scholar