Naive Algorithm and Randomly Generated Sets

4 downloads 373 Views 446KB Size Report
P(C). First Algorithm. Divide and Conquer. NAc=1/2. NAc=1/4. 1000. 1. 1. 1. 1 ... Refer to Appendex I Part I for calculation and output code. S2 = 1,5,3,4,5,6,7,13 ...
Naive Algorithm and Randomly Generated Sets Sathaporn Hu (998866516) November 16, 2012 Abstract Naive Algorithm’s premise is simple: to see whether a set matches a property or not, one picks fews elements from the set and compare then against the property. If they match with the properties, then a conclusion about the set is made regardless of the other elements. Although the algorithm should be extremely ineffective due to its shallow nature and, it may be useful for something else. In this case, the author would like to propose a conjecture: Naive Algorithm, if extended, may be used to detect a randomly generated set.

Abbreviations Since some words can become extremely long in the article, some abbreviations are introduced. The meaning behind each word in the list will be explained later on. • Naive Algorithm: NA • Naive Algorithm with conviction value of x: NAc=x • Probability of correctness: P(C) • Probability of correctness of Naive Algorithm with conviction value of x: P(C of NAc=x ) • Conviction-value: c-value or c

Introduction NA is an algorithm used to check whether a set matches with the desired properties or not, by simply pulling out a few elements from the set and compare them. While this is extremely fast, it can also be extremely inaccurate. The aim of this project is to check the accuracy rate of NA to see whether it is worth the sacrificed accuracy or not and to see whether it has its own special application. As it turns out, the algorithm may be useful for detecting a randomly generated set. However, before getting to that point, it is necessary to perform some accuracy tests. This is because the conjecture that NA may be useful as a tool to detect a randomly generated set requires the results from the tests. Given a finite set, create an algorithm that will check whether the elements of the set are sorted in an ascending order or not. For anyone, an obvious way to do this is to simply look at the first element and then count another one. If the next element is less than a current one, then it is obvious that the element is not in an ascending order. A procedural list of steps would look like this:  1, 2, 3, 4, 5, 6, 7, 8, 9, 10 1. Pick the first element:  1, 2, 3, 4, 5, 6, 7, 8, 9, 10 2. Pick the next element to compare with the second one: 3. In this case, the second element is less than the first so keep continuing. 4. Repeat the steps above with the next numbers until the end of the set. For this set, the set is obviously sorted.

1

It is evident that this method is extremely accurate to determine whether a set is sorted or not. However, this algorithm is quite slow. Its worst case run time is actually O(n). The author would like to call this algorithm, “The First Algorithm,” because he could not find the name for it from a scholarly source. Also, the author conjectures that when a person is confronted with a problem like this, he or she will select this algorithm over other ones. To improve its run time, divide-and-conquer strategy can be introduced. [Bently et al.] The name ’divide-andconquer’ is very straightforward. In this case, one simply has to simply split the set up into partitions and analyze each partitions before merging the results together. [Bently et al.] The list below demonstrates the divide-andconquer algorithm as procedures:  1, 2, 3, 4, 5, 6, 7, 8, 9, 10 → 1. Split the set into the smallest possible elements:      1, 2 3, 4 5, 6 7, 8 9, 10 2. Each subsection is then analyzed: {T rue} {T rue} {T rue} {T rue} {T rue} 3. The results from each part is then combined. For this case AND operation is used: T rue·T rue·T rue·T rue·T rue 4. From this, it is known that the set is sorted. Divide-and-conquer strategy’s runtime is O(n log n) [Bently et al.] which is significantly faster than the previous method. Still, there can be a faster algorithm, and that algorithm is NA. To use NA, one simply picks small amounts of elements from the set, compare them against a property and then conclude. The other unchecked elements are ignored even if they do not follow the given property. The procedures are as follow:  1. Pick a few random elements from the set. In this case, 3 elements: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 2. Since 1 ≤ 3 ≤ 9, then it is concluded that the list is sorted in an ascending order. The runtime of this algorithm is O(1); however, this complexity can change dependent on how much conviction does one have. If one does not have much conviction, he or she will pick more elements. And if his or her conviction is so low, he or she may pick every single in the set and hence NA may simply collapse into the first algorithm. Since conviction and numbers of picked numbers are related, it is a good idea to create a formula to relate them together. The formula is simply ∀n ∈ N, c = n1 where n is the number of picked element and c is the conviction value (abbreviated as c-value). As one can see, the smaller the conviction value there is, the more numbers will picked. And this also implies that c ≤ 12 , because the minimal amounts of items picked is 2. If c > 0.5, then less than 2 elements would be picked and the algorithm will not work. In another word, c > 0.5 means that there is too much conviction for any check with NA. On the other hand, if the c-value is too low, the algorithm itself might analyze the whole set. When that happens, NA will basically collapse into the first algorithm and here is the proof: Proof #1: For any NA N , N that takes a finite countable set S, N becomes the first algorithm if all elements of S are picked. If |S| = 3, then the possible numbers of picked elements is 2 or 3. If 3 is chosen, then every single elements are picked and there are 2 checks performed. The first check is S1 ≤ S2 and the second check is S2 ≤ S3 . This is basically the first algorithm and so N A has become the first algorithm for |S| = 3. The inductive hypothesis is for N whose numbers of picked elements, n equal to the size of its input set S, N is the first algorithm. If N A’s input set’s size and the numbers of picked elements are n + 1, then let m = n + 1. So N ’s input set’s size and numbers of picked elements are now m. By inductive hypothesis, N A has become the first algorithm for the input size.  1 1 From this proof, it can be said that c ∈ ( |S| , 2 ]. c must also be more than 0 as well, because if c = 0, then is not enough conviction that the algorithm will work and it will be abandonned. The key to have a runtime of O(1) for Naive Algorithm is to have a relatively firm conviction that the algorithm will work and to pick a larger conviction value. Although Naive Algorithm may have a somewhat shaky advantage against the firts algorithm and the divideand-conquer strategy in the area of runtime complexity, it does have a definitive edge when perform on a set with an infinite size. Unless implemented with a Zeno Machine (Fundamentally, a machine that can perform infinite steps in a finite time[Potgieter]), the first two algorithms will take an eternity to complete.[Potgieter] And it seems that no one would be able to acquire a Zeno Machine at anytime in the near future. [Potgieter] So there is only one choice: Naive Algorithm which naively ignore an infinite amounts of elements. To demonstrate this, here are proofs given why the first algorithm and divide-and-conquer strategy would need Zeno Machine:

2

Proof #2: For any function f that takes a countable set S as input and compute on its elements, f does not require a Zeno Machine if and only if there exists a positive integer b such that |S| < b. Assume f requires a Zeno Machine, then f must perform an infinite amounts of steps. Since f computes on its input, S which is a set of valid inputs must have the size of infinite. Therefore, b cannot be greater than |S| because b = |S| = ∞. Hence, @b, |S| < b. Assume |S| < b, then |S| is bounded by a positive integer. Hence, |S| is finite and therefore, f does not need a Zeno’s machine in order to be implemented.  Proof #3: For any function f (S) that computes on its element and S = [a, b] a, b ∈ R, [a, b] is continuous if and only if f needs a Zeno’s Machine. If S is continuous, there must be an infinite amounts of numbers between a and b. Then, there are infinite inputs for f , and f would require a Zeno’s Machine. If f requires a Zeno’s Machine, then its input size must be infinite. If S is between [a, b], the size of [a, b] must be infinite or f would not need a Zeno’s Machine. To have an infinite size while a, b ∈ R, [a, b] must be continuous.  Proof #4: Any function f whose input is R itself requires a Zeno’s machine to compute. Since R = (−∞, ∞), then |R| = ∞. So there does not exist a number that can bound the set. By Proof #2, f needs a Zeno’s machine. 

By the three proofs above, it is shown that both algorithms will need a Zeno Machine to work and hence, they are impractical. Now, one must try to conclude that NA does not, in fact, also requires a Zeno Machine. However, it would not that simple, because by Proof #1, any NA can be converted into the first algorithm if enough elements are picked. Nevertheless, it seems that NA does not require Zeno Machine as it is incapable of analysing infinite input by default: Proof #5: For any NA N (S), if |S| is infinite, there is no c > 0, c ∈ R that can cause it to collapse into the first algorithm and hence it will never need Zeno Machine. In order to have N to become the first algorithm, N must enough numbers of elements. Since |S| = ∞, this means, N must take ∞ inputs to become the first algorithm. Then c = ∞−1 = 0 and the algorithm is simply abandonned. The consequence of this is that N must be done with finite resources. This implies that N does not need Zeno Machine.  Now combining the results of all given proofs, it can be said from the three algorithms, only NA will work practically with a set with infinite elements. Despite of NA’s perks, many people may still be skeptical about the algorithm, and they should be. The algorithm is extremely unreliable. For the graphical steps outlined, the algorithm will fail if one alters the other numbers and leave 1, 3, 9 alone. However, on the other hand, NA is also extremely nimble. But is NA’s agility really worth the sacrificed accuracy? The tests will find out.

Methodology In order to test the algorithms, R version 2.15.1 is used. In each test, each algorithm will be run for 1000 times on a set. Each time the algorithm must figure out whether the set is arranged in an ascending or not. If it answers correctly, then 1 will be added to a temporary variable used to store the number of correct checks. Then, the correct checks . probability of correctness (P(C)) is calculated with this formula: P(C) = N umber of1000 The tests are separated into 2 main sections: Experiments with Finite Sets and Experiments with Infinite Sets. The separation comes from the proofs above which have demonstrated that only Naive Algorithm will work with infinite sets. In Experiments with Finite Sets, NAc=1/2 and NAc=1/4 are pitted against the first algorithm and divide-andconquer strategy. A set with 10 elements will be used for each test and it can comes in a variety of arrangement. 3

In Experiments with Infinite Sets, different types of infinite sets are be used. The first type is a countable set. The second type is an uncountable sets, but bounded within a certain range. Since, the sets are infinite, it is impossible to define every element by any manual mean. This means the sets are actually ranges of functions. As the proofs have shown that only NA will works with infinite sets, non-NA algorithms will not be tested here. Rather NAC=1/2 , NAC=1/10 , NAC=1/25 , NAC=1/50 are tested and compared instead.

Experiments with Finite Sets S1 = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 Table 1 Trials 1000

First Algorithm 1

P(C) Divide and Conquer 1

NAc=1/2 1

NAc=1/4 1

Set Generating Code: # # # S

Question: Is this set sorted in an ascending order? Corret Answer: Yes The set is {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} = 1:10

Refer to Appendex I Part I for calculation and output code.

S2 = 1, 5, 3, 4, 5, 6, 7, 13, 9, 10 Table 2 Trials 1000

First Algorithm 1

P(C) Divide and Conquer 1

NAc=1/2 0.146

NAc=1/4 0.446

Set Generating Code: # # # S

Question: Is this set sorted in an ascending order? Corret Answer: No The set is {1, 5, 3, 4, 5, 6, 7, 13, 9, 10} = c(1, 5, 3, 4, 5, 6, 7, 13, 9, 10)

Refer to Appendex I Part I for calculation and output code.

S3 = 0, 1, 0, 2, 0, 3, 0, 4, 0, 5 Table 3 Trials 1000

First Algorithm 1

P(C) Divide and Conquer 1

NAc=1/2 0.216

NAc=1/4 0.721

Set Generating Code: # # # S

Question: Is this set sorted in an ascending order? Corret Answer: No The set is {0, 1, 0, 2, 0, 3, 0, 4, 0, 5} = c(0, 1, 0, 2, 0, 3, 0, 4, 0, 5)

Refer to Appendex I Part I for calculation and output code.

4

S4 = 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 Table 4 Trials 1000

First Algorithm 1

P(C) Divide and Conquer 1

NAc=1/2 1

NAc=1/4 1

Set Generating Code: # # # S

Question: Is this set sorted in an ascending order? Correct Answer: NO The set is {10, 9, 8, 7, 6, 5, 4, 3, 2, 1} = c(10, 9, 8, 7, 6, 5, 4, 3, 2, 1)

Refer to Appendex I Part I for calculation and output code.

S5 = a, b, c, d, e, f, g, h, i, j, a, ..., j are random integers ∈[0, 40000] Table 5 Trials 1000

First Algorithm 1

P(C) Divide and Conquer 1

NAc=1/2 0.458

NAc=1/4 0.937

Set Generating Code: # Question: Is this set sorted in an ascending order? # Correct Answer: ?? # - What given are actually randomly generated numbers. # - Each number is from the range of [0, 400000] # - So there is no way to determine the correct answers! S = rep(0,10) for (i in 1:10) { S[i] = sample(400000, 1) } Refer to Appendex I Part I for calculation and output code.

Experiments with Infinite Sets Countable Infinite Sets S6 = {x ∈ [0, 1000]|f (x) = x} Table 6 Trials 1000

NAC=1/2 1

P(C) NAC=1/10 NAC=1/25 1 1

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: Yes list_generator = function(n) { return(n) } Refer to Appendex I Part II for calculation and output code.

5

( S7 = {x ∈ [0, 1000]|f (x) =

x, −x,

Table 7 Trials 1000

if x is odd } if x is even

P(C) NAC=1/10 NAC=1/25 1 1

NAC=1/2 0.509

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: No list_generator = function(n) { if (n %% 2 == 0) { return(-n) } else { return(n) } } Refer to Appendex I Part II for calculation and output code. ( S8 = {x ∈ [0, 1000]|f (x) = Table 8 Trials 1000

NAC=1/2 0.243

0, x,

if x is odd } if x is even P(C) NAC=1/10 NAC=1/25 0.991 1

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: No list_generator = function(n) { if (n %% 2 == 0) { return(n) } else { return(0) } } Refer to Appendex I Part II for calculation and output code. S9 = {x ∈ [0, 1000]|f (x) = (x − 1)(x − 200)} Table 9 Trials 1000

NAC=1/2 0.992

P(C) NAC=1/10 NAC=1/25 0.997 1

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: No list_generator = function(n) { return((n-1)*(n-200)) } Refer to Appendex I Part II for calculation and output code.

6

S10 = {x ∈ [0, 1000]|f (x) = (x − 1)(x − 200)(x − 4000)} Table 10 Trials 1000

NAC=1/2 0.347

P(C) NAC=1/10 NAC=1/25 1 1

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: No list_generator = function(n) { return((n-1)*(n-200)*(n-4000)) } Refer to Appendex I Part II for calculation and output code. S11 = {x ∈ [0, 1000]|f (x) = (x − 1)(x − 200)(x − 3)} Table 11 Trials 1000

NAC=1/2 0.981

P(C) NAC=1/10 NAC=1/25 0.936 0.873

NAC=1/25 0.787

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: No list_generator = function(n) { return((n-1)*(n-2)*(n-3)) } Refer to Appendex I Part II for calculation and output code. S12 = {x ∈ [0, 1000]|f (x) = a random integer from (0, 2000)} Table 12 Trials 1000

NAC=1/2 0.503

P(C) NAC=1/10 NAC=1/25 0.998 1

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: ??? list_generator = function(n) { return(sample(2000, 1)) } Refer to Appendex I Part II for calculation and output code.

Uncountable, Bounded Infinite Sets S13 = {x ∈ [0, 1]|f (x) = sin(x)} Table 13 Trials 1000

NAC=1/2 1

P(C) NAC=1/10 NAC=1/25 1 1

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: Yes (Would be No if the domain exceeds pi/2.) list_generator = function(n) { return(sin(n*10)) }

7

Refer to Appendex I Part II for calculation and output code. S14 = {x ∈ [0, 1]|f (x) = sin(10x)} Table14 Trials 1000

NAC=1/2 0.445

P(C) NAC=1/10 NAC=1/25 1 1

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: No list_generator = function(n) { return(sin(n*10)) } Refer to Appendex I Part II for calculation and output code. S15 = {x ∈ [0, 1]|f (x) = sin(100x)} Table 15 Trials 1000

NAC=1/2 0.483

P(C) NAC=1/10 NAC=1/25 1 1

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: No list_generator = function(n) { return(sin(n*10)) } Refer to Appendex I Part II for calculation and output code. S16 = {x ∈ [0, 1]|f (x) = x} Table 16 Trials 1000

NAC=1/2 1

P(C) NAC=1/10 NAC=1/25 1 1

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: Yes list_generator = function(n) { return(n) } Refer to Appendex I Part II for calculation and output code. S17 = {x ∈ [0, 1]|f (x) = 2x} * The results in this section is to be discarded. Please read Inconsistencies in S17 , S18 for more information. Table 17 Trials 1000

NAC=1/2 0.526

P(C) NAC=1/10 NAC=1/25 0.479 0.5

NAC=1/25 0.486

Set Generating Code: # Question:

Is the set sorted in an ascending order or not?

8

# Answer: Yes list_generator = function(n) { return(2*n) } Refer to Appendex I Part II for calculation and output code. S18 = {x ∈ [0, 1]|f (x) = 10x} * The results in this section is to be discarded. Please read Inconsistencies in S17 , S18 for more information. Table 18 Trials 1000

NAC=1/2 0.098

P(C) NAC=1/10 NAC=1/25 0.101 0.093

NAC=1/25 0.850

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: Yes list_generator = function(n) { return(10*n)) } Refer to Appendex I Part II for calculation and output code. S19 = {x ∈ [0, 1]|f (x) = 12 x} Table 19 Trials 1000

NAC=1/2 1

P(C) NAC=1/10 NAC=1/25 1 1

NAC=1/25 1

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: Yes list_generator = function(n) { return(0.5*n)) } Refer to Appendex I Part II for calculation and output code. S20 = {x ∈ [0, 1]|f (x) = a random integer from (0, 1000)} Table 20 Trials 1000

NAC=1/2 0.501

P(C) NAC=1/10 NAC=1/25 0.005 0

NAC=1/25 0

Set Generating Code: # Question: Is the set sorted in an ascending order or not? # Answer: ??? list_generator = function(n) { return(sample(1000,1)) } Refer to Appendex I Part II for calculation and output code.

Discussion For the most part, everything seems to be predictable. NA with high c-value is less accurate than the one with low c-value. And the results from the experiment also seem to confirm Proof #1 as Naive Algorithm with lower c-value tends to become more accurate than those with higher value. To make things worse, the ideal runtime of O(1) for NA also seems to be too idealistic, because c-value can also have a major impact on runtime. The other 9

algorithms do not suffer from vague runtime as they are not dependent on c-value. In the end, NA by itself is not all that useful as an algorithm checker. It seems that the best strategy for checking a set is the divide-and-conquer strategy. However, sometimes, the use of NA is unavoidable, especially in the cases where infinite is involved. From the tests, it seems that there are two main strategies that can be used. The first strategy is to reduce c-value as much as possible. With reduced c-value, the algorithm will behave more like the first algorithm and hence becomes more accurate. So how much c-value is required? There is no way to actually know. It depends on the problem. If the problem requires only a rough checking, then high c-value should be fine. Another more complex strategy is to calculate P(C of NAc=0.5 ). In this test, P(C of NAc=0.5 ) is found by dividing the number of correct guesses with the number of total guesses. Then, one can the probability to see how well the set obeys the property. For instance, for this test, P(C) = 1 means that the set is totally arranged in an ascending order. Indeed to achieve the total accuracy, infinite trails must be performed. However, if total accuracy is not required, then finite trials can still be used. This strategy is also not that useful because, there has to be a party that knows whether the set is already sorted or not beforehand. Despite of this, this strategy seems to yield an interesting result when analysing a randomly generated set. Something strange happens when algorithms have to analyze randomly generated set. P(C) for each algorithm should be unobtainable, because there is no way to know whether the sets are sorted or unsorted before hand. Nevertheless, the algorithms that are not NA seem to have P(C) = 1. This anamoly is very likely to be caused by the fact that the non Naive Algorithms are forced to be correct. This is merely a conjecture though, and more analysis is required to prove this: Conjecture #1: Non-Naive Algorithms (like the first algorithm and divide and conquer strategies) are forced to be correct or at least, to give a binary answer. Interestingly, NA with low c-value also has P(C) much closer to 1 than NA with high c-value and this may stem from the consequence of Proof #1. Regardless of how interesting this conjecture can be, it will not be justified in the report and is left for future analysis; the formal proof for this conjecture is likely to be too complex for this report. In the context of randomly generated set, it seems that NA with the highest c-value seems to seems to hovering around 0.5 for some reason when the algorithm is analysing a randomly generated set. So the author would like to make this conjecture: Conjecture #2: P(C of NAc=0.5 ) = 0.5 if NA is analysing a randomly generated set. Now, the author would like to propose some theories about this conjecture. Before moving on with the theories, it should be first demonstrated that only c = 0.5 will work. This is because by Proof #1, the lower the c-value, the more NA becomes like the first algorithm. And by Conjecture #1, the first algorithm does not fare well with randomly generated set. This report will not attempt to prove this conjecture as well since the proof would be extremely complex and is worthy of another report on its own. However, the author would like to give an insight to demonstrate why P(C of NAc=0.5 ) = 0.5 for any randomly generate set. The insight comes from entropy and Monte Carlo algorithm. In a simple entropy demonstration by Gibbs, let’s pretend there is a room filled with two different types of gas.[Lebowitz] As of now, the gases are separated by a barrier. When the barrier is removed, both gases will mix with each other. The ’mixing’ energy is the entropy.[Lebowitz] In order to have the maximum entropy, the quantity of both gases must be the same.[Lebowitz] Although entropy is a concept from physics, it is possible to uses it as an analogy. In fact, Claud Shannon has created the concept of information entropy by using the physical entropy as an anology.[Shannon] To link the entropy together with NA, one must also look into stochastic process and Monte Carlo algorithm. Someone who is astute enough would probably have noticed that NA is very much related to Monte Carlo algorithm. However, there is a major different between NA and Monte Carlo algorithm: NA seeks to answer question, so when an answer is found, the numbers are soon abandonned. On the other hand, Monte Carlo algorithm will try to do some modelling with the numbers.[Santos] Although NA is not necessarily Monte Carlo algorithm, the process to find P(C) in this experiment is actually a Monte Carlo method. When the process to find P(C of NAc=0.5 ) is running, it creates stochastic processes with the size of 2. This process can be either a counting process or a count down process. And the probability of correctness can also be treated as a percentage of either type of stochastic process depending on the question. So for instance, if the question is asking for a set sorted in an ascending order, then P(C) represents all counting processes. P(∼ C) would then be representing the number of counting down processes. Now, imagine all of the counting processes and the count down processes floating in a room like gas and separated by a barrier: 10

When the barrier is removed, the processes start to mix together. The entropy in this case is representing the randomness of the set. In order to achieve the maximum randomness/entropy, the amounts of counting processes and count down processes must be the same. As P(C of NAc=0.5 ) is representing either type of stochastic process, the ideal value for P(C of NAc=0.5 ) to achieve the maximum randomness is 0.5. From this, the formula to obtain the value or randomness of the set is: −P(C) log2 P(C) − (1 − P(C)) log2 (1 − P(C)) For this function, the maximum of randomness is 1 and the lowest is 0. This function is based on Shannon’s entropy formula in A Mathematical Theory of Communication.[Shannon] However, as tempting as it is to apply this formula to the real-world application, one should still remember that this is all just a conjecture. Until a formal proof can be written, this can only be used as an insight to why P(C of NAc=0.5 ) = 0.5 for randomly generated set.

* Inconsistencies in S17, S18 It is observed that there is something wrong with the analysis on S17 , S18 . The set generator function is simply a linear function and P(C) should be 1. However, as observed, P(C) 6= 1. The author would like to argue that this may have something to do with the would like to argue that this may have something to do with the limitation of R. If an operation that increases the number of digits (like multiplication and addition) applied to the numbers, then the result will have be rounded up since the new numbers’ numbers of digits exceed the maximum amount. When the numbers are rounded, they lose some of their information and consequently alter the result of the experiments. There is something interesting happening for the tests on S17 , S18 . The probability of correctness for S17 is about 0.5 while the set generating function of S17 is f (x) = 2x. The probability of corectness of S18 is about 0.1 while the set generating function of S18 is f (x) = 10x. It seems that the coefficients of the set generating functions have something to do with the P(C). However, this topic is outside of the scope of this report. For the purposes of this report, the results from these sets should be ignored.

Appendix I Part I trial_num = 1000 ########## TESTING THE FIRST ALGORITHM ########## # The number of correct answers: a1_correct = 0 # Now, the normal algorithm: for (t in 1:trial_num) { 11

is_sorted = T for (i in 1:9) { is_sorted = is_sorted & (S[i]

Suggest Documents