Download But Is it Random?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mechanical calculator wikipedia , lookup

Addition wikipedia , lookup

Location arithmetic wikipedia , lookup

Elementary mathematics wikipedia , lookup

Central limit theorem wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Collatz conjecture wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Law of large numbers wikipedia , lookup

Arithmetic wikipedia , lookup

Elementary arithmetic wikipedia , lookup

Positional notation wikipedia , lookup

Approximations of π wikipedia , lookup

Transcript
But Is it Random?
Mr. King, MDM4U, 2010
The Runs Test for Randomness
While the sequences of outcomes of a dice roll can be assumed random, not everything is random. The digits when dividing
8/17 seem random until about the 16th decimal place, where they are seen to repeat:
���������������������������
8/17 = 0.47058823529411764705882352941176 = 0.4705882352941176
The famous simplification for Pi: 22/7 is nothing more than an oversimplification, and is seen to be repeating:
����������
22/7 = 3.1428571428571428571428571428571 = 3.142857
Sequences generated by a computer are called “pseudo random” sequences, because a formula is applied to an
inputted number called a “seed”. Each time a new number is generated, that number becomes the new seed. This means that
identical sequences can be generated by using the same seed at the start of the generation of the series. A seed can be
generated by the user or by the computer’s system clock. Most computer games and simulations rely on this kind of pseudorandomness.
To determine randomness, keeping track of 10 digits, or even 26 letters of the alphabet, is not straightforward, so this is
often not done, in favour of finding some way to divide the outcomes into two groups. Digits can be grouped into even and
odd, for example.
Are the digits in the number π (Pi) random? To judge this, we need at least 20 digits for a statistically significant sample. The
Windows Calculator on MS-Windows 7 generates Pi to 32 digits: 3.1415926535897932384626433832795
Let’s look for randomness by dividing the digits into odd (1 3 5 7 9) and even (0 2 4 6 8), symbolizing “e” for even or “d” for
odd. We can agree that an outcome such as “eeeeeeeeee” constitutes a pattern, and thus it is not random. We can also agree
that “dededededede”, “deedeedeedeed” are other patterns.
3.1415926535897932384626433832795
d.dedddeedddedddd edeee eeedd ededdd
The purpose of the spacing is to line the letters up with the digits above them. We count 15 runs of odd and even numbers
total; 8 of these runs are of odd numbers. 18 digits are odd, while the remaining 14 are even. The fact that 18 and 14 are
both numbers greater than 10 means we can assume a normal distribution. We need to define how many runs of odd and
even numbers there must be to be considered random. For this, it is the number of even and odd digits that are important:
2𝑛1 𝑛2
𝜇=𝑛
1 +𝑛2
=
2(18)(14)
18+14
=
504
32
= 15.75 = MEAN # OF RUNS
The number 15.75 means that our total number of runs has to be an integer close to 15.75 to be considered random. We
counted 8 + 7 = 15 runs in total, so that does seem close. But this is not hard proof. Can we come up with a less subjective
definition of “close to the mean”? The best definition would be to show that the z-score of your sequence is close to 0. We
can set the cutoff point such that z < 0.5. This means that our z must be at most half a standard deviation away from the
mean to be considered random. For a z score, we need the mean, and a way to compute the standard deviation.
The standard deviation is given by:
2𝑛1 𝑛2 (2𝑛1 𝑛2 −𝑛1 −𝑛2 )
(𝑛1 +𝑛2 )2 (𝑛1 +𝑛2 −1)
𝜎=�
2(18)(14)(2∙18∙14−18−14)
(18+14)2 (18+14−1)
=�
Now we have what is needed to compute a z score:
𝑧=
𝜇−𝑋�
𝜎
=
15.75−15
2.738
≅ 2.738
= 0.274, well under the cutoff of 0.5.
It is also possible to generate a confidence interval centered around µ, to show that “15” falls inside it.
H0: 𝜇 = 𝑋�, to a 90% level of confidence (𝛼 = 0.1).
H1: 𝜇 ≠ 𝑋�.
At 90% confidence, we obtain: 𝐶𝐼: 𝜇 ± 𝑧0.05 ×
𝜎
√𝑛
= 15.75 ± 1.645 ×
2.738
√32
= 15.75 ± 0.796 = 14.95 𝑡𝑜 16.54
Seen this way, our experimental value of 15 falls just inside the confidence interval, causing us to accept H0 at the 90% level.
This might seem a little shaky, since we only just made it into the acceptance region. But a second look at the sequence
reveals some peculiarities:
1. The digit “0” never occurs in the sequence. Thus, there are only 4 even digits used against 5 odd digits.
2. While the average number of times a digit should occur in a 32-digit sequence is 32/10 = 3.2 times, the number “3”
occurs 7 times, more than twice the average.
The question then arises, is this just due to randomness? After figuring out Pi to billions of digits, mankind has still not
known Pi to have its sequence repeat as of this writing. So, ultimately, the answer has to be yes, but we need to prove it to
some level of confidence.
With more digits, it is also more likely for our sample mean to fall closer to the middle of the normal distribution. The
software Mathematica is able to generate Pi to as many digits as one may have the patience for. For example, using a
command such as N(Pi, 500) will generate Pi to 500 digits. If the sequence is indeed random, we should expect the
proportions of zeroes and threes to even themselves out at some point.