Download Time Series on Lottery Numbers Introduction - Neas

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Time Series Student Project
Session: Fall 2011
Name: xxx xxx
Time Series on Lottery Numbers
Introduction
Lottery is a national-approved gambling prevailed all around the world. In China, a welfare
lottery called “Two Color Balls” is very popular among people. There are 33 red balls and 16 blue
balls. The rule is to choose 6 out of the 33 red balls and 1 out of the 16 blue balls. This project
focuses on the time series of the sum of the 6 chosen red ball numbers.
Data Source
All the data (100 values) are collected from (A Chinese webpage, if you are capable of
Chinese language): http://www.cqcp.net/trend/ssq/trendchart_red.aspx
A part of the data is as the following:
Time
Sum
2011109
77
2011110
134
2011111
81
2011112
89
2011113
117
2011114
108
2011115
86
2011116
75
2011117
121
2011118
77
Analysis
1) Stationarity
From Figure 1, it is clear that this time series has no obvious trend. It is reasonable to assume
stationarity in this time series.
2) Sample Autocorrelation
Figure 2 displays the sample autocorrelation function from lag 1 to lag 99. All the points in

Figure 2 are within the critical bounds  0.2   2

100 .
3) Partial Sample Autocorrelation
Figure 3 is the graph of partial sample autocorrelation function from lag 1 to lag 99, which
also confines all the points to critical bounds, the same as above.
Figure 1
160
140
Sum
120
100
80
60
40
0
10
20
30
40
50
Time
60
70
80
90
100
Figure 2
0.2
0.15
0.1
Autocorrelation
0.05
0
-0.05
-0.1
-0.15
-0.2
0
10
20
30
40
50
Lag
60
70
80
90
100
60
70
80
90
100
Figure 3
0.15
0.1
Partial Autocorrelation
0.05
0
-0.05
-0.1
-0.15
-0.2
0
10
20
30
40
50
Lag
4) Y(t) versus Y(t-1)
Figure 4 shows that there is no apparent upward or downward trend.
Figure 4
160
140
Y(t)
120
100
80
60
40
40
60
80
100
Y(t-1)
120
140
160
5) Q-Q Plot
After some simple calculations, I obtained that the mean of the sum is 101.47 and the
standard deviation is 21.00. The Q-Q plot of figure 5 strongly suggests that
Y t   101.47
is
21
approximately distributed as standard normal distribution.
Figure 5
3
Quantiles of Input Sample
2
1
0
-1
-2
-3
-3
-2
-1
0
Standard Normal Quantiles
1
2
3
Conclusion
From the previous analysis, the sum of red ball numbers is a white noise with mean 101.47
and standard deviation 21, Y t   101.47   t , where  t is a random error with mean 0 and
standard deviation 21.
Figure 1 shows no need to difference or log the original data. Figure 2 and 3 show that any
MA(p) or AR(q) is inappropriate. Figure 4 reveals no direct connection between Y(t) and Y(t-1),
which leads to consider white noise process. Figure 5 finally confirms the hypothesis.
Comment on conclusion
It is no surprise that the sum of red ball numbers is a white noise process. Predicting lottery
numbers is really challenging. But one thing for sure should be bore in mind: keep the sum within
one standard deviation away from the mean, since events like extremely large or small sums are
unlikely to occur.