Download NECUMONO

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
NZ-SIMSS
2008 Parallel NCEA Scholarship Statistics and
Modelling Exam Answers
All questions are based on activities associated with a company called PlayGames.
(1) (a) PlayGames needs to estimate within one minute, with 95% confidence, the mean
time per player to finish the game OBE, where players have to stop a monster called OBE
from enslaving a country. It has been noted that the standard deviation of any group of
players is about 3 minutes. Find the minimum sample size for this estimate.
This condition will be fulfilled if the margin of error of a 95% confidence interval
for the mean time is less than 1. ………………………..1
Hence
1.96x3÷√n < 1……………………………………………..2
If this is solved then n > 35………………………………3
(b) PlayGames operates a parlour which has a maximum number of 50 players. Usually
if all of these players are playing OBE the mean time to play the game till its conclusion
is 6.8 minutes. A sample of the size calculated in (a) had a mean of 5.7 minutes. Is this
what might be expected ? Give reasons.
The 95% confidence interval has end points 5.7 ± 1.96x3÷√35 giving the interval
(4.71, 6.70). …………………………………………………1
As 6.8 does not belong to this interval we can be 95% confident that the sample of 35
players was more skillful than average……………………2
(c) PlayGames wants to compare mean times of customers who come in on week day
evenings and customers who come in over the weekend. Explain how this might be done.
Samples of size more than 30 are randomly selected from the the customers who
come during the week and from the customers who come at the weekend. ……….1
Let these sample sizes be n1 and n2 respectively. For each sample the sample means
m1 and m2 are calculated, as are the sample standard deviations s1 and s2.
From these parameters a 95% confidence interval is constructed whose end points
are m1 – m2 ± 1.96√(s12/n1 + s22/n2). ……………………………………………………2
If 0 is in the interval then there is no significant difference between the mean times
otherwise there is ……………………………………………………………………….3
© Rory Barrett 2008
(2) (a) A customer satisfaction survey using a cluster sample was carried out . Discuss
this and give an advantage and a disadvantage.
The customers are divided into groups which are presumed to have the same
characteristics as the population …………………………………………….1
then after one or more of these groups are chosen randomly sampling is done
entirely within this(these) group(s)…………………………………………..2
An advantage is that it is often more convenient to sample within only one or a few
groups rather than the population as a whole………………………………..3
A disadvantage is that the presumption of the groups having the same
characteristics may be wrong and subsequent sampling will be biased…….4
(b) A survey of 50 customers gave 80% satisfaction rate with the games offered. Use this
to find a 99% confidence interval for the proportion of all customers who are satisfied
with the games offered.
The confidence interval will have end points 0.8 ± 2.576 x √(0.8 x 0.2 ÷ 50) which is
0.6543 < π < 0.9457, where π is the population proportion, or
(0.6543,0.9457)……………….1
(c) Many of the customers in the survey hate the game SBA where a giant spider eats
teenagers, as they are frightened of spiders. If 1000 customers were asked whether they
hated SBA what is the probability more than 940 would hate SBA if the probability
customers hated SBA was the maximum probability suggested by the 99% confidence
interval based on the survey in (2)(b) above ?
The number of people in 1000 who hate SBA is a B(1000,0.9547) distribution…1
Let X be the number of people in 1000 who hate SBA. We need to find P(X > 941).
Attempting to find this using a CFX9750 leads to a MA Error.
μX = 954.7, σX = √(1000 x 0.9457 x (1-0.9457)) = 7.166……………………2
Let X* be a normal approximation to X with μX = 954.7, σX = 7.166.
Using this we need to find P(X* > 940.5), correction for continuity,
= 0.97623……………………….3
© Rory Barrett 2008
(3) PlayGames records the number of customers over a thirty day period. The table
below shows some of this data.
PlayGames records the number of customers over a thirty day period. The table below
shows some of this data.
Day
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Number
of
customers
57
43
42
31
73
183
105
64
52
49
29
80
188
191
63
44
41
30
81
195
181
66
56
55
24
78
201
178
70
60
Deseasonalised
Numbers of
customers
87.8
84.0
81.1
34.5
87.8
89.7
89.8
85.8
91.0
86.1
120.5
86.8
81.7
81.8
86.8
92.0
93.1
110.5
89.8
93.7
95.8
80.8
89.0
99.1
There ia also a graph below which shows the actual data and the deseasonalised data.
© Rory Barrett 2008
Customers
Original
Deseasonalised
250
Numbers
200
150
100
50
0
0
5
10
15
20
25
30
35
Days
(a) Write a short essay describing the customer numbers over this period. In this essay
explain how the deseasonalised values were obtained ;
There is a slight increasing trend over the period surveyed, particularly at
weekends. ……………………………………………………………………………1
Customer numbers peak over the weekends and are least on Thursdays. The
deseasonalised data shows this as well………………………………………………2
The deseasonalised data is obtained by first finding the centred moving
means(CMM) of order 7. The ‘Rough’, the difference between the raw and the
CMM are then averaged for each day to find ‘Average Seasonal Effects’……….3
The deseasonalised data is found by subtracting the ‘Average Seasonal Effects’ from
thm the raw data………………………………………………………………………4
(b) A trend line is fitted to the deseasonalised dats whose equation is y = 0.7x +77.
The table below shows the average ‘seasonal effects’ for each day of the week.
Day of week
Average ‘seasonal effect’
-24
1
-38
2
-41
3
-57
4
-11
5
102
6
71
7
Use these to predict the customer numbers on day 47 if the features and trend shown
continue;
Day 47 is ‘Day 5’ when it is necessary to use average seasonal effects.
The prediction is 0.7(47) + 77 – 11 = 99…………………………………………….2
© Rory Barrett 2008
(c) Playgames will lease new premises if the average daily number of customers exceeds
150. When is this likely to occur and what would be the prediction of the actual number
of customers on that day ?
The average daily number of customers is represented by the trend equation
thus 0.7x + 77 = 150 and x = 105 days………………………………………………....1
That day is a Sunday hence the ‘average seasonal effect’ is 71 hence the prediction
would be .7 x 105 + 77 + 71 = 221.5 hence expect 221 or 225 people………………..2
(4) The manager of PlayGames is interested in how long it took experienced players of
SBA and OBE to win points in these games. He got a group of 20 very experienced
players and recorded how many points the members of the group won each minute. No
player needed more than ten minutes to complete either game. The data he acquired is
shown in the table below. Each game has 10 points which have to be won in order to
finish.
Minute
1
2
3
4
5
6
7
8
9
10
Points won in this minute
OBE
SBA
3
13
24
25
41
19
61
23
33
30
20
18
11
22
5
18
2
20
0
12
(a) He summarized this by finding the mean number of minutes for each point for each
game and the standard deviation of number of minutes for each point for each game. Find
the parameters he found and write down what they show
The required parameters are shown in the table below…………………………..1
Parameter
OBE
SBA
Sample Mean
4.19 minutes
5.34 minutes
Sample Standard
1.58 minutes
2.65 minutes
Deviation
The values obtained in the table indicate that for experienced players SBA is a more
difficult game than OBE. The reason for this is that it takes longer to score points in
SBA than it does for OBE as shown by the mean………………………………….2
SBA has a higher standard deviation than OBE indicating there is more variation in
the time required to win points in SBA than in OBE………………………….3
© Rory Barrett 2008
(b)
OBE
y = -1.5303x 2 + 13.658x + 3.8
R2 = 0.5705
70
60
50
40
30
20
10
0
-10
0
2
4
6
8
10
12
-20
SBA
y = -0.4697x 2 + 4.7788x + 11.8
R2 = 0.4958
35
30
25
20
15
10
5
0
0
2
4
6
8
10
12
He also drew scatter plots of the number of points against minutes for each game. These
are shown below as well as regression lines, equations and the associated coefficients of
determination (R2 ). Write an essay about what these diagrams show.
.
The diagrams show scatter plots of the number of points won in one, two, three,..
minutes against the times one, two, three minutes. The most number of points was
won in 4 minutes in OBE and in five minutes in SBA.
© Rory Barrett 2008
The number of points won was smaller at the ends of the time interval 1 < time < 10
than in the middle of the interval for both games……………………………………..1
The data for both games has been given a trend line which is a quadratic function.
The fit is measured by the coefficient of determination R2. It is a better fit for OBE
(R2 = 0.5705 ) than it is for SBA (R2 = 0.4958 )……………2
The proposed model for OBE is unsatisfactory for two reasons. In the first four
minutes of playing OBE there seems to be a linear relationship if we plot points won
per minute against minute and thereafter a decay…………………………………….3
In addition the proposed model gives a negative value for 10 minutes which is
impossible
Although there is considerable scatter about the proposed model for SBA it is not
obvious from looking at the graph that the model is flawed…………………………4
(c) How many points would you expect a group of 50 experienced players of OBE to win
in the third minute of play ?
If the original group players were playing then the number of points won in the
third minute would be about 40
As there are 50 players rather than 20 then we would expect 50 x 40 ÷ 20 = 100 on
average………………………………………………………………………………….1
(5) Playgames introduces a new game called CIE. CIE provides far more challenge.
Most players are keen on it. The table below shows some information about the games
Playgames offers.
Game
Percentage of customers
Probability of a win within
playing this game
ten minutes of play starting
OBE
28%
2p
SBA
7%
4p
CIE
65%
p
(a) Calculate the probability at least 14 of the next 30 customers play SBA;
The number of players in 30 who play SBA has a B(30, 0.07) distribution……1
We require P(X > 14) where X is the number of players in 30 who play SBA.
P(X > 14) = 1 - P(X < 13) = 1 – 0.99999 = 0.00001………………………………..2
(b) If the probability of a randomly selected player winning in ten minutes irrespective of
the game played is 0.35 then find p;
The probability that a randomly selected player wins in ten minutes is 0.28 x 2p +
0.07 x 4p + 0.65 x p …………………………………………………………………1
This is equal to 0.35…...……………………………………………………………..2
Solving this equation gives p = 0.2071………………………………………………3
(c) Find the probability that if a customer wins in ten minutes that they were playing
OBE.
The required probability is P(OBE|win within 10 minutes) ………………………1
= 0.28 x 2 x 0.2071 ÷ 0.35 …………………………………………………………….2
= 0.3314…………………………………………………………………………………3
© Rory Barrett 2008
(6) In a new games parlour Playgames only had two games machines, X and Y.
Machines were either one or the other. An X machine cost $1000, a Y machine cost
$800. The maximum number of machines was 50. Playgames allocated $48000 for the
purchase of machines. For 0 < x < 20 then y < 40 . For 20 < x < 40 then y < 20
(a) If the ratio of revenue between an X machine and a Y machine is 12:10 then find the
number of each type which should be purchased for maximum revenue;
Constraints are:
1000x + 800y < 48000
x + y < 50………………………………………………………………..1
0 < x < 20 then y < 40
For 20 < x < 40 then y < 20
The diagram shows the feasible region……………………………….2
For maximum revenue x has to be as large as possible at a corner and so we need 40
X and 10 Y………………………………………………………………3
(b Due to changing economic conditions the amount of money allocated for machine
purchase was reduced however all other constraints remained the same, as well as the
revenue ratios for X and Y. How much was the budget for machine purchase reduced by
if the number of X machines for maximum revenue is now 20 with the revenue being as
large as possible;
In this case we now have 20 X and 30 Y . ……………………………1
Thus the machine budget is 1000(20) + 800(30) = $44000.
The budget has been reduced by $4000………………………………2
© Rory Barrett 2008
(c) As a result of still worsening economic conditions cheaper premises have to be leased
and in the new premises the total number of machines is determined by floor space.. Each
X machine uses 1m2 of space and each Y machine uses 2m2. A total of 70m2 is available
for machines. If the ratio of revenue between an X machine and a Y machine is now 0.5:1
then find how many of type X there should be and of type Y..
The new constraint which replaces x + y < 50 is x + 2y < 70…………………1
The feasible region is shown below……………………………………………..2
This means that any combination of X and Y machines satisfying x + 2y = 70
with 0 < x < 30 will be satisfactory for the given ratio between X and Y………3
© Rory Barrett 2008