Download Exam 3 - Stetson University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Line (geometry) wikipedia , lookup

Law of large numbers wikipedia , lookup

History of statistics wikipedia , lookup

Elementary mathematics wikipedia , lookup

Transcript
DS280 – INTRODUCTION TO STATISTICS
FALL SEMESTER 2003
“Big Quiz” #3
Answer the following questions in the space provided. Show your work as appropriate. Relative
problem weights are given in brackets; these total 100 points. The word “Pledged” in front of
your signature on this “big quiz” indicates your ongoing commitment to the Stetson Honor
System.
Question 1 [12 points; 2 points apiece]:
Indicate whether each of the following statements is TRUE or FALSE.
a) Income data are typically skewed right.
TRUE
FALSE
b) If two events are mutually exclusive, they are also independent.
TRUE
FALSE
c) It is theoretically possible for the covariance to be negative.
TRUE
FALSE
d) When the Florida Lottery Commission says your odds against winning the
lottery are 23 million to one, they are giving a classical probability.
TRUE
FALSE
Question 2 [3 points]:
For a normal distribution, about ________ of the data re within one standard deviation of
the mean, about ________ are within two standard deviations of the mean, and about ________
are within three standard deviations of the mean.
Question 3 [3 points]:
3456
Evaluate C 2 .
Question 4 [12 points]:
Find the sample mean, median, variance, and standard deviation for the following data:
6
2
9
0
3
Question 5 [8 points]:
The table below gives the percentage of forested land in each of the twenty-one provinces
of the Kingdom of Boravia.
Province
% forest
Athabasca
3.9%
Snohomish
5.4%
Murgatroyd
5.7%
Mugwump
6.4%
Thalia
7.5%
Ondibrox
7.9%
Morquitize
8.2%
Province
% forest
Paranoia
9.1%
BamaAla
9.8%
Hoozit
10.5%
Prandagloid
12.2%
Mortimer
15.3%
Nermal
18.5%
Lollapalooza 19.7%
Province
Jambalaya
BoraBora
Kali
Lucretia
Prunella
Otho
Percival
% forest
24.7%
28.6%
32.9%
35.9%
39.5%
42.0%
48.9%
Sketch an appropriate graph to illustrate these data. Label it appropriately. State, in a sentence
or two, what the graph tells about the data.
Question 6 [18 points, 6 each part]:
Daily returns on the Boravian stock market are approximately normally distributed, with
a mean of 0.05% and a standard deviation of 1.12%.
a) What percentage of the time is the return on the Boravian stock market positive (i.e., greater
than 0%)?
b) What percentage of the time does the Boravian stock market gain more than 10% in a single
day?
c) On the worst 1% of the days on the Boravian stock market, what returns occur?
Question 7 [16 points, divided as indicated]:
Bettors in the Boravian State Lottery three numbers out of ten. They win if all three
numbers are correct.
a) [8] If you buy one ticket in the Boravian State Lottery, what is your probability of winning?
b) [8] Lottery tickets cost one Boravian dollar. The lottery pays “100-to-1.” That is, if you win,
you receive $100 (your $1 back, plus $99 more). Otherwise, you lose your dollar. What is the
expected value of your net winnings? The variance?
Question 8 [16 points]:
The data below give the number of hours of sleep the night before a “big quiz”, and the
score on the big quiz, for four students in Recreational Statistics class. Find the slope and
intercept of the regression model for these data. Interpret these numbers, in context.
Sleep:
Score:
0
34
8
98
6
98
2
66
Question 9 [4 points]:
Several research studies on the extent to which genius is inherited have shown that
children of Nobel Prize winners, while generally of above average intelligence, tend not to be as
smart as their parents. This is an example of …
______ Simpson’s paradox
______ Chevalier de Mere’s fallacy
______ regression to the mean
______ false positive rate
Question 10 [4 points]:
Clorinda Cragdingle owns 42 shares of stock in the Sirius Cybernetics Corporation
(SCC). She wants to invest more money in the stock market. Statistically speaking, which of the
following investments will reduce her risk (variance) the most?
______ buying more stock in the Sirius Cybernetics Corporation
______ buying stock in the Antares Cybernetics Corporation, a company in the same industry
whose stock returns have correlation with SCC of .9.
______ buying stock in Amalgamated Fratostat, a heavy manufacturing company whose stock
returns have correlation with SCC of .2.
______ buying stock in Repos-R-Us, an automobile repossession company whose stock returns
have correlation with SCC of -.5.
Question 11 [4 points]:
Does increased television violence cause increased crime? Researchers at the University
of Southern North Dakota have obtained data, for the past twenty years, on number of violent
acts per hour of television programming. They have also obtained data on the per capita crime
rate, for the same period. The correlation between the two is 0.68. Does this indicate that
television violence causes increased crime? Explain.
Question 12 [4 points]:
What is a saturation model in regression? Give an example to illustrate.
DS280 – FALL 2003 – “BIG QUIZ” #3 - SOLUTIONS
1a) True
1b) False
1c) True
1d) True
2) two-thirds, 95%, virtually all
3)
C
3456
2

3456!
3456  3455  3454  ...  2 1
3456  3455


= 5 970 240
2!  3454! 2 1  3454  3453  ...  2 1
2
4) mean = (6 + 2 + 9 + 0 + 3)/5 = 20/5 = 4
median: 0 2 3 6 9
(put the data in order first)
1
2
2
X 2    X 



X

X

n
variance: either compute
or
n 1
n 1
2
2
(6  4)  (2  4)  ...  (3  4) 2 4  4  25  16  1
By the first method: s 2 
= 12.5

4
4
By the second method:
X
X2
6
36
2
4
9
81
0
0
3
9
Totals: 20
130
1
130   (20) 2
130  80
5

So variance =
= 12.5
4
4
standard deviation is the square root of the variance: 12.5 = 3.54
5) Do a histogram of the data. Any reasonable horizontal axis is OK. One possible graph is
given below. (It was done in Excel, which does wretched histograms.) You should also
comment on interpretation of the graph (most concentrated in 0-20%; skewed right; outlier s in
the 40’s, etc.)
Forested Land Percentages in Boravia
10
Frequency
8
6
4
2
0
10
20
30
40
Percentage of Forested Land
50
6a) z = (0 - .05)/1.12 = -.04. This is a z-score; looking up its probability gives .0160. So the
overall probability is .5 + .0160 = .5160
6b) z = (10 - .05)/1.12 = 8.88. That is, nearly 9 standard deviations from the mean. So the
probability is approximately 0.
6c) We want the bottom 1% of the curve. We look up .5 - .01 = .49 in the table (in the center of
the table, with the probabilities), and read off a z-score of 2.33. In other words, the cutoff
point is 2.33 standard deviations below the mean. So: .05 – (2.33)*(1.12) = -2.56%.
7a) P(all numbers correct) = P(1st correct AND 2nd AND 3rd) =
OR: There are
C
7b) Net winnings |
99
|
-1
|
10
3

3 2 1
  = 1/120
10 9 8
10!
 120 possible tickets, so the probability of winning is 1/120.
3!  7 !
Probability
1/120
119/120
E(X) = (99)(1/120) + (-1)(119/120) = -.167
V(X) = [(99)2(1/120) + (-1)2(119/120)] – (-.167)2 = 82.64
8) First compute the covariance.
First method:
X
Y X-Xbar
0
34
-4
8
98
4
6
98
2
2
66
-2
Total:
Second method:
Y-Ybar product
X
Y
X*Y
-40
160
0
34
0
24
96
8
98
784
24
48
6
98
588
-8
16
2
66
132
320
16
296
1504
1
1504   (16)  (296)
320
4
Covariance =
= 106.67
Covariance =
= 106.67
3
3
Then: slope = rise/run = Covar/Var(X) = 106.67/13.33 = 8. For each additional hour of
sleep, grade increases 8 points, on average.
To get the intercept, plug in the average X and Y values: Y = mX + b → 74 = (8)*(4) + b
→ b = 42. Folk who get 0 sleep score a 42, on average.
9) regression to the mean
10) buying stock in Repos- are –Us … correlation with SCC of -.5.
NOTE: Since Var(X+Y) = Var(X) + Var(Y) + 2*Cov(X,Y), we can reduce the overall
variance the most by having the second stock have a negative covariance with the first.
11) No. Correlation does not imply causality. All we’ve shown is that the two have tended to
happen in tandem.
12) A “filling up” process. After a point, diminishing returns set in and the rate of change
declines. The product life cycle is the most common business example – after a while,
everyone who plans to buy a DVD player has done so and sales level off.