Download Final Exam Review- Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
MTH 153 Final Exam Review
20 points
Semester Grade is calculated by_____ 1st Sem + _____ 2nd Sem. + ____ (Final) = Sem grade
You need to know the following for the written portion of the final exam:
A. Normal Curve and Applications using Norm CDF and INV Norm functions (or charts)
B. Central Limit Theorem- State what it means and using it in an application, including the creation of
confidence intervals.
C. Hypothesis Testing and Confidence Intervals- All of the types we have studied this semester,
especially including the types on this review.
YOU MAY WANT TO USE A SEPARATE SHEET OF PAPER TO COMPLETE YOUR ANSWERS.
A. Normal Curve
Given that the resting heart rates of the population of high school seniors is normally distributed and
has a mean of 80bpm, and a standard deviation of 10b.p.m. (except during finals week!) :)
Answer the following:
1. Draw and label the normal curve that corresponds to the above population parameters:
2. If a randomly chosen student was selected out of this population:
(use curve and/or Norm CDF and INV NORM)
a. P(the student has a heart rate greater than 90 b.p,m.): ________
b. p(the student has a heart rate between 65 bpm and 95 bpm): ______
c. p(the student has a heart rate less than 60 b.pm.) :______
d. If a fitness instructor wants to know the lowest 20% of the population’s hear rate, what heart rate
would this correspond to?
e. What is the interval that defines the heart rates that represent the middle 50% of all students’
heart rates? ___________
f. Given that you select 5 students at random from this population, what is the probability that all 5 of
them have hear rates greater than 90 b.pm?
B. CLT and Confidence Intervals:
1. State the central limit theorem and interpret what it means:
2. If the Idaho SAT math mean score in 2015 was 515 and the standard deviation was 115
for all students:
a. What is the probability that a sample of 30 would have a mean of 560? ________
b. Would this be considered a rare event? (Explain)
PSAT questions:
c. The following is a random sample of RHS PSAT math scores:
451 677 531 503 487 584 690 570 465 486 551 560 512 621 554 589 308 554 585
589 537 495 557 518 509 551 521 613 518 583
1. Give the point estimates for the mean, and standard deviation of the data set.
a. Mean: _________
Stnd Dev: ______
b. Give a 95% CI for the RHS population mean, μ:
_______< μ < _______
2. a. Using the sample, give a point estimate 𝑝̂ for the percentage (proportion) of scores
that scored over 580: _______
(There were 325 test takers at RHS)
b. Give a 95% CI for the proportion over 580:
_______< P < _______
c. What sample size (if it were possible) would I need to have a 95% CI and a 5%
margin of error for the total number of scores over 585? _________
D. Hypothesis Testing
1. Be prepared to state the Null hypothesis H 0 , and the Alternative Hypothesis H a , that would be
used for a hypothesis test related to each of the following statements, and identify the type of test
that would be used for each: (these are in random order for how we learned them)
a. The mean age of the students enrolled in evening classes at a certain college is greater than
26.
b. A blood donation facility knows the population proportion of each of the blood types for
Native Americans. A group of 60 volunteers recruited from a Native American convention is
recruited to donate blood and the sample distribution of their blood types is compared to the
nationally known proportions.
c. There is little to no similarity in comparing 6 treatments for senioritus to one another.
d. The mean weight of packages shipped on Air Europe during the past month was less than
36.7lb the weights for the new month are greater, and the distribution of weights for these
packages is known to be normally distributed.
e. A taste test between “Cheese Its” and “Cheese Nips” reveals that “Cheese its” are
significantly better in taste than “Cheese Nips” using a rating scale.
f. The mean life of florescent light bulbs is at least 1600hr. The new improved light bulbs have a
life span that is greater.
g. There is a strong association between the gross revenue that a movie brings in and the weeks
it is released for, the number of theaters it opens at, and the amount spent on the making of
the movie.
2. You are given weights for pre season and post-season weights for a sample of a football team. We
think that there is going to be a decrease in the weights due to working out daily.
a. State the null hypothesis and Alternative hypothesis
b. Find the mean, standard deviation, Sample size (n) for both sets of data.
c. Use the stats from b to find the t-score and p-value
d. Using t-score and p-value to make your conclusion.
Pre- season
295
279
250
235
255
290
310
260
275
280
Post Season
265
266
245
240
230
230
235
235
250
265
Difference
Null Hypothesis: ___________
Alternative Hypothesis: __________
Mean X: ________ n: _______
Mean Y: ________ n:________
Mean difference (d bar):
Standard Deviation of x: _______
Standard Deviation of y: ________
T-score: ________
P-Value: ________
Discuss conditions being met or not for this test:
Conclusion Statements regarding the null hypothesis and conditions:
Hypothesis Testing Continued…..
3. A study was done to determine whether or not the crime rate in a large city is greater
during an essentially full moon compared to nights without a full moon.
The following statistics were given from a random sample for the rates of violent crimes
during full moons in a three four-year span and the rates of violent crimes not during a
full moon for the same span: Data is in mean crime rates/day. (assume same cities were
used both times)
(use a 95% confidence level)
Full Moon
Not Full Moon
X = 16.1
Y = 15.2
SX = 2
S X = 1.5
n = 36
n = 62
Null Hypothesis: ___________
Alternative Hypothesis: __________
Discuss conditions being met or not for this test:
t-score: ________
P-Value: ________
Conclusions:
Use the spreadsheet for Movie data from 2005 and 2015 for the following hypothesis tests. The
movie data is a list of the top twenty-five U.S. grossing movies from 2005 and 2015:
4. We want to determine if in fact the number count of each genre of movies is statistically the
same in 2015 as it was in 2005 for top movies. Use a two-way table and a Chi Squared test (with a
95% confidence level) to test this hypothesis.
Comedy
Action
Drama
2005
2015
Totals
Null Hypothesis: ___________
Alternative Hypothesis: __________
Discuss conditions being met or not for this test:
𝜒2 -score: ________
P-Value: ________
Conclusions:
Adventure
Totals
5. Here are the results to a multi-variable analysis on the 25 movies for 2015 that compared the
gross sales as it is associated with the costs, score, and run-time.
Regression Statistics
Multiple R
0.593697
R Square
0.352476
Adjusted R Square
0.259973
Standard Error
160.4014
Observations
25
ANOVA
df
Residual
21
SS
294109.645
7
540300.914
3
Total
24
834410.56
Regression
Intercept
3
MS
98036.54
9
25728.61
5
F
3.81040
9
Significanc
eF
0.025189
Coefficient
s
Standard
Error
t Stat
P-value
Lower 95%
Upper
95%
Lower
95.0%
Upper
95.0%
-18.09
294.10
-0.06
0.95
-629.70
593.51
-629.70
593.51
Cost(in millions)
1.60
0.61
2.62
0.02
0.33
2.87
0.33
2.87
Score
Run Time (in
mins)
14.84
40.16
0.37
0.72
-68.67
98.35
-68.67
98.35
-0.20
1.71
-0.12
0.91
-3.77
3.36
-3.77
3.36
a. Using the results to the test, which variable(s) should be eliminated from the analysis
and why?
b. What other factors could impact the gross sales, and what percent of the regression
do these other variables represent?
6. Using an ANOVA test on the top 4 in each genre of 2005 movies ie Comedy, Action, Drama, and
Adventure, with a 95% confidence level, determine if there is variability between the different
genres and their run times:
(Type in the 4 values for each of the four movies into 4 separate lists, and run the ANOVA):
Comedy
Action
Drama
Movie1
Movie2
Movie3
Movie 4
a. State the Null and Alternative Hypothesis:
b. Show the F-statistic and p-value:
F = _______
c. Discuss the conditions for ANOVA being met or not:
p-value: ______
Adventure
7. Using the top 25 2015 movies, calculate the point estimate for the mean run-time, a
90% confidence interval for the mean run-time, and determine the margin of error for
this CI, for all U.S. released movies in 2015, and interpret the result:
Point Estimate for the mean run-time: ______ CI: [
]
Margin of Error: _______
What conditions or problems would you see as a statistician in using this Confidence
Interval to estimate the true population mean run-time of all movies released in the U.S.
8. Given the following statistics for a complete random sample of the voting choice of
male voters and female voters in an Idaho primary election survey, apply a “two proportion
z interval” and a “two proportion z test” with 95% confidence on the difference in the
percentage of males who are in favor of voting for Donald Trump and females in favor of
voting for Donald Trump.
yes
Males
Females
105
72
Total
sampled
228
240
Z Interval for Difference: [
]
Margin of Error: _____
Null Hypothesis: ___________
Alternative Hypothesis: __________
Discuss conditions being met or not for this test:
z-score: ________
P-Value: ________
Conclusions:
Your Name:________________________________
20 points if it changes grade