Download MATH 3090 – Spring 2014 – Test 3 Version A

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear regression wikipedia , lookup

Regression analysis wikipedia , lookup

Least squares wikipedia , lookup

Coefficient of determination wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Transcript
MATH 3090 – Spring 2014 – Test 3 Version A
Multiple Choice: (Questions 1 – 20) Answer the following questions on the scantron provided. Give the
response that best answers the question. Each multiple choice correct response is worth 3 points. Please
use a #2 pencil. For your record, also circle your choice on your exam since the scantron will not be
returned to you. Only the responses recorded on your scantron sheet will be graded.
1.
A statewide survey concerning the 2015 race for governor was conducted. It was found that from among 1200
registered votes, 640 would vote for the Democratic candidate. Does this sample provide evidence that a
majority of the population would vote for the Democratic candidate? The sample produces the following test
statistic:
A.
B.
C.
D.
2.
t = 2.31
z = 2.31
z = 0.53
z = 0.47
Consider the following hypothesis test output provided by DDXL testing about the variability of the weights of
men living in a certain seaside town based on a sample of 7 individuals (assuming their actual weights are
normally distributed):
According to the above output, which of the following is a true statement: (α = 0.05)
A.
B.
C.
D.
3.
True or false: When our test statistic does not fall into the rejection region for a two-tail hypothesis test, we can
accept the null hypothesis as true (population parameter was proven to be on-target).
A.
B.
C.
D.
4.
Conclude we have sufficient evidence to claim the standard deviation of weights is below 28 pounds.
Reject the null hypothesis.
A decision about the null hypothesis cannot be determined from the given information
We fail to reject the null hypothesis in favor of the alternative
False
True
Inconclusive
Not enough information
Which of the following are required conditions of a valid test for non-zero slope in a simple linear regression
analysis?
A. For a given value of x, the response variable y is normally distributed with a mean of 0 and a constant
variance 𝜎𝜎 2 .
B. For a given value of x, the random error 𝜀𝜀 is normally distributed with a mean of 0 and a constant
variance 𝜎𝜎 2 .
C. For a given value of x, the response variable y is normally distributed with a mean of 0 and a variance of
1.
D. For a given value of x, the random error 𝜀𝜀 is normally distributed with a mean of 0 and a variance of 1.
MATH 3090 – Spring 2014 – Test 3 Version A
5.
A quality manager recently tested 40 tires to see if the process average tread depth has deviated from the target
of 9mm (9 millimeters) – in either direction. A computerized caliper measures the depths, transmits the data to
a computer, and gives a read-out. Use a significance level of 5%. The p-value on the screen is 0.15, with no
explanation. What should she report to her supervisor?
A. Because 0.15 is greater than 0.05, there’s no evidence to indicate that our tread-depth population mean is
off-target.
B. Because 0.15 is greater than 0.05, our test statistic obviously falls in the rejection region, so we need to
take action.
C. Because 0.15 is greater than 0.05, there is evidence to indicate that our tread depth population mean is
off-target.
D. We are obviously right-on-target today with tread depth.
6.
Scotty is certain that at least half of the TD’s waitresses are in love with him. Sedat does not believe him and
collects a random sample of 35 waitresses and counts the number of girls who say they are in love with him. 20
girls say they love Scotty, and 15 say that they do not. We want to test whether Scotty’s statement is true at the
.01 significance level. What is the target parameter for this hypothesis test?
A.
B.
C.
D.
7.
Population Mean
Population Proportion
Population Median
Population Variance
A local restaurant claims the standard deviation of waiting times on a Thursday evening from ordering your food
to being served that food is 15.4 minutes. After observing 23 randomly selected individuals waiting times you
calculate a standard deviation of 17.1 minutes. In testing a hypothesis to check if the true standard deviation of
waiting times is greater than 15.4, which of the following represents the appropriate test statistic?
22(17.12 )
15.42
23(17.1)
B.
15.4
22(15.42 )
C.
17.12
22(17.1)
D.
15.4
A.
MATH 3090 – Spring 2014 – Test 3 Version A
8.
Consider the following conditions:
I. A random sample is selected from the target population.
II. The population from which the sample is selected has an exponential distribution.
III. The sample mean, X , is normally distributed.
Which of these conditions is required for a valid hypothesis test for the population mean, µ?
A.
B.
C.
D.
9.
I and II
I and III
I, II, and III
none of the above
A study was done to predict Ice Cream Sales (in dollars) from temperature (in ℃) sing the following DDXL
output, determine the predicted ice cream sales on a day that the temperature is 24℃.
A.
B.
C.
D.
$881.58
$595.73
$544.58
$562.64
MATH 3090 – Spring 2014 – Test 3 Version A
10.
To determine how the mileage of a used car is related to its selling price, a used car dealer took a random sample
of 100 used cars of the same make and model sold by dealerships during the past month. Each car was 3 years
old, in good condition, and equipped with all the features that come standard with this car. The dealer recorded
the price (in $1,000) and number of miles (thousands) on the odometer. DDXL output from the regression
analysis is below.
Which of the following gives the equation of the least squares regression line?
A. The regression equation is 𝑦𝑦� = 17.142 − 0.065𝑥𝑥 where x is selling price and y is the odometer reading.
B. The regression equation is 𝑦𝑦� = −0.065 + 17.142𝑥𝑥 where x is selling price and y is the odometer
reading.
C. The regression equation is 𝑦𝑦� = 17.142 − 0.065𝑥𝑥 where x is the odometer reading and y is the selling
price.
D. The regression equation is 𝑦𝑦� = −0.065 + 17.142𝑥𝑥 where x is the odometer reading and y is the selling
price.
11.
Big Bob claims to be an amazing golfer. He claims that the average number of strokes for him to get on the
green is less than 1.8. However, his friends are doubtful. Help them test Big Bob’s claim by setting up the null
and alternative hypotheses for a hypothesis test.
A.
B.
C.
D.
12.
In a two-tailed test of hypothesis about the population proportion, a random sample yields a test statistic of
1.546. At the 10% level of significance, make a conclusion for this test.
A.
B.
C.
D.
13.
H0: µ=1.8 vs. Ha:µ<1.8
H0: µ<1.8 vs. Ha:µ>1.8
H0: µ=1.8 vs. Ha:µ≠1.8
H0: p=1.8 vs. Ha:p<1.8
The test statistic falls into the rejection region and therefore I reject H0 and accept Ha.
The test statistic is smaller than the critical value and therefore I reject H0.
The test statistic does not fall into the rejection region and therefore I do not reject H0.
The test statistic is bigger than the critical value and therefore I do not reject H0.
An Electrical Engineering student has developed a new chip. She claimed that it can run on average for 300
minutes without built in fan. Suppose a random sample of 25 chips is tested. The chips run for an average of
295 minutes, with a standard deviation of 20 minutes. The run times for the population of chips are normally
distributed. If we want to test the claim of the student using this sample, what is the value of the test
statistic?
A.
B.
C.
D.
-0.25
1.25
-1.25
2.25
MATH 3090 – Spring 2014 – Test 3 Version A
14.
An inspector inspects large truckloads of potatoes to determine the proportion of potatoes that have major
defects and cannot be used for making potato chips. Unless there is evidence that this proportion is less than
10%, she will reject the shipment. She selects a random sample of 200 potatoes from more than 3000 potatoes on
the truck and finds that only 12 potatoes have major defects. Which of the following assumptions for inference
about a population proportion using a hypothesis test is violated?
A. A random sample is selected from the population of interest.
B. np0 > 15.
C. There appears to be no violation.
15.
From the output above does it appear that this model is statistically useful?
A.
B.
C.
D.
E.
16.
Yes – because the p-value for the hypothesis test for slope is ≤ 0.0001
No – because the p-value for the hypothesis test for slope is 0.4289
No – because the coefficient for Salary is negative
No – because there are 65520 cases missing
Yes – because r-squared is small
An eye glass manufacturing company designed a glass that weighs 1 oz. Though actual weights of glasses made
in the company vary from the designed weight, a standard deviation larger than 2 ozs is considered unacceptable.
Hence, the quality control unit of the company needs to sample the production line periodically and test the
variation in the weights of the eye glasses produced. To run such a test, which of the following conditions is not
required?
A.
B.
C.
D.
The samples taken are random.
The weights of eye glasses are approximately normally distributed.
The sample sizes must be large.
All of them are required.
MATH 3090 – Spring 2014 – Test 3 Version A
17.
The table below gives the DDXL output from a simple linear regression analysis relating the outside
temperature and the weekly energy consumption of homes in a large city for a random sample of 10 homes.
Which of the following gives the best interpretation of the coefficient of determination?
A. About 97.6% of the variation in the sample of energy consumption can be explained by using the
outside temperature in our model.
B. About 95% of the observed values of energy consumption are expected to fall within .2734 of the
predicted values of energy consumption when using this regression model.
C. About 97.9% of the variation in the sample of energy consumption can be explained by using the
outside temperature in our model.
D. Energy consumption is expected to decrease by 0.119 for each degree of temperature increase.
18.
The human resources manager of a telemarketing firm is concerned by the fact that many of the firm’s
telemarketers do not work very long before quitting. The manager suspects age as one of the reasons for this
high turnover, and performed a regression analysis using the work history for a random sample of workers that
quit in the last year. His estimated regression equation is 𝑦𝑦� = 53.18 − 0.95𝑥𝑥, where x is the age of each worker
when originally hired and y is the number of weeks on the job before quitting. The estimated standard error of
this regression model is s = 3.67. Which of the following statements gives the best interpretation of the value of
s?
A. For a given age, we expect most all of the telemarketers’ employment periods to be within 3.67 weeks of
the value predicted by the least squares regression equation.
B. For a given age, we expect most of the telemarketers’ employment periods to be within 2(3.67) = 7.34
weeks of the value predicted by the least squares regression equation.
C. For a given length of employment, we expect most of the telemarketers’ ages to be within 3.67 weeks of
the value predicted by the least squares regression equation.
D. For a given length of employment, we expect most of the telemarketers’ ages to be within 2(3.67) = 7.34
weeks of the value predicted by the least squares regression equation.
19.
In a simple linear regression analysis in order to determine if a model is practically useful it is ideal to have…
A.
B.
C.
D.
A large value of r2 and a large value of s
A small value of r2 and a large value of s
A large value of r2 and a small value of s
A small value of r2 and a small value of s
MATH 3090 – Spring 2014 – Test 3 Version A
20.
Jim and Carol got a p-value of 0.027 for a two-tailed hypotheses test for mean at 0.05 level of significance. Jim
thought they should reject Ho while Carol thought they should not reject Ho. Who do you think is right?
A.
B.
C.
D.
Jim
Carol
Neither of them
Both of them
MATH 3090 – Spring 2014 – Test 3 Version A
Free Response: The Free Response questions will count 39% of your total grade. Read each question
carefully. In order to receive full credit you much show legible and logical (relevant) justification which
supports your final answers. You MUST show your work. Answers with no justification will receive no
credit.
1. (15 pts) An automotive part must be machined to close tolerances to be acceptable to customers. Production
specifications call for a maximum variance in the lengths of the parts of .0004 cm. Suppose the sample variance
for 30 randomly selected parts turns out to be .0005. Assume the lengths of the automotive parts are normally
distributed. At the 5 % significance level, is there sufficient evidence that the population variance specification of
the automotive part is being violated? Use the rejection region approach. [Must show all appropriate steps for
the hypothesis test to earn full credit, must show work].
𝐻𝐻0 : 𝜎𝜎 2 = 0.0004
𝐻𝐻𝐴𝐴 : 𝜎𝜎 2 > 0.0004
Assumptions: 1. Random Sample from the Population – stated in problem
2. Population Normally Distributed – stated in problem
(𝑛𝑛 − 1)𝑠𝑠 2 (29)(0.0005)
=
= 36.25
𝜒𝜒 =
𝜎𝜎 2
0.0004
2
Rejection Region:
Conclusion:
At the 5% significance level, my test statistics does not fall into the rejection region therefore I do not reject my null
hypothesis. There is insufficient evidence to suggest the population variance specification of the automotive part is
greater than 0.0004 cm.
Hypotheses: symbol (1); signs (1); value (1)
Assumptions: (1) each (do not have to check)
Testing: chi-square label(1); value (1); critical value (1);
Summary: alpha (1), TS not in RR (1), therefore dnr null (1); population (1); variance (1); insufficient evidence
(1); context (1)
-
Must be consistent with response
Do not take off twice for same mistake
If do p-value approach (minus 2 for critical value and RR in conclusion)
MATH 3090 – Spring 2014 – Test 3 Version A
2. Suppose a new production method will be implemented if a hypothesis test supports the conclusion that the new
method reduces the mean operating cost per hour.
a. (2 pts) Stat the appropriate null and alternative hypotheses if the mean cost for the current production
method is $220 per hour.
𝐻𝐻0 : 𝜇𝜇 = $220
𝐻𝐻𝐴𝐴 : 𝜇𝜇 < $220
b. (2 pts) Describe a Type I error in this situation.
Conclude population mean is less than $220 when it is actually $220.
c. (2 pts) What are the consequences (in context of this problem) of making a Type I error?
Conclude new production method reduces operating costs when it does not – spend time/money on new
method that doesn’t change anything.
d. (2 pts) Describe a Type II error in this situation.
Conclude population mean is $220 when it is actually less
e. (2 pts) What are the consequences (in context of this problem) of making a Type II error?
Not implementing new method that would reduce operating costs
Essentially right or wrong
Can find parts b in part a and vice versa
If reverses Type I and Type II error but gets everything else correct then – 4
If reverses Type I and Type II error but is not consistent then potentially - 8
MATH 3090 – Spring 2014 – Test 3 Version A
3. The number of megapixels in a digital camera is one of the most important factors in determining picture quality.
But, do digital cameras with more megapixels cost more? The following data show the number of megapixels
and the price ($) for 10 digital cameras (Consumer Reports, March 2009).
x
8
10
7
8
15
8
10
12
10
7
y
180
200
230
120
470
140
180
310
250
110
Below is the DDXL output for the Simple Linear Regression Analysis for this problem.
a. (2 pts) In this problem, which variable is the explanatory variable and which is the response variable?
Y = price (response)
X = # of megapixels (explanatory)
b. (2 pts) Give the correct equation for the simple linear regression line from the DDXL output.
-1 if forget hat on y
𝑦𝑦� = −145.027 + 38.3186𝑥𝑥
c. (2 pts) What is your interpretation of the value of slope for this analysis (in context of this problem)?
For every additional megapixel price increases on average by an additional $38.32
Can leave out “on average”
MATH 3090 – Spring 2014 – Test 3 Version A
d. (3 pts) Give the value of the correlation coefficient (r) and interpret this value in the context of the
problem.
r = 0.8944 (0.5 pts)
There is a strong positive linear relationship between # of megapixels and price of camera.
0.5 each (strong, positive, linear, context)
e. (3 pts) Give the p-value and conclusions for a two-sided test for slope for this analysis.
p-value = 0.0005
Since p-value is small (less than any reasonable significance level) we conclude the population slope
is different from 0.
1 point for p-value
2 points for conclusion (-0.5 for not using population)
f.
(2 pts) The model given above was used for prediction where x = 20. A 99% confidence interval
created around the prediction value was found to be (376.71, 376.711). Is it appropriate to use the
model in this way? Explain why or why not.
No – x (megapixels) = 20 is outside the scope of the x values used to create this equation. This is
known as extrapolation and should not be done. Only values close to the sample of x values used
should be used for prediction, values outside the scope of this problem could have a different
representative equation.
Right or wrong – should mention extrapolation either by name or concept– do not have to go into
much detail about why extrapolation is wrong, if argue that x = 20 is a reasonable value since x = 15
is in the scope of problem possible consider correct depending on rest of wording
Did you correctly fill in your scantron? (1pt)
• Are your bubbles dark enough?
• Did you use your XID with the C bubbled as a 0?
• Is your section number listed?
• Is your instructor filled out?
• Is your test Version bubbled in correctly?