Download Take home Test 1 Key SPR 2010

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
1
Quality Take Home Test 1 – Spr 2010: Name ______________________________
1. The amount of active ingredient in a blood pressure pill is known to be
approximately normally distributed with mean 20 mg. and standard deviation 0.3
mg.
a. Draw a picture of this normal distribution [be sure to label the X-axis].
b. Doctors discover that patients develop a rash if the blood pressure pill has more
than 20.6 mg. of active ingredient. What is the probability that the pill you take
today has more than 20.6 mg. and you develop a rash?
Z = (20.6 – 20)/0.3 = 2.0
P(X > 20.6) = 0.0228
2
c. If 100 people in Arlington took this pill today, what is the probability that none
develop a rash? How many of these 100 people would you have expected to
develop a rash? Think binomial distribution.
X (# who develop rash) ~ Binomial [n = 100, p = 0.0228]
P(X = 0) = 0.755
Excel: “ =BINOMDIST(0,100,0.0028,FALSE)”
Mean of Binomial = n * p = E(# developing rash) = 100 * 0.0228 = 2.28
d. The pills are known to be ineffective if they have less than 18 mg. of active
ingredient, but are quite effective as long as they have at least 18 mg. What is the
probability that a randomly selected pill has less than 18 mg. of active ingredient?
Z = (18 – 20)/0.3 = - 6.67
P(X < 18) ~ 0.0000
3
e. With so many patients developing rashes from taking these pills, the company puts
you in charge of fixing this problem. After forming a process improvement team, the
first suggestion is to change the process and target a mean less than the current 20
mg. The entire team is aware that the cost of the active ingredient is very high and
the more pills manufactured [for a given volume of active ingredient] means more
profit for the company. What would you suggest for a new target mean? Draw a
picture to show what you are recommending here.
Target 4 std. deviations above the minimum effective level of the pills (18).
Depending on the problem, you could target more of less that 4 std. deviations.
Target Mean = 18 + 4 * (0.3) = 19.2
4
2. A new drug can cure your serious medical problem provided your specially made
batch of pills (based on your age, weight ,etc.) contain an average of 6 mg. The drug
company has informed you that if the average is 6.1 mg. or larger it will cause heart
failure and if the average is 5.9 mg. or smaller it will not cure your problem. You ask
the drug company how they know the pills have an average of 6 mg. and they respond
by saying “trust us”. Remembering something you studied in college called statistics,
you decide to test the hypothesis (5% level of significance) that the average is 6 mg.
against the alternative hypothesis that the average is not 6 mg. Since the pills cost
$20 each management instructs you to test 36 pills (pretty large sample) resulting in a
sample mean of 6.08 mg. and sample standard deviation of 0.3 mg.
a. Conduct the two tailed hypothesis test and provide me with the results and
whether or not you would now take the pills.
Ho: μ = 6.0
Ha: μ ≠ 6.0
α = 0.05
n = 36
X-bar =6.08
s = 0.3
X-bar ~ Normal
μ(X-bar) = μ = 6.0
σ(X-bar) = s/SQRT(n) = 0.3/6 = 0.05
Z0.025 = 1.96 [Used Z here although technically we should use t]
UCV = 6.0 + 1.96 * 0.05 = 6.098
LCV = 6.0 – 1.96 * 0.05 = 5.902
Since LCV(5.902) < X-bar(6.08) < UCV(6.098) we fail to reject Ho: μ = 6.0 at a 5%
level of significance.
5
b. Sketch the power curve for the test in part a above.
c. Based on the power curve above, would you be worried about your
conclusion in part a above? Explain your logic.
6
If the true mean were 5.9 or 6.1 you have approximately a .55 probability
of rejecting the null hypothesis and throwing those “unacceptable” pills
away, which means you have a .45 probability of not rejecting the null
hypothesis and shipping those pills to the consumer. If you do ship this
quality pill to consumers, you have committed a Type II error.
Management has to decide if they are willing to live with this risk. The
answer normally lies with other factors such as the extent of other process
controls which would prevent the mean from ever degrading to these
unacceptable levels.
d. If the above power curve is unacceptable, what would you like the power
curve to look like?
7
We would want a sample size sufficiently large to reject pills whose mean is
unacceptable [less than 5.9 or larger than 6.1]
e. How large a sample would you need to take in order to achieve the power
curve you sketched above?
SINGLE MEAN HYP TEST LECTURE: REQUIRED SAMPLE SIZES
THE NECESSARY SAMPLE SIZE TO RESULT IN A GIVEN "SHAPE" TO THE POWER CURVE CAN
BE FOUND AS FOLLOWS:
Can change these cells to study effects:
HYPOTHESIZED MEAN YOU WOULD LIKE TO ACCEPT =
ALPHA =
6.00
0.05
MEAN YOU WANT TO REJECT (1-b)100% OF THE TIME =
BETA =
5.90
0.05
WHAT IS AN ESTIMATE FOR THE STD. DEV. OF THE DATA =
0.30
REQUIRED SAMPLE SIZE =
117
(Answer)
8
3. Random Number Simulation Homework:
a. Generate 100 random numbers from a normal distribution with mean 20 and
standard deviation 0.5 . Plot a histogram of the data. Select your bin ranges
(cell boundaries) such that you have a pretty histogram.
Go to Tools - Data Analysis – Random Number Generator
μ = 20 σ = 0.5
Normally Distributed Data:
19.31
19.91
20.79
20.46
19.67
20.52
20.67
19.95
19.05
19.98
20.38
20.99
20.09
20.16
19.66
20.41
19.96
19.92
20.45
20.34
Bin Frequency
18.4
0
18.8
0
19.2
6
19.6
17
20.0
33
20.4
25
20.8
15
21.2
4
21.6
0
More
0
19.47
19.79
20.12
19.51
19.73
19.88
20.74
19.99
19.30
20.01
19.65
20.34
20.20
19.70
20.27
20.45
19.39
19.48
19.78
21.15
19.58
18.93
19.07
20.18
19.89
20.02
19.88
20.64
19.50
19.97
19.92
19.65
20.32
19.83
19.49
19.13
18.83
19.87
19.59
19.56
19.75
19.94
19.07
19.82
20.31
20.75
19.75
19.82
19.49
19.24
20.34
20.38
20.32
19.67
21.03
19.83
20.52
20.08
20.01
19.66
Bins:
18.4
18.8
19.2
19.6
20.0
20.4
20.8
21.2
21.6
19.86
20.10
20.59
19.59
20.20
20.19
21.10
20.41
20.77
19.56
Histogram
Frequency
19.78
20.49
20.03
20.18
19.93
20.16
19.91
19.43
19.56
20.33
40
20
0
18.4
18.8
19.2
19.6
20.0
20.4
Data
20.8
21.2
21.6
More
9
b. Generate 100 random numbers from a normal distribution with mean 23 and
standard deviation 0.5 . Plot a histogram of the data. Select your bin ranges
(cell boundaries) such that you have a pretty histogram.
μ = 23 σ = 0.5
Normally Distributed Data:
23.13
22.87
23.40
22.49
23.23
23.22
23.93
23.63
23.16
22.80
22.63
23.45
23.33
22.15
23.53
23.11
23.45
23.59
23.32
22.56
BinFrequency
21.2
0
21.6
1
22.0
0
22.4
15
22.8
23
23.2
26
23.6
25
24.0
7
24.4
3
24.8
0
More
0
22.23
22.23
22.51
22.28
22.43
22.87
23.24
22.83
23.85
22.72
22.84
22.15
23.00
22.70
23.34
23.32
23.64
24.09
23.44
22.68
23.13
23.32
23.52
22.30
23.08
23.04
22.39
22.59
23.27
22.96
22.41
23.34
23.48
22.33
22.94
23.53
23.02
23.49
22.73
22.66
22.90
23.43
23.77
22.79
22.29
23.49
23.66
22.04
22.63
23.79
22.27
23.25
22.91
23.16
22.13
22.76
22.70
23.48
22.49
22.99
23.13
22.97
22.36
23.10
24.12
22.63
21.60
24.11
22.65
22.39
Bins:
21.2
21.6
22.0
22.4
22.8
23.2
23.6
24.0
24.4
24.8
Histogram
Frequency
22.81
23.10
23.15
22.99
22.22
22.48
22.45
22.85
22.64
23.57
30
20
10
0
21.2 21.6 22.0 22.4 22.8 23.2 23.6 24.0 24.4 24.8 More
Data
10
c. Now combine all the data (200 observations) in problems 1 and 2 above and
plot a histogram.
μ=?
Distribution Unknown
19.47
19.79
20.12
19.51
19.73
19.88
20.74
19.99
19.30
20.01
22.23
22.23
22.51
22.28
22.43
22.87
23.24
22.83
23.85
22.72
19.65
20.34
20.20
19.70
20.27
20.45
19.39
19.48
19.78
21.15
22.84
22.15
23.00
22.70
23.34
23.32
23.64
24.09
23.44
22.68
19.58
18.93
19.07
20.18
19.89
20.02
19.88
20.64
19.50
19.97
23.13
23.32
23.52
22.30
23.08
23.04
22.39
22.59
23.27
22.96
19.92
19.65
20.32
19.83
19.49
19.13
18.83
19.87
19.59
19.56
22.41
23.34
23.48
22.33
22.94
23.53
23.02
23.49
22.73
22.66
19.75
19.94
19.07
19.82
20.31
20.75
19.75
19.82
19.49
19.24
22.90
23.43
23.77
22.79
22.29
23.49
23.66
22.04
22.63
23.79
20.34
20.38
20.32
19.67
21.03
19.83
20.52
20.08
20.01
19.66
22.27
23.25
22.91
23.16
22.13
22.76
22.70
23.48
22.49
22.99
Bins:
18.4
18.8
19.2
19.6
20.0
20.4
20.8
21.2
21.6
22.0
22.4
22.8
23.2
23.6
24.0
24.4
24.8
19.86
20.10
20.59
19.59
20.20
20.19
21.10
20.41
20.77
19.56
23.13
22.97
22.36
23.10
24.12
22.63
21.60
24.11
22.65
22.39
Histogram
35
30
25
20
15
10
5
Data
More
24.8
24.4
24.0
23.6
23.2
22.8
22.4
22.0
21.6
21.2
20.8
20.4
20.0
19.6
19.2
0
18.8
Bin Freq.
18.4
0
18.8
0
19.2
6
19.6
17
20.0
33
20.4
25
20.8
15
21.2
4
21.6
1
22.0
0
22.4
15
22.8
23
23.2
26
23.6
25
24.0
7
24.4
3
24.8
0
More
0
20.38
20.99
20.09
20.16
19.66
20.41
19.96
19.92
20.45
20.34
22.63
23.45
23.33
22.15
23.53
23.11
23.45
23.59
23.32
22.56
18.4
19.31
19.91
20.79
20.46
19.67
20.52
20.67
19.95
19.05
19.98
23.13
22.87
23.40
22.49
23.23
23.22
23.93
23.63
23.16
22.80
Frequency
19.78
20.49
20.03
20.18
19.93
20.16
19.91
19.43
19.56
20.33
22.81
23.10
23.15
22.99
22.22
22.48
22.45
22.85
22.64
23.57
σ=?
11
d. Put all 3 histograms on one sheet and comment on what you see.
Frequency
Histogram
40
20
0
18.4 18.8 19.2 19.6 20.0 20.4 20.8 21.2 21.6 More
Data
Data appears to be fairly well behaved, mounded, and most likely will pass a
hypothesis test for normality.
Frequency
Histogram
30
20
10
0
21.2 21.6 22.0 22.4 22.8 23.2 23.6 24.0 24.4 24.8 More
Data
Data appears to be fairly well behaved, mounded, and most likely will pass a
hypothesis test for normality.
Histogram
35
30
Frequency
25
20
15
10
5
24.8
More
24.4
24.0
23.6
23.2
22.8
22.4
22.0
21.6
21.2
20.8
20.4
20.0
19.6
19.2
18.8
18.4
0
Data
Data appears to have two modes and may well reflect a mixture of two different
distributions, both of which appear to be approximately normally distributed. Data
could also reflect time series data where the mean shifted half way through data
collection.
12
4. The time it takes a Nascar crew to change each tire during a race is known to be
normally distributed with a mean of 4 seconds and standard deviation of 0.2 seconds.
As you may know, they change one tire at a time until they complete changing all 4
tires.
a. Determine the mean and standard deviation of the time it takes to change all 4
tires.
X = time to change one tire [μ = 4, σ = 0.2, σ2 = 0.04]
T = time to change 4 tires = X1 +X2 + X3 + X4
μT = μ + μ + μ + μ = 4 + 4 + 4 + 4 = 16
σT2 = σ2 + σ2 + σ2 + σ2 = 0.04 + 0.04 + 0.04 + 0.04 = 0.16
σT = SQRT(0.16) = 0.4
b. What is the probability they take longer than 16.6 seconds to change all 4 tires?
Z = (16.6 – 16)/0.4 = 1.5
P(T > 16.6) = 0.0668
13
c. If they are currently leading the race by 15.6 seconds over the second place car
and the crew chief decides to have a pit stop and change all 4 tires, what is the
probability they are still leading the race after the pit stop?
Z = (16 – 15.6) / 0.4 = 1.0
P(T < 15.6) = .1587
14
d. If the crew chief want to reduce the mean time to change one tire such that they
have a 97.3% probability of changing all 4 tires in less than 16 seconds, what
would their target value for the mean time to change one tire be?
Z = (16 – μT) / 0.4 = 1.923 [required Z score to get 0.973 area to left of 16]
Solving for μT
μT = 16 - 0.4 * ( 1.923 ) = 15.231 [mean time to change 4 tires]
Since μT = μ + μ + μ + μ
μ = ( μT ) / 4 = 15.231 / 4 = 3.81 [mean time to change 1 tire]
15
5. Have you ever had trouble taking the cap off of a new bottle of eye drops (or any new
product)? During the manufacturing process, the equipment is initially set up, before
each production run, to apply a specified torque to the caps. This is important because if
the cap is too loose, it might be possible that leaks occur causing a loss of sterility and if
the cap is too tight, consumers, old like your professor, would not be able to remove the
cap and refuse to buy your product again. In order to insure that the equipment is set up
properly each morning, you decide to randomly sample 100 bottles during the first 30
minutes of production and check to see if the average torque is on target. The target for
the application torque is 50 ft-lbs. Consumer research has found that old professors can
still remove the cap as long as the torque is less than 90 ft-lbs and bottles will not leak as
long as the torque is greater than 10 ft-lbs. The sample of 100 bottles resulted in the
following:
……… 100 SAMPLE MEAN SAMPLE STANDARD DEVIATION
BOTTLE 1 2
TORQUE 89 111 ……… 93
60
9
a. Test the hypothesis that the mean torque is equal to 50 against the alternative
hypothesis that the mean torque is not equal to 50 at a 5% level of significance.
You may assume that torque data is approximately normally distributed.
Ho: μ = 50
Ha: μ ≠ 50
α = 0.05
n = 100
X-bar =60
s=9
X-bar ~ Normal
μ(X-bar) = μ = 50
σ(X-bar) = s/SQRT(n) = 9/10 = 0.9
Z0.025 = 1.96 [Used Z here although technically we should use t]
UCV = 50 + 1.96 * 0.9 = 51.764
LCV = 50 – 1.96 * 0.9 = 48.236
Since X-bar(60) > UCV(51.764) we reject Ho: μ = 50 at a 5% level of significance.
b. What will you tell your manager about today’s torque set-up?
It appears that you missed the target torque of 50 ft-lbs.
16
c. For the previous hypothesis test, sketch the power curve
d. Your boss is charged with recommending process improvements resulting in cost
savings but at the same time continue to produce quality products. In your opinion,
what is the largest average torque you would consider acceptable? Give reason.
Since torques above 90 are difficult for old professors to open, the mean will be
targeted 4 std. deviations from 90. Therefore,
μx = 90 - 4 * σ = 90 - 4 * 9 = 54
Other distances from 90 could also be targeted depending on circumstances.
17
e. A similar product, packaged in the same bottles and using the same equipment as in
the previous problem, has a FDA requirement that the average torque be 50 ft-lbs, but
no larger than 55 ft-lbs and no smaller than 45 ft-lbs. If it is larger than 55 or smaller
than 45, you get in a lot of trouble. How large a sample would you recommend we
take to test the hypothesis “and” stay out of trouble. Make whatever assumptions you
wish to answer this problem (just write them down).
SINGLE MEAN HYP TEST LECTURE: REQUIRED SAMPLE SIZES
THE NECESSARY SAMPLE SIZE TO RESULT IN A GIVEN "SHAPE" TO THE POWER CURVE CAN
BE FOUND AS FOLLOWS:
Can change these cells to study effects:
HYPOTHESIZED MEAN YOU WOULD LIKE TO ACCEPT =
ALPHA =
50.00
0.05
MEAN YOU WANT TO REJECT (1-b)100% OF THE TIME =
BETA =
55.00
0.05
WHAT IS AN ESTIMATE FOR THE STD. DEV. OF THE DATA =
9.00
REQUIRED SAMPLE SIZE =
42
(Answer)
Assumptions: Assume sample size of 42 is sufficiently large for the central limit
theorem to apply or population itself is normally distributed.