Download HW1-1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Susan Kolakowski
Design of Experiments – EQAS 770
Homework #1
March 22, 2006
Problem 1
Photoresist is a light-sensitive material applied to semiconductor wafers so that the
circuit pattern can be imaged on to the wafer. After application, the coated wafers are
baked to remove the solvent in the photoresist mixture and to harden the resist. Here
are the measurements of photoresist thickness( in kA) for eight wafers baked at 2
different temperatures. Assume that the runs were made in random order and they are
independent. (Problem statements copied from assignment)
Temp.
95°C
100°C
11.176
5.263
7.089
6.748
Photoresist Thickness (in kA)
8.097
11.739 11.291 10.759
7.461
7.015
8.133
7.418
6.467
3.772
8.315
8.963
a) Preliminary Analysis
For the preliminary analysis, the descriptive statistics were calculated and three plots
were produced: boxplot, dotplot and histogram of data.
The results of the descriptive statistics calculations were as follows (where N
represents the number of samples for each temperature and the mean, standard
deviation, minimum, median and maximum are in units of kA):
Temperature
N
Mean
95°C
100°C
8
8
9.367
6.847
Standard
Minimum
Deviation
2.100
6.467
1.640
3.772
Median
Maximum
9.537
7.217
11.739
8.963
By looking at these statistics, it appears that the photoresist thickness may differ
depending on which temperature the resisters are baked at. At this stage, we can only
hypothesize this due to the fact that the mean values of the 8 samples baked at each of
the temperatures is different but since the mean value for 100°C is greater than one
standard deviation away from the mean value for 95°C, this seems to be the case.
Another observation to make is that the maximum thickness for 100°C is less than the
mean for 95°C which also makes it appear that baking temperature affects the
thickness of photoresisters.
Here we have a boxplot of the data illustrating the spread of the samples at each
temperature. You can see from this plot that the entire sample set baked at 100°C has
a lower thickness than the median of the sample set baked at 95°C. This again makes
it appear that the baking temperature has a significant affect on the photoresistors’
thicknesses.
The dotplot is another illustration of the data collected but instead of display statistics
of the data, it displays where each data sample falls. In my opinion it is harder to get
an idea of the significance of temperature to photoresist thickness using this plot,
although you can see that two resisters baked at 100°C were measured to have
thicknesses lower than the minimum thickness achieved when baking at 95°C and
that four photoresisters baked at 95°C exceeded the maximum thickness achieved
when baking at 100°C.
This histogram of the two sets of data displays the probability of continuous Normal
distributions described by the statistics produced by the 8 samples for each
temperature. In this plot, you can again see that the mean for the 8 resisters baked at
95°C is greater than the mean for the 8 resisters baked at 100°C, although this plot
does show a fair amount of overlap between the two distributions.
Based on only the descriptive statistics and the three plots produced, I would say that
it appears that there may be a significant difference between the thickness of
photoresisters baked at different temperatures and that it is worthwhile to go forward
with this data to see if there is enough evidence to support this difference.
b) Check all assumptions needed to perform the analysis:
1. Samples are from Normal distributions.
2. Variance for each temperature is equal.
3. Runs were made in random order and are independent.
1. A probability plot was produced in Minitab to test if the data could be assumed to
be Normal:
Since the p-values are greater than α=0.05 for both temperatures, there is not
enough evidence to say that these two data sets are not Normally distributed.
Therefore the assumption that the data is Normal is met.
2. A test to determine if the variances for each temperature could be assumed to be
equal was run in Minitab. This test produced the following plot:
Since the p-values from both tests (F-test and Levene’s test) are greater than
α=0.05, we can safely assume that the variances are equal. There is not enough
evidence to reject this assumption.
3. It was given in the problem statement that runs were made in random order and
are independent.
c) A two sample t-test for equal variances was performed to determine if there was
enough evidence to support the claim that there is a difference in the mean thickness
of photoresisters baked at 95°C versus 100°C. The assumptions required to perform
this test were met as described in part b of this problem. For this test, an α-value of
0.05 was used.
The results of the test were produced by Minitab as follows:
Two-sample T for Data
Labels
T=100
T=95
N
8
8
Mean
6.85
9.37
StDev
1.64
2.10
SE Mean
0.58
0.74
Difference = mu (T=100) - mu (T=95)
Estimate for difference: -2.52000
95% CI for difference: (-4.54043, -0.49957)
T-Test of difference = 0 (vs not =):
T-Value = -2.68 P-Value = 0.018 DF = 14
Both use Pooled StDev = 1.8840
Since the p-value produced by this test is less than α=0.05, there is enough evidence
to say that the means are not equal.
d) The 95% confidence interval for the difference in the means was calculated during
the 2-sample t-test performed for part c: (-4.54043, -0.49957)
Since the value of 0 does not fall into this confidence interval, there is not enough
confidence to say that the difference for the means of the populations could be zero
(or that there may not be a difference between the population means).
e) The sample size necessary to detect an actual difference in mean thicknesses of 1.5kA
with a power of 0.9 (or β-risk of 0.1) was determined in Minitab using a process
standard deviation of 1.8 kA and an α-value of 0.05.
The results from determining this sample size were:
2-Sample t Test
Testing mean 1 = mean 2 (versus not =)
Calculating power for mean 1 = mean 2 + difference
Alpha = 0.05 Assumed standard deviation = 1.8
Difference
1.5
Sample
Size
32
Target
Power
0.9
Actual Power
0.906801
The sample size is for each group.
These results tell us that to detect a difference of 1.5 kA between the means for each
temperature, a sample size of 32 photoresisters baked at each temperature is
necessary. This value was determined under the assumption that the process variation
is 1.8 kA, allowing the maximum β-risk to be 0.1 and using an α-value of 0.05.
Problem 2 - ANOVA
The tensile strength of Portland cement is being studied. Four different mixing techniques
can be used economically. A completely randomized experiment was conducted and the
following data was collected. Assume that the data are independent.
Mixing
Techniques
1
2
3
4
Tensile Strength (lb/in2)
3129
3200
2800
2600
3000
3300
2900
2700
2865
2975
2985
2600
2890
3150
3050
2765
a) Preliminary Analysis
For the preliminary analysis, the descriptive statistics were calculated and three plots
were produced: boxplot, dotplot and histogram of data.
The results of the descriptive statistics calculations were as follows (where N
represents the number of samples for each mixing technique and the mean, standard
deviation, minimum, median and maximum are in units of lb/in2):
Mixing
Technique
1
2
3
4
N
Mean
4
4
4
4
2971.0
3156.3
2933.8
2666.3
Standard
Deviation
120.6
136.0
108.3
81.0
Minimum
Median
Maximum
2865.0
2975.0
2800.0
2600.0
2945.0
3175.0
2942.5
2650.0
3129.0
3300.0
3050.0
2765.0
It is hard to make observations based on just the descriptive statistics when there are
more than two levels. It does appear from these statistics that mixing techniques 1
and 2 produce similar results, while mixing technique 2 appears to produce higher
tensile strengths while mixing technique 4 appears to be able to produce the lowest
tensile strengths.
Here we have a boxplot of the data which makes it easier to visualize some of the
statistics which were calculated. This plot shows that the tensile strengths of all the
samples produced by mixing technique 4 actually fall below the minimums of the
other three techniques.
The following dot plot allows us to visual where each individual sample falls. With
this plot, you can see that not only do the samples produced by technique 4 fall below
the samples produced by the other techniques, but that there are actually two samples
produced by technique 4 that had the minimum tensile strength of all the data.
The final plot produced for the preliminary analysis was the histogram plot showing
the Normal distribution assumed for each technique given the sample statistics. Here
again we can see how much less the mixing technique 4 distribution overlaps the
other three techniques and how the distributions for mixing techniques 2 and 4 hardly
overlap at all making it very hard to deny that tensile strength is dependent on
whether you use technique 2 or 4.
Based on the preliminary analysis I would conclude that some mixing techniques may
produce similar results (techniques 1 and 3) but the tensile strength would most likely
be greater if you use mixing technique 2 over using mixing technique 4 and therefore
the mixing technique probably does have an affect on the tensile strength.
b) The hypothesis that the mixing technique affects the tensile strength of Portland
cement was tested by hand and is attached. The result of this test was that the mixing
technique does have an affect on the tensile strength.
c&d) Check all assumptions needed to perform the analysis:
1.
2.
3.
4.
Samples are from Normal distributions.
Variance for each technique is equal.
Runs were made in random order and are independent.
Each technique has n observations in it.
1. A probability plot was produced in Minitab to test if the data could be assumed to
be Normal:
Since the p-values for every technique is greater than α=0.05, there is not enough
evidence to say that any of the four data sets are not from Normal distributions.
2. A test to determine if the variances for each technique could be assumed to be
equal was run in Minitab. This test produced the following plot:
The p-values for Bartlett’s and Levene’s tests were both very high showing that
we are very safe to assume that the variances of the four data sets are equal.
There is definitely not enough evidence to reject this hypothesis.
3. It was given in the problem statement that runs were made in random order and
are independent.
4. Each mixing technique contained 4 observations (n=4).
All the assumptions were met for the ANOVA test which was performed by hand.
The results of this test were also performed in Minitab as a check. The Minitab
results did match the results determined by hand and were as follows:
Source
Labels
Error
Total
DF
3
12
15
S = 113.3
Level
MT1
MT2
MT3
MT4
N
4
4
4
4
SS
489740
153908
643648
MS
163247
12826
R-Sq = 76.09%
Mean
2971.0
3156.3
2933.8
2666.3
StDev
120.6
136.0
108.3
81.0
F
12.73
P
0.000
R-Sq(adj) = 70.11%
Individual 95% CIs For Mean Based on
Pooled StDev
---+---------+---------+---------+-----(------*-----)
(-----*-----)
(-----*-----)
(-----*-----)
---+---------+---------+---------+-----2600
2800
3000
3200
Pooled StDev = 113.3
When performing this test by hand, the F-statistic was used to determine the result but
in the Minitab results we can see that the p-value was extremely low (less than
=0.05), although not equal to zero because a probability value can never be equal to
zero in these test cases. This p-value confirms the result obtained by hand that the
null hypothesis can be rejected and we can state that the mixing technique does have
an affect on the tensile strength of Portland cement.
Problem 3
This problem is based on the experiment in problem 2.
a) The 95% confidence interval for the mean tensile strength of the Portland cement
produced by each of the 4 mixing techniques was determined in Minitab and the
following plot was produced:
b) A 95% confidence interval for the difference in means for techniques 1 and 3 was
produced using a 2-sample t-test in Minitab and the following results were obtained:
Two-sample T for MT1 vs MT3
MT1
MT3
N
4
4
Mean
2971
2934
StDev
121
108
SE
Mean
60
54
Difference = mu (MT1) - mu (MT3)
Estimate for difference: 37.2500
95% CI for difference: (-160.9986, 235.4986)
T-Test of difference = 0 (vs not =):
T-Value = 0.46 P-Value = 0.662 DF = 6
Both use Pooled StDev = 114.5795
Estimate for difference: 37.2500
95% CI for difference: (-160.9986, 235.4986)
These results show a high confidence (95%) for saying that the means of techniques 1
and 3 are equal or that the difference between these means is zero.
c) The result from part b shows that if we were just testing techniques 1 and 3 we may
say that the mixing technique does not have an affect on the tensile strength because
the tensile strength using these two techniques is very similar. The reason that the
results presented in problem 2 stated that the mixing technique does have an affect is
because 4 techniques were being tested and only one technique needed to have a
different mean in order for the null hypothesis (that all means were equal) to be
rejected. As long as one technique produces significantly different tensile strengths,
you can not say that the mixing technique does not have an affect on this factor.
Problem 4
A rental car company wants to investigate whether the type of car rented affects the
length of the rental period. An experiment is run for one week at a particular location,
and 10 rental contracts are selected at random for each car type. The results are shown
in the table below.
Type of Car
Sub-Compact
Compact
Midsize
Full Size
3
1
4
3
5
3
1
5
3
4
3
7
7
7
5
5
Observations
6
5
5
6
7
1
10
3
3
3
2
4
2
2
4
7
1
1
2
2
6
7
7
7
a) Preliminary analysis
For the preliminary analysis, the descriptive statistics were calculated and three plots
were produced: boxplot, dotplot and histogram of data.
The results of the descriptive statistics calculations were as follows (where N
represents the number of samples for each type of car and the mean, standard
deviation, minimum, median and maximum are (assumed to be) in terms of number
of days the car was rented for):
Type of Car
N
Mean
Compact
Full Size
Midsize
SubCompact
10
10
10
3.900
5.300
3.600
10
4.100
Standard
Minimum
Deviation
2.283
1.000
2.452
2.000
2.221
1.000
1.969
1.000
Median
Maximum
3.500
5.000
3.500
7.000
10.000
7.000
4.000
7.000
From these statistics we can see that randomly chosen contracts for compact and midsize cars produced the exact same minimum, median and maximum results. We can
also see that sub-compact cars were very close to these results. Full size cars, on the
other hand, did produce higher statistics but it is hard to say based solely on these
values that the difference between full size cars and others is significant.
Below is a boxplot of the data. From this box plot it appears that the results for the
four cars were quite similar and that the higher statistics for full size cars may just be
the result of a single outlier as the two middle quartiles for the full size cars are very
similar to the other car types.
The dotplot can help us verify this assumption that one outlier may have affected the
statistics for the full size cars as it shows the individual observations for each car
type. Here we see that there is one obvious outlier in the full size data set and that the
rest of the observations on the full size car contracts are very well aligned with the
observations of the other car types. This outlier therefore is most likely the result of
some other factor that may have nothing to do with the car type or may just be an
observation not characteristic of the data set. Although this outlier may be the only
reason the full size car type had higher statistics, it can be seen in this dotplot that the
full size car type had no contracts with a count of 1 while the other three car types had
a total of five 1 counts (and compact and midsize cars both had two contracts of
length 1). Also the full size car did have the most 7 day contracts observed.
The histograms of the four data sets, shown below, also shows that the full size car
distribution is not very different from the others. The four distributions on this plot
overlap very much making it seem that the car type really does not have an effect on
the length of the contract.
Based on the preliminary analysis, I would conclude that car type probably does not
have a significant affect on the length of the contract. At first it looked as if the full
size car might be different from the others which would mean, with just one
difference, car type does have a significant affect on the contract but after looking at
the dotplot and histograms it appears as though all the car types had very similar
observations and there was just one outlier uncharacteristic of the rest of the data. We
still can’t say this for sure and looking at the minimum values in the dotplot as well as
the maximum values even without the outlier, the full size car may still have an affect
on the contract length.
b) Using =0.1, is there evidence to support the claim that the car type does have an
affect on the length of the rental contract?
The results of the ANOVA test performed in Minitab were:
Source
Labels
Error
Total
DF
3
36
39
S = 2.238
Level
Compact
Full Size
Midsize
Sub-Compact
SS
16.68
180.30
196.98
MS
5.56
5.01
R-Sq = 8.47%
N
10
10
10
10
Mean
3.900
5.300
3.600
4.100
Pooled StDev = 2.238
F
1.11
P
0.358
R-Sq(adj) = 0.84%
StDev
2.283
2.452
2.221
1.969
Individual 95% CIs For Mean Based on
Pooled StDev
--+---------+---------+---------+------(----------*-----------)
(-----------*-----------)
(-----------*-----------)
(-----------*-----------)
--+---------+---------+---------+------2.4
3.6
4.8
6.0
These results show that the p-value is greater than =0.1 and therefore there is not
enough evidence to say that the means are not equal. In other words, there is not
enough evidence to state that the type of car effects the rental contract.
The assumptions made to perform this test – data is normal and independent,
randomly collected, equal variances, equal sample size – were tested and verified. It
was given in the problem statement that the data was randomly collected and it is
assumed that they were independent. The plot to test the normality of the data, shown
below, resulted in p-values greater than =0.1 for every car type and therefore the
data can be assumed to be normal.
Finally, the equal variances assumption was tested and also shown to be valid due to
the very high p-values:
c) Analyze the residuals and comment on model adequacy
The following plots were produced for the residuals:
The top left plot shows how well the residuals fit a Normal distribution. The residuals
appear to fit this line well although there does seem to be some curving away from
the fit towards the edges. For this reason the Normal distribution was fitted to the
histogram in the bottom left. This also shows that the residuals do not seem to fit the
Normal distribution that well.
The two plots on the left of the first residuals figure do both appear to show random
patterns which is good although there is a large gap in the middle of the top right plot.
In terms of the model adequacy, all assumptions were met (although the assumption
that the data was independent could not be verified). The test for normality of the
four data sets was verified but the residuals plots do seem to contradict this. If the
residuals are not normally distributed, it means that the data is not normally
distributed either.
d) The response variable is a count. Typically count variables fall under Binomial
distributions which would mean that the Normal distribution assumption may not be
met and that this ANOVA test can not be used. If the Normal distribution is not met,
than the results of the ANOVA would be irrelevant.
I performed a normality test on the saved residuals in Minitab to further test this
assumption. The following plot was produced:
The p-value for this test was low (0.181) but still not less than =0.1 so this is not enough
evidence to reject the hypothesis that the residuals, and data, are distributed normally.
Although the points do seem to oscillate around the fit line in a pattern and some points
come extremely close to the 95% confidence boundary so I would say that these residuals
still may not be from a Normal distribution.
There is not enough evidence to say that these test results can not be accepted. Under
these test results, there is not enough evidence to reject the null hypothesis that the car
type affects the length of the rental contract.