Download engstat q3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Council of Student Organizations
De La Salle University– Manila
Rm. 402 Bro. Connon Hall, De La Salle University
2401Taft Avenue, Manila
ENGSTAT Quiz 3 Reviewer
Prepared by: Ma. Elizabeth Ann L. Uy
Sampling Distribution of the Sample Mean
EXAMPLE 1: Heights of Staring Players
Player
Height
A
76
B
78
C
79
D
81
E
86
πœ‡ = 80
Obtain the sampling distribution of the sample mean for samples of size 2 and for samples of
size 4. Compare.
Solution:
For size 2:
Sample
A, B
A, C
A, D
A, E
B, C
B, D
B, E
C, D
C, E
D, E
Height mean
76 78
77
76 79 77.5
76 81 78.5
76 86
81
78 79 78.5
78 81 79.5
78 86
82
79 81
80
79 86 82.5
81 86 83.5
(x-ΞΌ)
-3
-2.5
-1.5
1
-1.5
-0.5
2
0
2.5
3.5
(x-ΞΌ)
Ξ£ (x-ΞΌ)2
βˆ‘(π‘₯ βˆ’ πœ‡)2
43.5
𝜎=√
=√
= 𝟐. πŸŽπŸ—
𝑛
10
2
9
6.25
2.25
1
2.25
0.25
4
0
6.25
12.25
43.5
For size 4:
Sample
A, B, C, D
A, B, C, E
A, B, D, E
A, C, D, E
B, C, D, E
Heights
76
76
76
76
78
78
78
78
79
79
𝜎=√
79
79
81
81
81
81
86
86
86
86
mean
78.5
79.75
80.25
80.5
81
(x-ΞΌ)
-1.5
-0.25
0.25
0.5
1
Ξ£ (x-ΞΌ)2
2
(x-ΞΌ)
2.25
0.0625
0.0625
0.25
1
3.625
βˆ‘(π‘₯ βˆ’ πœ‡)2
3.625
=√
= 𝟎. πŸ–πŸ“
𝑛
5
The higher the sample size, the higher the accuracy there would be because Οƒ will be smaller.
EXAMPLE 2: Prices of New Mobile Homes
53.8
53.1
35.9
42.0
54.4
59.4
42.5
62.7
Prices of new mobile homes (in thousand pesos)
45.2
42.9
49.9
48.2
41.6
49.7
43.7
52.7
47.7
41.5
57.2
45.1
50.3
50.0
41.9
62.8
46.6
60.3
43.9
56.4
58.9
35.3
37.3
49.8
48.6
58.9
39.7
63.9
a) Determine Point Estimate; mean = 49.27
b) Identify the Distribution of the variable; 𝜎π‘₯ =
7.2
√36
= 𝟏. 𝟐
c) Show that 95.44% of all samples have the property that the interval from x – 2.4 to z + 2.4
contains ΞΌ.
πœ‡π‘₯ βˆ’ 2𝜎π‘₯ π‘‘π‘œ πœ‡π‘₯ + 2𝜎π‘₯
49.27 βˆ’ 2(1.2) π‘‘π‘œ
49.27 + 2(1.2)
𝑷𝒉𝒑 πŸ’πŸ”, πŸ–πŸ•πŸŽ 𝒕𝒐 π‘·π’‰π’‘πŸ“πŸ, πŸ”πŸ•πŸŽ
The β€œ95.44” part of the 68.26-95.44-99.74 rule states that, for a normally distributed variable,
95.44% of all possible observations lie within two standard deviations to either side of the mean.
Applying the rule to the variable x, we see that 95.44% of all samples of 36 new mobile homes have
mean prices within 2.1 to 2.4 of ΞΌ or equivalently 95.44% of all sample of 36 new mobile homes have
the property that the interval from x-2.4 to x+2.4 contains ΞΌ.
EXAMPLE 3: Age of the civilian labor force
Find a 95% confidence interval for the mean age, ΞΌ, of all people in the civilian labor force. Assume
Οƒ=12.1 years, n = 50.
22
32
33
43
60
51
27
28
42
35
58
34
16
37
41
37
31
39
40
29
40
45
49
19
28
65
33
43
31
33
42
38
29
21
35
37
24
26
34
32
43
19
30
62
37
26
34
38
38
33
Step 1: Confidence level = 95%
1 βˆ’ 𝛼 = 1 βˆ’ 0.95 = 0.05
𝑍π‘₯/2 = 𝑍0.025 = 1.96
Refer to the table for this part*
Step 2: Confidence Interval
π‘₯ βˆ’ (𝑍𝛼 π‘₯
2
35.98 βˆ’ (1.96) (
𝜎
βˆšπ‘›
)
π‘‘π‘œ
π‘₯ + (𝑍𝛼 π‘₯
2
𝜎
βˆšπ‘›
)
12.1
12.1
) π‘‘π‘œ 35.98 + (1.96) (
)
√50
√50
πŸ‘πŸ. πŸ”πŸπŸ” 𝒕𝒐 πŸ‘πŸ—. πŸ‘πŸ‘πŸ’
HYPOTHESIS TESTING
EXAMPLE 4: Quality Assurance (taken from the book)
A company that produces snack foods uses a machine to package a 454 g bags of pretzels. We
assume that the net weights are normally distributed and that the population standard deviation of all
such weights is 7.8 g. A simple random sample of 25 bags of pretzels has the net weights, in grams,
displayed in the Table 1. Do the data provide a sufficient evidence to conclude that the packaging machine
is not working properly? We use the following steps to answer the question.
Table 1. Weights, in grams, of 25 randomly selected bags of pretzels
465
449
468
446
447
456
442
433
447
456
438
449
454
456
456
454
446
463
452
435
447
447
450
444
450
a) State the null and alternative hypothesis for the hypothesis test.
b) Discuss the logic behind carrying out the hypothesis test.
c) Identify the distribution of the variable x, that is, the sampling distribution of the sample mean
for sample size 25.
d) Obtain a precise criterion for deciding whether to reject the null hypothesis in favor of the
alternative hypothesis.
e) Apply the criterion in part d to the sample data and state the conclusion.
Solution:
a) The null and alternative hypothesis for the hypothesis test are:
a. π»π‘œ : πœ‡ = 454 𝑔 (π‘‘β„Žπ‘’ π‘π‘Žπ‘π‘˜π‘Žπ‘”π‘–π‘›π‘” π‘šπ‘Žπ‘β„Žπ‘–π‘›π‘’ 𝑖𝑠 π‘€π‘œπ‘Ÿπ‘˜π‘–π‘›π‘” π‘π‘Ÿπ‘œπ‘π‘’π‘Ÿπ‘™π‘¦)
b. π»π‘Ž : πœ‡ π‘›π‘œπ‘‘ = 454 𝑔 (π‘‘β„Žπ‘’ π‘π‘Žπ‘π‘˜π‘Žπ‘”π‘–π‘›π‘” π‘šπ‘Žπ‘β„Žπ‘–π‘›π‘’ 𝑖𝑠 π‘›π‘œπ‘‘ π‘€π‘œπ‘Ÿπ‘˜π‘–π‘›π‘” π‘π‘Ÿπ‘œπ‘π‘’π‘Ÿπ‘™π‘¦)
b) Basically, the logic behind carrying out the hypothesis test is this: If the null hypothesis is true,
that is, if ΞΌ=454 g, the mean weight, x, of the sample of 25 bags of pretzels should approximately
equal 454g. We say β€œapproximately equal” because we cannot expect a sample mean to equal
exactly the population mean; some sampling error is to be anticipated. However, if the sample
mean weights differ β€œtoo much” from 454 g, we would be inclined to reject the null hypothesis
and conclude that the alternative hypothesis is true. As we show in part d, we can use our
knowledge of the sampling distribution of the sample mean to decide how much difference is
β€œtoo much”.
c) Because n=25, Οƒ=7.8, and the weights are normally distributed
a. πœ‡π‘₯ = πœ‡ (π‘€β„Žπ‘–π‘β„Ž 𝑀𝑒 π‘‘π‘œπ‘›β€² π‘‘π‘˜π‘›π‘œπ‘€)
𝜎
7.8
b. 𝜎π‘₯ = 𝑛 =
= 1.56
√
√25
c. X is normally distributed
In other words, for samples of size 25, the variable x is normally distributed with mean ΞΌ
and standard deviation 1.56 g.
d) The β€œ95.44” part of the 68.26-95.44-99.74 rule states that, for a normally distributed variable,
95.44% of all possible observations lie within two standard deviations to either side of the mean.
Applying this part of the rule to the variable x and referring to part c, we see that 95.44% of all
samples of 25 bags of pretzels have mean weights within 2(1.56)=3.12 g of ΞΌ. Or equivalently, only
4.56% of all samples of 25 bags of pretzels have mean weights that are not within 3.12 g of ΞΌ as
illustrated in Figure 1.
Figure 1. 95.44% of all samples of 25 bags of pretzels have mean weights within two standard deviations (3.12 g) of ΞΌ
Thus if the mean weight, x, of the 25 bags of pretzels sampled is not within the two
standard deviations (3.12g) of 454g, we have evidence against the null hypothesis. Why? Because
observing such a sample mean would occur by chance only 4.56% of the time if the null
hypothesis, ΞΌ=454g, is true.
In summary, we have obtained the following precise criterion for deciding whether to
reject the null hypothesis. The criterion is portrayed graphically in Figure 2a.
Figure 2. a) criterion for deciding whether to reject the null hypothesis; b) normal curve associated with x if the null hypothesis
is true, superimposed on the decision criterion
If the mean weight, x, of the 25 bags of pretzels sampled is more than two standard
deviations (3.12g) from 454g, reject the null hypothesis, ΞΌ=454g, and conclude that the alternative
hypothesis, ΞΌ not = 454g, is true. Otherwise, do not reject the null hypothesis.
If the null hypothesis is true, the normal curve associated with x is the one with
parameters 454 and 1.56; that normal curve is superimposed on Figure 2a in Figure 2b.
e) The mean weight, x, of the sample of 25 bags of pretzels whose weights are given in Table 1 is
450 g. Therefore,
π‘₯ βˆ’ 454 450 βˆ’ 454
𝑧=
=
= βˆ’2.56
1.56
1.56
That is, the sample mean of 450 g is 1.56 standard deviations below the null hypothesis
population mean of 454 g, as shown in Figure 3. Because the mean weight of the 25 bags of
pretzels sampled is more than two standard deviations from 454 g, we reject the null hypothesis,
ΞΌ=454g, and conclude that the alternative hypothesis, ΞΌ not = 454g, is true.
Figure 3. Graph showing the number of standard deviations that the sample mean of 450 g is from the null hypothesis
population mean of 454g.
Interpretation: The data provided sufficient evidence to conclude that the packaging machine is
not working properly.
Reference:
5th Edition
Scheaffer, Mulekar, and McClave (2010). Probability and Statistics for Engineers,