Download Chapter 2 Basic Statistical Methods

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 2 Basic Statistical Methods
許湘伶
Design and Analysis of Experiments
(Douglas C. Montgomery)
hsuhl (NUK)
DAE Chap. 2
1 / 30
Compare two conditions
experiments to compare two conditions (or called treatments處理)
run試驗: each of the observations in j
hsuhl (NUK)
DAE Chap. 2
2 / 30
Basic Statistical Concepts
Graphical description of variability: useful for summarizing the
information in a sample of data
dot diagram:
I
I
I
small data (about 20 observations)
location or central tendency
spread or variability
hsuhl (NUK)
DAE Chap. 2
3 / 30
Basic Statistical Concepts (cont.)
histogram: a large-sample tool
hsuhl (NUK)
DAE Chap. 2
4 / 30
Basic Statistical Concepts (cont.)
box plot (box-and-whisker plot): useful way to display data;
Figure : 圖片來源:wiki-box plot
hsuhl (NUK)
DAE Chap. 2
5 / 30
Probability distributions
to describe the observations more completely
properties of probability distributions
hsuhl (NUK)
DAE Chap. 2
6 / 30
Probability distributions (cont.)
mean; expected value: µ = E(y)
variance: σ 2 = V(y) = E[(y − µ)2 ]
some properties of expected value and variance: Ex
I
I
E(y1 + y2 ) = E(y1 ) + E(y2 )
Cov(y1 , y2 ) = E[(y1 − µ1 )(y2 − µ2 )]
hsuhl (NUK)
DAE Chap. 2
7 / 30
Sampling and Sampling Distributions
to draw conclusions about a population using a sample from that
population
sample mean: ȳ =
Pn
i=1 yi
n
2
sample variance: S =
squares of yi
Pn
2
i=1 (yi −ȳ)
n−1
sample standard deviation: S =
=
√
SS
;
n−1
SS: corrected sum of
S2
point estimator:
I
I
unbiased: ex E(¯(y)) = µ
An unbiased estimator should have minimum variance.
hsuhl (NUK)
DAE Chap. 2
8 / 30
Sampling and Sampling Distributions (cont.)
sampling distribution: Y ∼ N (µ, σ 2 )
hsuhl (NUK)
DAE Chap. 2
9 / 30
Sampling and Sampling Distributions (cont.)
The Central Limit Theorem
Central Limit Theorem
If y1 , . . . , yn is a sequence of n independent and identically
distributed r.v.s with E(yi ) = µ and V(yi ) = σ 2 (both finite)
P
and x = ni=1 yi , then the limiting form of the distribution
of
x − nµ
as n → ∞,
zn = √
nσ 2
is the standard normal distribution.
hsuhl (NUK)
DAE Chap. 2
10 / 30
Sampling and Sampling Distributions (cont.)
chi-square or χ2 distribution
P
z1 . . . , zn ∼ N (0, 1) ⇒ x = ki=1 z2i ∼ χ2k
µ = k; σ 2 = 2k
Pn
2
SS
i=1 (yi − ȳ)
=
∼ χ2n−1
σ2
σ2
hsuhl (NUK)
DAE Chap. 2
11 / 30
Sampling and Sampling Distributions (cont.)
t distribution
z ∼ N (0, 1) and χ2k ∼ chi-square r.v.
z
tk = p 2 ∼ t distribution with k d.f.
χk /k
hsuhl (NUK)
DAE Chap. 2
12 / 30
Sampling and Sampling Distributions (cont.)
F distribution
χ2u , χ2ν ∼ two indep. chi-square r.v. with d.f. u, ν
Fu,ν =
χ2u /u
∼ F distribution with (u, ν) d.f.
χ2ν /ν
yi1 , . . . , yini , i = 1, 2: two random sample of ni observations from
different populations
S12
∼ Fn1 −1,n2 −1
S22
hsuhl (NUK)
DAE Chap. 2
13 / 30
Sampling and Sampling Distributions (cont.)
hsuhl (NUK)
DAE Chap. 2
14 / 30
Hypothesis Testing
Model: two level
yij = µi + εij
i = 1, 2
j = 1, 2, . . . , n
Random error: εij ∼ NID(0, σi2 ) ⇒ yij ∼ NID(µi , σi2 )
hsuhl (NUK)
DAE Chap. 2
15 / 30
Hypothesis Testing (cont.)
Statistical Hypothesis:
H 0 : µ1 = µ2
vs. H1 : µ1 6= µ2
Probability of Type I error: α (significant level of the test)
α = P(type I error) = P(reject H0 |H0 is true)
Probability of Type II error: β
β = P(type II error) = P(fail to reject H0 |H0 is false)
Power:
Power = 1 − β = P(reject H0 |H0 is false)
hsuhl (NUK)
DAE Chap. 2
16 / 30
Hypothesis Testing (cont.)
hsuhl (NUK)
DAE Chap. 2
17 / 30
Hypothesis Testing (cont.)
Two-sample t-test
test statistic:
t0 =
ȳ − ȳ2 H0
q1
∼ tn1 +n2 −2
Sp n11 + n12
Estimate of σ12 = σ22 = σ 2 :
Sp2 =
(n1 − 1)S12 + (n2 − 1)S22
n1 + n2 − 1
100(1 − α) percent C.I.:
r
1
1
+
≤ µ 1 − µ2
n1 n2
r
1
1
≤ ȳ1 − ȳ2 + tα/2,n1 +n2 −2 Sp
+
n1 n2
ȳ1 − ȳ2 −tα/2,n1 +n2 −2 Sp
hsuhl (NUK)
DAE Chap. 2
18 / 30
Hypothesis Testing (cont.)
Figure : The t distribution with 18 d.f. with the critical region
±t0.0025,18 = ±2.101
hsuhl (NUK)
DAE Chap. 2
19 / 30
Hypothesis Testing (cont.)
the length of the C.I. is
determined:
r
1
1
tα/2,n1 +n2 −2 Sp
+
n1 n2
Consider n1 = n2 = n
hsuhl (NUK)
DAE Chap. 2
20 / 30
Hypothesis Testing (cont.)
hsuhl (NUK)
DAE Chap. 2
21 / 30
Hypothesis Testing (cont.)
Portland cement data
Modified Mortar
ȳi (kgf/cm2 ) 16.764
Si2
0.100
Si
0.316
ni
10
Unmodified Mortar
17.042
0.061
0.248
10
> t.test(y1,y2,var.equal=TRUE)
Two Sample t-test
data: y1 and y2
t = -2.1869, df = 18, p-value = 0.0422
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.54507339 -0.01092661
sample estimates:
mean of x mean of y
16.764
17.042
hsuhl (NUK)
DAE Chap. 2
22 / 30
Hypothesis Testing (cont.)
hsuhl (NUK)
DAE Chap. 2
23 / 30
Hypothesis Testing (cont.)
Normal probability plot: checking normal assumption
I
I
I
sample data: y1 , . . . , yn
arrange data: y(1) , . . . , y(n)
plot y(i) against the observed cumulative frequency
j−0.5
n
If the hypothesized distribution adequately describes the data, the
plotted points will fail approximately along a straight line.
subjective(主觀的)
When assumptions are badly violated, the performance of the
t-test will be affected.
hsuhl (NUK)
DAE Chap. 2
24 / 30
Hypothesis Testing (cont.)
hsuhl (NUK)
DAE Chap. 2
25 / 30
Hypothesis Testing (cont.)
σ12 6= σ22
hsuhl (NUK)
DAE Chap. 2
26 / 30
Hypothesis Testing (cont.)
hsuhl (NUK)
DAE Chap. 2
27 / 30
Paired comparison problem
t.test(x, y = NULL,
alternative = c("two.sided", "less", "greater"),
mu = 0, paired = FALSE, var.equal = FALSE,
conf.level = 0.95)
hsuhl (NUK)
DAE Chap. 2
28 / 30
Paired comparison problem (cont.)
dj = y1j − y2j
H 0 : µd = µ1 − µ2
test statistics:
d̄
H
√ ∼0 tn−1 ,
t0 =
Sd / n
hsuhl (NUK)
Pn
Sd =
DAE Chap. 2
− d̄)2
n−1
i=1 (dj
1/2
29 / 30
Variance of Normal Distribution
H0 : σ12 = σ22 vs. H1 : σ12 6= σ22
hsuhl (NUK)
DAE Chap. 2
30 / 30
Related documents