Download Math 4030 – 10a Inference Concerning Means

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Math 4030 – 9b
Comparing Two Means
• Dependent and independent
samples
• Comparing two means
1
Choice of test depend on
• Independent samples
design or Matched pairs
design (Sec. 8.1)
• Size of samples
• Equal variances
• Normality
2
Independent
Samples
Large
samples?
(≥ 30)
Matched
Pairs Sample
N
One sample of
differences (Sec.
8.4)
Normality?
N
Y
Y
Z test
(Sec. 8.2)
Y
Nonparametric
Tests (Ch.14)
Equal
variance?
Y
t test with df = n1 + n2 -2, using
pooled estimator for the
common variance (Sec. 8.3)
N
t test with
estimated degree of
freedom (Sec. 8.3)
3
Data format for independent samples:
Population 1 (may or may
not be normally
distributed), with mean 1
(to be estimated and
compared) and variance
21 (may or may not
known).
Population 2 (may or may
not be normally
distributed), with mean 2
(to be estimated and
compared) and variance
22 (may or may not
known).
Sample of size n1:
Sample of size n2:
X 1 , X 2 ,..., X n1
With sample mean and
sample variance:
X and S12
Y1 , Y2 ,..., Yn2
With sample mean and
sample variance:
Y and S22
4
Sampling distribution of X  Y :

   
E X  Y  E X  E Y  1  2


 

Var X  Y  Var X  Var Y 
 12
n1

 22
n2
S12 S22


n1 n2
Distribution? CLT still apply?
5
Case 1: both samples are large
(n1 ≥ 30, n2 ≥ 30) (Sec. 8.2)

12  22 
,
X  Y ~ N  1  2 ,

n1 n2 

or

X  Y      
Z
~ N 0,1
1
 12
n1

2
 22
n2
S12 S22


n1 n2
6
Example 1:
It is believed that the resistance of certain
electric wire can be reduced by 0.05 ohm by
alloying. (Assuming standard deviation of
resistance of any wire is 0.035 ohm.)
A sample of 32 standardX wires and 32
alloyed wires are sampled.
Question 1: Find the probability that average
resistance of 32 standard wires is at least
0.03 ohm higher than that of 32 alloyed
wires.
7
Confidence interval for 1  2 :
x  y   z
 /2
2
1
2
2
s
s

n1 n2
Test statistic for H0: 1  2   0

X  Y  
Z
2
1
0
2
2
S
S

n1 n2
8
Example 1:
It is believed that the resistance of certain
electric wire can be reduced by alloying. To
verify this, a sample of 32 standard wires
results the sample mean 0.136 ohm and
sample sd 0.034 ohm, and
a sample of 32
X
alloyed wires results the sample mean of
0.083 ohm and sample sd 0.036 ohm
Question 2: Construct a 95% confidence
interval for the mean resistance reduction
due to alloying.
9
Example 1:
It is claimed that the resistance of certain
electric wire can be reduced by more than
0.05 ohm by alloying. To verify this, a
sample of 32 standard wires results the
sample mean 0.136 ohm
and sample sd
X
0.004 ohm, and a sample of 32 alloyed
wires results the sample mean of 0.083 ohm
and sample sd 0.005 ohm.
Question 3: Can we support the claim at  =
0.05 level?
10
Case 2.1: Small sample(s), normal
populations with known equal variance 2
(Sec. 8.3)

1 
2 1
X  Y ~ N  1  2 ,     ,
 n1 n2  

or

X  Y      
Z
~ N 0,1
1
2
1 1


n1 n2
11
Example 1’:
It is claimed that the resistance of certain
electric wire can be reduced by more than
0.05 ohm by alloying. To verify this, a
sample of 15 standard wires results the
sample mean 0.136 ohm,
and a sample of
X
15 alloyed wires results the sample mean of
0.083 ohm. (Assume that the resistance has
normal distribution with standard deviation
0.0049 ohm for any types of wire.)
Question: Can we support the claim at  =
0.05 level?
12
Case 2.2: Small sample(s), normal
populations with unknown equal
variance (Sec. 8.3)
  Sp
2
2




n

1
S

n

1
S
2
1
2
2
 1
n1  n2  2

X  Y      
t
~ t n  n
1
Sp
1 1

n1 n2
2
1
2
 1
13
Example 1’’:
It is claimed that the resistance of certain
electric wire can be reduced by more than
0.05 ohm by alloying. To verify this, a sample
of 15 standard wires results the sample mean
0.136 ohm and sample sd 0.0049, and a
sample of 15 alloyed wires results the sample
mean of 0.083 ohm and sample sd 0.0052 .
(Assume that the resistance has normal
distribution with the same variance)
Question: Can we support the claim at  =
0.05 level?
14
Case 2.3: Small sample(s), normal
populations with unequal variance
(Sec. 8.3)

X  Y   1  2 
t' 
S12 S22

n1 n2
has t distribution with estimated
degree of freedom




2
2 2
  s1  s2  
  n1 n2  
df  
2
2 
2
2
 s2  
  s1 
  
  n 
n2 
1 





 n1  1 n2  1 
15
Matched Pairs Samples (Sec. 8.4)
Only one population, and one sample of
size n, but two measurements:
X 1 , X 2 ,..., X n
Y1 , Y2 ,..., Yn
Since we are interested in the differences,
this is really a one sample problem:
D1, D2 ,..., Dn
where D  X  Y
i
i
i
16
Sampling distribution of D :
  
   
E D  E X  Y  E X  E Y  1  2  D
 
 

Var D  Var X  Var Y


2
1 n
S 
Di  D .

n  1 i 1
2
D
Test the hypothesis D = 0 vs.
Confidence interval containing 0.
17
Example 2:
It is claimed that the resistance of certain
electric wire can be reduced by more than
0.05 ohm by alloying. To verify this, a sample
of 15 wires are tested before the alloying and
again after the alloying, we
find the mean
X
reduction 0.063 ohm, and the sd of the
reductions 0.025. (Assume that the resistance
has normal distribution)
Question: Can we support the claim at  =
0.05 level?
18
Use R to compare two means:
t.test(X, Y,…) can be used to compare
the means from two samples, where X
and Y are vectors of data values of
samples from two population.
Other parameters:
• Format of alternative hypothesis
• Assumed mean in null hypothesis
• Dependent or independent samples
• Equal or unequal variances
• Level of significance 
19
Example 3:
> Calif=c(59,68,44,71,63,46,69,54,48)
> Org=c(50,36,62,52,70,41)
> t.test(Calif, Org, alternative = "two.sided“,
mu = 0, conf.level = 0.95)
Two Sample t-test
Textbook Data:
8-11.TXT
Calif
59
68
44
71
63
46
69
54
48
Org
50
36
62
52
70
41
data: Calif and Org
t = 1.0302, df = 13, p-value = 0.3217
alternative hypothesis: true difference in
means is not equal to 0
95 percent confidence interval:
-6.764863 19.098196
sample estimates:
Conclusion?
mean of x mean of y
58.00000 51.83333
20
Example 4:
> X=read.table(file.choose(),header=TRUE)
> t.test(X$wghtI,X$wghtII,alternative = "less“,
mu = 0, paired = TRUE, conf.level = 0.95)
Paired t-test
Textbook Data:
8-16.TXT
wghtI
11.23
14.36
8.33
10.5
23.42
9.15
13.47
6.47
12.4
19.38
wghtII
11.27
14.41
8.35
10.52
23.41
9.17
13.52
6.46
12.45
19.35
data: X$wghtI and X$wghtII
t = -2.2056, df = 9, p-value = 0.02742
alternative hypothesis: true difference in
means is less than 0
95 percent confidence interval:
-Inf -0.003377979
sample estimates:
mean of the differences
Conclusion?
-0.02
21