Download T - Erwin Sitompul

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Central limit theorem wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Probability and Statistics
Lecture 11
Dr.-Ing. Erwin Sitompul
President University
http://zitompul.wordpress.com
2 0 1 3
President University
Erwin Sitompul
PBST 11/1
Chapter 8.6
Sampling Distribution of S2
Sampling Distribution of S2
 If S2 is the variance of a random sample of size n taken from a
normal population having the variance σ2, then the statistic
 
2
(n  1) S

2
2
n

i 1
X
i
X
2
2
has a chi-squared distribution with v = n – 1 degrees of freedom.
President University
Erwin Sitompul
PBST 11/2
Chapter 8.6
Sampling Distribution of S2
Sampling Distribution of S2
 Table A.5 gives values of  for various values of α and v.
 The column headings are the areas α.
 The left column shows the degrees of freedom.
2

 The table entry are the  value.
2
President University
Erwin Sitompul
PBST 11/3
Chapter 8.6
Sampling Distribution of S2
Table A.5 Chi-Squared Distribution
President University
Erwin Sitompul
PBST 11/4
Chapter 8.6
Sampling Distribution of S2
Table A.5 Chi-Squared Distribution
President University
Erwin Sitompul
PBST 11/5
Chapter 8.6
Sampling Distribution of S2
Sampling Distribution of S2
A manufacturer of car batteries guarantees that his batteries will
last, on the average, 3 years with a standard deviation of 1 year. If
five of these batteries have lifetimes of 1.9, 2.4, 3.0, 3.5, and 4.2
years, is the manufacturer still convinced that his batteries have a
standard deviation of 1 year? Assume that the battery lifetime
follows a normal distribution.
2
 n

2
n xi    xi 
(5)(48.26)  (15) 2
i 1
i 1


2
s 
 0.815

n(n  1)
(5)(4)
n
2 
(n  1) s 2
2

(4)(0.815)
 3.26
(1)
 From the table, 95% of the χ2 values with 4
degrees of freedom fall between 0.484 and
11.143.
 The computed value with σ2 = 1 is reasonable.
 The manufacturer has no reason to doubt the
current standard deviation.
President University
Erwin Sitompul
PBST 11/6
Chapter 8.7
t-Distribution
t-Distribution
 In the previous section we discuss the utility of the Central Limit
Theorem to infer a population mean or the difference between two
population means.
 These utility is based on the assumption that the population
standard deviation is known.
 However, in many experimental scenarios, knowledge of σ is not
reasonable than knowledge of the population mean μ.
 Often, an estimate of σ must be supplied by the_ same sample
information that produced the sample average x.
 As a result, a natural statistic to consider to deal with inferences
on μ is
T
X 
S n
President University
Erwin Sitompul
PBST 11/7
Chapter 8.7
t-Distribution
t-Distribution
 If the sample size is large enough, say n ≥ 30, the distribution of T
does not differ considerably from the standard normal.
 However, for n < 30, the values of S2 fluctuate considerably from
sample to sample and the distribution of T deviates appreciably
from that of a standard normal distribution.
 In the case that sample size is small, it is useful to deal with the
exact distribution of T.
 In developing the sampling distribution of T, we shall assume that
the random sample was selected from a normal population,
X    

T
S2  2
where
Z
V
n

Z
V (n  1)
X 
 n
(n  1) S 2
2
President University
Erwin Sitompul
PBST 11/8
Chapter 8.7
t-Distribution
t-Distribution
 Let Z be a standard normal random variable and V a chi-squared
random variable with v degrees of freedom. If Z and V are
independent, then the distribution of the random variable T, where
T
Z
V v
is given by the density function
  (v /1) 2  t 2 
h(t ) 
1  
v
  v 2  v 
 v 1 2
,
  t  
This is known as the t-distribution with v degrees of freedom.
President University
Erwin Sitompul
PBST 11/9
Chapter 8.7
t-Distribution
t-Distribution
 Let X1, X2,..., Xn be independent random variables that are all
normal with mean μ and standard deviation σ. Let
n
Xi
X 
i 1 n
and S 
2
n

i 1
X
i
X

2
n 1
X 
has a t-distribution with
n
Then the random variable T 
v = n – 1 degrees of freedom. S
 The shape of t-distribution
curves for v = 2, 5, and ∞
President University
Erwin Sitompul
PBST 11/10
Chapter 8.7
t-Distribution
Table A.4 t-Distribution
 It is customary to
let tα represent the
t-value above
which we find an
area equal to α.
 The t-distribution is
symmetric about a
mean of zero, that is,
t1–α = –tα
President University
Erwin Sitompul
PBST 11/11
Chapter 8.7
t-Distribution
Table A.4 t-Distribution
 A t-value that falls
below –t0.025 or above
t0.025 would tend to
make us believe that
either a very rare
event has taken place
or perhaps our
assumption about μ is
in error.
 Should this happen,
we shall make the
latter decision and
claim that our
assumed value of μ is
in error.
President University
Erwin Sitompul
PBST 11/12
Chapter 8.7
t-Distribution
t-Distribution
The t-value with v = 14 degrees of freedom that leaves an area of
0.025 to the left, and therefore an area of 0.975 to the right, is
t0.975  t0.025  2.145
Find P(–t0.025 < T < t0.05).
Area  1  0.05  0.025  0.925
P(t0.025  T  t0.05 )  0.925
President University
Erwin Sitompul
PBST 11/13
Chapter 8.7
t-Distribution
t-Distribution
Find k such that P(k < T < –1.761) = 0.045, for a random sample of
size 15 selected from a normal distribution and T  x  
s
n
From t-Distribution Table, the value 1.761 corresponds to t0.05 for
v = 14. So, t–0.05 = –1.761.
P(t  T  t0.05 )  0.045  0.05  t
t  t0.005  t0.005  k   2.977
 P(2.977  T  1.761)  0.045
President University
Erwin Sitompul
PBST 11/14
Chapter 8.7
t-Distribution
t-Distribution
A chemical engineer claims that the population mean yield of a
certain batch process is 500 grams per millimeter of raw material. To
check this claim he samples 25 batches each month. If the computed
t-value falls between –t0.05 and t0.05, he is satisfied with his claim._
What conclusion should he draw from a sample that has a mean x =
518 gr/mm and a sample standard deviation s = 40 gr? Assume the
distribution of yields to be approximately normal.
t
x   518  500

 2.25
s n 40 25
P(t0.05  T  t0.05 )  P(1.711  T  1.711)
 From t-Distribution Table, the value 2.25 corresponds to α
between 0.02 and 0.015.
 This means, the probability of obtaining a mean of 518 gr/mm for
a certain sample while the mean of population is 500 gr/mm is
only approximately 2%.
 It is more reasonable to assume that μ > 500.
 Hence, the manufacturer is likely to conclude that the process
produces a better product than he thought.
President University
Erwin Sitompul
PBST 11/15
Chapter 8.8
F-Distribution
F-Distribution
 If the t-distribution is motivated by the comparison between two
sample means, the F-distribution finds enormous application in
comparing sample variances.
 The statistic F is defined to be the ratio of two independent chisquared random variables, each divided by its number of degrees
of freedom. Hence, we can write
F
U v1
V v2
where U and V are independent random variables having chisquared distributions with v1 and v2 degrees of freedom,
respectively.
President University
Erwin Sitompul
PBST 11/16
Chapter 8.8
F-Distribution
F-Distribution
 Let U and V be two independent random variables having chisquared distributions with v1 and v2 degrees of freedom,
U v1
respectively. Then the distribution of the random variable F 
V v2
is given by the density
   v1  v2  2   v1 v2 v1

 
h( f )  
  v1 2    v2 2 
0,

2 1
f v1 21
,
( v1  v2 ) 2
(1  v1 f v2 )
0 f 
elsewhere
This is known as the F-distribution with v1 and v2 degrees of
freedom.
 Table A.6 in the reference gives values of fα for α = 0.05 and α =
0.01 for various combinations of the degrees of freedom v1 and v2.
 Table A.6 can also be used to find values of f0.95 and f0.99. The
theorem will be presented later.
President University
Erwin Sitompul
PBST 11/17
Chapter 8.8
F-Distribution
Table A.6 F-Distribution
 α = 0.05
President University
Erwin Sitompul
PBST 11/18
Chapter 8.8
F-Distribution
Table A.6 F-Distribution
 α = 0.05
President University
Erwin Sitompul
PBST 11/19
Chapter 8.8
F-Distribution
Table A.6 F-Distribution
 α = 0.01
President University
Erwin Sitompul
PBST 11/20
Chapter 8.8
F-Distribution
Table A.6 F-Distribution
 α = 0.01
President University
Erwin Sitompul
PBST 11/21
Chapter 8.8
F-Distribution
F-Distribution
 Typical F-distribution
 Tabulated values of the Fdistribution
 Writing fα(v1, v2) for fα with v1 and v2 degrees of freedom, we
obtain
f1 (v1 , v2 ) 
1
f (v2 , v1 )
President University
Erwin Sitompul
PBST 11/22
Chapter 8.8
F-Distribution
F-Distribution with Two Sample Variances
 If S1 and S 2 are the variances of independent random samples 2of
size n12 and n2 taken from normal populations with variances  1
and  2 , respectively, then
2
2
S12  12  22 S12
F 2 2  2 2
S2  2  1 S2
has an F-distribution with v1 = n1 – 1 and v2 = n2 – 1 degrees of
freedom.
President University
Erwin Sitompul
PBST 11/23
Probability and Statistics
Homework 10A
1. A maker of a famous chocolate candies claims that their average calorie
is 5 cal/g with a standard deviation of 1.2 cal/g. In a random sample of 8
candies of this famous brand, the calorie content was found to be 6, 7, 7,
3, 4, 5, 4, and 2 cal/g. Would you agree with the claim? Use Chi-squared
distribution and assume a normal distribution.
(Wal8.852 ep.283)
2. A small cleaning service company obtains a contract proposal
from a customer owning an office tower with 100 rooms. The
company has only 5 workers. It needs to determine its profit
margin by first finding out the time required by the workers to
finish cleaning 100 rooms. The first estimation is that the
workers would need 5.5 hours to clean the 100 rooms.
The company starts a probation period for two week, while
collecting data so that it can later charge the customer
correctly. The data collected by the company can be seen on th
next table.
After collecting this data, the company wants to determine if
the first estimation of 5.5 hours to finish cleaning 100 rooms
was reasonable. If the computed t-value falls between –t0.025
and t0.025, the company would be satisfied and will stay with its
first estimation. What is your opinion?
(Int.Rndvz ep.283)
President University
Erwin Sitompul
PBST 11/24
Time to
clean 100
rooms
5.5
7
6.4
4.5
3.9
7.1
5.6
5.8
7.8
4.6
4.5
5.5