Download =sample mean Linear Transformation Constants a and b, if …then

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
๐‘‹ฬ…=sample mean
Linear Transformation
Constants a and b, if ๐‘ฆ๐‘– = ๐‘Ž๐‘ฅ๐‘– + ๐‘โˆ€๐‘– = {1, ๐‘›}
โ€ฆthen ๐‘ฆฬ… = ๐‘Ž๐‘ฅฬ… + ๐‘ Ex temperature in F to C, compute
average in F, then convert that average to C
1
๐‘  2 =sample variance= โˆ‘๐‘›๐‘–=1(๐‘ฅ๐‘– โˆ’ ๐‘ฅฬ… )2
๐‘›โˆ’1
Elements of Probability
Sample space S: set of all possible outcomes
Event ๐ธ, Any subset of sample set
Union: ๐ธ โˆช ๐น, E || F
Intersect: ๐ธ โˆฉ ๐น, E && F, EF
Null event: โˆ… has no outcome
Compliment: , ๐ธ ๐‘ , all outcomes in ๐‘† but not in ๐ธ
๐ธ ๐‘ and ๐ธ are mutually exclusive, ๐‘† ๐‘ = โˆ…
Containment ๐ธโŠ‚๐น or ๐นโŠƒ๐ธ
all out comes in ๐ธ are also in ๐น
Example: ๐ธ={(๐ป,๐ป)} โŠ‚ ๐น={(๐ป,๐ป),(๐ป,๐‘‡), (๐‘‡,๐ป)}
Occurrence of ๐ธ implies that of ๐น
Example: getting two heads implies getting head at all
Equality
๐ธ=๐น, if ๐ธโŠ‚๐น and ๐ธโŠƒ๐น
Commutative law ๐ธโˆช๐น=๐นโˆช๐ธ, ๐ธ๐น=๐น๐ธ
Associative law (๐ธโˆช๐น)โˆช๐บ=๐ธโˆช(๐นโˆช๐บ), (๐ธ๐น)๐บ=๐ธ(๐น๐บ)
Distributive law (๐ธโˆช๐น)=๐ธ๐บโˆช๐น๐บ, ๐ธ๐นโˆช๐บ=(๐ธโˆช๐บ)(๐นโˆช๐บ)
DeMorganโ€™s law (๐ธโˆช๐น)^๐‘=๐ธ^๐‘ ๐น^๐‘, (๐ธ๐น)^๐‘=๐ธ^๐‘ ๐น^๐‘
Axioms of Probability
Probability of event ๐ธ=Proportion of times ๐ธ occurs
(out come is in ๐ธ) in repeated experiments
Axiomatic definition: For each event ๐ธ of an experiment
with sample space ๐‘†, assign a number (๐ธ). If (๐ธ) satisfies
the following three axioms
0โ‰ค(๐ธ)โ‰ค1, ๐‘ƒ(๐‘†)=1
For any sequence of mutually exclusive events ๐ธ_1,_2,โ€ฆ,
๐‘ƒ(โ‹ƒ๐‘›๐‘–=1 ๐ธ๐‘– ) = โˆ‘๐‘›๐‘–=1 ๐‘ƒ(๐ธ๐‘– ) , ๐‘› = 1,2, โ€ฆ , โˆž We call (๐ธ) the
probability of event ๐ธ
Example: ๐‘ƒ(๐ธ ๐‘ ) = 1 โˆ’ ๐‘ƒ(๐ธ)
Proof
๐ธ ๐‘ and ๐ธ are mutually exclusive, ๐‘ƒ(๐ธ ๐‘ ) + ๐‘ƒ(๐ธ) =
๐‘ƒ(๐ธ ๐‘ โˆช ๐ธ) = ๐‘ƒ(๐‘†) = 1
Example: ๐‘ƒ(๐ธ โˆช ๐น) = ๐‘ƒ(๐ธ) + ๐‘ƒ(๐น) โˆ’ ๐‘ƒ(๐ธ๐น)
Proof: Use Venn diagram
๐‘ƒ(๐ธ)
๐‘ƒ(๐ธ)
Odds of event ๐ธ ๐‘) =
๐‘ƒ(๐ธ
1โˆ’๐‘ƒ(๐ธ)
๐‘ƒ = 1/2, odds = 1, it is equally likely
๐‘ƒ = 3/4, odds = 3, it is 3 times as likely to occur
Conditional Probability
๐‘ƒ(๐ธ๐น)
๐‘ƒ(๐ธ|๐น) =
, ๐‘ƒ(๐ธ๐น) = ๐‘ƒ(๐ธ)๐‘ƒ(๐น|๐ธ)
๐‘ƒ(๐น)
Law of Total Probability
๐‘ƒ(๐น) = โˆ‘๐‘›๐‘–=1 ๐‘ƒ(๐ธ๐‘– )๐‘ƒ(๐น|๐ธ๐‘– )
๐‘ƒ(๐น) = ๐‘ƒ(๐น|๐ธ)๐‘ƒ(๐ธ) + ๐‘ƒ(๐น|๐ธ ๐‘ )๐‘ƒ(๐ธ ๐‘ )
Proof: ๐‘ƒ(๐น) = ๐‘ƒ(๐น๐‘†) = ๐‘ƒ(๐น(๐ธ โˆช ๐ธ ๐‘ ))
๐น(๐ธ โˆช ๐ธ ๐‘ ) = ๐น๐ธ โˆช ๐น๐ธ ๐‘ , ๐‘ƒ(๐น(๐ธ โˆช ๐ธ ๐‘ )) = ๐‘ƒ(๐น๐ธ โˆช ๐น๐ธ ๐‘ )
๐น๐ธ and ๐น๐ธ ๐‘ are mutually exclusive
๐‘ƒ(๐น๐ธ โˆช ๐น๐ธ ๐‘ ) = ๐‘ƒ(๐น๐ธ) + ๐‘ƒ(๐น๐ธ ๐‘ )
๐‘ƒ(๐ธ)๐‘ƒ(๐น|๐ธ)
Bayesโ€™ Formula ๐‘ƒ(๐ธ|๐น) =
๐‘ƒ(๐น)
=
๐‘ƒ(๐ธ)๐‘ƒ(๐น|๐ธ)
๐‘ƒ(๐ธ)๐‘ƒ(๐น|๐ธ)+๐‘ƒ(๐ธ ๐‘ )๐‘ƒ(๐น|๐ธ ๐‘ )
๐‘ƒ(๐ธ )๐‘ƒ(๐น|๐ธ๐‘– )
๐‘ƒ(๐ธ๐‘– |๐น) = โˆ‘๐‘› ๐‘– )๐‘ƒ(๐น|๐ธ
๐‘—)
๐‘—=1 ๐‘ƒ(๐ธ๐‘—
Independent Events
๐ธ โŠฅ ๐น, if ๐‘ƒ(๐ธ๐น) = ๐‘ƒ(๐ธ)๐‘ƒ(๐น)
๐‘ƒ(๐ธ๐น)
๐‘ƒ(๐ธ|๐น) =
= ๐‘ƒ(๐ธ)
๐‘ƒ(๐น)
Discrete r.v.
Pmf = ๐‘(๐‘ฅ) = ๐‘ƒ{๐‘‹ = ๐‘ฅ} = ๐‘ƒ{๐‘’ โˆˆ ๐‘†: ๐‘‹(๐‘’) = ๐‘ฅ}
Cdf = ๐น(๐‘ฅ) = ๐‘ƒ{๐‘‹ โ‰ค ๐‘ฅ} = โˆ‘๐‘ฆโ‰ค๐‘ฅ ๐‘ƒ{๐‘‹ = ๐‘ฆ} = โˆ‘๐‘ฆโ‰ค๐‘ฅ ๐‘(๐‘ฆ)
Continuous r.v.
Pdf = ๐‘ƒ{๐‘‹ โˆˆ ๐ต} = โˆซ๐ต ๐‘“(๐‘ฅ)๐‘‘๐‘ฅ
๐‘Ž
Independent r.v. ๐‘‹ โŠฅ ๐‘Œ, if for any two sets of real
numbers ๐ด and ๐ต, ๐‘ƒ{๐‘‹ โˆˆ ๐ด, ๐‘Œ โˆˆ ๐ต} = ๐‘ƒ{๐‘‹ โˆˆ ๐ด}๐‘ƒ{๐‘Œ โˆˆ ๐ต}
If ๐‘‹ โŠฅ ๐‘Œ: ๐น(๐‘Ž, ๐‘) = ๐น๐‘‹ (๐‘Ž)๐น๐‘Œ (๐‘)
๐‘(๐‘ฅ, ๐‘ฆ) = ๐‘๐‘‹ (๐‘ฅ)๐‘๐‘Œ (๐‘ฆ), for discrete RVs
๐‘“(๐‘ฅ, ๐‘ฆ) = ๐‘“๐‘‹ (๐‘ฅ)๐‘“๐‘Œ (๐‘ฆ), for continuous RVs
Conditional Distribution
๐‘ƒ{๐‘‹=๐‘ฅ,๐‘Œ=๐‘ฆ}
๐‘(๐‘ฅ,๐‘ฆ)
Disc: ๐‘๐‘‹|๐‘Œ (๐‘ฅ|๐‘ฆ) = ๐‘ƒ{๐‘‹ = ๐‘ฅ|๐‘Œ = ๐‘ฆ} =
= (๐‘ฆ)
๐‘ƒ{๐‘Œ=๐‘ฆ}
Continuous: ๐‘“๐‘‹|๐‘Œ (๐‘ฅ|๐‘ฆ) =
๐‘“(๐‘ฅ,๐‘ฆ)
๐‘๐‘Œ
๐‘“๐‘Œ (๐‘ฆ)
Expectation discrete RV: ๐ธ[๐‘‹] = โˆ‘๐‘– ๐‘ฅ๐‘– ๐‘(๐‘ฅ๐‘– )
โˆž
Continuous RV: ๐ธ[๐‘‹] = โˆซโˆ’โˆž ๐‘ฅ๐‘“(๐‘ฅ)๐‘‘๐‘ฅ
๐ธ[๐‘‹ + ๐‘Œ] = ๐ธ[๐‘‹] + ๐ธ[๐‘Œ]
๐ธ[โˆ‘๐‘›๐‘–=1 ๐‘‹๐‘– ] = โˆ‘๐‘›๐‘–=1 ๐ธ[๐‘‹๐‘– ]
If independence, ๐ธ[๐‘‹๐‘Œ] = ๐ธ[๐‘‹]๐ธ[๐‘Œ]
To solve optimization problems:
-choose quantity ๐‘ to represent r.v. ๐‘‹
MSE=Mean Squared Error, ๐‘€๐‘†๐ธ(๐‘) = ๐ธ[(๐‘‹ โˆ’ ๐‘)2 ]
Let ๐ธ[๐‘‹] = ๐œ‡ that minimizes MSE, ๐‘€๐‘†๐ธ(๐œ‡) โ‰ค ๐‘€๐‘†๐ธ(๐‘)โˆ€๐‘
โˆž
Law of the Lazy Statistician ๐ธ[๐‘”(๐‘‹)] = โˆซโˆ’โˆž ๐‘”(๐‘ฅ)๐‘“(๐‘ฅ)๐‘‘๐‘ฅ
๐‘”(๐‘‹) unknown, but ๐‘“(๐‘‹) known
๐ธ[๐‘Ž๐‘‹ + ๐‘] = ๐‘Ž๐ธ[๐‘‹] + ๐‘
๐ธ[๐‘”(๐‘‹)] โ‰  ๐‘”(๐ธ[๐‘‹]) except for linear case
Variance Var(๐‘‹) = ๐ธ[(๐‘‹ โˆ’ ๐œ‡)2 ] = ๐ธ[๐‘‹ 2 ] โˆ’ ๐œ‡ 2
-always positive, 0 iff ๐‘‹ is constant
Var(๐‘Ž๐‘‹ + ๐‘) = ๐‘Ž2 Var(๐‘‹)
Covariance
Cov(๐‘‹, ๐‘Œ) = ๐ธ[(๐‘‹ โˆ’ ๐œ‡๐‘‹ )(๐‘Œ โˆ’ ๐œ‡๐‘Œ )] = ๐ธ[๐‘‹๐‘Œ] โˆ’ ๐œ‡๐‘‹ ๐œ‡๐‘Œ
Cov(๐‘‹, ๐‘Œ) = Cov(๐‘Œ, ๐‘‹),
Cov(๐‘‹, ๐‘‹) = Var(๐‘‹) โ‰ฅ 0
If ๐‘‹ โŠฅ ๐‘Œ, Cov(๐‘‹, ๐‘Œ) = 0, but converse may be false
Cov(๐‘Ž๐‘‹ + ๐‘, ๐‘Œ) = ๐‘ŽCov(๐‘‹, ๐‘Œ)
Cov(๐‘‹ + ๐‘, ๐‘Œ) = Cov(๐‘‹, ๐‘Œ) + Cov(๐‘, ๐‘Œ)
Variance and Covariance
Variance of sum of RVs
Var(๐‘‹ + ๐‘Œ) = Var(๐‘‹) + Var(๐‘Œ) + 2Cov(๐‘‹, ๐‘Œ)
If ๐‘‹ โŠฅ ๐‘Œ, Var(๐‘‹ + ๐‘Œ) = Var(๐‘‹) + Var(๐‘Œ)
For ๐‘‹1 , โ€ฆ , ๐‘‹๐‘› , if ๐‘‹๐‘– โŠฅ ๐‘‹๐‘— , for all ๐‘– โ‰  ๐‘—, Var(โˆ‘๐‘›๐‘–=1 ๐‘‹๐‘– ) =
โˆ‘๐‘›๐‘–=1 Var(๐‘‹๐‘– )
Mutually independent
Cov(๐‘‹,๐‘Œ)
Cov(๐‘‹,๐‘Œ)
Correlation: Corr(๐‘‹, ๐‘Œ) =
=
Given an Operating Characteristic curve, want to know
probability of acceptance/rejection
Split into 4 regions: accept acceptable, reject
nonacceptable (hit), reject acceptable (supplierโ€™s risk,
false alarm/positive, type I error), accept nonacceptable
(inspectorโ€™s risk, miss, type II error)
As number of products inspected go up, curve gets
steeper, ideally a step function
Poisson
Number of event occurrences during a time period
Example: arrival of customers
We observe that on average ฮป customers come to the
store in 1 hour
Divide 1 hour into n intervals (e.g. 3600 seconds)
On average, ฮป out of n intervals have customer arrive
When n is large, no interval has more than 1 arrival in it
Each interval has arrival with probability ฮป/n
Number of arrivals in one hour: binomial distribution
ฮป k
ฮป nโˆ’k
n
n
p(k) = (nk) ( ) (1 โˆ’ )
ฮปk
n โ†’ โˆž, p(k) โ‰ˆ eโˆ’ฮป
k!
Pois replaces bin when n large, p small
ฮป
ฮป
ฮผ = E[Bin (n, )] = ฮป, ฯƒ2 = ฮป (1 โˆ’ ) โ‰ˆ ฮป
n
n
Uniform If we want the range to be [ฮฑ, ฮฒ]
consider Y = ฮฑ + (ฮฒ โˆ’ ฮฑ)X
yโˆ’ฮฑ
CDF: X โ‰ค x โ‡” Y โ‰ค ฮฑ + (ฮฒ โˆ’ ฮฑ)X, Y โ‰ค y โ‡” x โ‰ค
ฮฒโˆ’ฮฑ
FY (y) = FX (
yโˆ’ฮฑ
)=
ฮฒโˆ’ฮฑ
d
PDF: fY (y) =
dy
yโˆ’ฮฑ
, if y โˆˆ [ฮฑ, ฮฒ]
ฮฒโˆ’ฮฑ
FY (y) =
1
ฮฒโˆ’ฮฑ
Mean: E[Y] = ฮฑ + (ฮฒ โˆ’ ฮฑ)E[X] =
ฮฑ+ฮฒ
2
Variance: Var(Y) = (ฮฒ โˆ’ ฮฑ)2 Var(X) =
(ฮฒโˆ’ฮฑ)2
12
Exponential has PDF f(x) = ฮปeโˆ’ฮปx, for x โ‰ฅ 0
x
CDF F(x) = โˆซ0 ฮปeโˆ’ฮปy dy = โˆ’eโˆ’ฮปy |0x = 1 โˆ’ eโˆ’ฮปx
For a non-negative random variable X, with tail
distribution Fฬ…(x)
โˆž
E[X] = โˆซ Fฬ…(x)dx
0
โˆž
1
โˆž 1
E[X] = โˆซ eโˆ’ฮปx dx = โˆ’ eโˆ’ฮปx | =
ฮป
0 ฮป
โˆ’1 โ‰ค Corr(๐‘‹, ๐‘Œ) โ‰ค 1, 0 correlation does not mean
0
1
independence
Var(X) = 2
Markovโ€™s Inequality ๐‘‹ is a nonnegative random variable,
ฮป
(xโˆ’ฮผ)2
๐ธ[๐‘‹]
1
โˆ’
then for any ๐‘Ž > 0, ๐‘ƒ{๐‘‹ โ‰ฅ ๐‘Ž} โ‰ค
Normal
PDF
f(x) =
e 2ฯƒ2
๐‘Ž
โˆš2ฯ€ฯƒ
โˆž
๐‘Ž
โˆž
Proof: ๐ธ[๐‘‹] = โˆซ0 ๐‘ฅ๐‘“(๐‘ฅ)๐‘‘๐‘ฅ = โˆซ0 ๐‘ฅ๐‘“(๐‘ฅ)๐‘‘๐‘ฅ + โˆซ๐‘Ž ๐‘ฅ๐‘“(๐‘ฅ)๐‘‘๐‘ฅ If X โˆผ ๐’ฉ(ฮผ, ฯƒ2 ), Y = a + bX, then Y โˆผ ๐’ฉ(a + ฮผ, b2 ฯƒ2 )
โˆšVar(๐‘‹)Var(๐‘Œ)
๐‘Ž
๐œŽ(๐‘‹)๐œŽ(๐‘Œ)
โˆž
Xโˆ’ฮผ
Z=
, then Z โˆผ ๐’ฉ(0, 1), standard normal dist
ฯƒ
0
๐‘Ž
CDF:
ฮฆ(x)
โˆž
โˆž
xโˆ’ฮผ
xโˆ’ฮผ
FX (x) = P{X โ‰ค x} = P {Z โ‰ค
} = ฮฆ(
)
โˆซ ๐‘ฅ๐‘“(๐‘ฅ)๐‘‘๐‘ฅ โ‰ฅ โˆซ ๐‘Ž๐‘“(๐‘ฅ)๐‘‘๐‘ฅ
ฯƒ
ฯƒ
๐‘Ž
๐‘Ž
ฮฆ(โˆ’x)
=
P{Z
โ‰ค
โˆ’x}
=
P{Z
โ‰ฅ
x}
=
1
โˆ’
ฮฆ(x)
Chebyshevโ€™s Inequality ๐‘‹ is a random variable with
ฬ… (zฮฑ ) = ฮฑ
zฮฑ : 1 โˆ’ ฮฆ(zฮฑ ) = ฮฆ
๐œŽ2
mean ๐œ‡ and variance ๐œŽ 2 , then ๐‘ƒ{|๐‘‹ โˆ’ ๐œ‡| โ‰ฅ ๐‘˜} โ‰ค 2
๐‘˜
The (1 โˆ’ ฮฑ)100-th percentile of Z
Proof: Consider random variable (๐‘‹ โˆ’ ๐œ‡)2, which is zฮฑ = ฮฆโˆ’1 (1 โˆ’ ฮฑ)
nonnegative. Apply Markovโ€™s inequality with ๐‘Ž = ๐‘˜ 2
Sum of independent normal RVs is still normal
๐ธ[(๐‘‹ โˆ’ ๐œ‡)2 ] ๐œŽ 2
X1 โˆผ ๐’ฉ(ฮผ1 , ฯƒ12 ), X2 โˆผ ๐’ฉ(ฮผ2 , ฯƒ22 ), X1 โŠฅ X2
๐‘ƒ{(๐‘‹ โˆ’ ๐œ‡)2 โ‰ฅ ๐‘˜ 2 } โ‰ค
=
๐‘˜2
๐‘˜2
X1 + X2 โˆผ ๐’ฉ(ฮผ1 + ฮผ2 , ฯƒ12 + ฯƒ22 )
Weak Law of Large Numbers
What about X1 โˆ’ X2 ?
Let ๐‘‹1 , ๐‘‹2 , โ€ฆ , be a sequence of independent and โˆ’X โˆผ ๐’ฉ(โˆ’ฮผ , ฯƒ2 )
2
2 2
identically distributed (i.i.d.) random variables, each X โˆ’ X โˆผ ๐’ฉ(ฮผ โˆ’ ฮผ , ฯƒ2 + ฯƒ2 )
1
2
1
2 1
2
having mean ๐ธ[๐‘‹๐‘– ] = ๐œ‡. Then for any ๐œ€ > 0,
Samples sample mean ๐‘‹ฬ… = (๐‘‹1 + โ‹ฏ + ๐‘‹๐‘› )/๐‘›
๐‘‹1 + ๐‘‹2 + โ‹ฏ + ๐‘‹๐‘›
1
๐‘ƒ {|
โˆ’ ๐œ‡| > ๐œ€} โ†’ 0, as ๐‘› โ†’ โˆž.
๐‘‹ฬ… = (๐‘‹1 + ๐‘‹2 + โ‹ฏ + ๐‘‹๐‘› )
๐‘›
๐‘›
Average of i.i.d. RVโ€™s converges to their mean
1
1
๐ธ[๐‘‹ฬ…] = ๐ธ [ (๐‘‹1 + โ‹ฏ + ๐‘‹๐‘› )] = (๐ธ[๐‘‹1 ]+. . +๐ธ[๐‘‹๐‘› ]) = ๐œ‡
Proof: (assuming Var(X) = ฯƒ2 exists)
๐‘›
๐‘›
X1 + X 2 + โ‹ฏ + X n
1
E[
]=ฮผ
Var(๐‘‹ฬ…) = Var ( (๐‘‹1 + โ‹ฏ + ๐‘‹๐‘› ))
๐‘›
n
2
1
๐œŽ2
X1 + X 2 + โ‹ฏ + X n
1
ฯƒ
= 2 (Var(๐‘‹1 ) + โ‹ฏ + Var(๐‘‹1 )) =
Var (
) = 2 Var(X1 + โ‹ฏ + Xn ) =
๐‘›
๐‘›
n
n
n
1
โˆ‘๐‘›๐‘–=1(๐‘‹๐‘– โˆ’ ๐‘‹ฬ…)2
Sample variance ๐‘† 2 =
Apply Chebyshevโ€™s inequality
โˆซ ๐‘ฅ๐‘“(๐‘ฅ)๐‘‘๐‘ฅ โ‰ฅ 0 โ‡’ ๐ธ[๐‘‹] โ‰ฅ โˆซ ๐‘ฅ๐‘“(๐‘ฅ)๐‘‘๐‘ฅ
Cdf = ๐น(๐‘Ž) = โˆซโˆ’โˆž ๐‘“(๐‘ฅ)๐‘‘๐‘ฅ
Joint r.v.
X +X +โ‹ฏ+Xn
ฯƒ2
joint CDF, ๐น(๐‘ฅ, ๐‘ฆ) = ๐‘ƒ{๐‘‹ โ‰ค ๐‘ฅ, ๐‘Œ โ‰ค ๐‘ฆ}
P {| 1 2
โˆ’ ฮผ| > ฮต} โ‰ค 2 โ†’ 0, as n โ†’ โˆž
n
nฮต
Discrete RVs, joint PMF, ๐‘(๐‘ฅ๐‘– , ๐‘ฆ๐‘— ) = ๐‘ƒ{๐‘‹ = ๐‘ฅ๐‘– , ๐‘Œ = ๐‘ฆ๐‘— }
Bernoulli ฮผ = p, ฯƒ2 = p(1 โˆ’ p)
Marginal CDF ๐น๐‘‹ (๐‘ฅ) = ๐น(๐‘ฅ, โˆž), i.e. ๐‘Œ can take any value
Binomial sum of n iid Bernoulli r.v.
Marginal PMF ๐‘๐‘‹ (๐‘ฅ) = โˆ‘โˆž
๐‘—=1 ๐‘(๐‘ฅ, ๐‘ฆ๐‘— )
P{X = k} = (nk)pk (1 โˆ’ p)nโˆ’k, for k = 0,1, โ€ฆ , n
Continuous joint PDF: ๐‘ƒ{(๐‘‹, ๐‘Œ) โˆˆ ๐ถ} = โˆฌ(๐‘ฅ,๐‘ฆ)โˆˆ๐ถ ๐‘“(๐‘ฅ, ๐‘ฆ)
โˆ‘nk=0 p(k) = โˆ‘nk=0(nk)pk (1 โˆ’ p)nโˆ’k = (p + 1 โˆ’ p)n = 1
๐‘Ž
๐‘
ฮผ = E[Y1 + Y2 + โ‹ฏ ] = np
Joint CDF: ๐น(๐‘Ž, ๐‘) = โˆซโˆ’โˆž โˆซโˆ’โˆž ๐‘“(๐‘ฅ, ๐‘ฆ)๐‘‘๐‘ฅ๐‘‘๐‘ฆ
โˆž
ฯƒ2 = Var(Y1 + Y2. . . ) = np(1 โˆ’ p)
Marginal PDF ๐‘“๐‘‹ (๐‘ฅ) = โˆซโˆ’โˆž ๐‘“(๐‘ฅ, ๐‘ฆ) ๐‘‘๐‘ฆ
Ex. Acceptance Sampling
๐‘›โˆ’1
For an arbitrary distribution with mean ๐œ‡ and var ๐œŽ 2
For the sample mean ๐‘‹ฬ…, ๐ธ[๐‘‹ฬ…], Var(๐‘‹ฬ…)
Approximate distribution when ๐‘› is large
(Central Limit Theorem)
For the sample variance ๐‘† 2 (๐ธ[๐‘† 2 ])
For a normal distribution ๐’ฉ(๐œ‡, ๐œŽ 2 ), Exact distribution of
๐‘‹ฬ… and ๐‘† 2
Central Limit Theorem Let ๐‘‹1 , ๐‘‹2 , โ€ฆ , ๐‘‹๐‘› be a sequence of
i.i.d. random variables with mean ๐œ‡ and variance ๐œŽ 2
โˆš๐‘›(๐‘‹ฬ…โˆ’๐œ‡)
2
then the distribution of โˆš๐‘›(๐‘‹ฬ… โˆ’ ๐œ‡)โ„๐œŽ converges to ๐‘ƒ{๐‘† 2 > 12} = ๐‘ƒ {14 ๐‘† 2 > 14 × 12} = ๐‘ƒ{๐œ’14
> 18.57} =
โ€ข
๐‘ƒ {โˆ’๐‘ง๐›ผโ„2 <
< ๐‘ง๐›ผโ„2 } = 1 โˆ’ ๐›ผ
9
9
๐œŽ
standard normal as ๐‘› โ†’ โˆž
๐œŽ
๐œŽ
0.1779
ฬ…
ฬ…
๐‘ƒ
{๐‘‹
โˆ’
๐‘ง
<
๐œ‡
<
๐‘‹
+
๐‘ง
}
=
1
โˆ’
๐›ผ
ฬ…
โ„
โ„
๐›ผ 2
๐›ผ 2
๐‘ƒ{โˆš๐‘›(๐‘‹ โˆ’ ๐œ‡)โ„๐œŽ โ‰ค ๐‘ฅ} โ†’ ฮฆ(๐‘ฅ) โ‰ˆ ๐‘ƒ{๐‘ < ๐‘ฅ}
๐‘
โˆš๐‘›
โˆš๐‘›
๐’•-distribution If ๐‘ โŠฅ ๐‘‹๐‘›2 ๐‘‡๐‘› =
with ๐‘› deg of freedom
๐œŽ
๐œŽ
ฬ…
2
โˆš๐œ’๐‘›/๐‘›
โˆš๐‘›(๐‘‹ โˆ’ ๐œ‡)โ„๐œŽ โ†’ ๐’ฉ(0,1)
(๐‘ฅฬ… โˆ’ ๐‘ง๐›ผโ„2 , ๐‘ฅฬ… + ๐‘ง๐›ผโ„2 )
โˆš๐‘›
โˆš๐‘›
Generally works as long as ๐‘› โ‰ฅ 30
๐‘‹ฬ…โˆ’๐œ‡
๐œŽ
Recall
โˆผ
๐‘
One sided upper: (๐‘ฅฬ… โˆ’ ๐‘ง๐›ผ , โˆž)
Questions: approximate distribution of the sum?
๐œŽโ„โˆš๐‘›
โˆš๐‘›
2
๐œŽ
(๐‘› โˆ’ 1)๐‘† 2 โ„๐œŽ 2 โˆผ ๐œ’๐‘›โˆ’1
๐‘‹1 + ๐‘‹2 + โ‹ฏ + ๐‘‹๐‘› โ‰ˆ ๐’ฉ(๐‘›๐œ‡, ๐‘›๐œŽ 2 )
One sided lower: (โˆ’โˆž, ๐‘ฅฬ… + ๐‘ง๐›ผ )
โˆš๐‘›
Question: approximate distribution of the sample mean?
๐‘‹ฬ… โˆ’ ๐œ‡
โˆš๐‘›(๐‘‹ฬ… โˆ’ ๐œ‡)โ„๐œŽ
Estimating Difference in Means
=
โˆผ ๐‘ก๐‘›โˆ’1
๐‘‹ฬ… โ‰ˆ ๐’ฉ(๐œ‡, ๐œŽ 2 โ„๐‘›)
โˆš(๐‘› โˆ’ 1)๐‘† 2 โ„๐œŽ 2 โ„(๐‘› โˆ’ 1) ๐‘†โ„โˆš๐‘›
๐‘‹ฬ… โˆ’ ๐‘Œฬ… โˆผ ๐’ฉ(๐œ‡1 โˆ’ ๐œ‡2 , ๐œŽ12 โ„๐‘› + ๐œŽ22 โ„๐‘š)
Ex The College of Engineering decided to admit 450
๐‘‹ฬ…โˆ’๐‘Œฬ… โˆ’(๐œ‡1 โˆ’๐œ‡2 )
Parameter Estimation a population has certain
Normalize
โˆผ ๐’ฉ(0,1)
students, 0.3 probability that an admitted student will
distribution with unknown parameter ๐œƒ
โˆš๐œŽ12 โ„๐‘›+๐œŽ22 โ„๐‘š
actually enroll. What is the probability that the incoming
Estimate ๐œƒ based on ๐‘‹1 โ€ฆ ๐‘‹๐‘›
๐œŽ2
๐œŽ2
๐œŽ2
๐œŽ2
class has more than 150 students?
ฬ‚ (๐‘‹1 , โ€ฆ , ๐‘‹๐‘› ) a function of sample, is a r.v.
(๐‘ฅฬ… โˆ’ ๐‘ฆฬ… โˆ’ ๐‘ง๐›ผโ„2 โˆš ๐‘›1 + ๐‘š2 , ๐‘ฅฬ… โˆ’ ๐‘ฆฬ… + ๐‘ง๐›ผโ„2 โˆš ๐‘›1 + ๐‘š2 ) is a 1 โˆ’ ๐›ผ
ฮ˜
Solution:๐‘‹๐‘– โˆผ Bernoulli(0.3), ๐‘† = ๐‘‹1 + ๐‘‹2 + โ‹ฏ +
Is also a point estimator
๐‘‹450 ~Binomial(0.3, 450)
๐‘‹ฬ…โˆ’๐‘Œฬ…โˆ’(๐œ‡1 โˆ’๐œ‡2 )
Moment Generating Function
๐‘ƒ {โˆ’๐‘ง๐›ผโ„2 โ‰ค
โ‰ค ๐‘ง๐›ผโ„2 } = 1 โˆ’ ๐›ผ
๐‘ƒ{๐‘† > 150} = ๐‘ƒ{๐‘† = 151} + ๐‘ƒ{๐‘† = 152} + โ‹ฏ
2
2
๐œ™(๐‘ก) = ๐ธ[๐‘’ ๐‘ก๐‘ฅ ๐‘(๐‘ฅ)] = โˆซ ๐‘’ ๐‘ก๐‘ฅ ๐‘“(๐‘ฅ)๐‘‘๐‘ฅ
โˆš๐œŽ1 +๐œŽ2
๐‘› ๐‘š
+ ๐‘ƒ{๐‘† = 450}
๐‘กโ„Ž
๐‘กโ„Ž
๐‘› moment is ๐‘› derivative
๐œŽ2
๐œŽ2
๐‘‹1 + โ‹ฏ + ๐‘‹450 is approximately normal with mean
๐‘ƒ {๐‘‹ฬ… โˆ’ ๐‘Œฬ… โˆ’ ๐‘ง๐›ผโ„2 โˆš 1 + 2 โ‰ค ๐œ‡1 โˆ’ ๐œ‡2 โ‰ค ๐‘‹ฬ… โˆ’ ๐‘Œฬ… +
Method of Moments
๐‘›
๐‘š
450 × 0.3 = 135 and standard deviation
Moments can be expressed as functions of parameters
2
2
โˆš450 × 0.3 × 0.7 = โˆš94.5
๐‘š]
๐œŽ
๐œŽ
๐ธ[๐‘‹ = ๐‘”๐‘˜ (๐œƒ1 , ๐œƒ2 , โ€ฆ , ๐œƒ๐‘˜ )
๐‘ง๐›ผโ„2 โˆš 1 + 2 } = 1 โˆ’ ๐›ผ
๐‘š
Approximate a discrete r.v. (e.g. ๐‘† = ๐‘‹1 + โ‹ฏ + ๐‘‹450)
๐‘š
๐‘›
๐‘š
๐‘š = ๐‘‹1 +โ‹ฏ+๐‘‹๐‘›
ฬ…ฬ…ฬ…ฬ…
If ๐‘˜ unknown parameters, ๐‘‹
with a continuous r.v. (e.g. ๐‘Œ โˆผ ๐’ฉ(135,94.5))
๐‘›
๐‘”1 (๐œƒ1 , ๐œƒ2 , โ€ฆ , ๐œƒ๐‘˜ ) = ฬ…๐‘‹ฬ…ฬ…ฬ…1
Question: ๐‘ƒ{๐‘† = ๐‘˜} โ‰ˆ ๐‘ƒ{๐‘Œ โˆˆ ? }
C.I. for the difference in the means of two normal
๐‘ƒ{๐‘† = ๐‘˜} โ‰ˆ ๐‘ƒ{๐‘˜ โˆ’ 0.5 โ‰ค ๐‘Œ โ‰ค ๐‘˜ + 0.5}
๐‘”2 (๐œƒ1 , ๐œƒ2 , โ€ฆ , ๐œƒ๐‘˜ ) = ฬ…ฬ…
๐‘‹ฬ…ฬ…2
population when the variances are known
Continuity correction
โ‹ฎ
When unknown variances, assume to be equal
(๐‘›โˆ’1)๐‘†12 +(๐‘šโˆ’1)๐‘†22
๐‘ƒ{๐‘‹1 + โ‹ฏ ๐‘‹450 > 150} = โˆ‘โˆž
๐‘‹ฬ…ฬ…๐‘˜
๐‘˜=151 ๐‘ƒ{๐‘‹1 + โ‹ฏ + ๐‘‹450 = ๐‘˜} โ‰ˆ {๐‘”๐‘˜ (๐œƒ1 , ๐œƒ2 , โ€ฆ , ๐œƒ๐‘˜ ) = ฬ…ฬ…
Use Pooled Sample Variance ๐‘†๐‘2 = (๐‘›โˆ’1)+(๐‘šโˆ’1)
โˆ‘โˆž
Maximum Likelihood Estimators
๐‘˜=151 ๐‘ƒ{๐‘˜ โˆ’ 0.5 โ‰ค ๐‘Œ โ‰ค ๐‘˜ + 0.5} = ๐‘ƒ{๐‘Œ โ‰ฅ 150.5}
๐‘‹ฬ…โˆ’๐‘Œฬ…โˆ’(๐œ‡1 โˆ’๐œ‡2 )
150.5โˆ’135
150.5โˆ’135
โˆผ ๐‘ก๐‘›+๐‘šโˆ’2
๐‘ƒ{๐‘Œ โ‰ฅ 150.5} = ๐‘ƒ {๐‘ โ‰ฅ
} = 1 โˆ’ ฮฆ(
) โ‰ˆ Given parameter ๐œƒ, for each observation ๐‘ฅ1 , ๐‘ฅ2 , โ€ฆ , ๐‘ฅ๐‘› , we 2
โˆš94.5
โˆš94.5
โˆš๐‘†๐‘ (1โ„๐‘›+1โ„๐‘š)
can calculate the joint density function
1 โˆ’ ฮฆ(1.59) โ‰ˆ 0.06
๐‘“(๐‘ฅ1 , ๐‘ฅ2 , โ€ฆ , ๐‘ฅ๐‘› ; ๐œƒ)
1
1
Continuity Correction When using a continuous function
(๐‘ฅฬ… โˆ’ ๐‘ฆฬ… โˆ’ ๐‘ก๐›ผโ„2,๐‘›+๐‘šโˆ’2 ๐‘ ๐‘ โˆš๐‘› + ๐‘š , ๐‘ฅฬ… โˆ’ ๐‘ฆฬ… +
๐‘ฅ1 , ๐‘ฅ2 , โ€ฆ , ๐‘ฅ๐‘› are independent
to approximate a discrete one, ex. Normal approx.. Bin.
๐‘“(๐‘ฅ1 , ๐‘ฅ2 , โ€ฆ , ๐‘ฅ๐‘› ; ๐œƒ) = ๐‘“๐‘‹1 (๐‘ฅ1 ; ๐œƒ)๐‘“๐‘‹2 (๐‘ฅ2 ; ๐œƒ) โ‹ฏ ๐‘“๐‘‹๐‘› (๐‘ฅ๐‘› ; ๐œƒ)
1
1
If P(X=n) use P(n โ€“ 0.5 < X < n + 0.5)
๐‘ก๐›ผโ„2,๐‘›+๐‘šโˆ’2 ๐‘ ๐‘ โˆš + ) is a 1 โˆ’ ๐›ผ C.I. for the difference in
Given observation ๐‘ฅ1 , ๐‘ฅ2 , โ€ฆ , ๐‘ฅ๐‘› , for each ๐œƒ, define the
๐‘›
๐‘š
If P(X>n) use P(X > n + 0.5)
likelihood ๐ฟ(๐œƒ) = ๐‘“(๐‘ฅ1 , ๐‘ฅ2 , โ€ฆ , ๐‘ฅ๐‘› ; ๐œƒ)
If P(Xโ‰คn) use P(X < n + 0.5)
the means of two normal population when the variances
Maximum likelihood Estimator MLE maximizes ๐ฟ(ฮ˜)
If P (X<n) use P(X < n โ€“ 0.5)
are unknown but equal
For
MLE
use
the
pdf
of
the
distribution,
using
๐œƒ
to
If P(X โ‰ฅ n) use P(X > n โ€“ 0.5)
Approximate Confidence Interval
replace as needed. Multiply together all ๐‘› samples in this If population not normal but relatively large sample size,
Sample Variance
1
new pdf.
use results for normal population mean with known
โˆ‘๐‘› (๐‘‹ โˆ’ ๐‘‹ฬ…)2
๐‘†2 =
๐‘›โˆ’1 ๐‘–=1 ๐‘–
Itโ€™s easier to maximize log ๐ฟ(๐œƒ) โ‰ ๐‘™(๐œƒ)
variance to obtain approximate 1 โˆ’ ๐›ผ c.i.
๐‘› (๐‘‹
๐‘›
2
2
2
โˆ‘๐‘–=1 ๐‘– โˆ’ ๐‘‹ฬ…) = โˆ‘๐‘–=1 ๐‘‹๐‘– โˆ’ ๐‘›๐‘‹ฬ…
MLE of exponential dist is the sample mean
๐‘ 
๐‘ 
๐‘›
๐‘›
2
๐‘›
2
2
2
ฬ…
ฬ…
(๐‘ฅฬ… โˆ’ ๐‘ง๐›ผโ„2 , ๐‘ฅฬ… + ๐‘ง๐›ผโ„2 )
๐ธ[โˆ‘๐‘–=1(๐‘‹๐‘– โˆ’ ๐‘‹) ] = ๐ธ[โˆ‘๐‘–=1 ๐‘‹๐‘– โˆ’ ๐‘›๐‘‹ ] = ๐ธ[โˆ‘๐‘–=1 ๐‘‹๐‘– ] โˆ’
Example: Bernoulli distribution
โˆš๐‘›
โˆš๐‘›
2
2
2
๐ธ[๐‘›๐‘‹ฬ… ] = ๐‘›๐ธ[๐‘‹1 ] โˆ’ ๐‘›๐ธ[๐‘‹ฬ… ]
Ex. Monte Carlo Simul.
๐‘ƒ{๐‘‹ = 1} = ๐œƒ = ๐œƒ 1
๐‘
Var(๐‘‹) = ๐ธ[๐‘‹ 2 ] โˆ’ ๐ธ[๐‘‹]2 โ‡’ ๐ธ[๐‘‹ 2 ] = Var(๐‘‹) + ๐ธ[๐‘‹]2
๐‘ƒ{๐‘‹ = 0} = 1 โˆ’ ๐œƒ = (1 โˆ’ ๐œƒ)1โˆ’0
Evaluate a definite integral ๐œƒ = โˆซ๐‘Ž ๐‘”(๐‘ฅ)๐‘‘๐‘ฅ
๐ธ[๐‘‹12 ] = Var(๐‘‹1 ) + ๐ธ[๐‘‹1 ]2 = ๐œŽ 2 + ๐œ‡ 2
๐‘๐‘‹ (๐‘ฅ; ๐œƒ) = ๐œƒ ๐‘ฅ (1 โˆ’ ๐œƒ)1โˆ’๐‘ฅ
Consider an uniform random variable ๐‘ˆ in [๐‘Ž, ๐‘]
๐ธ[๐‘‹ฬ… 2 ] = Var(๐‘‹ฬ…) + ๐ธ[๐‘‹ฬ…]2 = ๐œŽ 2 โ„๐‘› + ๐œ‡ 2
log ๐‘๐‘‹ (๐‘ฅ; ๐œƒ) = ๐‘ฅ log ๐œƒ + (1 โˆ’ ๐‘ฅ) log(1 โˆ’ ๐œƒ)
Law of lazy statistician
๐‘› (๐‘‹
2]
2
2
2
2
2
๐‘›
๐‘›
ฬ…
)
(๐‘›
๐ธ[โˆ‘๐‘–=1 ๐‘– โˆ’ ๐‘‹ = ๐‘›๐œŽ + ๐‘›๐œ‡ โˆ’ ๐œŽ โˆ’ ๐‘›๐œ‡ = โˆ’ 1)๐œŽ ๐‘™(๐œƒ) = โˆ‘๐‘–=1 ๐‘ฅ๐‘– log ๐œƒ + (๐‘› โˆ’ โˆ‘๐‘–=1 ๐‘ฅ๐‘– ) log(1 โˆ’ ๐œƒ)
๐‘
๐‘
1
๐œƒ
2]
2
๐‘›
๐‘›
๐ธ[๐‘”(๐‘ˆ)] = โˆซ๐‘Ž ๐‘”(๐‘ฅ)๐‘“๐‘ˆ (๐‘ข)๐‘‘๐‘ข =
โˆซ ๐‘”(๐‘ข)๐‘‘๐‘ข = ๐‘โˆ’๐‘Ž
โˆ‘
๐‘‘
๐‘ฅ
๐‘›โˆ’โˆ‘๐‘–=1 ๐‘ฅ๐‘–
๐ธ[๐‘† = ๐œŽ , so unbiased
๐‘โˆ’๐‘Ž ๐‘Ž
๐‘™(๐œƒ) = ๐‘–=1 ๐‘– โˆ’
๐‘‘๐œƒ
๐œƒ
1โˆ’๐œƒ
Estimate ๐œƒ
Sampling from a Normal Population
(1 โˆ’ ๐œƒ) โˆ‘๐‘›๐‘–=1 ๐‘ฅ๐‘– = ๐‘›๐œƒ โˆ’ ๐œƒ โˆ‘๐‘›๐‘–=1 ๐‘ฅ๐‘–
Simulate uniform random variables ๐‘ˆ1 , โ€ฆ , ๐‘ˆ๐‘›
๐‘‹ฬ… โˆผ ๐’ฉ(๐œ‡, ๐œŽ 2 โ„๐‘›)
๐‘›
โˆ‘
๐‘ฅ
๐‘›
โˆ‘๐‘› ๐‘”(๐‘ˆ )
Chi-square distribution ๐œ’๐‘›2 = ๐‘12 + ๐‘22 + โ‹ฏ + ๐‘๐‘›2 with ๐‘›
๐œƒฬ‚(๐‘‹1 , โ€ฆ , ๐‘‹๐‘› ) = ๐‘‹ฬ… = ๐‘–=1 ๐‘– = 1
๐œƒฬ‚ = (๐‘ โˆ’ ๐‘Ž) ๐‘–=1 ๐‘–
๐‘›
๐‘›
๐‘›
degrees of freedom, sum of indep chi-squar is chi-square Normal: ๐œ‡ฬ‚ (๐‘‹ , โ€ฆ , ๐‘‹ ) = ๐‘‹ฬ…
1
๐‘›
Let ๐œƒฬ‚ and ๐‘  be the observed sample mean and standard
1
โˆ‘๐‘›๐‘–=1(๐‘‹๐‘– โˆ’ ๐‘‹ฬ…)2
๐‘†2 =
deviation of the ๐‘”(๐‘ข๐‘– )โ€™s
๐‘›โˆ’1
๐œŽฬ‚(๐‘‹1 , โ€ฆ , ๐‘‹๐‘› ) = โˆšโˆ‘๐‘›๐‘–=1(๐‘‹๐‘– โˆ’ ๐‘‹ฬ…)2 โ„๐‘› ,๐œŽฬ‚ 2 = (๐‘› โˆ’ 1)๐‘† 2 /๐‘›
2
๐‘ 
๐‘ 
โˆ‘๐‘›๐‘–=1(๐‘‹๐‘– โˆ’ ๐‘‹ฬ…)2 = โˆ‘๐‘›๐‘–=1((๐‘‹๐‘– โˆ’ ๐œ‡) โˆ’ (๐‘‹ฬ… โˆ’ ๐œ‡))
Approximate C.I. (๐œƒฬ‚ โˆ’ ๐‘ง๐›ผโ„2 , ๐œƒฬ‚ + ๐‘ง๐›ผโ„2 )
โˆš๐‘›
โˆš๐‘›
Uniform on [0,ฮ˜], ๐œƒฬ‚(๐‘‹1 , โ€ฆ , ๐‘‹๐‘› ) = max{๐‘‹1 , โ€ฆ , ๐‘‹๐‘› }
๐‘Œ๐‘– = ๐‘‹๐‘– โˆ’ ๐œ‡, ๐‘Œฬ… = ๐‘‹ฬ… โˆ’ ๐œ‡
2
Poisson: ๐œ†ฬ‚(๐‘‹1 , โ€ฆ , ๐‘‹๐‘› ) = ๐‘‹ฬ…
โˆ‘๐‘›๐‘–=1((๐‘‹๐‘– โˆ’ ๐œ‡) โˆ’ (๐‘‹ฬ… โˆ’ ๐œ‡)) = โˆ‘๐‘›๐‘–=1(๐‘Œ๐‘– โˆ’ ๐‘Œฬ…)2 = โˆ‘๐‘›๐‘–=1 ๐‘Œ๐‘–2 โˆ’
Point Estimators
๐‘›๐‘Œฬ… 2
Biased/unbiased and precise/imprecise
๐‘› (๐‘‹
๐‘›
2
2
2
โˆ‘๐‘–=1 ๐‘– โˆ’ ๐‘‹ฬ…) = โˆ‘๐‘–=1(๐‘‹๐‘– โˆ’ ๐œ‡)๐‘– โˆ’ ๐‘›(๐‘‹ฬ… โˆ’ ๐œ‡)
Bias of estimator ๐œƒฬ‚(๐‘‹1 , โ€ฆ , ๐‘‹๐‘› ) for parameter ๐œƒ
Rearrange the terms
ฬ‚) = ๐ธ[๐œƒฬ‚(๐‘‹1 , โ€ฆ , ๐‘‹๐‘› )] โˆ’ ๐œƒ
๐‘(๐œƒ
2
๐‘›
๐‘›
2
2
โˆ‘๐‘–=1(๐‘‹๐‘– โˆ’ ๐œ‡) = โˆ‘๐‘–=1(๐‘‹๐‘– โˆ’ ๐‘‹ฬ…) + ๐‘›(๐‘‹ฬ… โˆ’ ๐œ‡)
Confidence Interval
Decomposition of the sum of squares
2-sided, 1-sided upper/lower
2
Divide by ๐œŽ
Ex. normal 2-sided 95% confidence
2
๐‘›
2
2
(๐‘‹ โˆ’๐œ‡)
โˆ‘ (๐‘‹ โˆ’๐‘‹ฬ…)
(๐‘‹ฬ…โˆ’๐œ‡)
โˆ‘๐‘›๐‘–=1 ๐‘– 2 = ๐‘–=1 2๐‘–
+ ๐œŽ2
โˆš๐‘›(๐‘‹ฬ… โˆ’ ๐œ‡)
๐œŽ
๐œŽ
๐‘ƒ {โˆ’1.96 <
< 1.96} = 2ฮฆ(1.96) โˆ’ 1 = 0.95
๐‘›
๐œŽ
2
2
=๐‘‹๐‘›
=? ? ?
+๐‘‹1
๐œŽ
๐œŽ
๐‘‹ โˆ’๐œ‡
โ‡” ๐‘ƒ {โˆ’1.96
< ๐‘‹ฬ… โˆ’ ๐œ‡ < 1.96 } = 0.95
๐‘๐‘– = ๐‘– , ๐‘1 , ๐‘2 , โ€ฆ , ๐‘๐‘› are independent standard normal
๐œŽ
๐‘›
๐‘›
โˆš
โˆš
(๐‘‹ โˆ’๐œ‡)2
๐œŽ
๐œŽ
โˆ‘๐‘›๐‘–=1 ๐‘– 2 is chi-square with ๐‘› degrees of freedom
ฬ…
โ‡”
๐‘ƒ
{โˆ’1.96
<
๐œ‡
โˆ’
๐‘‹
<
1.96
} = 0.95
๐œŽ
โˆš๐‘›
โˆš๐‘›
If sampling from normal population, then ๐‘‹ฬ… โŠฅ ๐‘† 2 ,
๐œŽ
๐œŽ
sample mean and variance are indep r.v.
๐‘ƒ {๐‘‹ฬ… โˆ’ 1.96
< ๐œ‡ < ๐‘‹ฬ… + 1.96 } = 0.95
โˆš๐‘›
โˆš๐‘›
๐‘‹ฬ… โˆผ ๐’ฉ(๐œ‡, ๐œŽ 2 โ„๐‘›)
If ๐œŽ is given, and we have observed sample mean ๐‘ฅฬ…
2
(๐‘› โˆ’ 1)๐‘† 2 โ„๐œŽ 2 โˆผ ๐œ’๐‘›โˆ’1
๐œŽ
๐œŽ
(๐‘ฅฬ… โˆ’ 1.96 , ๐‘ฅฬ… + 1.96 ) is a 95% confidence interval
(๐‘› โˆ’ 1)๐‘† 2 โ„๐œŽ 2 = โˆ‘๐‘›๐‘–=1(๐‘‹๐‘– โˆ’ ๐‘‹ฬ…)2 โ„๐œŽ 2
โˆš๐‘›
โˆš๐‘›
CPU time for a type of jobs is normally distributed with estimate of ๐œ‡
mean 20 seconds and standard deviation 3 seconds
Once observed, ๐‘ฅฬ… is a constant, and the interval is fixed.
15 such jobs are tested, what is the probability that the 95% is not a probability
sample variance exceeds 12
Two-sided Confidence Interval 1 โˆ’ ๐›ผ for normal mean
2
Solution(15 โˆ’ 1)๐‘† 2 โ„32 โˆผ ๐œ’14
with known variance:
Related documents