Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
๐ฬ
=sample mean
Linear Transformation
Constants a and b, if ๐ฆ๐ = ๐๐ฅ๐ + ๐โ๐ = {1, ๐}
โฆthen ๐ฆฬ
= ๐๐ฅฬ
+ ๐ Ex temperature in F to C, compute
average in F, then convert that average to C
1
๐ 2 =sample variance= โ๐๐=1(๐ฅ๐ โ ๐ฅฬ
)2
๐โ1
Elements of Probability
Sample space S: set of all possible outcomes
Event ๐ธ, Any subset of sample set
Union: ๐ธ โช ๐น, E || F
Intersect: ๐ธ โฉ ๐น, E && F, EF
Null event: โ
has no outcome
Compliment: , ๐ธ ๐ , all outcomes in ๐ but not in ๐ธ
๐ธ ๐ and ๐ธ are mutually exclusive, ๐ ๐ = โ
Containment ๐ธโ๐น or ๐นโ๐ธ
all out comes in ๐ธ are also in ๐น
Example: ๐ธ={(๐ป,๐ป)} โ ๐น={(๐ป,๐ป),(๐ป,๐), (๐,๐ป)}
Occurrence of ๐ธ implies that of ๐น
Example: getting two heads implies getting head at all
Equality
๐ธ=๐น, if ๐ธโ๐น and ๐ธโ๐น
Commutative law ๐ธโช๐น=๐นโช๐ธ, ๐ธ๐น=๐น๐ธ
Associative law (๐ธโช๐น)โช๐บ=๐ธโช(๐นโช๐บ), (๐ธ๐น)๐บ=๐ธ(๐น๐บ)
Distributive law (๐ธโช๐น)=๐ธ๐บโช๐น๐บ, ๐ธ๐นโช๐บ=(๐ธโช๐บ)(๐นโช๐บ)
DeMorganโs law (๐ธโช๐น)^๐=๐ธ^๐ ๐น^๐, (๐ธ๐น)^๐=๐ธ^๐ ๐น^๐
Axioms of Probability
Probability of event ๐ธ=Proportion of times ๐ธ occurs
(out come is in ๐ธ) in repeated experiments
Axiomatic definition: For each event ๐ธ of an experiment
with sample space ๐, assign a number (๐ธ). If (๐ธ) satisfies
the following three axioms
0โค(๐ธ)โค1, ๐(๐)=1
For any sequence of mutually exclusive events ๐ธ_1,_2,โฆ,
๐(โ๐๐=1 ๐ธ๐ ) = โ๐๐=1 ๐(๐ธ๐ ) , ๐ = 1,2, โฆ , โ We call (๐ธ) the
probability of event ๐ธ
Example: ๐(๐ธ ๐ ) = 1 โ ๐(๐ธ)
Proof
๐ธ ๐ and ๐ธ are mutually exclusive, ๐(๐ธ ๐ ) + ๐(๐ธ) =
๐(๐ธ ๐ โช ๐ธ) = ๐(๐) = 1
Example: ๐(๐ธ โช ๐น) = ๐(๐ธ) + ๐(๐น) โ ๐(๐ธ๐น)
Proof: Use Venn diagram
๐(๐ธ)
๐(๐ธ)
Odds of event ๐ธ ๐) =
๐(๐ธ
1โ๐(๐ธ)
๐ = 1/2, odds = 1, it is equally likely
๐ = 3/4, odds = 3, it is 3 times as likely to occur
Conditional Probability
๐(๐ธ๐น)
๐(๐ธ|๐น) =
, ๐(๐ธ๐น) = ๐(๐ธ)๐(๐น|๐ธ)
๐(๐น)
Law of Total Probability
๐(๐น) = โ๐๐=1 ๐(๐ธ๐ )๐(๐น|๐ธ๐ )
๐(๐น) = ๐(๐น|๐ธ)๐(๐ธ) + ๐(๐น|๐ธ ๐ )๐(๐ธ ๐ )
Proof: ๐(๐น) = ๐(๐น๐) = ๐(๐น(๐ธ โช ๐ธ ๐ ))
๐น(๐ธ โช ๐ธ ๐ ) = ๐น๐ธ โช ๐น๐ธ ๐ , ๐(๐น(๐ธ โช ๐ธ ๐ )) = ๐(๐น๐ธ โช ๐น๐ธ ๐ )
๐น๐ธ and ๐น๐ธ ๐ are mutually exclusive
๐(๐น๐ธ โช ๐น๐ธ ๐ ) = ๐(๐น๐ธ) + ๐(๐น๐ธ ๐ )
๐(๐ธ)๐(๐น|๐ธ)
Bayesโ Formula ๐(๐ธ|๐น) =
๐(๐น)
=
๐(๐ธ)๐(๐น|๐ธ)
๐(๐ธ)๐(๐น|๐ธ)+๐(๐ธ ๐ )๐(๐น|๐ธ ๐ )
๐(๐ธ )๐(๐น|๐ธ๐ )
๐(๐ธ๐ |๐น) = โ๐ ๐ )๐(๐น|๐ธ
๐)
๐=1 ๐(๐ธ๐
Independent Events
๐ธ โฅ ๐น, if ๐(๐ธ๐น) = ๐(๐ธ)๐(๐น)
๐(๐ธ๐น)
๐(๐ธ|๐น) =
= ๐(๐ธ)
๐(๐น)
Discrete r.v.
Pmf = ๐(๐ฅ) = ๐{๐ = ๐ฅ} = ๐{๐ โ ๐: ๐(๐) = ๐ฅ}
Cdf = ๐น(๐ฅ) = ๐{๐ โค ๐ฅ} = โ๐ฆโค๐ฅ ๐{๐ = ๐ฆ} = โ๐ฆโค๐ฅ ๐(๐ฆ)
Continuous r.v.
Pdf = ๐{๐ โ ๐ต} = โซ๐ต ๐(๐ฅ)๐๐ฅ
๐
Independent r.v. ๐ โฅ ๐, if for any two sets of real
numbers ๐ด and ๐ต, ๐{๐ โ ๐ด, ๐ โ ๐ต} = ๐{๐ โ ๐ด}๐{๐ โ ๐ต}
If ๐ โฅ ๐: ๐น(๐, ๐) = ๐น๐ (๐)๐น๐ (๐)
๐(๐ฅ, ๐ฆ) = ๐๐ (๐ฅ)๐๐ (๐ฆ), for discrete RVs
๐(๐ฅ, ๐ฆ) = ๐๐ (๐ฅ)๐๐ (๐ฆ), for continuous RVs
Conditional Distribution
๐{๐=๐ฅ,๐=๐ฆ}
๐(๐ฅ,๐ฆ)
Disc: ๐๐|๐ (๐ฅ|๐ฆ) = ๐{๐ = ๐ฅ|๐ = ๐ฆ} =
= (๐ฆ)
๐{๐=๐ฆ}
Continuous: ๐๐|๐ (๐ฅ|๐ฆ) =
๐(๐ฅ,๐ฆ)
๐๐
๐๐ (๐ฆ)
Expectation discrete RV: ๐ธ[๐] = โ๐ ๐ฅ๐ ๐(๐ฅ๐ )
โ
Continuous RV: ๐ธ[๐] = โซโโ ๐ฅ๐(๐ฅ)๐๐ฅ
๐ธ[๐ + ๐] = ๐ธ[๐] + ๐ธ[๐]
๐ธ[โ๐๐=1 ๐๐ ] = โ๐๐=1 ๐ธ[๐๐ ]
If independence, ๐ธ[๐๐] = ๐ธ[๐]๐ธ[๐]
To solve optimization problems:
-choose quantity ๐ to represent r.v. ๐
MSE=Mean Squared Error, ๐๐๐ธ(๐) = ๐ธ[(๐ โ ๐)2 ]
Let ๐ธ[๐] = ๐ that minimizes MSE, ๐๐๐ธ(๐) โค ๐๐๐ธ(๐)โ๐
โ
Law of the Lazy Statistician ๐ธ[๐(๐)] = โซโโ ๐(๐ฅ)๐(๐ฅ)๐๐ฅ
๐(๐) unknown, but ๐(๐) known
๐ธ[๐๐ + ๐] = ๐๐ธ[๐] + ๐
๐ธ[๐(๐)] โ ๐(๐ธ[๐]) except for linear case
Variance Var(๐) = ๐ธ[(๐ โ ๐)2 ] = ๐ธ[๐ 2 ] โ ๐ 2
-always positive, 0 iff ๐ is constant
Var(๐๐ + ๐) = ๐2 Var(๐)
Covariance
Cov(๐, ๐) = ๐ธ[(๐ โ ๐๐ )(๐ โ ๐๐ )] = ๐ธ[๐๐] โ ๐๐ ๐๐
Cov(๐, ๐) = Cov(๐, ๐),
Cov(๐, ๐) = Var(๐) โฅ 0
If ๐ โฅ ๐, Cov(๐, ๐) = 0, but converse may be false
Cov(๐๐ + ๐, ๐) = ๐Cov(๐, ๐)
Cov(๐ + ๐, ๐) = Cov(๐, ๐) + Cov(๐, ๐)
Variance and Covariance
Variance of sum of RVs
Var(๐ + ๐) = Var(๐) + Var(๐) + 2Cov(๐, ๐)
If ๐ โฅ ๐, Var(๐ + ๐) = Var(๐) + Var(๐)
For ๐1 , โฆ , ๐๐ , if ๐๐ โฅ ๐๐ , for all ๐ โ ๐, Var(โ๐๐=1 ๐๐ ) =
โ๐๐=1 Var(๐๐ )
Mutually independent
Cov(๐,๐)
Cov(๐,๐)
Correlation: Corr(๐, ๐) =
=
Given an Operating Characteristic curve, want to know
probability of acceptance/rejection
Split into 4 regions: accept acceptable, reject
nonacceptable (hit), reject acceptable (supplierโs risk,
false alarm/positive, type I error), accept nonacceptable
(inspectorโs risk, miss, type II error)
As number of products inspected go up, curve gets
steeper, ideally a step function
Poisson
Number of event occurrences during a time period
Example: arrival of customers
We observe that on average ฮป customers come to the
store in 1 hour
Divide 1 hour into n intervals (e.g. 3600 seconds)
On average, ฮป out of n intervals have customer arrive
When n is large, no interval has more than 1 arrival in it
Each interval has arrival with probability ฮป/n
Number of arrivals in one hour: binomial distribution
ฮป k
ฮป nโk
n
n
p(k) = (nk) ( ) (1 โ )
ฮปk
n โ โ, p(k) โ eโฮป
k!
Pois replaces bin when n large, p small
ฮป
ฮป
ฮผ = E[Bin (n, )] = ฮป, ฯ2 = ฮป (1 โ ) โ ฮป
n
n
Uniform If we want the range to be [ฮฑ, ฮฒ]
consider Y = ฮฑ + (ฮฒ โ ฮฑ)X
yโฮฑ
CDF: X โค x โ Y โค ฮฑ + (ฮฒ โ ฮฑ)X, Y โค y โ x โค
ฮฒโฮฑ
FY (y) = FX (
yโฮฑ
)=
ฮฒโฮฑ
d
PDF: fY (y) =
dy
yโฮฑ
, if y โ [ฮฑ, ฮฒ]
ฮฒโฮฑ
FY (y) =
1
ฮฒโฮฑ
Mean: E[Y] = ฮฑ + (ฮฒ โ ฮฑ)E[X] =
ฮฑ+ฮฒ
2
Variance: Var(Y) = (ฮฒ โ ฮฑ)2 Var(X) =
(ฮฒโฮฑ)2
12
Exponential has PDF f(x) = ฮปeโฮปx, for x โฅ 0
x
CDF F(x) = โซ0 ฮปeโฮปy dy = โeโฮปy |0x = 1 โ eโฮปx
For a non-negative random variable X, with tail
distribution Fฬ
(x)
โ
E[X] = โซ Fฬ
(x)dx
0
โ
1
โ 1
E[X] = โซ eโฮปx dx = โ eโฮปx | =
ฮป
0 ฮป
โ1 โค Corr(๐, ๐) โค 1, 0 correlation does not mean
0
1
independence
Var(X) = 2
Markovโs Inequality ๐ is a nonnegative random variable,
ฮป
(xโฮผ)2
๐ธ[๐]
1
โ
then for any ๐ > 0, ๐{๐ โฅ ๐} โค
Normal
PDF
f(x) =
e 2ฯ2
๐
โ2ฯฯ
โ
๐
โ
Proof: ๐ธ[๐] = โซ0 ๐ฅ๐(๐ฅ)๐๐ฅ = โซ0 ๐ฅ๐(๐ฅ)๐๐ฅ + โซ๐ ๐ฅ๐(๐ฅ)๐๐ฅ If X โผ ๐ฉ(ฮผ, ฯ2 ), Y = a + bX, then Y โผ ๐ฉ(a + ฮผ, b2 ฯ2 )
โVar(๐)Var(๐)
๐
๐(๐)๐(๐)
โ
Xโฮผ
Z=
, then Z โผ ๐ฉ(0, 1), standard normal dist
ฯ
0
๐
CDF:
ฮฆ(x)
โ
โ
xโฮผ
xโฮผ
FX (x) = P{X โค x} = P {Z โค
} = ฮฆ(
)
โซ ๐ฅ๐(๐ฅ)๐๐ฅ โฅ โซ ๐๐(๐ฅ)๐๐ฅ
ฯ
ฯ
๐
๐
ฮฆ(โx)
=
P{Z
โค
โx}
=
P{Z
โฅ
x}
=
1
โ
ฮฆ(x)
Chebyshevโs Inequality ๐ is a random variable with
ฬ
(zฮฑ ) = ฮฑ
zฮฑ : 1 โ ฮฆ(zฮฑ ) = ฮฆ
๐2
mean ๐ and variance ๐ 2 , then ๐{|๐ โ ๐| โฅ ๐} โค 2
๐
The (1 โ ฮฑ)100-th percentile of Z
Proof: Consider random variable (๐ โ ๐)2, which is zฮฑ = ฮฆโ1 (1 โ ฮฑ)
nonnegative. Apply Markovโs inequality with ๐ = ๐ 2
Sum of independent normal RVs is still normal
๐ธ[(๐ โ ๐)2 ] ๐ 2
X1 โผ ๐ฉ(ฮผ1 , ฯ12 ), X2 โผ ๐ฉ(ฮผ2 , ฯ22 ), X1 โฅ X2
๐{(๐ โ ๐)2 โฅ ๐ 2 } โค
=
๐2
๐2
X1 + X2 โผ ๐ฉ(ฮผ1 + ฮผ2 , ฯ12 + ฯ22 )
Weak Law of Large Numbers
What about X1 โ X2 ?
Let ๐1 , ๐2 , โฆ , be a sequence of independent and โX โผ ๐ฉ(โฮผ , ฯ2 )
2
2 2
identically distributed (i.i.d.) random variables, each X โ X โผ ๐ฉ(ฮผ โ ฮผ , ฯ2 + ฯ2 )
1
2
1
2 1
2
having mean ๐ธ[๐๐ ] = ๐. Then for any ๐ > 0,
Samples sample mean ๐ฬ
= (๐1 + โฏ + ๐๐ )/๐
๐1 + ๐2 + โฏ + ๐๐
1
๐ {|
โ ๐| > ๐} โ 0, as ๐ โ โ.
๐ฬ
= (๐1 + ๐2 + โฏ + ๐๐ )
๐
๐
Average of i.i.d. RVโs converges to their mean
1
1
๐ธ[๐ฬ
] = ๐ธ [ (๐1 + โฏ + ๐๐ )] = (๐ธ[๐1 ]+. . +๐ธ[๐๐ ]) = ๐
Proof: (assuming Var(X) = ฯ2 exists)
๐
๐
X1 + X 2 + โฏ + X n
1
E[
]=ฮผ
Var(๐ฬ
) = Var ( (๐1 + โฏ + ๐๐ ))
๐
n
2
1
๐2
X1 + X 2 + โฏ + X n
1
ฯ
= 2 (Var(๐1 ) + โฏ + Var(๐1 )) =
Var (
) = 2 Var(X1 + โฏ + Xn ) =
๐
๐
n
n
n
1
โ๐๐=1(๐๐ โ ๐ฬ
)2
Sample variance ๐ 2 =
Apply Chebyshevโs inequality
โซ ๐ฅ๐(๐ฅ)๐๐ฅ โฅ 0 โ ๐ธ[๐] โฅ โซ ๐ฅ๐(๐ฅ)๐๐ฅ
Cdf = ๐น(๐) = โซโโ ๐(๐ฅ)๐๐ฅ
Joint r.v.
X +X +โฏ+Xn
ฯ2
joint CDF, ๐น(๐ฅ, ๐ฆ) = ๐{๐ โค ๐ฅ, ๐ โค ๐ฆ}
P {| 1 2
โ ฮผ| > ฮต} โค 2 โ 0, as n โ โ
n
nฮต
Discrete RVs, joint PMF, ๐(๐ฅ๐ , ๐ฆ๐ ) = ๐{๐ = ๐ฅ๐ , ๐ = ๐ฆ๐ }
Bernoulli ฮผ = p, ฯ2 = p(1 โ p)
Marginal CDF ๐น๐ (๐ฅ) = ๐น(๐ฅ, โ), i.e. ๐ can take any value
Binomial sum of n iid Bernoulli r.v.
Marginal PMF ๐๐ (๐ฅ) = โโ
๐=1 ๐(๐ฅ, ๐ฆ๐ )
P{X = k} = (nk)pk (1 โ p)nโk, for k = 0,1, โฆ , n
Continuous joint PDF: ๐{(๐, ๐) โ ๐ถ} = โฌ(๐ฅ,๐ฆ)โ๐ถ ๐(๐ฅ, ๐ฆ)
โnk=0 p(k) = โnk=0(nk)pk (1 โ p)nโk = (p + 1 โ p)n = 1
๐
๐
ฮผ = E[Y1 + Y2 + โฏ ] = np
Joint CDF: ๐น(๐, ๐) = โซโโ โซโโ ๐(๐ฅ, ๐ฆ)๐๐ฅ๐๐ฆ
โ
ฯ2 = Var(Y1 + Y2. . . ) = np(1 โ p)
Marginal PDF ๐๐ (๐ฅ) = โซโโ ๐(๐ฅ, ๐ฆ) ๐๐ฆ
Ex. Acceptance Sampling
๐โ1
For an arbitrary distribution with mean ๐ and var ๐ 2
For the sample mean ๐ฬ
, ๐ธ[๐ฬ
], Var(๐ฬ
)
Approximate distribution when ๐ is large
(Central Limit Theorem)
For the sample variance ๐ 2 (๐ธ[๐ 2 ])
For a normal distribution ๐ฉ(๐, ๐ 2 ), Exact distribution of
๐ฬ
and ๐ 2
Central Limit Theorem Let ๐1 , ๐2 , โฆ , ๐๐ be a sequence of
i.i.d. random variables with mean ๐ and variance ๐ 2
โ๐(๐ฬ
โ๐)
2
then the distribution of โ๐(๐ฬ
โ ๐)โ๐ converges to ๐{๐ 2 > 12} = ๐ {14 ๐ 2 > 14 × 12} = ๐{๐14
> 18.57} =
โข
๐ {โ๐ง๐ผโ2 <
< ๐ง๐ผโ2 } = 1 โ ๐ผ
9
9
๐
standard normal as ๐ โ โ
๐
๐
0.1779
ฬ
ฬ
๐
{๐
โ
๐ง
<
๐
<
๐
+
๐ง
}
=
1
โ
๐ผ
ฬ
โ
โ
๐ผ 2
๐ผ 2
๐{โ๐(๐ โ ๐)โ๐ โค ๐ฅ} โ ฮฆ(๐ฅ) โ ๐{๐ < ๐ฅ}
๐
โ๐
โ๐
๐-distribution If ๐ โฅ ๐๐2 ๐๐ =
with ๐ deg of freedom
๐
๐
ฬ
2
โ๐๐/๐
โ๐(๐ โ ๐)โ๐ โ ๐ฉ(0,1)
(๐ฅฬ
โ ๐ง๐ผโ2 , ๐ฅฬ
+ ๐ง๐ผโ2 )
โ๐
โ๐
Generally works as long as ๐ โฅ 30
๐ฬ
โ๐
๐
Recall
โผ
๐
One sided upper: (๐ฅฬ
โ ๐ง๐ผ , โ)
Questions: approximate distribution of the sum?
๐โโ๐
โ๐
2
๐
(๐ โ 1)๐ 2 โ๐ 2 โผ ๐๐โ1
๐1 + ๐2 + โฏ + ๐๐ โ ๐ฉ(๐๐, ๐๐ 2 )
One sided lower: (โโ, ๐ฅฬ
+ ๐ง๐ผ )
โ๐
Question: approximate distribution of the sample mean?
๐ฬ
โ ๐
โ๐(๐ฬ
โ ๐)โ๐
Estimating Difference in Means
=
โผ ๐ก๐โ1
๐ฬ
โ ๐ฉ(๐, ๐ 2 โ๐)
โ(๐ โ 1)๐ 2 โ๐ 2 โ(๐ โ 1) ๐โโ๐
๐ฬ
โ ๐ฬ
โผ ๐ฉ(๐1 โ ๐2 , ๐12 โ๐ + ๐22 โ๐)
Ex The College of Engineering decided to admit 450
๐ฬ
โ๐ฬ
โ(๐1 โ๐2 )
Parameter Estimation a population has certain
Normalize
โผ ๐ฉ(0,1)
students, 0.3 probability that an admitted student will
distribution with unknown parameter ๐
โ๐12 โ๐+๐22 โ๐
actually enroll. What is the probability that the incoming
Estimate ๐ based on ๐1 โฆ ๐๐
๐2
๐2
๐2
๐2
class has more than 150 students?
ฬ (๐1 , โฆ , ๐๐ ) a function of sample, is a r.v.
(๐ฅฬ
โ ๐ฆฬ
โ ๐ง๐ผโ2 โ ๐1 + ๐2 , ๐ฅฬ
โ ๐ฆฬ
+ ๐ง๐ผโ2 โ ๐1 + ๐2 ) is a 1 โ ๐ผ
ฮ
Solution:๐๐ โผ Bernoulli(0.3), ๐ = ๐1 + ๐2 + โฏ +
Is also a point estimator
๐450 ~Binomial(0.3, 450)
๐ฬ
โ๐ฬ
โ(๐1 โ๐2 )
Moment Generating Function
๐ {โ๐ง๐ผโ2 โค
โค ๐ง๐ผโ2 } = 1 โ ๐ผ
๐{๐ > 150} = ๐{๐ = 151} + ๐{๐ = 152} + โฏ
2
2
๐(๐ก) = ๐ธ[๐ ๐ก๐ฅ ๐(๐ฅ)] = โซ ๐ ๐ก๐ฅ ๐(๐ฅ)๐๐ฅ
โ๐1 +๐2
๐ ๐
+ ๐{๐ = 450}
๐กโ
๐กโ
๐ moment is ๐ derivative
๐2
๐2
๐1 + โฏ + ๐450 is approximately normal with mean
๐ {๐ฬ
โ ๐ฬ
โ ๐ง๐ผโ2 โ 1 + 2 โค ๐1 โ ๐2 โค ๐ฬ
โ ๐ฬ
+
Method of Moments
๐
๐
450 × 0.3 = 135 and standard deviation
Moments can be expressed as functions of parameters
2
2
โ450 × 0.3 × 0.7 = โ94.5
๐]
๐
๐
๐ธ[๐ = ๐๐ (๐1 , ๐2 , โฆ , ๐๐ )
๐ง๐ผโ2 โ 1 + 2 } = 1 โ ๐ผ
๐
Approximate a discrete r.v. (e.g. ๐ = ๐1 + โฏ + ๐450)
๐
๐
๐
๐ = ๐1 +โฏ+๐๐
ฬ
ฬ
ฬ
ฬ
If ๐ unknown parameters, ๐
with a continuous r.v. (e.g. ๐ โผ ๐ฉ(135,94.5))
๐
๐1 (๐1 , ๐2 , โฆ , ๐๐ ) = ฬ
๐ฬ
ฬ
ฬ
1
Question: ๐{๐ = ๐} โ ๐{๐ โ ? }
C.I. for the difference in the means of two normal
๐{๐ = ๐} โ ๐{๐ โ 0.5 โค ๐ โค ๐ + 0.5}
๐2 (๐1 , ๐2 , โฆ , ๐๐ ) = ฬ
ฬ
๐ฬ
ฬ
2
population when the variances are known
Continuity correction
โฎ
When unknown variances, assume to be equal
(๐โ1)๐12 +(๐โ1)๐22
๐{๐1 + โฏ ๐450 > 150} = โโ
๐ฬ
ฬ
๐
๐=151 ๐{๐1 + โฏ + ๐450 = ๐} โ {๐๐ (๐1 , ๐2 , โฆ , ๐๐ ) = ฬ
ฬ
Use Pooled Sample Variance ๐๐2 = (๐โ1)+(๐โ1)
โโ
Maximum Likelihood Estimators
๐=151 ๐{๐ โ 0.5 โค ๐ โค ๐ + 0.5} = ๐{๐ โฅ 150.5}
๐ฬ
โ๐ฬ
โ(๐1 โ๐2 )
150.5โ135
150.5โ135
โผ ๐ก๐+๐โ2
๐{๐ โฅ 150.5} = ๐ {๐ โฅ
} = 1 โ ฮฆ(
) โ Given parameter ๐, for each observation ๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ , we 2
โ94.5
โ94.5
โ๐๐ (1โ๐+1โ๐)
can calculate the joint density function
1 โ ฮฆ(1.59) โ 0.06
๐(๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ ; ๐)
1
1
Continuity Correction When using a continuous function
(๐ฅฬ
โ ๐ฆฬ
โ ๐ก๐ผโ2,๐+๐โ2 ๐ ๐ โ๐ + ๐ , ๐ฅฬ
โ ๐ฆฬ
+
๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ are independent
to approximate a discrete one, ex. Normal approx.. Bin.
๐(๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ ; ๐) = ๐๐1 (๐ฅ1 ; ๐)๐๐2 (๐ฅ2 ; ๐) โฏ ๐๐๐ (๐ฅ๐ ; ๐)
1
1
If P(X=n) use P(n โ 0.5 < X < n + 0.5)
๐ก๐ผโ2,๐+๐โ2 ๐ ๐ โ + ) is a 1 โ ๐ผ C.I. for the difference in
Given observation ๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ , for each ๐, define the
๐
๐
If P(X>n) use P(X > n + 0.5)
likelihood ๐ฟ(๐) = ๐(๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ ; ๐)
If P(Xโคn) use P(X < n + 0.5)
the means of two normal population when the variances
Maximum likelihood Estimator MLE maximizes ๐ฟ(ฮ)
If P (X<n) use P(X < n โ 0.5)
are unknown but equal
For
MLE
use
the
pdf
of
the
distribution,
using
๐
to
If P(X โฅ n) use P(X > n โ 0.5)
Approximate Confidence Interval
replace as needed. Multiply together all ๐ samples in this If population not normal but relatively large sample size,
Sample Variance
1
new pdf.
use results for normal population mean with known
โ๐ (๐ โ ๐ฬ
)2
๐2 =
๐โ1 ๐=1 ๐
Itโs easier to maximize log ๐ฟ(๐) โ ๐(๐)
variance to obtain approximate 1 โ ๐ผ c.i.
๐ (๐
๐
2
2
2
โ๐=1 ๐ โ ๐ฬ
) = โ๐=1 ๐๐ โ ๐๐ฬ
MLE of exponential dist is the sample mean
๐
๐
๐
๐
2
๐
2
2
2
ฬ
ฬ
(๐ฅฬ
โ ๐ง๐ผโ2 , ๐ฅฬ
+ ๐ง๐ผโ2 )
๐ธ[โ๐=1(๐๐ โ ๐) ] = ๐ธ[โ๐=1 ๐๐ โ ๐๐ ] = ๐ธ[โ๐=1 ๐๐ ] โ
Example: Bernoulli distribution
โ๐
โ๐
2
2
2
๐ธ[๐๐ฬ
] = ๐๐ธ[๐1 ] โ ๐๐ธ[๐ฬ
]
Ex. Monte Carlo Simul.
๐{๐ = 1} = ๐ = ๐ 1
๐
Var(๐) = ๐ธ[๐ 2 ] โ ๐ธ[๐]2 โ ๐ธ[๐ 2 ] = Var(๐) + ๐ธ[๐]2
๐{๐ = 0} = 1 โ ๐ = (1 โ ๐)1โ0
Evaluate a definite integral ๐ = โซ๐ ๐(๐ฅ)๐๐ฅ
๐ธ[๐12 ] = Var(๐1 ) + ๐ธ[๐1 ]2 = ๐ 2 + ๐ 2
๐๐ (๐ฅ; ๐) = ๐ ๐ฅ (1 โ ๐)1โ๐ฅ
Consider an uniform random variable ๐ in [๐, ๐]
๐ธ[๐ฬ
2 ] = Var(๐ฬ
) + ๐ธ[๐ฬ
]2 = ๐ 2 โ๐ + ๐ 2
log ๐๐ (๐ฅ; ๐) = ๐ฅ log ๐ + (1 โ ๐ฅ) log(1 โ ๐)
Law of lazy statistician
๐ (๐
2]
2
2
2
2
2
๐
๐
ฬ
)
(๐
๐ธ[โ๐=1 ๐ โ ๐ = ๐๐ + ๐๐ โ ๐ โ ๐๐ = โ 1)๐ ๐(๐) = โ๐=1 ๐ฅ๐ log ๐ + (๐ โ โ๐=1 ๐ฅ๐ ) log(1 โ ๐)
๐
๐
1
๐
2]
2
๐
๐
๐ธ[๐(๐)] = โซ๐ ๐(๐ฅ)๐๐ (๐ข)๐๐ข =
โซ ๐(๐ข)๐๐ข = ๐โ๐
โ
๐
๐ฅ
๐โโ๐=1 ๐ฅ๐
๐ธ[๐ = ๐ , so unbiased
๐โ๐ ๐
๐(๐) = ๐=1 ๐ โ
๐๐
๐
1โ๐
Estimate ๐
Sampling from a Normal Population
(1 โ ๐) โ๐๐=1 ๐ฅ๐ = ๐๐ โ ๐ โ๐๐=1 ๐ฅ๐
Simulate uniform random variables ๐1 , โฆ , ๐๐
๐ฬ
โผ ๐ฉ(๐, ๐ 2 โ๐)
๐
โ
๐ฅ
๐
โ๐ ๐(๐ )
Chi-square distribution ๐๐2 = ๐12 + ๐22 + โฏ + ๐๐2 with ๐
๐ฬ(๐1 , โฆ , ๐๐ ) = ๐ฬ
= ๐=1 ๐ = 1
๐ฬ = (๐ โ ๐) ๐=1 ๐
๐
๐
๐
degrees of freedom, sum of indep chi-squar is chi-square Normal: ๐ฬ (๐ , โฆ , ๐ ) = ๐ฬ
1
๐
Let ๐ฬ and ๐ be the observed sample mean and standard
1
โ๐๐=1(๐๐ โ ๐ฬ
)2
๐2 =
deviation of the ๐(๐ข๐ )โs
๐โ1
๐ฬ(๐1 , โฆ , ๐๐ ) = โโ๐๐=1(๐๐ โ ๐ฬ
)2 โ๐ ,๐ฬ 2 = (๐ โ 1)๐ 2 /๐
2
๐
๐
โ๐๐=1(๐๐ โ ๐ฬ
)2 = โ๐๐=1((๐๐ โ ๐) โ (๐ฬ
โ ๐))
Approximate C.I. (๐ฬ โ ๐ง๐ผโ2 , ๐ฬ + ๐ง๐ผโ2 )
โ๐
โ๐
Uniform on [0,ฮ], ๐ฬ(๐1 , โฆ , ๐๐ ) = max{๐1 , โฆ , ๐๐ }
๐๐ = ๐๐ โ ๐, ๐ฬ
= ๐ฬ
โ ๐
2
Poisson: ๐ฬ(๐1 , โฆ , ๐๐ ) = ๐ฬ
โ๐๐=1((๐๐ โ ๐) โ (๐ฬ
โ ๐)) = โ๐๐=1(๐๐ โ ๐ฬ
)2 = โ๐๐=1 ๐๐2 โ
Point Estimators
๐๐ฬ
2
Biased/unbiased and precise/imprecise
๐ (๐
๐
2
2
2
โ๐=1 ๐ โ ๐ฬ
) = โ๐=1(๐๐ โ ๐)๐ โ ๐(๐ฬ
โ ๐)
Bias of estimator ๐ฬ(๐1 , โฆ , ๐๐ ) for parameter ๐
Rearrange the terms
ฬ) = ๐ธ[๐ฬ(๐1 , โฆ , ๐๐ )] โ ๐
๐(๐
2
๐
๐
2
2
โ๐=1(๐๐ โ ๐) = โ๐=1(๐๐ โ ๐ฬ
) + ๐(๐ฬ
โ ๐)
Confidence Interval
Decomposition of the sum of squares
2-sided, 1-sided upper/lower
2
Divide by ๐
Ex. normal 2-sided 95% confidence
2
๐
2
2
(๐ โ๐)
โ (๐ โ๐ฬ
)
(๐ฬ
โ๐)
โ๐๐=1 ๐ 2 = ๐=1 2๐
+ ๐2
โ๐(๐ฬ
โ ๐)
๐
๐
๐ {โ1.96 <
< 1.96} = 2ฮฆ(1.96) โ 1 = 0.95
๐
๐
2
2
=๐๐
=? ? ?
+๐1
๐
๐
๐ โ๐
โ ๐ {โ1.96
< ๐ฬ
โ ๐ < 1.96 } = 0.95
๐๐ = ๐ , ๐1 , ๐2 , โฆ , ๐๐ are independent standard normal
๐
๐
๐
โ
โ
(๐ โ๐)2
๐
๐
โ๐๐=1 ๐ 2 is chi-square with ๐ degrees of freedom
ฬ
โ
๐
{โ1.96
<
๐
โ
๐
<
1.96
} = 0.95
๐
โ๐
โ๐
If sampling from normal population, then ๐ฬ
โฅ ๐ 2 ,
๐
๐
sample mean and variance are indep r.v.
๐ {๐ฬ
โ 1.96
< ๐ < ๐ฬ
+ 1.96 } = 0.95
โ๐
โ๐
๐ฬ
โผ ๐ฉ(๐, ๐ 2 โ๐)
If ๐ is given, and we have observed sample mean ๐ฅฬ
2
(๐ โ 1)๐ 2 โ๐ 2 โผ ๐๐โ1
๐
๐
(๐ฅฬ
โ 1.96 , ๐ฅฬ
+ 1.96 ) is a 95% confidence interval
(๐ โ 1)๐ 2 โ๐ 2 = โ๐๐=1(๐๐ โ ๐ฬ
)2 โ๐ 2
โ๐
โ๐
CPU time for a type of jobs is normally distributed with estimate of ๐
mean 20 seconds and standard deviation 3 seconds
Once observed, ๐ฅฬ
is a constant, and the interval is fixed.
15 such jobs are tested, what is the probability that the 95% is not a probability
sample variance exceeds 12
Two-sided Confidence Interval 1 โ ๐ผ for normal mean
2
Solution(15 โ 1)๐ 2 โ32 โผ ๐14
with known variance: