Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
๐ฬ =sample mean Linear Transformation Constants a and b, if ๐ฆ๐ = ๐๐ฅ๐ + ๐โ๐ = {1, ๐} โฆthen ๐ฆฬ = ๐๐ฅฬ + ๐ Ex temperature in F to C, compute average in F, then convert that average to C 1 ๐ 2 =sample variance= โ๐๐=1(๐ฅ๐ โ ๐ฅฬ )2 ๐โ1 Elements of Probability Sample space S: set of all possible outcomes Event ๐ธ, Any subset of sample set Union: ๐ธ โช ๐น, E || F Intersect: ๐ธ โฉ ๐น, E && F, EF Null event: โ has no outcome Compliment: , ๐ธ ๐ , all outcomes in ๐ but not in ๐ธ ๐ธ ๐ and ๐ธ are mutually exclusive, ๐ ๐ = โ Containment ๐ธโ๐น or ๐นโ๐ธ all out comes in ๐ธ are also in ๐น Example: ๐ธ={(๐ป,๐ป)} โ ๐น={(๐ป,๐ป),(๐ป,๐), (๐,๐ป)} Occurrence of ๐ธ implies that of ๐น Example: getting two heads implies getting head at all Equality ๐ธ=๐น, if ๐ธโ๐น and ๐ธโ๐น Commutative law ๐ธโช๐น=๐นโช๐ธ, ๐ธ๐น=๐น๐ธ Associative law (๐ธโช๐น)โช๐บ=๐ธโช(๐นโช๐บ), (๐ธ๐น)๐บ=๐ธ(๐น๐บ) Distributive law (๐ธโช๐น)=๐ธ๐บโช๐น๐บ, ๐ธ๐นโช๐บ=(๐ธโช๐บ)(๐นโช๐บ) DeMorganโs law (๐ธโช๐น)^๐=๐ธ^๐ ๐น^๐, (๐ธ๐น)^๐=๐ธ^๐ ๐น^๐ Axioms of Probability Probability of event ๐ธ=Proportion of times ๐ธ occurs (out come is in ๐ธ) in repeated experiments Axiomatic definition: For each event ๐ธ of an experiment with sample space ๐, assign a number (๐ธ). If (๐ธ) satisfies the following three axioms 0โค(๐ธ)โค1, ๐(๐)=1 For any sequence of mutually exclusive events ๐ธ_1,_2,โฆ, ๐(โ๐๐=1 ๐ธ๐ ) = โ๐๐=1 ๐(๐ธ๐ ) , ๐ = 1,2, โฆ , โ We call (๐ธ) the probability of event ๐ธ Example: ๐(๐ธ ๐ ) = 1 โ ๐(๐ธ) Proof ๐ธ ๐ and ๐ธ are mutually exclusive, ๐(๐ธ ๐ ) + ๐(๐ธ) = ๐(๐ธ ๐ โช ๐ธ) = ๐(๐) = 1 Example: ๐(๐ธ โช ๐น) = ๐(๐ธ) + ๐(๐น) โ ๐(๐ธ๐น) Proof: Use Venn diagram ๐(๐ธ) ๐(๐ธ) Odds of event ๐ธ ๐) = ๐(๐ธ 1โ๐(๐ธ) ๐ = 1/2, odds = 1, it is equally likely ๐ = 3/4, odds = 3, it is 3 times as likely to occur Conditional Probability ๐(๐ธ๐น) ๐(๐ธ|๐น) = , ๐(๐ธ๐น) = ๐(๐ธ)๐(๐น|๐ธ) ๐(๐น) Law of Total Probability ๐(๐น) = โ๐๐=1 ๐(๐ธ๐ )๐(๐น|๐ธ๐ ) ๐(๐น) = ๐(๐น|๐ธ)๐(๐ธ) + ๐(๐น|๐ธ ๐ )๐(๐ธ ๐ ) Proof: ๐(๐น) = ๐(๐น๐) = ๐(๐น(๐ธ โช ๐ธ ๐ )) ๐น(๐ธ โช ๐ธ ๐ ) = ๐น๐ธ โช ๐น๐ธ ๐ , ๐(๐น(๐ธ โช ๐ธ ๐ )) = ๐(๐น๐ธ โช ๐น๐ธ ๐ ) ๐น๐ธ and ๐น๐ธ ๐ are mutually exclusive ๐(๐น๐ธ โช ๐น๐ธ ๐ ) = ๐(๐น๐ธ) + ๐(๐น๐ธ ๐ ) ๐(๐ธ)๐(๐น|๐ธ) Bayesโ Formula ๐(๐ธ|๐น) = ๐(๐น) = ๐(๐ธ)๐(๐น|๐ธ) ๐(๐ธ)๐(๐น|๐ธ)+๐(๐ธ ๐ )๐(๐น|๐ธ ๐ ) ๐(๐ธ )๐(๐น|๐ธ๐ ) ๐(๐ธ๐ |๐น) = โ๐ ๐ )๐(๐น|๐ธ ๐) ๐=1 ๐(๐ธ๐ Independent Events ๐ธ โฅ ๐น, if ๐(๐ธ๐น) = ๐(๐ธ)๐(๐น) ๐(๐ธ๐น) ๐(๐ธ|๐น) = = ๐(๐ธ) ๐(๐น) Discrete r.v. Pmf = ๐(๐ฅ) = ๐{๐ = ๐ฅ} = ๐{๐ โ ๐: ๐(๐) = ๐ฅ} Cdf = ๐น(๐ฅ) = ๐{๐ โค ๐ฅ} = โ๐ฆโค๐ฅ ๐{๐ = ๐ฆ} = โ๐ฆโค๐ฅ ๐(๐ฆ) Continuous r.v. Pdf = ๐{๐ โ ๐ต} = โซ๐ต ๐(๐ฅ)๐๐ฅ ๐ Independent r.v. ๐ โฅ ๐, if for any two sets of real numbers ๐ด and ๐ต, ๐{๐ โ ๐ด, ๐ โ ๐ต} = ๐{๐ โ ๐ด}๐{๐ โ ๐ต} If ๐ โฅ ๐: ๐น(๐, ๐) = ๐น๐ (๐)๐น๐ (๐) ๐(๐ฅ, ๐ฆ) = ๐๐ (๐ฅ)๐๐ (๐ฆ), for discrete RVs ๐(๐ฅ, ๐ฆ) = ๐๐ (๐ฅ)๐๐ (๐ฆ), for continuous RVs Conditional Distribution ๐{๐=๐ฅ,๐=๐ฆ} ๐(๐ฅ,๐ฆ) Disc: ๐๐|๐ (๐ฅ|๐ฆ) = ๐{๐ = ๐ฅ|๐ = ๐ฆ} = = (๐ฆ) ๐{๐=๐ฆ} Continuous: ๐๐|๐ (๐ฅ|๐ฆ) = ๐(๐ฅ,๐ฆ) ๐๐ ๐๐ (๐ฆ) Expectation discrete RV: ๐ธ[๐] = โ๐ ๐ฅ๐ ๐(๐ฅ๐ ) โ Continuous RV: ๐ธ[๐] = โซโโ ๐ฅ๐(๐ฅ)๐๐ฅ ๐ธ[๐ + ๐] = ๐ธ[๐] + ๐ธ[๐] ๐ธ[โ๐๐=1 ๐๐ ] = โ๐๐=1 ๐ธ[๐๐ ] If independence, ๐ธ[๐๐] = ๐ธ[๐]๐ธ[๐] To solve optimization problems: -choose quantity ๐ to represent r.v. ๐ MSE=Mean Squared Error, ๐๐๐ธ(๐) = ๐ธ[(๐ โ ๐)2 ] Let ๐ธ[๐] = ๐ that minimizes MSE, ๐๐๐ธ(๐) โค ๐๐๐ธ(๐)โ๐ โ Law of the Lazy Statistician ๐ธ[๐(๐)] = โซโโ ๐(๐ฅ)๐(๐ฅ)๐๐ฅ ๐(๐) unknown, but ๐(๐) known ๐ธ[๐๐ + ๐] = ๐๐ธ[๐] + ๐ ๐ธ[๐(๐)] โ ๐(๐ธ[๐]) except for linear case Variance Var(๐) = ๐ธ[(๐ โ ๐)2 ] = ๐ธ[๐ 2 ] โ ๐ 2 -always positive, 0 iff ๐ is constant Var(๐๐ + ๐) = ๐2 Var(๐) Covariance Cov(๐, ๐) = ๐ธ[(๐ โ ๐๐ )(๐ โ ๐๐ )] = ๐ธ[๐๐] โ ๐๐ ๐๐ Cov(๐, ๐) = Cov(๐, ๐), Cov(๐, ๐) = Var(๐) โฅ 0 If ๐ โฅ ๐, Cov(๐, ๐) = 0, but converse may be false Cov(๐๐ + ๐, ๐) = ๐Cov(๐, ๐) Cov(๐ + ๐, ๐) = Cov(๐, ๐) + Cov(๐, ๐) Variance and Covariance Variance of sum of RVs Var(๐ + ๐) = Var(๐) + Var(๐) + 2Cov(๐, ๐) If ๐ โฅ ๐, Var(๐ + ๐) = Var(๐) + Var(๐) For ๐1 , โฆ , ๐๐ , if ๐๐ โฅ ๐๐ , for all ๐ โ ๐, Var(โ๐๐=1 ๐๐ ) = โ๐๐=1 Var(๐๐ ) Mutually independent Cov(๐,๐) Cov(๐,๐) Correlation: Corr(๐, ๐) = = Given an Operating Characteristic curve, want to know probability of acceptance/rejection Split into 4 regions: accept acceptable, reject nonacceptable (hit), reject acceptable (supplierโs risk, false alarm/positive, type I error), accept nonacceptable (inspectorโs risk, miss, type II error) As number of products inspected go up, curve gets steeper, ideally a step function Poisson Number of event occurrences during a time period Example: arrival of customers We observe that on average ฮป customers come to the store in 1 hour Divide 1 hour into n intervals (e.g. 3600 seconds) On average, ฮป out of n intervals have customer arrive When n is large, no interval has more than 1 arrival in it Each interval has arrival with probability ฮป/n Number of arrivals in one hour: binomial distribution ฮป k ฮป nโk n n p(k) = (nk) ( ) (1 โ ) ฮปk n โ โ, p(k) โ eโฮป k! Pois replaces bin when n large, p small ฮป ฮป ฮผ = E[Bin (n, )] = ฮป, ฯ2 = ฮป (1 โ ) โ ฮป n n Uniform If we want the range to be [ฮฑ, ฮฒ] consider Y = ฮฑ + (ฮฒ โ ฮฑ)X yโฮฑ CDF: X โค x โ Y โค ฮฑ + (ฮฒ โ ฮฑ)X, Y โค y โ x โค ฮฒโฮฑ FY (y) = FX ( yโฮฑ )= ฮฒโฮฑ d PDF: fY (y) = dy yโฮฑ , if y โ [ฮฑ, ฮฒ] ฮฒโฮฑ FY (y) = 1 ฮฒโฮฑ Mean: E[Y] = ฮฑ + (ฮฒ โ ฮฑ)E[X] = ฮฑ+ฮฒ 2 Variance: Var(Y) = (ฮฒ โ ฮฑ)2 Var(X) = (ฮฒโฮฑ)2 12 Exponential has PDF f(x) = ฮปeโฮปx, for x โฅ 0 x CDF F(x) = โซ0 ฮปeโฮปy dy = โeโฮปy |0x = 1 โ eโฮปx For a non-negative random variable X, with tail distribution Fฬ (x) โ E[X] = โซ Fฬ (x)dx 0 โ 1 โ 1 E[X] = โซ eโฮปx dx = โ eโฮปx | = ฮป 0 ฮป โ1 โค Corr(๐, ๐) โค 1, 0 correlation does not mean 0 1 independence Var(X) = 2 Markovโs Inequality ๐ is a nonnegative random variable, ฮป (xโฮผ)2 ๐ธ[๐] 1 โ then for any ๐ > 0, ๐{๐ โฅ ๐} โค Normal PDF f(x) = e 2ฯ2 ๐ โ2ฯฯ โ ๐ โ Proof: ๐ธ[๐] = โซ0 ๐ฅ๐(๐ฅ)๐๐ฅ = โซ0 ๐ฅ๐(๐ฅ)๐๐ฅ + โซ๐ ๐ฅ๐(๐ฅ)๐๐ฅ If X โผ ๐ฉ(ฮผ, ฯ2 ), Y = a + bX, then Y โผ ๐ฉ(a + ฮผ, b2 ฯ2 ) โVar(๐)Var(๐) ๐ ๐(๐)๐(๐) โ Xโฮผ Z= , then Z โผ ๐ฉ(0, 1), standard normal dist ฯ 0 ๐ CDF: ฮฆ(x) โ โ xโฮผ xโฮผ FX (x) = P{X โค x} = P {Z โค } = ฮฆ( ) โซ ๐ฅ๐(๐ฅ)๐๐ฅ โฅ โซ ๐๐(๐ฅ)๐๐ฅ ฯ ฯ ๐ ๐ ฮฆ(โx) = P{Z โค โx} = P{Z โฅ x} = 1 โ ฮฆ(x) Chebyshevโs Inequality ๐ is a random variable with ฬ (zฮฑ ) = ฮฑ zฮฑ : 1 โ ฮฆ(zฮฑ ) = ฮฆ ๐2 mean ๐ and variance ๐ 2 , then ๐{|๐ โ ๐| โฅ ๐} โค 2 ๐ The (1 โ ฮฑ)100-th percentile of Z Proof: Consider random variable (๐ โ ๐)2, which is zฮฑ = ฮฆโ1 (1 โ ฮฑ) nonnegative. Apply Markovโs inequality with ๐ = ๐ 2 Sum of independent normal RVs is still normal ๐ธ[(๐ โ ๐)2 ] ๐ 2 X1 โผ ๐ฉ(ฮผ1 , ฯ12 ), X2 โผ ๐ฉ(ฮผ2 , ฯ22 ), X1 โฅ X2 ๐{(๐ โ ๐)2 โฅ ๐ 2 } โค = ๐2 ๐2 X1 + X2 โผ ๐ฉ(ฮผ1 + ฮผ2 , ฯ12 + ฯ22 ) Weak Law of Large Numbers What about X1 โ X2 ? Let ๐1 , ๐2 , โฆ , be a sequence of independent and โX โผ ๐ฉ(โฮผ , ฯ2 ) 2 2 2 identically distributed (i.i.d.) random variables, each X โ X โผ ๐ฉ(ฮผ โ ฮผ , ฯ2 + ฯ2 ) 1 2 1 2 1 2 having mean ๐ธ[๐๐ ] = ๐. Then for any ๐ > 0, Samples sample mean ๐ฬ = (๐1 + โฏ + ๐๐ )/๐ ๐1 + ๐2 + โฏ + ๐๐ 1 ๐ {| โ ๐| > ๐} โ 0, as ๐ โ โ. ๐ฬ = (๐1 + ๐2 + โฏ + ๐๐ ) ๐ ๐ Average of i.i.d. RVโs converges to their mean 1 1 ๐ธ[๐ฬ ] = ๐ธ [ (๐1 + โฏ + ๐๐ )] = (๐ธ[๐1 ]+. . +๐ธ[๐๐ ]) = ๐ Proof: (assuming Var(X) = ฯ2 exists) ๐ ๐ X1 + X 2 + โฏ + X n 1 E[ ]=ฮผ Var(๐ฬ ) = Var ( (๐1 + โฏ + ๐๐ )) ๐ n 2 1 ๐2 X1 + X 2 + โฏ + X n 1 ฯ = 2 (Var(๐1 ) + โฏ + Var(๐1 )) = Var ( ) = 2 Var(X1 + โฏ + Xn ) = ๐ ๐ n n n 1 โ๐๐=1(๐๐ โ ๐ฬ )2 Sample variance ๐ 2 = Apply Chebyshevโs inequality โซ ๐ฅ๐(๐ฅ)๐๐ฅ โฅ 0 โ ๐ธ[๐] โฅ โซ ๐ฅ๐(๐ฅ)๐๐ฅ Cdf = ๐น(๐) = โซโโ ๐(๐ฅ)๐๐ฅ Joint r.v. X +X +โฏ+Xn ฯ2 joint CDF, ๐น(๐ฅ, ๐ฆ) = ๐{๐ โค ๐ฅ, ๐ โค ๐ฆ} P {| 1 2 โ ฮผ| > ฮต} โค 2 โ 0, as n โ โ n nฮต Discrete RVs, joint PMF, ๐(๐ฅ๐ , ๐ฆ๐ ) = ๐{๐ = ๐ฅ๐ , ๐ = ๐ฆ๐ } Bernoulli ฮผ = p, ฯ2 = p(1 โ p) Marginal CDF ๐น๐ (๐ฅ) = ๐น(๐ฅ, โ), i.e. ๐ can take any value Binomial sum of n iid Bernoulli r.v. Marginal PMF ๐๐ (๐ฅ) = โโ ๐=1 ๐(๐ฅ, ๐ฆ๐ ) P{X = k} = (nk)pk (1 โ p)nโk, for k = 0,1, โฆ , n Continuous joint PDF: ๐{(๐, ๐) โ ๐ถ} = โฌ(๐ฅ,๐ฆ)โ๐ถ ๐(๐ฅ, ๐ฆ) โnk=0 p(k) = โnk=0(nk)pk (1 โ p)nโk = (p + 1 โ p)n = 1 ๐ ๐ ฮผ = E[Y1 + Y2 + โฏ ] = np Joint CDF: ๐น(๐, ๐) = โซโโ โซโโ ๐(๐ฅ, ๐ฆ)๐๐ฅ๐๐ฆ โ ฯ2 = Var(Y1 + Y2. . . ) = np(1 โ p) Marginal PDF ๐๐ (๐ฅ) = โซโโ ๐(๐ฅ, ๐ฆ) ๐๐ฆ Ex. Acceptance Sampling ๐โ1 For an arbitrary distribution with mean ๐ and var ๐ 2 For the sample mean ๐ฬ , ๐ธ[๐ฬ ], Var(๐ฬ ) Approximate distribution when ๐ is large (Central Limit Theorem) For the sample variance ๐ 2 (๐ธ[๐ 2 ]) For a normal distribution ๐ฉ(๐, ๐ 2 ), Exact distribution of ๐ฬ and ๐ 2 Central Limit Theorem Let ๐1 , ๐2 , โฆ , ๐๐ be a sequence of i.i.d. random variables with mean ๐ and variance ๐ 2 โ๐(๐ฬ โ๐) 2 then the distribution of โ๐(๐ฬ โ ๐)โ๐ converges to ๐{๐ 2 > 12} = ๐ {14 ๐ 2 > 14 × 12} = ๐{๐14 > 18.57} = โข ๐ {โ๐ง๐ผโ2 < < ๐ง๐ผโ2 } = 1 โ ๐ผ 9 9 ๐ standard normal as ๐ โ โ ๐ ๐ 0.1779 ฬ ฬ ๐ {๐ โ ๐ง < ๐ < ๐ + ๐ง } = 1 โ ๐ผ ฬ โ โ ๐ผ 2 ๐ผ 2 ๐{โ๐(๐ โ ๐)โ๐ โค ๐ฅ} โ ฮฆ(๐ฅ) โ ๐{๐ < ๐ฅ} ๐ โ๐ โ๐ ๐-distribution If ๐ โฅ ๐๐2 ๐๐ = with ๐ deg of freedom ๐ ๐ ฬ 2 โ๐๐/๐ โ๐(๐ โ ๐)โ๐ โ ๐ฉ(0,1) (๐ฅฬ โ ๐ง๐ผโ2 , ๐ฅฬ + ๐ง๐ผโ2 ) โ๐ โ๐ Generally works as long as ๐ โฅ 30 ๐ฬ โ๐ ๐ Recall โผ ๐ One sided upper: (๐ฅฬ โ ๐ง๐ผ , โ) Questions: approximate distribution of the sum? ๐โโ๐ โ๐ 2 ๐ (๐ โ 1)๐ 2 โ๐ 2 โผ ๐๐โ1 ๐1 + ๐2 + โฏ + ๐๐ โ ๐ฉ(๐๐, ๐๐ 2 ) One sided lower: (โโ, ๐ฅฬ + ๐ง๐ผ ) โ๐ Question: approximate distribution of the sample mean? ๐ฬ โ ๐ โ๐(๐ฬ โ ๐)โ๐ Estimating Difference in Means = โผ ๐ก๐โ1 ๐ฬ โ ๐ฉ(๐, ๐ 2 โ๐) โ(๐ โ 1)๐ 2 โ๐ 2 โ(๐ โ 1) ๐โโ๐ ๐ฬ โ ๐ฬ โผ ๐ฉ(๐1 โ ๐2 , ๐12 โ๐ + ๐22 โ๐) Ex The College of Engineering decided to admit 450 ๐ฬ โ๐ฬ โ(๐1 โ๐2 ) Parameter Estimation a population has certain Normalize โผ ๐ฉ(0,1) students, 0.3 probability that an admitted student will distribution with unknown parameter ๐ โ๐12 โ๐+๐22 โ๐ actually enroll. What is the probability that the incoming Estimate ๐ based on ๐1 โฆ ๐๐ ๐2 ๐2 ๐2 ๐2 class has more than 150 students? ฬ (๐1 , โฆ , ๐๐ ) a function of sample, is a r.v. (๐ฅฬ โ ๐ฆฬ โ ๐ง๐ผโ2 โ ๐1 + ๐2 , ๐ฅฬ โ ๐ฆฬ + ๐ง๐ผโ2 โ ๐1 + ๐2 ) is a 1 โ ๐ผ ฮ Solution:๐๐ โผ Bernoulli(0.3), ๐ = ๐1 + ๐2 + โฏ + Is also a point estimator ๐450 ~Binomial(0.3, 450) ๐ฬ โ๐ฬ โ(๐1 โ๐2 ) Moment Generating Function ๐ {โ๐ง๐ผโ2 โค โค ๐ง๐ผโ2 } = 1 โ ๐ผ ๐{๐ > 150} = ๐{๐ = 151} + ๐{๐ = 152} + โฏ 2 2 ๐(๐ก) = ๐ธ[๐ ๐ก๐ฅ ๐(๐ฅ)] = โซ ๐ ๐ก๐ฅ ๐(๐ฅ)๐๐ฅ โ๐1 +๐2 ๐ ๐ + ๐{๐ = 450} ๐กโ ๐กโ ๐ moment is ๐ derivative ๐2 ๐2 ๐1 + โฏ + ๐450 is approximately normal with mean ๐ {๐ฬ โ ๐ฬ โ ๐ง๐ผโ2 โ 1 + 2 โค ๐1 โ ๐2 โค ๐ฬ โ ๐ฬ + Method of Moments ๐ ๐ 450 × 0.3 = 135 and standard deviation Moments can be expressed as functions of parameters 2 2 โ450 × 0.3 × 0.7 = โ94.5 ๐] ๐ ๐ ๐ธ[๐ = ๐๐ (๐1 , ๐2 , โฆ , ๐๐ ) ๐ง๐ผโ2 โ 1 + 2 } = 1 โ ๐ผ ๐ Approximate a discrete r.v. (e.g. ๐ = ๐1 + โฏ + ๐450) ๐ ๐ ๐ ๐ = ๐1 +โฏ+๐๐ ฬ ฬ ฬ ฬ If ๐ unknown parameters, ๐ with a continuous r.v. (e.g. ๐ โผ ๐ฉ(135,94.5)) ๐ ๐1 (๐1 , ๐2 , โฆ , ๐๐ ) = ฬ ๐ฬ ฬ ฬ 1 Question: ๐{๐ = ๐} โ ๐{๐ โ ? } C.I. for the difference in the means of two normal ๐{๐ = ๐} โ ๐{๐ โ 0.5 โค ๐ โค ๐ + 0.5} ๐2 (๐1 , ๐2 , โฆ , ๐๐ ) = ฬ ฬ ๐ฬ ฬ 2 population when the variances are known Continuity correction โฎ When unknown variances, assume to be equal (๐โ1)๐12 +(๐โ1)๐22 ๐{๐1 + โฏ ๐450 > 150} = โโ ๐ฬ ฬ ๐ ๐=151 ๐{๐1 + โฏ + ๐450 = ๐} โ {๐๐ (๐1 , ๐2 , โฆ , ๐๐ ) = ฬ ฬ Use Pooled Sample Variance ๐๐2 = (๐โ1)+(๐โ1) โโ Maximum Likelihood Estimators ๐=151 ๐{๐ โ 0.5 โค ๐ โค ๐ + 0.5} = ๐{๐ โฅ 150.5} ๐ฬ โ๐ฬ โ(๐1 โ๐2 ) 150.5โ135 150.5โ135 โผ ๐ก๐+๐โ2 ๐{๐ โฅ 150.5} = ๐ {๐ โฅ } = 1 โ ฮฆ( ) โ Given parameter ๐, for each observation ๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ , we 2 โ94.5 โ94.5 โ๐๐ (1โ๐+1โ๐) can calculate the joint density function 1 โ ฮฆ(1.59) โ 0.06 ๐(๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ ; ๐) 1 1 Continuity Correction When using a continuous function (๐ฅฬ โ ๐ฆฬ โ ๐ก๐ผโ2,๐+๐โ2 ๐ ๐ โ๐ + ๐ , ๐ฅฬ โ ๐ฆฬ + ๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ are independent to approximate a discrete one, ex. Normal approx.. Bin. ๐(๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ ; ๐) = ๐๐1 (๐ฅ1 ; ๐)๐๐2 (๐ฅ2 ; ๐) โฏ ๐๐๐ (๐ฅ๐ ; ๐) 1 1 If P(X=n) use P(n โ 0.5 < X < n + 0.5) ๐ก๐ผโ2,๐+๐โ2 ๐ ๐ โ + ) is a 1 โ ๐ผ C.I. for the difference in Given observation ๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ , for each ๐, define the ๐ ๐ If P(X>n) use P(X > n + 0.5) likelihood ๐ฟ(๐) = ๐(๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ ; ๐) If P(Xโคn) use P(X < n + 0.5) the means of two normal population when the variances Maximum likelihood Estimator MLE maximizes ๐ฟ(ฮ) If P (X<n) use P(X < n โ 0.5) are unknown but equal For MLE use the pdf of the distribution, using ๐ to If P(X โฅ n) use P(X > n โ 0.5) Approximate Confidence Interval replace as needed. Multiply together all ๐ samples in this If population not normal but relatively large sample size, Sample Variance 1 new pdf. use results for normal population mean with known โ๐ (๐ โ ๐ฬ )2 ๐2 = ๐โ1 ๐=1 ๐ Itโs easier to maximize log ๐ฟ(๐) โ ๐(๐) variance to obtain approximate 1 โ ๐ผ c.i. ๐ (๐ ๐ 2 2 2 โ๐=1 ๐ โ ๐ฬ ) = โ๐=1 ๐๐ โ ๐๐ฬ MLE of exponential dist is the sample mean ๐ ๐ ๐ ๐ 2 ๐ 2 2 2 ฬ ฬ (๐ฅฬ โ ๐ง๐ผโ2 , ๐ฅฬ + ๐ง๐ผโ2 ) ๐ธ[โ๐=1(๐๐ โ ๐) ] = ๐ธ[โ๐=1 ๐๐ โ ๐๐ ] = ๐ธ[โ๐=1 ๐๐ ] โ Example: Bernoulli distribution โ๐ โ๐ 2 2 2 ๐ธ[๐๐ฬ ] = ๐๐ธ[๐1 ] โ ๐๐ธ[๐ฬ ] Ex. Monte Carlo Simul. ๐{๐ = 1} = ๐ = ๐ 1 ๐ Var(๐) = ๐ธ[๐ 2 ] โ ๐ธ[๐]2 โ ๐ธ[๐ 2 ] = Var(๐) + ๐ธ[๐]2 ๐{๐ = 0} = 1 โ ๐ = (1 โ ๐)1โ0 Evaluate a definite integral ๐ = โซ๐ ๐(๐ฅ)๐๐ฅ ๐ธ[๐12 ] = Var(๐1 ) + ๐ธ[๐1 ]2 = ๐ 2 + ๐ 2 ๐๐ (๐ฅ; ๐) = ๐ ๐ฅ (1 โ ๐)1โ๐ฅ Consider an uniform random variable ๐ in [๐, ๐] ๐ธ[๐ฬ 2 ] = Var(๐ฬ ) + ๐ธ[๐ฬ ]2 = ๐ 2 โ๐ + ๐ 2 log ๐๐ (๐ฅ; ๐) = ๐ฅ log ๐ + (1 โ ๐ฅ) log(1 โ ๐) Law of lazy statistician ๐ (๐ 2] 2 2 2 2 2 ๐ ๐ ฬ ) (๐ ๐ธ[โ๐=1 ๐ โ ๐ = ๐๐ + ๐๐ โ ๐ โ ๐๐ = โ 1)๐ ๐(๐) = โ๐=1 ๐ฅ๐ log ๐ + (๐ โ โ๐=1 ๐ฅ๐ ) log(1 โ ๐) ๐ ๐ 1 ๐ 2] 2 ๐ ๐ ๐ธ[๐(๐)] = โซ๐ ๐(๐ฅ)๐๐ (๐ข)๐๐ข = โซ ๐(๐ข)๐๐ข = ๐โ๐ โ ๐ ๐ฅ ๐โโ๐=1 ๐ฅ๐ ๐ธ[๐ = ๐ , so unbiased ๐โ๐ ๐ ๐(๐) = ๐=1 ๐ โ ๐๐ ๐ 1โ๐ Estimate ๐ Sampling from a Normal Population (1 โ ๐) โ๐๐=1 ๐ฅ๐ = ๐๐ โ ๐ โ๐๐=1 ๐ฅ๐ Simulate uniform random variables ๐1 , โฆ , ๐๐ ๐ฬ โผ ๐ฉ(๐, ๐ 2 โ๐) ๐ โ ๐ฅ ๐ โ๐ ๐(๐ ) Chi-square distribution ๐๐2 = ๐12 + ๐22 + โฏ + ๐๐2 with ๐ ๐ฬ(๐1 , โฆ , ๐๐ ) = ๐ฬ = ๐=1 ๐ = 1 ๐ฬ = (๐ โ ๐) ๐=1 ๐ ๐ ๐ ๐ degrees of freedom, sum of indep chi-squar is chi-square Normal: ๐ฬ (๐ , โฆ , ๐ ) = ๐ฬ 1 ๐ Let ๐ฬ and ๐ be the observed sample mean and standard 1 โ๐๐=1(๐๐ โ ๐ฬ )2 ๐2 = deviation of the ๐(๐ข๐ )โs ๐โ1 ๐ฬ(๐1 , โฆ , ๐๐ ) = โโ๐๐=1(๐๐ โ ๐ฬ )2 โ๐ ,๐ฬ 2 = (๐ โ 1)๐ 2 /๐ 2 ๐ ๐ โ๐๐=1(๐๐ โ ๐ฬ )2 = โ๐๐=1((๐๐ โ ๐) โ (๐ฬ โ ๐)) Approximate C.I. (๐ฬ โ ๐ง๐ผโ2 , ๐ฬ + ๐ง๐ผโ2 ) โ๐ โ๐ Uniform on [0,ฮ], ๐ฬ(๐1 , โฆ , ๐๐ ) = max{๐1 , โฆ , ๐๐ } ๐๐ = ๐๐ โ ๐, ๐ฬ = ๐ฬ โ ๐ 2 Poisson: ๐ฬ(๐1 , โฆ , ๐๐ ) = ๐ฬ โ๐๐=1((๐๐ โ ๐) โ (๐ฬ โ ๐)) = โ๐๐=1(๐๐ โ ๐ฬ )2 = โ๐๐=1 ๐๐2 โ Point Estimators ๐๐ฬ 2 Biased/unbiased and precise/imprecise ๐ (๐ ๐ 2 2 2 โ๐=1 ๐ โ ๐ฬ ) = โ๐=1(๐๐ โ ๐)๐ โ ๐(๐ฬ โ ๐) Bias of estimator ๐ฬ(๐1 , โฆ , ๐๐ ) for parameter ๐ Rearrange the terms ฬ) = ๐ธ[๐ฬ(๐1 , โฆ , ๐๐ )] โ ๐ ๐(๐ 2 ๐ ๐ 2 2 โ๐=1(๐๐ โ ๐) = โ๐=1(๐๐ โ ๐ฬ ) + ๐(๐ฬ โ ๐) Confidence Interval Decomposition of the sum of squares 2-sided, 1-sided upper/lower 2 Divide by ๐ Ex. normal 2-sided 95% confidence 2 ๐ 2 2 (๐ โ๐) โ (๐ โ๐ฬ ) (๐ฬ โ๐) โ๐๐=1 ๐ 2 = ๐=1 2๐ + ๐2 โ๐(๐ฬ โ ๐) ๐ ๐ ๐ {โ1.96 < < 1.96} = 2ฮฆ(1.96) โ 1 = 0.95 ๐ ๐ 2 2 =๐๐ =? ? ? +๐1 ๐ ๐ ๐ โ๐ โ ๐ {โ1.96 < ๐ฬ โ ๐ < 1.96 } = 0.95 ๐๐ = ๐ , ๐1 , ๐2 , โฆ , ๐๐ are independent standard normal ๐ ๐ ๐ โ โ (๐ โ๐)2 ๐ ๐ โ๐๐=1 ๐ 2 is chi-square with ๐ degrees of freedom ฬ โ ๐ {โ1.96 < ๐ โ ๐ < 1.96 } = 0.95 ๐ โ๐ โ๐ If sampling from normal population, then ๐ฬ โฅ ๐ 2 , ๐ ๐ sample mean and variance are indep r.v. ๐ {๐ฬ โ 1.96 < ๐ < ๐ฬ + 1.96 } = 0.95 โ๐ โ๐ ๐ฬ โผ ๐ฉ(๐, ๐ 2 โ๐) If ๐ is given, and we have observed sample mean ๐ฅฬ 2 (๐ โ 1)๐ 2 โ๐ 2 โผ ๐๐โ1 ๐ ๐ (๐ฅฬ โ 1.96 , ๐ฅฬ + 1.96 ) is a 95% confidence interval (๐ โ 1)๐ 2 โ๐ 2 = โ๐๐=1(๐๐ โ ๐ฬ )2 โ๐ 2 โ๐ โ๐ CPU time for a type of jobs is normally distributed with estimate of ๐ mean 20 seconds and standard deviation 3 seconds Once observed, ๐ฅฬ is a constant, and the interval is fixed. 15 such jobs are tested, what is the probability that the 95% is not a probability sample variance exceeds 12 Two-sided Confidence Interval 1 โ ๐ผ for normal mean 2 Solution(15 โ 1)๐ 2 โ32 โผ ๐14 with known variance: