Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
(Quick & Dirty) STAT 311 REVIEW Chapter 1 - Overview and Descriptive Statistics Chapter 2 - Probability Chapter 3 - Discrete Random Variables and Probability Distributions Chapter 4 - Continuous Random Variables and Probability Distributions Chapter 5 - Joint Probability Distributions and Random Samples Chapter 6 - Point Estimation Numerical Random Variable X assigns a number to each pop unit Categorical Continuous Discrete k = 2 categories Binary (0/1) k > 2 categories Density f(x) Population Distribution of X = Right foot length (mm) f ( x) probability density function (pdf) Properties? f ( x) 0 f ( x) dx 1 The probability density curve pictured above is “skewed to the right” (or “positively skewed”). But many other possibilities exist, such as: • skewed to the left (i.e., negatively skewed) • symmetric (no skew) • unimodal (one peak) • bimodal (two peaks) • uniform (i.e., flat) over some finite interval • normal (i.e., the “bell curve”) 2 Numerical Random Variable X assigns a number to each pop unit Categorical Continuous Discrete k = 2 categories Binary (0/1) k > 2 categories Population Distribution of X = Right foot length (mm) Density f(x) f ( x) probability density function (pdf) Properties? f ( x) 0 f ( x) dx 1 Parameters mean E[ X ] x f ( x) dx variance E ( X )2 ( x )2 f ( x) dx 2 2 2 E X x f ( x) dx 2 2 3 Continuous Numerical Random Variable X assigns a number to each pop unit Categorical Discrete k = 2 categories Binary (0/1) k > 2 categories Population Distribution of X = Right foot length (mm) Density f(x) f ( x) probability density function (pdf) a Parameters mean Properties? f ( x) 0 f ( x) dx 1 b E[ X ] x f ( x) dx variance E ( X )2 ( x )2 f ( x) dx 2 2 2 E X x f ( x) dx 2 2 bb PP(a X b) a f ( x) dx a F (b) F (a) 4 Numerical Random Variable X assigns a number to each pop unit Categorical Continuous Discrete k = 2 categories Binary (0/1) k > 2 categories Population Distribution of X = Right foot length (mm) Density f(x) f ( x) probability density function (pdf) E[ X ] x f ( x) dx variance E ( X )2 ( x )2 f ( x) dx 2 2 2 E X x f ( x) dx 2 2 f ( x) 0 f ( x) dx 1 x Parameters mean Properties? cumulative distrib function (cdf ) F ( x) P( X x) bb PP(a X b) a f ( x) dx a F (b) F (a) 5 Numerical Random Variable X assigns a number to each pop unit p(xi ) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ 1 k > 2 categories f ( x) f ( x) 0 f ( x) dx 1 , 9, 9 12 , 10, 10 12 , 11, 11 12 , E[ X ] x f ( x) dx variance E ( X )2 ( x )2 f ( x) dx 2 2 2 E X x f ( x) dx 2 2 Properties? probability density function (pdf) Parameters mean Discrete k = 2 categories Binary (0/1) Population Distribution of X = Shoe size ( ,9,9 12 ,10,10 12 ,11,1112 , ) Density f(x) xi Categorical Continuous cumulative distrib function (cdf ) F ( x) P( X x) b P(a X b) f ( x) dx a F (b) F (a) 6 Continuous Numerical Random Variable X assigns a number to each pop unit p(xi ) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ 1 p( x) f ( x) 0 f ( x) dx 1 , 9, 9 12 , 10, 10 12 , 11, 11 12 , E[ X ] x f ( x) dx variance E ( X )2 ( x )2 f ( x) dx 2 2 2 E X x f ( x) dx 2 2 Properties? probability mass function (pmf) Parameters mean k > 2 categories Population Distribution of X = Shoe size ( ,9,9 12 ,10,10 12 ,11,1112 , ) Density f(x) xi Categorical Discrete k = 2 categories Binary (0/1) cumulative distrib function (cdf ) F ( x) P( X x) b P(a X b) f ( x) dx a F (b) F (a) 7 Continuous Numerical Random Variable X assigns a number to each pop unit p(xi ) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ ⋮ 1 p( x) Properties? probability mass function (pmf) p ( x) 0 p ( x) 1 Parameters mean k > 2 categories Population Distribution of X = Shoe size ( ,9,9 12 ,10,10 12 ,11,1112 , ) Density f(x) xi Categorical Discrete k = 2 categories Binary (0/1) , 9, 9 12 , 10, 10 12 , 11, 11 12 , E[ X ] x p( x) variance 2 E ( X ) 2 ( x ) 2 p( x) E X 2 2 x 2 p( x) 2 cumulative distrib function (cdf ) F ( x) P( X x) P ( a X b) a p ( x ) b F ( b) F ( a ) 8 Numerical Random Variable X assigns a number to each pop unit Categorical Continuous Discrete k = 2 categories Binary (0/1) k > 2 categories Population Distribution of X ~ Dist ( , ) Density f(x) f ( x) p( x) probability density function (pdf) Parameter Estimation Sample, size n X1 , , Xn random 1 n X Xi n i 1 How do we obtain a random sample-based estimator ˆ of the population mean ? How do we obtain a random sample-based estimator ˆ 2 of the population variance 2 ? Moreover, E X and E S 2 2 . 1 n 2 S ( X X ) i n 1 i 1 2 probability mass function (pmf) X is an unbiased estimator of S 2 is an unbiased estimator of 2 . Numerical Random Variable X assigns a number to each pop unit Categorical Continuous Discrete k = 2 categories Binary (0/1) k > 2 categories Population Distribution of X ~ Dist ( ) Density f(x) f ( x) p( x) probability density function (pdf) Parameter Estimation Sample, size n X1 , , Xn random in general… How do we obtain a random sample-based estimator ˆ of a population parameter ? 1 n X Xi n i 1 1 n 2 S ( X X ) i n 1 i 1 2 probability mass function (pmf) ˆ ˆ( X1, X 2 , , X n ) Method of Moments, MLE,… (Stat 311) Properties (e.g., bias)? Improvement? Continuous Numerical Discrete k = 2 categories Binary (0/1) Random Variable X assigns a number to each pop unit Categorical k > 2 categories Population Distribution of X Density f(x) f ( x) p( x) probability density function (pdf) probability mass function (pmf) … etc… Sample 3, Sample 1, size n Sample 2, X1 size n X2 size n Sample 4, X3 size n X4 How are these random X values distributed ? 11 Numerical Random Variable X assigns a number to each pop unit Categorical Continuous Discrete k = 2 categories Binary (0/1) k > 2 categories Density f(x) Population Distribution of X f ( x) p( x) probability density function (pdf) probability mass function (pmf) Sampling Distribution of X As long as and exist, “standard error” X N , of the mean (SEM) n for "large" values of n (> 30). n X IMPORTANT FACT! Numerical Random Variable X Continuous Discrete k = 2 categories Binary (0/1) Density f(x) Suppose X tofollows assigns a number each pop unit a Categorical k > 2 categories normal distribution X N ( , ) Population Distribution of X f ( x) p( x) probability density function (pdf) As long as and exist, exactly X N , n . 30). for "large" ALL values of n (> probability mass function (pmf) Sampling Distribution of X n X Normal Distribution N ( , ) In general…. What symmetric interval about the mean contains 100(1 – )% of the population values? 1– /2 /2 z 2 “ / 2 critical values” Example: .05 z 2 Normal Distribution N ( , ) In general…. What symmetric interval about the mean contains 100(1 – )% of the population values? “Approximately 95% of any normally-distributed population lies within 2 standard deviations of the mean.” .95 .025 .025 1.96 z.025 1.96 z.025 ““.025 / 2 critical values” cumulative areas Example: .05 Use the included table or R: > qnorm(c(.025, .975)) [1] -1.959964 1.959964 Random Variable X Density f(x) Population Distribution of X Dist ( , ) Continuous Discrete f ( x) p( x) probability density function (pdf) probability mass function (pmf) To summarize… Suppose we wish to estimate the mean Sampling Distribution of X N ( , n ) from a particular random sample. We can now use n known properties of n Sample, 1 the “bell curve” to END x x i size n n i 1 REVIEW improve our estimate. x1 , x2 , xn X