* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 9 – Random Samples, Statistics, and Central Limit Theorem
Foundations of statistics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Gibbs sampling wikipedia , lookup
Law of large numbers wikipedia , lookup
Lecture 9 – Random Samples, Statistics, and Central Limit Theorem Independent random variables X1, X2, …, Xn with the same distribution are called a random sample. We use statistics to describe a sample. Examples of statistics include sample mean, sample standard deviation etc. It is important to realize that any sample statistic is a function of the random variables in a random sample. For example, consider an experiment in which I select ten students and measure their heights. If I repeat the experiment there is no guarantee that the mean of the sample will be the same for both trials. Therefore, the sample mean and sample variance are random variables and not constants. The probability distribution of a statistic is called the sampling distribution. Statistics of the Sample Mean Consider the mean of a random sample where Xi has a mean 𝑋= and variance 2. 𝑋! + 𝑋! + ⋯ + 𝑋! 𝑛 Using the rules derived for functions of random variables the expected value of the sample mean, 𝐸 𝑋 =𝐸 𝐸 𝑋 = 𝑋! + 𝑋! + ⋯ + 𝑋! 𝑛 𝜇 + 𝜇 + ⋯ + 𝜇 (𝑛 𝑡𝑖𝑚𝑒𝑠) 𝑛 𝐸 𝑋 =𝜇 Similarly the variance of the sample mean is given by 𝑉 𝑋 = 𝜎! 𝑛 Central Limit Theorem The central limit theorem states that, for large n, the probability distribution of the sum of n independent and identical random variables can be approximated by the normal distribution. More specifically, you can see that the sample mean is the sum of n independent and identical random variables. Therefore, for large sample sizes, 𝜎! 𝑋 ~ 𝑁 𝜇, 𝑛 On a related note, the sum of n random variables is also normally distributed, 𝑛𝑋 ~ 𝑁 𝑛𝜇, 𝑛𝜎 ! If Xi are normal If Xi are normal, then the sample mean is normal irrespective of the sample size. Normal Distribution Most observations in real life follow the normal distribution. This is because many objects in real life are sums of multiple units. Example – thickness of a book, box of gloves etc. The normal distribution is given by, 𝑓 𝑥 = 1 𝜎 2𝜋 𝑒 !!.! !!! ! ! 𝑓𝑜𝑟 −∞<𝑋 <∞ The mean and variance of a normal variable are given by, 𝐸 𝑋 =𝜇 𝑉 𝑋 = 𝜎! If a variable follows the normal distribution with parameters represented as, 𝑋~𝑁(𝜇, 𝜎 ! ) and , then it is Standard Normal Distribution There is no analytical solution for the integral of normal probability density function. So we use tables. But we cannot build separate tables of probability for each and 2. Therefore, we standardize the normal variable by subtracting the mean and dividing by the standard deviation. The resultant standardized normal variable has a mean zero and a standard deviation of 1. 𝑍= 𝑋−𝜇 𝜎 Now we can use the tables to compute probabilities by standardizing X. 𝑍~𝑁(0,1) 𝑓 𝑧 = 1 2𝜋 ! 𝑒 !!.!! 𝑓𝑜𝑟 − ∞ < 𝑍 < ∞