Download Lecture 9 – Random Samples, Statistics, and Central Limit Theorem

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Law of large numbers wikipedia, lookup

Central limit theorem wikipedia, lookup

History of statistics wikipedia, lookup

Student's t-test wikipedia, lookup

Taylor's law wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Resampling (statistics) wikipedia, lookup

Degrees of freedom (statistics) wikipedia, lookup

Foundations of statistics wikipedia, lookup

Gibbs sampling wikipedia, lookup

Transcript
Lecture 9 – Random Samples, Statistics, and Central Limit Theorem
Independent random variables X1, X2, …, Xn with the same distribution are called a random
sample. We use statistics to describe a sample. Examples of statistics include sample mean,
sample standard deviation etc.
It is important to realize that any sample statistic is a function of the random variables in a
random sample. For example, consider an experiment in which I select ten students and
measure their heights. If I repeat the experiment there is no guarantee that the mean of the
sample will be the same for both trials.
Therefore, the sample mean and sample variance are random variables and not constants. The
probability distribution of a statistic is called the sampling distribution.
Statistics of the Sample Mean
Consider the mean of a random sample where Xi has a mean
𝑋=
and variance
2.
𝑋! + 𝑋! + ⋯ + 𝑋!
𝑛
Using the rules derived for functions of random variables the expected value of the
sample mean,
𝐸 𝑋 =𝐸
𝐸 𝑋 =
𝑋! + 𝑋! + ⋯ + 𝑋!
𝑛
𝜇 + 𝜇 + ⋯ + 𝜇 (𝑛 𝑡𝑖𝑚𝑒𝑠)
𝑛
𝐸 𝑋 =𝜇
Similarly the variance of the sample mean is given by
𝑉 𝑋 =
𝜎!
𝑛
Central Limit Theorem
The central limit theorem states that, for large n, the probability distribution of the sum
of n independent and identical random variables can be approximated by the normal
distribution. More specifically, you can see that the sample mean is the sum of n
independent and identical random variables. Therefore, for large sample sizes,
𝜎!
𝑋 ~ 𝑁 𝜇,
𝑛
On a related note, the sum of n random variables is also normally distributed,
𝑛𝑋 ~ 𝑁 𝑛𝜇, 𝑛𝜎 !
If Xi are normal
If Xi are normal, then the sample mean is normal irrespective of the sample size.
Normal Distribution
Most observations in real life follow the normal distribution. This is because many
objects in real life are sums of multiple units. Example – thickness of a book, box of
gloves etc.
The normal distribution is given by,
𝑓 𝑥 =
1
𝜎 2𝜋
𝑒
!!.!
!!! !
!
𝑓𝑜𝑟
−∞<𝑋 <∞
The mean and variance of a normal variable are given by,
𝐸 𝑋 =𝜇
𝑉 𝑋 = 𝜎!
If a variable follows the normal distribution with parameters
represented as, 𝑋~𝑁(𝜇, 𝜎 ! )
and , then it is
Standard Normal Distribution There is no analytical solution for the integral of normal probability density function. So
we use tables. But we cannot build separate tables of probability for each and 2.
Therefore, we standardize the normal variable by subtracting the mean and dividing by
the standard deviation. The resultant standardized normal variable has a mean zero
and a standard deviation of 1.
𝑍=
𝑋−𝜇
𝜎
Now we can use the tables to compute probabilities by standardizing X.
𝑍~𝑁(0,1)
𝑓 𝑧 =
1
2𝜋
!
𝑒 !!.!! 𝑓𝑜𝑟 − ∞ < 𝑍 < ∞