Download Chapter 15: Sampling Distributions Statistics has two fields

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter 15: Sampling Distributions
Statistics has two fields.
Descriptive statistics: consists of procedures used to summarize and describe the important
characteristics of a set of measurements.
Inferential statistics: consists of procedures used to make inferences or generalizations about
population characteristics from information contained in a sample drawn from this population.
Large group
Studying about ⇒ Population
Population is the complete collection of all individuals.
Parameter: A numerical measurement describing some characteristic of a population.
Mean(µ), Variance(σ 2 ), Standard Deviation (σ)
Small part of Large group
We are going to use the information contained in ⇒ Sample
Sample is a subcollection of members selected from population.
Statistic: A numerical measurement describing some characteristic of a sample.
Mean(x), Variance(s2 ), Standard Deviation (s)
We need to estimate those parameters from the sample data, which draw from the population.
Population distribution of a variable:
The distribution of values of the variable among all individuals in the population.
Law of Large Numbers: Draw observations at random from any population with finite mean
µ. As the number of observations drawn increases, the mean x of the observed values tends to get
closer and closer to the mean µ of the population.
Sampling distribution of a statistics
The distribution of values taken by the statistic in all possible samples of the same size (n) from
the same population.
If x is the mean of an SRS of size n drawn from a large population with mean µ and standard
deviation σ, then
1) Sampling distribution of x is the very close value of the population mean µ.
σ
2) The standard deviation of the distribution of x is √ .
n
Sampling distribution of a Sample Mean
Individual observations:
N (µ, σ)
The sample mean x of an SRS of size n :
σ
N (µ,√ )
n
Sampling Distribution of the mean: x
The distribution of sample means.
Sampling Distribution of the variance: s2
The distribution of sample variances.
Sampling distribution of the proportion: p̂
The distribution of sample proportions.
p̂ = sample proportion
Note: All samples have the same size n. They are taken from the same population.
So we need a good estimator!
Estimator: A statistic used to infer (estimate) the value of a population parameter.
E (a good estimator) = Population Parameter
Unbiased Estimators (=Good estimator):
Sample means,
Sample variances
Sample proportions
are unbiased estimators.
Biased Estimators:
Sample median,
sample range
sample standard deviation
are biased estimators.
Note: The standard deviation’s bias is relatively small in a large sample, so s is often used
to estimate σ .
If the sample size increases, what kind of results do we have?
The Central Limit Theorem
The Central Limit Theorem tells us that as the sample size increases,
1) the sampling distribution of sample means tends
to approach a normal distribution.
2) the mean of all sample means is equal to the mean
of the original population.
3) the standard deviations of sample means become smaller.
(
)
N µ, σ
⇒
(
σ )
N µ, √
n
Notation
The mean of the sample means: µx = µ
σ
The standard deviation of the sample means: σx = √
n
Applying the Central Limit Theorem
Individual value:
When working with an individual value from a normally distributed population.
z=
x−µ
σ
Sample of values:
When working with a mean for some sample (or group).
z=
x − µx
√σ
n
Ex.: The average cholesterol content of a certain brand of eggs is 215 milligrams and
the standard deviation is 15 milligrams.
Assume the variable is normally distributed.
a. If a single egg is selected, find the probability that the cholesterol content will be more
than 220 milligrams.
b. If a sample of 25 eggs is selected, find the probability that the mean of the sample
will be larger than 220 milligrams.