Download Chapter 5 Sampling Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Chapter 5 Sampling
Distributions
1
The Distribution of a
Sample Statistic


Examples

Take random sample of students and
compute average GPA in sample.

Ask 20 people whether or not they will
vote NPA. Get number who say yes.

Get proportion of people in a SRS
who favor the Olympics.
These statistics ( in bold face) are
random variables.


The results vary from sample to
sample
Probability distribution of a statistic
is called its sampling distribution
2
Some important sampling
distributions

Distribution of




Sample average
Sample count (number
of successes in
sample)
Sample proportion
(proportion of
successes in sample)
Many others
3
Binomial setting for
sample counts and
proportions (Section 5.1)


Fixed number, n, of
observations.
Only two outcomes for each
observation – call them Success
or Failure (S/F)

Observations are all
independent.

Probability of success, p, is
same for each trial.
_____________________
In the above setting can look at
the prob. dist. of sample counts
of successes and at the sample
proportion of successes

4
Binomial
Distribution

Let X be the number
of successes in a
binomial setting.

The probability
distribution of X is
called the binomial
distribution with
parameters n and p.
5
Binomial Formula

Loaded coin – each toss has
prob 0.6 of coming up heads.

Toss 3 times.

Toss 10 times.
6
Binomial Example


Six customers enter a
restaurant. With
probability 0.75, any
given customer
purchases a meal.
Purchase are
independent of one
another.
Analyze number of meals
purchased.
7
Tables and Excel for
Computing Binomial
Probabilities

Table C (not in my text, but should be
there) has binomial probabilities




Gives prob. of exactly k successes in n
trials, for a given p.
Excel

BINOMDIST(6,20,0.4,FALSE)

Prob. of exactly 6 successes in 20
trials with prob. of success 0.4 on
each trial.

BINOMDIST(6,20,0.4,TRUE)

Prob. of 6 or fewer successes in 20
trials with prob. of success 0.4 on
each trial.
Midterms – use binomial formula and
Excel formula.
HW – formula, table or Excel.
8

Example: 20% of govt.
employees are dissatisfied
with their wage. I take a
SRS of 40 employees.
What is the expected
number and std. deviation of
the number of dissatisfied
employees in my sample?
9
Sample
Proportion




What about the proportion
of dissatisfied employees in
my sample?
Called the sample
proportion.
Its probability can be
computed via binomial dist.
Also, sample proportion has


Mean=p
Std. Dev = SQRT[(p(1-p))/n].
10
Normal Approximation for Counts
and Proportions

Computing binomial
probabilities when n is
large.



E.g Suppose 90% of
Canadians favor Liberal party!
Gallup poll takes SRS of 1600
Canadians. How likely is it that
at least 1460 of those sampled
favor Liberals?
Approximately binomial but
could spend forever computing
this.
What to do???
11
Normal approximation for counts and
proportions

Same thing for sample proportion.

Distribution of sample proportion is approximately normal
with
mean = p and std. dev = sqrt[p(1-p)/n]
12
Example of normal
approximation to binomial

Consider our Gallup poll.
Have p = .9, n = 1600.
Recall that we want prob.
that at least 1460 of the
people in our sample favor
Liberals.

We’ll also approximate
prob. that at least 85% of
people in our sample favor
Liberals.
13
The continuity correction (optional)

Improves the accuracy of normal
approximation to binomial.

Can lead to substantially better
approximations, especially for smaller
values of n.
Here it is:

Adjust P(a <X< b) by using

P(a-0.5 < X <b+0.5),

Adjust P(X > a) by using

P(X > a-0.5)

Adjust P(X<b) by using

p(x<b+0.5)

Try it for our Gallup poll example.

Try for situation when p = .5, n = 25.
14
Review question on
binomial (you do)



70% of all mortgage applicants
are granted a loan.
5 people apply for a mortgage
today. What is prob. exactly 3 get
a loan?
Suppose 100 people apply for a
mortgage each month.
 What is mean and std dev of #
who are granted loans?
 How likely is it that at least 75
of the 100 people are granted
loans?
th
 Find the 95 percentile of the
number who are granted loans.


Hints
For second part, use
normal approx to
binomial
Mu = np
Sigma = sqrt[n*p*(1-p)]
Then it’s just a normal
distribution question.
15
Sampling Distribution
of a Sample Mean
Section 5.2
16
Sample mean as a
random variable





Suppose I want to estimate the
mean annual return on TSE
stocks over the past year.
I take a random sample of 20
stocks and get the average return
for past year in the sample.
That’s my estimate of the true
mean return for TSE
But different samples would give
different estimates (results vary
from sample to sample).
Thus the sample average ( also
known as the sample mean) is a
random variable.
17
Distribution of
Sample Means

Sample averages are less
variable than individual
observations

Averages from big
samples should be less
variable than averages
taken from small samples.

It turns out that the
sample averages tend to
follow a normal
distribution (more on this
later)
18

Example:


I take a SRS of n= 36 people
from a large population with
mean 50 and std deviation 24.
What is the mean and standard
deviation of the sampling
distribution of xbar?
19
20
Example – Sampling Dist. Of
Sample Mean

The fills from a vending
machine are normally
distributed with a mean
of 250 ml and a std.
dev of 15 ml. I take a
sample of n = 100 fills
from the machine. How
likely is it that the
average fill from the
100 cups is between
247 and 253 ml.?
21
22
Example – An Application of
Central Limit Theorem:

The distribution of the time
required to complete a certain task
is highly skewed to the right.
However, we know that the mean
time required is 60 minutes, with
a standard deviation of 30
minutes.

Consider a SRS of 225 people
who complete the task. Estimate
the probability that at the average
time required for these people to
complete the task lies between 57
and 63 minutes.
23
Weighted sums and differences of
normal variables are normally
distributed


Accountants incomes are
normally distributed with a
mean of $100K and a standard
deviation of 20K. Lawyers
incomes are normally
distributed with a mean of
$80K and a standard deviation
of 40K. Incomes of lawyers
and accountants are
independent.
I select an accountant and a
lawyer at random. How likely
is it that their combined income
exceeds $220K? How likely is
it that the accountant makes
more than the lawyer?
24