• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia, lookup

History of statistics wikipedia, lookup

Transcript
Stat 350 Lab Session
GSI: Yizao Wang
Section 016 Mon 2pm30-4pm MH 444-D
Section 043 Wed 2pm30-4pm MH 444-B
Outlines
• Binomial and normal distribution
• Sampling distribution and CLT (Module 4)
• Confidential intervals (Module 5, Actv.1)
• Permission to post forms
• Today’s qwizdom questions are anonymous.
Binomial Distribution
Example of B(n,p): coins flipping
Flip a coin n times. The probability of getting heads each time is p.
The number of heads we get during n times is a r.v. distributed as B(n,p)
Conditions to verify for binomial r.v.
flipping a coin n times
1- n trails. n fixed in advance.
Flipping n times.
2- 2 possible outcomes each trial.
3- Independent outcomes between trails.
The result of any flipping won’t
change the others.
4- Probability of success is p, fixed for all
trials. (Identical distribution)
Decided by the (same) coin.
Another classical example is giving a survey of one ‘yes/no’ question to n
random selected persons.
Normal Approximation of Binomial Distribution
X ~ B(n,p)
• P(X = k) is decided by the parameters…
but
• When n is large, very difficult to calculate!
• Approximation by normal distribution
Approximately X ~ N( np,sqrt(np(1-p)) )
Normal Distribution
• Normal distribution is very rare in real world, but often a
very good approximation, with some nice mathematical
properties.
• Written as X ~ N(\mu,\sigma)
• Z-score (z-statistic) is the standardized X by
Z = (X-\mu)//sigma
• Z ~ N(0,1) (why we want to standardize X?)
• What do the normal distributions look like?
How to relate the shape with the two parameters?
Normal Distribution
• 10 minues In-lab review (8 questions)
CTools\Lab Info\Lab review: Normal Distribution
Population vs. Sample
Population
Sample
Definition
Collection of items you
want to study
Small collection of
population items
Size
Too large
Small
Example
Heights of all UM
students
Heights of students in a
certain lab
Random or fixed?
Fixed
Random (why?)
Parameters vs. Statistics
Parameters
Statistics
Where are they from? Population
Sample
Example
Mean height of UM
students
Mean height of students in
a certain lab
Known or not?
No
Calculable from sample
Random or fixed?
Fixed
Random
Examples of parameters and corresponding (why?) statistics
Mean
Population mean
Sample mean
Standard deviation
Population s.d.
Sample s.d.
Proportion
Population proportion Sample proportion
Statistics are random variables. Parameters are constants.
Statistical Inference
• Population parameters are unknown constants.
• Statistics are random variables obtained through sampling.
• Statistical inference: using statistics to estimate
parameters.
• Statistics are also called estimators (of parameter).
Example: X-bar is the estimator of μ
• We need to study the distribution of statistics.
(Random variables have fixed distributions.)
Sampling Distribution
• The probability distribution of the sample statistics
is called its sampling distribution.
The X in the pictures is not a random variable… Consider it as X-bar.
Statistical Inference
What kind of estimators do we prefer?
• Unbiased: the mean of estimator equals parameter.
• Small variation: small standard deviation.
Module 4
• Objectives: study the influence of the sample
size and the distribution of parent population on
the sampling distribution.
• Sampling Distribution Applet (CTools/lab info)
Summary
• The shape of the sampling distribution will
depend on the distribution of original parent
population as well as the sample size.
• The sampling distribution is approximately
normal when…
4(a) Sampling Dist. of the Sample Mean
If the parent popul. is a normal dist. with a
mean μ and a stand. dev. σ, then for any
sample size, the sample mean will have a
__________ dist. with a mean of _____
and a stand. dev. of _____.
4(b) Central Limit Theorem
If the parent popul. is NOT a normal dist. but
with a mean μ and a stand. dev. σ, then for
a large sample size, the sample mean will
have a
__________ dist. with a mean of _____
and a stand. dev. of _____.
What is the distinction between 4(a) and 4(b)?
Choose all that apply...
A)
B)
C)
D)
Shape of parent popul.
Shape of dist. of sample mean
Standard deviation of sample mean
Sample size
True or False
• If n is large, the sample data will always
have a normal distribution.
Confidence Interval
Recall the parameter-statistic comparison…
• We never know the true population parameter
value.
• We use a one-sample (with several observations)
statistic to estimate it.
• A sample statistic may not be exactly equal to the
corresponding parameter value.
(why confidence interval?)
Confidence Interval
Example: we are 95% confident that the true parameter value
lies inside the confidence interval [a, b].
Confidence interval provides a method of stating:
• What interval tells:
How close the value of a statistic is likely to be to the value
of a parameter
• What confidence tells:
The accuracy of it being that close
Confidence Interval
Basic structure for any confidence interval:
estimate  multiplier  standard error
The sample statistics
such as p-hat, x-bar.
Margin of error. The
Bigger the margin of
error, the wider the CI
(why?)
Confidence Interval
Two interpretations:
1. A 95% Confidence Interval: We are 95% confident
that the true parameter value lies inside the
confidence interval. The interval provides a range of
reasonable values for the population parameter.
2. The 95% Confidence Level: If the procedure were
repeated many times (that is, if we repeatedly took a
random sample of the same size and computed the
95% confidence interval for each sample), we would
expect 95% of the resulting confidence intervals to
contain the true population parameter.
Confidence Interval
Principles for using CIs to guide decision making:
• Principle 1: A value not in a CI can be rejected as possible
value of the population parameter.
A value in a CI is an “acceptable” or “reasonable”
possibility for the value of a population parameter.
• Principle 2: When the CIs for parameters for two different
populations do not overlap, it is reasonable to conclude
that the parameters for the two populations are different.
Confidence Interval
•
The probability that the true parameter
lies in a particular, already computed,
confidence interval is either 0 or 1. The
interval is now fixed and the parameter is
not random, so the parameter is either in
that particular interval or it is not.
Module 5 Activity1
• Good summary on p26
• Confidence Interval for Mean Applet
(CTools/Lab Info)
# 4: Interpret the (95%) confidence level in terms
of a popul. mean.
A) We are 95% confident that the popul.
mean will be in the computed confidence
interval.
B) The computed confidence interval will
contain the popul. mean 95% of the time.
C) 95% of all confidence intervals created
with this method are expected to contain
the popul. mean.
Before we finish today…