Download Point Estimation and Confidence Intervals

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Point and interval estimations
of parameters of the normally
up-diffused sign. Concept of
statistical evaluation.
Estimation of Population Parameters
• Statistical inference refers to making inferences about a
population parameter through the use of sample information
• The sample statistics summarize sample information and can be
used to make inferences about the population parameters
• Two approaches to estimate population parameters
– Point estimation: Obtain a value estimate for the population
parameter
– Interval estimation: Construct an interval within which the
population parameter will lie with a certain probability
Point Estimation
• In attempting to obtain point estimates of population parameters,
the following questions arise
– What is a point estimate of the population mean?
– How good of an estimate do we obtain through the methodology
that we follow?
• Example: What is a point estimate of the average yield on tenyear Treasury bonds?
• To answer this question, we use a formula that takes sample
information and produces a number
Point Estimation
• A formula that uses sample information to produce an estimate
of a population parameter is called an estimator
• A specific value of an estimator obtained from information of a
specific sample is called an estimate
• Example: We said that the sample mean is a good estimate of
the population mean
– The sample mean is an estimator
– A particular value of the sample mean is an estimate
Point Estimation
• Note: An estimator is a random variable that takes many
possible values (estimates)
• Question: Is there a unique estimator for a population
parameter? For example, is there only one estimator for the
population mean?
• The answer is that there may be many possible estimators
• Those estimators must be ranked in terms of some desirable
properties that they should exhibit
Properties of Point Estimators
• The choice of point estimator is based on the following criteria
– Unbiasedness
– Efficiency
– Consistency
• A point estimator ˆ is said to be an unbiased estimator of the
population parameter  if its expected value (the mean of its
sampling distribution) is equal to the population parameter it is
trying to estimate

E ˆ  
Properties of Point Estimators
• Interesting Results on Unbiased Estimators
– The sample mean, variance and proportion are unbiased
estimators of the corresponding population parameters
– Generally speaking, the sample standard deviation is not an
unbiased estimator of the population standard deviation
• We can also define the bias of an estimator as follows
 
Bias ˆ  E ˆ  
Properties of Point Estimators
• It is usually the case that, if there is an unbiased estimator of a
population parameter, there exist several others, as well
• To select the “best unbiased” estimator, we use the criterion of
efficiency
• An unbiased estimator is efficient if no other unbiased estimator
of the particular population parameter has a lower sampling
distribution variance
Properties of Point Estimators
• If ˆ1 and ˆ2 are two unbiased estimators of the population
parameter , then ˆ1 is more efficient than ˆ2 if
   
V ˆ1  V ˆ2
• The unbiased estimator of a population parameter with the
lowest variance out of all unbiased estimators is called the most
efficient or minimum variance unbiased estimator
• In some cases, we may be interested in the properties of an
estimator in large samples, which may not be present in the
case of small samples
Properties of Point Estimators
• We say that an estimator is consistent if the probability of
obtaining estimates close to the population parameter increases
as the sample size increases
• The problem of selecting the most appropriate estimator for a
population parameter is quite complicated
• In some occasions, we may prefer to have some bias of the
estimator at the gain of increases efficiency
Properties of Point Estimators
• One measure of the expected closeness of an estimator to the
population parameter is its mean squared error


MSE ˆ  E  ˆ  

2 
Interval Estimation
• Point estimates of population parameters are prone to sampling
error and are not likely to equal the population parameter in any
given sample
• Moreover, it is often the case that we are interested not in a
point estimate, but in a range within which the population
parameter will lie
• An interval estimator for a population parameter is a formula that
determines, based on sample information, a range within which
the population parameter lies with certain probability
Interval Estimation
• The estimate is called an interval estimate
• The probability that the population parameter will lie within a
confidence interval is called the level of confidence and is given
by 1 - 
• Two interpretations of confidence intervals
– Probabilistic interpretation
– Practical interpretation
Interval Estimation
• In the probabilistic interpretation, we say that
– A 95% confidence interval for a population parameter means that,
in repeated sampling, 95% of such confidence intervals will include
the population parameter
• In the practical interpretation, we say that
– We are 95% confident that the 95% confidence interval will include
the population parameter
Constructing Confidence Intervals
• Confidence intervals have similar structures
Point Estimate  Reliability Factor  Standard Error
– Reliability factor is a number based on the assumed distribution of
the point estimate and the level of confidence
– Standard error of the sample statistic providing the point estimate
Confidence Interval for Mean of a Normal
Distribution with Known Variance
• Suppose we take a random sample from a normal distribution
with unknown mean, but known variance
• We are interested in obtaining a confidence interval such that it
will contain the population mean 90% of times
• The sample mean will follow a normal distribution and the
corresponding standardized variable will follow a standard
normal distribution
Confidence Interval for Mean of a Normal
Distribution with Known Variance
• If X is the sample mean, then we are interested in the
confidence interval, such that the following probability is .9
.9  P 1.645  Z  1.645
X 


 P  1.645 
 1.645
/ n


1.645 
  1.645
 P
 X  

n
n 

1.645
1.645 

 P X 
X 

n
n 

Confidence Interval for Mean of a Normal
Distribution with Known Variance
• Following the above expression for the structure of a confidence
interval, we rewrite the confidence interval as follows
X  1.645 

n
• Note that from the standard normal density
PZ  1.65  FZ 1.65  0.95
P( Z  1.65)  FZ  1.65  0.05
Confidence Interval for Mean of a Normal
Distribution with Known Variance
• Given this result and that the level of confidence for this interval
(1-) is .90, we conclude that
– The area under the standard normal to the left of –1.65 is 0.05
– The area under the standard normal to the right of 1.65 is 0.05
• Thus, the two reliability factors represent the cutoffs -z/2 and
z/2 for the standard normal
Confidence Interval for Mean of a Normal
Distribution with Known Variance
• In general, a 100(1-)% confidence interval for the population
mean  when we draw samples from a normal distribution with
known variance 2 is given by
X  z / 2

n
where z/2 is the number for which
PZ  z / 2  

2
Confidence Interval for Mean of a Normal
Distribution with Known Variance
• Note: We typically use the following reliability factors when
constructing confidence intervals based on the standard normal
distribution
– 90% interval: z0.05 = 1.65
– 95% interval: z0.025 = 1.96
– 99% interval: z0.005 = 2.58
• Implication: As the degree of confidence increases the interval
becomes wider
Confidence Interval for Mean of a Normal
Distribution with Known Variance
• Example: Suppose we draw a sample of 100 observations of
returns on the Nikkei index, assumed to be normally distributed,
with sample mean 4% and standard deviation 6%
• What is the 95% confidence interval for the population mean?
• The standard error is .06/ 100 = .006
• The confidence interval is .04  1.96(.006)
Confidence Interval for Mean of a Normal
Distribution with Unknown Variance
• In a more typical scenario, the population variance is unknown
• Note that, if the sample size is large, the previous results can be
modified as follows
– The population distribution need not be normal
– The population variance need not be known
– The sample standard deviation will be a sufficiently good estimator
of the population standard deviation
• Thus, the confidence interval for the population mean derived
above can be used by substituting s for 
Confidence Interval for Mean of a Normal
Distribution with Unknown Variance
• However, if the sample size is small and the population variance
is unknown, we cannot use the standard normal distribution
• If we replace the unknown  with the sample st. deviation s the
following quantity
t
X 
s/ n
follows Student’s t distribution with (n – 1) degrees of freedom
Confidence Interval for Mean of a Normal
Distribution with Unknown Variance
• The t-distribution has mean 0 and (n – 1) degrees of freedom
• As degrees of freedom increase, the t-distribution approaches
the standard normal distribution
• Also, t-distributions have fatter tails, but as degrees of freedom
increase (df = 8 or more) the tails become less fat and resemble
that of a normal distribution
Confidence Interval for Mean of a Normal
Distribution with Unknown Variance
• In general, a 100(1-)% confidence interval for the population
mean  when we draw small samples from a normal distribution
with an unknown variance 2 is given by
X  tn1, / 2
s
n
where tn-1,/2 is the number for which
Ptn1  tn1, / 2  

2
Confidence Interval for Mean of a Normal
Distribution with Unknown Variance
• Example: Suppose we want to estimate a 95% confidence
interval for the average quarterly returns of all fixed-income
funds in the US
• We assume that those returns are normally distributed with an
unknown variance
• We draw a sample of 150 observations and calculate the
sample mean to be .05 and the standard deviation .03
Confidence Interval for Mean of a Normal
Distribution with Unknown Variance
• To find the confidence interval, we need tn-1,/2 = t149,0.025
• From the tables of the t-distribution, this is equal to 1.96
• The confidence interval is
.05  1.96
.03
150
Confidence Interval for the Population
Variance of a Normal Population
• Suppose we have obtained a random sample of n observations
from a normal population with variance 2 and that the sample
variance is s2. A 100(1 - )% confidence interval for the
population variance is
n  1s 2
 n21, / 2
2 
n  1s 2
 n21,1 / 2
Confidence Interval for the Population
Variance of a Normal Population
• The values of the chi-squared distribution with n-1,/2 and
n-1,1-/2 are determined as follows
 2


P  n21   n21,1 / 2  
2
P  n21   n21, / 2 
2
where  n 1 follows the chi-squared distribution with (n-1)
degrees of freedom
Good luck