Download here - BCIT Commons

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Time value of money wikipedia , lookup

Generalized linear model wikipedia , lookup

Probability box wikipedia , lookup

Confidence interval wikipedia , lookup

Least squares wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Transcript
MATH 2441
Probability and Statistics for Biological Sciences
Point Estimators
As mentioned several times already in these notes, there are two broad types of statistical inferences that
are commonly made about population parameters:


we can estimate the value of the population parameter (estimation)
we can decide whether we have adequate evidence to support a statement about the value of
that population parameter (hypothesis testing)
Both of these approaches make use of the data we have from a random sample.
There are two different approaches to estimation:


a point estimator is a formula or expression producing a single value estimate of the
population parameter. Some people refer to point estimates as a "best guess" value. The
term "guess" is a bit pessimistic, but it does give you the sense that there is a degree of
uncertainty in relying on point estimates.
an interval estimator (giving a confidence interval estimate) is a formula or expression that
produces a numerical interval which has a specified probability of capturing the true value of
the population parameter. Again, this element of uncertainty appears, but in the case of
confidence interval estimates, the degree of uncertainty is displayed more explicitly.
In practice, confidence interval estimates are used more commonly by far than point estimates.
Nevertheless, since point estimates are used in certain important ways in statistics, and carry with them
some important concepts and terms, we need to look at them briefly in this course.
The term estimator refers to the formula or expression used to calculate the estimate, the actual numerical
value estimate of the population parameter in a particular problem. When we speak generically, it is
conventional to represent the population parameter being estimated by the symbol , the Greek letter 'theta',
and to represent the estimator by the symbol ˆ , the same Greek letter, but with a caret on top.
You've already seen a number of potential point estimators:
population
parameters
()
point
estimators

x, ~
x
variance
standard deviation
proportion
correlation coefficient
linear regression
difference between two means
2



 0,  1
2 - 1
s2
s
p
r
b 0, b 1
x2  x1
difference between two proportions
2 - 1
p2 - p1
mean
( ˆ )
(You haven't seen the last two entries in this table yet -- they would be used when we want to compare two
populations, a very important type of problem in statistics. The other entries should be quite familiar to you
by now.)
Notice that all of the point estimators, ˆ , listed in this table are necessarily random variables, because they
are statistics for random samples. Since such random variables have distributions with some width (that is,
different random samples will generally give different values of ˆ ), we realize that the determination of a
©David W. Sabo (1999)
Point Estimators
Page 1 of 3
value of ˆ does not guarantee us that we have the exact value of the corresponding population parameter,
. In fact, in general, we can write
  ˆ  error
(PE - 1)
where 'error' stands for the difference between the observed value of ˆ and the actual "true" value of .
This emphasizes one very serious danger in reporting point estimates of population parameters -- readers
may forget that there is an unstated and perhaps large error, and so make mistakenly attribute greater
accuracy to the estimate than is really warranted. The consequences of unwittingly using erroneous results
can often be worse than not having any result to use at all! One reason interval estimates are favored much
more than point estimates in statistical inference is that they make the presence of potential estimation error
much more explicit.
Nevertheless, there are some instances in which the convenience of point estimates outweighs their
deficiencies. Two examples are:
(i)
interval estimates of certain fundamental physical constants would be very difficult or
inconvenient to work with in calculations. Thus, although quantities such as the gravitational
acceleration, g; Avogadro's number, N; and so forth are numbers which are experimentally
determined and thus subject to sampling errors of one sort or another, we normally use point
estimates of them rather than interval estimates. Of course, many of these fundamental
constants have been estimated with high precision so that errors in their estimates are not
significant for many applications. Also, as we saw last term in MATH 1441, there are ways to
"estimate" the effect of uncertainties in such numbers on calculations after the fact.
(ii)
as you'll see shortly, when we carry out various procedures of statistical inference focussing on
one population parameter of greatest interest, the formulas that result may involve the values of
other population parameters. In such situations, we can usually obtain adequately accurate
results by using point estimates for the parameters of secondary interest in order to derive
formulas for an interval estimate of the parameter of greatest interest.
For example, in deriving formulas for interval estimates of the population mean, , we require the
value of the population standard deviation, . Since  is unknown, it is very unlikely that we'll
know the value of  (though in some instances we might). Rather than backing up one more
step and determining an interval estimate for , it is more usual to use the available value of s as
a point estimate of  in the formula for the interval estimate of .
There may be many potential point estimators for a given population parameter. Obviously we'd like to use
the one which has the smallest error term in (PE - 1). Unfortunately, we have no way of calculating what the
error is for a particular estimator (if we could calculate the error exactly, we could compensate for it exactly
and there would be no further need for statistical analysis!), and so there is no direct way to decide which is
the best estimator to use in a specific situation. However, statisticians have developed some criteria which
they find useful in deciding which estimator might be the more advantageous in specific circumstances.
Very briefly, three of these are:
1. An unbiased estimator is generally considered superior to one which is not unbiased. An estimator, ˆ
is unbiased if its mean value (or expected value) equals the actual value of the parameter being
estimated:
E[ ˆ ] = 
(PE - 2)
Although particular values of ˆ obtained from actual random samples are unlikely to be exactly equal
to , the thought here is that for unbiased estimators, ˆ will not systematically underestimate or
systematically overestimate .
The sample mean, x is an unbiased estimator of the population mean, , and the sample variance,
s2 is an unbiased estimator of the population variance, 2 . The denominator 'n - 1' in the formula for
s2 is chosen to make this true. Although the sample standard deviation, s, is not an unbiased
Page 2 of 3
Point Estimators
©David W. Sabo (1999)
estimator of the population standard deviation, , the degree of bias is generally considered to be
small enough that it doesn't override the other advantages of using s. Under fairly general conditions,
the sample proportion is an unbiased estimator of the population proportion. When the population
distribution is continuous and symmetric (for example, populations which are approximately normally
distributed), the sample median and symmetrically trimmed sample means are also unbiased
estimators of the population mean.
2. Estimators whose sampling distributions have smaller variances are considered superior. The idea here
is that different random samples will give different values of the estimator, consistent with the
sampling distribution. The variance of the sampling distribution is a measure of how spread out these
values may be. A smaller variance means that any given sample is more likely to give a value nearer
to the actual value of the population parameter.
For example,  x is smaller than  ~x indicating that the sample mean is preferred over the sample
median as an estimator of the population. Some references speak of the sample mean being a more
efficient estimator than the sample median in this case. In fact, for a normally distributed population,
it can be proven mathematically that the sampling distribution of the mean has the smallest possible
variance, making the sample mean the most efficient possible estimator of the population mean.
2
2
Incidentally, because of its being a measure of the scale of error in a point estimator, the standard
deviation of a sampling distribution is often referred to as the standard error of the estimate.
3. An estimator is said to be consistent if the variance of its sampling distribution decreases with increasing
sample size. This is a good property because it means that if you make the effort to collect data from
a larger random sample, you should end up with a more accurate estimate of the population
parameter.
For example, the sample mean is a consistent estimator of the population mean because
 x2 
2
n
that is, the variance of the sampling distribution is inversely proportional to the sample size.
How well a specific estimator satisfies each of these criteria depends very much on the details of the
population distribution. Evaluating specific estimators in relation to these criteria often involves
mathematical techniques that are beyond the scope of this course.
©David W. Sabo (1999)
Point Estimators
Page 3 of 3