Download Understanding Observational Studies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Measuring Center
Lecture 15
Sections 5.1 – 5.2
Mon, Feb 11, 2008
Measuring the Center
Often, we would like to have one number
that that is “representative” of a population
or sample.
 It seems reasonable to choose a number
that is near the “center” of the distribution
rather than in the left or right extremes.
 But there is no single “correct” way to do
this.

Measuring the Center
Mean – the simple average of a set of
numbers.
 Median – the value that divides the set of
numbers into a lower half and an upper
half.
 Mode – the most frequently occurring
value in the set of numbers.

Measuring the Center
In a unimodal, symmetric distribution,
these values will all be near the center.
 In skewed distributions, they will be spread
out.

Mean, Median, and Mode

If a distribution is symmetric, then the
mean, median, and mode are all the same
and are all at the center of the distribution.
Mean, Median, and Mode

However, if the distribution is skewed, then
the mean, median, and mode are all
different.
Mean, Median, and Mode

However, if the distribution is skewed, then
the mean, median, and mode are all
different.
 The
mode is at the peak.
Mode
Mean, Median, and Mode

However, if the distribution is skewed, then
the mean, median, and mode are all
different.
 The
mean is shifted in the direction of
skewing.
Mode
Mean
Mean, Median, and Mode

However, if the distribution is skewed, then
the mean, median, and mode are all
different.
 The
median is (typically) between the mode
and the mean.
Mode Median Mean
The Median vs. The Mean
If the data are strongly skewed, then the
median is generally to give a more
representative value.
 If the data are not skewed, then the mean
is usually preferred.

The Mean
Why is the average usually a good
measure of the center?
 If we have only two numbers, the average
is half way between them.
 What if we have more than two numbers?
 The mean balances the “deviations” on the
left with the “deviations” on the right.

The Mean
1
2
3
4
5
6
7
8
9
10
The Mean
Average
1
2
3
4
5
6
7
8
9
10
The Mean
Average
-5
-2
1
2
3
4
5
6
7
8
9
10
The Mean
Average
+4
-5
+2
-2
1
2
3
4
5
+1
6
7
8
9
10
The Median
1
2
3
4
5
6
7
8
9
10
The Median
Median
1
2
3
4
5
6
7
8
9
10
The Median
Median
-6
-3
1
2
3
4
5
6
7
8
9
10
The Median
Median
-6
+3
-3
1
2
3
4
5
6
+1
7
8
9
10
The Mean
We use the letter x to denote a value from
the sample or population.
 The symbol  means “add them all up.”
 So,
x
means add up all the values in the
population or sample (depending on the
context).
x
 Then the sample mean is
n

The Mean
We denote the mean of a sample by the
symbolx, pronounced “x bar”.
 We denote the mean of a population by
, pronounced “mu” (myoo).
x
 Therefore,
x 

n
x
 
N
TI-83 – The Mean
Enter the data into a list, say L1.
 Press STAT > CALC > 1-Var Stats.
 Press ENTER. “1-Var-Stats” appears.
 Type L1 and press ENTER.
 A list of statistics appears. The first one is
the mean.

Case Study 8

2007 Small Arms Study (p. 47)
 Find
the average number of guns per country
for India, China, Germany, France, and
Pakistan.

Is the value representative of the group?
 Then
include the U.S. and compute the
average for the six countries.

Is the value representative of the group?
The Median
1
2
3
4
5
6
7
8
9
10
The Median
Median
1
2
3
4
5
6
7
8
9
10
The Median
Median – The middle value, or the
average of the middle two values, of a
sample or population, when the values are
arranged from smallest to largest.
 The median, by definition, is at the 50th
percentile.

 It
separates the lower 50% of the sample from
the upper 50%.
The Median
When n is odd, the median is the middle
number, which is in position (n + 1)/2.
 When n is even, the median is the average
of the middle two numbers, which are in
positions n/2 and n/2 + 1.

Case Study 8

2007 Small Arms Study (p. 47)
 Find
the median number of guns per country
for India, China, Germany, France, and
Pakistan.

Is the value representative of the group?
 Then
include the U.S. and compute the
median for the six countries.

Is the value representative of the group?
TI-83 – The Median
Follow the same procedure that was used
to find the mean.
 When the list of statistics appears, scroll
down to the one labeled “Med.” It is the
median.

TI-83 – The Median

Use the TI-83 to find the median number
of guns.
 46,
40, 25, 19, 18.
 46, 40, 25, 19, 18, 270.
The Mode
Mode – The value in the sample or
population that occurs most frequently.
 The mode is a good indicator of the
distribution’s central peak, if it has one.

Mode
The problem is that many distributions do
not have a peak or they have several
peaks.
 In other words, the mode does not
necessarily exist or there may be several
modes.

Weighted Means
For the countries India, China, Germany,
France, and Pakistan, the average number
of guns per country is 29.6 million.
 For Mexico and Brazil, the average is 15.4
million.
 What is the average for all seven
countries?

Weighted Means

The averages are
 India,
et al:
 Mexico & Brazil:

x1 = 29.6.
x2 = 15.4.
How could we combine the two averages
to get the average for all seven countries?
Weighted Means

Compute the weighted average:
5(29.6)  2(15.4)
x
7