Download Chapter 2 - Web4students

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Statistical inference wikipedia , lookup

Misuse of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
CHAPTER 2 – DESCRIBING DISTRIBUTION WITH NUMBERS - DESCRIPTIVE STATISTICS
TOPICS COVERED - Sections shown with numbers as in e-book
Any topic listed on this document and not covered in class must be studied “On Your Own” (OYO)
Section 2.1 – MEASURING THE CENTER – THE MEAN (pg. 40)
 Mean
Section 2.2 – MEASURING THE CENTER – THE MEDIAN (pg. 41)
 Median
o Odd number of observations (n)
o Even number of observations (n)
 Resistant measure
Section 2.3 – COMPARING THE MEAN AND THE MEDIAN (pg. 42)
 Comparing the mean and the median – shape of distributions
 Which measure is more appropriate to use? The mean, or the median?

From STUDENT RESOURCES, select “STATISTICAL APPLETS”, click GO, then select
Statistical Applets: Mean and Median
Section 2.4 – MEASURING SPREAD – THE QUARTILES (pg. 43)
 The quartiles
 Finding quartiles
o Odd n
o Even n
Section 2.5 – THE FIVE-NUMBER SUMMARY (pg. 45)
 The five-number summary
 Boxplots (most of the time will be constructed with technology)
 Boxplots and shapes of distribution
o What does a longer “section” means?
 Using box plots to compare data sets
Section 2.6 – SPOTTING SUSPECTED OUTLIERS (pg. 47)
 Outliers
 Range
 Inter quartile range = IQR
 Using the 1.5 IQR rule to determine outliers
Section 2.7 – MEASURING SPREAD – THE STANDARD DEVIATION (pg. 49)
 The standard deviation
 Degrees of freedom
 Resistant or not?
Section 2.8 – CHOOSING MEASURES OF CENTER AND SPREAD (pg. 52)
 When to use the five number summary
 When to use the mean and standard deviation
Section 2.9 – USING TECHNOLOGY (pg. 53)
Section 2.10 – ORGANIZING A STATISTICAL PROBLEM (pg. 55)
Demonstrate the use of the ONE-VARIABLE STATISTICAL APPLET
Get into the E-BOOK
Go to STUDENT RESOURCES
Select STATISTICAL APPLETS, ALL CHAPTERS and click GO
Select Statistical Applets: One-Variable Statistical Calculator
From the DATA SETS, select TABLE 1.1
Click on DATA to explore the data about “Percent of state population born outside the US”
Click on HISTOGRAM, and describe Shape – THINK, which will be larger, the mean or the median?
Click on STATISTICS, record them on a paper (do answers agree with what you predicted on the previous step?)
Construct a BOX PLOT by hand
Use the calculator to find the 1-Variable Statistics of a data set
Press STAT
Select EDIT
Enter the data in L1
Press STAT
Arrow right to CALCULATE
Select 1:1-Var-Stats, by pressing ENTER
Select L1 by pressing 2nd 1[L1]
Press ENTER
Scroll down to observe the five number summary
Use the calculator to construct a box plot
Enter data in one list, let’s say L1 (STAT, EDIT)
Press 2nd Y= to access STAT PLOT
Select one plot and press ENTER to turn it ON
Go down and right and select the box plot with outliers by pressing ENTER
Press ZOOM and select 9:ZOOMSTAT to see the graph
Press TRACE and arrow right and left to read the FIVE NUMBER SUMMARY

A numerical summary of a distribution should report at least its center and its spread or variability.

The mean and the median M describe the center of a distribution in different ways. The mean is the
arithmetic average of the observations, and the median is the midpoint of the values.

When you use the median to indicate the center of the distribution, describe its spread by giving the
quartiles. The first quartile Q1 has one-fourth of the observations below it, and the third quartile Q1 has
three-fourths of the observations below it.

The five-number summary consisting of the median, the quartiles, and the smallest and largest
individual observations provides a quick overall description of a distribution. The median describes the center,
and the quartiles and extremes show the spread.

Boxplots based on the five-number summary are useful for comparing several distributions. The box
spans the quartiles and shows the spread of the central half of the distribution. The median is marked within
the box. Lines extend from the box to the extremes and show the full spread of the data.

The variance s2 and especially its square root, the standard deviation s, are common measures of spread
about the mean as center. The standard deviation s is zero when there is no spread and gets larger as the
spread increases.

A resistant measure of any aspect of a distribution is relatively unaffected by changes in the numerical
value of a small proportion of the total number of observations, no matter how large these changes are. The
median and quartiles are resistant, but the mean and the standard deviation are not.

The mean and standard deviation are good descriptions for symmetric distributions without outliers.
They are most useful for the Normal distributions introduced in the next chapter. The five-number summary
is a better description for skewed distributions.


Numerical summaries do not fully describe the shape of a distribution. Always plot your data.
A statistical problem has a real-world setting. You can organize many problems using the four steps state,
plan, solve, and conclude.