• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Time series wikipedia, lookup

Power law wikipedia, lookup

Transcript
```Organizing Data
AP Stats Chapter 1
Organizing Data

Categorical
Dotplot (also used for quantitative)
 Bar graph
 Pie chart


Quantitative

Stemplots
 Unreasonable

with large data sets
Histogram
 Frequency/relative
frequency
Describing Distributions
Remember “SECS-C”
 S – Shape
 E – Extreme Values (outliers)
 C – Center
 C – Context
 **Make meaningful descriptions and
comparisons. Don’t just list numbers.**

Shape

Symmetric


Values smaller and larger than the midpoint
are mirror images.
Skewed

The tail on one end is much longer than the
other tail.
Example: Symmetric
Examples: Skewed
Ways to Measure Center

Mean
x =
x1 + x2 + . . . . + xn
n
 xi
x =
n
The mean is not a resistant measure of
center. (sensitive to outliers)
 Used mostly with symmetric distributions.

Ways to measure center

Median
Midpoint of a distribution
 Median is a resistant measure of center
 Used with symmetric or skewed distributions.


1) Range



Highest value – lowest value
Problem: could be based on outliers
2) Quartiles (for use with median)


pth percentile – value such that p percent of the
observations fall at or below it
Q1 (quartile 1): 25th percentile


Median of the first half of the data
Q3 (quartile 3): 75th percentile

Median of the second half of the data

5 Number Summary
Minimum, Q1, median, Q3, maximum
 The 5-number summary for a distribution can
be illustrated in a boxplot.

1.5 x IQR Rule for Outliers
IQR = Q3 – Q1 (Interquartile Range)
 Rule: If an observation falls more than 1.5
x IQR above Q3 or below Q1, then we
consider it an outlier.
 The 5 Number Summary can be used for
distributions which are skewed, or which
have strong outliers.

Standard deviation (for use with the mean)
 Std Dev tells you, on average, how far
each observation is from the mean.

variance: s2 =
std dev: s =
 (xi - x)2
n-1
 (xi - x)2
n-1
Properties of Standard Deviation

s gets larger as the data become more

Only use mean and std dev for reasonably
symmetric distributions which are free of
outliers.
Linear Transformation of Data




Xnew = a + bx
The shape of the distribution does not change.
Multiplying each observation by a positive
number, b, multiplies both measures of center
and measures of spread by b.
Adding the same number, a, to each observation
adds a to measures of center and to quartiles,
but does not change measures of spread.
```
Related documents