Download STATISTICS-THE SCIENCE OF DATA

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Student's t-test wikipedia , lookup

Statistical inference wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
STATISTICS-THE SCIENCE OF
DATA
METHODS FOR: COLLECTING
SUMMARIZING, ANALYZING
INTERPRETING DATA
WHY STUDY STATISTICS?
To understand info involving “chance”” polls,
advertising, sports, etc.;
To read/do research results: tables, graphs, reports;
To develop analytic, critical thinking skills.

STATISTICAL PROCESS

COLLECT DATA (usually a bunch of numbers, rather messy)

SUMMARIZE DATA (graphically or numerically)

ANALYZE DATA (use stat methods)

DRAW CONCLUSIONS MAKE INFERENCES or DECISIONS
ABOUT POPULATION after observing only a subset – a sample
from it (use more stat methods)
POPULATION, SAMPLE AND
INFERENCE
POPULATION – all the data one can collect on a topic of interest
Examples

1. Want to know chances of a STAT152 student getting an A.
Population: all 152 students.

2. Want to know “average family income” in the US.
Population: incomes of all US families (over 100 million).

3. Want to know if a coin is fair, i.e. if the chances of it coming
up H or T are 50% each. To figure that out we need to keep
tossing the coin, record results. Population: infinite number of
results.
SAMPLE




Problem: Populations are often difficult or impossible to
deal with or observe.
Solution: Use a representative subset of a population for
analysis – a sample!
Representative – select units randomly. For example,
simple random sample- every element of the population
has the same chance of being selected.
Advantages of random sampling:




Reduced cost,
Possibility of measuring precision in sample estimates
Good accuracy
Sometimes sampling is the only way to get information about the
population.
POPULATION PARAMETERS
and SAMPLE STATISTICS




Population characteristics like center or spread: parameters
Sample characteristics (computed from sample values):
statistics.
Example: Population mean μ, sample mean
We use sample statistics as estimates of the population
parameters.
x
INFERENCE
Typical statistical inference:


Making statements about population parameters using
sample statistics (estimation, testing hypothesis): parametric
inference.
Inference not involving parameters- nonparametric
inference: not included in this class.
Population
Sample
Parameter
Statistic
Inference
EXAMPLE

Is a coin fair?
p=probability that the
coin comes up H.
Fair coin → p=0.5

Collect data. Toss the
coin 100 times.
DATA: H, H, T, H, T, T, T,
….
Summarize Data: 20 H,
and 80 T.

Analyze Data:
Compute sample
proportion of H:
ps=20/100=0.2

Inference/conclusion:
Looks like the coin favors
T, so coin not fair.