Download CHAPTER 10

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

Data mining wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
CHAPTER 10
Data Analysis and Probability
Chapter Focus
In this chapter you will learn how to organize and analyze categorical data
by using relative frequencies. You will analyze numerical data by calculating
statistics that locate the center and measure the spread of the data. Then you
will see how displaying numerical data in various ways helps you make sense of
it, especially when the amount of data is substantial. You will also learn about
probability of events, and distinguish between dependent and independent
events in order to calculate their probabilities.
Chapter at a Glance
Standards for Mathematical Content
10-1
Organizing and Displaying Data
CC.9-12.S.ID.5
10-2
Frequency and Histograms
CC.9-12.S.ID.1, CC.9-12.S.ID.2
10-3
Data Distributions
CC.9-12.S.ID.1, CC.9-12.S.ID.2
10-4
Misleading Graphs and Statistics
CC.9-12.S.ID.1, CC.9-12.S.ID.2,
CC.9-12.S.ID.3
10-5
Experimental Probability
CC.9-12.S.CP.1
10-6
Theoretical Probability
CC.9-12.S.CP.1
10-7
Independent and Dependent Events
CC.9-12.S.CP.2, CC.9-12.S.CP.3,
CC.9-12.S.CP.8(+)
Performance Tasks
Assessment Readiness
Chapter 10
545
Data Analysis and Probability
chapter 10
© Houghton Mifflin Harcourt Publishing Company
Lesson
Unpacking the Standards
Understanding the standards and the vocabulary terms in the standards will
help you know exactly what you are expected to learn in this chapter.
Represent data with plots on the real
number line (dot plots, histograms, and
box plots).
Key Vocabulary
Lessons 10-2, 10-3, 10-4
Displaying numerical data on the real number line gives you an instant
visual image of how the data are distributed, and helps you draw
conclusions about the center and spread of the data.
Histogram
EXAMPLE
Golf Tournament Scores
A histogram gives you
an overall picture of how
data are distributed, but
does not indicate any
particular values
or statistics.
6
5
4
3
2
1
0
chapter 10
EXAMPLE
0
–7
68
3
–7
71
6
–7
74
Scores
77
–7
9
80
–8
2
Box-and-whisker plot
A box-and-whisker plot includes five statistical values.
First quartile
Minimum
0
CC.9-12.S.ID.3
Interpret differences in shape, center,
and spread in the context of the data
sets, accounting for possible effects of
extreme data points (outliers).
Key Vocabulary
2
6
What It Means For You
8
Maximum
10
12
14
Lesson 10-4
Always examine the displays and statistics for a data set in its own
particular context so that you can draw valid conclusions.
EXAMPLE
Most of data Mean
outlier (valor extremo) A data value
that is far removed from the rest of
the data.
Median
Chapter 10
4
Third quartile
Median
546
Outliers
Much different value
The outlier pulls the mean to the right. The
median better represents a “typical” value.
Data Analysis and Probability
© Houghton Mifflin Harcourt Publishing Company; Photo credit: © Corbis
histogram (histograma) A bar graph
used to display data grouped in
intervals.
box-and-whisker plot (gráfica de
mediana y rango) A method of
showing how data are distributed
by using the median, quartiles, and
minimum and maximum values; also
called a box plot.
What It Means For You
Frequency
CC.9-12.S.ID.1
What It Means For You
CC.9-12.S.ID.5
Summarize categorical data for two
categories in two-way frequency tables.
Interpret relative frequencies in the
context of the data (including joint,
marginal, and conditional relative
frequencies). Recognize possible
associations and trends in the data.
frequency table (tabla de frequencia)
A table that lists the number of times,
or frequency, that each data value
occurs.
joint relative frequency (frecuencia
relativa conjunta) The ratio of the
frequency in a particular category
divided by the total number of data
values.
marginal relative frequency (frecuencia relativa marginal) The
sum of the joint relative frequencies
in a row or column of a two-way
table.
conditional relative frequency (frecuencia relativa condicional) The
ratio of a joint relative frequency to a
related marginal relative frequency in
a two-way table.
CC.9-12.S.CP.2
Understand that two events A and B are
independent if the probability of A and B
occurring together is the product of their
probabilities, and use this characterization
to determine if they are independent.
Key Vocabulary
independent events (sucesos
independientes)) Events for which the
occurrence or non-occurrence of one
event does not affect the probability
of the other event.
probability (probabilidad) A number
from 0 to 1 (or 0% to 100%) that is the
measure of how likely an event is to
occur.
Two-way frequency tables give you a visual way to organize data
categorized by two different variables so that you can more easily
identify relationships.
EXAMPLE
A two-way relative frequency table
The table shows the portions of households in a study that own a dog,
a cat, both, or neither.
Joint relative
frequencies
Owns
a dog
Owns a cat
Yes
No
Total
Yes
0.15
0.24
0.39
No
0.18
0.43
0.61
Total
0.33
0.67
1
Marginal relative
frequencies
Here are a few conclusions you can draw from the table:
• 39% own a dog, and 33% own a cat.
• 15% own a dog and a cat, and 43% own neither.
15
  ​% ≈ 38% also own a cat.
• Of dog owners, __
​ 39
15
• Of cat owners, __
​ 33
  ​% ≈ 45% also own a dog.
What It Means For You
Lessons 10-6, 10-7
Two events are independent if the occurrence of one event does not
affect the occurrence of the other. When two events A and B are
independent, the probability of both events occurring together is
P(A and B) = P(A) · P(B).
EXAMPLE
Independent Events
You choose a marble without looking and
put it back. Then you choose a second marble..
1 ​ 
P(blue and then black) = _
​ 1 ​  · _
​ 1 ​  =​ _
2 2 4
NON-EXAMPLE
Dependent Events
You choose a marble without looking and don’t put it back. Then you
choose a second marble.
P(blue and then black) = _
​ 1 ​  · _
​ 2 ​  = _
​ 1 ​ 
2 3 3
Chapter 10
547
Data Analysis and Probability
chapter 10
© Houghton Mifflin Harcourt Publishing Company; Photo credit: © Corbis
Key Vocabulary
Lesson 10-1
Key Vocabulary
box-and-whisker plot (gráfica de mediana y rango) A method of showing how data are
distributed by using the median, quartiles, and minimum and maximum values; also called a box plot.
complement of an event
__ (complemento de un suceso) All outcomes in the sample space that are
not in an event E, denoted E​
​ . 
conditional probability (probabilidad condicional) The probability of event B, given that event A
has already occurred or is certain to occur, denoted P(B | A); used to find probability of dependent events.
dependent events (sucesos dependientes) Events for which the occurrence or nonoccurrence of one
event affects the probability of the other event.
first quartile (primer cuartil) The median of the lower half of a data set, denoted Q1. Also called
lower quartile.
histogram (histograma) A bar graph used to display data grouped in intervals
independent events (sucesos independientes) Events for which the occurrence or non-occurrence
of one event does not affect the probability of the other event.
chapter 10
interquartile range (IQR) (rango entre cuartiles) The difference of the third (upper) and first (lower)
quartiles in a data set, representing the middle half of the data.
intersection (intersección de conjuntos) The intersection of two sets is the set of all elements that
are common to both sets, denoted by ⋂.
mean (media) The sum of all the values in a data set divided by the number of data values. Also called the
average.
median (mediana) For an ordered data set with an odd number of values, the median is the middle value.
For an ordered data set with an even number of values, the median is the average of the two middle values.
outlier (valor extremo) A data value that is far removed from the rest of the data.
probability (probabilidad) A number from 0 to 1 (or 0% to 100%) that is the measure of how likely an
event is to occur.
range of a data set (rango de un conjunto de datos) The difference of the greatest and least
values in the data set.
standard deviation (desviación estándar) A measure of dispersion of a data set. The standard
deviation σ is the square root of the variance.
third quartile (tercer cuartil) The median of the upper half of a data set. Also called upper quartile.
union (unión) The union of two sets is the set of all elements that are in either set, denoted by ⋃.
Chapter 10
548
Data Analysis and Probability