Download Chapter 6: Random Variables and the Normal Distribution 6.1

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Probability amplitude wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Chapter 6: Random Variables and the Normal
Distribution

6.1 Discrete Random Variables

6.2 Binomial Probability Distribution

6.3 Continuous Random Variables and the
Normal Probability Distribution
6.1 Discrete Random Variables
Objectives:
By the end of this section, I will be
able to…
1)
2)
3)
Identify random variables.
Explain what a discrete probability
distribution is and construct probability
distribution tables and graphs.
Calculate the mean, variance, and standard
deviation of a discrete random variable.
Random Variables

A variable whose values are determined by
chance

Chance in the definition of a random variable
is crucial
Example 6.2 - Notation
for random variables
Suppose our experiment is to toss a single fair
die, and we are interested in the number
rolled. We define our random variable X to be
the outcome of a single die roll.
a. Why is the variable X a random variable?
b. What are the possible values that the
random variable X can take?
c. What is the notation used for rolling a 5?
d. Use random variable notation to express
the probability of rolling a 5.
Example 6.2 continued
Solution
a)
We don’t know the value of X before we toss the
die, which introduces an element of chance into
the experiment
b)
Possible values for X: 1, 2, 3, 4, 5, and 6.
c)
When a 5 is rolled, then X equals the outcome 5,
or X = 5.
d)
Probability of rolling a 5 for a fair die is 1/6, thus
P(X = 5) = 1/6.
Types of Random Variables

Discrete random variable - either a finite
number of values or countable number of
values, where “countable” refers to the fact
that there might be infinitely many values,
but they result from a counting process

Continuous random variable infinitely
many values, and those values can be
associated with measurements on a
continuous scale without gaps or
interruptions
Example
Identify each as a discrete or continuous
random variable.
(a)
Total amount in ounces of soft drinks
you consumed in the past year.
(b)
The number of cans of soft drinks that
you consumed in the past year.
Example
ANSWER:
(a)
continuous
(b)
discrete
Example
Identify each as a discrete or continuous
random variable.
(a) The number of movies currently playing
in U.S. theaters.
(b) The running time of a randomly selected
movie
(c) The cost of making a randomly selected
movie.
Example
ANSWER
(a) discrete
(b) continuous
(c) continuous
Discrete Probability Distributions

Provides all the possible values that the
random variable can assume

Together with the probability associated with
each value

Can take the form of a table, graph, or
formula

Describe populations, not samples
Example
Table 6.2 in your textbook
The probability distribution table of
the number of heads observed when
tossing a fair coin twice
Probability Distribution of a
Discrete Random Variable

The sum of the probabilities of all the
possible values of a discrete random variable
must equal 1.

That is, ΣP(X) = 1.

The probability of each value of X must be
between 0 and 1, inclusive.

That is, 0 ≤ P(X ) ≤ 1.
Example
Let the random variable x represent the
number of girls in a family of four
children. Construct a table describing
the probability distribution.
Example
Determine the outcomes with a tree diagram:
Example

Total number of outcomes is 16

Total number of ways to have 0 girls is 1
P(0 girls) 1 / 16 0.0625

Total number of ways to have 1 girl is 4
P(1 girl) 4 / 16 0.2500

Total number of ways to have 2 girls is 6
P(2 girls) 6 / 16 0.375
Example

Total number of ways to have 3 girls is 4
P(3 girls) 4 / 16 0.2500

Total number of ways to have 4 girls is 1
P(4 girls) 1 / 16 0.0625
Example
Distribution is:
NOTE:
P(x) 1
x
P(x)
0
0.0625
1
0.2500
2
0.3750
3
0.2500
4
0.0625
Mean of a Discrete Random
Variable

The mean μ of a discrete random variable
represents the mean result when the
experiment is repeated an indefinitely large
number of times

Also called the expected value or
expectation of the random variable X.

Denoted as E(X )

Holds for discrete and continuous random
variables
Finding the Mean of a Discrete
Random Variable

Multiply each possible value of X by its
probability.

Add the resulting products.
X P X
Variability of a Discrete
Random Variable
Formulas for the Variance and Standard
Deviation of a Discrete Random Variable
Definition Formulas
2
X
X
2
2
P X
P X
Computational Formulas
2
X2
P X
X2
P X
2
2
Example
x
P(x)
0
x P(x)
x2
x 2 P( x)
0.0625
0
0
0
1
0.2500
0.25
1
0.2500
2
0.3750
0.75
4
1.5000
3
0.2500
0.75
9
2.2500
4
0.0625
0.25
16
1.0000
xP(x) 2.0
Example
2
x
P(x)
0
x P(x)
x2
x 2 P( x)
0.0625
0
0
0
1
0.2500
0.25
1
0.2500
2
0.3750
0.75
4
1.5000
3
0.2500
0.75
9
2.2500
4
0.0625
0.25
16
1.0000
x 2 P( x)
2
5.0000 4.0000 1.0000
1.0000 1.0
Discrete Probability Distribution
as a Graph

Graphs show all the information contained in
probability distribution tables

Identify patterns more quickly
FIGURE 6.1 Graph of probability distribution
for Kristin’s financial gain.
Example

Page 270
Example


Probability distribution (table)
x
P(x)
0
0.25
1
0.35
2
0.25
3
0.15
Omit graph
Example

Page 270
Example

ANSWER
X
number of goals scored
(a) Probability X is fewer than 3
P( X 0 X 1 X 2)
P( X 0) P( X 1) P( X
0.25 0.35 0.25
0.85
2)
Example
ANSWER
(b) The most likely number of goals is the
expected value (or mean) of X

x
P(x)
x P(x)
0
0.25
0
1
0.35
0.35
2
0.25
0.50
3
0.15
0.45
xP(x) 1.3 1
She will most likely score one goal
Example
ANSWER
(c) Probability X is at least one

P( X 1 X 2 X 3)
P( X 1) P( X 2) P( X
0.35 0.25 0.15
0.75
3)
Summary

Section 6.1 introduces the idea of random
variables, a crucial concept that we will use
to assess the behavior of variable processes
for the remainder of the text.

Random variables are variables whose value
is determined at least partly by chance.

Discrete random variables take values that
are either finite or countable and may be put
in a list.

Continuous random variables take an infinite
number of possible values, represented by
an interval on the number line.
Summary

Discrete random variables can be described
using a probability distribution, which
specifies the probability of observing each
value of the random variable.

Such a distribution can take the form of a
table, graph or formula.

Probability distributions describe
populations, not samples.

We can find the mean μ, standard deviation
σ, and variance σ2 of a discrete random
variable using formulas.

6.2 Binomial Probability Distribution
6.2 Binomial Probability
Distribution
Objectives:
By the end of this section, I will be
able to…
1)
2)
3)
4)
Explain what constitutes a binomial
experiment.
Compute probabilities using the binomial
probability formula.
Find probabilities using the binomial tables.
Calculate and interpret the mean, variance,
and standard deviation of the binomial
random variable.
Factorial symbol

For any integer n ≥ 0, the factorial symbol
n! is defined as follows:

0! = 1

1! = 1

n! = n(n - 1)(n - 2) · · · 3 · 2 · 1
Example
Find each of the following
1. 4!
2. 7!
Example
ANSWER
1. 4! 4 3 2 1 24
2. 7! 7 6 5 4 3 2 1 5040
Factorial on Calculator
Calculator
7
MATH
PRB
4:!
which is
7!
Enter gives the result 5040
Combinations
An arrangement of items in which
 r items are chosen from n distinct items.
 repetition of items is not allowed (each item
is distinct).
 the order of the items is not important.
Example of a Combination
The number of different possible 5 card
poker hands. Verify this is a combination
by checking each of the three properties.
Identify r and n.
Example

Five cards will be drawn at random from a
deck of cards is depicted below
Example
An arrangement of items in which
 5 cards are chosen from 52 distinct items.
 repetition of cards is not allowed (each card
is distinct).
 the order of the cards is not important.
Combination Formula
The number of combinations of r items chosen
from n different items is denoted as nCr and
given by the formula:
n
Cr
n!
r! n r !
Example
Find the value of
C
7 4
Example
ANSWER:
7
C4
7!
4! (7 4)!
7!
4! 3!
7 6 5 4 3 21
( 4 3 2 1) (3 2 1)
7 6 5
3 21
7 5
35
Combinations on Calculator
Calculator
7
MATH
To get:
7
PRB
C4
Then Enter gives 35
3:nCr
4
Example of a Combination
Determine the number of different
possible 5 card poker hands.
Example
ANSWER:
52
C5
2,598,960
Motivational Example
Genetics
•
In mice an allele A for agouti (gray-brown,
grizzled fur) is dominant over the allele a,
which determines a non-agouti color.
Suppose each parent has the genotype Aa
and 4 offspring are produced. What is the
probability that exactly 3 of these have
agouti fur?
Motivational Example
•
A single offspring has genotypes:
A
a
A
AA
Aa
a
aA
aa
Sample Space
{ AA, Aa, aA, aa}
Motivational Example
•
Agouti genotype is dominant
• Event that offspring is agouti:
{ AA, Aa, aA}
•
Therefore, for any one birth:
P(agouti genotype) 3 / 4
P(not agouti genotype) 1 / 4
Motivational Example
•
•
Let G represent the event of an agouti
offspring and N represent the event of a
non-agouti
Exactly three agouti offspring may occur
in four different ways (in order of birth):
NGGG, GNGG, GGNG, GGGN
Motivational Example
•
Consecutive events (birth of a mouse)
are independent and using multiplication
rule:
P( N
P(G
G
N
G
G
G)
G)
P( N ) P(G ) P(G ) P(G )
1 3 3 3 27
4 4 4 4 256
P(G ) P( N ) P(G ) P(G )
3 1 3 3 27
4 4 4 4 256
Motivational Example
P(G
P(G
G
G
N
G
G)
N)
P(G ) P(G ) P( N ) P(G )
3 3 1 3 27
4 4 4 4 256
P(G ) P(G ) P(G ) P( N )
3 3 3 1 27
4 4 4 4 256
Motivational Example
•
P(exactly 3 offspring has agouti fur)
P(first birth N OR second birth N OR third birth N OR fourth birth N)
1 3 3 3
3 1 3 3
3 3 1 3
3 3 3 1
4 4 4 4
4 4 4 4
4 4 4 4
4 4 4 4
4
3
4
3
1
4
108
0.422
256
Binomial Experiment
Agouti fur example may be
considered a binomial
experiment
Binomial Experiment
Four Requirements:
1)
Each trial of the experiment has only two
possible outcomes (success or failure)
2)
Fixed number of trials
3)
Experimental outcomes are independent of
each other
4)
Probability of observing a success remains
the same from trial to trial
Binomial Experiment
Agouti fur example may be considered a binomial
experiment
1)
Each trial of the experiment has only two
possible outcomes (success=agouti fur or
failure=non-agouti fur)
2)
Fixed number of trials (4 births)
3)
Experimental outcomes are independent of
each other
4)
Probability of observing a success remains the
same from trial to trial (¾)
Binomial Probability Distribution
When a binomial experiment is performed,
the set of all of possible numbers of
successful outcomes of the experiment
together with their associated probabilities
makes a binomial probability
distribution.
Binomial Probability
Distribution Formula
For a binomial experiment, the probability of
observing exactly X successes in n trials
where the probability of success for any one
trial is p is
P( X )
where
n
CX
X
n
C
p
(
1
p
)
n X
n!
X! n X !
X
Binomial Probability
Distribution Formula
Let q=1-p
P( X )
X
n
C
p
q
n X
X
Rationale for the Binomial
Probability Formula
P(x) =
n!
•
(n – x )!x!
The number of
outcomes with exactly
x successes among n
trials
px •
n-x
q
Binomial Probability Formula
P(x) =
n!
•
(n – x )!x!
Number of
outcomes with exactly
x successes among n
trials
px •
n-x
q
The probability of x
successes among n
trials for any one
particular order
Agouti Fur Genotype Example
X
event of a birth with agouti fur
P( X )
4!
(4 3)! 3!
4
3
4
3
3
4
3
1
4
27
1
4
64
4
108
0.422
256
1
1
4
1
Binomial Probability Distribution
Formula: Calculator
2ND VARS
A:binompdf(4, .75, 3)
n, p, x
Enter gives the result 0.421875
Binomial Distribution Tables



n is the number of trials
X is the number of successes
p is the probability of observing a success
See Example
6.16 on
page 278
for more
information
FIGURE 6.7 Excerpt from the binomial tables.
Example
Page 284
Example
ANSWER
X
number of heads
P( X
5)
5
20
C5 (0.5) (0.5)
5
20 5
15
15,504 (0.5) (0.5)
0.0148
Binomial Mean, Variance, and
Standard Deviation

Mean (or expected value): μ = n · p

Variance:
2
np(1 p)
Use q 1 p, then

2
npq
Standard deviation:
np(1 p)
npq
Example
20 coin tosses
The expected number of heads:
np (20)(0.50) 10
Variance and standard deviation:
2
npq (20)(0.50)(0.50) 5.0
5
2.24
Example
Page 284
Is this a Binomial Distribution?
Four Requirements:
1)
Each trial of the experiment has only two
possible outcomes (makes the basket or does
not make the basket)
2)
Fixed number of trials (50)
3)
Experimental outcomes are independent of
each other
4)
Probability of observing a success remains the
same from trial to trial (assumed to be
58.4%=0.584)
Example
ANSWER
(a) X number of baskets
P( X
25)
50
C25 (0.584 )
0.0549
25
(0.416 )
25
Example
ANSWER
(b) The expected value of X
np (50)(0.584)
29.2
The most likely number of
baskets is 29
Example
ANSWER
(c) In a random sample of 50 of
O’Neal’s shots he is expected
to make 29.2 of them.
Example
Page 285
Is this a Binomial Distribution?
Four Requirements:
1)
Each trial of the experiment has only two
possible outcomes (contracted AIDS through
injected drug use or did not)
2)
Fixed number of trials (120)
3)
Experimental outcomes are independent of
each other
4)
Probability of observing a success remains the
same from trial to trial (assumed to be
11%=0.11)
Example
ANSWER
(a) X=number of white males who
contracted AIDS through injected
drug use
P( X
10)
10
120
C10 (0.11)
0.0816
110
(0.89)
Example
ANSWER
(b) At most 3 men is the same as
less than or equal to 3 men:
P( X
3)
P( X
0) P( X
1) P( X
Why do probabilities add?
2) P( X
3)
Example
Use TI-83+ calculator 2ND VARS to
get A:binompdf(n, p, X)
P( X
0) P( X
1) P( X
2) P( X
3)
= binompdf(120, .11, 0) + binompdf(120, .11, 1)
+ binompdf(120, .11, 2) + binompdf(120, .11,3)
5.535172385 E 4 0.000554
Example
ANSWER
(c) Most likely number of white
males is the expected value of
X
np (120)(0.11) 13.2
Example
ANSWER
(d) In a random sample of 120
white males with AIDS, it is
expected that approximately 13
of them will have contracted
AIDS by injected drug use
Example
Page 286
Example
ANSWER
(a)
2
npq (120)(0.11)(0.89) 11.748
11.748 3.43
RECALL: Outliers and z Scores
Data values are not unusual if
2
z - score
2
Otherwise, they are moderately
unusual or an outlier (see page 131)
Z score Formulas
Sample
z
x x
s
Population
z
x
Example
Z-score for 20 white males who
contracted AIDS through injected drug
use:
z
20 13.2
1.98
3.43
It would not be unusual to find 20 white
males who contracted AIDS through injected
drug use in a random sample of 120 white
males with AIDS.
Summary

The most important discrete distribution is
the binomial distribution, where there are
two possible outcomes, each with probability
of success p, and n independent trials.

The probability of observing a particular
number of successes can be calculated using
the binomial probability distribution formula.
Summary

Binomial probabilities can also be found
using the binomial tables or using
technology.

There are formulas for finding the mean,
variance, and standard deviation of a
binomial random variable.
6.3 Continuous Random
Variables and the Normal
Probability Distribution
Objectives:
By the end of this section, I will be
able to…
1)
Identify a continuous probability
distribution.
2)
Explain the properties of the normal
probability distribution.
FIGURE 6.15- Histograms
(a) Relatively
small sample
(n = 100)
with large class
widths (0.5 lb).
(b) Large sample
(n = 200)
with smaller class
widths (0.2 lb).
Figure 6.15 continued
(c) Very large
sample (n = 400)
with very small
class widths
(0.1 lb).
(d) Eventually,
theoretical histogram of
entire population
becomes smooth curve
with class widths
arbitrarily small.
Continuous Probability Distributions

A graph that indicates on the horizontal axis
the range of values that the continuous
random variable X can take

Density curve is drawn above the horizontal
axis

Must follow the Requirements for the
Probability Distribution of a Continuous
Random Variable
Requirements for Probability
Distribution of a Continuous
Random Variable
1)
The total area under the density curve
must equal 1 (this is the Law of Total
Probability for Continuous Random
Variables).
2)
The vertical height of the density curve can
never be negative. That is, the density
curve never goes below the horizontal axis.
Probability for a Continuous
Random Variable

Probability for Continuous Distributions
is represented by area under the curve
above an interval.
The Normal Probability Distribution

Most important probability distribution in the
world

Population is said to be normally distributed,
the data values follow a normal probability
distribution

population mean is μ

population standard deviation is σ

μ and σ are parameters of the normal
distribution
FIGURE 6.19

The normal distribution is symmetric about
its mean μ (bell-shaped).
Properties of the Normal
Density Curve (Normal Curve)
1)
It is symmetric about the mean μ.
2)
The highest point occurs at X = μ, because
symmetry implies that the mean equals the
median, which equals the mode of the
distribution.
3)
It has inflection points at μ-σ and μ+σ.
4)
The total area under the curve equals 1.
Properties of the Normal
Density Curve (Normal Curve)
continued
5)
Symmetry also implies that the area under
the curve to the left of μ and the area
under the curve to the right of μ are both
equal to 0.5 (Figure 6.19).
6)
The normal distribution is defined for
values of X extending indefinitely in both
the positive and negative directions. As X
moves farther from the mean, the density
curve approaches but never quite touches
the horizontal axis.
The Empirical Rule
For data sets having a distribution that is
approximately bell shaped, the following
properties apply:
 About 68% of all values fall within 1
standard deviation of the mean.
 About 95% of all values fall within 2
standard deviations of the mean.
 About 99.7% of all values fall within 3
standard deviations of the mean.
FIGURE 6.23 The Empirical Rule.
Drawing a Graph to Solve
Normal Probability Problems
1.
Draw a “generic” bell-shaped curve, with a
horizontal number line under it that is
labeled as the random variable X.

Insert the mean μ in the center of the
number line.
Steps in Drawing a Graph to
Help You Solve Normal
Probability Problems
Mark on the number line the value of X
indicated in the problem.
Shade in the desired area under the
normal curve. This part will depend on
what values of X the problem is asking
about.
3) Proceed to find the desired area or
probability using the empirical rule.
2)
Example
Page 295
Example
ANSWER
Example
Page 296
Example
ANSWER
Example
Page 296
Example
ANSWER
Summary

Continuous random variables assume
infinitely many possible values, with no gap
between the values.

Probability for continuous random variables
consists of area above an interval on the
number line and under the distribution
curve.
Summary

The normal distribution is the most
important continuous probability
distribution.

It is symmetric about its mean μ and has
standard deviation σ.

One should always sketch a picture of a
normal probability problem to help solve it.