Download probability distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
+
Discovering Statistics
2nd Edition Daniel T. Larose
Chapter 6:
Probability Distributions
Lecture Powerpoint
+ Chapter 6 Overview

6.1 Discrete Random Variables

6.2 Binomial Probability Distribution

6.3 Poisson Probability Distribution

6.4 Continuous Random Variables and
the Normal Probability Distribution

6.5 Standard Normal Distribution

6.6 Applications of the Normal
Distribution

6.7 Normal Approximation to the
Binomial Probability Distribution
2
+ The Big Picture
3
Where we are coming from and where we are headed…
 In Chapter 5, we learned about probability, which allows us to
quantify the uncertainty involved in performing statistical inference.
However, we need a new set of tools in our probability toolbox:
random variables and probability distributions. Here in Chapter 6, we
learn these new tools, including the binomial distribution and the
normal distribution.

Chapter 7, “Sampling Distributions,” is a pivotal chapter where we
learn that statistics have predictable behavior, which allows us to
perform the statistical inference we learn in the remainder of the
book.

+ 6.1: Discrete Random Variables
Objectives:

Identify random variables.
Explain what a discrete probability distribution is and
construct probability distribution tables and graphs.

Calculate the mean, variance, and standard deviation of
a discrete random variable.

4
5
Random Variables
In this chapter, we develop an approach that analyzes probability
problems more efficiently than we did in Chapter 5. Recall from
Chapter 1 that a variable is a characteristic that can assume
different values.
A random variable is a variable that takes on quantitative values
representing the results of a probability experiment, and thus its
values are determined by chance. We denote random variables
using capital letters such as X, Y, or Z.
Example
Let X = outcome of a single roll of a die
X could be 1, 2, 3, 4, 5, 6
P(X = 5) = 1/6
Discrete and Continuous
Random Variables
6
There are two main types of random variables. The difference
between the two types relates to the possible values that each type
of random variable can assume.
Discrete and Continuous Random Variables
• A discrete random variable can take either a finite or a countable
number of values. The values can be graphed as separate points on
a number line.
• A continuous random variable can take on infinitely many values.
The values form an interval on the number line.
7
Discrete Probability Distributions
For every random variable, there is a probability distribution that
allows us to view all possible values of the random variable at a
glance.
A probability distribution of a discrete random variable provides all the
possible values that the random variable can assume, together with the
probability associated with each value. The probability distribution can take
the form of a table, graph, or formula. Probability distributions describe
populations, not samples.
When constructing a probability distribution
of a discrete random variable,
•Note each possible value of X
•Note the probability associated with each
value of X
Mean of a Discrete Random
Variable
We can calculate the mean and standard deviation of a discrete
random variable X, just as we can calculate the mean and standard
deviation of quantitative data.
Finding the Mean of a Discrete Random Variable X
The mean µ of a discrete random variable X is found as follows:
1. Multiply each possible value of X by its probability
2. Add the resulting products
  X P(X)
X = age
P(X)
15
0.07
16
0.17
17
0.29
18
0.47

  15(0.07) 16(0.17) 17(0.29) 18(0.47)
 17.16
The mean µ of a discrete random variable is also
called the expected value or the expectation of
the random variable X, denoted as E(X).
8

Variability of a Discrete Random
Variable
9
Since a discrete random variable takes on quantitative values, we use
the variance or standard deviation of a random variable X to help
us determine whether a particular value is unusual.
Formulas for the Variance and Standard Deviation of a
Discrete Random Variable
Definition Formulas
Computational Formulas
 2   (X  ) 2  P(X)
 2   X 2  P(X)  2


 (X  )  P(X)
2
X = age
P(X)
15
0.07
16
0.17
17
0.29
18
0.47
 X
2
 P(X)  2
  17.16
2

 2  (15 17.16)
(0.07)  ... (18 17.16) 2 (0.47)
 2  0.8944
  0.9457
+ 6.2: Binomial Probability
Distribution
Objectives:

Explain what constitutes a binomial experiment.
Compute probabilities using the binomial probability
formula.


Find probabilities using the binomial tables.
Calculate the mean, variance, and standard deviation of
the binomial random variable and find the mode of the
distribution.

10
11
Binomial Experiments
Many situations involve only two possible outcomes to a process.
Methods have been developed to make it more convenient to
analyze them. These methods begin with the definition of a
binomial experiment.
Binomial Experiment
A probability experiment that satisfies the following four
requirements is said to be a binomial experiment:
1.Each trial of the experiment has only two possible outcomes.
One outcome is denoted a success and the other a failure.
2.There is a fixed number of trials, known in advance.
3.The experimental outcomes are independent of each other.
4.The probability of observing a success is the same from trial to
trial.
12
Binomial Experiments
The outcomes of a binomial experiment, together with their
probabilities, generate a special discrete probability distribution
called the binomial probability distribution.
Notation for Binomial Experiments and Binomial Distribution
Symbol
Meaning
S
The outcome denoted as a success
F
The outcome denoted as a failure
P(Success)=P(S)=p The probability of observing a success
P(Failure)=P(F)=1-pThe probability of observing a failure
n
The number of trials
13
Binomial Probability Distribution
We can use the binomial probability distribution formula to find
probabilities for the number of successes for any binomial
experiment.
Binomial Probability Distribution Formula
The probability of observing exactly X successes in n trials of a
binomial experiment is
P(X) ( n CX ) p X (1  p) n X
We often call this the binomial probability formula.
In other words,
the binomial probability formula is
P(X) ( n CX )P(Success)number of successesP(Failure)number of f ailures
Binomial Mean, Variance, and
Standard Deviation, and Mode
Since the binomial random variable X is discrete, it also has a mean,
variance, and standard deviation.
Mean, Variance, and Standard Deviation of a Binomial Random
Variable X
  n p
• Mean (or expected value):
 2  n p (1  p)
• Variance:
  np(1  p)
• Standard deviation:
The mode of a binomial distribution is the most likely outcome of the
binomial experiment for the given
 values of n, p, and X, that is, the outcome
with the largest probability.
14
15
Binomial Example
Suppose we know the population proportion p of left-handed
students is 0.10, and we have a random sample of 100 students.
Calculate the expected number of left-handed students.
 100(0.10) 10
Calculate the variance and standard deviation of the number of lefthanded students.

 2  100(0.10)(0.90)
  9 3
Would 22 left-handed students out of 100 be considered an outlier?
22
78
P(X  22)100 C22 (0.1) (0.9)  0.00019
This is highly unlikely.
+ 6.3: Poisson Probability
Distribution
Objectives:
Explain the requirements for the Poisson probability
distribution.


Compute probabilities for a Poisson random variable.
Calculate the mean, variance, and standard deviation of
a Poisson random variable.

Use the Poisson distribution to approximate the binomial
distribution.

16
Requirements for the Poisson
Distribution
The Poisson distribution, like the binomial distribution, is a discrete
probability distribution. The Poisson probability distribution is used
when we wish to find the probability of observing a certain number of
occurrences of an event within a fixed interval of space or time.
Poisson Probability Distribution
The random variable X represents the number of occurrences of an
event in an interval.
1.The occurrences must be random.
2.Each occurrence must be independent.
3.The occurrences must be uniformly distributed over the given
interval.
17
Computing Probabilities for a
Poisson Random Variable
18
Poisson Probability Distribution Formula
If the requirements are met, the probability that a particular event
occurs X times within a given interval is
P(X) 
X  e
X!
where µ is the mean of the Poisson probability distribution, e is a
constant approximately equal to 2.718281828, and X is the number
of occurrences of the event within the interval.

Example
A study of the number of cardiac arrests to
occur per week in a hospital found a Poisson
distribution with mean µ=1.09. Find the
probability of 2 cardiac arrests in a given week.
1.092  e 1.09
P(2) 
 0.1997
2!
19
Mean, Variance, and Standard Deviation
for a Poisson Random Variable
Parameters of the Poisson Distribution
Mean = 
Variance =  2  
Standard Deviation    

Example
A study of the number of cardiac arrests to
occur per week in a hospital found a Poisson
distribution with mean µ=1.09. Find the mean,
variance, and standard deviation of X=the
number of cardiac arrests in a given week.
Mean =   1.09
Variance =  2    1.09
Standard Deviation    1.09  1.044
20
Using the Poisson Distribution to
Approximate the Binomial Distribution
We can use the Poisson distribution to approximate the binomial
distribution when the number of trials n is large and the probability of
successes p is small, as measured by the following requirements.
Requirements for Using the Poisson Distribution to
Approximate the Binomial Distribution
n ≥ 100
and
np ≤ 10
where n is the number of trials and p is the probability of success
for the binomial distribution.
If the requirements are met, then the mean of the Poisson
distribution used to approximate the binomial distribution is given as
µ = np
+ 6.4: Continuous Random
Variables and the Normal
Probability Distribution
21
Objectives:
Identify a continuous probability distribution and state the
requirements.

Calculate probabilities for the uniform probability
distributions.

Explain the properties of the normal probability
distribution.

Continuous Probability
Distributions
Continuous random variables assume infinitely many possible
values, with no gap between the values. For a given continuous
random variable X, we are not interested in whether X equals any
particular value. Rather we are interested in whether X is greater or
less than a particular value or between two values.
Continuous Probability Distribution
A continuous probability distribution is a graph that indicates the
range of values that the continuous random variable X can take and,
above which, is drawn a density curve.
1.The total area under the density curve must equal 1 (this is the
Law of Total Probability for Continuous Random Variables)
2.The vertical height of the density curve can never be negative.
That is, the density curve never goes below the horizontal axis.
22
Calculating Probabilities for the
Uniform Probability Distribution
23
The uniform probability distribution is a continuous distribution
that has constant probability from left endpoint a to right endpoint b.
Its curve is a flat, straight line, so that the shape of the distribution is
a rectangle.
Probability for a Continuous Distribution
The probability that a continuous random variable X takes a value in
an interval is equal to the area under the density curve above that
interval.
X = waiting time for the campus shuttle bus
Find the probability you will wait between
2 and 4 minutes for the bus.
P(2≤X≤4) = 2(0.1) = 0.2
24
Normal Probability Distribution
We now turn to what is considered to
be the most important probability
distribution in statistics:
the normal probability distribution.
Properties of the Normal Probability Distribution
1.It is symmetric about the mean µ.
2.The highest point occurs at X=µ.
3.The total area under the curve = 1.
4.The area under the curve to the left of µ and to the right of µ are
both equal to 0.5.
5.The normal distribution is defined for values of X extending
indefinitely in both the positive and negative directions.
6.Values of X are always found on the horizontal axis. Probabilities
are represented by areas under the curve.
25
The Empirical Rule
+ 6.5: Standard Normal
Distribution
Objectives:
Find areas under the standard normal curve, given a Zvalue.


Find the standard normal Z-value, given an area.
26
27
Standard Normal Distribution
There is one very special normal distribution called the standard
normal distribution. The mean and standard deviation of the
standard normal distribution make it unique.
The standard normal distribution is a
normal distribution with
• mean µ = 0 and
• standard deviation σ = 1.
Finding Areas Under the
Standard Normal Curve
28
Finding Areas Under the
Standard Normal Curve: Case 1
Find the area to the left of Z = 0.57.
1. Draw the standard normal curve and label Z = 0.57.
2. Shade to the left of 0.57.
3. Look at the intersection of row 0.5 and
column 0.07. This is the area to the left of
Z = 0.57.
Area = 0.7157.
29
Finding Areas Under the
Standard Normal Curve: Case 2
Find the area to the right of Z = -1.25.
1. Draw the standard normal curve and label Z = -1.25.
2. Shade to the right of -1.25.
3. Look at the intersection of row -1.2 and
column 0.05. This is the area to the left of
Z = -1.25. The area to the right is then
Area = 1 – 0.1056 = 0.8944.
30
Finding Areas Under the
Standard Normal Curve: Case 3
Find the area between Z = -1 and Z = 1.
1. Draw the standard normal curve and label Z = -1 and Z = 1.
2. Shade the area between -1 and 1.
3. Find the area to the left of Z = -1 and the area to the left of Z = 1.
Subtract the smaller area from the larger area to find the area in between.
Area = 0.8413 – 0.1587 = 0.6826.
31
32
Finding Z-Values for a Given Area
Find the Z-value with area 0.90 to its left.
1. Draw the standard normal curve and label Z1
2. Shade the area to the left of Z1 and label
with the given area of 0.90.
3. Find the value closest to 0.90 in the
body of the Z table. This should be
0.8997. Move to the left to find the
value 1.2, then move up from 0.8997
to find the value 0.08. Putting these
values together, we get
Z1 = 1.2 + 0.08 = 1.28
33
Finding Z-Values for a Given Area
Find the Z-value with area 0.03 to its right.
1. Draw the standard normal curve and label Z1
2. Shade the area to the right of Z1 and label
with the given area of 0.03. Label the area to
the left as 0.97
3. Find the value closest to 0.97 in the
body of the Z table. This should be
0.9699. Move to the left to find the
value 1.8, then move up from 0.9699
to find the value 0.08. Putting these
values together, we get
Z = 1.8 + 0.08 = 1.88
34
Finding Z-Values for a Given Area
Find the Z-values that mark the boundaries of the middle 95% of the
area under the standard normal curve.
1. Draw the standard normal curve and label Z1 and Z2
2. Shade the area in between and label the
area as 0.95. By symmetry, there is area
0.05 ÷ 2 = 0.025 in each tail.
3. Find the value closest to 0.025 in
the body of the Z table. This
corresponds to a Z-value of -1.96. By
symmetry, the other Z-value is 1.96.
+ 6.6: Applications of the Normal
Distribution
Objectives:
Compute probabilities for a given value of any normal
random variable.

Find the appropriate value of any normal random
variable, given an area or probability.

35
Standardizing a Normal Random
Variable
To standardize a normal random variable X, we transform that normal
random variable X into the standard normal random variable Z.
Standardizing a Normal Random Variable
Any normal random variable X can be transformed into the standard
normal random variable Z by standardizing X using the formula
Z
x

36
Probabilities for Any Normal
Distribution
37
Finding Probabilities for Any Normal Distribution
1.Determine the random variable X, the mean µ, and the standard
deviation σ. Draw the normal curve for X and shade the desired area.
2.Standardize by using the formula Z = (X - µ)/σ to find the values of Z.
3.Draw the standard normal curve and shade the area corresponding to
the shaded area in the graph of X.
4.Find the area under the standard normal curve using either the Z table
or technology. This area is equal to the area under the normal curve for X
drawn in step 1.
Normal Data Values for a Given
Area or Probability
Finding Normal Data Values for a Given Area or Probability
1.Determine X, µ, and σ, and draw the normal curve for X. Shade the
desired area. Mark the position of the unknown value X1.
2.Find the Z-value corresponding to the desired area.
3.Transform this value of Z into a value of X using the formula
X1 = Zσ + µ
38
39
Example
Edmunds.com reported that the average amount that
people were paying for a 2007 Toyota Camry XLE was
$23,400. Let X = price, and assume that price follows a
normal distribution with μ = $23,400 and σ = $1000.
Find the prices that separate the middle 95% of 2007
Toyota Camry XLE prices from the bottom 2.5% and
the top 2.5%.
40
Example
Find the Z-values corresponding to the desired area.
The area to the left of X1 equals 0.025, and the area to
the left of X2 equals 0.975.
Looking up area 0.025 on the inside of the Z table gives
us Z1 = –1.96. Z2 = 1.96.
X1 = Z1σ + μ =(–1.96)(1000) + 23,400 = 21,440
X2 = Z2σ + μ =(1.96)(1000) + 23,400 = 25,360
The prices that separate the middle
95% of 2007 Toyota Camry XLE
prices from the bottom 2.5% of
prices and the top 2.5% of prices are
$21,440 and $25,360.
+ 6.7: Normal Approximation to the
Binomial Probability Distribution
41
Objectives:
Use the normal distribution to approximate probabilities of
the binomial distribution.

42
Normal Approximation to the
Binomial Distribution
Recall the binomial random variable X represents the number of
successes in n trials and depends on the sample size n and
probability of success p. For a given probability of success p, if the
sample size n gets large enough, the binomial distribution begins to
resemble the normal distribution.
n=4, p=0.2
n=64, p=0.2
Normal Approximation to the
Binomial Distribution
43
The Normal Approximation to the Binomial Probability Distribution
Consider the binomial random variable X with probability of success p
and number of trials n. If np≥5 and n(1-p)≥5, the binomial distribution
may be approximated by a normal distribution with mean np and
standard deviation √(np(1-p).
Example 6.44
The binomial distribution with n = 64
and p = 0.2 can be approximated by
a normal distribution since np =
12.8 and n(1-p) = 51.2
The mean of the normal distribution
is np = 12.8 and the standard
deviation is √np(1-p) = 3.2.
+ Chapter 6 Overview

6.1 Discrete Random Variables

6.2 Binomial Probability Distribution

6.3 Poisson Probability Distribution

6.4 Continuous Random Variables and
the Normal Probability Distribution

6.5 Standard Normal Distribution

6.6 Applications of the Normal
Distribution

6.7 Normal Approximation to the
Binomial Probability Distribution
44