Download Chapter 1: Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Randomness wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Chapter 5 ~ Probability Distributions
(Discrete Variables)
Express Checkout
P( x )
0.3
0.25
0.2
0.15
0.1
0.05
0
0
1
2
3
4
5
6
Number of Items Purchased
7
8
x
1
Chapter Goals
• Combine the ideas of frequency distributions and
probability to form probability distributions
• Investigate discrete probability distributions and
study measures of central tendency and dispersion
• Study the binomial random variable
2
5.1 ~ Random Variables
• Bridge between experimental outcomes and
statistical analysis
• Each outcome in an experiment is assigned to a
number
• This suggests the idea of a function
3
Random Variable
Random Variable: A variable that assumes a unique numerical
value for each of the outcomes in the sample space of a probability
experiment
Notes:
 Used to denote the outcomes of a probability experiment
 Each outcome in a probability experiment is assigned to a unique
value
 Illustration:
S
Outcomes






Random Variable




 2 1 0
1
2
4
Examples of Random Variable
1. Let the number of computers sold per day by a local merchant be
a random variable. Integer values ranging from zero to about 50
are possible values.
2. Let the number of pages in a mystery novel at a bookstore be a
random variable. The smallest number of pages is 125 while the
largest number of pages is 547.
3. Let the time it takes an employee to get to work be a random
variable. Possible values are 15 minutes to over 2 hours.
4. Let the volume of water used by a household during a month be a
random variable. Amounts range up to several thousand gallons.
5. Let the number of defective components in a shipment of 1000 be
a random variable. Values range from 0 to 1000.
5
Discrete & Random Variables
Discrete Random Variable: A quantitative random variable that can
assume a countable number of values
• Intuitively, a discrete random variable can assume values
corresponding to isolated points along a line interval. That is,
there is a gap between any two values.
Note: Usually associated with counting
Continuous Random Variable: A quantitative random variable that
can assume an uncountable number of values
• Intuitively, a continuous random variable can assume any value
along a line interval, including every possible value between any
two values
Note: Usually associated with a measurement
6
Example
 Example: Determine whether the following random variables are
discrete or continuous
1. The barometric pressure at 12:00 PM
2. The length of time it takes to complete a statistics exam
3. The number of items in the shopping cart of the person in
front of you at the checkout line
4. The weight of a home grown zucchini
5. The number of tickets issued by the PA State Police during a 24
hour period
6. The number of cans of soda pop dispensed by a machine
placed in the Mathematics building on campus
7. The number of cavities the dentist discovers during your
next visit
7
5.2 ~ Probability Distributions
of a Discrete Random Variable
• Need a complete description of a discrete random
variable
• This includes all the values the random variable may
assume and all of the associated probabilities
• This information may be presented in a variety of
ways
8
Probability Distribution & Function
Probability Distribution: A distribution of the probabilities
associated with each of the values of a random variable. The
probability distribution is a theoretical distribution; it is used to
represent populations.
Notes:

The probability distribution tells you everything you need to know
about the random variable.

The probability distribution may be presented in the form of a
table, chart, function, etc.
Probability Function: A rule that assigns probabilities to the values
of the random variable
9
Example
 Example: The number of people staying in a randomly selected
room at a local hotel is a random variable ranging in
value from 0 to 4. The probability distribution is
known and is given in various forms below:
x
P (x )
0
2/15
1
4/15
2
5/15
3
3/15
4
1/15
Notes:
 This chart implies the only values x takes on are 0, 1, 2, 3,
and 4
5
 P ( the random variable x equals 2 ) =
=
P(2)
15
10
A Line Representation
P( x )
0
.
4
Hotel Room Probability Distribution
0
.
3
0
.
2
0
.
1
0
.
0
0 1 2 3 4x
11
Histogram
A histogram may be used to present a probability distribution:
P( x )
Hotel Room Probability Distribution
0.4
0.3
0.2
0.1
0.0
0
1
2
3
4
x
12
Notes

The histogram of a probability distribution uses the
physical area of each bar to represent its assigned
probability

In the Hotel Room probability distribution: the width of
each bar is 1, so the height of each bar is equal to the
assigned probability, which is the area of each bar.

The idea of area representing probability is important in
the study of continuous random variables
13
Reminder!

Every probability function must satisfy the two basic
properties of probability:
1. The probability assigned to each value of the random
variable must be between 0 and 1, inclusive:
0  P( x)  1
2. The sum of the probabilities assigned to all the values
of the random variable must equal 1:
 P( x) = 1
all x
14
5.3 ~ Mean and Variance of a
Discrete Probability Distribution
• Describe the center and spread of a population
• m, s, s2 : population parameters
• Population parameters are usually unknown values
(we would like to estimate)
15
Important Notes
1. x is the mean of the sample
2. s2 and s are the variance and standard deviation of the sample
3. x , s2, and s are called sample statistics
4. m (lowercase Greek letter “mu”) is the mean of the population
5. s2 (“sigma squared”) is the variance of the population
6. s (lowercase Greek letter “sigma”) is the standard deviation of
the population
7. m, s2, and s are called population parameters. (A parameter is a
constant. m, s2, and s are typically unknown values.)
16
Mean of a Discrete Random Variable
• The mean, m, of a discrete random variable x is found by
multiplying each possible value of x by its own
probability and then adding all the products together:
m =  [ xP( x )]
Notes:
 The mean is the average value of the random variable, what
happens on average

The mean is not necessarily a value of the random variable
17
Discrete Random Variables
Variance of a Discrete Random Variable: Variance, s2, of a
discrete random variable x is found by multiplying each
possible value of the squared deviation from the mean, (x  m)2,
by its own probability and then adding all the products
together:
2
2
s =  [( x  m ) P( x )]
=  [ x 2 P( x )]  { [ xP( x )]}
2
=  [ x 2 P( x )]  m 2
Standard Deviation of a Discrete Random Variable: The
positive square root of the variance:
s = s2
18
Example
 Example: The number of standby passengers who get seats on a
daily commuter flight from Boston to New York is a
random variable, x, with probability distribution given
below (in an extensions table). Find the mean,
variance, and standard deviation:
x
0
1
2
3
4
5
Totals
P( x )
0.30
0.25
0.20
0.15
0.05
0.05
1.00
xP( x )
0.00
0.25
0.40
0.45
0.20
0.25
1.55
 P( x) [ xP( x)]
x 2 x 2 P( x )
0
0.00
1
0.25
4
0.80
9
1.35
16
0.80
25
1.25
4.45
[ x 2 P( x)]
(check)
19
Solution
• Using the formulas for mean, variance, and standard
deviation:
m =  [ xP( x )] = 155
.
Note: 1.55 is not a value of the random variable (in this case). It is
only what happens on average.
s 2 =  [ x 2 P( x )]  { [ xP( x )]}
2
= 4.45  (155
. ) 2 = 4.45  2.4025 = 2.0475
s = s 2 = 2.0475  143
.
20
Example
 Example: The probability distribution for a random variable x is
given by the probability function:
8 x
P( x ) =
for x = 3, 4, 5, 6, 7
15
Find the mean, variance, and standard deviation
Solutions: Find the probability associated with each value by using
the probability function:
83 5
=
15 15
P(4) =
84 4
=
15 15
86 2
P(6) =
=
15 15
P(7) =
87 1
=
15
15
P(3) =
P(5) =
85 3
=
15 15
21
Finding the Population Parameters
• Use an extensions table to find the population parameters:
x
3
4
5
6
7
Totals
P( x )
5/15
4/15
3/15
2/15
1/15
15/15
xP ( x )
15/15
16/15
15/15
12/15
7/15
65/15
x2
9
16
25
36
49
2
[
x
 P( x)]
[ xP( x)]
m =  [ xP ( x )] =
x 2 P( x )
45/15
64/15
75/15
72/15
49/15
305/15
65
 4.33
15
s 2 =  [ x 2 P ( x )]  { [ xP ( x )]}
2
2
305  65 
=
    1.56
15  15 
s = s 2 = 1.56  1.25
22
5.4 ~ The Binomial Probability Distribution
• One of the most important discrete distributions
• Based on a series of repeated trials whose outcomes
can be classified in one of two categories: success or
failure
• Distribution based on a binomial probability
experiment
23
Binomial Probability Experiment
Binomial Probability Experiment: An experiment that is
made up of repeated trials that possess the following properties:
1. There are n repeated independent trials
2. Each trial has two possible outcomes (success, failure)
3. P(success) = p, P(failure) = q, and p + q = 1
4. The binomial random variable x is the count of the number
of successful trials that occur; x may take on any integer
value from zero to n
24
Notes

Properties 1 and 2 are the two basic properties of any
binomial experiment

Property 3 concerns the algebraic notation for each trial

Property 4 concerns the algebraic notation for the
complete experiment


Both x and p must be associated with “success”
Independent trials mean that the result of one trial does
not affect the probability of success of any other trial in
the experiment. The probability of “success” remains
constant throughout the entire experiment.
25
Example
 Example: It is known that 40% of all graduating seniors on the
campus of a very large university have taken a statistics
class. Five seniors are selected at random and asked if
they have taken a statistics class. This approximates a
binomial experiment:
1. A trial is asking one student, repeated 5 times. The trials are independent
since the probability of taking a statistics class for any one student is not
affected by the results from any other student.
2. Two outcomes on each trial: taken a statistics class (success), not taken a
statistics class (failure)
3. p = P(taken a statistics class) = 0.40
q = P(not taken a statistics class) = 0.60
4. x = number of students who have taken a statistics class
26
Binomial Probability Function
• For a binomial experiment, let p represent the probability of a
“success” and q represent the probability of a “failure” on a single
trial; then P(x), the probability that there will be exactly x
successes on n trials is:
 n x n  x
P( x ) =   ( p )(q ), for x = 0, 1, 2, ... , or n
 x
Notes:



The number of ways that exactly x successes can occur in n trials:
 n
 
 x
The probability of exactly x successes: px
The probability that failure will occur on the remaining
(n - x) trials: qn - x
27
Notes

The number of ways that exactly x successes can occur in a set of
n trials is represented by the symbol:  n
 
 x
1. Must always be a positive integer
2. Called the binomial coefficient
3. Found by using the formula:  n
n!
=
 
 x x !(n  x )!

n! is an abbreviation for n factorial: n! = n(n  1)(n  2)(3)(2)(1)
6! = 6  5  4  3  2  1 = 720
28
Factorial Calculations with TI
• Example: 6! = 6 x 5 x 4 x 3 x 2 x 1 = 720
• Enter the number 6 and then use MATH > PRB> 4 to enter
the !, then press enter. The display reads 720
• Try these values next:
• 10!
0!
12!
20!
• Answers:
10! = 3,628,800 0! = 1
12! = 479,001,600
20! = 2.43E18
As you can see, the factorials grow very rapidly
29
Example
 Example:
According to a recent study, 65% of all homes in a
certain county have high levels of radon gas leaking
into their basements. Four homes are selected at random
and tested for radon. The random variable x is the number
of homes with high levels of radon (out of the four).
Properties:
1. There are 4 repeated trials: n = 4. The trials are independent.
2. Each test for radon is a trial, and each test has two outcomes: radon or
no radon
3. p = P(radon) = 0.65, q = P(no radon) = 0.35
p+q=1
4. x is the number of homes with high levels of radon, possible values:
0, 1, 2, 3, 4
30
Binomial Probability Function
 4

=
P ( x )   (0.65 ) x (0.35 ) 4 x , for x = 0, 1, 2, 3, 4
 x
 4
=
P ( 0 )   (0.65 ) 0 (0. 35 ) 4 = (1)( 1)( 0 .0150 ) = 0 .0150
 0
 4
P (1) =   (0. 65 ) 1 (0. 35 ) 3 = ( 4 )( 0 .65 )( 0 .0429 ) = 0 .1115
 1
 4
=
P ( 2 )   (0.65 ) 2 (0. 35 ) 2 = ( 6 )( 0 .4225 )( 0 .1225 ) = 0 .3105
 2
 4
=
P ( 3 )   (0. 65 ) 3 (0. 35 ) 1 = ( 4 )( 0 .2746 )( 0 .35 ) = 0 .3845
 3
 4
=
P ( 4 )   (0. 65 ) 4 (0. 35 ) 0 = ( 1)( 0 .1785 )( 1) = 0 .1785
 4
31
Example
 Example: In a certain automobile dealership, 70% of all customers
purchase an extended warranty with their new car. For 15
customers selected at random:
1) Find the probability that exactly 12 will purchase an
extended warranty
2) Find the probability at most 13 will purchase an extended
warranty
Solutions:
• Let x be the number of customers who purchase an extended warranty. x
is a binomial random variable.
• The probability function associated with this experiment:
 15

=
P( x )   (0.7) x (0.3)15 x , for x = 0, 1, 2, ... ,15
 x
32
Solutions Continued
1) Probability exactly 12 purchase an extended warranty:
 15
P(12) =   (0.7)12 (0.3) 3 =0.1700
 12
2) Probability at most 13 purchase an extended warranty:
P( x  13) = P (0) + P(1) + ... + P(13)
= 1  [P(14) + P (15)]
15 
14
1 15 
15
0
= 1   (0.7) (0. 3) +  (0.7) (0.3) 
15 
14 

= 1  [0.0305 + 0.0047]
= 1  0.0352 = 0.9648
33
Notes

Many graphing calculators also have built-in functions for computing
binomial probabilities and cumulative probabilities

TI Calculators:
Choose 2nd >DISTR> 0 or A:binompdf(
 Enter: n, p)
where n is the number of independent trials
and p is the probability of success

34
5.5 ~ Mean & Standard Deviation
of the Binomial Distribution
• Population parameters of the binomial distribution help
to describe the distribution
• Mean and standard deviation indicate where the
distribution is centered and the spread of the distribution
35
Mean & Standard Deviation
• The mean and standard deviation of a theoretical
binomial distribution can be found by using the
following two formulas:
m = np
s = npq
Notes:

Mean is intuitive: number of trials multiplied by the
probability of a success

The variance of a binomial probability distribution is:
s = ( npq ) = npq
2
2
36
Example
 Example: Find the mean and standard deviation of the binomial
distribution when n = 18 and p = 0.75
Solutions:
1) n = 18, p = 0.75,
q = 1 - 0.75 = 0.25
m = np = (18)(0.75) = 13.5
s = npq = (18)(0.75)(0.25) = 3.375  18371
.
2) The probability function is:
 18

=
P( x )   (0.75) x (0.25)18 x for x = 0, 1, 2, ... , 18
 x
37
Table of Values & Probabilities
x
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
11.00
12.00
13.00
14.00
15.00
16.00
17.00
18.00
P(x)
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0002
0.0010
0.0042
0.0139
0.0376
0.0820
0.1436
0.1988
0.2130
0.1704
0.0958
0.0338
0.0056
38
Histogram
m
s
P( x )
0.22
0.20
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
x
39