Download CH5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Randomness wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Section 5.1 – Probability Distributions
A random variable is a variable (typically represented by x) that has a numeric value, determined by chance,
for each possible outcome of an experiment
Examples:
The number of students passing a certain class
The average height of the students in a class
The number of girls in a family of 5 children
The sum on the faces of two rolled dice
The number of defective parts in a sample of 20
The average daily temperature
A word about randomness
The word randomness suggests unpredictability.
Randomness and uncertainty are vague concepts that deal with variation.
A simple example of randomness involves a coin toss. The outcome of the toss is uncertain. Since the coin
tossing experiment is unpredictable, the outcome is said to exhibit randomness.
Even though individual flips of a coin are unpredictable, if we flip the coin a large number of times, a pattern
will emerge. Roughly half of the flips will be heads and half will be tails.
This long-run regularity of a random event is described with probability. Our discussions of randomness will be
limited to phenomenon that in the short run are not exactly predictable but do exhibit long run regularity.
A discrete random variable has either a finite or a countable number of values. This chapter deals with
discrete random variables.
A continuous random variable has infinitely many values, and those values can be associated with
measurements on a continuous scale in such a way that there are no gaps or interruptions.
A probability distribution is a graph, table, or formula that gives the probability for each possible value of the
random variable.
(Notice: similar to relative frequency tables, histograms)
A probability histogram is a way to graph a probability distribution.
The vertical scale shows probabilities instead of relative frequencies.
Note that the area of these rectangles is the same as the probabilities.
1
M116 – NOTES – CH 5
Section 5.1 – Probability Distributions

Requirements for a Probability Distribution
o 0  P(X = x)  1
o The sum of the probabilities of a discrete random variable is 1.
 P( X  x)  1

To evaluate the mean and standard deviation of a probability distribution using the calculator
Enter x into L1
Enter the probabilities into L2
Press STAT
Arrow right to CALC
Select 1: 1-Var Stats L1,L2
Press ENTER

Identifying Unusual Results with the Range Rule of Thumb (section 6.2)
The range rule of thumb is based on the principle that for many data sets (symmetrical, bell shaped),
the vast majority (such as 95%) of sample values lie within two standard deviations of the mean. Less
common values are more than two standard deviations from the mean.

Minimum “usual” value ~ mean – 2 * standard deviation =
  2
Maximum “usual” value ~ mean + 2 * standard deviation =
  2
Identifying Unusual Results with Probabilities
Unusually high: x successes among n trials is unusually high if P(x or more) is very small (such
as less than 0.05)
Unusually low: x successes among n trials is unusually low if P( or fewer) is very small (such as
less than 0.05)
2
M116 – TI 83/84 CALCULATOR – CH 5
Using the TI-83/84 calculator to find the Mean and Standard Deviation of Probability Distributions
To evaluate the mean and standard deviation using the calculator
Enter x into L1
Enter the probabilities into L2
Press STAT
Arrow right to CALC
Select 1: 1-Var Stats L1,L2
Press ENTER
Example 1) When randomly selecting jail inmates convicted of DWI (driving while intoxicated), the
probability distribution for the number x of prior DWI sentences is as described in the accompanying table
(based on data from the U.S. Department of Justice).
x
0
1
2
3
P(x)
0.512
0.301
0.138
0.049
a) What is the population and the success attribute?
b) Describe in words the random variable. (What are we counting?)
c) What are the possible values of the random variable?
d) Verify that the given table is a probability distribution
e) Use the calculator to find the mean and standard deviation of this distribution.
f) Which values are usual and which are unusual, according to
(i) The probability rule?
(ii) The range rule of thumb?
3
M116 – NOTES – CH 5
Section 5.2 & 5.3 – Binomial Experiments

Features of a binomial experiment (5.2)
1)
2)
3)
4)
The experiment has a fixed number of trials (n)
The trials must be independent
Each trial has 2 possible outcomes: success (S) and failure (F)
Probabilities remain constant for each trial.
p is the probability of success, and q is the probability of failure
When sampling without replacement, the events can be treated as if they were independent
if the sample size is no more than 5% of the population size. (That is, n  0.05 N )

Find binomial probabilities with a shortcut feature of the calculator
To find individual probabilities: Use binompdf(n,p,x)
Press 2nd VARS
Select 0:binompdf(
Type n,p,x)
Press ENTER
To calculate cumulative probabilities from 0 to x, use binomcdf(n,p,x)

Mean, Variance, and Standard Deviation for the Binomial Distribution (5.3)
If we have the probability distribution in the editor of the calculator we can use the calculator by
doing STAT – CALC, 1-VarStat L1, L2
Otherwise we can use these formulas for binomial distributions.
  npq
  np
Remember that the variance is the square of the standard deviation:
Variance =

 2  ( npq )2  npq
Unusual values (5.3)
For a binomial distribution, it is unusual for the number of successes to be more than 2.5 σ from
µ.
Minimum “usual” value ~   2.5
Maximum “usual” value ~
  2.5
4
M116 – TI 83/84 CALCULATOR – CH 5
Binomial Distributions and Simulations (Chapter 5)
Example 2) – Booking tickets:
Air America has a policy of booking as many as 15 persons on an airplane that can seat only 14. Past studies
have revealed that only 85% of the booked passengers actually arrive for the flight. Find the probability that if
Air America books 15 persons, not enough seats will be available.
a) Describe the random variable and success attribute. Give the possible values of the random variable.
Give the number of trials and the probability of success.
b) Use the calculator to find the probability that if Air America books 15 persons, not enough seats will be
available.
c) Is it unusual to find that there are not enough sits available? Should overbooking be a concern for
passengers?
d) SIMULATION
Now we are going to simulate this situation by repeating the experiment 20 times.
Use MATH PRB 7:randBin(n,p) and press ENTER 20 times.
Record results in a table, and then use your table to answer the question to the problem.
e) Use class results and answer the question again.
f) OPTIONAL (OYO)
Here we have another simulation technique. Use the calculator to generate 50 numbers that come from a
binomial distribution with n = 15 and p = 0.85
(We’ll clear List 1, generate the numbers and store them into List 1, we’ll sort the list and then
explore the editor)
STAT 4:ClrList L1 :
MATH PRB 7:randBin(n,p,50) STO L1 :
STAT 3:SortA(L1)
Go to the editor, explore the list and count how many times we had 15 passengers showing up. Then
determine the probability, and compare with the theoretical results from part (a). Comment on the law
of large numbers.
5
M116 – TI 83/84 CALCULATOR – CH 5
Binomial Distributions – Usual and Unusual Outcomes– Why do we care about it?
1) Experiment: Counting the number of girls born to 100 women.
Use the range rule of thumb with n = 100 and p = .5 to find the usual range of the distribution of x: the number
of girls among 100 babies.
Application: Gender Selection: ProCare Industries, LTD., once provided a product called “Gender
Choice”, which, according to advertising claims, allowed couples to “increase your chances of having a
boy up to 85%, and a girl up to 80%”.
Gender Choice was available in blue packages for couples wanting a baby boy and pink packages for
couples wanting a baby girl.
Suppose we conduct an experiment with 100 couples who want to have baby girls, and they all follow the
Gender Choice “easy-to-use in-home system” described in the pink package.
In the box, they show the CLAIM:
Gender Choice increases your chances of having a girl (or boy).
So here we have two conflicting hypothesis:
Gender Choice has no effect
Claim
Gender Choice increases your chances of having a girl/ boy
What would you conclude about the effectiveness of Gender Choice if 100 couples using the pink package have
100 babies consisting of
a) 52 girls? (52%)
There is sufficient evidence to support the claim that the gender selection method is effective
There is not sufficient evidence to support the claim that the gender selection method is effective
b) 77 girls? (77%)
There is sufficient evidence to support the claim that the gender selection method is effective
There is not sufficient evidence to support the claim that the gender selection method is effective
Rare Event Rule for Inferential Statistics
If, under a given assumption, the probability of a particular observed event is exceptionally small, (less
than 0.05) we conclude that the assumption is probably not correct.
6
2) The rate of Lyme disease cases in Clinton County is 2%.
In groups of 1000 what is the usual range of the distribution of x: the number of people of the county who has
Lyme disease out of 1000.
Here is the rest of the story:
A new vaccine has been developed to avoid getting Lyme disease. We would like to know whether the vaccine
is effective.
There are two conflicting hypotheses:
The vaccine is not effective
Claim
The vaccine is effective
Case 1: When 1000 people from that county are given the new vaccine, it is found that 19 of them contract
Lyme disease
We support the claim that the vaccine is effective
We don’t have enough evidence to support the claim that the vaccine is effective
Case 2: When 1000 people from that county are given the new vaccine, it is found that 7 of them contract Lyme
disease
We support the claim that the vaccine is effective
We don’t have enough evidence to support the claim that the vaccine is effective
Rare Event Rule for Inferential Statistics
If, under a given assumption, the probability of a particular observed event is exceptionally small, (less
than 0.05) we conclude that the assumption is probably not correct.
7
Section 5.2 and 5.3 – how this material helps us in inferential statistics?
3) There are two conflicting hypotheses:
The coin is fair
Claim
The coin is not fair
Case 1: Heads turns up 17 times in 30 tosses
We support the claim that the coin is NOT fair
We don’t have enough evidence to support the claim that the coin is NOT fair
Case 2: Heads turns up 27 times in 30 tosses
We support the claim that the coin is NOT fair
We don’t have enough evidence to support the claim that the coin is NOT fair
8
4) There are two conflicting hypotheses:
The die is fair
Claim
The die is not fair
Case 1: The outcome of 1 occurs 9 times in 60 rolls
We support the claim that the die is NOT fair
We don’t have enough evidence to support the claim that the die is NOT fair
Case 2: The outcome of 1 occurs 52 times in 60 rolls
We support the claim that the die is NOT fair
We don’t have enough evidence to support the claim that the die is NOT fair
9