Download Example - Tripod

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Transcript
Module III – A Bridge to Inference
Unit 6: Randomness, Probability, and
Sampling Distributions
Randomness
Parameter: a number that describes
the population. We do not generally
know the values of parameters.
Statistics: a number we calculate
from the sample data. We use a
statistic to estimate the value of an
unknown parameter.
1
Example: A car shop has a box full
of ball bearings with mean diameter
2.5003 cm. This is within the
specification for acceptance of the lot
by the purchaser. By chance, an
inspector chooses 100 bearings from
the lot that have mean diameter
2.5009 cm. Because this is outside
the specified limits, the lot is
mistakenly rejected.
To distinguish between a sample and
a population recall the following
notation:
Mean of a pop.:  A parameter
Mean of a sample: x A statistic
2
The value of a statistic changes from
sample to sample. In repeated
random sampling, the value of the
statistic varies – called sampling
variability. The statistic varies in the
short run, unpredictably. But is has a
regular and predictable pattern in the
long run.
The concept of probability
Something is random if individual
outcomes are uncertain but there is a
regular distribution of outcomes in a
large number of repetitions.
3
The probability of an outcome of a
random phenomenon is the
proportion of times the outcome
would occur in a very long series of
repetitions
A Trial is independent when the
outcome of one trial does not
influence the outcome of another.
Probability Models:
Now that we know what randomness
is, we want to describe the patterns
that random phenomena exhibit. To
describe the pattern, list all the
possible outcomes and their
probabilities.
4
Example: Tossing 2 coins.
All possible outcomes:
HH, HT, TH,TT where each set has
¼ chance of happening.
Sample Space, S, of a random
phenomena is the set of all possible
outcomes.
An event is any outcome (or set of
outcomes) of a random
phenomenon. An event is a subset
of the sample space.
Example: Referring back to our last
example.
S = {HH, HT, TH, TT}
EVENT
5
A probability model is a mathematical
description of a random phenomenon
consisting of a sample space S and a
way of assigning probabilities to
events.
Probability Rules
If A is any event, the probability of A
is written as P(A).
Example: P(H) or P(T)
6
PROBABILITY RULES:
1. Let A be a subset of the set S
0P(A)1
2. P(S)=1
3. P(A does not occur) = 1 –P(A)
4. Two events A and B are disjoint (or
mutually exclusive)if they have no
outcomes in commons in common. If
A and B are disjoint, then P(A or B) =
P(A) + P(B)
 Addition rule for disjoint events
(they have no outcomes in
common)
7
Assigning Probabilities – Finite Number of
outcomes
Assign a probability to each
individual outcome. Each probability
must be a number between 0 and 1,
and their sum must be 1.
The probability of any event is the
sum of the probabilities of the
outcomes making up the event.
Definition: Two events A and B are
exhaustive of the sample space if,
combined they cover all possible
outcomes in the sample space.
8
Example: Rolling a dice
S = {1, 2, 3, 4, 5, 6}
1
2
3
4
5
6
1/6 1/6 1/6 1/6 1/6 1/6
P(5) = 1/6
P(5 or 6) = P(5) + P(6)
= 1/6 + 1/6
disjoint = 2/6 = 1/3
P(not 5 or 6) = 1-P(5 or 6)
= 1 – 1/3
= 2/3
P(even no.) = P(2)+P(4)+P(6)
= 1/6 + 1/6 + 1/6
= 3/6 = ½
9
Example: 4.21
All human blood can be typed as one
of O, A, B, or AB, but the distribution
of the types varies a bit with race.
Here is the probability model for the
blood type of a randomly chose
African American:
Blood Type O
A
B
AB
Probability 0.49 0.27 0.20 ?
What is the probability of type AB
blood?
= 1 – (0.49 + 0.27 + 0.20)
= 1 – 0.96
= 0.04
10
Maria has type B blood. She can
safely receive blood transfusions
from people with blood types O and
B. What is the probability that a
randomly chosen African American
can donate blood to Maria?
P(O or B) = P(O) + P(B)
= 0.49 + 0.20
= 0.69
11
Multiplication Rule for Independent Events
Two events A and B are independent in knowing that
one occurs does not change the probability that the
other occurs.
If A and B are independent:
P(A and B) = P(A)P(B)
Note: The multiplication rule holds only if A and B are
independent.
Example: dice
P(5 and 6) = (1/6)(1/6) = 1/36
Example of independence
P(5 or 6) = 1/6 + 1/6 = 1/3
Example of disjoint.
12
Using a Venn diagram we can also show a disjoint
event.
S
B
A
Note: the multiplication rule
P(A and B) holds if A and B are independent but not
otherwise.
The addition rule:
P(A or B) = P(A) + P(B) holds if A and B are disjoint but
not otherwise.
If A and B are disjoint then the fact that A occurs
tells us that B cannot.
13
The general addition rule for any
two events:
We know that if A and B are
disjoint events, then
P(A or B) = P(A) + P(B). The
addition rule extends to more than
two events that are disjoint in the
sense that no two have any
outcomes in common.
Here is the addition rule for any
two events disjoint or not.
P(A or B)=P(A)+P(B)–P(A and B)
B
AA
14
In the previous examples we look at
if we tossed a coin four times we can
record the outcome as a string of
heads or tails. We are interested in:
 Let X be the number of heads
 The possible values are 0,1,2,3,4
if x is a random variable because
its values can vary when the coin
is tossed repeatedly. We can
define a random variable as:
15
A random variable is a variable
whose value is a numerical outcome
of a random phenomenon.
There are two main ways of
assigning probabilities to the values
of a random variable.
1. Discrete random variable – x
has a finite number of possible
values
 The probability distribution of x
lists the values and their
probabilities
16
i.e.
Value of x
x1………xk
Probability
p1………..pk
The probabilities but be between 0
and 1 and the sum of the
probabilities is 1.
Note: we can find the probability of
any event by adding the probabilities
that make up the event.
The mean of a discrete random
variable X is defined to be:
 x   xp( x )
allx
17
and its variance is:
2
2
   ( x   ) p( x )
allx
Example
Consider the distribution of the
amount X (in $) won on a spin of a
slot machine:
X
-1
5
10
50
P(x) 0.93 0.05 0.015 0.005
 x  1(0.93)  5(0.05)  .....  50(0.005)
 0.28
18
 2  ( 0.72)2 (0.93)  (5.28)2 (0.05)  ......
2
 (50.28) (0.005)
 16.1016
P(X≥$10) = 0.015+0.005
P(X≥$10) = 0.02
P(X<$5) = 0.93
19
2. Continuous random variables
 When there are an infinite set of
values that we have no way of
listing or counting the individual
values.
Ex. Thermometer – an infinite
number of values lie along the
thermometer and we cannot count
them all.
We define a continuous random
variable x, x takes all values in an
interval of numbers.
20
 The probability distribution of x is
described by a density curve. The
probability of any event is the area
under the density curve and above
the values of x that make up the
event. The probability model for a
continuous random variable
assigns probabilities to an interval
of outcomes rather than to
individual outcomes.
21
Note: As we know, individual
outcomes are assigned a probability
of 0.
Normal distributions are probability
distributions, based on a very large
data set. Recall N(µ,σ), if x has the
N(µ,σ) distribution then the
standardized variable is z=x-µ
σ
is a standard normal random variable
having the distribution N(0,1).
22
Example: heights of women
appeared to be normally distribution
with mean μ=64.5 inches and σ = 2.5
inches. We choose one women at
random from the population and
observe height x. Upon repeated
sampling, we observe that the
distribution of values of X is the same
normal distribution. P(63  x  65)
=
63  64.5 x  64.5 65  64.5
p(


)
2.5
2.5
2.5
= p(-0.6  Z  0.2)
= 0.5793 – 0.2743 = 0.3050
23
Example: The normal distribution
with mean 6.84 and standard
deviation 1.6 is a good description
of the Iowa test vocabulary scores
of seventh graders in Gary, Indiana.
Let the random variable x be the
Iowa test score of one Gary seventh
grader chosen at random.
Write the event “ the student
chosen has a score of 10 or higher”
in term of x.
Find the probability of this event
scores: N(6.8, 1.6)
P(x ≥ 10)
24
 x  6.8 10  6.8 
p


1 .6 
 1 .6
 p( z  2)
 1  0.9772
 0.0228
25
Sampling Distribution
We use the statistic x (known) to
estimate the unknown parameter μ
and s to estimate the unknown
parameter σ.
A SRS should represent the
population fairly well, so the mean of
the sample should be somewhere
near the population mean μ.
But x changes, depending on the
sample.
26
Law of Large Numbers: Draw
observations at random from any
population with finite mean μ.
As the number of observations drawn
increases, the mean x of the
observed values gets closer and
closer to the mean μ of the
population.
The sampling distribution of a
statistic is the distribution of values
taken by the statistic in all possible
samples of the same size from the
same population.
27
Mean and Standard deviation of
x
Suppose x is the mean of a SRS of
size n drawn from a large population
with mean μ and standard deviation
σ.
Then the mean of the sampling
distribution of x is μ and the standard
deviation is 
n
.
Notes:
 The sampling distribution has the
same mean but smaller spread
than the distribution of the
observations in the population. .
28
Example:
X~N(μ, σ) where σ = 30 and n = 9 we will estimate μ with
x . Now the x standard deviation becomes

= 30
 10
n
9
 Averages are less variable than
individual observations.
 For fixed σ, as sample size
increases, 
n
gets smaller.
The result of large samples are
less variable than the results of
smaller samples.
29
Note: To reduce variation by half,
sample size has to increase by a
multiple of 4.
Example: n = 36, 30
36
5
Now we will look at:
The Central Limit Theorem:
The shape of a probability
distribution x depends on the shape
of the population distribution. If the
population distribution is normal, then
so is the distribution of the sample
mean.
30
Sampling Distribution of a Sample
Mean
If a population is distributed N(μ, σ),
then the sample mean x of n
independent observations had the
N(μ, 
n
) distribution.
If the population is not normal, the
distribution of x changes shape as n
increases.
The shape looks more and more like
the normal distribution (as long as
the population has finite σ).
31
When the shape of the population
distribution is far from normal, a
larger sample is needed for the
distribution of x to be close to
normal.
In other words: Central Limit
Theorem is saying… Draw a SRS
size n from any population with mean
μ and finite standard deviation σ.
When n is large, the sampling
distribution of x is approximately
normal, x is approximately
N(μ, 
n
).
32
Notes:
 The CLT allows us to use normal
probability calculations to answer
questions about sample means (n
large enough) even when the
population is not normal.
 The CLT says that the distribution
of a sum or average of many small
random quantities is close to
normal
 True even if the quantities are
not independent
 True even if they have different
distributions
 How large is large? n≥20 is a
pretty good standard, but it really
depends on how non-normal the
population distribution is.
33
Example
NASA is producing a batch of nuts
for their new space shuttle. The
distribution of the diameters of the
nuts is unknown, but it is know from
past experience that them mean of
the distribution is 1.8 mm and the
standard deviation is 0.3 mm.
What is the probability of observing a
sample of 100 nuts with mean less
than 1.75?
34
NOTE: We use the Central Limit
Theorem to find this probability. The
sample size we have is sufficiently
high (n≥20).
P( x < 1.75)
1.75  
= P(Z <
)
 n
1.75  1.8
= P(Z <
)
0.3 100
 0.05
=P(Z <
)
0.03
= P(Z < -1.67) = 0.0475
35
Example: An educational researcher
selects a random sample of 400
students’ scores from the population
of scores on a national exam. The
population mean is 485 points, and
its standard deviation is 80 points.
What is the probability that the
sample of students have scores
greater than 500?
P( x > 500)
500  
= P(Z >
)
 n
500  4.85
= P(Z >
)
80 400
15
=P(Z > )
4
= P(Z > 3.75) = 0.00
36
Statistical Process Control
Statistical process control is a
collection of tools that when
used together can result in
process stability and variability
reduction
A process – is a chain of activities that
turns inputs into outputs.
The goal is to make a process stable
over time.
A Control chart is a statistical tool that
monitors data over time.
37
A typical control chart:
A control chart has:
 A horizontal line called the centre line
(μ) around which the data vary.
38
 Two horizontal lines called control
limits and Upper control limit (UCL)
and a lower control limit (LCL) at
 3

n
 Any x that does not fall between the
control limits gives evidence that the
process is out of control.
39
To Make a control chart
1.
Take samples of size n from the
process at regular intervals. Plot
the means x of these samples
against the order in which the
samples were taken
We know the sampling distribution
of x under the process monitoring
condistions is normal with mean µ
and standard deviation σ/√n.
 Draw a center line at µ
3. The 99.7% rule from the 68-9599.7% rule for normal distribution
says that, as long as the process
remains in control, 99.7% of the
values of x will fal between
x ±3(σ/√n)
2.
40
Example: In a study of voter turnout,
15 people of voting age are randomly
selected. The mean numbers of
actually voted are listed below.
Construct a control chart and
determine whether the process is
within statistical control. It is know
that the mean number of voters
468.73 and the standard deviation is
83.
Mean numbers who voted:
608
466
552
382
536
372
526
398
531
364
501
365
551
388
491
41
LCL:

 83 
 3
 468.73  3 
  402.89
n
 15 
UCL:

 83 
 3
 468.73  3 
  534.57
n
 15 
CL = μ = 468.73
42
CONTROL CHART
650
600
550
500
450
400
350
1
2
3
4
5
6
7
8
9
43
10
11
12
13
14
15
Binomial Distribution
We want a probability model for a
count of successful outcomes.
The binomial distribution is a
common model.
Binomial Setting
1. There are fixed number n
observations.
2. The n observations are all
independent knowing one tells
nothing about another.
3. Each observation falls into
one of only two categories –
success or failure.
44
4. The probability of success, p,
is the same for each
observation
Example: Toss a coin 4 times and
count the number of heads.
X = # of heads
X is a random variable
The distribution of the count x of
successes in the binomial setting is
the binomial distribution with
parameters n and p.
n is the number of observations
p is the probability of a success on
any one observation (trial).
45
The count x can take on values from
0 to n.
Warning – not all counts have a
binomial distribution. All of 1 – 4
must be satisfied.
Example: a couple has 4 kids, what
is the probability that the first kid
(x=1) is a boy.
P(boy) = 0.5
46
Binomial Probabilities
Binomial Coefficient
The number of ways of arranging x
successes among n observations is
given by the binomial coefficient.
n!
n
 
 x  x!(n  x)!
 for x = 1,……,n
n
   is called “n choose x”
 x
 n! indicates a factorial
47
For any positive number n, its
factorial (n!) is:
n! = n*(n-1)*(n-2)………3*2*1
Example:
2! = (2)*(1)=2
4! = (4)*(3)*(2)*(1) = 24
Note: 0!=1
 if doing this by hand many factors
will cancel out.
Example:
6!
6!
6

 
 2  2!(6  2)! 2!4!
48
6  5  4  3  2 1
=
= 15
(2  1)4  3  2  1
n
n
Also note that   is not a fraction .
x
 x
n
  counts the number of different
 x
ways that k success can be arranges
among n observations.
5! 5 * 4 * 3 * 2 *1
5

5
  =
 4  4!1! (4 * 3 * 2 *1) *1
49
Binomial Probability
If x has the binomial distribution with n observation and
probability p of success on each observation, the
possible values of x are 0,1,2,…..,n. If k is any one of
those values,
n x
P(x) =   p (1-p)n-x
 x
5
P(x=4) =   (0.6)4(1-0.4)5-4
 4
= 0.2592
50
Example 5.22: A factory employs
several thousands of workers, of
whom 30% are Hispanic. If the 15
members of the union executive
committee were chosen from the
workers at random, the number of
Hispanics on the committee would
have the binomial distribution with
n=15 and p=0.3.
a. What is the probability that
exactly 3 members of the
committee are Hispanic?
b. What is the probability that 3
or fewer members of the
committee are Hispanic?
51
Answer:
a. P(x=3) we know p = 0.3 and n=15
n x
P(x) =   p (1-p)n-x
 x
15 
P(x=3) =   (0.3)3(1-0.3)15-3
3
15 
  =
3
15! 15 *14 *13 *12...... *1

 455
3!12! (3 * 2 *1) *12.......... *1
=455*0.027*0.014 = 0.17
Therefore the probability of exactly 3
members of the community are
Hispanic is 0.17 or 17%
52
P(x3)=
P(x=3)+P(X=2)+P(X=1)+P(X=0)
15 
=   (0.3)3(1-0.3)15-3
3
15 
+…..+  (0.3)0(1-0.3)15-0
0
=
0.0047 + 0.0305 + 0.0916 + 0.17
= 0.2968
Therefore the probability of three or fewer members of
the community are Hispanic is 0.2968 or 30%
53
BINOMIAL MEAN AND STANDARD
DEVIATION
If a count x has the binomial
distribution with n observations and
probability of success p, the mean
and standard deviation of x are:
  np
  np(1  p)
Note: X has a binomial distribution
with parameters n and p.
54
Example: Previous example where
n = 15 p =0.3
So, we have
  15 * 0.3  4.5
  15 * 0.3(1  0.3)
 5.61
Note: these formulas only work for the
binomial distribution!!!
55
Sample Proportions
In statistical sampling we often want to
estimate the proportion p̂ of “successes”
in a population. Our estimator is the
sample proportion of successes:
^
pˆ 
count of succes sin the sample
count of observation sin the sample
= X/N
 To distinguish between the
proportion p and the count x. The
count takes whole – number values
between 0 and n, but a proportion is
always a number between 0 and 1.
 In the binomial setting, the count x
has a binomial distribution. The
56
proportion p̂ does not have a
binomial distribution. We can
however, do probability calculations
about p̂ by restating them in terms of
the count x and using binomial
methods.
The mean and standard deviation of
are:
µ=
σ=
p̂
p̂
p 1  p 
n
Note: we will use it when the population
is at least 20 times as large as the
sample
57
 p̂ in an SRS is an unbiased estimator
of the population proportion p
 The variability of p̂ about its mean,
decreases as the sample size
increases. Therefore a sample
proportion from a large sample will
usually be quite close to the
population proportion p
 The √n in the denominator means
the sample size must be multiplied
by 4 if we wish to divide the standard
deviation in half.
58
A continuous random variable we have
but do not discuss a lot about especially
in this course is the uniform distribution.
The uniform distribution looks like:
 Has a constant height at 1 between a
given interval and a height of 0 after.
 To area under the density curve is 1,
the area of a square wit base 1 and
height 1. The probability of any event
is the area under the density curve
and about the event in question.
59
Example: Many random number
generators allow users to specify the
range of the random numbers to be
produced. Suppose that you specify
that the range is to be all numbers
between 0 & 2. Call the random
number generator Y. Then the density
curve of the random variable Y has a
constant height between 0 and 2 and
a height of 0 elsewhere.
A. What is the height of the density
curve between 0 and 2.
Height = 1 / (2-0)
=½
60
B. What is the P(Y≤1)
= (1-0)* ½
=½
C. P(Y≥ 0.8)
=(2-0.8)* ½
= 0.6
61