Download Statistics 2014, Fall 2001

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Gibbs sampling wikipedia , lookup

Transcript
1
Probability Plots
A simple way to assess the fit of a particular probability distribution to a data set is to superimpose the
p.d.f. of the distribution on a relative frequency histogram of the data.
A better method uses a graph which plots quantiles of the proposed distribution against the
corresponding quantiles of the data set.
Defn: The pth quantile of a data set is the smallest number such that the fraction of the data values less
than that number is p.
Defn: The pth quantile of the distribution of a continuous r.v. X is the smallest number x such that F(x)
= p.
Defn: For a random sample of size n, consisting of observed values x1, x2, …, xn, the ith order statistic
is the ith data value when the data values are ordered from smallest to largest.
i  0.5
The cumulative relative frequency associated with the ith order statistic is
.
n
The general procedure for constructing a probability plot (or a quantile-quantile plot) is as follows:
1) Sort the data in ascending order.
2) For the sample size n, calculate the cumulative relative frequencies.
3) Invert the assumed distribution function to find the quantiles associated with the cumulative
relative frequencies.
4) Do a scatterplot of the order statistics of the data v. the quantiles of the distribution.
Constructing a normal probability plot
If we have a set of data consisting of observed values x1, x2, …, xn, and we want to decide whether it is
reasonable to assume that the data were sampled from a normal distribution, we proceed as follows:
1) Sort the data from smallest to largest, yielding the order statistics x(1), x(2), …, x(n).
1) Calculate the standardized normal scores
 i  0.5 
z(i )   1 
 , for each i = 1, 2, …, n, using Table 1 in Appendix A.
 n 
2)
Plot the order statistics of the data set against the corresponding standardized normal scores on
regular graph paper.
If the plotted points lie near a straight line, then it is reasonable to assume that the data were sampled
from a normal distribution.
Example: p. 79, 3-63
A) Exponential Distribution
If X is a continuous r.v. which has an exponential distribution with mean , then the c.d.f. of the
 x
distribution is F  x   1  exp    , for x > 0, and F(x) = 0, for x  0.
 
2
Then the pth quantile of the distribution (for 0 < p < 1) is given by
 x
p  1  exp    , implying x   ln 1  p  .
 
The calculated quantiles for the distribution would be found from
 i  0.5 
Q(i )   x ln 1 
 , for i = 1, 2, …, n.
n 

Example: The lifetimes (in years) of 10 cell phones are given as 4.23, 1.89, 10.52, 6.46, 8.32, 8.60,
0.41, 0.91, 2.66, 35.71. Is it reasonable to assume that X, the lifetime of a randomly selected cell
phone, has an exponential distribution. The order statistics and quantiles are given in the table below;
the sample mean is 7.971 years:
i
x(i)
i  0.5
n
Q(i)
1
2
3
4
5
6
7
8
9
10
0.41
0.91
1.89
2.66
4.23
6.46
8.32
8.60
10.52
35.71
0.05
0.15
0.25
0.35
0.45
0.55
0.65
0.75
0.85
0.95
0.41
1.30
2.29
3.43
4.76
6.36
8.37
11.05
15.12
23.88
We do a scatterplot of the order statistics v. the quantiles of the distribution:
Exponential Probability Plot for Cell Phone Data
40
Ordered Lifetimes
35
30
25
20
15
10
5
0
0
5
10
15
20
Exponential Quantiles
Note that the one large observation lies far from the line.
25
30
3
B) Lognormal Distribution
To construct a q-q plot for a data set believed to be sampled from a lognormal distribution, we recall
that a continuous r.v. X has a lognormal distribution if W = ln(X) has a normal distribution. Hence we
can simply take the logs of the data values and do a normal probability plot to assess the fit of the data.
Example: We found that the cell phone lifetime data did not quite fit an exponential distribution.
Perhaps we could try a lognormal distribution instead. The following table has the order statistics for
the logs of the data values, and the quantiles of a standard normal distribution.
i
w(i)
i  0.5
n
Q(i)
1
2
3
4
5
6
7
8
9
10
-0.89
-0.09
0.64
0.98
1.44
1.87
2.12
2.15
2.35
3.58
0.05
0.15
0.25
0.35
0.45
0.55
0.65
0.75
0.85
0.95
-1.64
-1.04
-0.67
-0.39
-0.13
0.13
0.39
0.67
1.04
1.64
The scatterplot of the order statistics v. the quantiles of the standard normal distribution is shown
below. Since the data points all fall near a straight line, we may say that it is reasonable to assume that
the data were sampled from a lognormal distribution, or that the r.v. X (= lifetime of a randomly
selected cell phone) has a lognormal distribution.
Lognormal Q-Q Plot of Cell Phone Data
Log of Ordered Lifetime
4
3.5
3
2.5
2
1.5
1
0.5
0
-2
-1
-0.5 0
-1
1
-1.5
Quantile of Standard Normal Dist.
2
4
C) Weibull Distribution
The c.d.f. for a Weibull(, ) distribution is
  x  
F  x   1  exp      , for x > 0, and F(x) = 0, for x  0. Then the pth quantile for the
    
distribution is given by
  Q( p)  
p  1  exp   
  , giving  ln Q  p     ln     ln   ln 1  p   .

 
 
Of course, we don’t know the values of the parameters, but it doesn’t matter. There should be a
straight line relationship between ln Q  p   and ln   ln 1  p   . Hence if the data come from a
Weibull distribution, then the graph of the log of the order statistics v. ln   ln 1  p   should be a
straight line.
Example:
14 randomly selected identical coupons of steel were stress tested, and their breaking strengths were
recorded (the order statistics are given in table below). Also given in the table are the values of
ln   ln 1  p   .
i
ln[x(i)]
i  0.5
14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
3.53
3.97
4.43
4.67
4.92
5.05
5.17
5.33
5.42
5.54
5.62
5.69
5.80
5.83
0.036
0.107
0.179
0.250
0.321
0.393
0.464
0.536
0.607
0.679
0.750
0.821
0.893
0.964

 i  0.5  
ln   ln 1 

14  


-3.31
-2.18
-1.63
-1.25
-0.95
-0.70
-0.47
-0.26
-0.07
0.13
0.33
0.54
0.80
1.20
The scatterplot of the second column in the table v. the last column is shown on the next page. All of
the points lie near a straight line; hence it is reasonable to assume that the random variable X
(= breaking stress for randomly selected steel coupon) has a Weibull distribution.
5
Weibull Q-Q Plot for Steel Breaking Stress
7
Log of Order Statistic
6.5
6
5.5
5
4.5
4
3.5
3
-3.5
-2.5
-1.5
-0.5
0.5
1.5
Log(-Log(1-p))
Distributions of Discrete Random Variables
For a discrete r.v. X, there is a finite or at most countably infinite number of possible values.
Defn: The probability distribution for a discrete r.v. X is a set of ordered pairs of numbers. In each
pair, the first number is a possible value x of X, and the second number is the probability that the r.v. X
assumes the value x.
Example: Our random experiment is to flip a fair coin twice. The r.v. X is defined to be the number of
heads that occur. The sample space of the experiment is S = { HH, HT, TH, TT }. Since the coin is
fair, each of these outcomes occurs with probability 0.25. The distribution of the r.v. x is given in the
table below:
x
0
1
2
P(X = x)
0.25
0.50
0.25
Defn: For a discrete r.v. X with possible values x1, x2, …, xn, the probability mass function (or p.m.f.)
is f  xi   P  X  xi  , for i = 1, 2, …, n. The p.m.f. must satisfy two conditions which follow from
Kolmogorov’s axioms:
1)
f  x   0 for all x, and
n
2)
 f  x   1.
i 1
i
6
Mean and Variance of a Discrete Distribution
n
Defn: The mean, or expectation, or expected value, of a discrete r.v. X is given by    xi f  xi  .
i 1
n
Defn: The variance of a discrete r.v. X is given by  2    xi    f  xi  . The standard deviation of
2
i 1
X is just the square root of the variance.
Note: It is generally easier to calculate the variance using the equivalent formula
n
   xi2 f  xi    2 .
2
i 1
Bernoulli Distribution
The simplest type of discrete distribution is one for which the r.v. has two possible values.
Defn: A discrete r.v. X is said to have a Bernoulli distribution with parameter p (X ~ Bernoulli(p)) if
there are exactly two possible values 0 and 1 of X, such that P(X = 1) = p, and P(X = 0) = 1 – p.
The mean of a Bernoulli distribution is given by
n
   xi f  xi   1 p    0 1  p   p .
i 1
The variance of a Bernoulli distribution is given by
n
 2   xi2 f  xi    2  1 p    0  p   p 2  p 1  p  .
i 1
Example: Our random experiment is to flip a fair coin once. We define X = number of heads. Then X
~ Bernoulli(0.5). Then P(X = 1) = 0.5, and P(X = 0) = 0.5. The mean of this distribution is  = 0.5.
The standard deviation is  = 0.5.
Binomial Distribution
Assume that instead of flipping the fair coin once, we flip it 10 times, and we define our discrete r.v. X
to be the number of heads. We want to be able to calculate probabilities associated with X, as well as
the mean and standard deviation.
Assume that we have n independent, identically distributed (i.i.d.) r.v.’s Y1, Y2, …, Yn, each of which
has a Bernoulli distribution.
n
Let X   Yi . Then X is said to have a binomial distribution.
i 1
Defn: A discrete r.v. X is said to have a binomial distribution with parameters n and p if
n
n x
P  X  x     p x 1  p  , for x = 0, 1, …, n.
 x
7
Derivation of the Binomial Distribution:
A binomial experiment is a random experiment which satisfies the following conditions:
1) The experiment consists of a fixed number, n, of trials.
2) The trials are identical to each other, in that they are performed the same way.
3) The trials are independent of each other, meaning that the outcome of one trial does not give any
information about the outcome of any other trial.
4) Each trial has two possible outcomes, which we will call Success and Failure.
5) The probability of Success is the same, p, for each of the trials.
We let X = # of Successes in the n trials. The possible values of X are 0, 1, 2, 3, …, n. For a given x
 {0, 1, 2, …, n}, what is P(X = x)?
One way that we can have exactly x Successes out of n trials is for the first x trials to result in Success
and the remaining n – x trials to result in failure. If the trials are independent (the outcome of one trial
is unrelated to the outcome of any other trial), then
P(x Successes followed by n-x Failures) = p 1  p  .
Any other ordering of x Successes and n – x Failures will have the same probability of occurring. How
many such orderings are there?
x
n x
Defn: Given a set of n objects, the number of ways to choose a subset of x of the objects is given by
the binomial coefficient:
n
n!
.
 
 x  x ! n  x  !
The number of different orderings of x Successes and n – x Failures is the same as the number of ways
of choosing x of the n trials to be Successes.
Hence, the probability that there will be exactly x Successes in n Bernoulli trials is given by:
n
n x
P  X  x     p x 1  p  , for x = 0, 1, …, n.
 x
The mean and variance of a binomial r.v. X may be found as follows: Since X is the sum of n
independent, identical Bernoulli r.v.s’, the mean of X is just n times the mean of the Bernoulli
distribution, i.e.,   np . The variance of X is also just n times the variance of the Bernoulli
distribution, i.e.,   np 1  p  .
Example: Let’s go back to our random experiment of flipping a fair coin 10 times. Let X = number of
heads that occur. Does this satisfy the conditions of being a binomial experiment?
2
The mean is
  np  (10)(0.5)  5 . The variance is  2  np 1  p   (10)(0.5)(0.5)  2.5 , and
the standard deviation is  = 1.5811.
10 
5
5
We have P  X  5     0.5   0.5  0.24609375 .
5 
5
10 
x
10  x
 0.6230 . Clearly the calculations can
What about P(X  5)? P  X  5      0.5  0.5
x 0  x 
become tedious.
8
To find binomial probabilities using Excel: If X ~ binomial(n, p), and we want to find P(X  x), then
in the cell of the worksheet, enter
=BINOMDIST(x, n, p, TRUE).
In our example, we want to find P(X  5). In cell A1, we enter
=BINOMDIST(5,10,0.5,TRUE)
We get 0.6230.
If we want P(X = 5), we enter
=BINOMDIST(5,10,0.5,TRUE) – BINOMDIST(4,10,0.5,TRUE) We get 0.2461.
If we want P(X > 5), we enter
=1-BINOMDIST(5,10,0.5,TRUE)
We get 0.3770.
Examples of Binomial Experiments:
1)
Assume that the date is October 1, 2008. We want to predict the outcome of the Presidential
election. We will assume, for simplicity, that there are only two candidates, Joe Asinus and
Judy Olifaunt. We select a simple random sample of 1000 voters, and ask each voter in the
sample, “Do you intend to vote for Asinus or Olifaunt?” Let X = number of voters in the
sample who plan to vote for Asinus.
2)
A worn machine tool produces 1% defective parts. We select a simple random sample of 25
parts produced by this machine, and let X = number of defective parts in the sample.
3)
I give a pop quiz to the class consisting of 10 multiple choice questions, each with four
possible responses, only one of which is the correct response. A student has been goofing off
all semester, and comes to class totally unprepared for the quiz. He decides to randomly guess
the answer to each question. Let X = his score on the quiz.
4)
It is known that of the entire population of adults in Florida, 5% have a certain blood type.
We select a random sample of Florida and obtain blood samples to test. Let X = number of
people in the sample who have the blood type.
5)
On p. 88, there is a graph illustrating the random experiment of flipping a fair coin 20 times,
counting the number of heads.
9
Poisson Distribution
This distribution provides the model for the occurrence of rare events over a period of time, distance,
or some dimension.
Examples:
1) X = number of cars driving through an intersection in an hour.
2) X = number of accidents occurring at an intersection in a year.
3) X = number of alpha particles emitted by a sample of U-238 over a period of time.
The common characteristics of Poisson processes are these:
We divide the interval of time (distance, etc.) into a large number of equal subintervals.
1) The probability of occurrence of more than one count in a small subinterval is 0;
2) The probability of occurrence of one count in a small subinterval is the same for all equal
subintervals, and is proportional to the length of the subinterval;
3) The count in each small subinterval is independent of other subintervals.
We let X = count of occurrences in the entire interval.
Defn: A discrete r.v. X is said to have a Poisson distribution with mean  if the p.m.f. of the
distribution is
e   x
f  x 
, for x = 0, 1, 2, 3, ….
x!
The mean and variance of the distribution are
  E  X    and V(X) = .
Note: We may derive the Poisson distribution as a limiting case of the binomial distribution with the
number of trials going to infinity and the probability of success on each trial going to 0 in such a way
that the mean of the distribution remains constant.
Example: p. 95, Exercise 3-95