Download The normal curve - Erwin Sitompul

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Probability and Statistics
Lecture 7
Dr.-Ing. Erwin Sitompul
President University
http://zitompul.wordpress.com
President University
Erwin Sitompul
PBST 7/1
Chapter 6
Some Continuous Probability Distributions
Chapter 6
Some Continuous Probability
Distributions
President University
Erwin Sitompul
PBST 7/2
Chapter 6.1
Continuous Uniform Distribution
Continuous Uniform Distribution
 |Uniform Distribution| The density function of the continuous
uniform random variable X on the interval [A, B] is
 1
B  A , A  x  B
f ( x; A, B)  
0,
elsewhere

 The mean and variance of the uniform distribution are
A B

2
and
( B  A)2
 
12
2
 The uniform density
function for a
random variable on
the interval [1, 3]
President University
Erwin Sitompul
PBST 7/3
Chapter 6.1
Continuous Uniform Distribution
Continuous Uniform Distribution
Suppose that a large conference room for a certain company can be
reserved for no more than 4 hours. However, the use of the
conference room is such that both long and short conference occur
quite often. In fact, it can be assumed that length X of a conference
has a uniform distribution on the interval [0,4].
(a) What is the probability density function?
(b) What is the probability that any given conference lasts at least 3
hours?
(a)
1
4 , 0  x  4
f ( x)  
0, elsewhere

(b)
1
1
P  X  3     dx 
4
4
3
4
President University
Erwin Sitompul
PBST 7/4
Chapter 6.2
Normal Distribution
Normal Distribution
 Normal distribution is the most important continuous probability
distribution in the entire field of statistics.
 Its graph, called the normal curve, is the bell-shaped curve which
describes approximately many phenomena that occur in nature,
industry, and research.
 The normal distribution is often referred to as the Gaussian
distribution, in honor of Karl Friedrich Gauss, who also derived its
equation from a study of errors in repeated measurements of the
same quantity.
 The normal curve
President University
Erwin Sitompul
PBST 7/5
Chapter 6.2
Normal Distribution
Normal Distribution
 A continuous random variable X having the bell-shaped distribution
as shown on the figure is called a normal random variable.
 The density function of the normal random variable X, with mean μ
and variance σ2, is
n( x;  ,  ) 
1
e
2
1  x 
 

2  
2
,
  x  
where π = 3.14159... and e = 2.71828...
President University
Erwin Sitompul
PBST 7/6
Chapter 6.2
Normal Distribution
Normal Curve
 μ1 < μ2, σ1 = σ2
 μ1 = μ2, σ1 < σ2
 μ1 < μ2, σ1 < σ2
President University
Erwin Sitompul
PBST 7/7
Chapter 6.2
Normal Distribution
Normal Curve
f(x)
The mode, the point where
the curve is at maximum
Concave downward
Point of inflection
σ
σ
Concave upward
Approaches zero
asymptotically
x
μ
Total area under the curve
and above the horizontal
axis is equal to 1
President University
Symmetry about a vertical
axis through the mean μ
Erwin Sitompul
PBST 7/8
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
 The area under the curve bounded by two ordinates x = x1 and
x = x2 equals the probability that the random variable X assumes
a value between x = x1 and x = x2.
x2
1
P( x1  X  x2 )   n( x;  ,  )dx 
2
x1
President University
Erwin Sitompul
x2
e
1  x 
 

2  
2
dx
x1
PBST 7/9
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
 As seen previously, the normal curve is dependent on the mean μ
and the standard deviation σ of the distribution under
investigation.
 The same interval of a random variable can deliver different
probability if μ or σ are different.
 Same interval, but different probabilities
for two different normal curves
President University
Erwin Sitompul
PBST 7/10
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
 The difficulty encountered in solving integrals of normal density
functions necessitates the tabulation of normal curve area for
quick reference.
 Fortunately, we are able to transform all the observations of any
normal random variable X to a new set of observation of a normal
random variable Z with mean 0 and variance 1.
Z
X 

1
P( x1  X  x2 ) 
2
1
2

z2
x2
e
1  x 
 

2  
2
dx
x1
e

z2
2
dz
z1
z2
  n( z;0,1)dz  P( z1  Z  z2 )
z1
President University
Erwin Sitompul
PBST 7/11
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
 The distribution of a normal random variable with mean 0 and
variance 1 is called a standard normal distribution.
President University
Erwin Sitompul
PBST 7/12
Chapter 6.3
Areas Under the Normal Curve
Table A.3 Normal Probability Table
President University
Erwin Sitompul
PBST 7/13
Chapter 6.3
Areas Under the Normal Curve
Interpolation
 Interpolation is a method of constructing new data points within
the range of a discrete set of known data points.
 Examine the following graph. Two data points are known, which
are (a,f(a)) and (b,f(b)).
 If a value of c is given, with a < c < b, then the value of f(c) can be
estimated.
 If a value of f(c) is given, with f(a) < f(c) < f(b), then the value of c
can be estimated.
f (c )  f ( a ) 
f (b )
ca
 f (b)  f (a) 
ba
f (c ) ?
f (a)
c a
a
President University
c?
f (c )  f ( a )
b  a 
f (b)  f (a)
b
Erwin Sitompul
PBST 7/14
Chapter 6.3
Areas Under the Normal Curve
Interpolation
 P(Z < 1.172)?
 P(Z < z) = 0.8700, z = ?
President University
Erwin Sitompul
Answer: 0.8794
1.126
PBST 7/15
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a standard normal distribution, find the area under the curve
that lies (a) to the right of z = 1.84 and (b) between z = –1.97 and
z = 0.86.
(a)
P( Z  1.84)  1  P( Z  1.84)
 1  0.9671
 0.0329
(b)
P(1.94  Z  0.86)  P( Z  0.86)  P( Z  1.94)
 0.8051  0.0244
 0.7807
President University
Erwin Sitompul
PBST 7/16
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a standard normal distribution, find the value of k such that
(a) P ( Z > k ) = 0.3015, and (b) P ( k < Z < –0.18 ) = 0.4197.
(a)
P( Z  k )  1  P( Z  k )
P( Z  k )  1  P( Z  k )
 1  0.3015  0.6985
k  0.52
(b)
P(k  Z  0.18)  P( Z  0.18)  P( Z  k )
P( Z  k )  P( Z  0.18)  P(k  Z  0.18)
 0.4286  0.4197  0.0089
k   2.37
President University
Erwin Sitompul
PBST 7/17
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a random variable X having a normal distribution with μ = 50
and σ = 10, find the probability that X assumes a value between 45
and 62.
z1 
z2 
x1  

45  50
 0.5
10
x2  

62  50
 1.2
10


P(45  X  62)  P(0.5  Z  1.2)
 P( Z  1.2)  P( Z  0.5)
 0.8849  0.3085
 0.5764
President University
Erwin Sitompul
PBST 7/18
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given that X has a normal distribution with μ = 300 and σ = 50, find
the probability that X assumes a value greater than 362.
z
x


362  300
 1.24
50
P( X  362)  P( Z  1.24)
 1  P( Z  1.24)
 1  0.8925
 0.1075
President University
Erwin Sitompul
PBST 7/19
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a normal distribution with μ = 40 and σ = 6, find the value of x
that has (a) 45% of the area to the left, and (b) 14% of the area to
the right.
(a)
P( Z  z )  0.45
z  0.13 
0.45  0.4483
 0.12  (0.13)   0.1256
0.4522  0.4483
x    z  40  (0.1256)(6)  39.2464
2254.0
54.0
3844.0
President University
31.0
?
21.0
Erwin Sitompul
PBST 7/20
Chapter 6.3
Areas Under the Normal Curve
Area Under the Normal Curve
Given a normal distribution with μ = 40 and σ = 6, find the value of x
that has (a) 45% of the area to the left, and (b) 14% of the area to
the right.
(b)
P( z  Z )  0.14  1  P( Z  z )
P( Z  z )  1  0.14  0.86
 z  1.08
x    z  40  (1.08)(6)  46.48
President University
Erwin Sitompul
PBST 7/21
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
A certain type of storage battery lasts, on average, 3.0 years, with a
standard deviation of 0.5 year. Assuming that the battery lives are
normally distributed, find the probability that a given battery will last
less than 2.3 years.
z
x


2.3  3.0
 1.4
0.5
P( Z  1.4)  0.0808
 8.08%
President University
Erwin Sitompul
PBST 7/22
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
In an industrial process the diameter of a ball bearing is an
important component part. The buyer sets specifications on the
diameter to be 3.0 ± 0.01 cm. All parts falling outside these
specifications will be rejected.
It is known that in the process the diameter of a ball bearing has a
normal distribution with mean 3.0 and standard deviation 0.005.
On the average, how many manufactured ball bearings will be
scrapped?
P(2.99  X  3.01)  P(2  Z  2)
 P( Z  2)  P( Z  2)
 0.9772  0.0228
 0.9544
x1  
2.99  3.0
 2

0.005
x   3.01  3.0
z2  2

 2

0.005
z1 
 95.44% accepted

President University
 4.56% rejected
Erwin Sitompul
PBST 7/23
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
A certain machine makes electrical resistors having a mean
resistance of 40 Ω and a standard deviation of 2 Ω. It is assumed
that the resistance follows a normal distribution.
What percentage of resistors will have a resistance exceeding 43 Ω
if:
(a) the resistance can be measured to any degree of accuracy.
(b) the resistance can be measured to the nearest ohm only.
(a)
(b)
43  40
 1.5
2
P( X  43)  P( Z  1.5)  1  P( Z  1.5)  1  0.9332  0.0668  6.68%
z
43.5  40
 1.75
2
P( X  43.5)  P( Z  1.75)  1  P( Z  1.75)  1  0.9599  0.0401  4.01%
z
 As many as 6.68%–4.01% = 2.67% of
the resistors will be accepted although
the value is greater than 43 Ω due to
measurement limitation
President University
Erwin Sitompul
PBST 7/24
Chapter 6.4
Applications of the Normal Distribution
Applications of the Normal Distribution
The average grade for an exam is 74, and the standard deviation is
7. If 12% of the class are given A’s, and the grade are curved to
follow a normal distribution, what is the lowest possible A and the
highest possible B?
P( Z  z )  0.12
P( Z  z )  1  P( Z  z )  1  0.12  0.88
 z  1.175
x    z  74  (1.175)(7)  82.225
President University
 Lowest possible A is 83
 Highest possible B is 82
Erwin Sitompul
PBST 7/25
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
 The probabilities associated with binomial experiments are readily
obtainable from the formula b(x;n, p) of the binomial distribution
or from the table when n is small.
 For large n, making the distribution table is not practical anymore.
 Nevertheless, the binomial distribution can be nicely approximated
by the normal distribution under certain circumstances.
President University
Erwin Sitompul
PBST 7/26
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
 If X is a binomial random variable with mean μ = np and variance
σ2 = npq, then the limiting form of the distribution of
Z
X  np
npq
as n  ∞, is the standard normal distribution n(z;0, 1).
 Normal approximation of b(x; 15, 0.4)
 Each value of b(x; 15, 0.4) is
approximated by P(x–0.5 < X < x+0.5)
President University
Erwin Sitompul
PBST 7/27
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
P( X  4)  b(4;15, 0.4)
 15 C4 (0.4) 4 (0.6)11
 0.1268
P( X  4)  P(3.5  X  4.5)
 P(1.32  Z  0.79)
 0.1214
 Normal approximation of
9
b(4;15, 0.4) and
 b( x;15, 0.4)
x 7
P(7  X  9)   b( x;15, 0.4)
x 7
 0.3564
  np  (15)(0.4)  6
  npq  (15)(0.4)(0.6)  1.897
President University
9
Erwin Sitompul
P(7  X  9)  P(6.5  X  9.5)
 P(0.26  Z  1.85)
 0.3652
PBST 7/28
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
 The degree of accuracy, that is how well the normal curve fits the
binomial histogram, will increase as n increases.
 If the value of n is small and p is not very close to 1/2, normal
curve will not fit the histogram well, as shown below.
b( x; 6, 0.2)
b( x;15, 0.2)
 The approximation using normal curve will be excellent when n is
large or n is small with p reasonably close to 1/2.
 As rule of thumb, if both np and nq are greater than or equal to 5,
the approximation will be good.
President University
Erwin Sitompul
PBST 7/29
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
 Let X be a binomial random variable with parameters n and p. For
large n, X has approximately a normal distribution with μ = np and
σ2 = npq = np(1–p) and
x
P( X  x)   b(k ; n, p)
k 0
 area under normal curve to the left of x  0.5
 P ( X  x  0.5)
( x  0.5)   

 PZ 




and the approximation will be good if np and nq = n(1–p) are
greater than or equal to 5.
President University
Erwin Sitompul
PBST 7/30
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
The probability that a patient recovers from a rare blood disease is
0.4. If 100 people are known to have contracted this disease, what is
the probability that less than 30 survive?
  np  (100)(0.4)  40
n  100, p  0.4
29
P( X  30)   b( x;100, 0.4)
  npq  (100)(0.4)(0.6)  4.899
x 0
P( X  30)  P( X  29.5)
z
29.5  40
 2.143
4.899
 P ( Z  2.143)
 0.01608
 After interpolation
 1.608%
 Can you calculate the
exact solution?
President University
Erwin Sitompul
PBST 7/31
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
A multiple-choice quiz has 200 questions each with 4 possible
answers of which only 1 is the correct answer. What is the probability
that sheer guess-work yields from 25 to 30 correct answers for 80 of
the 200 problems about which the student has no knowledge?
n  80, p 
1
4
  np  (80)( 14 )  20
  npq  (80)( 14 )( 34 )  3.873
z1 
P(25  X  30) 
30
 b( x;80,
x  25
1
4
24.5  20
30.5  20
 1.162, z2 
 2.711
3.873
3.873
)
 P(24.5  X  30.5)
 P(1.162  Z  2.711)
 P( Z  2.711)  P( Z  1.162)
 0.9966  0.8774
 0.1192
President University
Erwin Sitompul
PBST 7/32
Chapter 6.5
Normal Approximation to the Binomial
Normal Approximation to the Binomial
PU Physics entrance exam consists of 30 multiple-choice questions
each with 4 possible answers of which only 1 is the correct answer.
What is the probability that a prospective students will obtain
scholarship by correctly answering at least 80% of the questions just
by guessing?
n  30, p 
1
4
  np  (30)( 14 )  7.5
  npq  (30)( 14 )( 43 )  2.372
P( X  24) 
30
 b( x;30,
x  24
1
4
)
z
23.5  7.5
 6.745
2.372
 1  P( X  23.5)
 1  P( Z  6.745)
 0
 It is practically impossible to
get scholarship just by pure
luck in the entrance exam
President University
Erwin Sitompul
PBST 7/33
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
 There are still numerous situations that the normal distribution
cannot cover. For such situations, different types of density
functions are required.
 Two such density functions are the gamma and exponential
distributions.
 Both distributions find applications in queuing theory and reliability
problems.
 The gamma function is defined by

( )   x 1e x dx
for α > 0.
0
President University
Erwin Sitompul
PBST 7/34
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
 |Gamma Distribution| The continuous random variable X has a
gamma distribution, with parameters α and β, if its density
function is given by
 1
 1  x 
x
e
, x0
   ( )
f ( x)  

elsewhere
0,
where α > 0 and β > 0.
 |Exponential Distribution| The continuous random variable X
has an exponential distribution, with parameter β, if its density
function is given by
 1 x 
, x0
 e
f ( x)  

elsewhere
0,
where β > 0.
President University
Erwin Sitompul
PBST 7/35
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
 Gamma distributions for certain values of
the parameters α and β
 The gamma distribution with α = 1 is called
the exponential distribution
President University
Erwin Sitompul
PBST 7/36
Chapter 6.6
Gamma and Exponential Distributions
Gamma and Exponential Distributions
 The mean and variance of the gamma distribution are
  
and
 2   2
 The mean and variance of the exponential distribution are

and
President University
2  2
Erwin Sitompul
PBST 7/37
Chapter 6.7
Applications of the Gamma and Exponential Distributions
Applications
Suppose that a system contains a certain type of component whose
time in years to failure is given by T. The random variable T is
modeled nicely by the exponential distribution with mean time to
failure β = 5.
If 5 of these components are installed in different systems, what is
the probability that at least 2 are still functioning at the end of 8
years?

1
P(T  8)   et 5 dt
58
5
P( X  2)   b( x;5, 0.2)
x2
1
 1   b( x;5, 0.2)
 e 8 5  0.2
x 0
 1  0.7373
 The probability whether
the component is still
functioning at the end of 8
years
 0.2627
 The probability whether at
least 2 out of 5 such
component are still
functioning at the end of 8
years
President University
Erwin Sitompul
PBST 7/38
Chapter 6.7
Applications of the Gamma and Exponential Distributions
Applications
Suppose that telephone calls arriving at a particular switchboard
follow a Poisson process with an average of 5 calls coming per
minute.
What is the probability that up to a minute will elapse until 2 calls
have come in to the switchboard?
  1 5,   2
x
P( X  x )  
1
2

0
xe x  dx
 β is the mean time of the
event of calling
 α is the quantity of the
event of calling
1
P( X  1)  25 xe5 x dx  1  e5(1) (1  5)  0.96
0
President University
Erwin Sitompul
PBST 7/39
Chapter 6.7
Applications of the Gamma and Exponential Distributions
Applications
Based on extensive testing, it is determined that the average of time
Y before a washing machine requires a major repair is 4 years. This
time is known to be able to be modeled nicely using exponential
function. The machine is considered a bargain if it is unlikely to
require a major repair before the sixth year.
(a) Determine the probability that it can survive without major repair
until more than 6 years.
(b) What is the probability that a major repair occurs in the first
year?

(a)
1
P(Y  6)   et 4 dt  e6 4  0.223
46
(b)
1
P(Y  1)  1   e t 4 dt  1  e1 4  0.221
41
 Only 22.3% survives until
more than 6 years without
major reparation

1
 22.1% will need major
reparation after used for 1
year
1
  et 4 dt
40
President University
Erwin Sitompul
PBST 7/40
Chapter 6.8
Chi-Squared Distribution
Chi-Squared Distribution
 Another very important special case of the gamma distribution is
obtained by letting α = v/2 and β = 2, where v is a positive
integer.
 The result is called the chi-squared distribution, with a single
parameter v called the degrees of freedom.
 The chi-squared distribution plays a vital role in statistical
inference. It has considerable application in both methodology and
theory.
 Many chapters ahead of us will contain important applications of
this distribution.
President University
Erwin Sitompul
PBST 7/41
Chapter 6.8
Chi-Squared Distribution
Chi-Squared Distribution
 |Chi-Squared Distribution| The continuous random variable X
has a chi-squared distribution, with v degrees of freedom, if its
density function is given by
1

v 2 1  x 
x
e
, x0
 2v 2 (v 2)
f ( x)  

elsewhere
0,
where v is a positive integer.
 The mean and variance of the chi-squared distribution are
and
 v
 2  2v
President University
Erwin Sitompul
PBST 7/42
Chapter 6.9
Lognormal Distribution
Lognormal Distribution
 The lognormal distribution is used for a wide variety of
applications.
 The distribution applies in cases where a natural log
transformation results in a normal distribution.
President University
Erwin Sitompul
PBST 7/43
Chapter 6.9
Lognormal Distribution
Lognormal Distribution
 |Lognormal Distribution| The continuous random variable X has
a lognormal distribution if the random variable Y = ln(X) has a
normal distribution with mean μ and standard deviation σ. The
resulting density function of X is
2
 1
 ln( x )   
 2 x e
f ( x)  

0,
(2 2 )
, x0
x0
 The mean and variance of the chi-squared distribution are
E( X )  e
  2 2
President University
and
Var( X )  e
2   2
Erwin Sitompul
2
(e 1)
PBST 7/44
Chapter 6.9
Lognormal Distribution
Lognormal Distribution
Concentration of pollutants produced by chemical plants historically
are known to exhibit behavior that resembles a log normal
distribution. This is important when one considers issues regarding
compliance to government regulations.
Suppose it is assumed that the concentration of a certain pollutant,
in parts per million, has a lognormal distribution with parameters μ =
3.2 and σ = 1.
What is the probability that the concentration exceeds 8 parts per
million?
P( X  8)  1  P( X  8)
 ln(8)  3.2 
P( X  8)  F 
 F (1.12)  0.1314

1


 F denotes the cumulative distribution
function of the standard normal distribution
 a. k. a. the area under the normal curve
President University
Erwin Sitompul
PBST 7/45
Probability and Statistics
Homework 7
1. Suppose the current measurements in a strip of wire are assumed to
follow a normal distribution with a mean of 10 milliamperes and a
variance of 4 milliamperes2. (a) What is the probability that a
measurement will exceed 13 milliamperes? (b) Determine the value for
which the probability that a current measurement is below this value is
98%.
(Mo.E4.13-14 p.113)
2. A lawyer commutes daily from his suburban home to midtown office. The
average time for a one-way trip is 24 minutes, with a standard deviation
of 3.8 minutes. Assume the distribution of trip times to be normally
distributed. (a) If the office opens at 9:00 A.M. and the lawyer leaves his
house at 8:45 A.M. daily, what percentage of the time is he late for work?
(b) Find the probability that 2 of the next 3 trips will take at least 1/2
hour.
(Wa.6.15 s.186)
3. (a) Suppose that a sample of 1600 tires of the same type are obtained at
random from an ongoing production process in which 8% of all such tires
produced are defective. What is the probability that in such sample 150
or fewer tires will be defective?
(Sou18. CD6-13)
(b) If 10% of men are bald, what is the probability that more than 100 in
a random sample of 818 men are bald?
President University
Erwin Sitompul
PBST 7/46