Download Week6

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
Continuous Distributions
Week 6
Objectives
On completion of this module you should be able
to:
 calculate areas under the standard normal
curve,
 solve and interpret problems involving the
normal distribution,
 check assumptions of normality,
 calculate probabilities using the uniform
distribution,
 solve and interpret problems involving the
uniform distribution,
2
Objectives
On completion of this module you should be able
to:
 calculate probabilities using the exponential
distribution and
 solve and interpret problems involving the
exponential distribution.
3
Three continuous distributions
Normal Distribution
x
x
Uniform Distribution
x
Exponential Distribution
4
The Normal distribution
Characteristics of the normal distribution:
 it is bell-shaped (symmetrical) and unimodal
(one mode),
 the mean, median and mode are identical,
 most of the data falls within ±1.33 standard
deviations of the mean,
 the variable (X) has an infinite domain,
 as X±, f(X)0 and
 the total area under curve is 1.
Normal Distribution
5
x
Normal probability density function
2
1
1 2   X     
f X  
e
2
where
e = 2.71828…
 = 3.14159…
 = population mean
 = population standard deviation
Normal Distribution
6
x
Mean and standard deviation
• The mean determines location…
• The standard deviation determines the spread…
7
Standardised Normal distribution
1
1 2  Z 2
f Z  
e
2
where Z 
X 

 The standard normal distribution always has a
mean of 0 (is centred on zero) and a standard
deviation of 1.
8
Example 6-1
A final exam for a particular accountancy course
is known by students to be a difficult one.
In the past, the mean mark was 62% and the
standard deviation was 11%.
What proportion of students have received a
mark of:
(a) At least 65%
9
Solution 6-1
 We are told that μ = 62 and  = 11.
 = 62
 = 11
Normal
distribution
65
10
Solution 6-1
 Using the transformation formula:
Z
X 

65  62

 0.27 (to 2 dec. pl.)
11
 = 0
 = 1
Standard
Normal
distribution
0.27
11
Solution 6-1
 This means that P  X  65  P  Z  0.27 
 We can now look the standardised value up in
table (Table E.2).
 Tables gives probabilities of less than specified
Z values.
 We know the probability under the curve is 1, so
we can subtract the tabulated values from 1 to
get our desired probability.
12
Using Table E.2 from the text
Table E.2 gives P(Z < 0.27):
We want
1 – P(Z < 0.27) = P(Z > 0.27):
13
Solution 6-1
P  X  65   P  Z  0.27   1  P  Z  0.27 
 1  0.6064  0.3936
 So 39.36% of students will receive more than
65% on the exam.
 Always sketch the required area when solving
normal distribution problems – this helps you
find the correct probability area!
14
Solution 6-1
What proportion of students have received a mark of:
(b) at least 50%
X   50  62
Z

 1.09

11
P  X  50   P  Z  1.09 
 1  P  Z  1.09 
Z = – 1.09
(50%)
 1  0.1379
 0.8621
So 86.21% of students can be expected to receive
more than 50% on the exam.
15
Solution 6-1
What proportion of students have received a
mark of:
(c) less than 40%
Z
X 

40  62

 2
11
P  X  40   P  Z  2 
 0.0228
Z = – 2
(40%)
So 2.28% of students can be expected to receive
less than 40% on the exam.
16
(70%)
Z
(100%)
= 3.45
0.73
Solution 6-1
What proportion of students have received a mark of:
(d) between 70% and 100%
X   70  62
Z1 

 0.73

11
X   100  62
Z2 

 3.45
Z = 0.73
Z = 3.45

11
(70%)
(100%)
P  70  X  100   P  0.73  Z  3.45
 0.99972  0.7673
 0.23242
So 23.24% of students will receive between 70% and
100% on the exam.
17
Solution 6-1
 Note we could logically have assumed that no
student could get more than 100% on the
exam.
 This would mean that
P  X  100   1
and so
P  70  X  100   P  X  100   P  X  70 
 1  P  Z  0.73
 1  0.7673
 0.2327
18
Solution 6-1
(d) Between what two marks symmetrically
distributed around the mean will 95% of the
students’ marks fall?
P  Z1  Z  Z 2   0.95
95%
0.025
0.475 0.475
0.025
19
Solution 6-1
 We look for 0.025 in Table E.2.
 It corresponds to a Z value of -1.96.
 Therefore Z1 = –1.96 and Z2 = 1.96 (since the
distribution is symmetrical).
0.025
0.025
0.475 0.475
– 1.96
+ 1.96
20
Solution 6-1
 Using Z 
X 

we discover that
X  62  1.96 11  X  62
 
1.96 
11
 X  62  1.96 11  40.44
X  62
and 1.96 
 1.96 11  X  62
11
 X  62  1.96 11  83.56
So 95% of students can be expected to receive
between 40.44% and 83.56%.
21
Evaluating the normality assumption
 Recall that the normal distribution is:
 bell-shaped
 has IQR equal to 1.33 standard deviations
 is continuous with an infinite range
We can begin checking for normality using:
 a box-and-whisker plot
 a stem-and-leaf display (for small data sets)
or a histogram (for larger data sets)
22
Evaluating the normality assumption
 Examining summary statistics and checking that:
 mean, median and mode are all similar
 IQR is approximately equal to 1.33 standard
deviations
 range is approximated by 6 standard
deviations
 2/3 of observations lie within ±1 standard
deviation of the mean, 4/5 within ±1.28
standard deviations, 19 out of 20 observations
within ±2 standard deviations.
23
Example 6-2
Last term, a group of 21 students enrolled in an
accounting course on a particular campus.
Their scores on the final exam are recorded
below.
Determine whether or not these marks are
normally distributed by evaluating the actual
versus theoretical properties and by constructing
a normal probability plot.
59
64
48
49
75
76
51
74
51
53
48
58
67
71
43
44
43
72
63
62
64
24
Solution 6-2
 We begin with a five number summary:
Five-number Summary
Minimum
First Quartile
43
48.5
Median
59
Third Quartile
69
Maximum
76
and the mean and standard deviation:
Mean
58.80952
Std. Deviation
11.10234
25
Solution 6-2
 The mean (58.80952) is very similar to the median
(59).
 The mode is not very helpful with such a small
data set.
 IQR = 69 – 48.5 = 20.5
1.33 standard deviations: 1.33  11.10234 = 14.77
(2 dec. pl.)
 Range = 76 – 43 = 33
6 std. dev. = 6  11.10234 = 66.61 (2 dec. pl.)
 The stem-and-leaf diagram given by PHStat2 is
not particularly useful (try this yourself to see why).
26
Box plot of exam m arks
M arks
40
45
50
55
60
65
70
75
80
27
28
Solution 6-2
 The data appears to be roughly symmetrical.
 The upper and lower quartiles may be a little
too large and too small respectively to fit the
normal distribution.
 The data on the normal probability plot
(produced using PHStat2) approximately
follows a straight line.
 This is a small data set so accurate
conclusions are difficult to draw, but the data is
probably approximately normally distributed.
29
Uniform distribution
1
f X  
if a  X  b
ba
where
a = minimum value of X
b = maximum value of X
ab

2

b  a 
2
12
x
Uniform Distribution
30
Example 6-3
A surfer knows that the time between wipe-outs
(falling off his surfboard) is uniformly distributed
between two minutes and nine minutes in
particularly large surf.
What is the probability that the time between
wipe-outs is:
(a) less than five minutes?
31
Example 6-3
0.2
0.1
1
2
3
4
5
6
7
8
9
10
The total area under the rectangle is 1, so
since the length is b – a = 9 – 2 = 7, the height
1
1
1
must be

  0.1428...
ba 92 7
y
0.2
0.1
1 2 3 4 5 6 7 8 9 10 x
32
Solution 6-3
y
P  X  5  P  2  X  5
0.2
0.1
1 2 3 4 5 6 7 8 9 10 x
 1 
 5  2  

9

2


3

7
Base  Height
33
Solution 6-3
What is the probability that the time between
wipe-outs is:
(b) between three and four minutes?
P 3  X  4
y
0.2
0.1
1 2 3 4 5 6 7 8 9 10 x
 1 
  4  3  

92
1

7
34
Solution 6-3
What is the probability that the time between
wipe-outs is:
(c) more than six minutes?
P  X  6
y
0.2
0.1
1 2 3 4 5 6 7 8 9 10 x
 1 
 9  6  

92
3

7
35
Solution 6-3
(d) What is the expected value of the time
between wipe-outs?
ab 29


 5.5
2
2
(e) What is the standard deviation of the time
between wipe-outs?

b  a 
2

9  2
2
12
12
 2.0207 (to 4 dec. pl.)
36
Exponential distribution
f  arrival time  X   1  e  X
where
x
Exponential Distribution
e = 2.71828…
 = the population mean number of arrivals
per unit
X = any value of the continuous variable
where 0  X  
37
Example 6-4
People are known to arrive at a particular
vending machine at a mean rate of 27 per hour.
Assuming that these arrival times follow an
exponential distribution, find the probability that
the next person will arrive:
(a) within one minute?
We are given:
  27
P  arrival time  X   1  e
 X
 1  e27 X
38
Solution 6-4
(a) P  arrival time  1 minute   1  e
 1 
27 
 60 
 0.3624 (to 4 dec. pl.)
Note that we converted the units from minutes
to portions of hours since the variable is
expressed in hours.
(b) within five minutes?
P  arrival time  5 minutes   1  e
 5 
27 
 60 
 0.8946
39
Solution 6-4
(c) in more than five minutes?
P  arrival time>5 minutes 
 1  P  arrival time  5 minutes 
 1  0.8946  0.1054
40
After the lecture each week…
 Review the lecture material
 Complete all readings
 Complete all of recommended problems (listed
in SG) from the textbook
 Complete at least some of additional problems
 Consider (briefly) the discussion points prior to
tutorials
41