Download S2.2 Continuous random variables

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
A2-Level Maths:
Statistics 2
for Edexcel
S2.2 Continuous
random variables
This icon indicates the slide contains activities created in Flash. These activities are not editable.
For more detailed instructions, see the Getting Started presentation.
1 of 61
© Boardworks Ltd 2006
Relative frequency histograms
Relative frequency histograms
Contents
Probability density functions
Mode
Cumulative distribution functions
Median and quartiles
Expectation
Variance
2 of 61
© Boardworks Ltd 2006
Relative frequency histograms
A histogram has the important property that the area of each
bar is in proportion to the corresponding frequency.
A histogram with unequal widths can be drawn by plotting the
frequency density on the vertical axis, where:
frequency density =
frequency
class width
A relative frequency histogram is one in which the area of
a bar corresponds to the proportion of the data falling into
the corresponding interval.
The relative frequency is defined as:
relative frequency density =
3 of 61
frequency density
total frequency
© Boardworks Ltd 2006
Relative frequency histograms
Example: The relative frequency histogram below
shows the distribution of ages of the UK population.
Relative frequency
density
Relative frequency histogram to show the age
distribution in the UK
0.018
0.016
0.014
0.012
0.01
0.008
0.006
0.004
0.002
0
0
10
20
30
40
50
60
70
80
90
100
Age
The area of each bar corresponds to the proportion of
the population with ages in that interval.
The total area of all the bars is 1.
4 of 61
© Boardworks Ltd 2006
Relative frequency histograms
Relative frequency histogram to show the age
distribution in the UK
Relative frequency
density
0.02
0.015
0.01
0.005
0
0
10
20
30
40
50
60
70
80
90 100
Age
The distribution of the ages can be modelled by a curve.
This curve is called a probability density function.
5 of 61
© Boardworks Ltd 2006
Probability density functions
Relative frequency histograms
Contents
Probability density functions
Mode
Cumulative distribution functions
Median and quartiles
Expectation
Variance
6 of 61
© Boardworks Ltd 2006
Probability density functions
Relative frequency density
A probability density function (or p.d.f.) is a curve that models
the shape of the distribution corresponding to a continuous
random variable.
0.02
0.015
0.01
0.005
0
0
10
20
30
40
50
60
70
80
90 100
Age
A probability density function has several important properties.
7 of 61
© Boardworks Ltd 2006
Probability density functions
If f(x) is the p.d.f corresponding to a continuous random
variable X and if f(x) is defined for a ≤ x ≤ b then the following
properties must hold:
0.02
1.
b
 f ( x)dx  1
a
0.015
0.01
0.005
0
0
10 20 30 40 50 60 70 80 90 100
i.e. the total area under a p.d.f. is 1.
8 of 61
© Boardworks Ltd 2006
Probability density functions
If f(x) is the p.d.f. corresponding to a continuous random
variable X and if f(x) is defined for a ≤ x ≤ b then the following
properties must hold:
2.
f ( x)  0
for a  x  b
0.02
0.015
0.01
0.005
0
0
10 20 30 40 50 60 70 80 90 100
i.e. the graph of the p.d.f. never
dips below the x-axis.
9 of 61
© Boardworks Ltd 2006
Probability density functions
If f(x) is the p.d.f. corresponding to a continuous random
variable X and if f(x) is defined for a ≤ x ≤ b then the following
properties must hold:
x2
3.
P( x1  X  x2 )   f ( x)dx
x1
i.e. probabilities correspond to
areas under the curve.
10 of 61
© Boardworks Ltd 2006
Probability density functions
Example: Sketch the graph of each of the following functions.
Decide in each case whether it could be the equation of a
probability density function:


1.
1
24 x  3 x 2  36

f ( x)   32

0
2.
1
 3
f ( x)   x
0

3.
 x2  4 x  3 0  x  4 
f ( x)  

0
otherwise 

11 of 61

2 x6 

otherwise 

x  1

x  1
© Boardworks Ltd 2006
Probability density functions

1
24 x  3 x 2  36

1. f ( x)   32

0


2 x6 

otherwise 
The function is non-negative
everywhere.
If f represents a p.d.f., then
 f ( x)dx  1
all x
6
1
16
2
3
2
12 x  x  36 x  2
 f ( x)dx 
 (24 x  3 x  36)dx 
32
all x
32 2
 1  (432 216  216)  (48 8 72)  1

32 
So f(x) could represent a probability density function.
12 of 61
© Boardworks Ltd 2006
Probability density functions
1

2. f ( x)   x3
0


x  1

x  1
The function is non-negative
everywhere.
For f to represent a p.d.f. we need to check that

 f ( x)dx  1
all x

1
1 2 
1 1


3
But:  3 dx   x dx 
x   0  

1 x
1
 2 2
2
1

So f(x) could not represent a probability density function.
13 of 61
© Boardworks Ltd 2006
Probability density functions
3.
 x2  4 x  3 0  x  4 
f ( x)  

0
otherwise 

The function is clearly negative for some values of x.
Consequently f(x) cannot represent a probability density
function.
14 of 61
© Boardworks Ltd 2006
Probability density functions
15 of 61
© Boardworks Ltd 2006
Question 1
Question 1: A continuous random variable X is defined by the
probability density function
k (5  x) 0  x  5
f ( x)  
otherwise
 0
a) Sketch the probability density function.
b) Find the value of the constant k.
c) Find P(1 ≤ X ≤ 3).
16 of 61
© Boardworks Ltd 2006
Question 1
k (5  x) 0  x  5
a) f ( x)  
otherwise
 0
The diagram shows the
probability density function.
b) To find k, we can use the property that
 f ( x)dx  1
all x
5
1 

k (5  x)dx  k 5 x  x2   k  (25  12.5)  0  12.5k
0
2 0


5
2
Therefore, 12.5k = 1  k 
25
17 of 61
© Boardworks Ltd 2006
Question 1
c) P(1 ≤ X ≤ 3)
3
2 3
2
2 
x
(5  x)dx 
5 x  
25
25 
2 
1

1

2
 (15  4.5)  (5  0.5) 
25
= 0.48
18 of 61
© Boardworks Ltd 2006
Examination-style question 1
Examination-style question 1: A continuous random
variable X is defined by the probability density function
1 x  3
 k ( x  1)

f ( x)  k (5  x)( x  2) 3  x  5

0
otherwise

a) Sketch the probability density function.
b) Find the value of the constant k.
c) Find P(X > 2).
19 of 61
© Boardworks Ltd 2006
Examination-style question 1
a)
1 x  3
 k ( x  1)

f ( x)  k (5  x)( x  2) 3  x  5

0
otherwise

20 of 61
© Boardworks Ltd 2006
Examination-style question 1
 f ( x)dx  1
b) To find k, we can use the property that
Note that (5 – x)(x – 2) = 7x – 10 –
So,
3
1
k ( x  1)dx 
3 
5
all x
x2

k 7 x  10  x2 dx  1
3
5
1 3
1 2

7 2
 k  x  x   k  x  10 x  x   1
3 3
2
1
2
1
 1 1
 1
 k  1    k  4  7   1
2
 2 2
 6
1
Therefore 5 k  1
3
21 of 61
i.e. k 
3
16
© Boardworks Ltd 2006
Examination-style question 1
5
c) P(X > 2) =
 f ( x)dx
2
3

( x  1)dx 
2 16

3


3
7 x  10  x 2 dx
3 16

5
3
5
3 1 2
3 7 2
1 3


x  x 
x  10 x  x 


16  2
3 3
 2 16  2
22 of 61

3  1
1
 3  1
1

0


4

7




16  2
16
6
2




29
32
An alternative method would be
to utilise P(X > 2) = 1 – P(X ≤ 2)
© Boardworks Ltd 2006
Examination-style question 2
Examination-style question 2: The life, T hours, of an
electrical component is modelled by the probability density
function
ke0.001t
f (t )  
 0
t  1000
otherwise
a) Sketch the probability density function.
b) Find the value of the constant k.
c) Find P(1500 ≤ T ≤ 2000).
23 of 61
© Boardworks Ltd 2006
Examination-style question 2
Solution:
a)
24 of 61
© Boardworks Ltd 2006
Examination-style question 2
ke0.001t
f (t )  
 0
t  1000
otherwise
 f (t )dt  1
b) To find k, we use the fact that


So:
all t
ke0.001t dt  1
1000
 k  1000e

0.001t  
1000

1

 k 0  (1000e1)  1
Therefore
25 of 61
k
1
1000e1

e
 0.00272
1000
© Boardworks Ltd 2006
Examination-style question 2
ke0.001t
f (t )  
 0
t  1000
otherwise
2000
c) P(1500 ≤ T ≤ 2000) =

2000
f (t )dt
1500
k

e0.001t dt
1500
 k  1000e

0.001t  2000

1500
e

1000e2  1000e1.5
1000

= 0.239 (3 s.f.)
26 of 61
© Boardworks Ltd 2006
Mode
Relative frequency histograms
Contents
Probability density functions
Mode
Cumulative distribution functions
Median and quartiles
Expectation
Variance
27 of 61
© Boardworks Ltd 2006
Mode
Suppose that a random variable X is defined by the
probability density function f(x) for a ≤ x ≤ b.
The mode of X is the value of x that produces the largest
value for f(x) in the interval a ≤ x ≤ b.
A sketch of the probability density function can be very
helpful when determining the mode.
Differentiation
could be used
to find the
mode here.
28 of 61
© Boardworks Ltd 2006
Mode
Example: A random variable X has p.d.f. f(x), where
 x 2 (2  x ) 0  x  2
f ( x)  
0
otherwise

Find the mode.
Sketch of f(x):
29 of 61
© Boardworks Ltd 2006
Mode
The mode can be found using differentiation:
f ( x)  2x2  x3  f ( x)  4x  3x2
To find a turning point, we solve f ( x)  0  4 x  3 x2  0
Factorize:
x(4  3 x)  0 
x  0 or x  4 3
Check that x  4 3 gives the maximum value:
f ( x)  4  6 x  f ( 4 3)  4  8  4  0
So the mode is x  4 3 .
30 of 61
© Boardworks Ltd 2006
Cumulative distribution functions
Relative frequency histograms
Contents
Probability density functions
Mode
Cumulative distribution functions
Median and quartiles
Expectation
Variance
31 of 61
© Boardworks Ltd 2006
Cumulative distribution functions
The cumulative distribution function (c.d.f.) F(x) for a
continuous random variable X is defined as
F(x) = P(X ≤ x).
Therefore, the c.d.f. is found by integrating the p.d.f..
Example: A random variable X has p.d.f. f(x), where


1 3
 x 1
f ( x)   6

0
0 x2
otherwise
Find the c.d.f. and find P(X < 1).
32 of 61
© Boardworks Ltd 2006
Cumulative distribution functions
Solution: The c.d.f., F(x) is given by:
 1 3 1
F( x)  f ( x)dx   x   dx
6
6
1 4 1

x  xc
24
6


To find the constant c we can use the fact that P(X ≤ 0) = 0
(because the random
variable X is only nonzero between 0 and 2)
Therefore F(0) = 0, i.e. c = 0.
33 of 61
© Boardworks Ltd 2006
Cumulative distribution functions
So the c.d.f. is
0
x0

1 4 1
F( x)   24
x 6 x 0 x2

1
x2

1 1 5
 
P(X < 1) = F(1) =
24 6 24
34 of 61
© Boardworks Ltd 2006
Median and quartiles
Relative frequency histograms
Contents
Probability density functions
Mode
Cumulative distribution functions
Median and quartiles
Expectation
Variance
35 of 61
© Boardworks Ltd 2006
Median and quartiles
The median, m, of a random variable X is defined to be the
value such that
F(m) = P(X ≤ m) = 0.5
where F is the cumulative distribution function of X.
Likewise the lower quartile is the solution to the equation
F(x) = 0.25
and the upper quartile is the solution to
F(x) = 0.75.
36 of 61
© Boardworks Ltd 2006
Median and quartiles
37 of 61
© Boardworks Ltd 2006
Median and quartiles
Example: A random variable X is defined by the cumulative
distribution function:
0

1 2
F( x)   24 x  x  6

1



x2
2 x5
x5
a) Calculate and sketch the probability density function.
b) Find the median value.
c) Work out P(3 ≤ X ≤ 4).
38 of 61
© Boardworks Ltd 2006
Median and quartiles
a) We can get the p.d.f. by differentiating the c.d.f.
1 x 1
f ( x)  F( x)  12
24
So the p.d.f. is
1 x 1
 12
2 x5
24
f ( x)  
otherwise
 0
Sketch of f(x):
39 of 61
© Boardworks Ltd 2006
Median and quartiles
b) The median, m, satisfies F(m) = 0.5.
Therefore
1
24


m2  m  6  0.5
 m2  m  6  12
 m2  m  18  0
1  1  4  1 (18)
m 
2
 m  4.77 or m  3.77
The median must be 3.77 (as the p.d.f. is only
non-zero for values in the interval [2, 5]).
40 of 61
© Boardworks Ltd 2006
Median and quartiles
c) P(3 ≤ X ≤ 4) = F(4) – F(3)



 1 42  4  6  1 32  3  6
24
24

 7 1
12
4
1
3
41 of 61
© Boardworks Ltd 2006
Median and quartiles
Example: A random variable X has p.d.f. f(x), where
 3 x2  3 x  3 1  x  2
2
4
4
f ( x)   3  3 x
2 x4
2
8

 0
otherwise
a) Calculate the cumulative distribution function
and verify that the lower quartile is at x = 2.
b) Work out the median value of X.
42 of 61
© Boardworks Ltd 2006
Median and quartiles
a) For f ( x)  34 x2  32 x  34 , F( x)  41 x3  34 x2  34 x  c
We know that P(X ≤ 1) = 0, i.e., that F(1) = 0.
So,
1
4
 34  34  c  0  c   41
Therefore F( x)  41 x3  34 x2  34 x  41
3 x2  c
For f ( x)  32  38 x, F( x)  32 x  16
We know that P(X ≤ 4) = 1, i.e. that F(4) = 1.
3  42  c  1  c  2
So 32  4  16
3 x2  2
Therefore F( x)  32 x  16
43 of 61
© Boardworks Ltd 2006
Median and quartiles
0
x 1

 1 x3  3 x 2  3 x  1 1  x  2

4
F( x)   4 3 4 3 2 4
2 x4
 2 x  16 x  2

1
x4
So
To verify that the lower quartile is 2, we simply need to
check that F(2) = 0.25:
1 3 3 2 3
1
F(2)   2   2   2   0.25
4
4
4
4
Therefore the lower quartile is 2.
44 of 61
© Boardworks Ltd 2006
Median and quartiles
b) The median, m, must lie in the interval [2, 4] because
F(2) = 0.25.
To find the median we must solve F(m) = 0.5:
3
3 2
1
i.e. F(m)  m  m  2 
2
16
2
This can be rearranged to give the quadratic equation:
3m2  24m  40  0
Using the quadratic formula, m 
24  576  4  3  40
6
m = 5.63 or m = 2.37
As 5.63 does not lie in the interval [2, 4],
the median must be 2.37.
45 of 61
© Boardworks Ltd 2006
Expectation
Relative frequency histograms
Contents
Probability density functions
Mode
Cumulative distribution functions
Median and quartiles
Expectation
Variance
46 of 61
© Boardworks Ltd 2006
Expectation
If X is a continuous random variable defined by the probability
density function f(x) over the domain a ≤ x ≤ b, then the mean or
expectation of X is given by
b
E[ X ]   xf ( x)dx
a
E[X] is the value you would expect to get, on average.
This mean value of X is also sometimes denoted μ.
[Note: if the p.d.f. is symmetrical, then the expected value of
X will be the value corresponding to the line of symmetry].
We can also find the expected value of g(X), i.e. any function
of X:
b
E[g( X )]   g( x)f ( x) dx
a
47 of 61
© Boardworks Ltd 2006
Expectation
Example: A random variable X is defined by the probability
density function
2
x 1
 3
f ( x)   x
 0 otherwise
Calculate the value of E[X] and E[1/X].


2
2
E[ X ]   xf ( x)dx   x. 3 dx   2 dx
x
x
all x
1
1

 2x
2
1 
dx   2 x   (0)  (2)  2
1
1
48 of 61
© Boardworks Ltd 2006
Expectation

1
1 2
1


E

f ( x)dx   . 3 dx
 X  x
x x
all x
1

  2x 4 dx
1

 2 3 
  x 
 3
1
 2
 (0)    
 3
2

3
2
1
So, E[ X ] 
3
49 of 61
© Boardworks Ltd 2006
Variance
Relative frequency histograms
Contents
Probability density functions
Mode
Cumulative distribution functions
Median and quartiles
Expectation
Variance
50 of 61
© Boardworks Ltd 2006
Variance
If X is a continuous random variable defined by the probability
density function f(x) over the domain a ≤ x ≤ b, then the
variance of X is given by
Var[ X ]  E  X 2   E[ X ]
2
b
or
Var[ X ]   x 2f ( x)dx  μ 2
a
The standard deviation of X is the square root of the variance.
The standard deviation is sometimes denoted by the symbol σ.
51 of 61
© Boardworks Ltd 2006
Variance
Example: A continuous random variable Y has a
probability density function f(y) where
3
y (4  y ) 0  y  4
 32
f ( y)  
0
otherwise

Calculate the value of Var[Y].
Sketch of f(y):
The p.d.f. is symmetrical. Therefore E[Y] = 2.
52 of 61
© Boardworks Ltd 2006
Variance
4
4
3
E[Y ]   y f ( y )dy   y .
y ( 4  y )dy
32
0
0
2
2
2
4
3
3
4

(
4
y

y
)dy

32 0
4

3  4 1 5
y  y 

32 
5 0
3  4 1 5


(
4


4
)

0


32 
5

4
4
5
Therefore Var[Y] = 4
53 of 61
4
4
 22 
5
5
© Boardworks Ltd 2006
Variance
Example: A continuous random variable x has a
probability density function f(x) where
 41
0 x2
3 x
f ( x)   8  16 2  x  6

otherwise
 0
Calculate
a) the mean value, μ.
b) the standard deviation, σ.
54 of 61
© Boardworks Ltd 2006
Variance
 41
0 x2

x
f ( x)   38  16
2 x6
 0
otherwise

2
6
 3 x x2 
1
x
3 x 
 dx
a) E[ X ]   x. dx   x.    dx   dx   
4
4
8 16 
 8 16 
0
2
0
2
2
6
2
6
 x   3x
x 
  
 
 8  0  16 48  2
2
2
3
4
  1 7 
   0  2  
8
  4 12 
2
55 of 61
1
6
© Boardworks Ltd 2006
Variance
 41
0 x2

x
f ( x)   38  16
2 x6
 0
otherwise

6
2 2
6
2
3


1
3
x
x
3
x
x


2
2
2
 dx
b) E[ X ]   x . dx   x .    dx   dx   
4
4
8 16 
 8 16 
0
2
0
2
2
2
6
x  x
x 
    
12  0  8 64  2
3
3
4
2
  3 3
   0  6  
3
  4 4
2
6
3
56 of 61
2
2  1
35
So, Var[ X ]  6   2   1
3  6
36
Therefore σ =
71
36
= 1.40 (3 s.f.)
© Boardworks Ltd 2006
Examination-style question
Examination-style question: The mass, X kg, of luggage
taken on board an aircraft by a passenger can be modelled by
the probability density function
kx3 (30  x) 0  x  30
f ( x)  
0
otherwise

a) Sketch the probability density function and find the value
of k.
b) Verify that the median weight of luggage is about 20.586 kg.
c) Find the mean and the variance of X.
57 of 61
© Boardworks Ltd 2006
Examination-style question
a)
kx3 (30  x) 0  x  30
f ( x)  
0
otherwise

30
To find k we use
3
kx
 (30  x)dx  1
0
30
 k  (30 x 3  x 4 )dx  1
0
30
 30 4 1 5 
 k  x  x  1
5 0
4
 k 1215000  0   1
58 of 61
k 
1
1215000
© Boardworks Ltd 2006
Examination-style question
b)
kx3 (30  x) 0  x  30
f ( x)  
0
otherwise

To verify that the median is about 20.586, we need to check
that P(X ≤ 20.586) = 0.5
20.586
P(X ≤ 20.586) =

kx3 (30  x)dx
0
20.586
1
 30 4 1 5 

x  x 

1215000  4
5 0
1

 607525  0 
1215000
 0.500
59 of 61
© Boardworks Ltd 2006
Examination-style question
c)
kx3 (30  x) 0  x  30
f ( x)  
0
otherwise

30
E[ X ] 

0
30
x.kx3 (30  x)dx  k  (30 x 4  x 5 )dx
0
30
1
 5 1 6

6x  x 

1215000 
6 0
1

 24300000  0 
1215000
 20
60 of 61
© Boardworks Ltd 2006
Examination-style question
c)
kx3 (30  x) 0  x  30
f ( x)  
0
otherwise

E[ X 2 ] 
30
30
0
0
2
3
5
6
x
.
kx
(
30

x
)
dx

k
(
30
x

x
)dx


30
1
 6 1 7

5x  x 

1215000 
7 0
 428.5714
Therefore, Var[ X ]  428.5714  202  28.57 (to 4 s.f.).
61 of 61
© Boardworks Ltd 2006