Download Lecture Unit 5 - NCSU Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 15 Random Variables
Streamlining Probability:
Probability Distribution,
Expected Value and Standard
Deviation of Random Variable
Graphically and
Numerically Summarize a
Random Experiment
Principal vehicle by which we do this:
random variables
Random Variables
Definition:
A random variable is a numerical-valued
variable whose value is based on the outcome of
a random event.
Denoted by upper-case letters X, Y, etc.
Examples
1. X = payout by insurance company on an
iPhone6 damage protection policy
Possible values of X are x=$0, $250, $500
2. Y=score on 13th hole (par 5) at Augusta
National golf course for a randomly
selected golfer on day 1 of 2015 Masters
y=3, 4, 5, 6, 7
Random Variables and
Probability Distributions
A probability distribution lists the possible values of a
random variable and the probability that each value will
occur.
Random variables are
unknown chance
outcomes.
Probability distributions
tell us what is likely
to happen.
Data variables are
known outcomes.
Data distributions
tell us what happened.
Probability Distribution Of Payout by
Insurance Company on an iPhone6
Damage Protection Policy
Policy payouts based on estimates of
damaged/ruined cellphones.
x
0
250
500
p(x)
0.67
0.13
0.20
Probability
Histogram
iPhone6 Insurance Policy Payouts
0.8
0.7
0.67
0.6
0.5
0.4
0.3
0.2
0.2
0.13
0.1
0
0
250
500
Probability Distribution Of Score on
13th hole (par 5) at Augusta
National Golf Course on Day 1 of
2015 Masters
y
3
4
5
6
7
p(x)
0.040
0.414
0.465
0.051
0.030
Score on 13th Hole
0.5
Probability
Histogram
0.465
0.45
0.414
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0.051
0.04
0.03
0
3
4
5
6
7
Probability distributions:
requirements
Requirements
1. 0  p(x)  1 for all values x of X
2. all x p(x) = 1
Expected Value of a
Random Variable
A measure of the “middle”
of the values of a random
variable
Score on 13th Hole
iPhone6 Insurance Policy Payouts
0.8
0.7
0.67
0.6
0.5
0.4
0.3
0.2
0.2
0.13
0.1
0
0
250
500
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0.465
0.414
0.051
0.04
3
4
5
6
The mean of the probability distribution is
the expected value of X, denoted E(X)
E(X) is also denoted by the Greek letter µ
(mu)
0.03
7
Mean or
Expected
Value
x
0
250
500
p(x)
0.67
0.13
0.20
y
3
4
5
6
7
p(x)
0.040
0.414
0.465
0.051
0.030
k = the number of possible values of
random variable
E( X )   =
k
 x  P( X  x )
i
i
i=1
E(x)= µ = x1·p(x1) + x2·p(x2) + x3·p(x3) +
... + xk·p(xk)
Weighted mean
Sample Mean
Mean or
Expected
Value
X
=
n
X

i
i = 1
n
x +x +x +...+x
n
X= 1 2 3
n
1
1
1
1
= x + x + x +...+ x
n 1 n 2 n 3
n n
k = the number of outcomes
E ( x)   =
k
x
i
 P(X=x i )
i=1
µ = x1·p(x1) + x2·p(x2) + x3·p(x3) + ... +
xk·p(xk)
Weighted mean
Each outcome is weighted by its probability
Other Weighted Means
GPA A=4, B=3, C=2, D=1, F=0
Five 3-hour courses: 2 A's (6 hrs), 1 B (3 hrs), 2 C's (6 hrs)
GPA:
4*6  3*3  2*6
15

45
 3.0
15
Course grade: tests 40%, final exam 25%,
quizzes 25%, homework 10%
Your scores: tests - 83, final exam - 75, quizzes - 90, homework - 100
Course grade  (83  .40)  (75  .25)  (90  .25)  (100  .10)
 33.2  18.75  22.5  10  84.45
"Average" ticket prices
Mean or
Expected
Value
x
0
250
500
p(x)
0.67
0.13
0.20
y
3
4
5
6
7
p(x)
0.040
0.414
0.465
0.051
0.030
E( X )   =
k
x
i
 P(X=x i )
i=1
E(X)= µ =0(0.67)+250(0.13)+500(0.20)
=32.5 + 100 = 132.5
E(Y)= µ=3(.04)+4(0.414)+5(0.465)+6(0.051)+7(0.03)
=4.617 strokes
Score on 13th Hole
Mean or
Expected
Value
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0.465
0.414
0.051
0.04
3
4
5
6
0.03
7
µ=4.617
E(Y)= =3(.04)+4(0.414)+5(0.465)+6(0.051)+7(0.03)
=4.617 strokes
Interpretation
E(x) is not the value of the random
variable x that you “expect” to observe if
you perform the experiment once
Interpretation of E(X)
E(X) is a “long run” average.
The expected value of a random variable
is equal to the average value of the
random variable if the chance process was
repeated an infinite number of times. In
reality, if the chance process is continually
repeated, x will get closer to E(x) as you
observe more and more values of the
random variable x.
Example: Green Mountain
Lottery
State of Vermont
choose 3 digits from 0 through 9; repeats
allowed
win $500
x
$0
$500
p(x)
.999
.001
E(x)=$0(.999) + $500(.001) = $.50
Example (cont.)
E(x)=$.50
On average, each ticket wins $.50.
Important for Vermont to know
E(x) is not necessarily a possible value of
the random variable (values of x are $0
and $500)
 So the probability distribution of X is:
Example
x
p(x)
0
1/8
1
3/8
2
3/8
3
1/8
Let X = number of heads in 3 tosses of a
fair coin.
E(x) (or μ ) is
E(x)
4
  x  p(x )
i
i
i 1
 (0  1 )  ( 1 3 )  (2  3 )  (3  1 )
8
8
8
8
 12  1.5
8
US Roulette Wheel
and Table
 The roulette wheel has
alternating black and
red slots numbered 1
through 36.
 There are also 2 green
slots numbered 0 and
00.
 A bet on any one of
the 38 numbers (1-36,
0, or 00) pays odds of
35:1; that is . . .
 If you bet $1 on the
winning number, you
receive $36, so your
winnings are $35
American Roulette 0 - 00
(The European version has
only one 0.)
US Roulette Wheel: Expected Value of a
$1 bet on a single number
Let x be your winnings resulting from a $1 bet
on a single number; x has 2 possible values
x
p(x)
-1
37/38
35
1/38
E(x)= -1(37/38)+35(1/38)= -.05
So on average the house wins 5 cents on every
such bet. A “fair” game would have E(x)=0.
The roulette wheels are spinning 24/7, winning
big $$ for the house, resulting in …
Standard Deviation of a
Random Variable
First center (expected value)
Now - spread
Standard Deviation of a
Random Variable
Measures how “spread out”
the random variable is
Summarizing data and
probability
Data
Histogram
measure of the
center: sample mean
x
measure of spread:
sample standard
deviation s
Random variable
Probability Histogram
measure of the
center: population
mean 
measure of spread:
population standard
deviation s
Example
x
0
100
p(x)
1/2 1/2
E(x) = 0(1/2) + 100(1/2) = 50
y
49 51
p(y)
1/2 1/2
E(y) = 49(1/2) + 51(1/2) = 50
Variance –
measure of
spread
Variance
n
s2 =
 (X
i
 X) 2
i=1
n-1
=
1805.703
= 53.1089
34
The deviations of the outcomes from the
mean of the probability distribution
xi - µ
Xi - X
s2 (sigma squared) is the variance of the
probability distribution
[the variance is also denoted Var(X)]
Variance –
measure of
spread
Variance
n
s2 =
 (X
i
 X) 2
i=1
n-1
=
1805.703
= 53.1089
34
Variance of random variable X
s 2 [or Var(X)] =
k
2
(
x


)
 P ( X = xi )
 i
i =1
Variance Var(X)
x
0
250
500
p(x)
0.67
0.13
0.20
k
Var ( X ) =
2
(
x


)
 P ( X = xi )
 i
i =1
Recall: µ = E(X)=132.5
Example
132.5
132.5
Var(X) = (x1-µ)2 · P(X=x1) + (x2-µ)2 · P(X=x2)
+ (x3-µ)2 · P(X=x3)
132.5
= (0-132.5)2 · 0.67 + (250-132.5)2 · 0.13
+ (500-132.5)2 · 0.20 = 40,568.75
P. 207, Handout 4.1, P. 4
Standard Deviation: of
More Interest then the
Variance
The population standard deviation, denoted by s or SD(X),
is the square root of the population variance Var(X)
s  s  or SD( X ) = Var ( X )
Standard Deviation
Standard
Deviation
Standard Deviation (s) =
Positive Square Root of the Variance
s =
s2
s2 = 40,568.75
s, or SD(X), is the standard deviation of the
random variable X
s [or SD(X)] = s
2
s (or SD( X )] = 40,568.75  201.42
Expected Value of a Random Variable
Example: The probability model for a particular life insurance
policy is shown. Find the expected annual payout on a policy.
We expect that the insurance company will pay out $200 per policy
per year.
33
© 2010 Pearson Education
Standard Deviation of a Random Variable
Example: The probability model for a particular life insurance
policy is shown. Find the standard deviation of the annual payout.
34
© 2010 Pearson Education
Rules for E(X), Var(X) and SD(X):
adding a constant a
If X is a rv and a is Example: a = -1
a constant:
 E(X+a) = E(X)+a
 E(X+a)=E(X-1)=E(X)-1
Rules for E(X), Var(X) and SD(X):
adding constant a (cont.)
Var(X+a) = Var(X)
SD(X+a) = SD(X)
Example: a = -1
 Var(X+a)=Var(X-1)=Var(X)
 SD(X+a)=SD(X-1)=SD(X)
Carolina Panthers Next Season’s Profit
Wins
Profit X
($ Millions)
Probability
 12
10
0.20
10, 11
5
0.40
8, 9
1
0.25
7
-4
0.15
E(X)=10(0.20) + 5(0.40) + 1(0.25) – 4(0.15)
=3.65
SD(X)=4.4
Profit X
($ Millions)
Wins
Probability
Wins
Profit X+2
($ Millions)
Probability
 12
x1 10
0.20
 12
x1+2 10+2
0.20
10, 11
x2 5
0.40
10, 11
x2+2 5+2
0.40
8, 9
x3 1
0.25
8, 9
x3+2 1+2
0.25
7
x4 -4
0.15
7
x4+2 -4+2
0.15
E(X + a) = E(X) + a; SD(X + a)=SD(X); let a = 2
Probability
-4
-2
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
0
2
4
Profit
3.65
6
8
10
12
14
SD(X) = 4.40
Probability
SD(X) = 4.40
-4
-2
0
2
4
6
8
Profit
5.65
10
12
14
New Expected Value
Long (UNC-CH) way:
E(X+2)=12(.20)+7(.40)+3(.25)+(-2)(.15)
= 5.65
Smart (NCSU) way:
a =2; E(X+2) =E(X) + 2 = 3.65 + 2 = 5.65
New Variance and SD
Long (UNC-CH) way: (compute from
“scratch”)
Var(X+2)=(12-5.65)2(0.20)+…
+(-2+5.65)2(0.15) = 19.3275
SD(X+2) = √19.3275 = 4.40
Smart (NCSU) way:
Var(X+2) = Var(X) = 19.3275
SD(X+2) = SD(X) = 4.40
Rules for E(X), Var(X) and SD(X):
multiplying by constant b
E(bX)=bE(X)
Var(bX) = b2Var(X)
SD(bX)= |b|SD(X)
 |b| denotes the
absolute value of b
 Example: b =-1
 E(bX)=E(-X)=-E(X)
 Var(bX)=Var(-1X)=
=(-1)2Var(X)=Var(X)
 SD(bX)=SD(-1X)=
=|-1|SD(X)=SD(X)
Expected Value and SD of Linear
Transformation a + bx
Let the random variable X= season field goal shooting percentage
for NCSU women’s bb team. Suppose E(X)= 45.31 and SD(X)=1.67
The relationship between X and points scored per game for a college
women’s team can be described by 14.49 + 1.35X.
What are the mean and standard deviation of the points scored per
game?
Points per game (ppg) = 14.49 + 1.35X
E(ppg) = E(14.49+1.35X)=14.49+1.35E(X)=14.49+1.35*45.31=
= 14.49+61.1685=75.6585
SD(ppg)=SD(14.49+1.35X)=SD(1.35X)=1.35*SD(X)=1.35*1.67=
=2.2545
Note that the shift of 14.49 does NOT affect the
standard deviation.
Addition and Subtraction Rules
for Random Variables
 E(X+Y) = E(X) + E(Y);
 E(X-Y) = E(X) - E(Y)
 When X and Y are independent random variables:
1. Var(X+Y)=Var(X)+Var(Y)
2. SD(X+Y)= Var ( X )  Var (Y )
SD’s do not add:
SD(X+Y)≠ SD(X)+SD(Y)
3. Var(X−Y)=Var(X)+Var(Y)
4. SD(X −Y)= Var ( X )  Var (Y )
SD’s do not subtract:
SD(X−Y)≠ SD(X)−SD(Y)
SD(X−Y)≠ SD(X)+SD(Y)
Motivation for
Var(X-Y)=Var(X)+Var(Y)
 Let X=amount automatic dispensing machine
puts into your 16 oz drink (say at McD’s)
 A thirsty, broke friend shows up.
Let Y=amount you pour into friend’s 8 oz cup
 Let Z = amount left in your cup; Z = ?
 Z = X-Y
Var(Y)
Has 2 +
components
 Var(Z) = Var(X-Y) = Var(X)
Example: rv’s NOT independent
 X=number of hours a randomly selected student from our
class slept between noon yesterday and noon today.
 Y=number of hours the same randomly selected student
from our class was awake between noon yesterday and
noon today. Y = 24 – X.
 What are the expected value and variance of the total hours
that a student is asleep and awake between noon yesterday
and noon today?
 Total hours that a student is asleep and awake between
noon yesterday and noon today = X+Y
 E(X+Y) = E(X+24-X) = E(24) = 24
 Var(X+Y) = Var(X+24-X) = Var(24) = 0.
 We don't add Var(X) and Var(Y) since X and Y are not
independent.
Pythagorean Theorem of Statistics
for Independent X and Y
a2+b2=c2
Var(X+Y)
c2
Var(X) +Var(Y) =Var(X+Y)
Var(X)
a2
a
c
SD(X+Y)
SD(X)
b
SD(Y)
b2
Var(Y)
a+b≠c
SD(X)+SD(Y) ≠SD(X+Y)
Pythagorean Theorem of Statistics
for Independent X and Y
32 + 42 = 52
Var(X)+Var(Y)=Var(X+Y)
25=9+16
Var(X)
9
Var(X+Y)
3
5
SD(X+Y)
SD(X)
4
SD(Y)
16
Var(Y)
3+4≠5
SD(X)+SD(Y) ≠SD(X+Y)
Example: meal plans
Regular plan: X = daily amount spent
E(X) = $13.50, SD(X) = $7
Expected value and stan. dev. of total spent in
2 consecutive days?
E(X
+X
)=E(X
)+E(X
)=$13.50+$13.50=$27
1
2
1
2
SD(X + X ) ≠ SD(X )+SD(X ) = $7+$7=$14
1
2
1
2
SD( X 1  X 2 )  Var ( X 1  X 2 )  Var ( X 1 )  Var ( X 2 )
 ($7)  ($7)  $ 49  $ 49  $ 98  $9.90
2
2
2
2
2
Example: meal plans (cont.)
Jumbo plan for football players Y=daily
amount spent
E(Y) = $24.75, SD(Y) = $9.50
Amount by which football player’s spending
exceeds regular student spending is Y-X
E(Y-X)=E(Y)–E(X)=$24.75-$13.50=$11.25
SD(Y ̶ X) ≠ SD(Y) ̶ SD(X) = $9.50 ̶ $7=$2.50
SD(Y  X )  Var (Y  X )  Var (Y )  Var ( X )
 ($9.50)  ($7)  $ 90.25  $ 49  $ 139.25  $11.80
2
2
2
2
2
For random variables, X+X≠2X
 Let X be the annual payout on a life insurance policy.
From mortality tables E(X)=$200 and SD(X)=$3,867.
1) If the payout amounts are doubled, what are the new
expected value and standard deviation?
The risk to the
 Double payout is 2X. E(2X)=2E(X)=2*$200=$400
insurance co. when
 SD(2X)=2SD(X)=2*$3,867=$7,734 doubling the payout
is notThe
the same
2) Suppose insurance policies are sold to 2 (2X)
people.
as 2
thepeople
risk when
annual payouts are X1 and X2. Assume the
selling policies
behave independently. What are the expected
value to 2
people.
and standard deviation of the total payout?
 E(X1 + X2)=E(X1) + E(X2) = $200 + $200 = $400
SD(X1 + X2 )= Var ( X1  X 2 )  Var ( X1 )  Var ( X 2 )
 (3867)2  (3867)2  14,953,689  14,953,689
 29,907,378  $5,468.76
Related documents