Download Statistics 510: Notes 7

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Statistics 510: Notes 9
Reading: Sections 4.3-4.5
Next week’s homework will be e-mailed and posted on the
web site www-stat-wharton.upenn.edu/~dsmall/stat510-f05
by tonight.
I. Review
A random variable is a real-valued function whose domain
is the sample space S.
In Chapter 4, we focus on discrete random variables,
random variables that can take on a finite or at most
countably infinite number of values.
Associated with each discrete random variable X is a
probability mass function (pmf) p ( a ) that gives the
probability that X equals a:
p(a)  P{ X  a}  P({s  S | X ( s)  a}) .
Example 1: Suppose two fair dice are tossed. Let X be the
random variable that is the sum of the two upturned faces.
Example 2: Three balls are to be randomly selected without
replacement from an urn containing balls numbered 1
through 20. Let X denote the largest number selected.
X is a random variable taking on values 3, 4, ..., 20. Since
we select the balls randomly, each of the
 20 
  combinations of the balls is equally likely to be
3 
chosen. The probability mass function is
 i  1


2 

P{ X  i} 
, i  3, , 20
20
 
 
3 
This equation follows because the number of selections that
result in the event { X  i} is just the number of selections
that result in the ball numbered i and two of the balls
numbered 1 through i-1 being chosen.
II. Cumulative Distribution Function
For a random variable X, the function F defined by
F ( x)  P{ X  x},    x  
is called the cumulative distribution function (CDF),
The pmf can be determined uniquely from the cdf and vice
versa.
III. Expected Value
Probability mass functions provide a global overview of a
random variable’s behavior. Detail that explicit, though, is
not always necessary – or even helpful. Often times, we
want to focus the information contained in the pmf by
summarizing certain of its features with single numbers.
The first feature of a pmf that we will examine is central
tendency, a term referring to the “average” value of a
random variable.
The most frequently used measure for describing central
tendency is the expected value. For a discrete random
variable, the expected value of a random variable X is a
weighted average of the possible values X can take on, each
value being weighted by the probability that X assumes it:
E[ X ]   xp( x) .
x: p ( x )  0
Example 1 continued: The expected value of the random
variable X is
E[ X ]  2*(1/ 36)  3*(2 / 36)  4*(3 / 36)  5*(4 / 36)  6*(5 / 36) 
7*(6/36)+8*(5/36)+9*(4/36)+10*(3/36)+11*(2/36)+12*(1/36)=7
Example 2 continued: The expected value of the random
variable X is
 i  1

20 
2 

E( X )   i
 15.75
20
 
i 3
 
3 
Frequency motivation for expected value:
Another motivation for the definition of the expected value
is provided by the frequency interpretation of probabilities.
The frequency interpretation assumes that if an infinite
sequence of independent replications of an experiment is
performed, then for any event E, the proportion of times E
occurs will be P(E). Now consider a random variable X
that takes on values x1 , , xn with probabilities
p( x1 ), , p( xn ) . Then the mean value of X over many
repetitions of the experiment will be
E[ X ]   xp( x)
x: p ( x )  0
IV. Expectation of Function of a Random Variable
(Chapter 4.4)
Suppose we are given a discrete random variable X along
with its pmf and that we want to compute the expected
value of some function of X, say g(X).
One approach is to directly determine the pmf of g(X).
Example 3: Let X denote a random variable that takes on
the values -1, 0, 1 with respective probabilities
P{X=-1}=.2, P{X=0}=.5, P{X=1}=.3
2
Compute E ( X ) .
Although the procedure we used in Example 3 will always
enable us to compute the expected value of g(X) from
knowledge of the pmf of X, there is another way of thinking
about E[ g ( X )] . Noting that g(X) will equal g(x) whenever
X is equal to x, it seems reasonable that E[ g ( X )] should
just be a weighted average of the values g(x) with g(x)
being weighted by the probability that X is equal to x.
Proposition 4.1: If X is a discrete random variable that
takes on one of the values xi , i  1 with respective
probabilities p ( xi ) , then for any real valued function g,
E[ g ( X )]   g ( xi ) p( xi ) .
i
Applying the proposition to Example 3,
E ( X 2 )  (1)2 (.2)  02 (.5)  12 (.3)  .5
Proof:
A corollary of Proposition 4.1 is:
Corollary 4.1: If a and b are constants, then
E[aX  b]  aE[ X ]  b .
Proof:
V. Variance (Section 4.5)
Another useful summary of a random variable’s probability
mass function besides its central tendency is its “spread.”
This is a very important concept in finance where investors
not only want investments with good returns (high expected
variables) but also want the investment not to be too risky (
have a low spread).
Example 4: The following three random variables have
expected value 0 but very different spreads:
W=0 with probability 1
Y=-1 with probability .5, 1 with probability .5
Z=-100 with probability .5, 100 with probability .5
A commonly used measure of spread is the variance of a
random variable, which is the expected squared deviation
of the random variable from its expected value.
The variance of a random variable X is defined by
Var ( X )  E[( X  E{ X })2 ] .
An alternative formula for variance is
Var ( X )  E ( X 2 )  ( E[ X ])2
Proof:
Example 4 continued: Compute Var ( X ),Var (Y ), Var ( Z )
Example 1 continued: Compute Var ( X )
Notes on Variance:
(1) A useful identity is that for any constants a and b,
Var (aX  b)  a 2Var ( X )
Proof:
(2) The square root of Var ( X ) is called the standard
deviation of X and we denote it by SD ( X )
(3) Another measure of spread is:
Mean Absolute Deviation (X) = E[| X  E ( X ) |]