Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Confidence interval wikipedia , lookup

Mean field particle methods wikipedia , lookup

Taylor's law wikipedia , lookup

Karhunen–Loève theorem wikipedia , lookup

Central limit theorem wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Chap 5
Sums of Random Variables and Long-Term Averages
• Many problems involve the counting of number of
occurrences of events, computation of arithmetic averages
in a series of measurements.
• These problems can be reduced to the problem of finding
the distribution of a random variable that consists of sum
of n i.i.d. random variables.
1
5.1 Sums of Random Variables
Let X1 , X 2 ,..... X n be a sequence of random variables, and let
Sn  X 1  X 2 
 X n.
E  Sn   E  X 1   E  X 1   .....  E  X 1  , regardless of statistical dependence.
2
VAR  Sn     Sn  E  Sn  


2



 E   X i  E  X i   
 i
 


n
 n





 E   X j  E X j   X  E X

k
k




j

1
k

1


n n
   E  X j  E  X j  X  E  X  
k
 k 
 
j 1 k 1 
n
n
n
  VAR X  
 COV X j , X k
k
j 1 k 1, j  k
k 1



 


 




2
In general, COV
j k
( X j , X k )  0, so VAR( X i )  VAR( X i ).
i
i
are indep., COV ( X j , X )  0 for j  k .
k
 n
 n
VAR  Sn   VAR   X i    VAR X i
 i 1  i 1
If X1, X 2 ,....., X n
 
Ex. 5.2 Find the mean and variance of the sum of n independent, identically
distributed (iid) random variables, each with mean  and variance  2 .
E  Sn  
VAR  Sn  
3
The characteristic function of Sn
Let X 1 , X 2 ,..... X n be independent random variables, and
Sn  X 1  X 2 
 X n.
 Sn ( )  E e j Sn   E e  1

 E e j X1  E e j X n 
j X  X 2 ..... X n 


  X1 ( )  X n ( )
The pdf of S n can be found by


f Sn  s    1  X1   ..... X n   .
4
Ex. 5.3 Sum of n iid Gaussian r.v. with parameters
 X i    e
mi and  i2
.
j mi  2 i2 / 2
 Sn ( ) 
Ex. 5.5 Sum of n iid exponential r.v. with parameter
 X   


  j
  
 Sn     

  j 
n
p.102 n-Erlang
5
If X i ' s are integer-valued r.v.s. , it is preferable to use the prob. generating
function (z-transform).
GN  z   E  z N 
If N  X1  .....  X n , X1 ,..., X n independent,
GN ( z )  E  z X1 
  E  z X1  E  z X n 
 G X1 ( z ) G X n ( z )
 Xn
Ex. Find the generating function for a sum of n iid geometrically distributed r.v.
GX ( z ) 
pz
1  qz
 pz 
GN ( z )  

1

qz


n
p.100, negative binomial
6
Sum of a random number of random variables
N
SN   X k
k 1
X k i.i.d.
N is a r.v., independent of X k ' s.
E  S N   E  E  S N | N  
 E  N E  X 
(
E  S N | N  n  E  X1 
 X n   nE  X  )
 EN  EX 
j ( X1   X n ) 
n
| N  n   E e


(

)
X



j SN
E e
| N    X ( ) N


j SN


 S N ( )  E  E e
| N 
 
 
 E  z N  |z  X ( )  GN ( X ( ))
E e

j SN
Ex. 5.7
7
Ex. 5.7 The number of jobs N submitted to a computer in an hour
is a geometric random variable with parameter p, and the
job execution times are independent exponentially distribiuted
random variables with mean 1 . Find the pdf of the sum

of the execution times of the jobs submitted in an hour.
GN ( z ) 
 X ( ) 
 S N ( ) 
f SN ( x) 
8
5.2 Sample Mean and Laws of Large Numbers
Let X be a random variable for which the mean, E  X  = , is unknown.
Let X 1 , X 2 ,..... X n denote independent, repeated measurements of X .
i.e., X j 's are iid random variables.
The sample mean of of the sequence
1
M n  ( X1  X 2   X n )
n
can be used to estimate E  X  .
M n itself is a r.v.
1 n
 1 n
E  M n   E   X j    E  X j   
 n j1  n j1
M n is an unbiased estimator for .
9
E ( M n   ) 2   E ( M n  E  M n ) 2 

mean square error of M n
1
Since M n  S n
n

Variance of M n
1
n 2  2
VAR( M n )  2 VAR( Sn )  2 
n
n
n
( 0 as
n  )
Using Chebyshev inequality
P[ M n  E  M n    ] 
2
P[ M n     ] 
n 2
or
VAR ( M n )
2
2
P[ M n     ]  1 
n 2
10
Ex.5.9 Voltage measurement X j  v  N j , where v is the desired voltage
and N j is the noise voltage with mean zero and standard deviation 1  V.
Assume that noise voltages are independent random variables.
How many measurements are required so that the probability that
is within  =1 V of the true mean is at least 0.99?
11
Weak Law of Larger Numbers
be a sequence of iid random variables with
Mn
finite mean E[ X ]   , then for   0
Let X 1 , X 2 ,
lim P[ M n     ]  1
n
Fig. 5.1
Sample mean will be close to the true mean with high probability
Strong Law of Larger Numbers
be a sequence of iid random variables with
finite mean E[ X ]   and finite variance, then
Let X 1 , X 2 ,
P[ lim M n   ]  1
n 
With probability 1, every sequence of sample mean calculations will
eventually approach and stay close to E[X].
12
n
Ex.5.10 In order to estimate the probability of an event A, a sequence of Bernoulli
trials is carried out and relative frequency of A is observed. How large
should n be in order to have a 0.95 probability that relative frequency
is within 0.01 of p  P[ A]?
13
5.3 The central Limit Theorem
Let X 1 , X 2 ,
be a sequence of iid random variables with finite mean 
and finite variance  2 , and let
Sn  X 1  X 2   X n .
In sec. 5.1, we learn how to find the exact pdf of Sn .
CLT: as n becomes large, cdf of Sn approach that of a Gaussian.
Let Sn be the sum of n iid r.v.s with finite mean E[ X ]   and finite variance  2 .
Let Zn be the zero-mean, unit-variance r.v. defined by
S n  n
Zn 
 n
then
1
lim P[ Z n  z ] 
n 
2
e
z
X
2
2
dx
14
pf :
Zn 
Sn  n
 n
n

 ( X k  )
 n k 1
1
 Z n ( w)  E[e j Z n ]
 j
 E[exp 
 n
n
 E[ e

( X k   ) ]

k 1

n
j ( X k   )

n
]

n
]
k 1
n
  E[e
j ( X k   )
k 1

  E[e

j ( X k   )

n

]

n
15
E e

j  X     n


2


j 

j
2
 E 1 
X     R  , X  
X  
2 
2!n
  n

j 

j
 X    2   E  R   , X  
 1
EX  
E



2!n 2 
 n
2
 1
2
2n
 E  R  , X  
as n  
2
E  R  , X   can be neglected relative to 
2n
.
n
 2
  
lim  Zn    1    e 2
n 
 2n 
2
characteristic function of a zero-mean, unit-variance Gaussian r.v.
Fig 5.2-5.4 show approx.
16
Ex.5.11 Suppose that orders at a restaurant are iid random variables with mean
  $8 and standard deviation   $2. Estimate the probability that
the first 100 customers spend a total of more than $840.
Using Gaussian approximation:
Ex.5.12 In Ex. 5.11, after how many orders can we be 90% sure that the total
spending by all customers is more than $1000?
17
Ex.5.14 In order to estimate the probability of an event A, a sequence of Bernoulli
trials is carried out and relative frequency of A is observed. How large
should n be in order to have a 0.95 probability that relative frequency
is within 0.01 of p  P[ A]? (Using Gaussian approximation for binomial)
18
5.4 Confidence Intervals
The sample mean estimator M n provides a single numerical
value for the estimate of E  X    ,
1 n
Mn   X j
n j 1
In order to know how good is the estimate provided by M n ,
we can compute the sample variance, which is the average
dispersion about M n .
n
2
1
2
Vn 
X j  Mn 


n  1 j 1
E Vn 2    2
If Vn 2 is small, Xj’s are tightly clustered about Mn.
and we can be confident that Mn is close to E[X ].
19
Another way of specifying accuracy and confidence of an estimate:
Find an interval l ( X), u ( X)  such that
P l  X     u  X    1  
Such an interval is a (1- ) 100% confidence interval.
1- is called the confidence level.
The probability 1- is a measure of degree of confidence.
The width of the confidence interval is a measure of accuracy.
20
Case 1. Xj’s Gaussian with unknown Mean
Mn is Gaussian with mean 
 and known Variance  2 .
2

and variance
n


Mn  
P  z 
 z   1  2Q  z 
 n


Z
Z 

P M n 
   Mn 
  1  2Q  z 
n
n


u ( X)
l ( X)
Choose a z  z 2 such that   2Q( z 2 ), then
(M n  z 2

, M n  z 2

)
n
n
is a (1- ) 100% confidence interval for .
21
Table5.1
1-
z 2
0.90
1.645
0.95
1.960
0.99
2.576
EX.5.15 A voltage X is given by X  v  N , where v is an unknown
constant voltage and N is a random noise voltage that has
a Gaussian pdf with zero mean and variance 1V 2 .
Find the 95% confidence interval for v if the voltage X is
measured 100 independent times and the sample mean
is found to be 5.25V .
22

2
Case2: X j 's Gaussian; Mean and Variance unknown
use sample variance as replacement of variance
the confidence interval becomes
zVn
zVn 

M

,
M

n
 n

n
n



M 
zV
zV 

P  z  n
 z   P M n  n    M n  n 
n 
Vn n
n




M 
n (M n   ) 
W n

Vn n
Vn 
( M n   ) ( n )

Zero-mean unit-variance Gaussian
1
 
 2

2
2
 (n  1)Vn /   (n  1) 

 

Chi-square r.v. with n-1 degrees of freedom
Indep.
W is a student’s t-distribution with n-1 degrees of freedom.
23
(Ex. 4.38)
n 2 
y 

f n 1 ( y ) 
1 

(n  1) 2   (n  1)  n  1 
2
n
2
z
zVn
zVn 

P M n 
   Mn 
   z f n 1 ( y )dy
n
n

 1  2 Fn 1 ( z )
Choose a z  z 2,n 1 such that   2 Fn 1 ( z 2,n 1 ), then
Vn
V
, M n  z 2,n 1 n )
n
n
is a (1- ) 100% confidence interval for .
(M n  z 2,n 1
24
1- 
Table 5.2
z 2,n 1
n -1
1
2
3
4
5
6
7
.90
6.314
2.920
2.353
2.132
2.015
1.943
1.895
.95
12.706
4.303
3.182
2.776
2.571
2.447
2.365
.99
63.657
9.925
5.841
4.604
4.032
3.707
3.499
Ex.5.16 The life time of a certain device is assumed to have a Gaussian
distribution. Eight devices are tested and the sample mean and
sample variance for the lifetime obtained are 10 days and 4 days 2 .
Find the 99% confidence interval for the mean lifetime.
25
Case 3: X j ' s non-Gaussian; Mean and Variance unknown.
Use method of batch mean.
Performing a series of M independent experiments in which
sample mean (from a large number of observations) is computed.
Ex.5.17 A computer simulation program generates exponentially distributed
random variables of unknown mean. Two hundred samples of these
random variables are generated and grouped into 10 batches of 20
samples each. The sample means of the 10 batches are:
1.04 0.64 0.80 0.75 1.12
1.30 0.98 0.64 1.39 1.26
Find the 90% confidence interval for the eman of the r.v.
26
5.4 Convergence of Sequences of Random Variables
In Section 5.2, we discussed the convergence of the sequence of arithmetic
averages M n of iid random variables to the expected value :
Mn  
as n  .
In this section we consider the more general situation where a sequence
of random variables (usually not iid) X 1 , X 2 ,
converges to some
random variable X :
Xn  X
as n  .
A sequence of random variables X is a function that assigns a countably
infinite number of real values to each outcome  from some sample
space S:
X     X 1   , X 2   ,..., X n   ,... .
a sequence of functions of

- We sometimes use  X n   or  X n  to denote X( ).
27
1
Ex.5.18. Vn     1   ,


n
in S   0,1
Vn  
1
2
V3    
3
1
V2    
2
A sequence of functions of  .

1
Vn  
1

2

3
1

2
a sequence of real number
for a given  .
4

5
3

4
0
1
2
3
4
5
n
28
The sequence xn converges to x if, given any   0, we can specify an
integer N such that for all values of n beyond N we can guarantee that
xn  x < .
xn
2
x
N
n
If the limit x is unknown, we can use Cauchy criterion:
The sequence xn converges if and only if, given   0, we can specify
an integer N ' such that for m, n greater than N ', xn  xm < .
29
Sure Convergence: The sequence of random variables  X n ( ) converges surely
to the random variable X ( ) if the sequence of functions X n ( ) converges to the
function X ( ) as n   for all  in S .
X n ( )  X ( )
as n  
Almost - Sure Convergence:
X n ( )  X ( ) as n  
for all  in S .
for all  in S , except possibly on a set of
probability zero; that is, P  : X n ( )  X ( ) as n     1.
xn
2
x
n
Ex: Strong Law of Large numbers
30
Ex. 5.20 Let  be selected at random from the interval S   0,1 , where
we assume that the probability that  is in a subinterval of is equal to the
length of the subinterval. Define the following five sequences of random variables:
Un 

n
 1
Vn   1  
 n
Wn   e n
Yn  cos 2 n
Z n  e  n ( n 1)
Which of these sequences converge surely? almost surely?
31
Ex. 5.21 Let the sequence of random variables X n ( ) consist of
independent equiprobable Bernoulli random variables,
1
P  X n ( )  0    P  X n ( )  1
2
Does this sequence of random variables converge?
Ex. 5.22 An urn contains 2 black balls and 2 white balls.
At time n a ball is selected at random from the urn, and the color is noted.
If the number of balls of this color is greater than the number of balls of
the other color, then the ball is put back in the urn; otherwise, the ball is
left out. Let X n ( ) be the number of black balls in the urn after the nth
draw. Does this sequence of random variables converge?
32
Mean-Square Convergence
2
E  X n     X       0


as
n
Ex. 5.23 Does the sequence Vn ( ) converge in the mean square sense?
1
Vn ( )   (1  )
n
Convergence in Probability
P  X n    X       0
as n  
xn
2
x
Ex: weak law of large numbers.
n0
n
33
Ex. 5.24 Does Z n ( ) converge in the mean square sense?
Z n    e2 n ( n 1)
Convergence in Distribution: The sequence of random variables  X n  with
cumulative distribution function  Fn ( x) converges in distribution to the
random variable X with cumulative distribution F ( x) if
Fn ( x)  F ( x)
as n  
for all x at which F ( x) is continuous.
Ex. Central limit theorem
Ex. 5.21: Bernoulli iid sequence
34
m.s.
a.s.
s
prob
dist
35