Download review

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Review
ELEC 2811
1. Elementary Probability
Definitions:
o Random experiment: each experiment ends in an outcome that cannot be
determined with certainty before the performance of the experiment.
o Sample space S: a collection of every possible outcome from a random
experiment.
o Events: particular outcomes from a random experiment.
o Probability: a function P that assigns each event A in a sample space S a
number P(A), called the probability of the event A, such that the following
properties are satisfied:
i)
P(A)  0
ii)
P(S)  1
iii)
If Ai are mutually exclusive events, i.e. Ai  Aj  , i  j , then
P( A 1  A 2      A k )  P( A 1 )  P( A 2 )      P( A k )
for any positive integer k.
Properties:
For each event A, P( A)  1  P( A)
P() = 0.
If AB, then P(A)  P(B).
For each event A, P(A) < 1.
P(AB) = P(A) + P(B) - P(AB).
P(ABC) = P(A) + P(B) + P(C) - P(AB) - P(CB) -P(AC) + P(ABC).
1.2
Counting Sample Points
Multiplication Rule (with replacement): If an operation can be performed in n1
ways, and if for each of these, a second operation can be performed in n2 ways, then
the two operations can be performed in n1n2 ways.
Without replacement (ordered): If only r positions are to be filled with objects
selected from n different objects, r  n, the number of possible ordered arrangements
n!
is called Permutation
.
nPr = n(n-1)((n-2) ... (n-r+1) 
( n  r )!
P.1
Review
ELEC 2811
Without replacement (unordered): The number of ways in which r objects can be
selected without replacement from n objects, when the order of selection is
n!
disregarded, is nCr =
, a combination of the n objects taken r at a time.
r !( n  r )!
1. 3 Conditional Probability
The conditional probability of an event A, given that event B has occurred, is
P(A  B)
defined by
.
P(A| B) 
P( B)
Therefore, the probability that two events, A and B, both occur is given by the
P(A  B)  P( B| A )  P(A )  P ( A | B )  P ( B ) .
multiplication rule
Two events A and B are independent if and only if
P( A  B)  P( B)  P( A)
otherwise, A and B are called dependent events.
Theorem of Total Probability
Let A be an event, and let
B1 , B2 ,
, Bn be mutually exclusive events of nonzero
probability whose union is the sample space S. Then
P( A)  P( B1 ) P( A | B1 )  P( B2 ) P( A | B2 ) 
 P( Bn ) P( A | Bn ) .
Now if we are interested in determining the value of P( Bk | A) ,
P( Bk | A) 
P( Bk  A)
P( B ) P( A | Bk )
 m k
,
P( A)
 P( Bi ) P( A | Bi )
i = 1,2,...,m.
i 1
This is the famous Bayes’ Formula or Bayes’ Theorem.
Example:
Bowl B1 contains two red and four white balls; bowl B2 contains one
red and two white; and bowl B3 contains five red and four white. Suppose that the
probabilities for selecting the bowls are given by P(B1)=1/3, P(B2)=1/6, P(B3)=1/2.
The experiment consists of selecting a bowl at random and then drawing a ball at
random from that bowl. Find the probability of drawing from B1 given that the ball
drawn is a red ball, i.e. P(B1|R).
P.2
Review
2.
ELEC 2811
Random Variables & Their Distributions
2.1 Random Variable
Definition: Given a random experiment with sample space S, a function X that
assigns to each element s in S one and only one real number X(s) = x is called a
random variable (r.v.).
Example:
Let sample space S = {F, M}. Let X be a function defined on S such
that X(F) = 0 and X(M)=1. Thus X is a random variable.
For notational purposes we shall denote the event { s: s S and X(s)= a} by {X = a}
and we denote:
P(X = a) = P({s: s S and X(s)= a})
P(a<X<b) = P({s: s S and a < X(s)< b})
2.2 Random variable of the discrete type
Let X denote a random variable with space R, a subset of real numbers. Suppose that
the space R contains a countable number of points. Such a set R is called discrete
sample space. The random variable X is called a random variable of the discrete type,
and X is said to have a distribution of the discrete type.
For a random variable X of the discrete type, the function f(x) = P(X = x) is called the
probability density function (p.d.f.), or the probability mass function.
Since f(x) = P(X = x), xR, so f(x) satisfy the properties:
(a) f(x)>0, xR; and
f(x) = 0 when xR.
(b)
 f ( x)  1 ;
xR
(c)
P( X  A)   f ( x), where A  R
xA
The cumulative probability density function F(x) = P(X  x) is called the distribution
function of the discrete-type r. v. X.
Properties of a distribution function F(x):
(a)
(b)
0  F(x)  1.
F(x) is a non-decreasing function of x.
P.3
Review
ELEC 2811
If X is a random variable of the discrete type, then F(x) is a step function, and the
height of a step at x, xR, equals the probability P(X = x).
Example:
Find the probability density function of the number of heads when a
coin is tossed three times.
Let X equal number of heads outcomes.
Then X = {0,1,2,3}, the p.d.f. of X is given by:
x
0
1
2
3
f(x)
1/8
3/8
3/8
1/8
and hence, the distribution function F(x) of x :
x
0
1
2
3
F(x)
1/8
4/8
7/8
1
2.3 Random variable of the continuous type
Let X denote a random variable with space R, an interval or union of intervals. Such a
set R is called continuous sample space. The random variable X is called a r.v. of the
continuous type, and X is said to have a distribution of the continuous type.
The probability density function (p.d.f.) of a r.v. X of the continuous type, is an
integral of f(x) xR, satisfy the properties:
(a) f(x)  0, xR;
(b)
 f ( x)dx  1;
R
(c) The probability of the event xR is
P( X  A)   f ( x) dx, where A  R
A
Example:
Let the r.v. X be the distance in feet between bad records on a used
computer tape. Suppose that a reasonable probability model for X is given by the
p.d.f.
f ( x) 
1  x / 40
e
,
40
0 x
Then,
 f ( x)dx  
R

0
1  x / 40
e
dx  1
40
P.4
Review
ELEC 2811


1  x / 40
e
dx  e 1  0.368 .
40 40
P( X  40)   f ( x)dx  
And,
40
The cumulative probability density function
F ( x)  P( X  x)  
x

f (t )dt ,
is called the distribution function of the continuous type r.v. X.
Same two properties of a distribution function F(x) as in the discrete case.
(1) 0  F(x)  1 because F(x) is a probability.
(2) F(x) is a non-decreasing function of x.
3.
Mathematical Expectation
3.1 Expected Value
Definitions:
o If f(x) is the p.d.f. of the r.v. X of the discrete type with space R and if
 u ( x) f ( x)
xR
exists, then the sum is called the expected value of the function u(x), and
denoted by E[u(X)].
o If f(x) is the p.d.f. of the r.v. X of the continuous type with space R and if
 u ( x) f ( x)dx
R
exists, then the integral is called the expected value of the function u(x), and
denoted by E[u(X)].
Example:
Assume X have the p.d.f.
f(x)=x/10,
x=1,2,3,4.
4
 x
 1
 2
 3
 4
E (1)   1   (1)   (1)   (1)   (1)   1,








 10 
10
10
10
10
x 1
4
 x
 1
 2
 3
 4
E ( X )   x   (1)   (2)   (3)   (4)   3,
 10
 10
 10
 10
 10
x 1
4
 x
 1
 2
 3
 4
E ( X 2 )   x 2    (1) 2    (2) 2    (3) 2    (4) 2    10,








 10 
10
10
10
10
x 1
P.5
Review
ELEC 2811
Properties:
a) E(c) = c.
b) E [ c u(X) ] = cE [ u(X) ].
c) E [ c1u1(X)+ c2 u2(X)] = c1 E [u1(X)] + c2 E [u2(X)].
3.2 The Mean and the Variance
Definitions:
o If X is a r.v. with p.d.f. f(x) of the discrete type and space R, then
  E  X    xf ( x)
R
is the weighted average of the numbers belonging to R, where the weights are
given by the p.d.f. f(x). E(X) is called the mean of X and denoted by ..
o If X is a r.v. with p.d.f. f(x) of the continuous type and with space R, then the
mean of X is defined by

  E[ X ]   x f ( x)dx .

o If u(x)=(x-)2 and E[(x-)2] exists, the variance of a r.v. X is defined by
 2  E ( x   ) 2  , and denoted by 2 or Var(x),
o
If X is of discrete type,
2
 2  E  x       ( x   ) 2 f ( x ) .


R
o If X is of continuous type,
2
  E  x     


2

 ( x   ) f  x  dx .
2

o The positive square root of the variance is called the standard deviation of
X and is denoted by
  Var  X   E ( x   )2 
Example:
Let X have the p.d.f.:
f(x)=0.125, when x=0,3,
f(x)=0.375, when x=1,2.
then the mean, variance and the standard deviation are:
 = E(X) = 0(0.125) + 1(0.375) + 2(0.375) + 3(0.125) = 1.5
P.6
Review
ELEC 2811
2= E[(x-)2] = (-1.5)2(.125) + (-.5)2(.375) + (.5)2(.375) + (1.5)2(.125)
= 2.90625
2
  E  x      2.90625  1.721

4.

Discrete Probability Distributions
4.1 Bernoulli distribution
Consider a random experiment, the outcome of which can be classified in but one of
two possible ways, say, success or failure (e.g. female or male, life or death, defective
or non-defective). Let X be a r.v. such that
X(success) = 1
and
X(failure) = 0.
Let P(X=1) = p
and
P(X=0) = 1-p.
That is, the p.d.f. of X is
f(x) = px(1-p)1-x,
x = 0,1;
Then, X has a Bernoulli distribution.
4.2 Binomial Distributions
Example:
A fair die is cast four independent times. Call the outcome a success if
a six is rolled, all other outcomes being considered failures. Find the probability of
having (0,0,1,0) as the outcome.
In a sequence of Bernoulli trials, we are often interested in the total number of
successes and not in the order of their occurrence. Assume that the probabilities of
success and failure on each trial are, respectively, p and q=1-p. Since the trials are
independent, the probability of k successes among n trials is pk(1-p)n-k.
Let the r.v. Y denotes the number of successes in the n trials. Then the p.d.f. of Y is:
g (Y  k )  n Ck p k (1  p) n k ,
k=0, 1, 2, 3, …, n.
These probabilities are called binomial probabilities, and the r.v. Y has a binomial
distribution, denoted by b(n, p).
Properties:
(1)  = E(Y) = np,
(2)  2  Var Y   np 1  p   npq .
P.7
Review
ELEC 2811
4.2 Negative Binomial and Geometric Distributions
Suppose now that we do not fix the number of Bernoulli trials in advance but instead
continue to observe the sequence of Bernoulli trials until k successes occurs. The
random variable of interest is the number of trials needed to observe the kth success.
This is Negative Binomial, and its p.d.f. is of the form
f ( X  k )  x 1 Ck 1 p k (1  p ) x k ,
x  k , k  1,...
However, if we are interested in a random variable X which denotes the number of
trials on which the first success occurs, then X has a Geometric Distributions and its
p.d.f. is of the form
f ( X  k )  (1  p)k 1 p,
k  1, 2,3,...
Example:
Some biology students were checking the eye color for a large number
of fruit flies. For the individual fly, suppose that the probability of white eyes is 1/4
and the probability of red eyes is 3/4, and that we may treat these flies as independent
Bernoulli trials. The probability that at least four flies have to be checked to observe a
white-eyed fly is
P(X  4) = P(X > 3) = (1-p)3 = (3/4)3 =0.422
The probability that exactly four flies must be observed to see one white eye is
P(X = 4) = (1-p)4-1p = (3/4)3(1/4) = 0.105
Properties:
(1)  = E(Y) =1/p, and
1  p   q ,
(2)  2  Var Y  
p2
p2
4.3 Poisson Distributions
Some experiments result in counting the number of times particular events occur in
given times, e.g., the number of phone calls arriving at a switchboard between 9 and
10 a.m., the number of customers that arrive at a ticket window between 12 noon and
2 p.m., or the number of flaws in 100 feet of wire.
P.8
Review
ELEC 2811
If the following are satisfied for parameter  > 0:
(1) The numbers of changes occurring in non-overlapping intervals are independent.
(2) The probability of exactly one change in a sufficiently short interval of length h
(3)
is approximately h.
The probability of two or more changes in a sufficiently short interval is
essentially zero.
then the r.v. X has a Poisson distributions and its p.d.f. is of the form
f ( x) 
 x e 
x!
,
x  0,1, 2,...
for  > 0.
Properties:
(1)  = E(Y) = , and
(2)  2  Var Y    .
Example:
Telephone calls enter a college switchboard on the average of two
every 3 minutes. If one assumes an approximate Poisson process, what is the
probability of five or more calls arriving in a 9-minute period?
In fact, Poisson distribution can be approximated by a binomial distribution,
(np) x e np
 n Cx p x (1  p)n  x
x!
This approximation is reasonably good if n is large and p is very close to 0 or 1.
i.e.
5.
Continuous Probability Distributions
5.1 Uniform Distribution
The r.v. X has a Uniform Distributions if its p.d.f. is equal to a constant on the
interval [a, b],
f ( x) 
1
,
ba
a xb
and is denoted by the symbol U(a,b).
Properties:
(1)  = E(Y) = (a+b)/2,
(2)   Var Y 
2
b  a 

12
2
.
P.9
Review
ELEC 2811
5.2 Exponential Distribution
The r.v. X has a Exponential Distribution if its p.d.f. is defined by,
f ( x) 
1

e  x / ,
0 x
Properties:
1.
 = ,
2.
 2  Var Y    2 .
Example:
Suppose that the life of a certain type of component has an exponential
distribution with a mean life of 500 hours. Suppose that the component has been in
operation for 300 hours. Find the probability that it will last for another 600 hours.
5.3 Normal distribution
The r.v. X has a Normal Distribution if its p.d.f. is defined by,
f ( x) 
 ( x   )2 
1
exp  
,
2 2 
 2

  x  
where  and  are parameters satisfying - <  < , 0 <  < , and denoted by
N(,2).
For Normal Distribution, we have:
1.
2.
3.
 =, and
2 = Var(X) = 2.
If X is N(,2), then Z = (X - )/ is N(0,1).
6.
Sample Mean and Sample Variance
A random sample, X1 , X 2 ,
X
, X n , are drawn from a population. The sample mean is
X1  X 2 
n
 Xn
,
and the sample variance is defined as
S
2
X

 X    X2  X  
2
1
2
n 1
P.10
  Xn  X 
2
.