Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of network traffic models wikipedia , lookup

Inductive probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

History of statistics wikipedia , lookup

Birthday problem wikipedia , lookup

Expected value wikipedia , lookup

Negative binomial distribution wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
CPSC 531:Probability & Statistics:
Review II
Instructor: Anirban Mahanti
Office: ICT 745
Email: [email protected]
Class Location: TRB 101
Lectures: TR 15:30 – 16:45 hours
Class web page:
http://pages.cpsc.ucalgary.ca/~mahanti/teaching/F05/CPSC531
Notes derived from “Probability and Statistics” by
M. DeGroot and M. Schervish, Third edition,
Addison Wesley, 2002, and
“Discrete-event System Simulation” by Banks,
Carson, Nelson, and Nicol, Prentice Hall, 2005.
CPSC 531: Probability Review
1
Objective and Outline
 The world the model-builder sees is probabilistic
rather than deterministic.

Some statistical model might well describe the variations.
 An appropriate model can be developed by sampling
the phenomenon of interest:



Select a known distribution through educated guesses
Make estimate of the parameters
Test for goodness of fit
 Goal is to review:
 Random variables
 Discrete and continuous random variables
 Cumulative distribution functions
 Expectation, variance, etc.
CPSC 531: Probability Review
2
Random Variables
 A random variable is a real-valued mapping defined on a
sample space.
 Suppose that X is a random variable defined on space
S, then X assigns a real-number X(s) to each possible
outcome s є S.
 Typically, X, Y, Z etc denote random variables; x, y, z,
etc denote values attained by random variables.
 Example: Rolling a pair of dice. Let X be the random
variable corresponding to the sum of the dice on a roll.
If we think of the sample points as a pair (i, j), where i
= value rolled by the first dice and j = value rolled by
the second dice, we have:
X(s) = i+j
CPSC 531: Probability Review
3
Discrete Random Variables
 A random variable X is said to be discrete if the
number of possible values of X is finite, or at most, an
infinite sequence of different values.
 Example: Consider jobs arriving at a job shop.
• Let X be the number of jobs arriving each week at a job shop.
•
S = possible values of X (range space of X) = {0,1,2,…}
•
p(xi) = probability the random variable is xi = P(X = xi)

p(xi), i = 1,2, … must satisfy:
1. p( xi )  0, for all i
2. i1 p( xi )  1


The collection of pairs [xi, p(xi)], i = 1,2,…, is called the
probability distribution of X, and p(xi) is called the probability
mass function (pmf) of X.
The pmf is referred to as “probability function” in some texts
CPSC 531: Probability Review
4
Discrete Random Variables
 Consider a random variable X that takes on values 1, 2,
3, and 4 with probabilities 1/6, 1/3, 1/3, and 1/6, resp.
p(x)
0.35
0.30
0.25
0.20
0.15
0.10
0.05
x
0.00
1
2
3
4
CPSC 531: Probability Review
5
Continuous Random Variables
 X is a continuous random variable if there exists a non-negative
function f(x) such that for any set of real numbers A є S
P( X  A)   f ( x)dx
A
 The probability that X lies in the interval [a,b] is given by:
b
P(a  X  b)   f ( x)dx
a
 f(x), denoted as the pdf of X, satisfies:
1. f ( x)  0 , for all x in S
2.  f ( x)dx  1
S
3. f ( x)  0, if x is not in S
 Properties
x0
1. P( X  x0 )  0, because  f ( x)dx  0
x0
2. P(a  X  b)  P(a  X  b)  P(a  X  bCPSC
)  P531:
(a 
X  b)
Probability Review
6
Continuous Random Variables
 Example: Life of an inspection device is given by X, a
continuous random variable with pdf:
1 x / 2
 e
, x 0
f ( x)   2
0,
otherwise


X has an exponential distribution with mean 2 years
Probability that the device’s life is between 2 and 3 years is:
1 3 x / 2
P(2  x  3)   e dx  0.14
2 2
CPSC 531: Probability Review
7
Cumulative Distribution Function
 The cumulative distribution function (cdf) of a random variable X
is a function F(x), defined for each real number x:

F(x) = P(X <= x) for -∞ < x < ∞

If X is discrete, then

If X is continuous, then
 Properties
F ( x)   p( xi )
all
xi  x
x
F ( x)   f (t )dt

1. F is nondecreas ing function. If a  b, then F (a)  F (b)
2. lim x F ( x)  1
3. lim x F ( x)  0
 All probability question about X can be answered in terms of the
cdf, e.g.:
P(a  X  b)  F (b)  F (a), for all a  b
CPSC 531: Probability Review
8
Cumulative Distribution Function
 Example: An inspection device has cdf:
1 x t / 2
F ( x)   e dt  1  e  x / 2
2 0

The probability that the device lasts for less than 2 years:
P(0  X  2)  F (2)  F (0)  F (2)  1  e1  0.632

The probability that it lasts between 2 and 3 years:
P(2  X  3)  F (3)  F (2)  (1  e(3 / 2) )  (1  e1 )  0.145
CPSC 531: Probability Review
9
Expectation
 The expected value of X is denoted by E(X)
 If X is discrete
E ( X )   xp( x)
All x

If X is continuous

E ( X )   xf ( x)dx



The mean, μ, is the 1st moment of X
A measure of the central tendency
 Properties:
 E(cX) = cE(X), where c is a constant
 E(Y) = aE(X) + b, where Y=aX+b, a & b are constants
 E(X + Y) = E(X) + E(Y) regardless of whether X and Y are
independent
 E(X.Y) = E(X).E(Y) if X & Y are independent
CPSC 531: Probability Review
10
Variance
 The variance of X is denoted by V(X) or
var(X) or s2
 Definition:
V(X) = E[(X – E[X]2]
 Also,
V(X) = E(X2) – [E(x)]2
 The variance is a measure of the dispersion or
spread of a random variable about its mean
 The standard deviation of X is denoted by
 Definition: square root of V(X)
 Expressed in the same units as the mean
s
 Properties:
 V(cX) = c2V(X)
 V(X + Y) = V(X) + V(Y) if X, Y are independent
CPSC 531: Probability Review
11
Small vs. Large Variance
σ2
large
σ2
small
X
X
µ
X
X
µ
Density functions for continuous random variables
with large and small variances (Source LK00, Fig 4.6)
CPSC 531: Probability Review
12
Expectations and Variance (example)
 Example: The mean of life of the previous inspection device
is:


1
x / 2
x / 2
E ( X )   xe dx   xe
  e  x / 2 dx  2
0
2 0
0

 To compute variance of X, we first compute E(X2):


1  2 x / 2
x / 2
2
E ( X )   x e dx   x e
  e  x / 2 dx  8
0
2 0
0
2
 Hence, the variance and standard deviation of the device’s
life are:
V ( X )  8  22  4
s  V (X )  2
CPSC 531: Probability Review
13
Joint Distributions
 Let X and Y each have a discrete distribution.
Then X and Y have a discrete joint distribution
if there exists a function p(x,y) such that:
p(x,y) = P[X=x and Y=y]
 Random variables X and Y are jointly
continuous if there exists a non-negative
function f(x,y) called the joint probability
density function of X and Y, such that for all
sets of real numbers A and B
P(X є A, Y є B) = ∫ ∫f(x,y)dxdy
B A
CPSC 531: Probability Review
14
Covariance
 The covariance between the random variables
X and Y, denoted by Cov(X, Y), is defined by
Cov(X, Y) = E{[X - E(X)][Y - E(Y)]}
= E(XY) - E(X)E(Y)
 The covariance is a measure of the dependence
between X and Y. Note that Cov(X, X) = V(X).
CPSC 531: Probability Review
15
Covariance
Cov(X, Y)
=0
>0
<0
X and Y are
uncorrelated
positively correlated
negatively correlated
Independent random variables are also
uncorrelated.
CPSC 531: Probability Review
16
Statistical Models
 Application areas where statistical models find
widespread use:
Queueing systems
 Inventory and supply-chain systems
 Reliability and maintainability
 Limited data

CPSC 531: Probability Review
17
Queueing Systems
 In a queueing system, interarrival and service-time
patterns can be probabilistic (e.g., our M/M/1 example).
 Sample statistical models for interarrival or service
time distribution:




Exponential distribution: if service times are completely
random
Normal distribution: fairly constant but with some random
variability (either positive or negative)
Truncated normal distribution: similar to normal distribution
but with restricted value.
Gamma and Weibull distribution: more general than
exponential (involving location of the modes of pdf’s and the
shapes of tails.)
CPSC 531: Probability Review
18
Inventory and supply chain
 In realistic inventory and supply-chain systems, there
are at least three random variables:



The number of units demanded per order or per time period
The time between demands
The lead time
 Sample statistical models for lead time distribution:
 Gamma
 Sample statistical models for demand distribution:
 Poisson: simple and extensively tabulated.
 Negative binomial distribution: longer tail than Poisson (more
large demands).
 Geometric: special case of negative binomial given at least one
demand has occurred.
CPSC 531: Probability Review
19
Reliability and maintainability
 Time to failure (TTF)
Exponential: failures are random
 Gamma: for standby redundancy where each
component has an exponential TTF
 Weibull: failure is due to the most serious of a large
number of defects in a system of components
 Normal: failures are due to wear

CPSC 531: Probability Review
20
Our next stop
 Discrete distributions, such as:
Bernoulli trials and Bernoulli distribution
 Binomial distribution
 Geometric and negative binomial distribution
 Poisson distribution

 Continuous distributions, such as:
 Uniform
 Exponential
 Normal
 Weibull
 Lognormal
CPSC 531: Probability Review
21