* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
History of network traffic models wikipedia , lookup
Inductive probability wikipedia , lookup
Foundations of statistics wikipedia , lookup
Infinite monkey theorem wikipedia , lookup
History of statistics wikipedia , lookup
Birthday problem wikipedia , lookup
Expected value wikipedia , lookup
CPSC 531:Probability & Statistics: Review II Instructor: Anirban Mahanti Office: ICT 745 Email: [email protected] Class Location: TRB 101 Lectures: TR 15:30 – 16:45 hours Class web page: http://pages.cpsc.ucalgary.ca/~mahanti/teaching/F05/CPSC531 Notes derived from “Probability and Statistics” by M. DeGroot and M. Schervish, Third edition, Addison Wesley, 2002, and “Discrete-event System Simulation” by Banks, Carson, Nelson, and Nicol, Prentice Hall, 2005. CPSC 531: Probability Review 1 Objective and Outline The world the model-builder sees is probabilistic rather than deterministic. Some statistical model might well describe the variations. An appropriate model can be developed by sampling the phenomenon of interest: Select a known distribution through educated guesses Make estimate of the parameters Test for goodness of fit Goal is to review: Random variables Discrete and continuous random variables Cumulative distribution functions Expectation, variance, etc. CPSC 531: Probability Review 2 Random Variables A random variable is a real-valued mapping defined on a sample space. Suppose that X is a random variable defined on space S, then X assigns a real-number X(s) to each possible outcome s є S. Typically, X, Y, Z etc denote random variables; x, y, z, etc denote values attained by random variables. Example: Rolling a pair of dice. Let X be the random variable corresponding to the sum of the dice on a roll. If we think of the sample points as a pair (i, j), where i = value rolled by the first dice and j = value rolled by the second dice, we have: X(s) = i+j CPSC 531: Probability Review 3 Discrete Random Variables A random variable X is said to be discrete if the number of possible values of X is finite, or at most, an infinite sequence of different values. Example: Consider jobs arriving at a job shop. • Let X be the number of jobs arriving each week at a job shop. • S = possible values of X (range space of X) = {0,1,2,…} • p(xi) = probability the random variable is xi = P(X = xi) p(xi), i = 1,2, … must satisfy: 1. p( xi ) 0, for all i 2. i1 p( xi ) 1 The collection of pairs [xi, p(xi)], i = 1,2,…, is called the probability distribution of X, and p(xi) is called the probability mass function (pmf) of X. The pmf is referred to as “probability function” in some texts CPSC 531: Probability Review 4 Discrete Random Variables Consider a random variable X that takes on values 1, 2, 3, and 4 with probabilities 1/6, 1/3, 1/3, and 1/6, resp. p(x) 0.35 0.30 0.25 0.20 0.15 0.10 0.05 x 0.00 1 2 3 4 CPSC 531: Probability Review 5 Continuous Random Variables X is a continuous random variable if there exists a non-negative function f(x) such that for any set of real numbers A є S P( X A) f ( x)dx A The probability that X lies in the interval [a,b] is given by: b P(a X b) f ( x)dx a f(x), denoted as the pdf of X, satisfies: 1. f ( x) 0 , for all x in S 2. f ( x)dx 1 S 3. f ( x) 0, if x is not in S Properties x0 1. P( X x0 ) 0, because f ( x)dx 0 x0 2. P(a X b) P(a X b) P(a X bCPSC ) P531: (a X b) Probability Review 6 Continuous Random Variables Example: Life of an inspection device is given by X, a continuous random variable with pdf: 1 x / 2 e , x 0 f ( x) 2 0, otherwise X has an exponential distribution with mean 2 years Probability that the device’s life is between 2 and 3 years is: 1 3 x / 2 P(2 x 3) e dx 0.14 2 2 CPSC 531: Probability Review 7 Cumulative Distribution Function The cumulative distribution function (cdf) of a random variable X is a function F(x), defined for each real number x: F(x) = P(X <= x) for -∞ < x < ∞ If X is discrete, then If X is continuous, then Properties F ( x) p( xi ) all xi x x F ( x) f (t )dt 1. F is nondecreas ing function. If a b, then F (a) F (b) 2. lim x F ( x) 1 3. lim x F ( x) 0 All probability question about X can be answered in terms of the cdf, e.g.: P(a X b) F (b) F (a), for all a b CPSC 531: Probability Review 8 Cumulative Distribution Function Example: An inspection device has cdf: 1 x t / 2 F ( x) e dt 1 e x / 2 2 0 The probability that the device lasts for less than 2 years: P(0 X 2) F (2) F (0) F (2) 1 e1 0.632 The probability that it lasts between 2 and 3 years: P(2 X 3) F (3) F (2) (1 e(3 / 2) ) (1 e1 ) 0.145 CPSC 531: Probability Review 9 Expectation The expected value of X is denoted by E(X) If X is discrete E ( X ) xp( x) All x If X is continuous E ( X ) xf ( x)dx The mean, μ, is the 1st moment of X A measure of the central tendency Properties: E(cX) = cE(X), where c is a constant E(Y) = aE(X) + b, where Y=aX+b, a & b are constants E(X + Y) = E(X) + E(Y) regardless of whether X and Y are independent E(X.Y) = E(X).E(Y) if X & Y are independent CPSC 531: Probability Review 10 Variance The variance of X is denoted by V(X) or var(X) or s2 Definition: V(X) = E[(X – E[X]2] Also, V(X) = E(X2) – [E(x)]2 The variance is a measure of the dispersion or spread of a random variable about its mean The standard deviation of X is denoted by Definition: square root of V(X) Expressed in the same units as the mean s Properties: V(cX) = c2V(X) V(X + Y) = V(X) + V(Y) if X, Y are independent CPSC 531: Probability Review 11 Small vs. Large Variance σ2 large σ2 small X X µ X X µ Density functions for continuous random variables with large and small variances (Source LK00, Fig 4.6) CPSC 531: Probability Review 12 Expectations and Variance (example) Example: The mean of life of the previous inspection device is: 1 x / 2 x / 2 E ( X ) xe dx xe e x / 2 dx 2 0 2 0 0 To compute variance of X, we first compute E(X2): 1 2 x / 2 x / 2 2 E ( X ) x e dx x e e x / 2 dx 8 0 2 0 0 2 Hence, the variance and standard deviation of the device’s life are: V ( X ) 8 22 4 s V (X ) 2 CPSC 531: Probability Review 13 Joint Distributions Let X and Y each have a discrete distribution. Then X and Y have a discrete joint distribution if there exists a function p(x,y) such that: p(x,y) = P[X=x and Y=y] Random variables X and Y are jointly continuous if there exists a non-negative function f(x,y) called the joint probability density function of X and Y, such that for all sets of real numbers A and B P(X є A, Y є B) = ∫ ∫f(x,y)dxdy B A CPSC 531: Probability Review 14 Covariance The covariance between the random variables X and Y, denoted by Cov(X, Y), is defined by Cov(X, Y) = E{[X - E(X)][Y - E(Y)]} = E(XY) - E(X)E(Y) The covariance is a measure of the dependence between X and Y. Note that Cov(X, X) = V(X). CPSC 531: Probability Review 15 Covariance Cov(X, Y) =0 >0 <0 X and Y are uncorrelated positively correlated negatively correlated Independent random variables are also uncorrelated. CPSC 531: Probability Review 16 Statistical Models Application areas where statistical models find widespread use: Queueing systems Inventory and supply-chain systems Reliability and maintainability Limited data CPSC 531: Probability Review 17 Queueing Systems In a queueing system, interarrival and service-time patterns can be probabilistic (e.g., our M/M/1 example). Sample statistical models for interarrival or service time distribution: Exponential distribution: if service times are completely random Normal distribution: fairly constant but with some random variability (either positive or negative) Truncated normal distribution: similar to normal distribution but with restricted value. Gamma and Weibull distribution: more general than exponential (involving location of the modes of pdf’s and the shapes of tails.) CPSC 531: Probability Review 18 Inventory and supply chain In realistic inventory and supply-chain systems, there are at least three random variables: The number of units demanded per order or per time period The time between demands The lead time Sample statistical models for lead time distribution: Gamma Sample statistical models for demand distribution: Poisson: simple and extensively tabulated. Negative binomial distribution: longer tail than Poisson (more large demands). Geometric: special case of negative binomial given at least one demand has occurred. CPSC 531: Probability Review 19 Reliability and maintainability Time to failure (TTF) Exponential: failures are random Gamma: for standby redundancy where each component has an exponential TTF Weibull: failure is due to the most serious of a large number of defects in a system of components Normal: failures are due to wear CPSC 531: Probability Review 20 Our next stop Discrete distributions, such as: Bernoulli trials and Bernoulli distribution Binomial distribution Geometric and negative binomial distribution Poisson distribution Continuous distributions, such as: Uniform Exponential Normal Weibull Lognormal CPSC 531: Probability Review 21