Download C6_CIS2033

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
CIS 2033 based on
Dekking et al. A Modern Introduction to Probability and Statistics. 2007
B: Michael Baron. Probability and Statistics for Computer Scientists, CRC 2006
Instructor Longin Jan Latecki
Ch. 6 Simulations
What is a simulation?
One uses a model to create specific situations in order to study the response of
the model to them and then interprets this in terms of what would happen to the
system “in the real world”.
Models for such systems involve random variables, and we speak of probabilistic
or stochastic models, and simulating them is stochastic simulation.
Stochastic simulation of a system means generating values for all the random
variables in the model, according to their specified distributions, and recording
and analyzing what happens
We refer to the generated values as realizations of the random variables
6.2 Generating realizations: Bernoulli
Simulations are almost always done by computer, which usually have one or more socalled (pseudo) random number generators. Which mimics U(0,1).
Ex: How do you simulate a coin toss with a die.
Bernoulli random Variables
Suppose U has a U(0,1) distribution. To construct a Ber(p) random variable for some
0 < p < 1.
so that
A Matlab code:
U = rand;
X = (U<p)
Binomial
Binomial Random Variables
We can obtain a binomial RV as a sum of n independent Bernoulli RVs.
For this we start with n uniform RVs, for example:
A Matlab code:
n = 20; p = 0.68;
U = rand(n,1);
X = sum(U<p)
Geometric
Geometric Random Variables
A while loop of Bernoulli trials generates a geometric RV.
We run the loop until the first success occurs.
Variable X in the while loop counts the number of failures.
A Matlab code:
X=1;
while rand > p
X = X + 1;
end;
X
%at least one trail
%continue while there are failures
%stop at the first success
Arbitrary discrete distribution
Arbitrary Discrete Random Variable X
that takes values x0, x1, … with probabilities pi = P(X=xi) such that p0+p1+…=1.
Algorithm
1. Divide interval [0, 1] into subintervals of lengths pi
A0 = [0, p0)
A1 = [p0, p0+p1)
A2 = [p0+p1, p0+p1+p2),
…
2. Obtain U from Uniform(0, 1), the standard uniform distribution
3. Set X = xi if U belongs to interval Ai
X has the desired distribution, since
P( X  xi )  P(U  Ai )  pi
Continuous Random Variables
A simple yet surprising fact:
Regardless of the initial distribution of RV X, the cdf of X, FX(X), has uniform distribution:
Let X be a continues RV with a strictly increasing cdf FX(x). Define an RV as U = FX(X).
Then the distribution of U is Uniform(0, 1).
Proof: Since FX(x) is a cdf, 0 ≤ FX(x) ≤ 1 for all x. Hence values of U lie in [0, 1].
We show that the cdf of U is FU(u)=u, hence U is Uniform(0, 1):
FU (u )  P (U  u )
 P ( FX ( X )  u )
 P ( X  F inv X (u ))
 FX ( F
u
inv
X
(u ))
by definition of cdf
Generating Continuous Random Variables
In order to generate a continues RV X with cdf FX, let us revert the formula U = FX(X).
Then X can be obtained from the standard uniform variable U as
X  F inv X (U )
Proof:
P( X  x)  P( F inv X (U )  x)  P( FX ( F inv X (U ))  FX ( x))
 P(U  FX ( x))
since U is Uniform(0, 1)
 FX ( x)
Algorithm
1. Obtain a sample u from the standard uniform RV U
2. Compute a sample from X as
x  F inv X (u )
Exponential random variables
Applying this method to exponential distribution
F(x) = 1 – e-λx
F(x) = u ↔ x =
ln (1- u) = Finv(u)
if U has a U(0,1) distribution, then the random variable X is defined by
X = Finv(U) =
ln (1- U) =
ln (U)
So we can simulate a RV with Exp(λ) using a Uniform(0,1) RV
6.4 A single server queue
There is a well and people want to determine how powerful of a pump they
should buy. The more power would mean less wait times but also increased cost
Let Ti represent the time between consecutive customers, called interarrival time,
e.g., the time between customer 1 and 2 is T2.
Let Si be the length of time that customer i needs to use the pump
The pump capacity v (liters per minute) is a model parameter that we wish to
determine.
If customer i requires Ri liters of water, then Si = Ri / v
To complete the model description, we need to specify the distribution of T and R
Interarrival times: every Ti has an Exp(0.5) distribution (minutes)
Service requirement: every Ri has U(2,5) distribution (liters)
6.4 cont
Wi denotes the waiting time of customer i,
for customer W1 = 0 since he doesn’t have to wait.
Then the following waiting times depend on customer i-1:
Wi = max{Wi-1 + Si-1 + Ti, 0} See pg.81
We are interested in average waiting time:
n=
This is used for fig 6.7, which is the plot of (n,
n)
for n = 1,2,…
The average wait time for v = 2 turns out to be around 2 minutes
The average wait time for v = 3 turns out to be around .5 minutes
6.4 Work in system
One approach to determining how busy the pump is at any one time is to
record at every moment how much work there is in the system.
For example if I am halfway through filling my 4 liter container and 3 people are
waiting who require 2, 3, and 5 liters, then there are 12 liters to go; at v = 2,
there are 6 minutes of work in the system and at v = 3 there is just 4.
The amount of work in the system just before a customer arrives equals the
waiting time of the customer, this is also called virtual waiting time.
Figures 6.8 and 6.9 illustrate the work in the system for v = 2 and v = 3