Download Distributions I

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
ICU-KIEV. ADVANCED MARKETING.
Vladimir V. Bulatov ([email protected])
Major distributions used in marketing modeling.
UNIFORM DISTRIBUTION
UD can be either discreet or continuous.
f(x) 
1
B A
where A is the location parameter and (B - A) is the scale parameter. The case where A = 0 and B = 1 is called the standard
uniform distribution.
Important related notions: PDF, CDF.
Examples: lotteries, dice throwing results, unconditional consumer choices (e.g. each candy bar has the same chance to be
chosen).
BERNOULLI DISTRIBUTION
BD is a discrete probability distribution, which takes value 1 with success probability p and value 0 with failure probability
q=1-p.
1  p for n  0
P(n)  
 p for n  1
or
P(n)  p n(1  p)1n
Examples are: chance of purchasing a product during a purchase occasion, enter/not enter a supermarket when going home,
etc.
BINOMIAL DISTRIBUTION
BD gives the discrete probability distribution Pp(n|N) of obtaining exactly n successes of N Bernoulli trials.
N
Pp(n|N)    p n q N n
n 
Above is the plot of Bin(n|N) with p=0,5 and n=20.
Example: if, usually, the probability of entering a supermarket every working evening is 0.5, then the probability
distribution of making from 0 to 20 visits during 20 consecutive days will look like the graph above (days are on the
horizontal axis). The most likely result is 10 visits, but other outcomes are also possible, though, less likely.
—1—
Major distributions used in social sciences
GEOMETRIC DISTRIBUTION
If independent Bernoulli trials are made until a "success" occurs, then the total number of trials required is a geometric
random variable. The geometric distribution is defined as:
P(n)  p(1  p)n1
where n=1,2,3…
and p is the probability that a particular event (e.g., success) will occur.
Suppose an ordinary die is thrown repeatedly until the first time a “1” appears. The probability distribution of the number of
times it is thrown is supported on the infinite set {1,2,3,…} and is geometric distribution with p=1/6.
The picture above shows the graph for the geometric distribution with p=1/4.
A more interesting aspect of GD is its CDF. It demonstrates the probability of obtaining “success” on the nth Bernoulli trial.
x
F(x)   p(1  p)i 1
i 1
Example: in searching for a specific type of candy out of a mix of 4 candy types, with each consecutive candy try, the
chance of find the desired one increases.
HYPERGEOMETRIC DISTRIBUTION
[#ways for i success es][#ways for N-i failures]
P(x  i) 
[total number of ways to select]
 n  m 
 

i
N

i

P(x  i)   
 m  n


N 
A classical application of this probability is in so-called “urn problem” – finding the probability that i out of N balls drawn
are “good”, when the urn contains n “good” balls and m “bad” balls.
It also describes the probability of obtaining exactly i correct balls in a pick-N lottery from a reservoir of r balls (of which
n=N are “good” and m=r-N are “bad”).
For example, for N=6 and r=36, the probabilities of obtaining i correct balls are the following:
Number
Probability
Odds
correct
0
0.3048
2.280:1
1
0.4390
1.278:1
2
0.2110
3.738:1
3
0.04169
22.99:1
4
0.003350
297.5:1
5
9.241*10-5
10820:1
6
5.134*10-7
1.948*106:1
A typical example is illustrated by the contingency table below: there is a shipment of N objects in which D are defective.
The hypergeometric distribution describes the probability that in a sample of n distinctive objects drawn from the shipment
exactly k objects are defective.
defective
nondefective
total
drawn
k
n-k
n
not drawn
D-k
N+k-n-D
N-n
 D  N  D 
 

k  n  k 

f ( k ; N , D, n ) 
N
 
n 
total
D
N-D
N
—2—
Major distributions used in social sciences
POISSON DISTRIBUTION
It expresses the probability of a number of events occurring in a fixed time period if these events occur with a known
average rate, and are independent of the time since the last event.
(Formulas for a unit and a non-unit [more than one unit] time periods).
Pp(X  x|λ) 
e λ λ x
x!
E.g. the probability of the number of purchases in a period of unit length, where lambda is purchase rate (number of
units/time period).
Above are PDF and CDF of several Poisson distributed variables with different lambdas.
λ is a positive real number (1, 2, 3…), equal to the expected number of occurrences that occur during the given interval. For
instance, if the events occur on average every 4 minutes, and you are interested in the number of events that may occur in a
10 minute interval, you would use as model a Poisson distribution with λ = 2.5.
Example: a smoker smokes on average 2.56 cigarettes per day (lambda), the probability of smoking different number of
cigarettes (0, 1, 2, 3, 4…) is Poisson distributed. Note, with large lambda Poisson is approximately Normal.
Example: the number of false fire alarms in Kiev is 2.1 per day. The probability of a certain number of fire alarms in Kiev is
Poisson distributed. E.g. P(X=4| λ =2.1) = (2.1 4e-2.1)/4!=0.0992.
For a non-unit time period with λ being a rate of event occurrence, the distribution becomes:
Pp(X (t )  x|λ) 
e  λt ( λt ) x
x!
EXPONENTIAL DISTRIBUTION
E.g., if the likelihood of purchase during any period of short duration is constant and independent of when the last purchase
was made, the time until the next purchase will be exponential (lambda is “rate parameter”, lambda>0):
f(t|λ)  λe  λt
Lambda is a rate parameter, and CDF of exponential distribution is the following:
D(t)  1  e  λt
PDF
CDF (shows the probability of event’s occurrence in the time period [0;t]).
—3—
Major distributions used in social sciences
Related documents