Download Ch5 - OCCC.edu

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Chapter 5 Discrete Random Variables and Distributions
A. Continuous vs. Discrete Distributions
1. Random Variable-rv – a variable whose value is a numerical outcome is a random
phenomenon. One and only one numerical value is assigned to each outcome. The r.v. can be
either discrete or continuous.
a. Discrete random variable – a random variable that takes on whole numbers or integers.
b. Continuous random variable – when the random variable can take on any value in an interval;
not just whole numbers.
2. Discrete Probability Model – This is a probability model that has a finite sample space and
also takes on a finite value or infinite sequence of values.
3. Continuous Probability Model – this is a probability model that has an infinite sample
space. We cannot assess the probability of an outcome at a point in this case  the reasoning is
the probability of something at one point under a distribution is zero since the area would be 0.
-probability in this case must be measured using a range of values and is the area under the curve
in that range.
Ex: a uniform distribution or normal distributions are good examples.
4. Discrete Probability Distribution - p(x) – a table, formula, or a graph showing how
probability is attached to each possible value the random variable may assume.
Properties:
i- 1 ≥ p(x) ≥ 0; so as before the probability must be greater than 0 and less than 1.
ii.
; so the sum of all the probabilities must add to 1.
Example: Consider some r.v. x = # of car sales in an hour
x
0
1
2
3
p(x)
0.40
0.30
0.20
0.10
We can see that the sum of all of probabilities
.
5. Mean, Variance, & Standard Deviation of a discrete r.v.
a. mean – this is the average. It is also called the expected value (ie what you expect to occur on
average) and is denoted E(x)
E(x) =
1
b. variance – the interpretation is the same as before. So it is the squared deviation of each value
from the mean.
Var(x) =
c. Standard deviation – the average deviation of each data value from the mean (as before)
Std(x) =
Example: Let’s use the values above to calculate each of the measures in 5.
= 0(0.4) + 1 (0.3) + 2 (0.2) + 3(0.1) = 1.0
So what this says is on average we expect 1 car to be sold per hour.
= [ (0-1)2(0.4) + (1-1)2(0.3)+ (2-1)2(0.2) + (3-1)2(0.1)] = 1
-Var(x) =
-Std(x) =
=
=1
Note: the fact that the mean, variance, and standard deviation are all the same is only a property
of these values. You should not expect this to occur often at all.
B. Binomial Distributions
To have a binomial distribution we must meet certain criterion. We first must have a binomial
experiment.
1. Binomial experiment – an experiment that meets the following conditions:
a. consists of a sequence of n identical trials
b. two outcomes are possible each trial  success or failure; each has their own probability
c. The probability of success is called p, while the probability of failure is (1-p). This does not
change from trial to trial.
d. The trials are independent of each other. So no one trial affects the probability or the outcome
of another trial.
-what we are interested in are the number of successes (or number of times an event occurs)
given that we run n trials. So if we denote X as a success, we note that X – [0, n]. Since the
number of values is finite we note that the binomial distribution is a discrete distribution (i.e. not
continuous).
2. Binomial Distribution - the distribution that is associated with this type of experiment.
Ex: tossing a coin 5 times follows a binomial distribution. We can let success = head and failure
= tail, but it could just as easily be the other way around.
Facts: n = 5, p = 0.5, (1-p) = 0.5, and each toss is independent of the other.
3. Binomial Coefficient – the number of experimental outcomes that gives us exactly x
successes in n trials.
2
Mathematically:
( nx ) = n! / x! (n-x)!
Where (n-x)! = (n-x))(n-x-1)(n-x-2)….(3)(2)(1)
n! = (n)(n-1)(n-2)….(3)(2)(1)
0!=1
x<n
** This is the same concept as combination in chapter 4 notes.
4. Binomial Probability Function – the function that gives the probability of an event
occurring with x successes out of n trials.
Mathematically: f(x) – P( X = x) = ( nx ) px (1-p)(n-x)
Where p = probability of success, (1-p) = probability of failure, & n = # of trials
5. Mean, Variance, and Standard Deviation
a. mean – E(x) =
= n*p
b. variance – Var(x) = = np(1-p)
c. standard deviation - Std(x) =
n * p(1  p)
6. Normal Approximation to the Binomial – under certain circumstances we can use the
standard normal to approximate what happens with the binomial. If we have sufficiently large
numbers of observations this is a valid technique. So we get a distribution that is
N(n*p, n * p(1  p) .
Condition: We will assume that it is normal when it meets the following two conditions.
(a) n*p ≥10
(b) n (1-p) ≥10
7. Example: Suppose we have a binomial that we are looking at where we have n=10, x=4, and
the probability of success p=0.65. What is the probability that we get 4 successes in 10 tries
then?
So we use our formula for the binomial probability
4
(10-4)
f(x) – P( X = x) = ( nx ) px (1-p)(n-x) = P( X = 4) = ( 10
= 10! / 4! (10-4)!
4 ) (0.65) (1-0.65)
*[(0.65)4 (0.35)(6)] = 10*9*8*7/4*3*2*1 [0.00033] = 210*(0.00033) = 0.06891
So we find that the probability of 4 successes is around 6.9% given all these conditions. Note
that we can also look at this as being the probability of 6 failures (i.e. if we have 4 successes in
10, then it is the same as saying it has 6 failures in 10).
Normal Approximation: Note that we have n*p = 6.5 and n(1-p) = 3.5. This does not satisfy our
conditions for using the normal, but we will do it anyway to show how it can be used to
3
approximate the probability calculations, and should do so well in this case since it is very close
to meeting the required conditions. So if we look at:
P ( X = 4 ) and convert it to a normal approximation  P ( X < 4 ) = P (Z < (4 – 6.5) / 2.275 )
= P ( Z < -1.66) = 0.0485
We see that since the conditions were not met that the approximation is different from the actual
calculation. In this case it is about 30% different, but we would expect that as we get closer and
closer to the conditions that the values would converge to one another.
C. Poisson Process – Optional
-this distribution is used when you consider how many times an event occurs in a given period of
time or space (like for queuing time  waiting time)
1. Assumptions:
i- the probability of the event occurring must be the same or constant across all time/intervals of
equal length
ii-the occurrence of an event in one interval is independent of every other interval
iii- probability distribution – p(x) =
2. Mean, Variance and Standard Deviation
a. Mean =
b. Variance =
=μ
=μ
c. Standard Deviation =
*so an interesting characteristic of this distribution is that the variance and mean are the same.
3. Example: Suppose that we are told that the number of calls a call center gets per 15 minute
time interval is 2 calls on average. Find the probability of receiving 6 calls in a 1 hour time
period.
So we know that μ = 2 for 15 min. So the average for 1 hour would be 2*4 because there are 4
time intervals given.
So the p(x) =
 P(6) =
=
≈ 0.122 or about 12.2%
Note that if we wanted to get the probability of having 6 or fewer calls we would have to find P
(X ≤ 6 calls) = P(6) + P(5) + …+ P(1) +P(0)
4