Download Some Special Distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Transcript
Some Special Distribution
 Discrete Probability Distribution
Bernoulli Trials
Introduction
The Bernoulli trials process, named after James Bernoulli, is one of the simplest yet most important
random processes in probability. Essentially, the process is the mathematical abstraction of coin
tossing, but because of its wide applicability, it is usually stated in terms of a sequence of generic
trials that satisfy the following assumptions:
1. Each trial has two possible outcomes, generically called success and failure.
2. The trials are independent. Intuitively, the outcome of one trial has no influence over the
outcome of another trial.
3. On each trial, the probability of success is p and the probability of failure is 1 - p.
Bernoulli trials is its either you make it or not,
It happens very often in real life that an event may have only two outcomes that matter
It is an experiment whose outcome is random and can be either of two possible outcomes, called
"success" and "failure."
In practice it refers to a single event which can have one of two possible outcomes. You can phrase
these events into "yes or no" questions.
For Example:








Will the coin land heads?
Was the newborn child a girl?
Are a person's eyes green?
Did a mosquito die after the area was sprayed with insecticide?
Did a potential customer decide to buy my product?
Did a citizen vote for a specific candidate?
Is this employee going to vote pro-union?
Has this person been abducted by aliens before?
Therefore 'success' and 'failure' are labels for outcomes, and should not be construed literally.
Examples of Bernoulli trials include:


Flipping a coin. In this context, obverse ("heads") conventionally denotes success and reverse
("tails") denotes failure. A fair coin has the probability of success 0.5 by definition.
Rolling a die, where for example we designate a six as "success" and everything else as a
"failure".


In conducting a political opinion poll, choosing a voter at random to ascertain whether that
voter will vote "yes" in an upcoming referendum.
Call the birth of a baby of one sex "success" and of the other sex "failure." (Take your pick.)
Mathematically, such a trial is modeled by a random variable which can take only two values, 0 and
1, with 1 being thought off as "success". If p is the probability of success, then the expected value of
such a random variable is p and its standard deviation is
<math>\sqrt{p(1-p)}.\,<math>
Mathematically, we can describe the Bernoulli trials process with a sequence of indicator random
variables:
I1, I2, I3, ...
An indicator variable is a random variable that takes only the values 1 and 0, which in this setting
denote success and failure, respectively. The j'th indicator variable simply records the outcome of
trial j. Thus, the indicator variables are independent and have the same density function:
P(Ij = 1) = p, P(Ij = 0) = (1 - p)
Thus, the Bernoulli trials process is characterized by a single parameter p.
As we noted earlier, the most obvious example of Bernoulli trials is coin tossing, where success
means heads and failure means tails. The parameter p is the probability of heads (so in general, the
coin is biased).
1. In the basic coin experiment, set n = 20 and p = 0.1. Run the experiment with and observe the
outcomes. Repeat with p = 0.3, 0.5, 0.7, 0.9.
2. Use the basic assumptions to show that
P(I1 = i1, I2 = i2, ..., In = in) = pk(1 - p)n-k where k = i1 + i2 + ··· + in.
3. Suppose that I1, I2, I3, ... is a Bernoulli trials process with parameter p. Show that 1 - I1, 1 - I2, 1 I3, ... is a Bernoulli trials sequence with parameter 1 - p.
Binomial Distribution
If an experiment consists of n repeated trials has two possible outcomes which may be labeled as
success or failure and if the repeated trials are independent trials, the probability of a success
remains constant from trial to trial. The experiment is called the binomial experiment.
The number of a success in a n independent trials is called a binomial random variable. The
probability of distribution of this discrete random variable is called the binomial distribution.
If a binomial trial can result in a success with probability p and a failure with probability q=1-p,
then the probability distribution of the binomial random variable x, the number of success in n
independent trials is f(x)=b(x;n,p)=
In many cases, it is appropriate to summarize a group of independent observations by the number of
observations in the group that represent one of two outcomes. For example, the proportion of
individuals in a random sample who support one of two political candidates fits this description. In
this case, the statistic
is the count X of voters who support the candidate divided by the total
number of individuals in the group n. This provides an estimate of the parameter p, the proportion of
individuals who support the candidate in the entire population.
The binomial distribution describes the behavior of a count variable X if the following conditions
apply:
1: The number of observations n is fixed.
2: Each observation is independent.
3: Each observation represents one of two outcomes ("success" or "failure").
4: The probability of "success" p is the same for each outcome.
If these conditions are met, then X has a binomial distribution with parameters n and p, abbreviated
B(n,p).
Examples
1. Suppose individuals with a certain gene have a 0.70 probability of eventually contracting a
certain disease. If 100 individuals with the gene participate in a lifetime study, then the distribution
of the random variable describing the number of individuals who will contract the disease is
distributed B(100,0.7).
To find probabilities from a binomial distribution, one may either calculate them directly, use a
binomial table, or use a computer. The number of sixes rolled by a single die in 20 rolls has a
B(20,1/6) distribution. The probability of rolling more than 2 sixes in 20 rolls, P(X>2), is equal to 1 P(X<2) = 1 - (P(X=0) + P(X=1) + P(X=2)). Using the MINITAB command "cdf" with
subcommand "binomial n=20 p=0.166667" gives the cumulative distribution function as follows:
Binomial with n = 20 and p = 0.166667
x
0
1
2
3
4
5
6
7
8
9
P( X <= x)
0.0261
0.1304
0.3287
0.5665
0.7687
0.8982
0.9629
0.9887
0.9972
0.9994
The corresponding graphs for the probability density function and cumulative distribution function
for the B(20,1/6) distribution are shown below:
Since the probability of 2 or fewer sixes is equal to 0.3287, the probability of rolling more than 2
sixes = 1 - 0.3287 = 0.6713.
The probability that a random variable X with binomial distribution B(n,p) is equal to the value k,
where k = 0, 1,....,n , is given by
, where
.
The latter expression is known as the binomial coefficient, stated as "n choose k," or the number of
possible ways to choose k "successes" from n observations. For example, the number of ways to
achieve 2 heads in a set of four tosses is "4 choose 2", or 4!/2!2! = (4*3)/(2*1) = 6. The possibilities
are {HHTT, HTHT, HTTH, TTHH, THHT, THTH}, where "H" represents a head and "T" represents
a tail. The binomial coefficient multiplies the probability of one of these possibilities (which is
(1/2)²(1/2)² = 1/16 for a fair coin) by the number of ways the outcome may be achieved, for a total
probability of 6/16.
Summary
The distribution of a discrete random variable taking two values, usually 0 and 1. An experiment or
trial that has exactly two possible results, often classified as 'success' or 'failure', is called a Bernoulli
trial. If the probability of a success is p and the number of successes in a single experiment is the
random variable X, then X is a Bernoulli variable (also called a binary variable) and is said to have a
Bernoulli distribution with parameter p. The method Binomial Distribution Function finds the
distribution function for the given binomial distribution.