Download Module II – Probability and Random Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Module II – Probability and Random Variables
Normal distribution
600
500
400
Series1
300
Series2
200
100
-7
3
-7
1
72
70
-6
9
-6
7
68
-6
5
66
64
-6
3
-6
1
62
60
-5
9
0
58
Rel. Freq
0.0009
0.0018
0.008
0.0227
0.045
0.0757
0.117
0.148
0.1713
0.1575
0.11
0.0735
0.0374
0.0199
0.0074
0.0021
0.0015
0.0003
1
-5
7
Frequency(f)
3
6
26
74
147
247
382
483
559
514
359
240
122
65
24
7
5
1
3264
56
Heights(inches)
56 - 57
57 - 58
58 - 59
59 - 60
60 - 61
61 - 62
62 - 63
63 - 64
64 - 65
65 - 66
66 - 67
67 - 68
68 - 69
69 - 70
70 - 71
71 - 72
72 - 73
73 - 74
The normal curve associated with a normal distribution is
 Bell-shaped
 Centered at 
 Close to the horizontal axis outside the range from   3 to   3
A normal curve
Normally distributed variable
A variable is said to be a normally distributed variable or to have a normal distribution if
its distribution has the shape of a normal curve.
For a normally distributed variable, the percentage of all possible observations that lie
within any specified range equals the corresponding area under its associated normal
curve, expressed as a percentage
Probability
 Sample space and events
 Probability and some rules of probability
1. The Sample Space (S) associated with any experiment is the set of all possible
outcomes that can occur as a result of the experiment. So naturally, we will call each
element of the sample space an outcome.
EXAMPLE 1: Consider the experiment of rolling a pair of fair dice. The figure below
gives a representation of all the 36 equally likely outcomes of the sample space
associated with this experiment.
(1,1)
(1,2)
(1,3)
(1,4)
(1,5)
(1,6)
(2,1)
(2,2)
(2,3)
(2,4)
(2,5)
(2,6)
(3,1)
(3,2)
(3,3)
(3,4)
(3,5)
(3,6)
(4,1)
(4,2)
(4,3)
(4,4)
(4,5)
(4,6)
(5,1)
(5,2)
(5,3)
(5,4)
(5,5)
(5,6)
(6,1)
(6,2)
(6,3)
(6,4)
(6,5)
(6,6)
2. An event (E) is any subset of the sample space.
3. The probability of an event E (written as P(E))in a sample space (S) with equally
likely outcomes is given by
number of outcomes in E
P(E) =
number of outcomes in S
EXAMPLE 2: For the sample space in example 1, consider the event
E = {the sum of the faces is 7 or 3}.
Then
E = { (1,6), (2,5), (3,4), (4,3), (5,2), (6,1), (1,2), (2,1) }
number of outcomes in E
8
2

=
36 9
number of outcomes in S
Alternatively: If we let
A = {sum of faces is 7} and B = {sum of faces is 3}
Thus
P(E) =
Then, E = A  B, and P(E) = P(A  B)
= P(A) +P(B) - P(A  B)
=
6
2
2

0 
36 36
9
[ Additive Rule]
Observe: Let F = {the sum of faces is neither 7 nor 3}, i.e., F is the complement of E.
Then,
P(F) = 1 - P(E)
2
7
=1=
9
9
[ Complement Rule]
Properties:
1. 0  P( E )  1 ;
2. P( )  0, & P( S )  1
P( E )
number of elements in E
3. Odds for an event E =
=
number of elements in E
P( E )
number of elements in E
P( E )

4. Odds against E =
number of elements in E
P( E )
Conditional Probability
EXAMPLE 3: A regular deck of playing cards consists of 52 cards:
13 clubs (black), 13 diamonds (red), 13 spades (black), 13 hearts (red).
The 13 cards are labeled: Ace (A), 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack (J), Queen (Q), King (K).
Consider the experiment of drawing a single card from the deck. The sample space
associated with the experiment has 52 equally likely outcomes. Consider the event
E = {a black ace is drawn}.
Then we have,
P(E) = 2/52.
i.e.,
the probability of drawing a black ace is 1/26.
However, suppose a card is drawn and we are informed that it is a club, then the question
would be, ' what is the probability of drawing a black ace, given the information that
the card drawn is a club' ? If F = {a club is drawn}, the question can be rephrased as '
what is the probability of E given F' ? This is symbolically written as: Find
P(E | F)
i.e., P(E | F) represents - the probability of the event E given the condition F.
Clearly, the given condition reduces the size of the event E to 1 outcome, since there is
only one black ace that is a club; the given condition also reduces the size of the sample
space to 13 outcomes since there are 13 clubs.
Thus,
P(E | F) = 1/13
Using the Formula:
P( E  F )
P( E | F ) 
,
P( F )  0
P( F )
1
1
 52 
13 13
52
Note:
P( E  F )  P( E | F )  P( F ),
P( F )  0 [ Product Rule]
Independent Events
Definition: Let E and F be two events of a sample space S with P(F) > 0.
The event E is independent of the event F iff
P(E | F) = P(E).
Theorem: Let E, F be events for which P(E) > 0 and P(F) > 0. If E is independent of F,
then F is independent of E.
Test for Independence: Two events of a sample space S are independent iff
P( E  F )  P( E )  P( F ),
P( F )  0
EXAMPLE 4:
A fair coin is tossed twice. Define the events E and F to be
E: A head turns up on the first throw of a fair coin;
F: A tail turns up on the second throw of a fair coin.
Show that E and F are independent.
Solution:
E = {HH, HT}, and F = {HT, TT}.
E  F  {HT }, therefore P( E  F )  1 / 4. ,
Also, P( E )  P( F )  (2 / 4).( 2 / 4)  1 / 4
Thus, events E and F are independent.
Warning!: Mutually exclusive events are generally not independent.
Discrete random variables and probability distributions
Random Variables: Suppose a pair of dice are rolled. The value of the sum of the
numbers on the dice depends on chance. The ‘sum of the numbers on the dice’ is
therefore called a random variable. A random variable is a quantitative variable whose
value depends on chance. Another example is the number of siblings each student has in
a class.
Discrete variable: A discrete variable is a variable whose possible values forms a finite
set or a countably infinite set of numbers. The variable ‘sum of the numbers on the dice’
is a discrete variable. What are its possible value?
Discrete random variables:
A discrete random variable is a random variable whose possible values form a finite or
countably infinite set.
Note: We usually use uppercase letters to denote random variables.
Probability distribution: A listing of all the possible values of a discrete variable and
their corresponding probabilities is called a probability distribution. This may be
considered as an extended notion of relative – frequency distribution.
Probability Histogram:
Example: A fair dime is tossed three times. The 8 equally likely outcomes are:
TTT
TTH
THT
THH
HTT
HHT
HTH
HHH
Let X denote the random variable, the number of tails obtained in three tosses of a fair
dime. A probability distribution of the random variable is shown in the table below:
Number of tails (x)
0
1
2
3
Total =
Probability
P(X = x)
1/8
3/8
3/8
1/8
1.00
Probability Histogram:
Interpretation of probability distributions
The probability of an event is approximately the proportion of times the event occurs in a
large number of independent repetitions of the experiment, or equivalently, the
probability histogram for the event approximate the histogram of the proportions of the
event.
 Mean and standard deviation of a discrete random variable
Computing the mean of a Discrete Random Variable:
Consider the ages of 10 students
18
19
21
20
21
21
19
18
20
Let X denote the age of a randomly selected student (random variable).
The probability distribution of the random variable X is shown in the table below:
20
Age x
18
19
20
21
P(X = x)
2/10
2/10
3/10
3/10
The mean age of the 10 students is
18  18  19  19  20  20  20  21  21  21

10
2
2
3
3
 18   19   20   21 
 10 
 10 
 10 
 10 
 18  P( X  18)  19  P( X  19)  20  P( X  20)  21  P( X  21)
Formula for the mean of a discrete random variable:
   x  P( X  x )
Interpretation of the Mean of a Random Variable:
In a large number of observations of a random variable, the average value of the
observations is approximately equal to the mean  .
Formula for the Standard deviation of a Discrete Random Variable.

 (x  )
2
P( X  x)
or

x
2
P( X  x)   2
Note:  2 is called the variance of the random variable
Binomial distribution
Binomial Probability
Bernoulli Trial: Random experiments are called Bernoulli trials if
a. the same experiment is repeated several times
b. there are only two possible outcomes (success and failure) on each trial
c. the repeated trials are independent
d. the probability of each outcome remains the same for each trial
Bernoulli trials can always be represented by a tree diagram. Let the outcome success be
denoted by S and the outcome failure, by F. If P(S) = p, and P(F) = q, then p + q = 1.
The tree diagram for the experiment repeated twice is:
p
p
S
q
F p
F
S
q
F
S
q
EXAMPLE 6: A marksman hits a target with a probability 4/5. Assuming independence
for successive firings, find the probability of getting two misses and one hit.
Let S represent 'hit' and F represent 'miss'. Then P(S) = 4/5 = p, and P(F) = 1/5 = q.
Then by the binomial probability formula, the probability of getting two misses, and one
hit
(k =1, and n = 3) is given by:
b(n, k ; p) 
b(3,1;4 / 5) 
n!
p k q nk
k!(n  k )!
3!
(4 / 5)1 (1 / 5) 31
1!(3  1)!
= .096
Binomial Distribution
According to the U.S. National Center for Health Statistics, there is an 80% chance that a
person aged 20 will be alive at age 65. Consider the experiment of selecting at random
three people aged 20. Observing whether a person currently aged 20 is alive at age 65 has
two possible outcomes: dead (d) or alive (a). Each person observed is a Bernoulli Trial.
The trials are independent. The 8 possible outcomes of the three Bernoulli Trials is given
in the table below (can be easily obtained from a tree diagram):
aaa
aad
ada
add
daa
dad
dda
ddd
We have
P(a) = 0.8 and P(d) = 0.2
Since each trial is independent, the probability of each three-trial outcome is the product
of the probabilities of each outcome, for example,
P(aad) = (0.8)(0.8)(0.2) = 0.128
Note:
P(Exactly two will be dead) = P(add) + P(dad) + P(dda)
Probability Distribution:
Let X denote the random variable, the number of people out of the three that are alive at
age 65. The probability distribution of the random variable X is given in the table below:
Number of people alive (x)
P(X = x)
0
0.008
1
0.096
2
0.384
3
0.512
Probability Distribution Histogram:
Note: Generally, the binomial distribution is right skewed if p < 0.5, is symmetric if
p = 0.5, and is left skewed if p > 0.5.
Binomial Probability Formula:
Assume
 n identical trials are performed
 For each trial, there are two outcomes, success or failure
 Each trial is independent
 The probability for success. p, remains the same from trial to trial
Then, the binomial probability formula for the number of successes, X, is
 n
P( X  x)    p x (1  p) n  x
 x
From the example above:
3
If x = 2, P( X  2)   (0.8) 2 (1  0.8) 3  2  0.384
 2
Bayes' Formula
Consider the partition of the sample space U into three subsets A, B, and C.
Let E be any event in S so that P(E) > 0 (see figure below).
U
A
B
C
E
P(A), P(B), and P(C) are referred to as a priori probabilities, and
P(A | E), P(B | E), and P(C | E) are called a posteriori probabilities, and are given by
Bayes' Formula, for example
P( A)  P( E | A)
P( A | E ) 
P( A)  P( E | A)  P( B)  P( E | B)  P(C )  P( E | C )
EXAMPLE 5
A computer manufacturer has three assembly plants. Records show that 2% of the sets
shipped from plant A turn out to be defective, as compared to 3% of those that come from
plant B and 4% of those that come from plant C. In all, 30% of the manufacturer's total
production comes from plant A, 50% from plant B, and 20% from plant C. If a customer
finds that her computer is defective, what is the probability it came from plant B?
Solution:
You first recognize that the problem is solvable using the Bayes' formula based on a
partitioning of the sample space (in example, plants A, B, and C). Note that with every
problem solvable by
Bayes' formula is associated a probability tree diagram. If we let D denote 'defective'
and D denote 'non-defective', then, the tree diagram associated with our example is:
D
.02
A
D
.3
.5
.03
D
.04
D
D
B
.2
C
D
Using Bayes' Formula
P ( B | D) 
P( B)  P( D | B)
P( A)  P( D | A)  P( B)  P( D | B)  P(C )  P( D | C )
P ( B | D) 
(.5)  (.03)
(.3)  (.02)  (.5)  (.03)  (.2)  (.04)
= .51724