Download probability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Introduction to Probability
Elementary properties of probability
• Definition: Probability is the likelihood or chance of an event occurring.
• Probability of an event is usually referred to as relative frequency
• Probability statements used frequently in biostatistics – e.g., we say that we are
90% probably sure that an observed treatment effect in a study is real; the
success probability of this surgery is only 10%; the probability of tumor to
develop is 50%......
THE RELATIVE FREQUENCY INTERPRETATION OF PROBABILITY
• We are interested in learning about the probability of some event to
occur in a certain process based on the number of actual occurrence
of that event relative to number of times that process repeated.
• For example, our process could be rolling a dice, and we are
interested in the probability in the event that the number on the dice
is equal to 6.
• I did this dice experiment 20 times. Each time I recorded the top face
number: results are below:
1 3 2 2 4 6 4 5 3 1 3 5 6 4 2 1 2 5
6 2
Experimental Probability Vs Theoretical
Probability
• Experimental Probability = 3/20
• Theoretical probability = 1/6
• The theoretical probability is what you expect to happen, but it isn't
always what actually happens.
• When n increases experimental probability approaches theoretical
probability. When n
∞, experimental probability = theoretical
probability.
In general, the probability of an event can be approximated by the
relative frequency , or proportion of times that the event occurs.
PROBABILITY (EVENT) is approximately = # of times event occurs/# of
experiments
THE SUBJECTIVE INTERPRETATION OF
PROBABILITY
• The relative frequency notion of probability is useful when the
process of interest, say tossing a coin, can be repeated many times
under similar conditions. But we wish to deal with uncertainty of
events from processes that will occur a single time. e.g. probability of
a success of a surgery, probability to get A in class.
• A subjective probability reflects a person's opinion about the
likelihood of an event. Different between people.
Elementary properties of probability
Given some process (or experiment) with n mutually exclusive
outcomes (called events), E1, E2, E3, …, En, the probability of any
event Ei is assigned a nonnegative number. That is:
P( Ei )  0
The sum of probabilities of all mutually exclusive outcomes is equal to
1 (exhaustiveness):
P( E1 )  P( E2 )  P( E3 )  ...............P( En )  1
Probability as a Numerical Measure
of the Likelihood of Occurrence
Increasing Likelihood of Occurrence
Probability:
0
The event
is very
unlikely
to occur.
.5
The occurrence
of the event is
just as likely as
it is unlikely.
1
The event
Is almost
certain
to occur.
Some Basic Relationships of
Probability
There are some basic probability relationships that
can be used to compute the probability of an event
without knowledge of all the sample point probabilities.
Complement of an Event
Union of Two Events
Intersection of Two Events
Mutually Exclusive Events
Elementary properties of probability
Mutual Exclusiveness: Two or more events are said to be
mutually exclusive if the occurrence of any one of them
means the others will not occur (That is, we cannot have 2
events occurring at the same time)
Mutually exclusive events
H
Non-Mutually exclusive events
A
T
Example: Tossing a coin: the
outcome of head or tail is mutual
exclusive events. No way in a
certain toss you will have head
and tail at the same time. It is
either H or T
Example:
A
B
B
Calculating the probability of an event
• Sample of 75 men and 36 women were selected to study the cocaine
addiction as function of gender. The subjects are representative sample of
typical adult who were neither in treatment nor in jail.
Frequency of cocaine use by gender among adult cocaine users
Lifetime frequency of
cocaine use
Male (M)
Female (F)
Total
1-19 times (A)
32
7
39
20-99 times (B)
18
20
38
100+ times (c)
25
9
34
Total
75
36
111
Suppose we pick a person at random from this sample, what is the probability that this person is a male?
• 111 subjects are our population.
• Male and female are mutually exclusive categories
• The likelihood of selecting any one person is equal to the likelihood of selecting any other
• The desired probability is the number of subjects with the characteristic of interest (Male)
divided by the total number of subjects
P(M) = Number of males / Total number of subjects = 75/111 = 0.6757
Dr. Alkilany 2012
Conditional probability, P(B|A)
• Conditional probability of an event B is the probability that the
event will occur given the knowledge that an event A has
already occurred. This probability is written P(B|A)
• Suppose we pick a subject at random from the 111 subjects
and find that he is a male (M), what is the probability that this
male will be one who has used cocaine 100 times or more
during his lifetime?
• What is the probability that a subject has used cocaine 100
times or more given he is a male?
• P(C100ΙM) = 25/75 = 0.33
Joint probability, P(A∩B)
• Joint probability is defined as the probability of both A and B taking place
together (at same time)
• The joint probability is given the symbolic notation: P(A∩B) in which the
symbol “∩” is read either as “intersection” or “and”.
• What is the probability that a person picked at random from the 111
subjects will be male and be a person who had used cocaine 100 times or
more?
• The statement M∩C100 indicates the joint occurrence of conditions M and
C100.
P(M∩C100) = 25/111 = 0.2252
The Addition Rule, P(AUB)
for mutual exclusive events
• In the previous example, if we picked a person at random
from the 111 subject sample, what is the probability that
this person will be a male (M) or female (F)?
• We state this probability as P(M∪F) were the symbol ∪ is
read either “union” or “or”.
• Since the two genders are mutually exclusive and
variables:
P(M∪F) = P(M) + P(F) =(75/111)+(36/111
=0.6757+0.3243=1
A
B
P(A  B)  P(A)  P(B)
Calculating the probability of an event
• Complimentary events:
• The probability of an event A is equal to 1 minus the
probability of its compliment which is written as Ā, and
P(Ā)=1-P(A)
1.
Ā
A
2.
Dr. Alkilany 2012
If the probability of having
smokers in this class is 5%, so
what is the probability of
having non-smokers in the
same class?
If you toss a dice, what is the
probability not to have 2
“face up”?
Biostatistics
Lecture 6
Probability Distributions
Dr. Alkilany 2012
Probability Distributions
• The probability distribution of random variable is a graph,
table, or formula used to specify all possible values of a
random variable along with the respective probabilities.
formula
graph
table
Dr. Alkilany 2012
Probability Distributions
“General Example”
• The following table shows the prevalence of
prescription and nonprescription drugs use in
pregnancy among a population of women
who were delivered of infants.
Variable Probability
Variable
Probability Distributions Table
Variable Probability
• We wish to construct the probability
distribution of the variable X = number of
prescription and nonprescription drugs used
by the study subjects.
Dr. Alkilany 2012
Probability Distributions
Curve
Variable (# Drugs)
Note: 0≤P(X=x)≤1; ΣP(X=x)=1
Probability distributions “General Example”
Q1: What is the probability that a randomly
selected woman will be one who used exactly
three prescription and nonprescription drugs?
P(X=3)=0.0832
Q2: What is the probability that a randomly
selected woman used wither one or two
drugs?
We use the addition rule for mutually exclusive
events
P(1∪2)=P(1)+P(2)=0.3228+0.1895=0.5123
Dr. Alkilany 2012
Probability distributions “General Example”
• Sometimes we may want to work with a
cumulative probability distribution of a
random variable.
• The cumulative probability distribution
is obtained by successively adding the
probabilities, P(X=x) given in the last
column of the previous table.
• Q3: What is the probability that a
woman picked at random will be one
who used two or fewer drugs? P=0.8528
• Q4: What is the probability that a
randomly selected woman will be one
who used fewer than two drugs? P=
0.6633
Dr. Alkilany 2012
Probability distributions “General Example”
Q5: What is the probability that a
randomly selected woman is one who
used between three and five drugs,
inclusive?
P(x≤5)=0.9872 is the probability that a
woman used between zero and five drugs,
inclusive.
P(X≤2)= 0.8528 is the probability that a
woman used between zero and two drugs,
inclusive.
P(3≤x≤5)=P(x≤5)-P(X≤2)
= 0.9872- 0.8528=0.1344
“BE CARFULL HERE!”
Dr. Alkilany 2012
Probability Distributions (Types)
Binomial
Distribution
Discrete
Poisson
Distribution
Probability
Distribution
Continues
Dr. Alkilany 2012
Normal
Distribution
Continues distribution
•
A continuous variable is one that can assume any value within a specified
interval of values.
•
Consequently, between any two values assumed by a continuous variable,
there exist an infinite number of values.
•
In Normal distributions, as the number of possible outcomes
(observations, n) approaches infinity, and the width of the class intervals
approaches zero, the frequency polygon approaches a smooth curve.
•
Dr Alkilany 2012
Normal distribution
•
The most important of continuous probability distributions is the normal
distribution (some times referred to as Gaussian distribution, Carl
Friedrich Gauss, 1777-1855).
-∞<x<∞
•
•
The parameters of distribution are:
μ: mean and σ: the standard deviation.
π=3.14159, e=2.71828
Dr Alkilany 2012
Characteristics of the normal distribution
1. is considered the most prominent probability distribution in statistics
2. arises as the outcome of the central limit theorem, which states that
under mild conditions the sum of a large number of random variables
is distributed approximately normally
3. It is symmetrical about its mean, µ
4. The curve on either side of µ is a mirror image of the other side
5. The mean, the median, and the mode are all equal
6. The total area under the curve above the x-axis is one square unit.
This characteristic follows from the fact that the normal distribution is
a probability distribution
μ
σ
Mode
-x
x
Dr Alkilany 2012
Characteristics of the normal distribution
7 . The relative frequency of occurrence of values between any two points
(a, b) on the x-axis is equal to the total area bounded by the curve
(gray area).
Dr Alkilany 2012
Characteristics of the normal distribution
8.
If we erect perpendiculars a distance of 1 standard deviation from the
mean in both directions, the area enclosed by these perpendiculars,
the x-axis, and the curve will be approximately 68 percent of the total
area.
9. If we extend these lateral boundaries a distance of 2 standard
deviations on either side of the mean,
approximately 95
percent of the area will be
enclosed.
10. If we extend these lateral boundaries a distance of 3 standard
deviations will cause approximately 99.7 percent of the total area to
be enclosed.
Dr Alkilany 2012
Characteristics of the normal distribution
11. The normal distribution is completely determined by the parameters µ and .
Different values of µ, shift the graph of the distribution along the x-axis.
Different values of  determine the degree of flatness or peakedness of the
graph of the distribution.
3 Normal distributions with
different µ values and equal
 values
Summary
µ: location
: shape
3 Normal distributions with
equal µ values and different
 values
1<2<3
Dr Alkilany 2012
Biostatistics
Lecture 9
Probability Distributions
Dr Alkilany 2012
Normal distribution
•
The most important of continuous probability distributions is the normal
distribution (some times referred to as Gaussian distribution, Carl
Friedrich Gauss, 1777-1855).
-∞<x<∞
•
•
The parameters of distribution are:
μ: mean and σ: the standard deviation.
π=3.14159, e=2.71828
Dr Alkilany 2012
Characteristics of the normal distribution
1. is considered the most prominent probability distribution in statistics
2. arises as the outcome of the central limit theorem, which states that
under mild conditions the sum of a large number of random variables
is distributed approximately normally
3. It is symmetrical about its mean, µ
4. The curve on either side of µ is a mirror image of the other side
5. The mean, the median, and the mode are all equal
6. The total area under the curve above the x-axis is one square unit.
This characteristic follows from the fact that the normal distribution is
a probability distribution
μ
σ
Mode
-x
x
Dr Alkilany 2012
Characteristics of the normal distribution
7 . The relative frequency of occurrence of values between any two points
(a, b) on the x-axis is equal to the total area bounded by the curve
(gray area).
Dr Alkilany 2012
Characteristics of the normal distribution
8.
If we erect perpendiculars a distance of 1 standard deviation from the
mean in both directions, the area enclosed by these perpendiculars,
the x-axis, and the curve will be approximately 68 percent of the total
area.
9. If we extend these lateral boundaries a distance of 2 standard
deviations on either side of the mean,
approximately 95
percent of the area will be
enclosed.
10. If we extend these lateral boundaries a distance of 3 standard
deviations will cause approximately 99.7 percent of the total area to
be enclosed.
Dr Alkilany 2012
Characteristics of the normal distribution
11. The normal distribution is completely determined by the parameters µ and .
Different values of µ, shift the graph of the distribution along the x-axis.
Different values of  determine the degree of flatness or peakedness of the
graph of the distribution.
3 Normal distributions with
different µ values and equal
 values
Summary
µ: location
: shape
3 Normal distributions with
equal µ values and different
 values
1<2<3
Dr Alkilany 2012
The Standard Normal Distribution
•
A standard normal distribution is a normal distribution which
has a mean (µ)=0 and a standard deviation ()=1.
Z (standard score, z-score)
= (x — µ)/
f (z) 
1 z 2 / 2
e
2
Normal Distribution
Standard Normal Distribution
Standardization
-∞<x<∞
Dr Alkilany 2012
-∞<z<∞
What is z-score??
•
Z-scores are scores converted into the number of standard deviations that the values (x) are from the mean (μ):
•
(Important note: for z-scores:
Z-scoring is used to standardize scores in order to provide a way of comparing
them
(compare
of 3.0 in IELTS
mean
(µ)=0
and score
a standard
with a 580 in the old TOEFL exam??)
deviation ()=1)
z

• A Z-score= +2.0 means that the original
score (x) was 2 standard deviations above
the mean. A Z-score= –3.0 means that the
original score was three standard deviations
below the mean
Z=-3.0
Z=+2.0
Do it @ home (NOT a homework)
for values
0, 5, 10, 15, 20 calculate z-score for every value and then
calculate the mean and standard deviation for all z-scores
Dr Alkilany 2012
x

The Standard Normal Distribution
To find the probability that z takes on a value between any two points on the zaxis (z0 and z1), we must find the area bounded by the perpendiculars erected at
these points, the curve and the horizontal z-axis.
•
The area can be found by integrating the f(z) function between two values of the
variable. In the standard normal distribution, the integral (area) is given by the
equation:
f(z)
•
z0
z1

Dr Alkilany 2012
z1
z0
1 z 2 / 2
e
dz
2
The Standard Normal Distribution
•
There are CUMULATIVE TABLES that provide the results of all such
integrations.
•
These tables provide the areas under the curve between - and z
(cumulative relative frequency again!!).
•
These tables are used to find the probability that a statistic is observed
below, above, or between values on the standard normal distribution
Dr Alkilany 2012
-3.8
+3.8
Dr Alkilany 2012
P(Z ≤ 1.0) = 0.8413
P(Z ≤ 1.57) = 0.9418
P(Z ≤ 1.96) = 0.9750
P(2.4 ≤ Z ≤ 2.5) =
0.9938-0.9918=0.002
P(Z > 3.0) =1-P(Z≤3)=
1-0.9987=0.0013
Dr Alkilany 2012
Example I
• As a part of a study of Alzheimer’s disease, data reported were compatible
with the hypothesis that brain weights of victims of the disease are
normally distributed. From the reported data we may compute a mean of
1076.80 g and a standard deviation of 105.76 g. Find the probability that a
randomly selected patient will have a brain that weighs less than or equal
to 800 grams.
• Given from the question:
1. Brain weight mean (µ)=1076.80 g
2. Brain weight standard deviation (σ)=105.76 g
3. Brain weight measurements are normally distributed
• The question asks for the probability randomly selected patients to have a
brain weights less than or equal to 800 g……..> P(x≤800)
What do we need to do first??
Dr Alkilany 2012
Application of Normal Distribution
First, we need to calculate the z-score for x=800 g
800  1076.80
z
 2.62
105.76
Now we can use the standard normal distribution table:
What is the probability that randomly selected patients to have a brain weights less than or
equal to Z=-2.62……..> P(z≤ -2.62)=0.0044
Dr Alkilany 2012
Example II
• Suppose it is known that the Dissolution time (Dt) of a certain
pharmaceutical tablets are approximately normally distributed with a mean
of 70 min and a standard deviation of 3 min. What is the probability that a
tablet picked at random from this batch will have Dt between 65 and 74
min?
(x  )
65  70
 1.67

3
(x  ) 74  70
z74 

 1.33

3
P(65  x  74)  P(1.67  z  1.33)  P(  z  1.33)  P(  z  1.67)  0.9082  0.0475  0.8607
z65 


Dr Alkilany 2012
Example III
• In a batch of 10,000 tablets described in the previous example, how many
tablets would you expect to have Dt at least 77min?
77  70
z77 
 2.33
3
P(x  77)  P(z  2.33)  1  0.9901  0.0099
• Out of 10,000 tablets, we would expect 10,000*0.0099=99 to have
dissolution time at least 77 min

Z=+2.33
Dr Alkilany 2012