Download Examples of Empirical Probabilities

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Probability and Probability Distributions
● Experiment
– It is a process that results in an observation (often called an outcome or
sample point) which cannot be determined with certainty in advance of
the experiment
● Sample Space (S)
– S is the set of all possible outcomes of the experiment.
● Event (A, B, C, etc.)
– An event is a subset of the sample space S
● Probability of an Event A:
– P(A) = sum of the probabilities of all outcomes that are in the event A
● Examples:
●
●
Experiment
Sample Space
Throw a die once S = { 1, 2, 3, 4, 5, 6}
●
Throw a die twice
S = { (1,1), (1,2), (1,3), (1,4), (1,5), (1,6)
(2,1), (2,2), (2,3), (2,4), (2,5), (2,6)
(3,1), (3,2), (3,3), (3,4), (3,5), (3,6)
(4,1), (4,2), (4,3), (4,4), (4,5), (4,6)
(5,1), (5,2), (5,3), (5,4), (5,5), (5,6)}
Event
A: even number; A = { 2, 4, 6}
A: sum is 10; {(4,6), (5,5), (6,4)};
B: sum < 4; {((1,1), (1,2), (2,1)}
Probability of an Event: p(A)
● Classical Definition:
– If an experiment results in N equally likely outcomes, then p (A) =
NA/N where NA is the number of outcomes in the event A. (Note:
random selection of k units from N distinct units implies every possible
group of k units is equally likely)
● Relative Frequency or Empirical Definition:
– If an event occurs nA times in n repetitions of an experiment, then p(A)
= nA/n whenever n is sufficiently large.
– Example: When a fair coin is tossed a large number of times, then you
expect to observed 50% heads and 50% tails. We use this proportion
(i.e. relative frequency) as the p(Head) in a single toss; p (H) = ½.
● Axiomatic Approach to Probability
– This approach builds up probability theory from a number of
assumptions and will not be discussed here.
Examples of Empirical Probabilities
● Proportions and percentages found in samples, journal
articles, newspapers, polls, etc., are used as empirical
probabilities
● Examples:
●
“Five percent of all items produced in a factory are defective” means
• P(A randomly selected item is defective) = 0.05
● “NBA player Mr. W makes 80% free throws” means
• P(Mr. W will make his next free throw) = 0.80
●
“5% of ICU patients die within 15 days of hospital admission” means
• P(A randomly selected ICU patient will die within 15 days) = 0.05
Laws of Probability
● Addition Law
• p(A or B) = p(A) + p(B) - p(A and B)
● Complementary Law
• p(A) + p(not(A)) = 1
● Conditional Probability
• Probability of an even “A” given that B already occurred
• = p(A/B) = p(A and B)/p(B)
● Multiplicative Law
• p(A ∩ B) = p(A) p(B/A) = p(B)p(A/B)
● Independent Events
● Two events A and B are independent if and only if p(A and B) = p(A) p(B)
● Remarks
• If A and B are disjoint or mutually exclusive, then P(A ∩ B) = 0
• If S is the sample space of an experiment , then p(S) = 1
• If A is any event of S, then (A) or (Not A) = S
Example: Understanding Classical probability
Employment Status
Class
Rank
Fr
Full-Time
100
Part-time
200
No Job
10
Total
310
So
Jr
Sr
20
50
70
80
75
40
50
25
280
150
150
390
Total
240
395
365
1000
Select a student at random from this 1000 students. What
is the probability that the selected student is
Sr?
Unemployed?
390/1000;
365/1000;
Sr and unemployed?
Sr or unemployed?
280/1000;
475/1000
A student selected at random is found to be Sr. What is the
probability that the student is unemployed?
Ans. P(unemployed/Senior) = (280/1000)/ (390/1000) = 280/390
Example: Understanding Relative frequency or empirical probability
● An insurance company divides its policy holders into three
categories: low risk, moderate risk, and high risk. The low-risk
policy holders account for 60% of the total number of people
insured by the company. The moderate-risk policy holders
account for 30%, and the high-risk policy holders account for
10%. The probabilities that a low-risk, moderate-risk, and highrisk policy holder will file a claim within a given year are
respectively .01, .10, and .50.
– If a policy holder is selected at random, what is the probability that a low
risk policy holder will be selected? (need the proportion of low risk policy
holder)
– If a policy holder is selected at random, what is the probability of
selecting a policy holder who will file a claim? (need the proportion of all
policy holders who will file a claim)
– Given that a policy holder files a claim, what is the probability that the
person is a high-risk policy holder? (need the proportion of high risk
policy holders only among those who filed a claim)
Random Variables and Probability Distributions
● A discrete random variable (X, say) can assume only a countable
number of values.
● Probability distribution of X
• The values of X and the corresponding probabilities together form the
probability distribution of X.
● A continuous random variable X can assume any numerical value
within some interval or intervals.
● The graph of the probability distribution is a smooth curve often
called a probability density function ( f ) which satisfies
– (i) f(x) ≥ 0 for all x, and (ii) Total area under the graph is 1
● The probability that a randomly selected value of X falls between a and b is defined
as the area between a and b under the graph of f.
p(a<x<b)=shaded area
A
between a and b
Binomial Probability Distribution (a discrete distribution)
● Binomial Experiment
• consist of n ≥ 1 independent and identical trials where each trial
has two possible outcomes S (Success) and F (Failure) such that
P(S) =p is the same for each of the n trials.
● Binomial random variable : X = number of successes in n trials
● Probability distribution of X is binomial probability distribution
# of trials
# of successes
Probability of success in each trial
Geometric Probability Distribution (a discrete distribution)
● Recall that the binomial random variable is the number of
successes in n independent Bernoulli trials
● Suppose now that we do not fix the number of Bernoulli trials n
in advance but instead continue to observe the sequence of
Bernoulli trials until we observe a success. The random variable
of interest (X) here is the number of trials (or equivalently, the
number of failures) needed before the first success.
● The probability distribution of X is called geometric distribution
and is given by
• p(x) = p(1-p)x, x = 0, 1, … , where p is the probability of success
● Mean and variance of this distribution are
• μ = (1-p)/p, and
σ2 = (1-p)/ p2
Poisson Probability Distribution ( a discrete distribution)
● The Poisson random variable X is the observed number of
rare events in a unit of measurement (e.g., time, area,
volume, weight, distance, etc.) and its probability distribution is
given by
• p(x) = (e-λ λx )/x!, x = 0, 1, ..., where
• λ is the expected number of events during the given unit of measurement.
● For this distribution, both mean and variance are equal to λ
(i.e. μ= σ2 = λ)
● Examples of some events and units
•
•
•
•
•
•
Number of accidents (event) per month (unit)
Number of cancer deaths (event) per year (unit)
Number of diseased trees (event) per acre (unit)
Number of airline fatalities (event) per month (unit)
Number of hurricanes (event) per season (unit)
Number of misprints (event) per page (unit) of a book
Normal Distribution (continuous)
● Bell-shaped symmetric
distribution with mean μ
= 0 and standard
deviation σ = 1
● Often called a Zdistribution
● Bell-shaped symmetric
distribution with mean μ
and standard deviation σ
a
μ
b
X
Standard Normal Distribution
X=μ+Zσ
p(a < X < b) = area between a and b
p(a < X < b) =
c
0
d
Z
p(c < Z < d) = area between c and d
p( (a-μ)/σ < Z < (b-μ)/σ )
11
Sampling probability distribution of sample mean
As the sample size n increases, the distribution of sample mean gets
closer to normal distribution.
Standard Normal and t-distributions
● Standard Normal
(mean = 0, var = 1)
and
t-distribution with df = η
(mean = 0, var = η /(η -2)
F-distribution
● F-distribution depends on two degrees of freedoms (df) d1 and d2
Chi-squared (χ2 ) distribution
● χ2-distribution for some values of degrees of freedom (k)
Identifying probability distributions for observed data
● Compare summary results of observed data and properties of
distributions
– Calculate mean, median, variance, percentiles, etc., of your observed
data and identify the properties and/or relationships of these summary
results
– A probability distribution that has properties similar to that found in the
observed data is a good candidate to represent the observed data.
● Examples:
– Count data of rare events with mean equal to variance is likely a good fit
with Poisson distribution
– Quantitative measurement data with independent mean and variance that
satisfy empirical rule percentages (68%, 95% and 99.75%) may fit well
with normal distribution
– Quantitative measurement data for which mean is equal to standard
deviation may fit well with exponential distribution
Identifying probability distributions for observed data
● Three commonly used methods
● Histogram and overlayed probability density curve
– Construct histogram of your data and display the desired probability
density curve. Visually check if the density curve fits the histogram.
● Probability Plots
– Construct a scatterplot of the ranked data values on one axis and the
corresponding
expected theoretical distribution
score (often
standardized score is used) on another axis.
– Construct a scatterplot of the observed cumulative proportions on one
axis and the corresponding expected theoretical cumulative proportions
on another axis.
● For the Q-Q and P-P plots, scatterplot points are expected to be close
to a straight line if the theoretical distribution fits well to the observed
data
Example: Reading Scores and Normal Distribution
● Histogram and
Normal Curve
● Normal QuantileQuantile Plot (QQ Plot)
● Normal probability
Plot (P-P plot)
For Q-Q and P-P plots,
points close to the solid line
indicate that the data fit well
to the theoretical distribution