Download X - People Server at UNCW

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Randomness wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
• A random variable is a variable whose values
are numerical outcomes of a random
experiment. That is, we consider all the
outcomes in a sample space S and then
associate a number with each outcome
• Example: Toss a fair coin 4 times and let
X=the number of Heads in the 4 tosses
We write the so-called probability distribution of
X as a list of the values X takes on along with
the corresponding probabilities that X takes on
those values.
• The figure below (Fig. 4.6) and Example 4.23
show how to get the probability distribution of X.
Each outcome has prob=1/16 (HINT: use the
“and” rule to show this), and then use the “or”
rule to show that P(X=1) = P(TTTH or TTHT or
THTT or HTTT) etc…)
• There are two types of r.v.s: discrete and
continuous. A r.v. X is discrete if the number of
values X takes on is finite (or countably infinite).
In the case of any discrete X, its probability
distribution is simply a list of its values along with
the corresponding probabilities X takes on those
values.
Values of X: x1 x2 … xk
P(X):
p1 p2
pk
NOTE: each value of p is between 0 and 1 and all
the values of p sum to 1. We display probability
distributions for discrete r.v.s with so-called
probability histograms. The next slide shows the
probability histogram for X=# of Hs in 4 tosses of
a fair coin.
The next slide gives a similar example...
•The probability distribution of a
random variable X lists the values
and their probabilities:
•The probabilities pi must add up to 1.
•A basketball player shoots three free throws. The random
variable X is the number of baskets successfully made.
Suppose he is a 50% free throw shooter...
H
H -
HHH
M -
HHM
H -
HMH
M -
HMM
H
M
M…
…
Value of X
0
1
2
3
Probability
1/8
3/8
3/8
1/8
MMM
HMM
MHM
MMH
HHM
HMH
MHH
HHH
•The probability of any event is the sum of the probabilities
pi of the values of X that make up the event.
•A basketball player shoots three free throws. The random
variable X is the number of baskets successfully made.
Suppose he is a 50% free throw shooter.
What is the probability that the player
Value of X
0
1
2
3
successfully makes at least two
Probability
1/8
3/8
3/8
1/8
MMM
HMM
MHM
MMH
HHM
HMH
MHH
HHH
baskets (“at least two” means “two or
more”)? USE THE “OR” RULE!
P(X≥2) = P(X=2) + P(X=3) = 3/8 + 1/8 = 1/2
What is the probability that the player successfully makes fewer than three
baskets? USE THE “OR” RULE HERE TOO...!
P(X<3) = P(X=0) + P(X=1) + P(X=2) = 1/8 + 3/8 + 3/8 = 7/8 or
P(X<3) = 1 – P(X=3) = 1 – 1/8 = 7/8 (THIS IS THE “NOT” RULE)
• A continuous r.v. X takes its values in an interval
of real numbers. The probability distribution of a
continuous X is described by a density curve,
whose values lie wholly above the horizontal
axis, whose total area under the curve is 1, and
where probabilities about X correspond to
areas under the curve.
• The first example is the random variable which randomly
chooses a number between 0 and 1 (perhaps using the
spinner on page 253 – go over Example 4.25). This r.v.
is called the uniform random variable and has a density
curve that is completely flat! Probabilities correspond to
areas under the curve... see next slide for the
computations...
A continuous random variable X takes all values in an interval.
Example: There is an infinity of numbers between 0 and 1 (e.g., 0.001, 0.4, 0.0063876).
How do we assign probabilities to events in an infinite sample space?
 We use density curves and compute probabilities for intervals.
 The probability of any event is the area under the density curve
for the values of X that make up the event.
This is a uniform density curve for the variable X.
The probability that X falls between 0.3 and 0.7 is
the area under the density curve for that interval
(base x height for this density):
P(0.3 ≤ X ≤ 0.7) = (0.7 – 0.3)*1 = 0.4
X
The probability of a single point is meaningless for a
continuous random variable. Only intervals can have a
non-zero probability, represented by the area under the
density curve for that interval.
The probability of a single point is zero since
there is no area above a point! This makes
the following statement true:
Height
=1
The probability of an interval is the same whether
boundary values are included or excluded:
P(0 ≤ X ≤ 0.5) = (0.5 – 0)*1 = 0.5
P(0 < X < 0.5) = (0.5 – 0)*1 = 0.5
X
P(0 ≤ X < 0.5) = (0.5 – 0)*1 = 0.5
P(X < 0.5 or X > 0.8) = P(X < 0.5) + P(X > 0.8) = 1 – P(0.5 < X < 0.8) = 0.7
(You may use either the “OR” Rule or the “NOT” Rule...)
• The other example of a continuous r.v. that
we’ve already seen is the normal random
variable. See the next slide for a reminder of
how we’ve used the normal and how it relates to
probabilities under the normal curve...
• Go over Example 4.26 in detail! We saw earlier
that p-hat had a sampling distribution which was
normal. Thus p-hat can be treated as a normal
random variable… we have shown that the
mean of p-hat is p and the standard deviation of
p-hat is sqrt(p(1-p)/n). Now use this information
to do Ex. 4.26…
Continuous random variable and population distribution
The shaded area under a
density curve shows the
proportion, or %, of individuals
in a population with values of X
between x1 and x2.
Because the probability of
drawing one individual at
random depends on the
frequency of this type of
individual in the population, the
probability is also the shaded
area under the curve.
% individuals with
X such that x1 < X
< x2
Mean of a random variable
•The mean x bar of a set of observations is their arithmetic average.
•The mean µ of a random variable X is a weighted average of the
possible values of X, reflecting the fact that all outcomes might not be
equally likely.
A basketball player shoots three free throws. The random variable X is
the number of baskets successfully made (“H”).
MMM
HMM
MHM
MMH
HHM
HMH
MHH
HHH
Value of X
0
1
2
3
Probability
1/8
3/8
3/8
1/8
The mean of a random variable X is also called expected value of X.
What is the expected number of baskets made? Do the computations...
• We’ve already discussed the mean of a density
curve as being the “balance point” of the curve…
to establish this mathematically requires some
higher level math… So we’ll think of the mean of
a continuous r.v. in this way. For a discrete r.v.,
we’ll compute the mean (or expected value) as a
weighted average of the values of X, the weights
being the corresponding probabilities. E.g., the
mean # of Hs in 4 tosses of a fair coin is
computed as: (1/16)*0 + (4/16)*1 + (6/16)*2 +
(4/16)*3 + (1/16)*4 = (32/16) = 2.
• In either case (discrete or continuous), the
interpretation of the mean is as the long-run
average value of X (in a large number of
repetitions of the experiment giving rise to X)
• Look at Example 4.27 on page 260… a simple “lottery” (pick
3), like the old numbers game…you pay $1 to play (pick a 3
digit number), and if your number comes up, you win $500;
otherwise, the bookie keeps your $1. Note that in the long
run, your winnings are
$500*(1/1000) + $0*(999/1000) = $.50
• Law of Large Numbers: Essentially states that if you
sample from a population with mean = m, then the sample
mean (x-bar) will approximate m for large sample sizes. Or
that m is the expected value of many independent
observations on the variable. CAREFULLY READ PAGES
273ff ON THE LAW OF LARGE NUMBERS AND ITS
CONSEQUENCES! Stop Chapter 4 at the bottom of page
266 ("Rules for means"). HW: Read sections 4.3 & 4.4. Do
# 4.53-4.58, 4.61-4.63, 4.66, 4.74-4.76