Download Week 4, 9/10/12 - 9/14/12, Notes: Random Variables, PMFs

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Week 4, 9/10/12 - 9/14/12, Notes:
Random Variables, PMFs, Expectation, and Variance
1
Monday, 9/10/12, General Random Variables
A variable denotes a characteristic that varies from one person or thing to another. Examples include height, weight, mariatl status, gender, etc. Variables can be either quantitative (numerical)
or qualitative (categorical).
Example 4.1 The following is a chart desribing the number of siblings each student in a particular class has. Note there are 40 students total.
Siblings (x)
0
1
2
3
4
Frequency of Students
8
17
11
3
1
Relative Frequency
.200
.425
.275
.075
.025
Percentage of Students
20.0
42.5
27.5
7.5
2.5
We use many terms when describing variables, including frequency and relative frequency. These
terms mean ”count” and ”percent of count written as a decimal” respectively.
A random variable is a real-valued function whose domain is the sample space of a random experiment. In other words, a random variable is a function X : Ω −→ R where Ω is the sample space
of the random experiment under consideration and R represents the set of all real numbers. (You
can think of a random variable as a way of assigning probabilities to an event of an experiment.)
From the above example, the event that the student randomly drawn from the class has 2 siblings
can be expressed in several ways. Way 1 is {ω ∈ Ω : X(ω) = 2}. Or the shorthand way is to say
11
{X=2}. The probability of this event is 40
or .275. We could define the event A as the event that
a student has 2 or more siblings. P (X ∈ A) is what?
There are two main types of quantitative random variables: discrete and continuous. A discrete
random variable often involves a count of something. Examples may include number of cars per
household, number of hours spent studying for a test, number of hours spent watching t.v. per
day, etc.
A random variable X is called a discrete random variable if the outcome of the random variable
is limited to a countable set of real numbers (meaning the r.v. can only take on so many real
values). Mathematically, we have a countable set K (of real numbers) s.t. P(X ∈ K)=1.
Another key word for r.v.s is support. The word support means the possible values a r.v. can
take. Any r.v. with a countable support – that is whose possible values form a finite or countably
infinite set – is a discrete r.v. Another way of stating this is to say that all of the probability for a
1
discrete r.v. occurs at particular points. These points (or numbers) could be 1, 100, .5, -22, - 11
.
There is no stipulation that a r.v.s’ support must be positive or an integer. Random variables
(depending on the context) can take on really any value from R.
Let X be a discrete r.v. Then the probability mass function (pmf) of X is the real-valued function
defined on R by pX (x) = P(X=x). An important note is that capital letters, like X, are used to
denote r.v.s. Lowercase letters, like x, are used to denote possible values of the random variable.
This distinction will be used throughout this course as well as in most Statistics courses. The
subscript in the pX (x) notation is used to denote that this is the pmf of the r.v. X. We could use
Y, Z, etc. If it is obvious what variable we are referencing, the subscript is often dropped. The x
in parentheses refers to the value of the r.v. that we are interested in.
Example 4.2 A: Flip a fair coin 3 times. Let X denote the number of heads tossed in the 3 flips.
Create a pmf for X.
Example 4.2 B: Redo example 4.2, but suppose now that P(heads on 1 flip) is .7. Let Y be the
number of tails. After completing the pmf for X, find the pmf for Y here.
Example 4.2 C: I used 3 flips so that you would be able to easily write down the sample space
and physically count the options available. Suppose I instead said 10 flips of the coin in example
4.2 B. First, how many possible outcomes are there? Secondly, what is the probability of 7 heads?
What is the probability you get at least 1 head?
Example 4.3: This is problem 7 from the Fall 2010 Stat 225 Exam 2. There are 3 guys and 2
girls sitting in a row of 5 seats at the Wabash Landing 9. Let G be the number of girls sitting
at the ends [of the row]. First, find the pmf of G. Secondly, suppose the following information
is true. A person will only order popcorn during the movie if they are sitting at the end of the
row. A guy will order 2 boxes, but a girl will order 1 box. Let C denote the number of boxes of
popcorn the 5 friends will order. Find the pmf of C.
Example 4.4: Refer to example 4.1. Let M be the amount of money that parents spend on
college. Let M = 30,000X + 2,000. Find the pmf for M.
Basic Properties of a PMF:
• pX (x) ≥ 0 ∀x ∈ R. That is to say a pmf is a nonnegative function.
• {x ∈ R : pX (x) 6= 0 } is countable. That is the set of real numbers for which a pmf is
nonzero is countable.
P
•
x pX (x) = 1. The sum of the values of a pmf equals 1. This is just another way to say
P(Ω)=1.
P
Suppose that X is a discrete r.v. Then, for any subset A of real numbers, P(X ∈ A) = x∈A pX (x).
This states that the probability a discrete r.v. takes a value from a specified subset of real numbers is just the sum of the pmf of the r.v. over that subset of real numbers.
Interpretation of a pmf In a large number of independent observations of a discrete r.v. X,
the proportion of times that each possible value occurs will approximate the pmf at that value.
This is the frequentist viewpoint.
Example 4.5 Let X be a random variable with pmf defined as follows. pX (x) = k ∗ (5 − x) for x
= 0, 1, 2, 3, and 4. However, pX (x) = 0 for all other possible values of X.
• Find the value of k that makes pX (x) a legitimate pmf.
• What is the probability that X is between 1 and 3 inclusive?
• If X is not 0, what is the probability that X is less than 3?
2
Wednesday, 9/12/12, EX and Variance of X
Interpretation of an expected value Classic probability asserts that the expected value of a
r.v. is the long-run average value of the r.v. in independent observations.
The expected value of a discrete r.v. X, denoted by E[X] is defined by
E[X] =
X
x ∗ pX (x).
x
In other words, the expected value of a discrete r.v. is a weighted average of its possible values, and
the weight used is its probability. Sometimes we refer to the expected value as the expectation,
the mean, or the first moment. Sometimes it is denoted by muX . For any function, say g(x), we
can also find an expectation of that function. It is
X
E[g(X)] =
g(x) ∗ pX (x).
x
An expectation we are often interested in is E[X 2 ]. So, using the above formula, how could we
write this?
Expectation of r.v.s has some nice properties that can be quite useful computationally. Let X and
Y be independent, discrete r.v.s defined on the same sample space and having finite expectation
(meaning < ∞). Let a and b be real numbers. Then the following hold:
• The r.v. X + Y has finite expectation and E[X + Y] = E[X] + E[Y].
• The r.v. aX + b has finite expectation and E[aX + b] = a*E[X] + b.
The variance of a r.v. is a measure of the spread, or variability, in the r.v. The conceptual definition of variance is Var(X) = E[(X − µX )2 ]. Basically, this states that variance is the expected
squared deviation of a r.v. from its mean. You could combine this with the E[g(X)] formula
to calculate variance. However, there is another way. We can also define variance by Var(X) =
E[X 2 ] - (E[X])2 . This is typically more useful, mainly because we are often interested in E[X] so
there is just one more calculation in order to find Var(X).
There are 2 useful properties of variance of a r.v. Let X be a r.v. and let c be a constant.
• Var(cX) = c2 *Var(X)
• Var(X + c) = Var(X)
Examples: Refer to examples 4.1, 4.2, and 4.3. Calculate the expectation and variance of those
variables. Please note the properties of expectation and variance. These could save you some
time. As a check, E[X] and Var(X) for example 4.1 is 1.3 and .91 respectively.
3
Friday, 9/14/12: Example Day
For many of these you can focus on calculating E[X] and Var(X). Sometimes, creating a new pmf
from a given pmf is required or helpful.
Example 4.6 How many licks does it take to get to the center of a tootsie roll pop? You have
the following distribution representating your population:
animal
owl
thing 1
thing 2
silly person
licks
3
100
200
427
probability
.001
.55
.448999
.000001
A fun/challenging question: Suppose that it costs $ .40 for a tootsie roll pop. You want to make
sure that you get 500 licks of a tootsie roll pop. What would your expected cost be?
Example 4.7 How much wood could a woodchuck chuck if a woodchuck could chuck wood? We
have the following distribution measured in butt cords.
family member
younger brother
older sister
mom
dad
amount of wood
153
272
573
1245
probability
.15
.2
.23
.42
Example 4.8 Peter Piper picked a peck of pickled peppers. If Peter Piper could pick the following
number of pecks of peppers in a day, what is the expected value and variance of the number of
pecks of pickled peppers that Peter Piper could pick in a day?
# of Pecks
20
50
120
175
200
probability
.01
.25
.35
.2
.19
Every week Peter goes to the market on Saturday and sells all of his pecks of peppers. He does
not pick peppers on Saturday. If he gets $ .35 for a peck of peppers, what is the expected value
and variance of the amount of money he will earn?
Example 4.9 Sally sells seashells by the seashore. Suppose on a given day she sells 1-5 shells
with respective probabilities .25, .15, .3, .2, and .1. If each shell sells for $ 2, how much money
can Sally expect to earn in a day?
Example 4.10 The pmf of a discrete r.v. X is described below.
x
pX (x)
-2
.22
-1
.29
0
.04
1
.19
2
.11
3
.15
• What is the probability that X is between -.8 and 2.2?
• Given X is at least 0, what is the probability that it is at least 1?
• Find E[X] and Var(X).
• Let Y = 2X - 1. Find the pmf of Y.
• Let Z = X 2 . Find the pmf of Z.
• What is special about Y compared to Z that makes part d easier than part e? Does the
linearity property of expectation hold for both Y and Z?