Download CHAPTER 8 1. Probability Density Functions For some types of data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

History of statistics wikipedia , lookup

Density matrix wikipedia , lookup

Law of large numbers wikipedia , lookup

Probability amplitude wikipedia , lookup

Transcript
CHAPTER 8
1. Probability Density Functions
For some types of data, we want to consider the distribution throughout the population.
For example, age distribution in the US gives the percentage of the US population that is in
a given age range. One way to represent data of this sort is with a special type of histogram.
Recall that a histogram is a graph showing the frequency of some quantity over an interval
using rectangles. The interval is divided into sections, called bins. Frequency is on the
y-axis, while the interval is on the x-axis. Histograms are not bar graphs; bar graphs show
frequencies for categorical data, histograms are used for continuous data.
For data showing the distribution, we will create a histogram so that the area of each
rectangle represents the percentage of the population in that bin. Consider the following
data:
Age Group % of total pop. Width Height
0–20
29%
20–40
29%
40–60
26%
60–80
13%
80–100
3%
Example 1. Using the histogram, what percentage of the population is between 20 and 60
years old? What percentage of the population is older than 90?
1
2
CHAPTER 8
Notice that in order to find the percentage of the population, we consider the area
underneath the rectangles. We could get better information about the age distribution if
we had smaller bins. As the bins get very small, the top of the histogram begins to look like
a smooth continuous function. If we wanted to determine the percentage of the population
between 0 and 15 years old, we still look at the area underneath the distribution. As bin
size goes to zero, however, this is no longer involves the areas of the rectangles. Instead,
we use the integral.
Suppose the age density function– the function created by the top of the histogram as the
bin size goes to 0– is given by p(t). The units of p(t) are percent per year; we rarely use
the density function itself, since we are usually far more interested in the percentage of
the population. The fraction of the population that is between 0 and 15 years old is the
R 15
area under p(t) between t = 0 and t = 15, or 0 p(t)dt. In general, the fraction of the
Rb
population that is between a and b years old is a p(t)dt.
One of the most important aspects of the density function is the area under the curve
between the lowest and highest values of t.
Z 100
p(t)dt
0
represents the fraction of the population between 0 and 100 (Our data is slightly flawed
because it doesn’t account for people over 100, but they make up a miniscule percentage
of the population). So, assuming that every person is between 0 and 100,
Z 100
p(t)dt = 1.
0
Definition 1. A function, p(x) is a density function if
• The fraction of the population for which x is between a and b is
R∞
• −∞ p(x)dx = 1
• p(x) ≥ 0 for all x.
Rb
a
p(x)dx
Notice that in the second criterion, the bounds of the integral are ±∞. Some intervals,
R 100
like age, are finite, so the improper integral will be the same as 0 p(t)dt. Other intervals
CHAPTER 8
3
R∞
can be infinite. When you are checking −∞ p(x)dx = 1 for a density function on a finite
interval, you only need to integrate between the smallest and largest possible values.
4
CHAPTER 8
2. Cumulative Distribution Functions
Given a density function, p(x), we can define a cumulative density function, P (x). Cumulative density functions are really just a shorthand way to talk about finding the percentage
of the population between x = a and x = b.
Definition 2. A cumulative density function P (t) of a density function p(x) is defined by
Z t
P (t) =
p(x)dx.
−∞
So, P (t) is the fraction of the population having values of x below t. Note that as before,
Rt
if x can’t be less than some number a, it is equivalent to compute a p(x)dx.
Notice that by the Second Fundamental Theorem of Calculus, P 0 (t) = p(t). P (t) has
the following properties:
• P is increasing or nondecreasing.
• limt→∞ P (t) = 1 and limt→−∞ P (t) = 0.
• The fraction of the population having values of x between a and b is
Z b
p(x)dx = P (b) − P (a)
a
Usually, we use cumulative density functions to talk about probability. Since density functions are used on intervals, cumulative density functions are used to describe continuous
variables.
Remember that in probability, a continuous variable is one that can take on any value
in an interval. For example, age is a continuous variable because any age between 0 and
100 is possible, like 10.333 or 56.791. Variables that aren’t continuous are discrete; the
color of a randomly chosen m&m is a discrete variable, because there are only 6 options.
If P (t) is a cumulative density function describing the probability of some event x, P (t)
is the probability that x < t. So, if p(x) is the age density function we discussed before,
Rt
P (t) = 0 p(x)dx is the probability that a randomly chosen person is younger than t years
Rt
old. This lines up with our old interpretation of 0 p(x)dx as the fraction of the population
younger than t.
Example 2. p(x) = 0.2 − 0.02x, 0 ≤ x ≤ 10 is a density function describing the amount
of time x, in minutes, a person waits while on hold with Verizon. Find and graph the
cumulative density function and find the following:
• P(0)
CHAPTER 8
•
•
•
•
The
The
The
The
probability
probability
probability
probability
of
of
of
of
waiting
waiting
waiting
waiting
5
less than 2 minutes
between 4 and 6 minutes
more than 5 minutes
more than 10 minutes
CAUTION P (a) is NOT the probability that x = a. It is the probability that x < a.
For continuous variables, the probability that x is exactly a is actually zero, because there
are infinite possible values for x.
Remember that for a random variable (the event we want to find the probability of),
the maximum probability is 1. Something always has to happen, so the probability of a
certain event not happening is 1 minus the probability of that event happening. Since P (a)
is the probability of x < a, The probability that x > a is 1 − P (a).
6
CHAPTER 8
3. Median and Mean
We can define the median and the mean of a distribution, just like we do for other data.
3.1. Median. Given an ordered set of discrete numbers, like 36, 42, 47, 47, 59, 70, 73,
89, 157, the median is the middle value. In this case, the median is 59. When we have a
quantity x distributed through a population, the median is the value T so that half the
population has values of x less than or equal to T and half the population has values of x
greater than or equal to T . If p(x) is the density function, this means that half the area
under the graph of p(x) lies to the left of T , or
Z T
p(x)dx = 0.5
−∞
or, equivalently, P (T ) = 0.5.
Example 3. Given the density function p(x) = 0.2 − .02x, 0 ≤ x ≤ 10 find the median.
3.2. Mean. To understand the formula for the mean, let’s return to the age density function, p(x). If we want to find the average age of people in the US, intuitively we should add
together the ages of everyone in the US and divide by the total number of people living in
the US. This, of course, is impossible. Instead, we can try the following:
• First, let N be the total number of people living in the US. Then the number of
people who are t years old is N times the percentage of people who are t years old.
• We can’t use p(x) to find the percent of people who are exactly t years old. Instead,
we can find the percent of people whose age is between t and t + ∆t, where ∆t
is some small number. The percentage of the population with age between t and
t + ∆t is p(t) ∗ ∆t.
• Then the number of people with age between t and t + ∆t is N ∗ p(t) ∗ ∆t. The
last piece of information we need is the sum of all of their ages. All of the ages are
very close to t, so the sum of the ages is approximately t ∗ N ∗ p(t)∆t.
CHAPTER 8
7
• Now, we need to add up these estimates over all the intervals, giving us an estimate
of the sum of ages of all people in the US:
X
tp(t)∆tN
This estimate gets better as ∆t gets smaller. As ∆t goes to 0, the sum becomes
Z 100
Z 100
N tp(t)dt = N
tp(t)dt
0
0
• The mean is the sum of all ages divided by the number of people in the US, N .
Thus, the mean is
R 100
Z 100
N 0 tp(t)dt
=
tp(t)dt
N
0
In general, if a quantity has density function p(x), the mean value of the quantity is
Z ∞
xp(x)dx
−∞
Example 4. Given the density function p(x) = 0.2 − .02x, 0 ≤ x ≤ 10, find the mean.
3.3. The Normal Distribution. The normal density function is used for many different
quantities, from height to SAT scores. We describe normal distributions by the mean µ and
the standard deviation σ. The standard deviation measures how spread out the normal
distribution is. The normal distribution has a density function of the form
(x−µ)2
1
p(x) = √ e− 2σ2 .
σ 2π
Example 5. The number of eggs in a Bufo americanus egg mass can be modeled by a
normal distribution with mean 8,000 and standard deviation 1000. Write down the density
function and find the probability that an egg mass contains less than 1,500 eggs.
8
CHAPTER 8
..