Download Probability Distribution (PD) of a Random variable (RV) – what

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CHAPTER 4
Probability Distributions
Probability Distribution (PD) of a Random variable (RV) – what values occur
and how often. A PD may be expressed by a table, graph, or formula.
Definition. The probability distribution of a discrete random variable is
a table, graph, formula, or other device used to specify all possible values of a
discrete random variable along with their respective probabilities.
Example.
x f r(X) P (X = x) P (X  x)
1 45
.15
.15
0
90
.30
.45
1
60
.20
.65
2
45
.15
.80
3
30
.10
.90
4
30
.10
1.00
300
1.00
This is a table based on experimental probability. For classical (theoretical)
probability, we would not need the second column. The third column here
gives the relative frequency of occurrence of the corresponding value of X.
Based on this third column, a relative frequency histogram follows:
35
36
4. PROBABILITY DISTRIBUTIONS
Note.
(1) 0  P (X = x)  1
X
(2)
P (X = x) = 1,
for all x
A graph of the cumulative probability distribution, taken from column 4 of
the table, called an ogive, follows:
4. PROBABILITY DISTRIBUTIONS
37
(a) Find P (1  x  3).
P (1  x  3) = P (x = 1) + P (X = 2) + P (X = 3) = .20 + .15 + .10 = .45
or
P (1  x  3) = P (X  3)
P (X  0) = .90
.45 = .45
(b) Find P (X > 2).
P (X > 2) = 1
P (X  2) = 1
.80 = .20
Mean and Variance of Discrete Probability Distributions
X
µ=
xp(x)
X
X
2
2
=
(x µ) =
x2p(x) µ2
Example.
µ = ( 1)(.15) + 0(.30) + 1(.20) + 2(.15) + 3(.1) + 4(.1) = 1.05
2
= 1(.15) + 0(.30) + 1(.2) + 4(.15) + 9(.10) + 16(.10)
p
= 2.3475 = 1.532
1.052 = 2.3475
The Binomial Distribution – based on a process called the Bernoulli process.
Definition. A sequence of Bernoulli trials forms a Bernoulli process under
the following conditions:
(1) Each trial results in one of two possible, mutually exclusive outcomes.
One of the possible outcomes is denoted (arbitrarily) as a success, and
the other is denoted a failure.
(2) The probability of a success, denoted by p, remains constant from trial
to trial. The probability of a failure, 1 p, is denoted by q.
(3) The trials are independent; that is, the outcome of any particular trial is
not a↵ected by the outcome of any other trial.
Often, we let 1 = success and 0 = f ailure.
38
4. PROBABILITY DISTRIBUTIONS
Example. Find P (two ones) in 4 rolls of a fair die.
P (1 0 0 1) =
1 5 5 1
· · · =
6 6 6 6
✓ ◆2✓ ◆2
1
5
25
=
6
6
1296
1 5 1 5
· · · =
6 6 6 6
✓ ◆2✓ ◆2
1
5
25
=
6
6
1296
P (1 0 1 0) =
In how many ways can we get our two successes? This is similar to picking a
committee of 2 from a group of 4.
Definition. A combination of n objects taken x at a time is an unordered
subset of x of the objects.
The number of combinations of n objects that can be formed by taking x of
them at a time is nCx (read n choose x) where
✓ ⇣ ⌘◆
n!
n
=
n Cx =
x
x!(n x)!
Example.
4 C2 =
20 C15
Also,
=
4!
4·3·2·1
=
=6
2!2! 2 · 1 · 2 · 1
20!
20 · 19 · 18 · 17 · 16 · 15!
=
= 15, 504
15!5!
5 · 4 · 3 · 2 · 1 · 15!
20 C5
= 15, 504
Why, in general, is
n C(n x)
= nCx?
4. PROBABILITY DISTRIBUTIONS
39
What about picking a representative and an alternate from our group of 4?
There are 4 · 3 = 12 ways. Order comes into play here.
Definition. A permutation of n objects taken x at a time is an ordered
subset of x of the n objects.
The number of permutations of n objects thake x at a time is
n!
1)(n 2) · · · (n x + 1) =
.
n Px = n(n
(n x)!
Example.
4 P2 = 4 · 3 =
20 P5
4!
= 12
2!
=
20 · 19 · 18 · 17 · 16 =
Note.
n Cx =
The order is divided out.
20!
= 1, 860, 480
15!
n Px
x!
.
Binomial Distribution (formally)
(
x n x
, for x = 0, 1, 2, . . . , n
n Cx p q
f (x) = P (X = x) =
0,
elsewhere
Example.
⇣ 1 ⌘2⇣ 5 ⌘2
25
f (2) = 4C2p q = 6
=
6
6
216
2 2
40
4. PROBABILITY DISTRIBUTIONS
1
We make a tble for all the cases of n = 4 and p = .
6
Notation.
⇣
1⌘
P (X = x | n, p) = P X = x | 4,
6
x f (x) = P (X = x)
P (X  x)
⇣ ⌘0⇣ ⌘4
625
5
625
0 1296
= 4C0 16
6
1296
⇣ ⌘1⇣ ⌘3
1
5
125
1 125
324 = 4 C1 6
6
144
⇣ ⌘2⇣ ⌘2
25
5
425
2 216
= 4C2 16
6
432
⇣ ⌘3⇣ ⌘1
5
5
1295
3 324
= 4C3 16
6
1296
⇣ ⌘4⇣ ⌘0
1
5
4 1296
= 4C4 16
1
6
Note.
P (X = 2) = P (X  2)
P (X  1) =
425
432
P (1  x  3) = P (X  3)
P (X  0) =
1295
1296
125
25
=
144 216
625
335
=
1296 648
425
7
=
432 432
In the next problem, we use Table B in the Appendix and the following equations:
P (X > 2) = 1
P (X  2) = 1
P (X = x | n, p > .5) = P (X = n
P (X  x | n, p > .5) = P (X n
P (X x | n, p > .5) = P (X  n
x | n, 1
x | n, 1
x | n, 1
p)
p)
p)
4. PROBABILITY DISTRIBUTIONS
Problem (4.3.8). n = 15, p = .75
(a)
P (X = 6 | 15, .75) =
P (X = 9 | 15, .25) =
P (X  9 | 15, .25) P (X  8 | 15, .25) =
.9992 9958 = .0034
(b)
(c)
P (X 7 | 15, .75) =
1 P (X  6 | 15, .75) =
1 P (X 9 | 15, .25) =
P (X  8 | 15, .25) = .9958
P (X  5 | 15, .75) =
P (X 10 | 15, .25) =
1 P (X  9 | 15, .25) =
1 .9992 = .0008
(d)
P (6  x  9 | 15, .75) =
P (X  9 | 15, .75) P (X  5 | 15, .75) =
P (X 6 | 15, .25) P (X 10 | 15, .25) =
⇥
⇤ ⇥
⇤
1 P (X  5 | 15, .25)
1 P (X  9 | 15, .25) =
⇤
P (X  9 | 15, .25) P (X  5 | 15, .25) =
.9992 .8516 = .1476
41
42
4. PROBABILITY DISTRIBUTIONS
Properties of the Binomial Distribution
(1) Completely determined by n and p.
(2) The mean is µ = np.
Example.
(3) The variance is
2
⇣1⌘ 2
µ=4
=
6
3
= np(1 p) = npq
Example.
⇣ 1 ⌘⇣ 5 ⌘ 5
=4
=
6 6
9
(4) Can be used for sampling from:
2
(a) infinite populations.
(b) finite populations with replacement (if n is small relative to N , say
N 10n, can probably do without replacement).
The Poisson Distribution – used extensively as a probability model in biology
and medicine.
Let x = the number of occurences of some random event in an interval of time
or space (or some volume of matter).
Let = the average number of occurences of the random event in the interval
(or volume). is called the parameter of the distribution.
Then
f (x) = P (X = x) =
x
e
x!
,
x = 1, 2, 3 . . . .
4. PROBABILITY DISTRIBUTIONS
43
The Poisson Process – the process underlying the Poisson distribution.
(1) The occurrences of the events are independent. The occurrence of an event
in an interval of space or time has no e↵ect on the probability of a second
occurrence of the event in the same, or any other, interval.
(2) Theoretically, an infinite number of occurrences of the event must be possible
in the interval.
(3) The probability of the single occurrence of the event in a given interval is
proportional to the length of the interval.
(4) In any infinitesmally small portion of the interval, the probability of more
than one occrrence of the event is infinitesimal.
Note.
f (x)
0 for all x and
mean:
variance:
Problem (4.4.4). We are given
(a) P (X = 1 | .5) =
P (X  1 | .5)
.910
(b) P (X = 0 | .5) =
P (X  4 | .5)
1.000
1 | .5) =
f (x)
=1
| x {z }
infinite series
µ=
2
=
= .5 and we see Table C.
P (X  0 = | .5) =
.607 = .303
P (X = 0 | .5).607
(c) P (X = 4 | .5) =
(d) P (X
X
1
P (X  3 | .5) =
.998 = .002
P (X = 0 | .5) = 1
.607 = .393
44
4. PROBABILITY DISTRIBUTIONS
Reese’s Pieces Simulation – includes looking ahead a bit.
Reese’s Pieces come in 3 colors – yellow, orange, and brown. The proportions of
these colors is evidently a trade secret. Suppose the actual proportion of orange
pieces is .45. The following javascript simulation takes same-sized random
samples of Reese’s pieces, counts the number of orange ones, and then plots
the proportion on a number line. The applet is located at
http://www.rossmanchance.com/applets/OneProp/OneProp.htm?candy=1
This is one of several applets at
http://www.rossmanchance.com/applets/
designed by Allan Rossman and Beth Chance, two prominent statistics educators.
Fill out the screen as seen below, then click Draw Samples 10 times.
4. PROBABILITY DISTRIBUTIONS
45
Now click on Count.
We see that, since the standard deviation for the distribution here is roughly
.1, 86.83% of the samples are greater than .35. Change the .35 to .55, again
click Count, and you get 18.98% of the samples are greater than .55. Thus
86.83%-18.98%=64.85% of the samples have proportions within approximately
1 standard deviation of the mean. Now use .25 and .65. We have that 95.83%
of our samples have proportions within approximately two standard deviations
of the mean. Finally, choose .15 and .75. We have that 99.80% of our samples
have proportions within approximately three standard deviations of the mean.
Notice how closely these values of
64.85 — 95.83 — 99.80
match
68.3 — 95.4 — 99.7,
the corresponding percents for the normal curve.
46
4. PROBABILITY DISTRIBUTIONS
Continous Probability Distribution (or Probability Density Function) – a function f (x) such that
(1) f (x) 0 for all x.
Z 1
(2)
f (x) dx = 1.
1
Example. Consider the function f (x) =
a random variable.
Z
P ( 10  X  20) =
In general,
P (a  X  b) =
Z
1
⇣
1
5⇡ 1 + (x
25
20
f (x) dx.
10
b
f (x) dx.
a
Also,
Z
1
1
f (x) dx ⇡
Z
106
f (x) dx = 1.
106
2
1)
⌘ where X is
4. PROBABILITY DISTRIBUTIONS
47
Thus f (x) meets the requirements of being a probability density function.
Finally, for every real number a,
Z a
P (X = a) =
f (x) dx = 0.
a
We have been using normal curves. Let’s take a closer look at them.
Definition. A normal curve with mean µ and standard deviation is the graph of the function
(x µ)2
1
f (x) = p e 2 2
2⇡
Thus normal curves are completely determined by their mean and standard
deviation.
For the three normal curves above, one has a mean of 70 and a standard
deviation of 5, another has a mean of 70 and a standard deviation of 10, and
the third has a mean of 50 and a standard deviation of 10. Which is which?
48
4. PROBABILITY DISTRIBUTIONS
Every normal curve extends from 1 to 1 on the horizontal axis with the
area under the curve always equal to one.
Consider the normal curve below with mean µ and standard deviation , or,
in the case of the standard normal curve with mean 0 and standard deviation
1. Moving from the peak to the right, every normal curve curve changes from
concave down to concave up exactly 1 standard deviation from the mean.
From the above, we can also see that every normal curve with mean µ and
standard deviation and variable x can be transformed into a standard normal
curve with variable z by the formula
x µ
z=
.
Similarly, every standard normal curve with variable z can be transformed into
a normal curve with mean µ and standard deviation with variable x by the
formula
x = µ + z.
4. PROBABILITY DISTRIBUTIONS
49
As an example, the graph of the normal curve with mean 20 and standard
deviation 5 and the corresponding standard normal curve follow below.
Characteristics of a Normal Distribution
(1) It is symmetric about its mean µ.
(2) Mean = median = mode = µ.
(3) The total area between the curve and the x-axis is 1, with 50% of the
area on each side of the mean.
(4) The 68–95–99.7 rule says that 68% of the area is within one SD of the
mean, 95% within two, and 99.7% within three.
(5) The graph has a maximum at µ and points of inflection at µ ± .
(6) The normal distribution is completely determined by µ and . µ is often
called a location parameter, while , which determines the peakedness
or flatness of the graph, is called a shape parameter.
50
4. PROBABILITY DISTRIBUTIONS
Example (Using the Standard Normal Distribution with Table D).
(1) P ( 1.3  z  2.53) =
P (z  2.53) P (z < 1.3) =
.9943 .0968 = .8975
(2) P (z > .7) =
1 P (z  .7) =
1 .7580 = .2420
(3) P ( 1.24  z  1.24) =
1
2P (z < 1.24) =
1 2(.1075) = .785
(4) If P (z  z1) = .9265, what is z1?
z1 = 1.45
(5) If P (z > z1) = .2611, what is z1?
P (z  z1) = 1 .2611 = .7389 =)
z1 = .64
Many measurable characteristics in nature are at least approximately normal, so
we can use the normal distribution to model the distribution of these variables.
Problem (4.7.2). µ = 140,
(a) P (X
200) =
✓
P z
200
= 50
140
◆
= P (z
50
1 P (z  1.2) =
1 .8849 = .1151
1.2) =
4. PROBABILITY DISTRIBUTIONS
(b) P (x < 100) =
✓
100
P z<
140
50
◆
= P (z <
51
.8) =
.2119
(c) P (100  x  200) =
✓
◆
100 140
200 140
P
z
= P ( .8  z  1.2) =
50
50
P (z  1.2) P (z < .8) =
.8849 .2119 = .673
(d) P (200  x  250) =
✓
◆
200 140
250 140
P
z
= P (1.2  z  2.2) =
50
50
P (z  2.2) P (z < 1.2) =
.9861 .8849 = .1012
(e) From (a), P (X > 200) = .1151 =) .1151(10, 000) = 1151 have a ridge
count of 200 or more.
Related documents