Download here3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Probability 3.
Two sorts of random variables are of interest:
DISCRETE: the number of outcomes is countable
CONTINUOUS: the number of outcomes is infinite
(not countable).
Random variables are often described with probability
distribution functions. These are graphs, tables or
formula which allow for the computation of
probabilities.
1
A few common Discrete probability distributions are:
Uniform: P(Y=y) = 1/number of outcomes
(all outcomes are equally likely)
Binomial P(Y=y) = nCy *
y * (1- )(n-y)
where: nCy is the combination of n things
taken y at the time
n is the number of trials
y is the number of successes
 is the probability of succeeding in
one trial
in each trial, the only outcomes are
2
success and failure (0,1).
y 

e
Poisson P(Y=y) =
;
y!
y=0,1,2,… (for example number of people
waiting in line at a teller)
= the population mean of Y.
These are a few of many discrete distributions and
are used when there are only a few possible outcomes:
number of defects on a circuit board, number of
tumors in a mouse, pregnant or not, dead or alive,
number of accidents at an intersection and so on.
3
A useful Continuous probability distribution is the
Normal. There are several others which we will use
more (Students-t, F,
2), but they are all based upon
the Normal.
The Normal (N) is the ‘bell shaped curve’. If you
identify the mean () of y (the random variable)
along a number line, then N will be highest at
symmetric about


and
(the mean, median, and mode are
all the same).
4
The height of N at any value of y can be computed from the
following function:
f ( y) 
1
2

e
( y   )2
2 2
Where:  is the population mean of y
 is the standard deviation.
So, N can have any mean and variance. A probability cannot
be computed for a single value of y, but for an interval (say
y between a and b) by integrating f(y) over the interval a,b.
Probabilities, then are defined as the area under the N
curve between a and b.
5
For example: 15 second pulse rate has a mean of 20 and a
standard deviation of 2. It is Normally distributed. What
is the probability of selecting someone at random with a
pulse rate greater than 22 ? (or, what percent of people
have a pulse rate greater than 22 ?). Easy, integrate f(y)
from 22 to  ! It’s easy with a computer, but by hand it is
a bit awkward.
The process for computing this probability is to change
from the pulse rate distribution to the standard Normal
and look it up in a table.
The standard Normal is called Z and has a mean of 0 and
standard deviation of 1. A short hand notation is written:
Z is N(O,1) while
pulse rate is N(20,2)
The solution to the problem above involves 5 steps.
6
Pulse rate probability example: 5 steps:
area of interest
1. Draw a picture of the problem:
20
22
2. Express the problem in the following form (y represents pulse
rate)
P(22 < y) = p
3. Subtract the mean from each component inside the ( ).
Where it’s a number, subtract the number, where a symbol,
subtract the symbol.
4. Divide each component by the standard deviation, likewise.
 22 - 20 y   
P

  P 1  z   p
 
 2
7
Pulse rate probability example: 5 steps continued:
5. Look up the value for z in the table in Appendix 2….
P( 1 < z ) = .1587
It isn’t always that easy… practice the following:
1. P( y < 18) = ?
(.1587)
2. P( y < 22) = ?
(.8413)
3. P( 18 < y < 22 ) = ? (.6826)
4. P( Y < 20 ) = ?
(.5000)
8
Pulse rate probability example: another:
Another interesting aspect of the problem is this: If pulse
rate is N(20,2), above what value will 1/3 (33%) of people fall ?
1. Draw a picture:
Area of interest = .33
20
?
2. Express in the form(we need to solve for ?):
P( y > ? ) = .33
3&4. Subtract mean and divide by standard deviation gives:
?  20 
 y -  ?  20 

P

  P z 
  .33
2 
2 

 
9
Pulse rate probability example: another continued:
5. From Appendix table 2, we also know that:
P( z > .44) = .33 so:
?  20 

P z 
  .33  P(z  .44)
2 

?  20
which implies that :
 .44
2
so : ?  2(.44)  20  20.88
So, 33% of people have a pulse rate greater than 20.88
10
There is much more marvelous information about N in the text
and the notes
I recommend you look over these resources, work some
problems and call if you have a question.
All this begs the question, why so much interest in the Normal
distribution? That leads to the next discussion, the sampling
distribution for the mean and the Central Limit Theorem.
Consider drawing repeated samples from the same population.
All the samples are the same size and the mean is computed
for each. Obviously, the means will not all be the same.
END Probability 3.
11