Download Pepperdine Psych 626: Probability, Normal Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo
Probability
Sir Francis Galton’s “Quincunx Machine”:
 A ball tumbles down the machine, hitting a pin at each level (an “event”) and has
a 50/50 chance of bouncing to the right (one ”outcome”) or to the left (another
“outcome”) – so each event leads to 2 random outcomes – they have equal
probabilities (50/50).
 Therefore, the ball’s path through the machine is also random because it is made up of a series
of events with random outcomes.
 At the end of its, the ball ends up in one of the bins, but its final resting place is not random – in
fact, it is highly predictable!
 2 principles are at work:
1. Law of large numbers – For events with predictable outcomes (i.e., their probabilities
can be specified), the more times the event is conducted, the closer the actual outcomes
approach the predicted outcomes; e.g., sending more and more balls into the machine
OR having more and more rows of pins for the balls to fall through.
2. Central Limit Theorem – together with the Law of Large Numbers, the parameters and
shape of the outcome probability distribution can be derived.
 These principles and their importance in science may seem esoteric and indecipherable, but we
will see how they impact our daily lives.
 Here’s a Quincunx Machine with not 5 but 10 levels; in lecture I’ll point out details, but for
now, what can you see in the differences across these figures?
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo

Now here’s a Quincunx Machine with not 10 but 30 levels; in lecture I’ll point out details, but
for now, what can you see in the differences across these figures?
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo
Probability applied to games of chance -- cards & dice:
DECK OF CARDS
BLACK
Clubs

Spades

RED
Diamonds

Hearts

K
K
K
K
Q
Q
Q
Q
J
J
J
J
10
10
10
10
9
9
9
9
8
8
8
8
7
7
7
7
6
6
6
6
5
5
5
5
4
4
4
4
3
3
3
3
2
2
2
2
A
A
A
A
________________________________________________________
Total 13
13
13
13
52
OUTCOMES OF ROLLING 2 DICE
12
11
10
9
8
7
6
5
4
3
2
(6,6)
(6,5) (5,6)
(6,4) (4,6) (5,5)
(5,4) (4,5) (6,3) (3,6)
(6,2) (2,6) (5,3) (3,5) (4,4)
(6,1) (1,6) (5,2) (2,5) (3,4) (4,3)
(5,1) (1,5) (4,2) (2,4) (3,3)
(4,1) (1,4) (3,2) (2,3)
(3,1) (1,3) (2,2)
(2,1) (1,2)
(1,1)
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo
Probability Concepts:
 Event -- an action with a specific set of outcomes (e.g., rolling 2 dice, flipping a coin, picking a
card from a deck)
 Outcome -- the result of an event (e.g., rolling a 7, flipping a Heads, drawing a King of Hearts).
 Mutually Exclusive Outcomes -- cannot co-occur (i.e., cannot both happen at once.
 Mathematical definition of Mutually Exclusive: P(A and B) = 0
 Examples of Mutually Exclusive Outcomes:
 On a roll of 2 dice: 6 or 7,
 On the flip of a coin: Heads or Tails
 On the draw of a card from a deck: King or Queen
 Example of not mutually exclusive:
 On the draw of a card from a deck: a King or a Heart (the King of Hearts
satisfies both outcomes)
 Exhaustive -- no other possible outcomes. (e.g., on the flip of a coin: Heads or Tails, no 3rd
possibility)
 Mutually Exclusive & Exhaustive: If outcomes are both Mutually Exclusive & Exhaustive,
then:
P(A) + P(B) = 1.00, so P(B) = 1.00 – P(A) and P(A) = 1.00 - P(B)
 Independent -- the outcome of 1 event does not change the probability of the outcome for a
2nd event, i.e.:
P(B/A) = P(B)
 Gambler’s Fallacy -- belief that independent outcomes are actually dependent.
Basic Calculation of Probability:
# of outcomes that “favor” A
Total # of possible outcomes
P(A) =
(e.g., Outcome A: rolling 2 dice totaling 10 or higher
P(A) = 6/36 = 1/6)
Mathematical expressions of probability
“Area Under the Curve” = “Proportion” = “Percentage” => all synonymous
Probabilities range from
100% =
0%
=
1.00
0.00
“Sure thing”
“Impossible”
Basic Probability Calculations
Addition Rule:
P(A or B) = P(A) + P(B) (if mutually exclusive)
P(A or B) = P(A) + P(B) – P(A and B) (if not mutually exclusive)
Multiplication Rule: P(A and B) = P(A) x P(B) (if independent)
P(A and B) = P(A) x P(B/A) (if not independent)
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo
Working examples of probability calculations:
Addition Rule:
Ex 1: (outcomes are mutually exclusive) Pick a card from a randomly shuffled deck. What is the
probability the card is either a King or a Jack?
That is: Outcome A = King, Outcome B = Jack
Ex 1 asks for the probability of one outcome OR another, so Addition Rule applies:
P(A or B) = P(K or J) = P(K) + P(J) = 4/52 + 4/52 = 8/52
Ex 2: (outcomes are not mutually exclusive) Pick a card from a randomly shuffled deck. What is the
probability the card is either a King or a Heart?
That is: Outcome A = King, Outcome B = Heart
Once again, Addition Rule applies:
P(A or B) = P(K or H) = P(K) + P(H) = 4/52 + 13/52 = 17/52 (??)
No -- when events are not mutually exclusive the Addition Rule adds a “correction term” (see above),
so:
P(K or H) = P(K) + P(H) - P(K and H) 4/52 + 13/52 - 1/52 = 16/52
Application of Addition Rule: In poker it is considered bad strategy to draw to (try to fill) an inside
straight:
e.g., dealt: 3, 4, 6, 7, K -- discard the King, hope for a 5 -P(5) = 4/47 = 0.0851
Compare to:
Dealt: 3, 4, 5, 6, K -- discard the King, hope for a 2 or a 7 -P(2 or 7) = 4/47 + 4/47 = 0.1702 (double the odds – much better bet)
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo
Multiplication Rule:
Ex 1: (outcomes are independent) Pick a card, examine it, return it to the deck, shuffle thoroughly,
pick a card again, examine it. What is the probability that the 1st card is a King and that the 2nd card is
also a King?
That is: Outcome A = King, Outcome B = King
Ex 1 asks for the probability of one outcome AND another, so Multiplication Rule applies. Also,
because Ex 1 includes sampling with replacement, the 2 outcomes are independent, so this version of
the Multiplication Rule is used: P(A and B) = P(A) x P(B)
And so:
P(A and B) = P(K and K) = P(K) x P(K) = 4/52 x 4/52 = 0.00592
Ex 2: (outcomes are not independent) Pick a card, examine it, and now pick a second card. What is
the probability that the 1st card is a King and that the 2nd card is also a King?
That is: Outcome A = King, Outcome B = King
Ex 2 asks for the probability of one outcome AND another, so Multiplication Rule applies. However,
in this example the first card is not returned to the deck and then the deck shuffled, so the first outcome
does change the probability of the second outcome – that is, they are not independent, so this version
of the Multiplication Rule is used: P(A and B) = P(A) x P(B/A)
So: P(K and K) = P(A) x P(B/A) = 4/52 x 3/51 = 0.0045 Note: P(B/A), i.e., the probability of B
given that A has occurred, = 3/51 because the first outcome was a King, so now there are only 3 Kings
left in the deck, which now includes only 51 cards.
Application of Multiplication Rule: California Lottery – In lecture I will explain this: go ahead and
play, but you won’t win!
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo
Sir Francis Galton demonstrated that random events lead to surprisingly orderly outcomes.
Many variables in nature -- including those of living organisms – can be described by the
Normal Distribution (i.e., “Bell-Shaped Curve”). Just like Vegas Casinos, we gain confidence
in this phenomenon as the number of trials increases (i.e., we benefit from the Law of Large
Numbers).
This is important because Normal Distributions have specifiable probabilities -- no matter
what response variable is being collected. So, the peak of a Normal Distribution is also where
the mean can be found, and there are known percentages of scores that deviate both above
and below the mean. We could express these deviations in terms of raw scores (the actual
response variable), but we instead express these deviations in terms of standard deviations -that way we don’t have to worry about a particular response variable score that was collected
in the study -- we just use a measure of how much this raw score deviates from the mean -that is, the standard deviation we learned to calculate earlier in the semester. In this way we
contradict the idea that “you can’t compare apples to oranges” -- when the raw scores of 2 or
more variables are expressed in terms of how many standard deviations they differ from their
means, we can absolutely compare them.
A quick example may help clarify this abstract point. If you took some aptitude tests in high
school, you may have been told you should consider a career in art instead on one in
physics. This advice would be based on your test scores -- for example showing that you
scored 3 standard deviations above the mean in art but 2 standard deviations below the
mean in physics. In other words, you compare much more favorably with your peers in art
than you do in physics -- in particular, you are in the “gifted” range in art but are well below
average in physics.
We can transform individual raw scores into “standard scores” (“z scores”) by subtracting the
mean from the raw score and then dividing by the standard deviation:
z=
Y -m
s
(when comparing an individual’s score with a population distribution)
So here’s how this all turns out for a Normally Distributed variable -- in this case it’s
intelligence as measured on the WAIS-IV, which is normed so that the population mean =
100 and the standard deviation = 15. This figure depicts this distribution:
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo
40
55
70
85
100
115
130
145
160
Z scores
Raw scores
Here are some examples:
A student who scores 115 on the WAIS-IV has achieved a standard (z) score of +1 (z = 115100/15 = 15/15 =1). That is, his raw score is 1 standard deviation above the mean.
A student who scores a 70 on the WAIS-IV has achieved a standard (z) score of -2 (z = 70100/15 = -2). That is, his raw score is 2 standard deviations below the mean. You might know
that an IQ score of 70 is one of the criteria for the DSM-IV diagnosis of mental retardation -this score is not completely arbitrary; 2 standard deviations below the mean is a significant
deficit compared to the rest of the population. What does significant mean? Here’s where the
probabilistic characteristics of the Normal Distribution comes in to play.
The percentage of the population scoring between the
mean and 1 standard deviation above the mean is about
34% (with rounding), and because the distribution is
symmetrical, the percentage of the population scoring
between the mean and 1 standard deviation below the
mean is also about 34%.
You get another 13% of the population who score
between 1 and 2 standard deviations above the mean
(and, of course, another 13% who score between 1 and 2
standard deviations below the mean).
Finally, only about 2% of the population score between 2
and 3 standard deviations above the mean; another 2 %
are on the opposite side of the distribution -- between 2
and 3 standard deviations below the mean.
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo
Returning to the student who scored 70 on the WAIS-IV, we can now see that scoring 2
standard deviations below the mean puts him in the 2nd percentile intellectually (50% of the
population are above the mean, another 34% are between the mean and 1 standard
deviation below the mean, and another 13% are between 1 and 2 standard deviations below
the mean -- add it all up and you get about 98% above his score (there’s some rounding error
involved here) -- so his score puts him in the 2nd percentile.
Another way to think about the probabilities associated with the Normal Distribution is in
terms of the percentage of the population who fall on either side of the mean. This “68 - 95 99” rule breaks down this way: 68% of a population falls within +/- 1 standard deviation of the
mean; 95% of the population falls within +/- 2 standard deviations of the mean, and 99% fall
within +/- 3 standard deviations of the mean. It’s a very rare person who scores more than 3
standard deviations of the mean -- again, no matter what variable is being measured!
Group Averages
Psychology studies do not usually involve individuals -- at the very least studies involve a
group – a one-group study. So, this is the point where we are introduced to a distribution that
links the Sample (our data) to the Population (our interest). Called a “Sampling Distribution”,
it differs from the Population and Sample Distributions in that it is not a distribution of
individual scores – instead it is built on means of scores.
Population
Distribution
Conceptual
Status
Theoretical
Data Unit
Individual Score
Mean
µ
Sampling
Distribution
Sample
Distribution
Theoretical
Group Statistic
µY
Empirical
Individual Score
Y
Variance
s2
s
Standard
Deviation
Basis
s
sy
s
Infinite
Sample (N)
Infinite
2
y
s2
So for example, if we want to calulate the standard score using our previous z formula, then
z=
Y -m
s
becomes
z=
Y - my
sy
That is, a standard score is always calculated as: a deviation between a score and its mean
divided by its standard deviation. This is true when comparing an individual score from the
mean of its population and when comparing the mean of a bunch of individual scores from its
sampling mean:
z=
y-m
s
=
y - my
sy
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo
Central Limit Theorem:
Combined with the Law of Large Numbers – the link between our sample data and the
population distribution we wish to estimate (and make decisions about). If sample data are
collected (study conducted) over and over again and with larger and larger sample sizes (N)
then:
1. mY = m
2. s =
2
Y
s2
N
s
3. s Y =
N
4. the resulting sampling distribution will be normally distributed (even if parent
population distribution is not)