Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo Probability Sir Francis Galton’s “Quincunx Machine”: A ball tumbles down the machine, hitting a pin at each level (an “event”) and has a 50/50 chance of bouncing to the right (one ”outcome”) or to the left (another “outcome”) – so each event leads to 2 random outcomes – they have equal probabilities (50/50). Therefore, the ball’s path through the machine is also random because it is made up of a series of events with random outcomes. At the end of its, the ball ends up in one of the bins, but its final resting place is not random – in fact, it is highly predictable! 2 principles are at work: 1. Law of large numbers – For events with predictable outcomes (i.e., their probabilities can be specified), the more times the event is conducted, the closer the actual outcomes approach the predicted outcomes; e.g., sending more and more balls into the machine OR having more and more rows of pins for the balls to fall through. 2. Central Limit Theorem – together with the Law of Large Numbers, the parameters and shape of the outcome probability distribution can be derived. These principles and their importance in science may seem esoteric and indecipherable, but we will see how they impact our daily lives. Here’s a Quincunx Machine with not 5 but 10 levels; in lecture I’ll point out details, but for now, what can you see in the differences across these figures? Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo Now here’s a Quincunx Machine with not 10 but 30 levels; in lecture I’ll point out details, but for now, what can you see in the differences across these figures? Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo Probability applied to games of chance -- cards & dice: DECK OF CARDS BLACK Clubs Spades RED Diamonds Hearts K K K K Q Q Q Q J J J J 10 10 10 10 9 9 9 9 8 8 8 8 7 7 7 7 6 6 6 6 5 5 5 5 4 4 4 4 3 3 3 3 2 2 2 2 A A A A ________________________________________________________ Total 13 13 13 13 52 OUTCOMES OF ROLLING 2 DICE 12 11 10 9 8 7 6 5 4 3 2 (6,6) (6,5) (5,6) (6,4) (4,6) (5,5) (5,4) (4,5) (6,3) (3,6) (6,2) (2,6) (5,3) (3,5) (4,4) (6,1) (1,6) (5,2) (2,5) (3,4) (4,3) (5,1) (1,5) (4,2) (2,4) (3,3) (4,1) (1,4) (3,2) (2,3) (3,1) (1,3) (2,2) (2,1) (1,2) (1,1) Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo Probability Concepts: Event -- an action with a specific set of outcomes (e.g., rolling 2 dice, flipping a coin, picking a card from a deck) Outcome -- the result of an event (e.g., rolling a 7, flipping a Heads, drawing a King of Hearts). Mutually Exclusive Outcomes -- cannot co-occur (i.e., cannot both happen at once. Mathematical definition of Mutually Exclusive: P(A and B) = 0 Examples of Mutually Exclusive Outcomes: On a roll of 2 dice: 6 or 7, On the flip of a coin: Heads or Tails On the draw of a card from a deck: King or Queen Example of not mutually exclusive: On the draw of a card from a deck: a King or a Heart (the King of Hearts satisfies both outcomes) Exhaustive -- no other possible outcomes. (e.g., on the flip of a coin: Heads or Tails, no 3rd possibility) Mutually Exclusive & Exhaustive: If outcomes are both Mutually Exclusive & Exhaustive, then: P(A) + P(B) = 1.00, so P(B) = 1.00 – P(A) and P(A) = 1.00 - P(B) Independent -- the outcome of 1 event does not change the probability of the outcome for a 2nd event, i.e.: P(B/A) = P(B) Gambler’s Fallacy -- belief that independent outcomes are actually dependent. Basic Calculation of Probability: # of outcomes that “favor” A Total # of possible outcomes P(A) = (e.g., Outcome A: rolling 2 dice totaling 10 or higher P(A) = 6/36 = 1/6) Mathematical expressions of probability “Area Under the Curve” = “Proportion” = “Percentage” => all synonymous Probabilities range from 100% = 0% = 1.00 0.00 “Sure thing” “Impossible” Basic Probability Calculations Addition Rule: P(A or B) = P(A) + P(B) (if mutually exclusive) P(A or B) = P(A) + P(B) – P(A and B) (if not mutually exclusive) Multiplication Rule: P(A and B) = P(A) x P(B) (if independent) P(A and B) = P(A) x P(B/A) (if not independent) Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo Working examples of probability calculations: Addition Rule: Ex 1: (outcomes are mutually exclusive) Pick a card from a randomly shuffled deck. What is the probability the card is either a King or a Jack? That is: Outcome A = King, Outcome B = Jack Ex 1 asks for the probability of one outcome OR another, so Addition Rule applies: P(A or B) = P(K or J) = P(K) + P(J) = 4/52 + 4/52 = 8/52 Ex 2: (outcomes are not mutually exclusive) Pick a card from a randomly shuffled deck. What is the probability the card is either a King or a Heart? That is: Outcome A = King, Outcome B = Heart Once again, Addition Rule applies: P(A or B) = P(K or H) = P(K) + P(H) = 4/52 + 13/52 = 17/52 (??) No -- when events are not mutually exclusive the Addition Rule adds a “correction term” (see above), so: P(K or H) = P(K) + P(H) - P(K and H) 4/52 + 13/52 - 1/52 = 16/52 Application of Addition Rule: In poker it is considered bad strategy to draw to (try to fill) an inside straight: e.g., dealt: 3, 4, 6, 7, K -- discard the King, hope for a 5 -P(5) = 4/47 = 0.0851 Compare to: Dealt: 3, 4, 5, 6, K -- discard the King, hope for a 2 or a 7 -P(2 or 7) = 4/47 + 4/47 = 0.1702 (double the odds – much better bet) Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo Multiplication Rule: Ex 1: (outcomes are independent) Pick a card, examine it, return it to the deck, shuffle thoroughly, pick a card again, examine it. What is the probability that the 1st card is a King and that the 2nd card is also a King? That is: Outcome A = King, Outcome B = King Ex 1 asks for the probability of one outcome AND another, so Multiplication Rule applies. Also, because Ex 1 includes sampling with replacement, the 2 outcomes are independent, so this version of the Multiplication Rule is used: P(A and B) = P(A) x P(B) And so: P(A and B) = P(K and K) = P(K) x P(K) = 4/52 x 4/52 = 0.00592 Ex 2: (outcomes are not independent) Pick a card, examine it, and now pick a second card. What is the probability that the 1st card is a King and that the 2nd card is also a King? That is: Outcome A = King, Outcome B = King Ex 2 asks for the probability of one outcome AND another, so Multiplication Rule applies. However, in this example the first card is not returned to the deck and then the deck shuffled, so the first outcome does change the probability of the second outcome – that is, they are not independent, so this version of the Multiplication Rule is used: P(A and B) = P(A) x P(B/A) So: P(K and K) = P(A) x P(B/A) = 4/52 x 3/51 = 0.0045 Note: P(B/A), i.e., the probability of B given that A has occurred, = 3/51 because the first outcome was a King, so now there are only 3 Kings left in the deck, which now includes only 51 cards. Application of Multiplication Rule: California Lottery – In lecture I will explain this: go ahead and play, but you won’t win! Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo Sir Francis Galton demonstrated that random events lead to surprisingly orderly outcomes. Many variables in nature -- including those of living organisms – can be described by the Normal Distribution (i.e., “Bell-Shaped Curve”). Just like Vegas Casinos, we gain confidence in this phenomenon as the number of trials increases (i.e., we benefit from the Law of Large Numbers). This is important because Normal Distributions have specifiable probabilities -- no matter what response variable is being collected. So, the peak of a Normal Distribution is also where the mean can be found, and there are known percentages of scores that deviate both above and below the mean. We could express these deviations in terms of raw scores (the actual response variable), but we instead express these deviations in terms of standard deviations -that way we don’t have to worry about a particular response variable score that was collected in the study -- we just use a measure of how much this raw score deviates from the mean -that is, the standard deviation we learned to calculate earlier in the semester. In this way we contradict the idea that “you can’t compare apples to oranges” -- when the raw scores of 2 or more variables are expressed in terms of how many standard deviations they differ from their means, we can absolutely compare them. A quick example may help clarify this abstract point. If you took some aptitude tests in high school, you may have been told you should consider a career in art instead on one in physics. This advice would be based on your test scores -- for example showing that you scored 3 standard deviations above the mean in art but 2 standard deviations below the mean in physics. In other words, you compare much more favorably with your peers in art than you do in physics -- in particular, you are in the “gifted” range in art but are well below average in physics. We can transform individual raw scores into “standard scores” (“z scores”) by subtracting the mean from the raw score and then dividing by the standard deviation: z= Y -m s (when comparing an individual’s score with a population distribution) So here’s how this all turns out for a Normally Distributed variable -- in this case it’s intelligence as measured on the WAIS-IV, which is normed so that the population mean = 100 and the standard deviation = 15. This figure depicts this distribution: Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo 40 55 70 85 100 115 130 145 160 Z scores Raw scores Here are some examples: A student who scores 115 on the WAIS-IV has achieved a standard (z) score of +1 (z = 115100/15 = 15/15 =1). That is, his raw score is 1 standard deviation above the mean. A student who scores a 70 on the WAIS-IV has achieved a standard (z) score of -2 (z = 70100/15 = -2). That is, his raw score is 2 standard deviations below the mean. You might know that an IQ score of 70 is one of the criteria for the DSM-IV diagnosis of mental retardation -this score is not completely arbitrary; 2 standard deviations below the mean is a significant deficit compared to the rest of the population. What does significant mean? Here’s where the probabilistic characteristics of the Normal Distribution comes in to play. The percentage of the population scoring between the mean and 1 standard deviation above the mean is about 34% (with rounding), and because the distribution is symmetrical, the percentage of the population scoring between the mean and 1 standard deviation below the mean is also about 34%. You get another 13% of the population who score between 1 and 2 standard deviations above the mean (and, of course, another 13% who score between 1 and 2 standard deviations below the mean). Finally, only about 2% of the population score between 2 and 3 standard deviations above the mean; another 2 % are on the opposite side of the distribution -- between 2 and 3 standard deviations below the mean. Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo Returning to the student who scored 70 on the WAIS-IV, we can now see that scoring 2 standard deviations below the mean puts him in the 2nd percentile intellectually (50% of the population are above the mean, another 34% are between the mean and 1 standard deviation below the mean, and another 13% are between 1 and 2 standard deviations below the mean -- add it all up and you get about 98% above his score (there’s some rounding error involved here) -- so his score puts him in the 2nd percentile. Another way to think about the probabilities associated with the Normal Distribution is in terms of the percentage of the population who fall on either side of the mean. This “68 - 95 99” rule breaks down this way: 68% of a population falls within +/- 1 standard deviation of the mean; 95% of the population falls within +/- 2 standard deviations of the mean, and 99% fall within +/- 3 standard deviations of the mean. It’s a very rare person who scores more than 3 standard deviations of the mean -- again, no matter what variable is being measured! Group Averages Psychology studies do not usually involve individuals -- at the very least studies involve a group – a one-group study. So, this is the point where we are introduced to a distribution that links the Sample (our data) to the Population (our interest). Called a “Sampling Distribution”, it differs from the Population and Sample Distributions in that it is not a distribution of individual scores – instead it is built on means of scores. Population Distribution Conceptual Status Theoretical Data Unit Individual Score Mean µ Sampling Distribution Sample Distribution Theoretical Group Statistic µY Empirical Individual Score Y Variance s2 s Standard Deviation Basis s sy s Infinite Sample (N) Infinite 2 y s2 So for example, if we want to calulate the standard score using our previous z formula, then z= Y -m s becomes z= Y - my sy That is, a standard score is always calculated as: a deviation between a score and its mean divided by its standard deviation. This is true when comparing an individual score from the mean of its population and when comparing the mean of a bunch of individual scores from its sampling mean: z= y-m s = y - my sy Pepperdine Psych 626: Probability, Normal Distributions, & Sampling Distributions -- Dr. Mascolo Central Limit Theorem: Combined with the Law of Large Numbers – the link between our sample data and the population distribution we wish to estimate (and make decisions about). If sample data are collected (study conducted) over and over again and with larger and larger sample sizes (N) then: 1. mY = m 2. s = 2 Y s2 N s 3. s Y = N 4. the resulting sampling distribution will be normally distributed (even if parent population distribution is not)