Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Normal Curve June 20, 2006 Bryan T. Karazsia, M.A. Overview Hand-in Homework Why are distributions so important (particularly the normal distribution)? What is the normal distribution? Z-scores Using z-scores Setting Probable Limits Distributions What’s so important about distributions? Link with probabilities: between shape of distribution for given behavior and probability of that behavior occurring Can translate area under curve to a probability Doing so helps us to describe behavior better… Ex. – What’s the probability of _____? •We can find area under curve for any portion of the distribution and convert it to probability Distributions What’s so important about distributions? Areas under a curve are additive. So, we can explore grouping/addition of 2+ areas Can combine areas into one, single larger area •We can combine any areas under the curve with addition •Above, we can combine the area designated by the color BLUE •Adds together to be 68.2% Normal Distribution Characteristics of a normal distribution? • Symmetric • Uni-modal • Limits of infinity • Mathematically exact: Can find percentage between any 2 points on a curve (it’s actually already been done for us: Table E.10) Normal Distribution Characteristics of a normal distribution? 50-34-14 Rule Normal Distribution Why so important? We usually assume that our DVs are normally distributed Most phenomena are normally distributed (or close to it) in large populations Assumption of Normality is required for most statistical tests Relatively easy to work with Can calculate probabilities easily (in Tables!) Note: can be done with non-normal distributions, but that would require extensive mathematics (/Calculus/) Normal Distribution When we can calculate probabilities for any point on the curve, our results will depend on the curve’s mean and SD Why? Æ because area under curve will change So, anytime these attributes change, we need to calculate new probabilities… Standard Normal Distribution To make life easier, we base everything off of one single normal distribution, with predetermined attributes: µ=0 σ=1 Standard Normal Distribution Real world: very rarely (if ever) have a real-world distribution that matches this standard normal distribution exactly BUT, we can mathematically manipulate, or transform, our data… Enter Math---but remember why! Key = Concepts…the math will fall into place What are we doing again??? Z-scores How do we transform data? Convert our sample distribution such that… Mean = 0 SD = 1 Subtract mean of sample from EACH data pt. Divide each data pt. by the SD of the sample When we do this, we are converting raw scores to standardized scores (a.k.a., z-scores) Z-scores How do we transform data? Formula: z-score for any particular observation (score, x) z= (x − µ) σ x = indiv. score µ = population mean σ = population SD z = standardized score Z-scores µ x = 3.20 σ x = 1.79 Part 1 2 3 4 5 Σ X 2 1 3 5 5 16 Y 1 4 8 9 7 29 z1 = (X – µ) -1.2 -2.2 -0.2 1.8 1.8 z -0.67 -1.23 -0.11 1.01 1.01 (x − µ) σ (2 − 3.20) z1 = 1.79 (−1.2) z1 = 1.79 z1 = −.67 Z-scores World Cup Data (hypothetical) ORIGINAL DATA Z-SCORES Z-scores Why are the distributions the same??? Z-score transformation is a Linear Transformation Add, subtract, multiply, and/or divide by a constant Relationship among values does NOT change (shape of distribution remains unaffected) Common examples: Distance from KSU to Panama City, FL: 985 miles or 62,409,600 inches Z-scores Common Misconception: Converting raw scores to z-scores will normalize the data (will give us a normal curve) FALSE Shape of distribution will remain unaffected b/c relationships among variables does not change Quick Review What have we accomplished by converting raw-scores to z-scores??? What good is it? (GRE example) If you still don’t think they are very helpful, just hang-on… Normal Distributions (cont.) Link to probabilities: (Example) Suppose we want to know the probability that any random person will score 1 standard deviation above the mean (score 600) on the GRE Verbal Stated differently, what is the area under the normal curve that is 1 SD above the mean? Notes: Standard normal curve: total area = 1.0 (100% of observation) We are assuming GRE scores are normally distributed Normal Distributions (cont.) Link to probabilities: (Example) We already know that z-scores represent SDs from the mean, so we will want to find area above z = 1. Normal Distributions (cont.) Link to probabilities: (Example) Here is where we utilize Table E.10 Note: only positive half of normal distribution in Table (z > 0) We want z = 1 Normal Distributions (cont.) Table E.10 (Normal Distribution; z) We see… “mean to z” = .3413 “Larger portion” = .8413 “Smaller Portion” = .1587 Normal Distributions (cont.) Table E.10 (Normal Distribution; z) “mean to z” = .3413 Normal Distributions (cont.) Table E.10 (Normal Distribution; z) “Larger portion” = .8413 Normal Distributions (cont.) Table E.10 (Normal Distribution; z) “Smaller portion” = .1587 Normal Distributions (cont.) What if we wanted to know if the random person will score > 1 SD in either direction? Normal Distributions (cont.) Can now also calculate area (probability) between z = -1 & z = +1 Normal Distributions (cont.) Let’s go one step further… What is probability a person will fall between 1.5 (650) & 2.5 (750) SDs from the mean on the GRE Verbal When doing such calculations, it is Usually wise to draw the diagram (will eliminate many errors) Normal Distributions (cont.) Let’s go one step further… What is probability a person will fall between 1.5 (650) & 2.5 (750) SDs from the mean on the GRE Verbal Normal Distributions (cont.) Let’s go one step further… What is probability a person will fall between 1.5 (650) & 2.5 (750) SDs from the mean on the GRE Verbal From Table E.10… Z = 1.5 Æ “mean to z” = .4332 Z = 2.5 Æ “mean to z” = .4938 We want the difference between these… .4938 - .4332 = .0606 (.06) Probability of falling between z = 1.5 & z = 2.5 is .06 Normal Distributions (cont.) Setting Probable Limits on Observations: “If I draw an adult at random, 95% of the time his/her score will lie between ____ & ____” To answer this question, we need to find corresponding z-scores… Remember: Table is only ½ of the distribution… Normal Distributions (cont.) Setting Probable Limits on Observations: So, 95% of time, any random individual’s score will fall between –1.96 & + 1.96 SDs from the mean ???????????????????????????????? Normal Distributions (cont.) Setting Probable Limits on Observations: If we want to hear some music, we need to convert these z-scores back to raw scores… x = µ ± ( z *σ ) x = 500 ± (1.96 *100) x = 500 ± 196 = 696 & 304 Normal Distributions (cont.) Is anything we did today useful??? We see them all the time… IQ, GREs, SATs, etc… They all use standard scores…and perform another linear transformation to derive the scores we hear so often (IQ = 102, SAT = 1600, GRE Verbal = 520) All easy too…just convert raw scores to z-scores, then you can convert to whatever system you like… Review What is Variability? Measure of Variability Range Variance Standard Deviation Definitional vs. Computational formulae Practice before next class!!!