Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Normal Curve Theoretical Symmetrical Known Areas For Each Standard Deviation or Z-score FOR EACH SIDE: 34.13% of scores in distribution are b/t the mean and 1 s from the mean 13.59% of scores are between 1 and 2 s’s from the mean 2.28% of scores are > 2 s’s from the mean Z SCORE FORMULA Z = Xi – X Xi = 120; X = 100; s=10 S Z= 120 – 100 = +2.00 10 Xi = 80, S = 10 Z= 80 – 100 = -2.00 10 Xi = 112, S = 10 Z = 112 – 100 = 1.20 10 The point is to convert your particular metric (e.g., height, IQ scores) into the metric of the normal curve (Z-scores). If all of your values were converted to Zscores, the distribution will have a mean of zero and a standard deviation of one. 4 More Sample Problems For a sample of 150 U.S. cities, the mean poverty rate (per 100) is 12.5 with a standard deviation of 4.0. The distribution is approximately normal. Based on the above information: 1. 2. 3. 4. What percent of cities had a poverty rate of more than 8.5 per 100? What percent of cities had a rate between 13.0 and 16.5? What percent of cities had a rate between 10.5 and 14.3? What percent of cities had a rate between 8.5 and 10.5? First Two Answers What percent of cities had a poverty rate of more than 8.5 per 100? 8.5 – 12.5 = -1.0 .3413 + .5 = .8413 = 84.13% 4 What percent of cities had a rate between 13.0 and 16.5? 13.0 – 12.5 = .125 4 16.5 – 12.5 = 1.0 4 .3413 – .0478 = .2935 = 29.35% The Rest of the Answers What percent of cities had a rate between 10.5 and 14.3? 10.5 – 12.5 = -0.5 4 14.3 – 12.5 = .45 4 .1915 + .1736 = .3651 = 36.51% What percent of cities had a rate between 8.5 and 10.5? 10.5 – 12.5 = -0.5 4 8.5 – 12.5 = -1.0 4 Column C: .3085 -.1587 = .1498 = 14.98% …OR… Column B: .3413 - .1915 =.1498 = 14.98% Probability & the Normal Distribution THE NORMAL DISTRIBUTION as a PROBABILITY Distribution We can use the normal curve to estimate the probability of randomly selecting a case between 2 scores Probability distribution: Theoretical distribution of all events in a population of events, with the relative frequency of each event 1.2 1.0 .8 .6 .4 .2 0.0 -2.07 -1.21 -.36 .50 1.36 Normal Curve, Mean = .5, SD = .7 2.21 3.07 PROBABILITY & THE NORMAL DISTRIBUTION – The probability of a particular outcome is the proportion of times that outcome would occur in a long run of repeated observations. 1.2 1.0 .8 .6 .4 .2 0.0 -2.07 -1.21 -.36 .50 1.36 Normal Curve, Mean = .5, SD = .7 2.21 3.07 PROBABILITY & THE NORMAL DISTRIBUTION p [next male being 66-74” tall] = # that tall = 68 = 0.68 100 who approach 100 Probability & the Normal Distribution Another example: Suppose the mean score on a test is 80, with a standard deviation of 7. If we randomly sample one score from the population, what is the probability that it will be as high or higher than 89? Z for 89 = 89-80/7 = 9/7 or 1.29 Area in tail for z of 1.29 = 0.0985 P(X > 89) = .0985 or 9.85% Probability & the Normal Distribution Bottom line: Normal distribution can also be thought of as probability distribution Probabilities always range from 0 – 1 Probability What is the probability of picking a red marble out of a bowl with 2 red and 8 green? There are 2 outcomes that are red THERE ARE 10 POSSIBLE OUTCOMES p(red) = 2 divided by 10 p(red) = .20 Frequencies and Probability The probability of picking a color relates to the frequency of each color in the bowl 8 green marbles, 2 red marbles, 10 total p(Green) = .8 p(Red) = .2 Frequencies & Probability What is the probability of randomly selecting an individual who is extremely liberal from this sample? p(extremely liberal) = 32 = .024 (or 2.4%) 1,319 THINK OF SELF AS LIBERAL OR CONSERVATIVE Valid Mis sing Total Frequency 1 EXTREMELY LIBERAL 32 2 LIBERAL 171 3 SLIGHTLY LIBERAL 186 4 MODERATE 486 5 SLGHTLY 205 CONSERVATIVE 6 CONSERVATIVE 198 7 EXTRMLY 41 CONSERVATIVE Total 1319 8 DK 62 9 NA 6 Total 68 1387 Percent 2.3 12.3 13.4 35.0 Valid Percent 2.4 13.0 14.1 36.8 Cumulative Percent 2.4 15.4 29.5 66.3 14.8 15.5 81.9 14.3 15.0 96.9 3.0 3.1 100.0 95.1 4.5 .4 4.9 100.0 100.0 Inferential Statistics (intro) Inferential statistics are used to generalize from a sample to a population We seek knowledge about a whole class of similar individuals, objects or events (called a POPULATION) We observe some of these (called a SAMPLE) We extend (generalize) our findings to the entire class WHY SAMPLE? Why sample? It’s often not possible to collect info. on all individuals you wish to study Even if possible, it might not be feasible (e.g., because of time, $, size of group) WHY USE PROBABILITY SAMPLING? Representative sample One that, in the aggregate, closely approximates the population from which it is drawn PROBABILITY SAMPLING Samples selected in accord with probability theory, typically involving some random selection mechanism If everyone in the population has an equal chance of being selected, it is likely that those who are selected will be representative of the whole group EPSEM – Equal Probability of SElection Method