* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Stats_lecture_3 (Statistics lecture on bell
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					Statistics lecture 3 Bell-Shaped Curves and Other Shapes Goals for lecture 3     Realize many measurements in nature follow a bell-shaped (“normal”) curve Understand and learn to compute a standardized score Learn to find the proportion of the population that falls into a given range Memorize the Empirical Rule Histogram Bell-Shaped “Normal” Curve Bell-Shaped “Normal” Curve Bell-Shaped “Normal” Curve Bell-Shaped “Normal” Curve Remember?   Mean (average): Sum of the values divided by the number of values Standard deviation: A measure of how spread out the values are. Think of it as the “average distance” of all values from the mean. Some Characteristics of a Normal Distribution     Symmetrical (not skewed) One peak in the middle, at the mean The wider the curve, the greater the standard deviation Area under the curve is 1 (or 100%) mean Why it looks like that With many things in nature, most individuals fall near the average. The farther you move above or below the average, the fewer individuals there are with those extreme values. Examples: Height, weight, IQ, pulse rate Bell-shaped wear Bell-shaped wear Not all curves are “normal” Normal Curve... If you know these two things:  The Mean  The Standard Deviation ... ...Normal Curve ...you  The into  The  The can figure these things: proportion of individuals who fall any range of values percentile of any given value value of any given percentile Percentiles Your percentile for a particular measure (like height or IQ) is the percentage of the population that falls below you. In one of my recent classes:  My height (183 cm): 89th percentile  My weight ( ): 99th percentile  My age (62): 99th percentile 104 kg Standardized Scores A standardized score (also called the z-score) is simply the number of standard deviations a particular value is either above or below the mean. The standardized score is:  Positive if above the mean  Negative if below the mean Standardized Score Examples Class height: Mean 170 cm, StdDev: 10 cm. What is the z-score of someone:  160 cm  180 cm  175 cm  150 cm  170 cm  145 cm Calculate z-score for a Particular Value z-score = (Value - mean) / StdDev 185 cm : (185 – 170) / 10 = 15 / 10 = +1.5 165 cm: (165 - 170) / 10 = -5 / 10 = -0.5 180 cm: (180 - 170) / 10 = 10 / 10 = +1.0 What’s the Point?   With z-score or percentile, you can compare unlike things. For instance, I am heavier (99th pctile) than I am tall (89th pctile). With a z-score, you can look up the percentile in a table or an online calculator The Empirical Rule For any normal curve, approximately:  68% of values within one StdDev of the mean  95% of values within two StdDevs of the mean  99.7% of values within three StdDevs of the mean Empirical Rule Empirical Rule Empirical Rule Outlier  A value that is more than three standard deviations above or below the mean. Apply Empirical Rule to Class Height Class height: Mean 170 cm., StdDev 10 cm.  About 68% of class is between what heights? 160 cm and 180 inches (+/- 10 cm)  About 95% of class is between what heights? 150 inches and 190 inches (+/- 20 cm) Data visualization goals     See different ways of graphically displaying data. Learn the features of a good statistical picture. Be able to identify common problems with graphs and plots. Learn to read graphs comprehensively. Why do we turn data into graphics?    Easier to understand Easier to see the trends A good graphic will convey the same message you would get if you really studied the data “Graphics reveal data.” -- Edward Tufte Two kinds of variables   Categorical: Data that can be counted in categories, such as gender or race Measurement: Data that can be recorded as a number and then put into order, such as IQ, weight, cigarettes smoked per day, etc. Pictures of Categorical Data Three common types of graphics for categorical data:  Pie charts  Bar graphs  Pictograms Pie Charts Women 37% Men 63%   Good for showing one categorical variable, like gender Show the percentage that falls into each category Bar Graphs Can show two or more categorical variables simultaneously (for example, height and gender) 10 8 Students  12 F 6 M 4 2 0 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 Inches number Pictograms A grades  B C D F Height of pictures is used like bars Pictograms can be misleading  We tend to focus on the area, rather than just the height Pictograms can be misleading  To be fair, you should keep the width of pictograms the same Pictures of Measurement Data Lots of ways to illustrate measurement variables:  Stemplots and histograms (lecture 2)  Line graphs (also called fever charts)  Scatter plots  Others: Area, radar, doughnut, highlow-close, surface plots, maps, et al. Stemplots 19 20 21 22 23 24 25 | | | | | | | 5 1 0 0 0 4 1 9 1 1 2 2 6 7 4 2 2 2 7 5 2 2 3 7 5 2 5 5 9 56666778 24444588899 679 5 9 Line Graph (Fever Chart) Scatter Plot  Good for displaying the relationship between two measurement variables Scatter Plot 350 Doig pounds 300 250 200 150 100 60 65 70 inches 75 80 Scatter Plot 350 Doig pounds 300 250 200 150 100 60 65 70 inches 75 80 Scatter Plot height vs. weight pounds 300 250 200 150 100 60 65 70 inches 75 80 Scatter Plot height vs. weight 300 pounds 250 200 150 100 60 65 70 inches 75 80 Scatter Plot height vs. weight 300 pounds 250 200 150 100 60 65 70 inches 75 80 Difficulties and Disasters Most common problems:  No labeling on one or more axes  Not starting at zero  Changes in labeling on axes  Misleading units  Graphs based on poor information Checklist for Statistical Pictures 1. Does the message clearly stand out? 2. Is the purpose or title evident? 3. Is a source given for the data? 4. Did the data come from a reliable, believable source? 5. Is everything labeled clearly and unambiguously? Checklist for Statistical Pictures 6. Do the axes start at zero? 7. Do the axes maintain a constant scale? 8. Are there breaks in the numbers on the axes that may be easy to miss? 9. Have financial numbers been adjusted for inflation? 10. Is there extraneous information cluttering the picture or misleading the eye? Perguntas?
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            