Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures of Center 3-3 Measures of Variation 3-4 Measures of Relative Standing 3-5 Exploratory Data Analysis (EDA) Slide 1 Section 3-1 Overview Slide 2 Overview Descriptive Statistics summarize or describe the important characteristics of a known set of data Inferential Statistics use sample data to make inferences (or generalizations) about a population Slide 3 Section 3-2 Measures of Center Slide 4 Key Concept When describing, exploring, and comparing data sets, these characteristics are usually extremely important: center, variation, distribution, outliers, and changes over time. Slide 5 Definition Measure of Center the value at the center or middle of a data set Slide 6 Definition Arithmetic Mean (Mean) the measure of center obtained by adding the values and dividing the total by the number of values Slide 7 Notation denotes the sum of a set of values. x is the variable usually used to represent the individual data values. n represents the number of values in a sample. N represents the number of values in a population. Slide 8 Notation x is pronounced ‘x-bar’ and denotes the mean of a set of sample values x = x n µ is pronounced ‘mu’ and denotes the mean of all values in a population µ = x N Slide 9 Definitions Median the middle value when the original data values are arranged in order of increasing (or decreasing) magnitude often denoted by x~ (pronounced ‘x-tilde’) is not affected by an extreme value Slide 10 Finding the Median If the number of values is odd, the median is the number located in the exact middle of the list. If the number of values is even, the median is found by computing the mean of the two middle numbers. Slide 11 5.40 1.10 1.10 0.42 0.73 0.48 0.42 5.40 0.48 0.73 1.10 1.10 (in order - even number of values – no exact middle shared by two numbers) 0.73 + 1.10 MEDIAN is 0.915 2 5.40 1.10 0.42 0.73 0.48 1.10 0.66 0.42 0.48 0.66 0.73 1.10 1.10 5.40 (in order - odd number of values) exact middle MEDIAN is 0.73 Slide 12 Definitions Mode the value that occurs most frequently Mode is not always unique A data set may be: Bimodal Multimodal No Mode Mode is the only measure of central tendency that can be used with nominal data Slide 13 Mode - Examples a. 5.40 1.10 0.42 0.73 0.48 1.10 Mode is 1.10 b. 27 27 27 55 55 55 88 88 99 Bimodal - c. 1 2 3 6 7 8 9 10 No Mode 27 & 55 Slide 14 Definition Midrange the value midway between the maximum and minimum values in the original data set Midrange = maximum value + minimum value 2 Slide 15 Round-off Rule for Measures of Center Carry one more decimal place than is present in the original set of values. Slide 16 You do! Here are the volumes (in ounces) or randomly selected cans of Coke: 12.3 12.1 12.2 12.3 12.2 Find the mean, median, mode, and midrange. Slide 17 Mean from a Frequency Distribution Assume that in each class, all sample values are equal to the class midpoint. Slide 18 Mean from a Frequency Distribution use class midpoint of classes for variable x Slide 19 Find the Mean from a Frequency Distribution Age of Actress Frequency (f) 21-30 28 Class Midpoint f*x (x) 25.5 714 31-40 30 35.5 1065 41-50 12 45.5 546 51-60 2 55.5 111 61-70 2 65.5 131 71-80 2 75.5 151 Totals 76 2718 Slide 20 Weighted Mean In some cases, values vary in their degree of importance, so they are weighted accordingly. (w • x) x = w Slide 21 Example • We have three test sores: 85, 90, 75 • The first test counts for 20%, the second for 30%, and the third for 50% of the final grade. • Find the weighted mean. Slide 22 Best Measure of Center Slide 23 Definitions Symmetric distribution of data is symmetric if the left half of its histogram is roughly a mirror image of its right half Skewed distribution of data is skewed if it is not symmetric and if it extends more to one side than the other Slide 24 Skewness Slide 25 Heart Rate Activity Is there a difference between male and female heart rates? Male: 60 67 59 64 80 55 72 84 59 67 69 Female: 83 56 57 63 60 69 70 86 70 57 67 75 72 75 57 76 69 79 84 75 56 Compare: 1) The sample sizes 2) The Means and the Medians (centers) 3) The IQR, the 5 # Summaries and the standard deviations (spreads) 4) Look at the shape: symmetric, skewed, or irregular? Slide 26