Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Advancing with Quantitative Data Symmetrical refers to data in which both sides are (more or less) the same when the graph is folded vertically down the middle bell-shaped is a special type has a center mound with two sloping tails Uniform refers to data in which every class has equal or approximately equal frequency Skewed (left or right) refers to data in which one side (tail) is longer than the other side the direction of skewness is on the side of the longer tail Skewness The following picture is an example of what kind of distribution? a. right skewed b. left- skewed c. symmetric Bimodal (multi-modal) refers to data in which two (or more) classes have the largest frequency & are separated by at least one other class Approximately Normal A phrase used to illustrate a bell shaped curve (unimodal with minimal skewness) Outlier When examining data- Look for outliers!!! Outlier- observations that lie outside the overall pattern of a distribution. FORMULA: An observation is considered an outlier it falls more than 1.5 X IQR below Q1 or above Q3. Mean Mean= (1/n)Σxi The mean is a measure of center in a distribution The mean is nonresistant to very large or very small observations Median Steps to finding the Median of a distribution: 1. Re-arrange the observations from smallest to largest 2. If number of observations is odd, the median is the center observation 3. If number of observations is even, the median is the average of the two observations in the center The median is resistant to very large or very small observations Standard Deviation (s) s measures the spread about the mean and should only be used when the mean is chosen as the measure of center s=0 only when there is no spread. This happens only when all observations are the same value. *as observations become more spread out about their mean, s gets larger. s is strongly influenced by outliers Variance (s²) Variance- the average of the squares of the deviations of the observations from their mean. s² = 1 Σ (xi – x)² n–1 Standard deviation is the square root of variance Purpose of a graph to help understand the data Look for an overall pattern and for striking deviations from the pattern An outlier in any graph of data is an individual observation that falls outside the overall pattern of the graph Overall Pattern of a Distribution To describe the overall pattern of a distribution: Give the center and spread See if the distribution has a simple shape that you can describe in a few words. Midpoint is the value with half the observations taking smaller values and half taking larger values. Spread is measured from the smallest to largest (ignoring outliers). Key Terms When describing a distribution, you should always include at minimum: Center Shape Spread Examples Which of the following numbers are outliers? 4 7 7 7 8 9 9 15 Which of the following situations does standard deviation equal zero? a. When there is an outlier b. When all the observations are less than zero. c. When all the numbers are greater than zero. d. When there is no spread (all numbers are the same value) Examples Continued: • A numerical summary should report which of the following? a. center, spread, variablility b. mean, median, mode c. Standard deviation d. IQR What is the relationship between standard deviation and variance? a. Standard deviation is the square root of variance. b. Standard deviation is 2 times the variance c. They are the same value d. They are both equal to zero. Examples Continued: What strikes you as the most distinctive difference among the distributions of exam scores in classes A, B, &C? The distribution of a set of data describes how the data is spread out. Two distributions can be compared using one of the three averages and the range. For example, the number of cars sold by two salesmen each day for a week is shown below. Matt 5 7 6 5 7 8 6 Jamie 3 6 4 8 12 9 8 Who is the better salesman? Matt 5 7 6 5 7 8 6 Jamie 3 6 4 8 12 9 8 To decide which salesman is best let’s compare the mean number cars sold by each one. Matt: 44 5+7+6+5+7+8+6 = Mean = = 6.3 (to 1 d.p.) 7 7 Jamie: 3 + 6 + 4 + 8 + 12 + 9 + 8 50 Mean = = = 7.1 (to 1 d.p.) 7 7 This tells us that, on average, Jamie sold more cars each day. Matt 5 7 6 5 7 8 6 Jamie 3 6 4 8 12 9 8 Now let’s compare the range for each salesman. Matt: Range = 8 – 5 = 3 Jamie: Range = 12 – 3 = 9 The range for the number of cars sold each day is smaller for Matt. This means that he is a more consistent or reliable salesman. We could argue that Jamie is better because he sells more on average, or that Matt is better because he is more consistent.