Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Exploration of Mean & Median Go to the website of “Introduction to the Practice of Statistics” Click on the link to “Statistical Applets” Select the “Mean and Median” applet Perform exercises 1.51 and 1.52 on p.58 of you textbook. Percentiles Median: 50th percentile, boundary value separating bottom and top halves of population xth percentile separates the bottom x% from the top (100-x)% First quartile (Q1): 25th percentile (marks boundary for lower 4th of data) Third quartile (Q3): 75th percentile (marks boundary for upper 4th of data) Inter-quartile range (IQR) IQR = Q3 – Q1 Answers the question, “How far is the median of the top half of the data from the median of the bottom half?” IQR is a resistant measure Isn’t as affected by outliers as variance or range. Useful for identifying outliers: A data point that lies more than [1.5(IQR)] above Q3 or more than [1.5(IQR)] below Q1. This method is called the 1.5 x IQR criterion. Five-number summary Consists of the minimum, Q1, median, Q3, and maximum. These numbers give you an idea of center and spread, though of course they can’t give the full picture. Boxplot Represents the 5-number summary graphically. A box spans the IQR, with a line in the middle marking the median. Lines leave the box from both sides, going out to the minimum and maximum. Modified Boxplot (what Minitab creates) Lines only extend to the smallest and largest observations that are not outliers. Outliers are marked separately beyond the endpoints of these lines. Boxplots of Time to Detection of Cancer – Simulation Study Exponential Gompertzian 0 5 10 Years 15 20 Sample variance (s2) Is almost the average squared distance, but divide by n - 1 instead of n. n s2 i 1 i n 1 Standard deviation (s or SD) is the square root of variance. Very sensitive to outliers x x 2 Squaring of the deviations. Use of the mean. s2 and s can’t be negative! Exercise 1.65 from book This is a standard deviation contest. You must choose four numbers from the whole numbers 0 to 10, with repeats allowed. a) Choose four numbers that have the smallest possible standard deviation. b) Choose four numbers that have the largest possible standard deviation. c) Is more than one choice possible in either (a) or (b)? Explain. Linear transformations What does it mean to say that a variable Y is a linear transformation of a variable X? It means that you can write Y as a function of X in this format: Y = a + bX (a and b are constants) Effects of linear transformations Let’s say we know the mean and s of a variable X. We perform a linear transformation of the variable; each of the old X values changes according to a+bX. How are the mean and s of Y related to those of X? Avgerage Monthly Temperature, NYC J F M A M J J A S O N D °F 32 33 41 52 62 72 77 75 68 58 47 35 Effects of linear transformations (cont.) Y a b X This also true for the median and the percentiles. SD(Y)=|b| SD(X) This is also true for IQR. Basic shape of the distribution doesn’t change due to the linear transformation.