Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data analysis wikipedia , lookup
Corecursion wikipedia , lookup
Predictive analytics wikipedia , lookup
Pattern recognition wikipedia , lookup
Granular computing wikipedia , lookup
Probability box wikipedia , lookup
Generalized linear model wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
Regression analysis wikipedia , lookup
Least squares wikipedia , lookup
SPEED REVIEW CHAPTER 1 A distribution of a variable tells the values a variable attained and how often. Use a dotplot, stemplot, histogram, or boxplot to display quantitative data. Use a bar chart or pie chart to display categorical data. Describe a distribution of a quantitative variable by describing shape, center, spread, and outliers (SOCS). Describe center and spread of symmetric distributions using mean and standard deviation; use 5-number summary for skewed distributions. Mean is not resistant and is always pulled toward the tail. Standard deviation is always positive and equals zero only when all observations are identical. Five number summary: Min, Q1, Median, Q3, Max. Q1 is the 25 th percentile which means that 25% of observations are at or below that value. Q3 is the 75th percentile which means that 75% of observations are at or below that value. Median is 50 percentile. th Frequency histogram has values of quantitative variable on one axis and frequency on other axis. Relative frequency histogram has values of quantitative variable on one axis and proportion or percent of observations on other axis. Cumulative frequency histogram or ogive gives the percent or frequency of observations at or below each value. Cumulative relative frequency histogram or ogive displays percentiles on one axis. Outliers may be identified using 1.5IQR rule, or by using a modified box plot on calculator. Mean and standard deviation are NOT resistant. Median and quartiles are resistant. Use median and IQR as measures of center and spread (respectively) if data is strongly skewed or has outliers. Graphs to display univariate, quantitative data: boxplot, stemplot, histogram, dotplot. (Note: box plot does not give information about individual observations.) Extremely important for success on the AP Statistics Exam: If you are given a numerical data set, always display the shape of the distribution. Using the calculator, this can be done very easily with a histogram or a boxplot. Practice problems on the other side Practice problems 1. Which of the following statements are true? I. All variables can be classified as quantitative or categorical variables. II. Categorical variables can be continuous variables. III. Quantitative variables can be discrete variables. (A) I only (B) II only (C) III only (D) I and II (E) I and III 2. A population consists of four observations: {1, 3, 5, 7}. What is the variance? (A) 2 (B) 4 (C) 5 (D) 6 (E) None of the above 3. Which of the following is likely to have a mean that is smaller than the median? (A) the number of candy bars sold in a movie theater on a randomly selected day. (B) the scores of students (out of 100 points) on a very easy exam in which most score perfectly, but a few do very poorly. (C) the prices of homes in a large city. (D) the scores of students (out of 100 points) on a very difficult exam on which most score poorly, but a few do very well. (E) the salaries of all National Football League players. Solutions: 1. The correct answer is (E). All variables can be classified as quantitative or categorical variables. Discrete variables are indeed a category of quantitative variables. Categorical variables, however, are not numeric. Therefore, they cannot be classified as continuous variables. 2. The correct answer is (C). First, we need to compute the population mean. μ=(1+3+5+7)/4=4 Then we plug all of the known values into formula for the variance of a population, as shown below: σ2 = Σ ( Xi - μ )2 / N 2 2 σ = [ ( 1 - 4 ) + ( 3 - 4 )2 + ( 5 - 4 )2 + ( 7 - 4 )2 ] / 4 σ2 = [ ( -3 )2 + ( -1 )2 + ( 1 )2 + ( 3 )2 ] / 4 3. The correct answer is (B). The distribution will be skewed left.