Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ATM Lesson 1-1 Collecting Data In statistics, a variable is: The population is: A sample is: A random sample is: If a sample is not random, it is said to be ______________________. Examples: Identify the population, sample, and variable: 1) A doctor takes a biopsy of a suspicious growth to check for malignancy. 2) A company wants to know the educational background of all its employees. It asks all the department managers what their level of education is. 3) Is the sample taken in #2 a random sample? Or is it biased? Capture-Recapture 4) From several locations on an island, a naturalist captures 96 squirrels, then tags and releases them. Ten days later 120 squirrels are caught and 36 of them have tags. Use this information to estimate the number of squirrels on the island. ATM 1-3 Other Displays Time-Series Data – displays changes in a variable over time. Label axes appropriately and give titles. Axes must always have equal increments. Other methods of displaying data include: Scatter Plot Line Graph Bar Graphs (Histograms) Population of NE cities in thousands For each of these, you can determine the Average Rate of Change (over time). This is also known as _______________. Stated as “Increase/Decrease of (quantity) over (period of time).” Ex: What was the average rate of change in the U. S. minimum wage from 1950 to 1990? Stemplots (also known as Stem and Leaf Plots) show the minimum, maximum, range and outliers of data. Scores on Chapter 8 Test (out of 130) What is the minimum value? The maximum? The range? Outliers? Back to Back Stem Plots Use the data from our class heights to create a back-to-back stemplot. Boys Girls ATM 1-4 Measures of Center To describe typical values in a data set, use measures of center, or measures of central tendency: Mean: Notation: X “x bar” Median: Mode: QUARTILES: 1st Quartile (lower quartile): 3rd Quartile (upper quartile): They will help us measure the spread of the data. Example 1: Consider the heights of a high school basketball team: # of players a) How many players are on the team? 1 3 b) Find the mean height. 3 2 2 c) Find the median height. 1 1 d) Which # describes the “average” height? e) Do there appear to be any outliers? f) Find the 1st quartile. g) Find the 3rd quartile. SIGMA NOTATION (or Summation Notation) Let Xi = the test score for the ith person in a class. (So, X1 = 1st person’s score, X2 = 2nd person’s score, etc.) X1 + X2 + X3 + X4 + X5 = ___________________________ height (in) 64 69 70 71 72 74 79 = the mean of those 5 test scores In general: Using sigma notation: X = Example 2 : Let X1 = 2, X2 = 4, X3 = 5, X4 = 7 a) Write an expression in sigma notation for the total of the four numbers above. 4 x 4 b) Find xi i 2 c) Find i 1 4 i 1-5 Quartiles, Percentiles, Box Plots Consider the test scores on a math exam. Identify the median and quartiles. 43, 52, 65, 67, 70, 70, 71, 74, 75, 78, 80, 82, 85, 87, 88, 90, 92, 94, 98, 98 1st quartile: Median: 3rd quartile: So, quartiles divide the data into 4 sections, each containing 25% of the data. Interquartile Range (IQR): 3rd quartile – 1st quartile . (A measure of spread, along with the range of the data. Give the IQR of the data. Percentile: The pth percentile is the value in a set such that p percent of the numbers are less than or equal to that value. What is the percentile rank of 67? Determining outliers: Find 1.5 ( IQR). Add this to the 3rd quartile and any value greater than this is an outlier. Also, subtract it from 1st quartile and any value smaller is an outlier. Examples: Using the test scores given, answer the following questions. 1. ¾ of the class scored above a ___________ on the exam. 2. What is the percentile rank of 85? 3. Which score has a percentile rank of 80th? 4. Are there any outliers? (Show your work!) 5. Make a box-and-whisker plot of the scores. 1-6 Histograms Frequencies Relative Frequencies Intervals Frequency table Histogram Frequency distribution 43, 52, 65, 66, 67, 68, 70, 70, 71, 72, 73, 74, 75, 75, 76, 78, 78, 78, 78, 79, 80, 82, 85, 87, 87, 88, 89, 90, 90, 90, 92 , 93, 94, 94, 98 1. Make a frequency table for the test scores. Hint: 1st determine reasonable intervals. Test Scores Frequency Relative Frequency 2. Make a histogram for the test scores using frequency for the vertical axis. 3. Make a histogram using relative frequency for the vertical axis. 4. Make a box-and-whisker plot to display the text scores. Use it to answer the following questions. a. Which quartile has the most spread? What is the interval for that quartile? b. Which quartile has the least spread? What is the interval for that quartile? c. Which interval in your histogram has the highest frequency? (tallest bar) d. Which interval in your histogram has the lowest frequency? (shortest bar) e. Determine a relationship between these intervals in the box-and-whisker plot and histogram intervals. 5. Use the histogram at the right showing the percent of students reporting how much they paid for their last haircut to answer the following questions. a. What percent of students said they paid between $10 and $20 for a haircut? b. What percent of students said they paid $25 for a haircut? c. How many students got haircuts? d. In what interval is the median price paid? e. What is the median price paid? 6. How can you determine what interval the median of a data set is in by looking at a histogram for the data? 7. Consider the following table of relative frequency table for test scores. a. Why is it not possible to tell how many students are in the class? b. If there are 30 students in the class, how many students received a score in the C range? c. About what percent of the students got A’s? d. About what percent of the students passed? ATM Lesson 1-7 Name ________________________ Using the Graphing Calculator Period _________ Box Plots Yearly Average Daily Temperature for Selected US Cities (in degrees F) 1. 2. 3. 4. 5. 6. 7. 8. Mobile, AL Juneau, AK Phoenix, AZ San Francisco, CA Miami, FL Chicago, IL Portland, OR Dallas-Ft. Worth, TX 67.5 40.0 71.2 50.3 75.6 49.2 53.0 66.0 9. Philadelphia, PA 10. Atlantic City, NJ 11. Nashville, TN 12. Burlington, VT 13. Cincinnati, OH 14. Buffalo, NY 15. Detroit, MI 16. Boston, MA 54.3 53.1 59.2 44.1 53.4 47.6 48.6 51.5 1. Use your calculator to enter the data in a list and find the mean. ________________ 2. Give a five-number summary of the temperatures. Identify any outliers using the IQR X 1.5 method. 3. What is the range of the values? ______________ Indicate the values you entered for your WINDOW or RANGE. Xmin = _____________ Ymin = ________________ Xmax = _____________ Ymax = ________________ Xscl = ______________ Yscl = _________________ (Do the values entered for “y” in this situation make a difference? __________) 4. Construct a box plot on your calculator. Sketch the box plot below. Label the number line; indicate any outliers with an “x”. (Don’t erase your list! You’ll need it again……) Histograms Yearly Average Daily Temperature for Selected US Cities (in degrees F) 1. 2. 3. 4. 5. 6. 7. 8. Mobile, AL Juneau, AK Phoenix, AZ San Francisco, CA Miami, FL Chicago, IL Portland, OR Dallas-Ft. Worth, TX 67.5 40.0 71.2 50.3 75.6 49.2 53.0 66.0 9. Philadelphia, PA 10. Atlantic City, NJ 11. Nashville, TN 12. Burlington, VT 13. Cincinnati, OH 14. Buffalo, NY 15. Detroit, MI 16. Boston, MA 54.3 53.1 59.2 44.1 53.4 47.6 48.6 51.5 5. What is the range of the values? ______________ Indicate the values you entered for your WINDOW or RANGE. Xmin = _____________ Ymin = ________________ Xmax = _____________ Ymax = ________________ Xscl = ______________ Yscl = _________________ (Do the values entered for “y” in this situation make a difference? __________) 6. Use your calculator to produce a histogram with intervals of 5. Sketch the histogram. 7. Use your calculator to produce a histogram with intervals of 10. Sketch the histogram. Variance and Standard Deviation (1-8) Standard deviation Variance Method and Formulas: 1. 2. 3. 4. n This is the variance. S2 = (x i 1 i x)2 n 1 5. n (x This is the standard deviation. S = i 1 i x )2 n 1 Example Find the variance and the standard deviation “by hand” for the data given. { 6, 9, 10, 13, 17 } x xx x n= ( x x)2 Things to keep in mind: Measures of center: Measures of spread: Standard deviation = (Standard deviation)2 = For example: if var. = 25, std. dev. = ? if std. dev. = 9, var. = ? if var. = .1, std. dev. = ? In general, groups with most data close to the mean have _______________ standard deviations than do groups with most data far from the mean. Variance and standard deviation are always positive. Why? Symbols - The ones we use are: s2 = variance of a sample s = std. dev. of a sample (divide by n – 1) Also on your calculator are: σ2 = variance of a population σ = std. dev. of a population (divide by n) Variance and Standard Deviation (1-8) Standard deviation Variance Method and Formulas: 1. 2. 3. 4. n This is the variance. S2 = (x i 1 i x)2 n 1 5. n (x This is the standard deviation. S = i i 1 x )2 n 1 Example Find the variance and the standard deviation “by hand” for the data given. { 6, 9, 10, 13, 17 } ( x x)2 xx x x n= Things to keep in mind: Measures of center: Measures of spread: Standard deviation = (Standard deviation)2 = For example: if var. = 25, std. dev. = ? if std. dev. = 9, var. = ? if var. = .1, std. dev. = ? In general, groups with most data close to the mean have _______________ standard deviations than do groups with most data far from the mean. Variance and standard deviation are always positive. Why? Symbols - The ones we use are: s2 = variance of a sample s = std. dev. of a sample (divide by n – 1) Also on your calculator are: σ2 = variance of a population σ = std. dev. of a population (divide by n)