Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Welcome to MM150 – Unit 9 Seminar • Instructor: Larry Musolino – Email: [email protected] • Some Administrative Items – Reminder that the Final Project is due by Tuesday Mar 6th , 2012, 11:59pm ET. Post to Dropbox • Click on DROPBOX Link at top of page • Under Basket, select Unit 9 Final Project – Unit 9 Seminar is the final seminar for the course (there is no Unit 10 seminar). – Last Day of the Course is Tues March 13, 2012 Copyright © 2009 Pearson Education, Inc. Slide 13 - 1 Reminder on Final Project • For more details on Final Project, see Unit 6 or Unit 7 topic “Final Project” – You will select a topic in the course and discuss a potential application for this concept in your chosen profession. Do not be afraid to "think outside the box" when discussing an application in your profession. This also could be an example of how the concept learned is fundamental to understanding a more complex concept. • Check out the example Final Projects, posted as powerpoint and MS-Word example documents Copyright © 2009 Pearson Education, Inc. Slide 13 - 2 Reminder on Final Project (cont’d) Final Project Instructions: • Choose either Microsoft Word or Microsoft PowerPoint in which to create your project. This course does not teach PowerPoint, so if you choose this format, do so only because you are already comfortable with it or know someone who can help you learn to use it. • Create 5 slides or pages. • On slides or pages 1 and 2, provide your name, the project title, and the course and section number, and introduce your chosen profession and give a brief overview of the concept you will apply to the profession. • On slides or pages 3 and 4, describe how the concept can apply to you chosen profession. You will not need to “do the math”; simply describe how you would use it and provide examples of situations in which you would use the concept you have chosen. • On slide or page 5, provide any resources you have used to give credit to others’ ideas and information. YOU MUST HAVE A REFERENCE PAGE !!! • Check spelling and grammar and visit the Writing Center if needed. • Submit your final project to the Unit 9 dropbox for grading. You will have an opportunity to share with your classmates at the Math Fair in Unit 10 Copyright © 2009 Pearson Education, Inc. Slide 13 - 3 Seminar Schedule Copyright © 2009 Pearson Education, Inc. Slide 13 - 4 MM150 Unit 9 Seminar Statistics – Part II • • • • 9.1 – Measures of Central Tendency 9.2 – Measures of Dispersion 9.3 – The Normal Distribution 9.4 – Linear Regression and Correlation Copyright © 2009 Pearson Education, Inc. Slide 13 - 5 9.1 Measures of Central Tendency: •Mean •Median •Mode •Midrange Copyright © 2009 Pearson Education, Inc. Slide 13 - 6 Definitions • An average is a number that is representative of a group of data. • The arithmetic mean, or simply the mean is symbolized by x , when it is a sample of a population or by the Greek letter mu, , when it is the entire population. Copyright © 2009 Pearson Education, Inc. Slide 13 - 7 Mean • The mean, is the sum of the data divided by the number of pieces of data. The formula for calculating the mean is Sx x n • where Sx represents the sum of all the data and n represents the number of pieces of data. Copyright © 2009 Pearson Education, Inc. Slide 13 - 8 Example-find the mean • Find the mean amount of money parents spent on new school supplies and clothes if 5 parents randomly surveyed replied as follows: $327 $465 $672 $150 $230 Copyright © 2009 Pearson Education, Inc. Slide 13 - 9 Solution 327 465 672 150 230 1844 x 368.8 5 5 Copyright © 2009 Pearson Education, Inc. Slide 13 - 10 Median • The median is the value in the middle of a set of ranked (ordered) data. • Example: Determine the median of $327 $465 $672 $150 $230. Copyright © 2009 Pearson Education, Inc. Slide 13 - 11 Solution Rank the data from smallest to largest. $150 $230 $327 $465 $672 middle value (median) Copyright © 2009 Pearson Education, Inc. Slide 13 - 12 Example: Median (even data) • Determine the median of the following set of data: 8, 15, 9, 3, 4, 7, 11, 12, 6, 4. Copyright © 2009 Pearson Education, Inc. Slide 13 - 13 Solution Rank the data: 3 4 4 6 7 8 9 11 12 15 There are 10 pieces of data so the median will lie halfway between the two middle pieces the 7 and 8. The median is (7 + 8)/2 = 7.5 3 4 4 6 7 8 9 11 12 15 (median) middle value Copyright © 2009 Pearson Education, Inc. Slide 13 - 14 Summary to Find Median of Dataset • If the number of datapoints is odd: – The median is the middle value in the ordered dataset. • If the number of datapoints is even: – The median is then the mean of the two middle values in the ordered dataset. Copyright © 2009 Pearson Education, Inc. Slide 13 - 15 You Try It #1 • (A) Find the median of the following dataset: 10, 3, -4, 75, 420, 39, 6, 8 • (B) Find the median of the following dataset: 28, 19, 13, 19, 17, 1, 279 Copyright © 2009 Pearson Education, Inc. Slide 13 - 16 You Try It #1 - Solution • (A) Find the median of the following dataset: 10, 3, -4, 75, 420, 39, 6, 8 Solution: Note there are eight datapoints. Order the data: -4, 3, 6, 8, 10, 39, 75, 420 Median = (8+10)/2 = 18/2 = 9 • (B) Find the median of the following dataset: 28, 19, 13, 19, 17, 1, 279 Solution: Note there are seven datapoints. Order the data: 1, 13, 17, 19, 19, 28, 279 Median = 19 Copyright © 2009 Pearson Education, Inc. Slide 13 - 17 Mode • The mode is the piece of data that occurs most frequently. • Example: Determine the mode of the data set: 3, 4, 4, 6, 7, 8, 9, 11, 12, 15. • Solution: The mode is 4 since it occurs twice and the other values only occur once. 3, 4, 4, 6, 7, 8, 9, 11, 12, 15 Copyright © 2009 Pearson Education, Inc. Slide 13 - 18 More on Mode • If each piece of data occurs only once , then there is no mode. 3, 16, 4, 6, 7, 8, 9, 11, 12, 15 • If two values occur in data set more often then others then we say there are two modes (called bimodal) – (Note some books refer to this situation as “no mode”) 3, 16, 4, 6, 7, 7, 9, 12, 12, 15 Our text indicates there are two modes: 7 and 12 Copyright © 2009 Pearson Education, Inc. Slide 13 - 19 You Try It #2 • (A) Find the mode of the following dataset: 4, 7, 9, 11, 3, 7, 11, 9, 4, -2, 13, 4 • (B) Find the mode of the following dataset: 8, 11, 2, 13, 17, 9 Copyright © 2009 Pearson Education, Inc. Slide 13 - 20 You Try It #2 - Solution • (A) Find the mode of the following dataset: 4, 7, 9, 11, 3, 7, 11, 9, 4, -2, 13, 4 Solution: Since the datavalue “4” occurs most often (three times in the dataset), thus mode is 4. • (B) Find the mode of the following dataset: 8, 11, 2, 13, 17, 9 Solution: There is no most frequently occurring datavalue, thus there is no mode. Copyright © 2009 Pearson Education, Inc. Slide 13 - 21 Midrange • The midrange is the value halfway between the lowest (L) and highest (H) values in a set of data. lowest value + highest value Midrange 2 Copyright © 2009 Pearson Education, Inc. Slide 13 - 22 Example • Find the midrange of the data set $327, $465, $672, $150, $230. 150 + 672 822 Midrange 411 2 2 Copyright © 2009 Pearson Education, Inc. Slide 13 - 23 Example • The weights of eight Labrador retrievers rounded to the nearest pound are 85, 92, 88, 75, 94, 88, 84, and 101. Determine the a) mean c) mode Copyright © 2009 Pearson Education, Inc. b) median d) midrange Slide 13 - 24 Example--dog weights 85, 92, 88, 75, 94, 88, 84, 101 a. Mean 85 92 88 75 94 88 84 101 707 x 88.375 8 8 b. Median-rank the data 75, 84, 85, 88, 88, 92, 94, 101 The median is 88. Copyright © 2009 Pearson Education, Inc. Slide 13 - 25 Example--dog weights 85, 92, 88, 75, 94, 88, 84, 101 c. Mode-the number that occurs most frequently. The mode is 88. d. Midrange = (L + H)/2 = (75 + 101)/2 = 88 Copyright © 2009 Pearson Education, Inc. Slide 13 - 26 You Try It #3 • The salaries of employees of a company are provided below (in thousands of dollars): 32, 44, 35, 39, 125 (A) Find the mean of this dataset (B) Find the median of this dataset (C) Which measurement do you think is the better measure of central tendency for this dataset? Copyright © 2009 Pearson Education, Inc. Slide 13 - 27 You Try It #3 - Solution • The salaries of employees of a company are provided below (in thousands of dollars): 32, 44, 35, 39, 125 (A) Find the mean of this dataset Solution: Mean = (32 + 44 + 35 + 39 + 125) / 5 = 275 / 5 = 55 (B) Find the median of this dataset Solution: Rank order the dataset: 32, 35, 39, 44, 125 Median = 39 (C) Which measurement do you think is the better measure of central tendency for this dataset? In this case, the median would be better measure of central tendency since it is not affected by an outlier datapoint (in this case, 125 is an outlier datavalue since it is significantly different than the other datavalues in the dataset). Copyright © 2009 Pearson Education, Inc. Slide 13 - 28 Measures of Position • Measures of position are often used to make comparisons. • Two measures of position are percentiles and quartiles. Copyright © 2009 Pearson Education, Inc. Slide 13 - 29 To Find the Quartiles of a Set of Data 1. Order the data from smallest to largest. 2. Find the median, or 2nd quartile, of the set of data. 1. If there are an odd number of datapoints, the median is the middle value. 2. If there are an even number of datapoints, the median will be the mean of the two middle pieces of data. Copyright © 2009 Pearson Education, Inc. Slide 13 - 30 To Find the Quartiles of a Set of Data continued 3. The first quartile, Q1, is the median of the lower half of the data; that is, Q1, is the median of the data less than Q2. 4. The third quartile, Q3, is the median of the upper half of the data; that is, Q3 is the median of the data greater than Q2. 5. Note: The Quartiles divide the dataset into four parts. Copyright © 2009 Pearson Education, Inc. Slide 13 - 31 Example: Quartiles • The weekly grocery bills for 23 families are as follows. Determine Q1, Q2, and Q3. 170 330 225 75 95 210 80 225 160 172 Copyright © 2009 Pearson Education, Inc. 270 170 215 130 190 270 240 310 74 280 270 50 81 Slide 13 - 32 Example: Quartiles continued • Order the data: 50 75 74 80 81 95 : Q1 130 160 170 170 172 190 : Median (Q2 ) 210 215 225 225 240 270 : Q3 270 270 280 310 330 Since there are 23 datapoints, the median will be the middle datapoint (12th datapoint) Copyright © 2009 Pearson Education, Inc. Slide 13 - 33 Example: Quartiles continued Q2 is the median of the entire data set which is 190. Q1 is the median of the numbers from 50 to 172 which is 95. Q3 is the median of the numbers from 210 to 330 which is 270. Copyright © 2009 Pearson Education, Inc. Slide 13 - 34 You Try It #4 • Find the quartiles Q1, Q2, and Q3 of the following dataset: 11, 7, 5, 9, 42, 27, 35, 19, 39, 36, 23 Copyright © 2009 Pearson Education, Inc. Slide 13 - 35 You Try It #4 - Solution • Find the quartiles Q1, Q2, and Q3 of the following dataset: 11, 7, 5, 9, 42, 27, 35, 19, 39, 36, 23 Solution: First, rank order the data and find the median (this is Q2) : 5, 7, 9, 11, 19, 23, 27, 35, 36, 39, 42 Thus Median, Q2 = 23 Now find median of lower set of data: 5, 7, 9, 11, 19 Now find median of upper set of data: 27, 35, 36, 39, 42 Thus, Q1 = 9 Q2 = 23 Q3 = 36 Copyright © 2009 Pearson Education, Inc. Slide 13 - 36 9.2 Measures of Dispersion Copyright © 2009 Pearson Education, Inc. Slide 13 - 37 Measures of Dispersion • Measures of dispersion are used to indicate the spread of the data. • The range is the difference between the highest and lowest values; it indicates the total spread of the data. Range = highest value – lowest value Copyright © 2009 Pearson Education, Inc. Slide 13 - 38 Example: Range • Nine different employees were selected and the amount of their salary was recorded. Find the range of the salaries. $24,000 $32,000 $26,500 $56,000 $48,000 $27,000 $28,500 $34,500 $56,750 Copyright © 2009 Pearson Education, Inc. Slide 13 - 39 Solution • Highest Value in Dataset = $56,750 • Lowest Value in Dataset = $24,000 • Range = $56,750 $24,000 = $32,750 Copyright © 2009 Pearson Education, Inc. Slide 13 - 40 Standard Deviation • The standard deviation measures how much the data differ from the mean. It is symbolized by “s” when it is calculated for a sample, and with (Greek letter sigma) when it is calculated for a population. s Copyright © 2009 Pearson Education, Inc. S xx 2 n 1 Slide 13 - 41 To Find the Standard Deviation of a Set of Data 1. Find the mean of the set of data. 2. Make a chart having three columns: DataValue DataValue Mean (DataValue Mean)2 3. List the data vertically under the column marked DataValue 4. Subtract the mean from each piece of data and place the difference in the DataValue Mean column. Copyright © 2009 Pearson Education, Inc. Slide 13 - 42 To Find the Standard Deviation of a Set of Data continued 5. Square the values obtained in the Data Mean column and record these values in the (DataValue Mean)2 column. 6. Determine the sum of the values in the (DataValue Mean)2 column. 7. Divide the sum obtained in step 6 by n 1, where n is the number of datapoints. 8. Determine the square root of the number obtained in step 7. This number is then the standard deviation of the set of data. Copyright © 2009 Pearson Education, Inc. Slide 13 - 43 Example • Find the standard deviation of the following prices of selected washing machines: $280, $217, $665, $684, $939, $299 Find the mean. 280 217 665 684 939 299 3084 x 514 6 6 Copyright © 2009 Pearson Education, Inc. Slide 13 - 44 Example continued, mean = 514 Data 217 280 299 665 684 939 Data Mean 297 234 215 151 170 425 0 Copyright © 2009 Pearson Education, Inc. (Data Mean)2 (297)2 = 88,209 54,756 46,225 22,801 28,900 180,625 421,516 Slide 13 - 45 Example continued, mean = 514 s S xx 2 n 1 421,516 84303.2 290.35 5 • The standard deviation is $290.35. Copyright © 2009 Pearson Education, Inc. Slide 13 - 46 9.3 The Normal Curve Copyright © 2009 Pearson Education, Inc. Slide 13 - 47 Types of Distributions • Rectangular Distribution Copyright © 2009 Pearson Education, Inc. • J-shaped distribution Slide 13 - 48 Types of Distributions continued • Bimodal Copyright © 2009 Pearson Education, Inc. • Skewed to right Slide 13 - 49 Types of Distributions continued • Skewed to left Copyright © 2009 Pearson Education, Inc. • Normal Slide 13 - 50 Properties of a Normal Distribution • The graph of a normal distribution is called the normal curve. • The normal curve is bell shaped and symmetric about the mean. • In a normal distribution, the mean, median, and mode all have the same value and all occur at the center of the distribution. Copyright © 2009 Pearson Education, Inc. Slide 13 - 51 Empirical Rule • Approximately 68% of all the data lie within one standard deviation of the mean (in both directions). • Approximately 95% of all the data lie within two standard deviations of the mean (in both directions). • Approximately 99.7% of all the data lie within three standard deviations of the mean (in both directions). Copyright © 2009 Pearson Education, Inc. Slide 13 - 52 Normal Distribution Copyright © 2009 Pearson Education, Inc. Slide 13 - 53 z-Scores • z-scores determine how far, in terms of standard deviations, a given score is from the mean of the distribution. value of the piece of data - mean x z standard deviation s Copyright © 2009 Pearson Education, Inc. Slide 13 - 54 Example: z-scores • A normal distribution has a mean of 50 and a standard deviation of 5. Find z-scores for the following values. • a) 55 b) 60 c) 43 55 50 5 • a) z 1 5 5 A score of 55 is one standard deviation above the mean. Copyright © 2009 Pearson Education, Inc. Slide 13 - 55 Example: z-scores continued 60 50 10 • b) z 2 5 5 A score of 60 is 2 standard deviations above the mean. 43 50 7 1.4 • c) z 5 5 A score of 43 is 1.4 standard deviations below the mean. Copyright © 2009 Pearson Education, Inc. Slide 13 - 56 To Find the Percent of Data Between any Two Values 1. 2. 3. Draw a diagram of the normal curve, indicating the area or percent to be determined. Use the formula to convert the given values to z-scores. Indicate these zscores on the diagram. Look up the percent that corresponds to each z-score on page 387-388. Copyright © 2009 Pearson Education, Inc. Slide 13 - 57 To Find the Percent of Data Between any Two Values continued 4. a) When finding the percent of data between two zscores on opposite sides of the mean (when one z-score is positive and the other is negative), you find the sum of the individual percents. b) When finding the percent of data between two zscores on the same side of the mean (when both z-scores are positive or both are negative), subtract the smaller percent from the larger percent. Copyright © 2009 Pearson Education, Inc. Slide 13 - 58 To Find the Percent of Data Between any Two Values continued c) When finding the percent of data to the right of a positive z-score or to the left of a negative z-score, subtract the percent of data between 0 and z from 50%. d) When finding the percent of data to the left of a positive z-score or to the right of a negative zscore, add the percent of data between 0 and z to 50%. Copyright © 2009 Pearson Education, Inc. Slide 13 - 59 Example Assume that the waiting times for customers at a popular restaurant before being seated for lunch are normally distributed with a mean of 12 minutes and a standard deviation of 3 min. a) Find the percent of customers who wait for at least 12 minutes before being seated. b) Find the percent of customers who wait between 9 and 18 minutes before being seated. c) Find the percent of customers who wait at least 17 minutes before being seated. d) Find the percent of customers who wait less than 8 minutes before being seated. Copyright © 2009 Pearson Education, Inc. Slide 13 - 60 Solution a. wait for at least 12 minutes Since 12 minutes is the mean, half, or 50% of customers wait at least 12 min before being seated. b. between 9 and 18 minutes 9 12 3 z 1 3 3 18 12 6 z 2 3 3 Use Table 13.7 Page 801 34.1% + 47.7% = 81.8% 0.341 + 0.477 = 0.818 Copyright © 2009 Pearson Education, Inc. Slide 13 - 61 Solution continued c. at least 17 min d. less than 8 min Use table 13.7 page 801. 45.3% is between the mean and 1.67. 50% 45.3% = 4.7% Thus, 4.7% of customers wait at least 17 minutes. Use table 13.7 page 801. 40.8% is between the mean and 1.33. 50% 40.8% = 9.2% Thus, 9.2% of customers wait less than 8 minutes. Copyright © 2009 Pearson Education, Inc. Slide 13 - 62 9.4 Linear Correlation and Regression Copyright © 2009 Pearson Education, Inc. Slide 13 - 63 Linear Correlation • Linear correlation is used to determine whether there is a relationship between two quantities and, if so, how strong the relationship is. Copyright © 2009 Pearson Education, Inc. Slide 13 - 64 Linear Correlation – The linear correlation coefficient, r, is a unitless measure that describes the strength of the linear relationship between two variables. • If the value is positive, as one variable increases, the other increases. • If the value is negative, as one variable increases, the other decreases. • The variable, r, will always be a value between –1 and 1 inclusive. Copyright © 2009 Pearson Education, Inc. Slide 13 - 65 Scatter Diagrams • A visual aid used with correlation is the scatter diagram, a plot of points (bivariate data). – The independent variable, x, generally is a quantity that can be controlled. – The dependent variable, y, is the other variable. • The value of r is a measure of how far a set of points varies from a straight line. – The greater the spread, the weaker the correlation and the closer the r value is to 0. – The smaller the spread, the stronger the correlation and the closer the r value is to 1. Copyright © 2009 Pearson Education, Inc. Slide 13 - 66 Correlation Copyright © 2009 Pearson Education, Inc. Slide 13 - 67 Correlation Copyright © 2009 Pearson Education, Inc. Slide 13 - 68 Linear Correlation Coefficient • The formula to calculate the correlation coefficient (r) is as follows: r n xy x y n x Copyright © 2009 Pearson Education, Inc. 2 x 2 n y 2 y 2 Slide 13 - 69 Example: Words Per Minute versus Mistakes There are five applicants applying for a job as a medical transcriptionist. The following shows the results of the applicants when asked to type a chart. Determine the correlation coefficient between the words per minute typed and the number of mistakes. Applicant Ellen George Phillip Kendra Nancy Words per Minute 24 67 53 41 34 Copyright © 2009 Pearson Education, Inc. Mistakes 8 11 12 10 9 Slide 13 - 70 Solution • We will call the words typed per minute, x, and the mistakes, y. • List the values of x and y and calculate the necessary sums. WPM Mistakes x y x2 y2 24 8 576 64 67 11 4489 121 53 12 2809 144 41 10 1681 100 34 9 1156 81 x2 =10,711y2 = 510 x = 219 y = 50 Copyright © 2009 Pearson Education, Inc. xy 192 737 636 410 306 xy = Slide 13 - 71 2,281 Solution continued • The n in the formula represents the number of pieces of data. Here n = 5. r n x x n y y 5 2281 219 50 5 10,711 219 5 510 50 n xy x y 2 r Copyright © 2009 Pearson Education, Inc. 2 2 2 2 2 Slide 13 - 72 Solution continued 11,405 10,950 5 10,711 47,961 5 510 2500 455 53,555 47,961 2550 2500 455 0.86 5594 50 Copyright © 2009 Pearson Education, Inc. Slide 13 - 73 Solution continued • Since 0.86 is fairly close to 1, there is a fairly strong positive correlation. • This result implies that the more words typed per minute, the more mistakes made. Copyright © 2009 Pearson Education, Inc. Slide 13 - 74 Linear Regression • Linear regression is the process of determining the linear relationship between two variables. • The line of best fit (regression line or the least squares line) is the line such that the sum of the squares of the vertical distances from the line to the data points (on a scatter diagram) is a minimum. Copyright © 2009 Pearson Education, Inc. Slide 13 - 75 The Line of Best Fit • Equation: y mx b, m where , n x x n xy x y 2 Copyright © 2009 Pearson Education, Inc. 2 and b y m x n Slide 13 - 76 Example • • Use the data in the previous example to find the equation of the line that relates the number of words per minute and the number of mistakes made while typing a chart. Graph the equation of the line of best fit on a scatter diagram that illustrates the set of bivariate points. Copyright © 2009 Pearson Education, Inc. Slide 13 - 77 Solution • From the previous results, we know that m n x x n xy x y 2 2 5(2,281) (219)(50) m 5(10,711) 219 2 455 m 5594 m 0.081 Copyright © 2009 Pearson Education, Inc. Slide 13 - 78 Solution • Now we find the y-intercept, b. b b y m x n 50 0.081 219 5 32.261 b 6.452 5 Therefore the line of best fit is y = 0.081x + 6.452 Copyright © 2009 Pearson Education, Inc. Slide 13 - 79 Solution continued • To graph y = 0.081x + 6.452, plot at least two points and draw the graph. x 10 20 30 Copyright © 2009 Pearson Education, Inc. y 7.262 8.072 8.882 Slide 13 - 80 Solution continued Copyright © 2009 Pearson Education, Inc. Slide 13 - 81