Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
M116 – TI 83/84 CALCULATOR – CH 1 Section 1.2 – Random number generator 1) Select 3 students at random from your Statistics class. a) There are 28 students in your class. Select 3 students at random. Use the TI-83 calculator to generate 3 random integers from 1 to 28. The instruction in the home screen of your calculator should read: randInt(1,28,3) Here are the steps to accomplish this: Press MATH, arrow right to PRB, and select 5:randInt( Type 1,28,3) Notice: the “,” is the black key Above the key for the number 7 Press ENTER b) List the three numbers obtained. If some of the numbers are repeated, press ENTER again and select as many numbers as necessary to complete the list of five different integers. c) Check with a classmate. Are his/her numbers the same as yours? Explain. d) Check with the class roster shown on the transparency to name the students selected. e) Comment on the importance of random selection. f) You have 5 minutes to get to know the 3 students in your list. g) How can we do this using the random number table from our book? 1 M116 – NOTES – CH 1 Section 1.1 – Exploring Sampling Techniques Simple random sample Systematic random sampling: Step 1: Divide the population size by the sample size, and round the result down to the nearest whole number. Call it m. Step 2: Select at random a number between 1 and m, and call it k. Step 3: Select for the sample those member of the population that are numbered k, k+m, k + 2m, etc Cluster sampling: Step 1: Divide the population into groups (clusters) Step 2: Obtain a simple random sample of the clusters Step 3: Use all the members of the clusters obtained in step 2 as the sample. Stratified random sampling with proportional allocation Step 1: Divide the population into subpopulations (strata) Step 2: From each stratum, obtain a simple random sample of size proportional to the size of the stratum. Step 3: Use all the members obtained in Step 2 as a sample. Convenience sampling 2) Exploring sampling techniques Consider the following population Students who are enrolled in Professor Aronne’s 3 Statistics classes during the Spring semester of 2007 a) Discuss how you would select a simple random sample of 35 students from this population. i) Without a random number generator. ii) With a random number generator. b) Discuss how you would select 35 students by using the systematic method. c) Discuss how you would select a sample from this population by using each of the following methods: i) Convenience method ii) Stratified method iii) Cluster method 2 M116 – TI 83/84 CALCULATOR – CH 2 Section 2.2 – Using the calculator to Create a New List, Sort, and construct a Frequency Distribution 3) To get ready for this activity, create a new list labeled GLUCO Here are the steps to accomplish this: Press STAT Select 1:Edit Arrow right and up until the cursor is on the name of the last list of your editor (the name has to be highlighted) Arrow right and type the name of the new list: GLUCO Press ENTER Enter the data from problem 2 page 67. Press ENTER after each entry. All the numbers should go into the same list. 4) Construct a frequency distribution of 6 classes for the GLUCO data. a) Calculate the class width: class.width l arg est.value lowest.value (rounded up) number.of .classes b) Use the smallest number as the lower limit of the first class. Obtain all other lower limits by adding the class width. Then write the upper limits. Classes Frequency c) In order to determine the frequencies we are going to SORT the list GLUCO, and then explore the list to count how many values are in each of the classes. To SORT the list press STAT, select 2:SortA( Press 2nd STAT to select the list GLUCO Press ENTER Then, get into the editor by pressing STAT, 1:Edit and scroll down to determine the frequencies. Count how many numbers are in each class and record on the table from part b. 3 d) Using your results from part (b), complete the following table: CHAPTER 2 Class limits Relative frequency Class midpoint Class boundaries Frequency Cumulative frequency e) Sketch the corresponding histogram and label. Use the same graph to sketch the corresponding frequency polygon for the data g) Sketch the corresponding ogive h) Sketch a Stem and Leaf plot for the GLUCO data (from #2 on page 67). i) Sketch a Dot Plot for the GLUCO data. Dot plots are explained in problem #17 on pages 73 and 74 4 M116 – TI 83/84 CALCULATOR – CH 2 Section 2.2 – Using the calculator to Sketch Histograms for Raw Data 5) Use the calculator to sketch a histogram for the data stored in GLUCO. Here are the steps to accomplish this: 1st: Set up the histogram Press 2nd Y= [STAT PLOT] Select 1:Plot1… (or any other plot) Turn the plot ON by pressing ENTER Arrow down and to the right to select the histogram Indicate GLUCO for the location of the data in Xlist To select GLUCO press 2nd STAT[LIST], scroll down and press ENTER to select Indicate 1 for Freq (Notice: Press ALPHA 1) 2nd: Set up the WINDOW. To sketch a histogram with a specific class width, we need to set up the window values according to the specifications given below. You will need some numbers from the classes produced in the previous page Press WINDOW Use the following values: Xmin = lower class limit of the first class Xmax =lower class limit of the next class beyond the data (Xmin + (number of classes)*(class width)) Xscl = class width Ymin = -5 Ymax = a number larger than the largest frequency (try any number, then adjust if necessary) Yscl = 1 Yres = 1 Press GRAPH 3rd: Read the frequencies Press TRACE and arrow to the right to read the classes and frequencies. Make sure the classes agree with the ones obtained in the previous page. Sketch the histogram here. 5 M116 – TI 83/84 CALCULATOR – CH 2 6) Use the calculator to sketch a histogram for the grouped data from part 4-d (Use L1, L2) Enter midpoints into L1 and frequencies into L2 In the STAT PLOT window, when you select the histogram, indicate L1 for XList and L2 for Freq If you still have the same WINDOW selections as indicated on the previous page, press GRAPH and TRACE to check on the class limits and frequencies. 7) Explore the feature ZOOM 9:ZoomStat. Press TRACE, arrow to the right and observe the frequencies. Are they the same as the ones obtained before? What is the class width? What are the class limits of the first and second class? 6 M116 – TI 83/84 CALCULATOR – CH 3 Sections 3.1-3.4 – Using the calculator to Find the Mean, Median, Standard Deviation, and 5-number Summary 8) Use the data from problem 2, page 67, which you have stored into the list GLUCO, to find the mean, standard deviation and the 5-number summary Raw Data (list of all 70 numbers listed on page 67) Instructions in the home screen should read 1-Var Stats GLUCO Press STAT Arrow to CALC Select 1:1-Var Stats Select the list GLUCO from the 2nd STAT (LIST) menu Press ENTER Grouped Data (use midpoints and frequencies. See page 4) Instructions in the home screen should read: 1-Var Stats L1,Ll2 Enter midpoints into a list (L1), Enter frequencies into another list (L2) Press STAT Arrow over to CALC Press 1:1-Var Stats Select L1, L2 Press ENTER Observe the values obtained for the raw data and for the grouped data. Are they the same? If not, why is that? Which answers are exact? 7 M116 – TI 83/84 CALCULATOR – CH 3 Section 3.4 – Using the calculator to construct Box-and-Whisker Plots, and TRACE to find the 5-number summary 9) Use the data from problem 2, page 67 (which is stored into the list GLUCO), to construct a box plot Here are the steps to accomplish this: Press 2nd Y= (STAT PLOTS) Turn one Plot ON, and make sure all others are OFF. Arrow down and right to select the box plot that shows the outliers Select GLUCO for Xlist (from the 2nd STAT[LIST] menu) Select 1 for Freq Press ZOOM 9 (this automatically opens an appropriate window) Press TRACE and use the left-right arrows to obtain the 5-number summary _____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_ 10) Constructing the Box Plot and Histogram for the same data Here are the steps to accomplish this: Turn ON a second plot Select a histogram for the data stored in list GLUCO Press GRAPH If necessary, press the WINDOW key and select a larger number for Y-max to provide enough space to graph the histogram and the box-plot. _____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_ 8 M116 – NOTES – CH 3 Section 3.2 - Chebyshev’s Theorem For any set of data (either population or sample) and for any constant k greater than 1, the proportion of the data that must lie within k standard deviations on either side of the mean is at least 1-1/k^2 For any set of data At least 75% of the data fall win the interval from µ- 2σ to µ+ 2σ (Within 2 standard deviations from the mean) At least 89% of the data fall win the interval from µ- 3σ to µ+ 3σ (Within 3 standard deviations from the mean) At least 93.8% of the data fall win the interval from µ- 4σ to µ+ 4σ (Within 4 standard deviations from the mean) 11) Use the GLUCO data (from #2, page 67) to determine a Chebyshev interval about the mean in which a) At least 75% of the data fall b) At least 89% of the data fall c) At least 93.8% of the data fall d) Explore the SORTED data which is in the GLUCO list and determine the actual percentage of values which lies i) Within two standard deviations from the mean ii) Within three standard deviations from the mean iii) Within four standard deviations from the mean 9 M116 – NOTES Empirical Rule and Range Rule of Thumb Empirical Rule (section 6.1) For a distribution that is symmetrical and bell-shaped (normal distribution) About 68% of the data fall within the interval from µ- σ to µ+ σ (Within 1 standard deviation of the mean) About 95% of the data fall within the interval from µ- 2σ to µ+ 2σ (Within 2 standard deviations of the mean) About 99.7% of the data fall within the interval from µ- 3σ to µ+ 3σ (Within 3 standard deviations of the mean) Range rule of thumb (section 6.2) The range rule of thumb is based on the principle that for many data sets (symmetrical, bell shaped), the vast majority (such as 95%) of sample values lie within two standard deviations of the mean. To roughly estimate the standard deviation, use: s ~ (highest value – lowest value)/4 To roughly estimate the minimum and maximum “usual” sample values, use: Minimum “usual” value ~ mean – 2 * standard deviation Maximum “usual” value ~ mean + 2 * standard deviation 11-e) Are the percentages obtained in part (d) of the previous page suggesting that the GLUCO distribution is bell shaped? 11-f) What values of the GLUCO data are usual, which ones are unusual? 10 M116 – TI 83/84 CALCULATOR – CH 2-3 Loading Data Sets into your calculator Here are the data sets that will be loaded into your calculator. Come to my office to get them CRGVL = Regular Coke Volume (oz) PRGVL = Regular Pepsi Volume (oz) CDTWT = Diet Coke Weight (lb) CRGWT = Regular Coke Weight (lb) FHED = Head circumferences of Two-Month-Old Baby-Girls (cm) MHED = Head circumferences of Two-Month-Old Baby-Boys (cm) FEMAL = ages of Females who finished a recent New York City Marathon MALE = ages of Males who finished a recent New York City Marathon Uploading a list from the memory into the editor of the calculator Upload the data from the CRGVL list into the editor. Press STAT Select 1:Edit Arrow up and to the right until we get into a list that has NO NAME Press 2nd STAT[LIST] Arrow down and select the list CRGVL and press ENTER twice. 11 M116 – TI 83/84 CALCULATOR – CH 2-3 12) What does the distribution of Volumes of Regular Coke cans look like? Before constructing any graphs, think about the following: a) Think on selecting a sample of regular Coke cans, recording their volumes, and using the calculator to sketch a histogram. What do you think the histogram will look like? What shape will this distribution have? b) Now let’s look at the data that we have in CRGVL. Is your prediction correct? c) Now use the calculator to sketch a histogram for the data set CRGVL. Is the histogram what you predicted? Comment on the results. Also, press TRACE and write the classes and frequencies obtained. 13) Let’s observe two graphs together for the same data set Set up a second STAT PLOT with a box plot for the data CRGVL. Press ZOOM 9:Stat, you may need to press the WINDOW key of the calculator and change the Ymax to fit both graphs. Write the five-number summary for the data. 12 M116 – TI 83/84 CALCULATOR – CH 2-3 Comparing Data Sets 14) Do you think the distribution of volumes for regular Pepsi will look the same as the one for regular Coke? CRGVL = Regular Coke Volume (oz) PRGVL = Regular Pepsi Volume (oz) a) Now let’s look at box plots for both distributions CRGVL and PRGVL. Turn both plots ON and press ZOOM 9:Stat to select a window. Is it what you predicted? Comment on your results. Use the scale provided below as a guide to sketch the box plots. b) Record the 5 number summary and the outliers for each of the distributions. c) Also mention the smallest and largest number of the distributions which are not outliers. _____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_ d) Which of the two data sets have (i) A larger standard deviation? (ii) A larger mean? (iii) A larger median? (iv) A larger IQR (v) A larger range e) The 75th percentile of the CRGVL distribution is about __________ which is the same as the _________th percentile of the PRGVL distribution. f) For each distribution, give the range of the middle 50% of the data. 13 M116 – TI 83/84 CALCULATOR – CH 2-3 15) Comparing Weights of Diet Coke and Regular Coke by using Box Plots CDTWT = Diet Coke Weight (lb) CRGWT = Regular Coke Weight (lb) Before constructing any graphs, think about both box plots. a) Do you think they will have the same length (range)? Will they have the same minimum and maximum, or one of the plots will be farther to the right of the other? If so, which will be to the right? b) Construct a box plot for each of the distribution of the weights of regular and diet Coke. Display both plots in the same window. Is it what you predicted? Compare the graphs and determine whether there appears to be a significant difference between the two distributions. If so, provide a possible explanation for the difference. Use the scale provided below as a guide to sketch the box plots. Record the 5 number summary and the outliers for each of the distributions. Also mention the smallest and largest number of the distributions which are not outliers. _____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_____|_ 16) Go back and look at each of the graphs obtained throughout this activity. Think about the measures of the center and variation for the data set. (Chapter 3) Do you think the mean and median are the same? If not, which one will be larger? Think about the standard deviation. Will it be a large number or a small number? Try to tie concepts from chapters 2 and 3 together. 14 M116 – NOTES Choosing an appropriate number to describe the data Measuring the center of a distribution The mean cannot resist the influence of extreme observations. It is not a resistant measure of the center The median is a resistant measure of the center. If the distribution is symmetric, the mean and median are the same. If the distribution is close to symmetric, the mean and median are very close in values. In a skewed distribution, the mean is farther out in the long tail than is the median Measuring the spread of a distribution – Box Plots and the 5-number summary The minimum and maximum values show the full spread of the data (but they may be outliers) The interquartile range marks the spread of the middle half of the data. In a symmetric distribution, the first and third quartiles are equally distant from the median In most distributions that are skewed to the right, the third quartile will be farther above the median than the first quartile The standard deviation measures spread by looking at how far the observations are from their mean Choosing measures of center and spread The five-number summary is usually better than the mean and standard deviation for describing a skewed distribution or a distribution with strong outliers. Use the mean and standard deviation only for reasonably symmetric distributions that are free of outliers. Example 1: Distributions of incomes are usually skewed to the right. Which measure of the center is more appropriate? Why? Reports about incomes and other strongly skewed distributions usually give the median rather that the mean. Example 2: The mean and median selling price of existing single-family homes sold in June 2002 were $163,900 and $210,900. Which of these numbers is the mean and which is the median? Explain how you know. 15