Download Part 1 — IQ Scores

Minitab Demonstrations by Bruce E. Trumbo Department of Statistics CSU Hayward Part 1 — IQ Scores Setup In this demonstration you will use the Minitab worksheet MTBDEM.MTW. You need to open this worksheet from disk so that it is ready for use within Minitab. Start Minitab. In the menus at the top of the screen use FILE ➯ Open Worksheet; then select the file name from its disk location. (Notice that there things other than worksheets that can be opened.) This worksheet contains the data for Parts 1-5 of these notes. Ordinarily, the data for each part would be put into its own separate worksheet, but we put them all together here to minimize file handling in these introductory sessions. If you do not see the Minitab prompt MTB > in the Session window when you start Minitab, then do the following. Click anywhere on the Session window so that it is "active" as indicated by a colored bar at the top. Then in the menus use EDITOR ➯ Enable Commands. (Be sure to use EDITOR, not EDIT.) The Data In this part you will use the following columns. Their exact contents will be explained as we go along. c12 Origl IQ c13 Final IQ The data for Part 1 are IQ scores of 250 high school students in the San Francisco Bay Area, collected for a master's thesis in Educational Psychology at CSU Hayward. Exploration of the Data We use several graphical and numerical methods to explore the IQ data. Dotplots The dotplot is one of the simplest graphical devices. Each observation is represented by a dot appropriately placed along a horizontal axis. If several observations have the same (or nearly the same) value, they are stacked vertically. In Minitab you might make a dotplot in either of two ways: You may type the command DOTPlot, followed by the column identifier (here 'Origl IQ' or c12). Use the apostrophe (on most keyboards near the RETURN key) for both beginning and ending single quotes; in Minitab, never use the left-single-quote (on most keyboards at the left of the top row of keys, near the 1). Minitab does not distinguish between capital and small letters in commands. We capitalize the first four letters here to signify that they are the only ones required. (If a command name has more than four letters, you need to type only the first four letters. But you may type the entire command name if you like.) Alternatively, you may follow the menu path GRAPH ➯ Character ➯ Dotplot, and then select c12 (Origl IQ). In these notes the menu path for Windows is shown at the beginning of each display on a gray background, followed by the corresponding command in courier typeface. Copyright © 1993, 1996, 2003 by Bruce E. Trumbo. All rights reserved. Intended for use in the Statistics Department at California State University, Hayward. Please request permission for other uses. Minitab Demonstrations—Part 1 1-2 Notes on commands and menus: Minitab is command-based software. The menus are just a way to generate commands. For the most basic procedures, if you know the command name, it is probably easier to type it after the program-generated prompt MTB >. However, if you don't know the name of the command you need for a procedure, the menus are laid out logically enough that they may help you to find the procedure you want. Also, for complex procedures (especially graphical ones with many possibilities for annotations and options) it is often easier to use menus. GRAPH ➯ Character ➯ Dotplot, select 'Orig IQ' MTB > dotp 'Orig IQ' . : :: : . :: : : .: :: : : :: :::::.:::: ::::::::::: : ::::::::::: :.::::::::::::: : ::::::::::::::: : :: ::::::::::::::::: ..:.::::::::::::::::::::.:. . . -----+---------+---------+---------+---------+---------+-Origl IQ 60 90 120 150 180 210 From this dotplot of the data, we see that most of the IQ scores lie between 70 and 130, with a few outside this interval on both sides. However, the striking thing is the extreme IQ score of almost 200. From what we know about IQ scores we suppose this is an error. Boxplots The boxplot of a dataset is based on the "five-number summary" of the observations. From smallest to largest these five numbers are: The minimum The lower quartile (lower end of box) The median (symbol within box) The upper quartile (upper end of box), and The maximum. Notice that the "middle half" of the observations fall within the box of the boxplot. An outlier is a value that falls relatively far away from the rest of the values in a dataset. Minitab can make two styles of boxplot—using character or standard graphics and using pixel or professional graphics. We begin with character graphics. A Minitab character graphics boxplot signals "probable" outliers with the symbol O and "possible" ones with *. Copyright © 1993, 1996, 2003 by Bruce E. Trumbo. All rights reserved. Intended for use in the Statistics Department at California State University, Hayward. Please request permission for other uses. Minitab Demonstrations—Part 1 1-3 GRAPH ➯ Character ➯ Boxplot, select 'Orig IQ' MTB > GSTD MTB > boxp 'Orig IQ' ------------------I + I-------* O -------------+---------+---------+---------+---------+---------+-Origl IQ 60 90 120 150 180 210 This boxplot explicitly highlights the extreme value we noticed in the dotplot, and labels it as a probable outlier. (It turns out that the observation marked * is not an error, but indicates an exceptionally bright student.) A professional graphic boxplot runs vertically instead of horizontally, does not distinguish between probable and possible outliers, and can be embellished in various ways (most conveniently using menus) that we do not discuss here. The procedure and the resulting graphic are shown below. GRAPH ➯ Boxplot, y = 'Orig IQ', x unspecified (left blank). MTB > GPRO MTB > boxp 'Orig IQ' 200 Origl IQ 150 100 50 Note: (a) When Minitab starts up, it is in professional graphics mode. If you make only professional graphs, then you do not need to use the commands GPRO and GSTD to switch between modes. Even though character graphs are somewhat less precise, they are often just as Copyright © 1993, 1996, 2003 by Bruce E. Trumbo. All rights reserved. Intended for use in the Statistics Department at California State University, Hayward. Please request permission for other uses. Minitab Demonstrations—Part 1 1-4 effective, often take up less room on the page, can be easily labeled using a word processor, and take up less file space. (b) When cutting a character graph from Minitab and pasting it into a word processor, always cut one blank line above and below the graph you want to move, and be sure to format the graph using a monospace font such as courier in order to preserve the spacing. Numerical Descriptive Statistics. Minitab makes it easy to compute a number of numerical descriptive statistics for a dataset. (Your output may look a little different, depending on the release of Minitab used.) STAT ➯ Basic ➯ Descriptive, select 'Origl IQ' MTB > desc 'Origl IQ' Origl IQ N 250 MEAN 100.32 MEDIAN 100.00 TRMEAN 100.21 Origl IQ MIN 58.00 MAX 196.00 Q1 90.00 Q3 112.00 STDEV 16.52 SEMEAN 1.04 The crucial information here is the maximum value MAX = 196. This is the exact numerical value of the outlier seen in the dotplot and the boxplot above. Notes on other descriptive statistics shown above: Check your textbook for the definitions: N = 250, the sample size. (Minitab uses N here, but most texts use n for sample size and N for population size.) The sample MEAN = 100.32. (Most texts use X or Y for the sample mean.) The sample MEDIAN = 100.00. The sample standard deviation STDEV = 16.52. TRMEAN stands for the trimmed mean of the sample, computed by ignoring the highest 5% and lowest 5% of the data and averaging the middle 90%; this quantity is not as sensitive to erratic extreme values as is the mean. Q1 and Q3 are the lower and upper quartiles of the sample. SEMEAN is the (estimated) standard error of the mean, equal to the sample standard deviation divided by the square root of the sample size; this quantity is used in statistical inference. Data Cleaning In the actual situation upon which these data are based, the researcher rechecked the original list of IQ scores and found that the value 196 resulted from a data input error; the correct value is 96. The data in c3 ('Final IQ') are identical to those in c2 except that this error has been corrected. Now we repeat our work, using the corrected data. Copyright © 1993, 1996, 2003 by Bruce E. Trumbo. All rights reserved. Intended for use in the Statistics Department at California State University, Hayward. Please request permission for other uses. Minitab Demonstrations—Part 1 1-5 Notice that the dotplot below uses a different scale, appropriate to the span of the corrected data. (For variety we have designated the column with the corrected data as c13 instead of 'Final IQ' in the command. Column names are often easier to remember, but column numbers are easier to type. If a column has a name, Minitab will always use it in the output—no matter whether you used its name or its number in the command.) GRAPH ➯ Character ➯ Dotplot, select 'Final IQ' MTB > GSTD MTB > dotp c13 .:: . : . ::: :.: : :.: : ::::::: : :::: . : ::::::::::::::: :: ::::::::::::::::::.. : .. ::::::::::::::::::::::: :. . .:..::.::::::::::::::::::::::::::: :. . -----+---------+---------+---------+---------+---------+-Final IQ 60 80 100 120 140 160 Here is a comparison of the numerical descriptive statistics for the incorrect and corrected IQ data. (Notice that descriptive statistics can be computed for more than one column at a time.) STAT ➯ Basic ➯ Descriptive, select both 'Origl IQ' and 'Final IQ' MTB > desc 'Origl IQ' 'Final IQ' Origl IQ Final IQ N 250 250 MEAN 100.32 99.920 MEDIAN 100.00 100.000 TRMEAN 100.21 100.076 Origl IQ Final IQ MIN 58.00 58.000 MAX 196.00 150.000 Q1 90.00 90.000 Q3 112.00 112.000 STDEV 16.52 15.367 SEMEAN 1.04 0.972 The incorrect observation changed the mean by 0.4 of an IQ point (giving 100.3 compared with a correct mean of 99.9), the trimmed mean by about 0.1 of an IQ point, and the median not at all. The largest discrepancy is in the standard deviation. Comments Unlike "textbook" examples, real data almost always contain some errors. In beginning to study a dataset it is well to use a number of graphical and numerical devices to screen the data for unreasonable and inconsistent values. Using a computer with statistical software such as Minitab, we find it easy to take such a critical look at a dataset before we try to draw conclusions from it—even if the sample size is fairly large, as in the present case. Consider for a moment how much work would be required to duplicate the work shown in Part 1 if we had to do it using pencil, graph paper, and a hand calculator. Copyright © 1993, 1996, 2003 by Bruce E. Trumbo. All rights reserved. Intended for use in the Statistics Department at California State University, Hayward. Please request permission for other uses.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Part 1 — IQ Scores