Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics 312 – Uebersax http://www.john-uebersax.com/stat312/ 06 Measures of Central Tendeny & Disperson Old Business Picking up pace Mac issues Topics to Cover Frequency distributions/histograms in Mac Excel Review last homework Central tendency: mean, median, mode Percentiles, quartiles, Descriptive statistics in JMP Measures of Dispersion: the Variance Homework assignment 1. Frequency Distributions/Histograms with Mac Mac only: Status of Excel Data Analysis ToolPak Not available for Mac as of 2008 Alternative: StatPlus:MacLE Download and install: http://www.analystsoft.com/en/products/statplusmacle/ Might not make frequency distributions (so see below) Frequency Distributions/Histograms via Excel FREQUENCY() function Place data in one column (e.g., a1:a10) Place bins in another column (e.g., b1:b4) In another column, select vertical range of blank cells, which contains one more than number of cells in bin array (e.g., c1:c5) Type formula: frequency(a1:a10, b1:b4), then press COMMAND+ENTER (Mac) or CONTROL+SHIFT+ENTER (PC) Note: double-check bin range address; formula editor may obscure first cell. Explained here: http://support.microsoft.com/kb/100122 Class demonstration: Female Dover sole lengths 1 Statistics 312 – Uebersax http://www.john-uebersax.com/stat312/ 06 Measures of Central Tendeny & Disperson 2 2. Review Last Homework 1. For males and females separately, make a distribution table: In separate columns show range, bin, frequency, percentage, cumulative percentage. Label columns, include units of mg/dL. 2. For females only, make histogram: Resize histogram to look nice Place legend on bottom (not side) Label x-axis: Upper Limit of Range (Cholesterol, mg/dL) Add chart title: Distribution of Cholesterol for Females (mg/dL) If necessary fix secondary y-axis to range from 0 to 100% 3. Make final report comparing male and female distributions: Col 1: Col 2: Col 3: Col 4: Col 5: Range Male frequency Male percentage Female frequency Female percentage Remember to save your worksheet. 4. Based on the results, what conclusions can you reach concerning differences between male and female patients? Place results of 1. (male and female), histogram (female), final comparison table, and answer to question into Word document. Using JMP: Enter data for females into Date Table; produce histogram & basic statistics; cutand-paste results into the same Word document as above. Statistics 312 – Uebersax http://www.john-uebersax.com/stat312/ 06 Measures of Central Tendeny & Disperson 3 3. Measures of Central Tendency: Mean, Median, Mode, Review: Watch Khan Academy video on Average, Median, Mode http://www.youtube.com/watch?v=uhxtUt_-GyM The Arithmetic Mean x N ( for a population) x x n ( for a sample) Ex: The data represent the number of textbooks purchased by a sample of seven students: 10 4 7 5 7 8 9 x = 10 4 7 5 7 8 9 7 = 50 7 = 7.14 Excel AVERAGE() FUNCTION The mean is affected by any outliers and skews. Because the mean is nonresistant, there are alternative measures that are more resistant to outliers and skews The Median The median is a resistant measure of central tendency that occupies the middle position of data placed in order of magnitude. If n is odd, the median is the middle number of the data placed in order of magnitude. It Statistics 312 – Uebersax http://www.john-uebersax.com/stat312/ 06 Measures of Central Tendeny & Disperson n +1 2 occupies the 4 th position. If n is even, the median is the average of the middle two numbers of the data placed in order of n 2 th magnitude. It is the average of the numbers in the n +2 2 and th positions. Ex Reordering the sample of books: 4 5 7 7 8 9 10. The median is 7. If there were an eighth person who purchased 12 books, the median would be 7.5. Excel MEDIAN() FUNCTION The Mode The mode, by definition, is the most frequently occurring value in a series. There can be more than one modes There can be no mode Excel MODE() FUNCTION Statistics 312 – Uebersax http://www.john-uebersax.com/stat312/ 06 Measures of Central Tendeny & Disperson 5 4. Percentiles and Quartiles The kth percentile, Pk, is such that no more than k percent of the data are less than Pk and no more than (100 - k) percent are greater than Pk. Usually used with large data sets. The first quartile (Q1) is the point that separates the lower 25 percent of the values from the upper 75 percent = value corresponding to the n 1 ordered observation. 4 The third quartile (Q3) is the point that separates the upper 25 percent of the values from the lower 75 percent = value corresponding to the 3n 1 ordered observation. 4 Ex Books: 4 5 7 7 8 9 10. n 1 = 2, so Q1 = 5; 4 3n 1 = 6, so Q3 = 9. 4 (If position = #.5, average two nearest values; else, if not integer, round.) 5. Descriptive Statistics in JMP Method 1: Distribution Function Enter data into a Data Table (Important: do not mix character and numerical values in a column!) Highlight column (takes some practice; hint: to refresh selection: Rows > Clear Row States Analyze > Distribution > OK Statistics 312 – Uebersax http://www.john-uebersax.com/stat312/ 06 Measures of Central Tendeny & Disperson More statistics available by clicking red arrow beside Summary Statistics Method 2: Summary Function Tables > Summary JMP Summary Statistics Menu 6 Statistics 312 – Uebersax http://www.john-uebersax.com/stat312/ 06 Measures of Central Tendeny & Disperson 7 From Statistics drop-down menu (see above), select statistics one at a time. Selected statistics will then appear in box to right. (Note: drop-down menu does not appear in picture below) For Q1 and Q3, choose Quantile statistic twice, specifying 25% and 75% in this box: Click: OK Statistics 312 – Uebersax http://www.john-uebersax.com/stat312/ 06 Measures of Central Tendeny & Disperson 8 For more info: http://www.jmp.com/support/help/Summarize_Columns.shtml 6. Measures of Dispersion: the Variance Range Range = Maximum - Minimum Ex Books: 4 5 7 7 8 9 10 Range = 10 - 4 = 6 Interquartile Range IQR = Q3 - Q1 Ex The sample of books: Q1 = 5, Q3 = 9, IQR = 9 - 5 = 4 Variance (Population and Sample) The variance is the average squared distance of observations from the mean. Population variance formula: The square root of the variance is the standard deviation. Spreadsheet calculation of population variance: Ex Books: 4 5 7 7 8 9 10 Statistics 312 – Uebersax http://www.john-uebersax.com/stat312/ 06 Measures of Central Tendeny & Disperson Variance = Average[X – mu]^2 =26.857/7 = 3.84 Video: Variance of a Population http://www.youtube.com/watch?v=6JFzI1DDyyk 7. Homework Read pp. 104-117, Prob 3.1, 3.2a [skip(4),(6), (10)], 3.2b Data for 3.b (bolts.xls) on course website 9