Survey

Survey

Document related concepts

Transcript

Statistics The Baaaasics S “For most biologists, statistics is just a useful tool, like a microscope, and knowing the detailed mathematical basis of a statistical test is as unimportant to most biologists as knowing which kinds of glass were used to make a microscope lens.” –John McDonald, Biological Data Analysis Professor, University of Delaware So Why Stats? S Useful for drawing conclusions about an entire population by sampling an acceptable, smaller size. S Useful for converting raw data into tables or graphs that are quick and easy to decipher. S Understanding how data is used can help you determine if the source is reliable and trustworthy. (i.e., “Studies show...” Can we trust what this says??? And yet we conform so much of our lives to these “studies.”) What Do I Need To Know About Stats? S Descriptive Statistics – “the analysis of data that helps describe, show or summarize data in a meaningful way such that, for example, patterns might emerge from the data.” S It is NOT used to help draw conclusions beyond the data collected (i.e., cannot be extrapolated to make claims regarding the whole population, if the whole population was not measured). S Inferential Statistics – makes inferences about populations using data drawn from a smaller population. Descriptive Statistics Mean Median Mode Range Standard Deviation Measures of Confidence: Standard Error &95% Confidence Intervals S Descriptive Statistics in a Nutshell S Collect Data (i.e. conduct survey) S Summarize Data (mean, median, mode, standard deviation, error bars, etc.) S Present Data (table or graph) Raw Data Table Graphical Representation Graphs usually plot the mean or median of the raw data set. By doing this, it is much easier to identify patterns or to see if there is a normal distribution in the population. 3Ms: MEAN, Median, Mode S Mean (aka average). S In biology, we want to know the mean of the entire population (μ), but we use the sample population’s mean (x) as an estimate of μ. S Most frequently used of the 3Ms S Use it when you have a typical data set (no outliers). S How to calculate x: 1. Find the sum of all the data sets. 2. Count the number of data sets. 3. Divide the sum by the # of data sets. 3Ms: Mean, MEDIAN, Mode S S Median – when the data are ordered from smallest or largest, the median is the midpoint of the data. Median is great to use when your data contains significant outliers that can skew the mean. S For example: 27, 27, 28, 28 29, 31, 32, 33, 33, 33, 34, 35, 163 S What would be the average? 27+27+28+28+29+31+32+33+33+33+34+35+163 13 x = 41 3Ms: Mean, Median, MODE S S Mode is the value that appears the most often. It is not frequently used, but can be useful for categorical data where the user wants to know what is the most common category. 27, 27, 28, 28 29, 31, 32, , Mode = 33 , , 34, 35, 163 Range S Smallest measurement to the largest. S The larger the range, the greater the variability. S However, extremely small or large values (outliers) can make the variability appear really high. S The standard deviation is a more accurate representation of the spread of the data, which will be discussed further when the semester begins. Make sure you have watched the Bozeman video on this! WARNING! You will either have a short quiz on this information and/or it will be on your first exam. Make sure you know the difference between the 3Ms and the max vs. min (i.e. range). We will take some measurements on the 1st or 2nd day of the semester and you will apply this information, along with graphing. Additional Slides (need to know) Population vs. Sample Size Qualitative vs. Quantitative Discrete vs. Continuous S Population vs. Sample Size Qualitative vs. Quantitative Data S Qualitative - data are measurements that each fail into one of several categories. (hair color, ethnic groups and other attributes of the population) S Qualitative data are generally described by words or letters. S They are not as widely used as quantitative data because many numerical techniques do not apply to the qualitative data. S Quantitative - data are observations that are measured on a numerical scale (distance traveled to college, number of children in a family, etc.) S Quantitative data are always numbers and are the result of counting or measuring attributes of a population. S Quantitative data can be separated into two subgroups: discrete & continuous Discrete vs. Continuous Data o Discrete - if it is the result of counting (eg., the number of students of a given ethnic group in a class, the number of books on a shelf, ...) o Continuous - if it is the result of measuring (eg., distance traveled, weight of luggage, …) Discrete Data S This is called discrete data because the units of measurement (for example, CDs) cannot be split up; there is nothing between 1 CD and 2 CDs. S Shoe sizes are a classic example of discrete data, because sizes 39 and 40 mean something, but size 39.2, for example, does not. Continuous Data S The data set shows a group of continuous data. S This data is called continuous because the scale of measurement - distance - has meaning at all points between the numbers given, ex. we can travel a distance of 1.2 and 1.85 and even 1.632 miles.