Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Nature of Science Science 1101 Nature of Science Scientific methods Formulation of a hypothesis Survey literature/Archives Design experiments considering all Variables (Independant, dependant and controlled) Collection of data Nature of Science Scientific methods Analysis of data (statistical methods) Interpretation of results Conclusion Publication Descriptive Statistics Measures of Central Tendency When looking at a data set, we often are interested in knowing the "center" of the group of values we're examining. If you are evaluating your exam score, you want to know the "average" score for the class. Student Score (pts.) a 93 b 88 c 77 d 85 e 74 f 69 g 90 h 66 i 82 j 88 k 59 l 83 m 75 n 97 o 71 Nature of Science Descriptive Statistics Measures of Central Tendency There are three such measures of central tendency: the mode, the median, and the mean. To examine measures of central tendency, it is helpful to arrange the values in descending order. 97, 93, 90, 88, 88, 85, 83, 82, 77, 75, 74, 71, 69, 66, 59 Student Score (pts.) a 93 b 88 c 77 d 85 e 74 f 69 g 90 h 66 i 82 j 88 k 59 l 83 m 75 n 97 o 71 Nature of Science Descriptive Statistics Measures of Central Tendency Mode The mode of a data set does not need to be near the center of the data set, it simply has to be the most common. Modes are useful in that they tell you the most common value in the data set, and can shed some light on the data set's tendencies. Nature of Science Descriptive Statistics Measures of Central Tendency Median The median is the value that occurs in the middle of the data set. If the data set contains an odd number of values, the median will be the middle value. For example, in a ordered data set of 19 values, value #10 would be the median as there would be nine values below it and nine values above it. In a data set of 20 values, the median would be the value halfway between the tenth and Nature of eleventh values. Science Descriptive Statistics Measures of Central Tendency Mean To calculate the mean you sum all of the values and divide by the number of values Nature of Science Descriptive Statistics Measures of Central Tendency Range The range describes the highest and lowest values in a data set. The range is commonly used in weather reports, where the daily high and low temperatures are reported Nature of Science Descriptive Statistics Measures of Central Tendency Variance and Standard Deviation The variance describes how far each value is from the mean. While the variance gives you an indication of the deviation of each value from the mean, the variance is in "squared" units Variance= Substract each value from mean Some values are (+), some (-) and some = 0. Nature of Science Descriptive Statistics Measures of Central Tendency Variance and Standard Deviation An absolute value of how much one is deviated from the mean is to calculate the standard deviation (SD) The square root of the variance, which will provide us with a measure of data dispersion is the same units as the mean. This value is known as the standard deviation (SD). The standard deviation is kind of the "mean of the mean," and often can help you find the story behind the data. To understand this concept, it can help to learn about what statisticians call normal distribution of data. Nature of Science Descriptive Statistics Measures of Central Tendency Variance and Standard Deviation If you looked at normally distributed data on a graph, it would look something like this: The x-axis (the horizontal one) is the value in question... calories consumed, for example. And the y-axis (the vertical one) is the number of data points for each value on the x-axis... in other words, the number of people who eat x calories Nature of Science Descriptive Statistics Measures of Central Tendency Variance and Standard Deviation Now, not all sets of data will have graphs that look this perfect. Some will have relatively flat curves, others will be pretty steep. Sometimes the mean will lean a little bit to one side or the other. But all normally distributed data will have something like this same "bell curve" shape. Nature of Science Descriptive Statistics Measures of Central Tendency Variance and Standard Deviation Now, not all sets of data will have graphs that look this perfect. Some will have relatively flat curves, others will be pretty steep. Sometimes the mean will lean a little bit to one side or the other. But all normally distributed data will have something like this same "bell curve" shape. Figure1. Normal distribution with standard deviations Nature of Science Standard Deviation Computing the value of a standard deviation is complicated. But let us see graphically what a standard deviation represents... One standard deviation away from the mean in either direction on the horizontal axis (the red area on the above graph) accounts for somewhere around 68 percent of the people in this group. Two standard deviations away from the mean (the red and green areas) account for roughly 95 percent of the people. And three standard deviations (the red, green and blue areas) account for about 99 percent of the people Nature of Science Homework Example 1: Brain teaser Here is one formula for computing the standard deviation. Terms you'll need to know x = one value in your set of data avg (x) = the mean (average) of all values x in your set of data n = the number of values x in your set of data For each value x, subtract the overall avg (x) from x, then multiply that result by itself (otherwise known as determining the square of that value). Sum up all those squared values. Then divide that result by (n-1). Got it? Then, there's one more step... find the square root of that last number. That's the standard deviation of your set of data. Nature of Science Standard Deviation Formula The formula for the standard deviation is very simple: it is the square root of the variance. It is the most commonly used measure of spread. The variance describes how far each value is from the mean. Nature of Science Standard Deviation Formula Nature of Science Standard Deviation Formula Nature of Science Example 2: Consider the observations 8,25,7,5,8,3,10,12,9. First, calculate the mean and determine N. Remember, the mean is the sum of scores divided by N where N is the number of scores. Therefore, the mean = (8+25+7+5+8+3+10+12+9) / 9 or 9.67 Then, calculate the standard deviation as illustrated below. Standard Deviation = Square root (sum of squared deviations / (N-1) = Square root(320.01/(9-1)) = Square root(40) Nature of Science = 6.32 Graphs Datas may be presented in the forms of tables, charts and graphs. The type of figure you will use most frequently is a graph. Graphs can be an effective way to present information, but it is important that they are properly constructed. Let's look at a sample graph to demonstrate the basic characteristics of a graph. Note that the independent variable is graphed on the xaxis and the dependent variable on the y-axis, the x-axis (horizontal) and y-axis (vertical) are labeled If "time" is a factor in your graph, it should be graphed on the x-axis. Graphs should also be of high quality and appropriate size. Graphs Sample bar graph Sample line graph When Bar Graphs are Useful Bar graphs are useful for graphing non-continuous data, such as data from different experimental groups. Note that in each case, the dependent variable is on the yaxis and the independent variable on the x-axis, and all of the formatting issues listed above are addressed. Sample bar graph Sample bar graph When Line Graphs are Useful If your data are continuous (each point is directly related to the next and can be connected by an infinite number of intermediate points), then a line graph is the way to go. Line graphs are commonly used in scientific studies to present data when time (a continuous variable) is one of the variables involved. Note the dependent variable is on the y-axis, "time" is on the x-axis. Sample line graph Sample line graph Scatter Plots While bar and line graphs are used commonly, there is another type of graph worth mentioning the scatter plot. In some cases you may need to graph two variables against one another to determine their relationship. You graph one variable on the X-axis and the other on the Y-axis, and then graph your points accordingly. Scatter Plots By doing this, you can see if one variable increases as the other increases (positive relationship). Or one increases as the other decreases (negative relationship). Or if the two variables show no relationship to one another. Sample scatter plot