Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics and Data Analysis (1B, 1D, 1E, 1F, 2C, 3A, 3D, 4C, 5C, 6A, 6C, 7D, 8D, 9A) Data: information, especially numerical information that is organized. Discrete data is exact data that can be counted. (Ex. Acres, bushels, livestock) Scatter plots, bar graphs and circle graphs show this type of data well. Continuous data is measurements that are taken over a period of time. (Ex. Temperatures, gallons of fuel consumed, weight gain in livestock) Line graphs show this type of data well. Statistics: numbers used to describe a set of data. Three different statistics are used to measure the central tendency of a distribution. A measure that summarizes a whole set of measurements by representing the approximate center of a distribution is called a measure of central tendency. The mean of a distribution is the arithmetic average of the scores. The mean is the measure found by adding together all of the scores of a distribution and dividing by the total number of scores. The median of a distribution is the middle number after the scores are arranged in numerical order. Scores may be arranged in either ascending or descending order. If there is an even number of scores, the median is determined by the average of the two middle scores. The mode of a distribution is the score that occurs most frequently. It is possible that there may not be a mode. It is also possible to have more than one mode. Correlation coefficient is statistic that is used to determine how closely the points in a scatter plot cluster about a line. It is used to predict future outcomes. The correlation coefficient measures the strength of the linear relationship between two variables. A high correlation does not necessarily imply a cause-and-effect relationship. Measures of dispersion: values that indicate how a distribution is scattered. The range is the difference between the largest and smallest numbers in a set. The deviation is the difference between a single score and the mean of the set of scores in a distribution. The variance is the mean of the squared deviations in a distribution. The further it is from the mean, the more pronounced its effect on the size of the variance. Variance gives a better measure of scatter than does the range because variance takes into account how much each score deviates from the mean. If most scores cluster near the mean, the variance will be small. But if many scores deviate widely from the mean, the variance will be large. The standard deviation is the square root of the variance: that is, the square root of the mean of the sum of the squared deviations in a distribution. Standard deviation is much more commonly used than variance as a measure of scatter. This is because it is expressed in the same unit as the scores, where the variance is expressed in square units. In a normal distribution, approximately 68% (about 2/3rds) of all the measurements lie within one standard deviation on either side of the mean, approximately 95% of all measurements lie with two standard deviations on either side of the mean, and approximately 99% of all the measurements lie within three standard deviations of the mean. Standard deviation gives a good indication of how widely the scores in a distribution are scattered. A standard deviation of 0 means every one has the same score. A large standard deviation means the scores are spread out. A small standard deviation indicates a much narrower distribution of the scores. The normal distribution is a symmetrical frequency distribution in which most scores or measures occur in the intervals closest to the center of the distribution and fewer scores are found in the intervals farther from the center. In a true normal distribution, the data is continuous – the highest frequency occurs in the center of the distribution and the lowest frequencies are found at the extremes. When such a distribution is graphed, the result is a “bell-shaped curve” called the normal curve. (Normal curve serves as a mathematical model.)