Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data and Statistics Terms (answer key) Name:____________ This is the website where you find definitions. Also use other searches to complete this vocabulary list. Give some examples. http://www.animatedsoftware.com/elearning/Statistics%20Explained/glossary/se_glossary.htmlPopulat ion Random Sample- A set of items that have been drawn from a population in such a way that each time an item was selected, every item in the population had an equal chance to appear in the sample. Ex. Numbered pieces of cardboard are drawn for a hat, it is important that they be thoroughly mixed, that they be identical in every respect except for the number imprinted on them and that the person selecting them be well blindfolded. Population-A population can be any size. The items need not be uniform, the items must share at least one measurable feature. Ex. A population of 9 people will share at least one measurable feature. Each person has a weight, height, a hat size, a shoe size, etc. Validity of data- Data Validity is the correctness and reasonableness of data. The Data should fall within a certain range and have accurate numbers. Outliers are numbers outside the normal data distribution. Measures of Central Tendency- Measures like mean, median, and mode that measure where the population is centered. Mean – (Average) The mean (mu) measures the average of the scores in the population Numerically it equals the sum of the scores divided by the number of scores. Median- (The middle of the data) The median is the point that divides the distribution of scores in half. Numerically, half of the scores in a population will have values that are equal to or larger than the median and half are equal to or smaller than the median. If the middle falls between two data items, Take the sum of the two numbers and divide them by two. Mode- (Most often Digit Entered) It is the score in the population that occurs most frequently. If there are two or more different scores that occur, both are mentioned and the data is said to be multi-modal. Range- The range tells the volume or dispersion of the data. The range is the distance between the highest and the lowest scores. Numerically, the range is the highest score minus the lowest score. Histogram – It is a graph showing the distribution of the data. Each measure is represented by a single block that is placed over the midpoint of the interval into which it falls. Pie chart-A pie chart (or a circle graph) is a circular chart divided into sectors or parts, illustrating proportion. Pie charts can be an effective way of displaying information in some cases, in particular if the intent is to compare the size of a slice with the whole pie, rather than comparing the slices among them. Pie charts work particularly well when the slices represent 25 to 50% of the data, but in general, other plots such as the bar chart or line graph, or non-graphical methods such as tables, may be more adapted for representing certain information. It also shows the frequency within certain groups of information. Pie chart of populations of English native speakers Normal Distribution- Also known as bell shaped curve or Gaussian distribution. The bell shaped curve has several properties: The curve concentrated in the center and decreases on either side. This means that the data has less of a tendency to produce unusually extreme values, compared to some other distributions. The bell shaped curve is symmetric. This tells you that he probability of deviations from the mean are comparable in either direction. When you want to describe probability for a continuous variable, you do so by describing a certain area. A large area implies a large probability and a small area implies a small probability. Some people don't like this, because it forces them to remember a bit of geometry (or in more complex situations, calculus). But the relationship between probability and area is also useful, because it provides a visual interpretation for probability. Standard Deviation 1,2,3 – The standard deviation is one of the several indices of variability that statisticians use to characterize the dispersion among the measures in a given population. Extrapolation- the process of constructing new data points outside a discrete set of known data points. It is similar to the process of interpolation, which constructs new points between known points. (EX. a driver extrapolates road conditions beyond his sight while driving). Example illustration of the extrapolation problem, consisting of assigning a meaningful value at the blue box, at x = 7, given the red data points. Forecast-to calculate or predict some future event or condition Predication- To state, tell about, or make known ahead of time, especially on the basis of special knowledge. A prediction is making a guess using known factors.