Download x - IISME Community Site

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability box wikipedia , lookup

Transcript
Data and Statistics Terms (answer key)
Name:____________
This is the website where you find definitions. Also use other
searches to complete this vocabulary list. Give some examples.
http://www.animatedsoftware.com/elearning/Statistics%20Explained/glossary/se_glossary.htmlPopulat
ion
Random Sample- A set of items that have been drawn from a population in such a way that each time
an item was selected, every item in the population had an equal chance to appear in the sample.
Ex. Numbered pieces of cardboard are drawn for a hat, it is important that they be thoroughly mixed,
that they be identical in every respect except for the number imprinted on them and that the person
selecting them be well blindfolded.
Population-A population can be any size. The items need not be uniform, the items must share at least
one measurable feature.
Ex. A population of 9 people will share at least one measurable feature. Each person has a weight,
height, a hat size, a shoe size, etc.
Validity of data- Data Validity is the correctness and reasonableness of data. The Data should fall within
a certain range and have accurate numbers. Outliers are numbers outside the normal data distribution.
Measures of Central Tendency- Measures like mean, median, and mode that measure where the
population is centered.
Mean – (Average) The mean (mu) measures the average of the scores in the population
Numerically it equals the sum of the scores divided by the number of scores.
Median- (The middle of the data) The median is the point that divides the distribution of scores
in half. Numerically, half of the scores in a population will have values that are equal to or larger
than the median and half are equal to or smaller than the median. If the middle falls between
two data items, Take the sum of the two numbers and divide them by two.
Mode- (Most often Digit Entered) It is the score in the population that occurs most frequently. If
there are two or more different scores that occur, both are mentioned and the data is said to be
multi-modal.
Range- The range tells the volume or dispersion of the data. The range is the distance between the
highest and the lowest scores. Numerically, the range is the highest score minus the lowest score.
Histogram – It is a graph showing the distribution of the data. Each measure is represented by a single
block that is placed over the midpoint of the interval into which it falls.
Pie chart-A pie chart (or a circle graph) is a circular chart divided into sectors or parts, illustrating
proportion.
Pie charts can be an effective way of displaying information in some cases, in particular if the intent is to
compare the size of a slice with the whole pie, rather than comparing the slices among them. Pie charts
work particularly well when the slices represent 25 to 50% of the data, but in general, other plots such
as the bar chart or line graph, or non-graphical methods such as tables, may be more adapted for
representing certain information. It also shows the frequency within certain groups of information.
Pie chart of populations of English native speakers
Normal Distribution- Also known as bell shaped curve or Gaussian distribution. The bell shaped curve
has several properties:


The curve concentrated in the center and decreases on either side. This means that
the data has less of a tendency to produce unusually extreme values, compared to
some other distributions.
The bell shaped curve is symmetric. This tells you that he probability of
deviations from the mean are comparable in either direction.
When you want to describe probability for a continuous variable, you do so by describing a
certain area. A large area implies a large probability and a small area implies a small probability.
Some people don't like this, because it forces them to remember a bit of geometry (or in more
complex situations, calculus). But the relationship between probability and area is also useful,
because it provides a visual interpretation for probability.
Standard Deviation 1,2,3 – The standard deviation is one of the several indices of variability that
statisticians use to characterize the dispersion among the measures in a given population.
Extrapolation- the process of constructing new data points outside a discrete set of known data points.
It is similar to the process of interpolation, which constructs new points between known points.
(EX. a driver extrapolates road conditions beyond his sight while driving).
Example illustration of the extrapolation problem, consisting
of assigning a meaningful value at the blue box, at x = 7,
given the red data points.
Forecast-to calculate or predict some future event or condition
Predication- To state, tell about, or make known ahead of time, especially on the basis of special
knowledge. A prediction is making a guess using known factors.