Download KEY TERMS IN STATISTICS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Glossary of Statistical Terms
Arithmetic Mean
the total value of a set of observations divided by the number of
observations (  or X )
Central Tendency
the “middle’ of a data set
Coefficient of Variation the standard deviation divided by the arithmetic mean ( / or s/ x )- used
to compare the RELATIVE VARIATION in two or more data sets.
Continuous Variable
data can assume any value in a given range (can be fractions or decimals)
Discrete Variable
data can only assume values that are whole numbers
Frequency Distribution a table that indicates the number of observations that fall into classes of
values
Frequency Polygon
a single line graph of a frequency distribution
Histogram
a bar chart of a frequency distribution
Median
the “middle” value when the data are arranged in ascending order
Mode
the value that occurs with the greatest frequency
Modal Class
the class in a frequency distribution that contains the most observations
Cumulative Frequency
Distribution
a frequency distribution that presents the number of observations that fall
below certain values
Population
all of the observations on a given phenomenon (N = population size)
Sample
part of the population (n = sample size)
Random Sample
a sample in which every observation in the population has the same
probability of being included in the sample
Systematic Sampling
a sampling plan in which every kth item in the population is included in the
sample after a random start
Stratified Sampling
a sampling plan in which the population is divided into strata (each of
which has some common characteristic) and probability samples are
drawn from each strata
Cluster Sampling
a sampling plan in which the population is dived into clusters (each of
which has the characteristics of the population) and probability samples
are drawn from each cluster
Parameter
a characteristic of a population (signified by Greek letters)
Statistic
a characteristic of a sample (signified by English letters)
Qualitative Variable
values that represent classes or attributes (non-numerical)
Quantitative Variable
values that are numerical
1
Quartiles
values that divide the data into quarters
Variance
the average of the squared deviations around the mean ( 2 or s2)
Standard Deviation
the square root of the variance (  or s )
Z Score
the number of standard deviations an observation lies above or below
the mean ( Z = ( x -)/ )-used to compare the RELATIVE POSITIONS
of individual observations in two or more data sets
Statistical Experiment
any process of observation
Outcome
a particular result of an experiment
Event
a subset of the sample space
Compound Event
an event that consists of more than one outcome
Complement of an
Event
all of the outcomes in the sample space EXCEPT those that make up
the event
Intersection of
Events
the outcomes in the sample space that are common to two events
simultaneously
Union of Events
the outcomes in the sample space that lie in either (or both)
of the events
Mutually Exclusive
Events
two events that cannot occur simultaneously
Mutually Exhaustive
Events
events that collectively comprise the entire sample space
Probability
the percent of the time that an event occurs
Classical Probability
the number of outcomes in the sample space that represent an event
divided by the total number of outcomes in the sample space
Historical Probability
the proportion of the time that an event occurs in the long run under
Stable conditions
Subjective Probability
the degree of belief or confidence placed in the occurrence of an event
P(A complement)
= 1 – P(A) = the probability that event A will not occur
P(A U B)
= P(A) + P(B) – P(AnB) = the probability that either event occurs
P(AnB)
= P(A) * P(B/A) = the probability that both events occur
P(B/A)
= P(AnB)/ P(A) = the probability that event B will occur given that
event A has occurred
Independent
Events
events are independent when the occurrence of once does not affect the
probability of the occurrence of the other
2
Prior Probability
a probability assigned before the observation of empirical information
(before the outcome of the experiment is known)
Posterior Probability
(Bayesian Probability)
a probability assigned after the observation of empirical information
(after the outcome of the experiment is known)
Probability Distribution a table, equation or graph that shows the relationship between all of the
possible outcomes of an experiment and the corresponding probabilities
Binomial Distribution
the probability distribution that is appropriate when the experiment
consists of (1) repeated independent trials, (2) each trial has only two
possible outcomes (success and failure) and (3) the probability of success is
the same for all of the trials
Expected Value
the average outcome for an experiment that is repeated a very large
number of times E( x ) = [ x  P( x )]
Random Variable
the numerical description of the outcomes of an experiment
Normal Distribution
a continuous probability distribution, characterized by a mean and
standard deviation, that is bell-shaped
Standard Normal
Distribution
the normal distribution that has been transformed so that (1) the mean is 0,
(2) the standard deviation is 1, and (3) the entire area under the curve is
equal to 1
Sampling Distribution
a probability distribution of a statistic (eg. mean, proportion)
Sampling Distribution
of the Mean
a probability distribution that shows all of the possible values that
the sample mean can assume for a given sample size and the corresponding
probabilities
Sampling Distribution
of the Proportion
a probability distribution that shows all possible values that
that the sample proportion can assume for a given sample size and the
corresponding probabilities
Central Limit
Theorem
as the sample size increases, the sampling distribution of the mean
approaches the normal distribution regardless of the shape of the
distribution of the population
Standard Error
the standard deviation of a sampling distribution
Standard Error
of the Mean
the standard deviation of the sampling distribution of the mean
(/n)
Standard Error
of the Proportion
the standard deviation of the sampling distribution of the proportion
(
Confidence Level
 (1   )
n
)
the degree of certainty with which an estimation is made ( 1-)
3
Confidence Interval
a range within which your would expect to find the value of a parameter a
certain percent of the time
Point Estimate
a single number (usually the corresponding statistic) used to estimate a
parameter
t-Distribution
a bell shaped distribution that approaches the standard normal distribution
as the sample size increases
Finite Population
Correction
a correction factor applied to the stand error when sampling from a finite
population
Unbiased Estimator
the expected value of the estimator (eg. the sample mean) is equal to the
corresponding parameter (the population mean)
Hypothesis Testing
a decision rule that specifies the value(s) of a sample statistic for which the
null hypothesis will be rejected
Null Hypothesis
the basic hypothesis that is tested for possible rejection
Alternative Hypothesis
the alternative to the Null Hypothesis so that rejection of the Null
Hypothesis constitutes acceptance of the Alternative Hypothesis
One Sample Tests
tests of hypotheses based on data contained in a single sample
Two Sample Tests
tests of hypotheses based on data contained in two samples
Type I Error
rejection of the null hypothesis when it is true
Type II Error
non-rejection of the null hypothesis when it is false
Level of Significance
the probability of making a Type I error (  )
P Value
the probability that the statistic will differ from the parameter being tested
by a greater degree than observed when the Null hypothesis is true
Rejection Region
values of the sample statistic for which the Null hypothesis is rejected
One- Tailed Test
a test in which the Null hypothesis is rejected by observing a statistic that
falls in the rejection region of the appropriate sampling distribution
Two-Tailed Test
a test in which the Null Hypothesis is rejected by observing a statistic that
falls in either of the two tails (rejection regions) of the appropriate sampling
distribution
Regression Analysis
the estimation of values of a dependent variable from the values of one or
more independent variables
Dependent Variable
the variable (Y) whose values are to be determined
Independent Variable
the variable (X) from which estimates of the dependent variable (Y) are
made
Simple Regression
regression analysis where there is only one independent variable
4
Multiple Regression
regression analysis when there are two or more independent variables
Regression Equation
and Line
the equation and line that describe the relationship between the dependent
variable and the independent variable
Scatter Diagram
a graph on which each plotted point represents and observed pair of values
of the dependent and independent variables
Least Squares
a curve fitting technique that minimizes the sum of the squared deviations
around the regression line (equation)
Coefficient of
Determination
the percentage of the variation in the dependent variable that is explained
by variation in the independent variable(s)
Correlation
Coefficient
a measure of the strength of the relationship between the dependent
variable and the independent variables
Extrapolation
prediction of the value of the dependent variable based on a value of the
independent variable outside the range of the observed data
Standard Error
Of the Estimate
a measure of the scatter of the observed values of the dependent variable
around the values of the dependent variable estimated from the regression
equation
Regression
Coefficient
the slope of the regression equation; the estimated value of the change in
the dependent variable per unit change in the independent variable
Standard Error
Of the Regression
Coefficients
the standard deviation(s) of the sampling distribution(s) of the regression
coefficient(s)
Prediction Interval
interval estimate for predicting the individual value of the dependent
variable
Analysis of Variance
a test of the null hypothesis of no relationship between the dependent and
all of the independent variables considered collectively
Dummy Variable
a qualitative variable included in regression analysis
Multicollinearity
correlation among independent variables
Net Regression
Coefficient
slope coefficient in multiple regression analysis
Time Series
a set of statistical observations arranged in chronological order
Trend
a smooth upward or downward movement of a time series over a long
period of time
Seasonal Variation
repetitive cycles that complete themselves within a one-year period
Cyclical Variation
recurrent upward and downward movements around trend levels with
duration of from two to fifteen years
5
Irregular Variation
erratic, unsystematic fluctuations due to random variation or unforeseen
events
Forecast
the process of predicting a future value of a dependent variable
Moving Average
a method of smoothing time series data by taking averages of successive
values over time
Exponential Smoothing a form of a weighted moving average
6