Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Glossary of Statistical Terms Arithmetic Mean the total value of a set of observations divided by the number of observations ( or X ) Central Tendency the “middle’ of a data set Coefficient of Variation the standard deviation divided by the arithmetic mean ( / or s/ x )- used to compare the RELATIVE VARIATION in two or more data sets. Continuous Variable data can assume any value in a given range (can be fractions or decimals) Discrete Variable data can only assume values that are whole numbers Frequency Distribution a table that indicates the number of observations that fall into classes of values Frequency Polygon a single line graph of a frequency distribution Histogram a bar chart of a frequency distribution Median the “middle” value when the data are arranged in ascending order Mode the value that occurs with the greatest frequency Modal Class the class in a frequency distribution that contains the most observations Cumulative Frequency Distribution a frequency distribution that presents the number of observations that fall below certain values Population all of the observations on a given phenomenon (N = population size) Sample part of the population (n = sample size) Random Sample a sample in which every observation in the population has the same probability of being included in the sample Systematic Sampling a sampling plan in which every kth item in the population is included in the sample after a random start Stratified Sampling a sampling plan in which the population is divided into strata (each of which has some common characteristic) and probability samples are drawn from each strata Cluster Sampling a sampling plan in which the population is dived into clusters (each of which has the characteristics of the population) and probability samples are drawn from each cluster Parameter a characteristic of a population (signified by Greek letters) Statistic a characteristic of a sample (signified by English letters) Qualitative Variable values that represent classes or attributes (non-numerical) Quantitative Variable values that are numerical 1 Quartiles values that divide the data into quarters Variance the average of the squared deviations around the mean ( 2 or s2) Standard Deviation the square root of the variance ( or s ) Z Score the number of standard deviations an observation lies above or below the mean ( Z = ( x -)/ )-used to compare the RELATIVE POSITIONS of individual observations in two or more data sets Statistical Experiment any process of observation Outcome a particular result of an experiment Event a subset of the sample space Compound Event an event that consists of more than one outcome Complement of an Event all of the outcomes in the sample space EXCEPT those that make up the event Intersection of Events the outcomes in the sample space that are common to two events simultaneously Union of Events the outcomes in the sample space that lie in either (or both) of the events Mutually Exclusive Events two events that cannot occur simultaneously Mutually Exhaustive Events events that collectively comprise the entire sample space Probability the percent of the time that an event occurs Classical Probability the number of outcomes in the sample space that represent an event divided by the total number of outcomes in the sample space Historical Probability the proportion of the time that an event occurs in the long run under Stable conditions Subjective Probability the degree of belief or confidence placed in the occurrence of an event P(A complement) = 1 – P(A) = the probability that event A will not occur P(A U B) = P(A) + P(B) – P(AnB) = the probability that either event occurs P(AnB) = P(A) * P(B/A) = the probability that both events occur P(B/A) = P(AnB)/ P(A) = the probability that event B will occur given that event A has occurred Independent Events events are independent when the occurrence of once does not affect the probability of the occurrence of the other 2 Prior Probability a probability assigned before the observation of empirical information (before the outcome of the experiment is known) Posterior Probability (Bayesian Probability) a probability assigned after the observation of empirical information (after the outcome of the experiment is known) Probability Distribution a table, equation or graph that shows the relationship between all of the possible outcomes of an experiment and the corresponding probabilities Binomial Distribution the probability distribution that is appropriate when the experiment consists of (1) repeated independent trials, (2) each trial has only two possible outcomes (success and failure) and (3) the probability of success is the same for all of the trials Expected Value the average outcome for an experiment that is repeated a very large number of times E( x ) = [ x P( x )] Random Variable the numerical description of the outcomes of an experiment Normal Distribution a continuous probability distribution, characterized by a mean and standard deviation, that is bell-shaped Standard Normal Distribution the normal distribution that has been transformed so that (1) the mean is 0, (2) the standard deviation is 1, and (3) the entire area under the curve is equal to 1 Sampling Distribution a probability distribution of a statistic (eg. mean, proportion) Sampling Distribution of the Mean a probability distribution that shows all of the possible values that the sample mean can assume for a given sample size and the corresponding probabilities Sampling Distribution of the Proportion a probability distribution that shows all possible values that that the sample proportion can assume for a given sample size and the corresponding probabilities Central Limit Theorem as the sample size increases, the sampling distribution of the mean approaches the normal distribution regardless of the shape of the distribution of the population Standard Error the standard deviation of a sampling distribution Standard Error of the Mean the standard deviation of the sampling distribution of the mean (/n) Standard Error of the Proportion the standard deviation of the sampling distribution of the proportion ( Confidence Level (1 ) n ) the degree of certainty with which an estimation is made ( 1-) 3 Confidence Interval a range within which your would expect to find the value of a parameter a certain percent of the time Point Estimate a single number (usually the corresponding statistic) used to estimate a parameter t-Distribution a bell shaped distribution that approaches the standard normal distribution as the sample size increases Finite Population Correction a correction factor applied to the stand error when sampling from a finite population Unbiased Estimator the expected value of the estimator (eg. the sample mean) is equal to the corresponding parameter (the population mean) Hypothesis Testing a decision rule that specifies the value(s) of a sample statistic for which the null hypothesis will be rejected Null Hypothesis the basic hypothesis that is tested for possible rejection Alternative Hypothesis the alternative to the Null Hypothesis so that rejection of the Null Hypothesis constitutes acceptance of the Alternative Hypothesis One Sample Tests tests of hypotheses based on data contained in a single sample Two Sample Tests tests of hypotheses based on data contained in two samples Type I Error rejection of the null hypothesis when it is true Type II Error non-rejection of the null hypothesis when it is false Level of Significance the probability of making a Type I error ( ) P Value the probability that the statistic will differ from the parameter being tested by a greater degree than observed when the Null hypothesis is true Rejection Region values of the sample statistic for which the Null hypothesis is rejected One- Tailed Test a test in which the Null hypothesis is rejected by observing a statistic that falls in the rejection region of the appropriate sampling distribution Two-Tailed Test a test in which the Null Hypothesis is rejected by observing a statistic that falls in either of the two tails (rejection regions) of the appropriate sampling distribution Regression Analysis the estimation of values of a dependent variable from the values of one or more independent variables Dependent Variable the variable (Y) whose values are to be determined Independent Variable the variable (X) from which estimates of the dependent variable (Y) are made Simple Regression regression analysis where there is only one independent variable 4 Multiple Regression regression analysis when there are two or more independent variables Regression Equation and Line the equation and line that describe the relationship between the dependent variable and the independent variable Scatter Diagram a graph on which each plotted point represents and observed pair of values of the dependent and independent variables Least Squares a curve fitting technique that minimizes the sum of the squared deviations around the regression line (equation) Coefficient of Determination the percentage of the variation in the dependent variable that is explained by variation in the independent variable(s) Correlation Coefficient a measure of the strength of the relationship between the dependent variable and the independent variables Extrapolation prediction of the value of the dependent variable based on a value of the independent variable outside the range of the observed data Standard Error Of the Estimate a measure of the scatter of the observed values of the dependent variable around the values of the dependent variable estimated from the regression equation Regression Coefficient the slope of the regression equation; the estimated value of the change in the dependent variable per unit change in the independent variable Standard Error Of the Regression Coefficients the standard deviation(s) of the sampling distribution(s) of the regression coefficient(s) Prediction Interval interval estimate for predicting the individual value of the dependent variable Analysis of Variance a test of the null hypothesis of no relationship between the dependent and all of the independent variables considered collectively Dummy Variable a qualitative variable included in regression analysis Multicollinearity correlation among independent variables Net Regression Coefficient slope coefficient in multiple regression analysis Time Series a set of statistical observations arranged in chronological order Trend a smooth upward or downward movement of a time series over a long period of time Seasonal Variation repetitive cycles that complete themselves within a one-year period Cyclical Variation recurrent upward and downward movements around trend levels with duration of from two to fifteen years 5 Irregular Variation erratic, unsystematic fluctuations due to random variation or unforeseen events Forecast the process of predicting a future value of a dependent variable Moving Average a method of smoothing time series data by taking averages of successive values over time Exponential Smoothing a form of a weighted moving average 6