Download Statistics Glossary

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Blair Lusby
Holes and Goals
Statistics Glossary





















Alpha: the likelihood that the population lies outside of the confidence interval
Alternative Hypothesis: the hypothesis that your observations are influenced by a nonrandom cause
Back-to-back Stemplots: a graphic way to compare data from two populations; there is a
center column called the stem with a vertical line on either side of the column, there are
leaves on the left and right side of the column each side representing a different
population
Bar Chart: made up of columns or rows on a graph; the length of the column or row
represents the size of the group defined by the column or row label
Bernoulli Distribution(binomial distribution): the probability distribution of a binomial
random variable
Bias: the tendency to over or under estimate the population parameter
Biased Estimator: when the mean of the sampling distribution of a statistic is not equal to
a population parameter
Bimodal Distribution: a distribution of data with two clear peaks
Bivariate Data: when a study is done with two variables
Blinding: when you do not tell subjects whether they are getting a placebo or the actual
treatment
Boxplot: used for quantitative data; data is split into quartiles (Q1, Q2, Q3) in a box; Q2
line is drawn at the median in the box, there are two “whiskers” one that goes from Q1 to
the smallest non-outlier and one that goes from Q3 to the largest non-outlier
Central Limit Theorem: if the sample size is large enough then the sample distribution
will always be normal or near normal
Clusters: when the population is in separate groups
Cluster Sampling: random sample of clusters is selected and the research is conducted on
those clusters
Combination: a selection of all or part of a set of objects
Complement: the event does not occur
Completely Randomized Design: the subjects are randomly assigned to either the placebo
or the experiment
Conditional Probability: the probability that even B occurs given that event A occurred
Confidence Interval: used to express ones uncertainty with the sample statistics
Confidence Level: the likelihood that the population lies within the given confidence
interval
Confounding: occurs when the experimental controls do not allow the experimenter to
reasonably eliminate plausible alternative explanations for an observed relationship
between independent and dependent variables
Blair Lusby
Holes and Goals



























Continuous Value: when a variable can take on any value between its minimum and
maximum value
Control Group: the group in a study that is the baseline and does not receive the treatment
Convenience Sample: a sample of people who were easy to reach
Correlation: the strength of association between two variables
Critical Value: used to compute the margin of error
Decision Rule: used by researchers to decide whether or not to keep the null hypothesis
Dependent Variable: the “effect” of the independent variable
Deviation Score: the difference between a raw score and the mean
Discrete Variable: data that can be listed or placed in order
Dotplot: a graphic display used to compare frequency counts within categories or groups
Double Bar Chart: has two pieces of information for each category instead of one like
normal bar charts
Double Blinding: when the subject and the analysts both do not know who is the control
group and who is the treatment group
Effect Size: the difference between the critical value and the value specified in the null
hypothesis
Estimation: when there are inferences made about population, based on information
obtained from a sample
Estimator: the process of estimation
Event Multiple: a grouping of two or more independent events
Expected Value: the mean of the discrete random variable
Experiment: a controlled study in which the researcher attempts to understand cause-andeffect relationships
Experimental Design: a plan for assigning subjects to treatment conditions
Frequency Count: a measure of the number of times an event occurs
Geometric Distribution: a negative binomial distribution where the number of successes
is equal to one
Geometric Probability: the probability that a negative binomial experiment will result in
only one success
Histogram: columns plotted on a graph
Hypothesis Test: sample data that statisticians use to determine whether to reject a null
hypothesis
Independent: when the occurrence of one event does not affect the probability of the
other event occurrence
Independent Variable(Factor): the variable that is manipulated by the experimenter to
determine its relationship to an observed phenomenon
Influential Point: an outlier that greatly affects the slope of the regression line
Blair Lusby
Holes and Goals
























Interquartile Range: a measure of variability, based on dividing a data set into
quartiles(Q1, Q2, Q3)
Interval Estimate: two numbers, between which a population parameter is said to lie
Interquartile Range(IQR): a measure of variability, based on dividing a data set into
quartiles
Joint Frequency: entries in a table that are only in the body of the table
Lurking Variable: an extraneous variable
Margin of Error: the max expected difference in the true population parameter and a
sample estimate of that parameter
Marginal Distribution(Marginal Frequency): in a table these are the entries that go in the
total rows or total columns
Mean: an average score; denoted by X(with a line over top)
Measurement Scales: used to categorize and/or quantify variables
Median: a measure of the central tendency; to find you arrange values from smallest to
largest and you choose the middle value or two middle values to be your median
Mode: the most frequent value in a sample
Mutually Exclusive: when two events have no sample points in common
Nominal Scale: a measurement scale where values are assigned to variables which
represent a descriptive category, but have no inherent numerical value with respect to
magnitude
Non-probability sampling: the probability that each element will be included in the
sample cannot be specified
Normal Distribution: a probability distribution that associates the normal random variable
X with a cumulative probability
Null Hypothesis: the hypothesis that your observations occur by chance
Null Set: a set that has no elements in it
Observational Study: a study trying to understand cause-and-effect relationships but
cannot control how subjects are assigned to groups or which treatments each group
receives
One-Sample t-Test: used to test if a population mean is significantly different from some
hypothesized value
One-Sample z-Test: used to test if a population parameter is significantly different from
some hypothesized value
One-Tailed Test: a statistical hypothesis test where the region of rejection is on only one
side of the sampling distribution
One-Way Table: when a table only has data for one categorical variable
Outlier: a data point that diverges greatly from the overall pattern of data
Parameter: a measurable characteristic of a population; examples: mean or standard
deviation
Blair Lusby
Holes and Goals




























Pearson Product-Moment Correlation: measures the strength of the linear association
between variables
Percentage: a way to express a proportion
Percentile: if a data set are rank ordered from smallest to the largest, then the values that
divide a rand-ordered set of elements into 100 equal parts are percentiles
Permutation: an arrangement of all or part of a set of objects, with regard to the order
Placebo: a neutral treatment that has no “real” effect
Point Estimate: a single value used to estimate the population parameter
Population: the total set of observations that can be made
Precision: how close estimates from different samples are to each other
Probability: a measure of the likelihood that the event will occur
Probability Distribution: a table or an equation that links each outcome of a statistical
experiment with its probability of occurrence
Probability Sampling: every element of the population has a know probability of being
included in the sample
Proportion: the fraction of the total
P-Value: measures the strength of evidence in support of a null hypothesis
Qualitative Data: categorical data
Quantitative Data: numerical data
Quartile: divide a rank-ordered data set into four equal parts
Random Number Table: a list of numbers that are arranged in no predictable order
Random Number: a number determined totally by chance
Random Sampling: a procedure for sampling from a population in which the selection of
a sample unit is based on chance and every element of the population has a know, nonzero probability of being selected
Random Variable: when the value of a variable is the outcome of a statistical experiment
Randomization: the practice of using chance methods to assign subjects to treatments
Range: a simple measure of variation in a set of random variables
Ratio Scale: a type of measurement scale that is characterized by equal intervals between
scale units and an absolute zero
Region of Acceptance: the range of values that leads the researcher to accept the null
hypothesis
Region of Rejection: the range of values that leads the researcher to reject the null
hypothesis
Replication: the practice of assigning each treatment to many experimental subjects
Representative: a good sample
Residual: the difference between the observed value of the dependent variable and the
predicted value
Blair Lusby
Holes and Goals



























Residual Plot: a graph that show the residuals on the vertical axis and the independent
variable on the horizontal axis
Response Bias: the bias that results from problems in the measurement process
Sample: a set of observations drawn from a population
Sample Point: an element of a sample space
Sample Space: a set of elements that represents all possible outcomes of a statistical
experiment
Sample Survey: a study that obtains data from a subset of a population
Sampling: the process of choosing a sample of elements from a total population of
elements
Sampling Error: the error that results from taking one sample rather than taking a census
of the entire population
Sampling Fraction: the proportion of a population to be included in a sample
Sampling Method: a procedure for selecting sample members from a population
Scatterplot: a graphic tool used to display the relationship between two quantitative
variables
Selection Bias: the bias that results from an unrepresentative sample
Set: a well defined collection of objects
Significance Level: the probability of committing a Type I error, which occurs when the
researcher rejects a null hypothesis when it was true
Skewness: when there are more observations on one side of the graph than the other
Slope: a measure of the steepness of a line
Standard Deviation: a numerical value used to indicate how widely individuals in a group
vary
Standard Error: a measure of the variability of a statistic
Standard Score(z score): indicated how many standard deviations an element is from the
mean
Statistic: a characteristic of a sample; used to estimate the value of a population
parameter
Statistical Hypothesis: an assumption about a population parameter
Stemplot: used to display quantitative data; a vertical line with numbers on the left being
stems and the numbers on the right being leaves
Symmetry: used to describe the shape of a data distribution that is able to be cut in half
and the two pieces are mirror images of each other
t-Test: any hypothesis test in which the test statist follows Student’s distribution if the
null hypothesis is true
Two Sample t-Test: used to test the difference between two population means
Two Tailed Test: the region of rejection is on both sides of the sampling distribution
Type I Error: occurs when researcher rejects a null hypothesis that was true
Blair Lusby
Holes and Goals












Type II Error: occurs when researchers accept a null hypothesis that is false
Unbiased Estimator: what a statistic is when the mean of the sampling distribution of a
statistic is equal to a population parameter
Undercoverage: a type of selection bias; occurs when some members of the population
are inadequately represented in the sample
Uniform Distribution: a probability distribution for which all of the values that a random
variable can take on occur with equal probability
Unimodal Distribution: a distribution of data with one peak
Univariate Data: when a study is done with one variable
Variable: an attribute that describes a person, place, thing, or idea
Variance: a numerical value used to indicate how widely individuals in a group vary
Voluntary Response Bias: occurs when sample members are voluntary samples
Voluntary Sample: made up of people who self-select into the survey
Y Intercept: the value of y when x equals zero on the Cartesian coordinate system
z Score(Standard Score): indicates how many standard deviations an element is from the
mean