Download EDUC5504-midterm_study_guide

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Categorical variable wikipedia , lookup

Psychometrics wikipedia , lookup

Transcript
Chapter One
Closed Book Information:




What is the purpose of obtaining descriptive statistics?
Descriptive statistics are used to organize and describe the characteristics of a
collection of data.
What is the purpose of obtaining inferential statistics?
Inferential statistics are used to make inferences from a smaller group of data to a
possible larger one.
Chapter Two
Closed Book Information:
1. Definition of:
 Mean is the simplest type of average: sum all values and divide by the number of values
provided.
 Median is the midpoint in a set of scores – 50% fall above, 50% below.
 Mode is the value that occurs most frequently within a set of values.
 Bimodal is the term applied when two numbers are equally most frequent.
2. The difference between N and n is representing the small “n” as the number of values provided
and the big “N” as the total size of the sample.
3. The symbol for mean is sometimes the capital “M” and sometimes the X-bar (capital “X” with a
line over it).
4. When to use the following (i.e. with what type of data)
 Mean is best used when the data does not include extreme scores and are not categorical.
 Median is used when there are extreme scores and one chooses not to distort the average
(computed as the mean).
 Mode is used when the data are categorical in nature and values can fit into only one class,
such as hair color. The categories are mutually exclusive.
Open Book Information
1. Ability to calculate the following, by hand or with a statistics program:



Mean M = (Sum of values)/(number of values) (20)
Median:Align all values in hierarchy and eliminate end until center point is found (24)
Mode: Use above alignment and count frequency of occurrence. Greatest frequency = Mode
(27)
2. Application of key understandings about mean, median and mode to specific examples (similar
to those presented in questions 4-7 in the end-of-chapter exercises)
Chapter Three
Closed Book Information:
3. Definition and symbol of:
 Range depicts the distance between highest and lowest scores by subtracting
lowest from highest.
 Standard Deviation represents the average amount of variability in the scores;
that is how far the distance from the mean averages to be.
 Variance is the square of the standard deviation.
4. Formula for computing the range is highest value minus lowest value.
5. Inclusive requires adding one (1) to the lowest score prior to subtracting.
6. Exclusive as defined above, the high value minus the low value.
5. How to compute the variance if you know the standard deviation is simply to
square the standard deviation.
6. The difference between the biased and the unbiased estimate of the standard
deviation results in an unbiased estimate being slightly more conservative in that
the denominator subtracts one (1).
3. Under what circumstances the biased estimate can be appropriately used to
describe the characteristics of the sample.
4. Under what circumstances the unbiased estimate should be used all others.
5. Why the unbiased estimate is preferred by researchers good scientists tend to be
conservative in their estimates. The unbiased estimate allows for that
conservativeness.
 What the size of the standard deviation tells you about the spread of scores,
i.e.:
 A large standard deviation means scores are more spread or more different
 A small standard deviation means scores are more similar
 If s=0 or nearly 0, the scores are without variability; essentially identical.
 Another term for variability would be dispersion or spread.
 Know the meaning of each of the symbols in the standard deviation formula
s
is the standard deviation
S
is sigma, the sum of what follows
X
is each individual score
X-bar is the mean of all the scores
n
is the sample size
Open Book Information: (38-39)
 Calculate the standard deviation using a statistical program
 Given questions similar to #3 and #4 in the end-of-chapter questions, be able
to explain why there would be more or less variability in a set of scores
Chapter Four
Closed Book Information:
 What is a frequency distribution? The number of times an answer is provided.
 What is a class interval within a frequency distribution? Grouping of the
above.
 What is a histogram? Graphic representation of the above with dots for each
item or class.
 What is a simple definition of a graph or chart? Visual representation of
numerical presentation.
 When would you want to use each of the following graphical
representations? Be able to identify which one would be the best visual
representation for a specific set of data.
 Pie chart
Best for clear and significant data by percentage or group
 Bar chart
Best for comparative data along a separated ideal
 Line chart
Best for showing a trend in data at equal intervals
 Know the difference and the relationship between a histogram and a
frequency polygon. A frequency polygon shows a continuous line following the
midpoints of each bar on the histogram.
 What does skewness represent in a distribution? Lack of symmetry;
lopsidedness.
 When the “tail” of a distribution is longer to the left, the distribution is
positively (positively, negatively) skewed.
 When the “tail” of a distribution is longer to the right, the distribution is
negatively (positively, negatively) skewed.
 If a distribution is negatively skewed, that means there are more low (high,
low) scores in the distribution.
 If a distribution is positively skewed, that means there are more high (high,
low) scores in the distribution.
Open Book Information:
 Be able to create a frequency distribution using the guidelines on page 51
 Be able to create a histogram (on the computer-she prefers wolframalpha for
this) and interpret whether the histogram appears to be a normal
distribution or indicates that the distribution may be positively or negatively
skewed 52&53
 Given a picture of a distribution that is “flat” (platykurtic) along with one that
is “peaked” (leptokurtic), be able to explain which distribution has more
variability and why. 59-60
Chapter Five
Closed Book Information
1. Be able to identify the definition of:
a. Correlation coefficient
b. Direct or positive correlation between variables
i. When X increases, Y increases
ii. When X decreases, Y decreases
c. Indirect or negative correlation between variables
i. When X increases, Y decreases
ii. When X decreases, Y increases
2. The Pearson product-moment correlation is used to examine the relationship
between what types of variables?
a. What are “continuous” variables?
3. What is the overall numerical range of correlation coefficients? -1.0 -+1.0
a. What is the range for positive correlations? 0-+1.0
b. What is the range for negative correlations? -1.0 - 0
4. How is the absolute value of a correlation reflected, by the positive/negative sign
or by the absolute value of the correlation? The importance is in how close to
absolute value of 1!
5. Does “negative correlation” imply a value judgment that a correlation is not good
and “positive correlation” imply a value judgment that a correlation is good? NO!
6. What letter is used to represent the Pearson product-moment correlation? r
7. When you are interested in establishing a relationship between two variables,
you are more likely to get a significant correlation with data that varies
significantly! (vary significantly/don’t vary significantly)
a. If one variable does not change in value (e.g., age when all the individuals
in the sample are the same age), the correlation will be A straight line!
b. If one set of data is constrained (e.g., scores of all gifted students on a
reading comprehension test), the correlation will be smaller
(larger/smaller) than if the data were not constrained (all students
instead of just gifted).
8. Given a picture of a scatterplot with data points identified, be able to determine:
a. The direction of the relationship (direct or indirect)
b. The strength of the correlation (relatively strong or relatively weak)
9. Be able to identify the definition for the coefficient of determination
The amount of variance accounted for in in the relationship between two
variables.
10. Given a specific correlation,
a. be able to calculate the coefficient of determination
b. be able to draw a picture to represent an approximation of how much of
the variance in one variable is accounted for by the variance in the other
variable
11. Given a scenario that describes a correlation between two variables, be able to
recognize when a correlation may be spurious (i.e., correlation exists, but
association cannot be considered causal)
Open Book/Open Notes Information
1. Given two sets of scores and using a statistical software program, be able to
a. compute the Pearson product-moment correlation
b. print a scatterplot of the data
c. identify the slope of the correlation (positive or negative)
d. identify the direction of the correlation (direct or indirect)
e. explain what the direction tells you about the relationship between the
variables
f. identify the strength of the correlation
2. Given a correlation matrix, be able to:
a. identify why some scores are perfectly correlated at 1.0
b. identify whether correlations are direct (positive) or indirect (negative)
c. identify the strength of the correlation
d. explain or identify what the correlation between the two variables means
Chapter Six
Closed Book Information:
1. What is reliability? The quality of a test such that it is consistent.
2. What is validity? The quality of a test such that it tests what it says it does.
3. Why is each of the above terms important? Because it is important that all tests be both
reliable and valid.
4. Can an assessment instrument be reliable without being valid? A test may provide
consistent measure of an ability, but may not be validly administered or validly tallied.
a. What is the observed score? The score that is recorded or observed.
5. What is the error score? The part of a test scpre tjat os ramdp, amd cpmtrobites tp the
unreliability of a test. Can an assessment instrument be valid without being reliable? A
test may provide valid measurement without consistent and reliable measures.
6. Why is it important for an assessment instrument to be reliable? If Tuesday students
run a 50 meter test in gentle sunshine, and Wednesday it is pouring rain such that the
track is mucky, the consistency of the test is in jeopardy.
7. When considering scores from an assessment,
a. What is the true score? The unobservable part of an observed score that reflects
the actual ability or behavior – The true, 100% reflection of what is really known.
8. What is the formula for the true score? Observed Score – Error Score = True Score.
9. What is the relationship between error and reliability? The less errors, the greater
reliability.
10. Understand the difference between the two types of variables:
a. Dependent or outcome variable The outcome variable or the predicted variable
in a regression equation.
b. Independent (also called predictor or treatment variable) The treatment
variable that is manipulated or the predictor variable in a regression equation.
11. When calculating correlation coefficients for the purpose of reliability:
a. Should the coefficients be positive, negative or both? It is more important that
they be consistent that be positive, negative, or both.
b. How large should they be? (somewhat of a trick question) Correlations closer to
1 or -1 would be preferred.
Open Book Information:
12. Be able to identify the level of measurement that an assessment score belongs to:
a. Nominal – The most gross level of measurement where variables can be placed in
categories.
b. Ordinal A level of measurement that is characterized by things being ordered.
c. Interval A level of measurement that is defined by the equal appearance of
spacing or values between points along the scale.
d. Ratio A level of measurement defined as having an absolute zero.
13. Using the above information, be able to identify the type of correlation coefficient you
would use to identify the level of relationship between two variables (this hearkens
back to Chapter Five).
14. Given a scenario about a researcher trying to determine reliability, be able to identify
which type of reliability the researcher needs to use, which type of correlation
coefficient would be used, and how you would interpret the reliability coefficient:
a. Test-retest A type of reliability that examines consistency over time.
b. Parallel forms (sometimes called equivalent forms) A type of reliability that
examines the consistency across different forms of the same test.
c. Internal consistency A type of reliability that examines the one-dimensional
nature of an assessment tool.
d. Interrater reliability A type of reliability that examines the consistency of raters.
15. Given a scenario about a researcher trying to determine validity, be able to identify
which type of validity is being assessed.
a. Content
b. Criterion-related
i. Predictive
ii. Concurrent
c. Construct
16. Be able to compute a simple correlation coefficient and scatterplot using Hyperstat.
Chapter Seven
Closed Book Information:
1. What is the most important role of a hypothesis?
2. What is a population? All the possible subjects or cases of interest.
3. What is a sample? A subset of a population.
4. What is sampling error? The difference between sample and population values.
5. When samples are collected from populations, what is the goal? To provide a
representation to meet the people.
6. What is the null hypothesis? A statement of equality between sets of variable.
7. What is the research hypothesis? A statement of inequality between two variables. What
is the purpose of the research hypothesis? To give boundaries to the survey.
8. The null hypothesis always refers to the _____________ (sample/population). The research
hypothesis always refers to the __________ (sample/population).
Open Book Information:
1. Given an example of a hypothesis, be able to:
a. Identify whether it is a research hypothesis or a null hypothesis
b. Identify whether a research hypothesis is directional or nondirectional
c. Write an equation to represent the hypothesis and be able to explain what each
symbol in the equation means
Chapter Eight
Closed Book Information:
9. The study of probability is the basis for the normal curve and the foundation of
inferential statistics. The normal curve provides us with a basis for understanding the
probability associated with any possible outcome.
10. The tools of probability and the study of the normal curve allow a researcher to
determine the mathematical likelihood that a difference found between two research
outcomes is not due to chance alone.
11. Be able to identify the three characteristics that are associated with the normal curve.
a. unskewed
b. symmetrical about the mean
c. asymptotic
12. Be able to identify the relative probability events that occur at various standard
deviations of the normal curve. Distributions by 10 points, would provide standard
deviations by full points in either direction.
13. In a normal distribution almost 100% of all scores occur between -3 standard
deviations and +3 standard deviations.
14. Standard scores (such as z scores) allow researchers to do something that they cannot
do with raw scores? What is that? Compare otherwise dichotomous ratings.
15. Standard scores are comparable because they are standardized in units of standard
deviations
16. Z scores that are below the mean are negative and z scores that are above the mean are
positive.
17. Positive z scores always fall to the _______ of the mean and are in the ______ _______ of the
distribution. Negative z scores always fall to the _______ of the mean and are in the ______
________ of the distribution.
18. A z score is simply the number of standard deviations from the mean.
19. When raw scores are represented as standard scores such as z scores, they are __________.
20. In hypothesis testing, if an event (score, outcome) seems to occur only ______ times (or
less) out of 100, (_____%), we will deem that event to be rather unlikely relative to all
other events that could occur.
21. Applying the concept in #12 to research, we know that if you are comparing two
outcomes in a research study (e.g., traditional approach to teaching reading vs. new
method to teaching reading) and the new reading approach yields a higher score than
the traditional approach at a probability level of .05 or less, we can assume that the
higher scores are not due to __________.
Open Book Information:
2. Given a picture of the normal curve, be able to
a. Write in a rounded percentage figure for each area of the distribution between
standard deviations
b. Identify where the mean is located
c. Identify the approximate percentile rank for each standard deviation
3. Given a set of raw scores, be able to convert the scores to z scores.
4. Using Table B.1 in the appendix of your text and raw scores that you have converted to z
scores, be able to determine the probability of a score falling
a. above or below a particular score.
b. between two particular scores