Download PSC 211 Midterm Stud..

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear regression wikipedia , lookup

Regression analysis wikipedia , lookup

Forecasting wikipedia , lookup

Confidence interval wikipedia , lookup

Regression toward the mean wikipedia , lookup

Time series wikipedia , lookup

Data assimilation wikipedia , lookup

Least squares wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
PSC 211 Midterm Study Guide
You should be able to define and give the significance of the following. Where applicable, you
should know the symbol that represents the concept and be familiar with the formula for deriving
the statistic.
Close-book section of the exam:
Inductive reasoning
Deductive reasoning
Subjective vs. objective reasoning
Descriptive statistics
Inferential statistics randomness
Data
Case
Unit of analysis
Ecological fallacy
Sample population
Nominal, ordinal, interval, and ratio level
variables
Discrete verses continuous variables
Frequency distributions
Percentage distributions
Cumulative distributions
Unimodal distribution
Sum of squares
Variance
Standard deviation
Bimodal distribution
Mean
Median
Mode
Skewness
N
Normal distribution
z-scores
sampling distribution
distribution of a sample
population distribution
standard error
standardized variables
95% confidence interval
99% confidence interval
Positive relationship
Negative relationship
Curvilinear relationship
Central limit theorem
Proportion
Open Book Section of Exam
There will be an open book, open notes section of the exam. In this portion, you will need to:
 Calculate the mean, median, and mode.
 Calculate frequency distributions, percentage distributions and cumulative percentage
distributions.
 Calculate, the variance, standard deviation, z-scores, standard error, and confidence
intervals for data sets.
 Draw pie charts and bar graphs to describe data.
NOTE: Be certain that you can provide a substantive interpretation of the statistics. In other
worlds, how would you explain the significance of a statistic to your grandmother? The ability
to provide a meaningful translation is essential to understanding and conducting political science
research.
FOR THE EXAM:
Bring a calculator, pencils, and erasers. I will supply all paper and a stapler.
Closed-book section:
-
inductive reasoning: reasoning from detailed facts to general principles
o for example, using a set of observations to make generalizations about data, like
“people with pets live longer.” It cannot be proved absolutely that owning a pet
makes you longer.
 Statistical inference is a type of inductive reasoning; i.e. assuming that
something is true of a population because it is true of a representative
sample. Accurate as long as the sample size is large enough.
-
deductive reasoning: reasoning from the general to the particular
o for example, the classic syllogism:
 All men are mortal
 Socrates is a man
 Therefore, Socrates is mortal
-
subjective v. objective reasoning:
-
descriptive statistics: methods for summarizing information so that it is more intelligible,
more useful or can be communicated more effectively
o i.e. calculating averages, graphing techniques (baseball stats)
o can understand what the data actually means
inferential statistics: procedures used to generalize from a sample to the larger
population and assess the confidence we have in such generalizing
o for example, opinion polls using representative samples, with a margin of error
o 95/99% confidence intervals
o relevant b/c it helps us understand things about large groups that we wouldn’t be
able to completely measure
randomness: we may see patterns in the world and society that aren’t there, but also not
notice patterns when they exist. Statistics can help see through the seeming randomness
of phenomena
data: the unsummarized records of observations that statistics makes more manageable
o ex: what happened at bat every time
unit of analysis: the person, object or event that a researcher is studying
o ex: individuals, groups, editorials, elections
case: the specific unit from which data are collected
o ex: the person being interviewed, college students
ecological fallacy: the logical error of inferring characteristics of individuals from
aggregate data
o ex: since people who have dogs tend to live longer, and I own a dog, I will live
longer
aggregate data: data in which the cases are larger units of analysis
sample: a part of the population that, when chosen randomly, can with degrees of
confidence be generalized to the population
o statistic: a characteristic of a sample
population: all or almost all cases to which a researcher wants to generalize
-
-
-
-
-
-
o ex: research on Kentucky: population=all the people living in KY
o oftentimes too large, expensive, time-consuming or rapidly-changing to collect all
data from the population
o parameter: a characteristic of a population
averages:
o mode: (or Mo) the most frequently occurring score on a variable (for example:
female is the modal gender in the US)
 unimodal distribution:
distribution in which one score occurs
considerably more often than other scores (i.e. it has only one mode); there
will be only one “hump”
 bimodal distribution: bar graph or histogram shows two scores that are
obviously the most common (it has 2 modes); may resemble a camel’s
hump
o median: (or Md) the value that divides an ordered set of scores in half
 calculating the mean for:
 an odd number of scores: put the scores in order from lowest to
highest, then find the middle score
 an even number of scores: put scores in order from lowest to
highest, find the two middle scores, average these two scores by
adding them and dividing by 2
o mean: (or X ) the arithmetical average found by dividing the sum of all scores
by the number of scores (or N)
X 

Xi
N
o
-
-
Levels of measurement:
1. nominal variables: measured such that its attributes are different, but not based on
some underlying continuum (like high to low)
a. ex: male and female; red white and blue
2. dichotomous variable: has exactly two values
a. ex: yes and no, male and female
b. nominal and ordinal variables can be dichotomous
3. ordinal variable: one whose values can be rank-ordered, but nothing else
a. ex: none, a little, a lot, always / social class
4. interval variable: has values that can be rank-ordered, using a standard unit of
measurement (ex: dollar, pounds, inches)
5. ratio variable: like interval variable, but has a non-arbitrary zero point
representing the absence of the characteristic being measured (ex: number of
years in school, # of hours spent watching TV, temperature in degrees Kelvin)
6. interval-ratio variables: since there aren’t many interval variables in social
sciences & they can usually be handled the same way the two are grouped
together here
continuous variable: can take on any value in a range of possible values (ex: age
measured to the second, attitudinal variables)
-
-
-
discrete variable: can have only certain values within its range (ex: family size=1,2,…)
o nominal is always discrete, but interval-ratio can be discrete or continuous
frequency distribution: summarizing data by counting the number of cases with each
score
percentage distribution: standardizes summaries to some degree; makes them easier to
understand, especially with large numbers of observations
cumulative distributions: tells us things like what % of respondents are greater than or
less than x; useful only for ordinal or interval-ratio variables
o cumulative percentage: the percentage of all scores that have a given value or
less (calculated as: F/N (100)
o cumulative frequency: the sum of all frequencies of a given or lesser value
sum of squares: the sum of squared deviations from the mean; (calculated as:
 (X i  X ) 2 ; or in other words subtract the mean from each score, square each
difference and add all these together)
measures of variation: summarizes how close together or spread out scores are
o variance: (represented as s) the average squared deviation from the mean;
(calculated as:
 (X
i
 X )2
N 1
for a sample and the same (except only “N” instead of “N-1” for a population), where
N is the number of cases, X(bar) is the mean and Xi is a score
o standard deviation: (represented as σ)
the the average deviation from the
s
mean; calculated as the square root of the variance:
o z-score (standard score): the number of standard deviations that a score is from
the mean; gives us a standard measure of variation that can be used to compare
scores from distributions with different means and standard deviations
-
The frequency distribution table we did in class:
Civil Disobedience by Education (in frequencies)
Title= [dependent variable] by [independent variable] (in [freq. or percent.])
Independent Variables (ascending: lowhigh)
< High
School
[Dependent variables~descending order]
Conscience
High
School
Total
College
4
13
12
29
Obey
Law
6
11
4
21
Total
10
24
16
50