Download PSYC60 Review

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Categorical variable wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
PSYC60 Review
—  Descriptive
statistics
—  Summarization and organization of data
—  Numbers, charts, tables, graphs..etc
—  Inferential statistics
—  Use observation (data) to predict another
thing
—  Reach conclusions that extend beyond the
immediate data alone
Descriptive vs. Inferential
Population = All UCSD students
Sample = Students in UCSD that are art
majors
—  Confounding variable
—  Extraneous variable that has an effect on
dependent variable
—  Random sampling
—  Choose entire group of participants in your
sample randomly from a given population
—  Random assignment
—  You randomly assign those participants to
either control or experimental groups
— 
— 
Basic concepts
—  Qualitative
—  Nominal
measurement
—  Different categories/kinds
—  Ranked
—  Ordinal measurement (order of rank)
—  Different ranking/standing within a group
—  Quantitative
—  Interval/ratio measurement (true zero)
—  Different amounts
Three types of data
—  Mean/Median/Mode
—  Population
—  Sample
mean: µ (mu)
mean: x bar
—  Outliers
—  A
very extreme score
Measure of Central Tendency
—  In
a skewed distribution, the mean is
pulled toward the tail
—  Positively
skewed distribution
—  Negatively
distribution
Skew
skewed
—  Range
—  Difference
between smallest and largest
values in a data set
—  Variance
—  Mean of all squared deviation scores
—  Population variance vs. sample variance
Variability
—  Standard
deviation
—  How much variation that exists from the
mean or expected value
—  Population vs. sample
Variability
—  Unit free
—  Standardized
score that indicates how
many standard deviations a score is above
or below the mean of its distribution
—  + or – sign (above (+) or below (-) the
mean
—  A number
—  Formula changes raw
Score into a standardized
Z-score!
Z scores
—  3-step process:
1.  Calculate z-score using formula
2.  Draw a visual depiction
3.  Look at the table (percentage)
— 
— 
Even if you get a negative value for your zscore, you look for the same value but
positive on the table
Finding the raw score is the same process
but reversed
Find the area beyond z
—  Use
the z-test when the standard
deviation is known
—  Use
the t-test when the standard
deviation is unknown
z-test or t-test?
—  Measures
the strength of the relationship
between two variables
—  Positive correlation
◦  2 variables move in same direction
◦  Lower left to upper right
—  Negative
correlation
◦  2 variables move in opposite direction
◦  Upper left to lower right
—  No
correlation: irregular pattern
Correlation
—  Perfect
linear relationship
Perfect correlation
—  Describes
strength and direction of the
relationship between the two variables
—  Pearson’s r
◦  a number b/t -1 and +1
◦  Sign indicate direction
◦  Number indicates strength
◦  0 indicates no relationship
—  0.1
—  0.3
—  0.5
= weak
= moderate
and above = strong
Correlation coefficient
—  Correlation
does NOT imply causation
—  May predict the other variable, but does
not cause it to happen
Pearson’s correlation
—  Sum
of Squares (SS)
◦  Calculate the difference between each score &
the mean (the difference score)
◦  Square each difference score
◦  Add up each squared difference score
—  Sum
of Products (SPxy)
◦  Calculate the difference score
◦  MULTIPLY each x and y difference scores
together
◦  Add everything up
Pearson’s correlation
—  Way
of predicting values of one variable
from another
—  Fit a line through the data points to find
the best line that predicts Y from X
—  Least squares regression
—  “y-hat”= the predicted value of y or the
dependent variable
—  X= the independent variable
—  b= slope
—  a= y-intercept
Regression
—  SS=
sum of squares
—  r= pearson’s r
Least Squares Regression
—  A
rough measure of the average amount
by which known X values deviate from
their predicted Y values
—  As r increases, this decreases
—  S
sub y, given x
—  …give or take _____ units
Standard error of estimate
—  “goodness of fit” of a regression
—  Overall measure of the accuracy
of the
regression
—  The higher the coefficient of
determination, the better the variance
that the dependent variable is explained
by the independent variable
—  What proportion of the variance in
personal happiness can be explained by #
of candy bars eaten?
Coefficient of determination
—  Mean
of all sample means
◦  Always equal to the value of the population
mean
—  Standard
error of the mean
◦  Rough measure of the average amount by
which sample means deviate from the mean of
the sampling distribution
◦  Will decrease with larger sample sizes
◦  The larger the sample size the more precise the
statistics
— 
— 
Define your question
Identify the hypotheses
— 
Specify decision rule
— 
— 
Calculate observed z-value
Make a decision and interpret
— 
Null hypothesis: No real effect; nothing
special is happening
— 
Alternative hypothesis: there is an effect
◦  Null vs. Alternative
◦  One tailed vs. two tailed
◦  Critical value
Hypothesis Test
—  In
a quality control situation the mean
weight of objects produced is supposed to
be 16 ounces with a standard deviation of
0.4 ounces.
—  A random sample of 70 objects yields a
mean weight of 15.8 ounces. Is it
reasonable to assume that the
production standards are being
maintained?
—  H0: The production standards are being
maintained µ=16
—  H1: The production standards are not
being maintained µ≠16
—  Always
assume alpha level= 0.05 if not
specified
—  If
the absolute value of the
calculated z-value > the z-critical, you
reject the null (reject H0)
—  If the absolute value of the
calculated z-value < the z-critical, you
retain the null (fail to reject H0)
—  The same goes for t-tests
Decision
—  Two
tailed
—  Does the amount of candy that statistics
students eat differ from that of other
students here?
—  One tailed
—  Does statistics students eat more candy
than that of other students here? (upper)
—  Does statistics students eat less candy
than that of other students here? (lower)
One tailed vs. Two tailed
Confidence Interval
—  Compare
with a:
a single sample to a population
◦  Known mean
◦  Unknown standard deviation
◦  Using the estimated standard error
The one sample t-test
—  Compares
the mean scores of two
different samples of subjects
—  Two independent samples from two
populations
—  Difference b/t the two means is the
EFFECT
Independent samples t-test
—  The
most accurate estimate of population
variance based on the combination of two
sample sum of squares and their df
—  You can use this if both groups have
similar σ/variances
Pooled variance estimate
—  One
sample measured twice
—  A pair of scores
—  Used when:
1.  You are measuring the same subjects on a
dependent variable at two different times
2.  You have two separate groups of subjects
that been matched based on some
characteristic
Dependent samples t-test
—  Calculate
the difference score for each
pair of scores
—  Find the mean of all difference scores
—  df=n-1
—  Find the sum of squares of the difference
scores (using the mean of all difference
scores)
Dependent samples t-test
one
one
How many
separate
sample
groups?
How many
scores for
each
subject?
two
matched
two
Is σ
known?
yes
Z-test
no
Single
sample ttest
Dependent samples t-test
Dependent samples ttest
Related
(matched) or
independent
independ
groups?
ent
Independent
samples t-test
— 
A study was conducted to examine differences b/
t older and younger adults on perceived life
satisfaction.
◦  INDEPENDENT
— 
Each basketball player was asked to shoot 20
consecutive free throws and the number of
successful attempts were recorded. The players
were then trained to use a special technique and
asked to shoot another round of 20 free throws
again (recorded).
◦  DEPENDENT
— 
average weight loss for someone on a diet is 15
pounds, with a SD of 4 pounds. Is the sample
taken representative of of this population?
◦  Z-TEST
Which test to use?
—  Used
for comparing two or more means
—  Want to test the difference b/t three or
more means
ANOVA
—  Grand
mean= sum of all data values
divided by the total sample size
—  Mean square= estimate of variance
between or within groups
ANOVA
ANOVA
—  Post-hoc
tests
◦  Pair-wise comparisons after a significant F
value is obtained
◦  Used to find out which means are actually
different when there are more than two groups
—  How
big is the effect?
◦  Proportion of variance that is explained by
group differences
ANOVA
—  Qualitative data (nominal scale)
—  Observations must be independent
—  Sample size must be large enough
—  Two types:
◦  Goodness of fit
–  1 variable
–  H0: frequencies are given by chance
–  H1:frequencies are not given by chance
◦  Test of independence
–  2 variables
–  H0: no association b/t the two variables
–  H1: there is an association b/t the two variables
Chi-square
1. 
2. 
3. 
— 
— 
Calculate the expected frequencies
Compute chi-squared
Compare to critical value
df=(number of categories) – 1
expected value = total sample size (n)/
number of categories (c)
• X^2 observed > X^2
critical, then you reject
H0
• X^2 observed < X^2
critical, then you retain
H0
Goodness of fit
Calculate row total and column total
2.  Calculate the expected frequency of
each cell
3.  Compute X^2
4.  Compare to critical value
—  df= (# of rows – 1)x(# of columns – 1)
—  Are the two variables related or
independent?
1. 
Test of independence
Number two
of
variables
Primary
interest
Linear
regression
Scale of
measurement
of the variables
Nom
inal
/cat
ego
ri
of
e
e
r
Deg onship Pearson
i
relat
correlation
coefficient
cal d
ata
Number
of
variables
one
Chi square
Goodness of fit
Chi square test
of
independence