Download PSY 3010 Study Guide 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
PSY 3010
•
ZW
How are the mean and standard deviation affected by linear transformations? How do we know if the
transformation is linear or non-linear? Why would we want to perform linear or non-linear
transformations?
-In linear transformation [Uniformly applied, shape of the distribution do not (dn) change]
mean always change by the constant (add/sub or mul/div)
std deviation dn change in add/sub [SSx dn change, N dn change], but change by the constant when
mul/div [SSx change by double/half of the constant].
Std dev= (SSx/N) 1/2
-Know transformation by examining the shape of the distribution. If linear, then retain original shape cuz of
uniform change for all scores. Also, can look at the mean & std dev.
-Non-linear transformation used in advanced statistics when raw scores are severely skewed. Both type are
used for comparison, relativity (provide context).
•
Z-scores – what are they, what is their purpose, what is the mean and standard deviation of a z-score
distribution, how do we compute a z-score, how can we translate a z-score into a percentile ranking, how
can determine the proportion of people who score between two z-scores, how can we transform a zscore into a T-score?
-Z-score: use std dev. As unit of measurement, to see how far score is from the mean (measure of relative,
not absolute) performance.
X−X
Sample:
Pop’n: Z = X − µ X
Z=
SX
σX
Prov. meaning to indiv. raw score
Pos Z: raw > mean; Neg: raw < mean
-Mean Z score = 0, Variance & Std Dev = 1
- Converting observed scores (X scores) to Z scores dn change the shape of the distribution:
− If the distribution of X scores is NOT normal, distribution of Z scores will NOT be normal
− If the distribution of X scores is normal, distribution of Z scores will be normal
-The Z-score distribution that is normally distributed is called the Standard Normal Distribution.
-When the distribution is normal, we can translate a Z score into a percentile rank. By using the statistical
table & looking for the area that corresponds to the Z score obtained.
-Transform to T-score (standard score that’s always pos. whole #):
T= T + sx * Z [T & sx is std dev. set by researcher]
•
In a standard normal distribution, what is the mean and standard deviation?
or proportion of scores fall within 1 sx, within 2 sx, within 3sx?
-Area under curve:
-3 to +3 standard deviations:
99.74%
-2 to +2 standard deviations:
95.44%
-1 to +1 standard deviations:
68.26%
What percentage of people
PSY 3010
ZW
Mean = 0
Std Dev = 1
Skew = 0 (Symmetric)
68.26%
Kurtosis = 0 (Mesokurtic)
13.59%
13.59%
34.13%
34.13%
2.15%
–3 –2
•
2.15%
–1
0
+1
+2 +3
What can the standard deviation tell you about precision?
In normally distributed pop’n, mean is the best single predictor. Yet, dispersion determine how precise the
predictor. More compactly around mean, smaller the error in prediction
•
Why would you want to compute a T-score?
Allows direct comparison, eliminating decimals & neg values; evaluating each score relative to its own grp.
•
What are the different types of correlation coefficients and with what type of data should each be used?
See p167 (ch8)
-All have correlation range of r=-1 to +1, dn by itself establish causal link btw variables
-Norma level: ① rphi- dichotomous var. ①rb ①rt
-Ordinal: ① rs (spearman r): both measure must be at least ordinal, expressed as ranks prior to calculate
① Kendall’s tau: rank data
-Interval or ratio: ① Pearson r: both scales interval or ratio
① Multiple R
•
What values can the Pearson product moment correlation coefficient assume?
Represent extent to which events occupy the same relative position in 2 distributions, dn show causation,
linear correlation are directionless (rxy = ryx). It assumes the relationship is linear. If depart from
linearlity, then r ↓ (approach 0)
•
What 3 factors can be used to interpret a correlation coefficient and what do the factors mean?
-Correlation coef (linear, rxy): used to estimate the population linear correlation coefficient (ρXY).
(∑ X)(∑ Y)
∑ XY −
N
rXY =
2
2


S
SSXY
 ∑ X 2 − (∑ X)   ∑ Y 2 − (∑ Y) 
rXY = XY
rXY =

SXSY
N 
N 
(SSX )(SSY )



PSY 3010
-Set up table:
X
X2
Y
Y2
ZW
XY
3 interpretations:
Sign: Positive correlation coefficient = direct relationship
Negative correlation coefficient = inverse relationship
Correlation coef: range from -1 to +1
Magnitude: det by the absolute value of the coefficient
Strong linear relationship = close to 1 (or –1)
Weak linear relationship = close to 0
No linear relationship = 0
Note: rxy = 0 does not mean no relationship at all
•
What the purpose of computing a correlation coefficient?
Det the relationship btw 2 vars
•
What are the assumptions of the Pearson correlation coefficient?
an assumption?
What is the effect on rxy if we violate
Linearity is most important (low r, cancel out values). 2 vars. Not necessary that r be calculated only btw
normally distributed var, so long as distribution are unimodal, relatively symmetrical then Pearson r may
legitimately be computed. Other violations:
Truncated range: restricted range value for var(s). low r
- Homogenous grp selected, remedy is design of the study (diverse sample)
Outliers: extreme score throw whole things off, corr. based on mean, which is sensitive to outliers.
Result in inflated estimate of correlation.
Multiple pop’n: subgrp
* sample size & strength is indept, dn assume small sample will give small r
Correlation dn mean causation
•
Know how to play with the formulae for determining the Pearson r and the components of the regression
equation.
-Correlation describes relation (plot dots), regression allow prediction (draw line of best fit/Least-Squares
Regression Line-cuz the line that minimizes the squared error in prediction) & test the accuracy.
Y’ = Predicted Y
by = Slope: Change in Y per unit
change in X
ay = Intercept: Intersection of
regression line and Y axis
∑ XY − ∑ N∑
bY =
( X)
∑X − ∑
X
Y’ = ay + by (X)
S
b Y = XY2
SX
SSXY
SS
= N = XY
SSX
SSX
N
Y
2
2
N
Need to get mean of x & y
PSY 3010
•
ZW
What are the goals of regression?
-Obtain an equation for a line that can be used to predict scores on one variable (criterion) based on scores
from another variable (predictor).
r2XY
= explained variation in Y
2
XY
r
total variation in Y
=
SSreg
SStot
∑ (Y'−Y )
=
∑ (Y − Y )
2
2
o Coefficient of nondetermination
1 – r2XY
= unexplained variation in Y
total variation in Y
o Standard error of the estimate
80
SY.X =
2
1 − rXY
=
∑ (Y - Y' )
N
SSerror
=
SStot
=
24.57
= .61
40.10
∑ (Y − Y')
∑ (Y − Y )
2
= 1.546 = 1.24
2
2
=
15.46
= .39
40.10