* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download PSYC60 Review
Degrees of freedom (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Psychometrics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Regression toward the mean wikipedia , lookup
Misuse of statistics wikipedia , lookup
Resampling (statistics) wikipedia , lookup
PSYC60 Review Descriptive statistics Summarization and organization of data Numbers, charts, tables, graphs..etc Inferential statistics Use observation (data) to predict another thing Reach conclusions that extend beyond the immediate data alone Descriptive vs. Inferential Population = All UCSD students Sample = Students in UCSD that are art majors Confounding variable Extraneous variable that has an effect on dependent variable Random sampling Choose entire group of participants in your sample randomly from a given population Random assignment You randomly assign those participants to either control or experimental groups Basic concepts Qualitative Nominal measurement Different categories/kinds Ranked Ordinal measurement (order of rank) Different ranking/standing within a group Quantitative Interval/ratio measurement (true zero) Different amounts Three types of data Mean/Median/Mode Population Sample mean: µ (mu) mean: x bar Outliers A very extreme score Measure of Central Tendency In a skewed distribution, the mean is pulled toward the tail Positively skewed distribution Negatively distribution Skew skewed Range Difference between smallest and largest values in a data set Variance Mean of all squared deviation scores Population variance vs. sample variance Variability Standard deviation How much variation that exists from the mean or expected value Population vs. sample Variability Unit free Standardized score that indicates how many standard deviations a score is above or below the mean of its distribution + or – sign (above (+) or below (-) the mean A number Formula changes raw Score into a standardized Z-score! Z scores 3-step process: 1. Calculate z-score using formula 2. Draw a visual depiction 3. Look at the table (percentage) Even if you get a negative value for your zscore, you look for the same value but positive on the table Finding the raw score is the same process but reversed Find the area beyond z Use the z-test when the standard deviation is known Use the t-test when the standard deviation is unknown z-test or t-test? Measures the strength of the relationship between two variables Positive correlation ◦ 2 variables move in same direction ◦ Lower left to upper right Negative correlation ◦ 2 variables move in opposite direction ◦ Upper left to lower right No correlation: irregular pattern Correlation Perfect linear relationship Perfect correlation Describes strength and direction of the relationship between the two variables Pearson’s r ◦ a number b/t -1 and +1 ◦ Sign indicate direction ◦ Number indicates strength ◦ 0 indicates no relationship 0.1 0.3 0.5 = weak = moderate and above = strong Correlation coefficient Correlation does NOT imply causation May predict the other variable, but does not cause it to happen Pearson’s correlation Sum of Squares (SS) ◦ Calculate the difference between each score & the mean (the difference score) ◦ Square each difference score ◦ Add up each squared difference score Sum of Products (SPxy) ◦ Calculate the difference score ◦ MULTIPLY each x and y difference scores together ◦ Add everything up Pearson’s correlation Way of predicting values of one variable from another Fit a line through the data points to find the best line that predicts Y from X Least squares regression “y-hat”= the predicted value of y or the dependent variable X= the independent variable b= slope a= y-intercept Regression SS= sum of squares r= pearson’s r Least Squares Regression A rough measure of the average amount by which known X values deviate from their predicted Y values As r increases, this decreases S sub y, given x …give or take _____ units Standard error of estimate “goodness of fit” of a regression Overall measure of the accuracy of the regression The higher the coefficient of determination, the better the variance that the dependent variable is explained by the independent variable What proportion of the variance in personal happiness can be explained by # of candy bars eaten? Coefficient of determination Mean of all sample means ◦ Always equal to the value of the population mean Standard error of the mean ◦ Rough measure of the average amount by which sample means deviate from the mean of the sampling distribution ◦ Will decrease with larger sample sizes ◦ The larger the sample size the more precise the statistics Define your question Identify the hypotheses Specify decision rule Calculate observed z-value Make a decision and interpret Null hypothesis: No real effect; nothing special is happening Alternative hypothesis: there is an effect ◦ Null vs. Alternative ◦ One tailed vs. two tailed ◦ Critical value Hypothesis Test In a quality control situation the mean weight of objects produced is supposed to be 16 ounces with a standard deviation of 0.4 ounces. A random sample of 70 objects yields a mean weight of 15.8 ounces. Is it reasonable to assume that the production standards are being maintained? H0: The production standards are being maintained µ=16 H1: The production standards are not being maintained µ≠16 Always assume alpha level= 0.05 if not specified If the absolute value of the calculated z-value > the z-critical, you reject the null (reject H0) If the absolute value of the calculated z-value < the z-critical, you retain the null (fail to reject H0) The same goes for t-tests Decision Two tailed Does the amount of candy that statistics students eat differ from that of other students here? One tailed Does statistics students eat more candy than that of other students here? (upper) Does statistics students eat less candy than that of other students here? (lower) One tailed vs. Two tailed Confidence Interval Compare with a: a single sample to a population ◦ Known mean ◦ Unknown standard deviation ◦ Using the estimated standard error The one sample t-test Compares the mean scores of two different samples of subjects Two independent samples from two populations Difference b/t the two means is the EFFECT Independent samples t-test The most accurate estimate of population variance based on the combination of two sample sum of squares and their df You can use this if both groups have similar σ/variances Pooled variance estimate One sample measured twice A pair of scores Used when: 1. You are measuring the same subjects on a dependent variable at two different times 2. You have two separate groups of subjects that been matched based on some characteristic Dependent samples t-test Calculate the difference score for each pair of scores Find the mean of all difference scores df=n-1 Find the sum of squares of the difference scores (using the mean of all difference scores) Dependent samples t-test one one How many separate sample groups? How many scores for each subject? two matched two Is σ known? yes Z-test no Single sample ttest Dependent samples t-test Dependent samples ttest Related (matched) or independent independ groups? ent Independent samples t-test A study was conducted to examine differences b/ t older and younger adults on perceived life satisfaction. ◦ INDEPENDENT Each basketball player was asked to shoot 20 consecutive free throws and the number of successful attempts were recorded. The players were then trained to use a special technique and asked to shoot another round of 20 free throws again (recorded). ◦ DEPENDENT average weight loss for someone on a diet is 15 pounds, with a SD of 4 pounds. Is the sample taken representative of of this population? ◦ Z-TEST Which test to use? Used for comparing two or more means Want to test the difference b/t three or more means ANOVA Grand mean= sum of all data values divided by the total sample size Mean square= estimate of variance between or within groups ANOVA ANOVA Post-hoc tests ◦ Pair-wise comparisons after a significant F value is obtained ◦ Used to find out which means are actually different when there are more than two groups How big is the effect? ◦ Proportion of variance that is explained by group differences ANOVA Qualitative data (nominal scale) Observations must be independent Sample size must be large enough Two types: ◦ Goodness of fit 1 variable H0: frequencies are given by chance H1:frequencies are not given by chance ◦ Test of independence 2 variables H0: no association b/t the two variables H1: there is an association b/t the two variables Chi-square 1. 2. 3. Calculate the expected frequencies Compute chi-squared Compare to critical value df=(number of categories) – 1 expected value = total sample size (n)/ number of categories (c) • X^2 observed > X^2 critical, then you reject H0 • X^2 observed < X^2 critical, then you retain H0 Goodness of fit Calculate row total and column total 2. Calculate the expected frequency of each cell 3. Compute X^2 4. Compare to critical value df= (# of rows – 1)x(# of columns – 1) Are the two variables related or independent? 1. Test of independence Number two of variables Primary interest Linear regression Scale of measurement of the variables Nom inal /cat ego ri of e e r Deg onship Pearson i relat correlation coefficient cal d ata Number of variables one Chi square Goodness of fit Chi square test of independence