Download File - Tera Letzring

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Simplex algorithm wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Psyc 6627 –Statistics and Research Methods I
Lab 3: Descriptive Statistics and Standard Scores
Use schooldays data in the HSAUR2 package
1) Numerical summaries
a) Statistics, Summaries, Numerical summaries… (absent)
i) For mean, standard deviation, interquartile range, coefficient of variation (standard
deviation/mean), quantiles, n
ii) Summarize by groups: to get output for levels of a categorical variable
(1) only possible to select one variable
b) Statistics, Summaries, Table of statistics…
i) Summarize by one or more groups, but only for one statistic at a time (mean, median,
standard deviation, interquartile range)
2) Script
a) mean(variable.name)
b) median(variable.name)
c) mode(variable.name)
d) standard deviation: sd(variable.name)
e) variance: var(variable.name)
f) range(variable.name)
g) all quantiles: 0, 25, 50, 75, 100: quantile(variable.name)
h) IQR(variable.name)
#interquartile range
i) summary(data.set) #mean, median, 25th and 75th quartiels, min, and max for numeric
variables; frequency counts for factor variables
j) item name ,item number, nvalid, mean, sd, median, trimmed median, mad (median
absolute deviation), min, max, range, skew, kurtosis, standard error
i) load the psych package: library(psych)
ii) describe(data.set)
k) calculate mode
i) create a sorted list of all unique values: temp <- table(as.vector(variable.name))
ii) return the names of the values that have the highest count in temp's 2nd row:
names(temp)[temp == max(temp)]
l) describe data by one grouping variable with the psych package
i) describeBy(variable.name, grouping.variable, mat=TRUE)
m) describe data by two grouping variable with the psych package
i) describeBy(variable.name, list(grouping.variable1,grouping.variable2),
mat=TRUE)
n) calculate means for all combinations of levels of 3 factor variables
i) tapply(variable.name, list(output.label1=factor1 output.label2=factor2,
output.label3=factor3), mean, na.rm=TRUE)
o) create a dataset with the means and output to .csv file
i) new.means <- tapply(data.set$variable.name, list(output.label1=factor1
output.label2=factor2, output.label3=factor3), mean, na.rm=TRUE)
ii) write.csv(new.means, "C:\\folder", row.names=TRUE)
3) Calculating z-scores and T scores in RCommander
a) Z-scores: Data → Manage variables in active data set → Standardize variables…
b) T scores: Data → Manage variables in active data set → Compute new variable…
i) Expression to compute: 50 + (10*Z.variable.name)
4) Calculating z-scores and T scores with script
a) Zscores <- scale (absent, center = TRUE, scale = TRUE)
i) center = TRUE subtracts the mean
ii) scale = TRUE divides by standard deviation
b) Calculating T-scores: Tscores <- 50 + (10*Zscores)
5) Merging standard scores with original data set
a) use the long form of the variable name to add the variable to the existing data frame
b) data.set$Zvariable.name <- scale (variable.name, center = TRUE, scale = TRUE)