Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Exploring R and the Group Assignment Analyzing the dataset ChickWeight: Students will form groups (4 or 5 students per group) and each group will be handed out a list of 10-15 R-functions from the list below and may be some more similar functions. Each group will have a separate set of functions. They will be asked to explore the functions and share their results at the end. Data Organization Functions: Try these commands. str(ChickWeight) head(Chickweight) [Gives the structure of dataset ChickWeight] [shows the first six observations with all the variables] summary(ChickWeight) [summarizes the information such as min , max, and mean of the data set] ChickWeight$weight [shows the column weight] table(ChickWeight$weight) [organizes the column in a table format] nrow(ChickWeight): [gives the number of rows in the dataset] ChickWeight[[Time]]=NULL [removes the Time variable from the data] Diet1=subset(ChickWeight,Diet==”1”) [Subsets the datasets in to all the chicks having diet level 1] Diet1 [Will show the subset thus formed] Overweight=subset(Diet1,weight>=”90”) [Subsets the chicks out of those in level 1 diet who weigh 90 and above] Overweight [Displays the subset with the variable name OverWeight] Data Summarization Functions: which.min(ChickWeight$weight) Ans: 196 [This is 196th entry] ChickWeight$weight[196] Ans: 35 [196th entry is 35 and it is the minimum weight] mean(ChickWeight$weight) sd(ChickWeight$weight): [Gives the mean of the weight variable] [Standard deviation of the weight variable] Ans: 121.8183 Ans: 71.07196 tapply(ChickWeight$weight,ChickWeight$Chick,mean) tapply(ChickWeight$weight,ChickWeight$Chick,min) Drawing Graphs: plot(ChickWeight$weight,ChickWeight$Diet) [Draws scatterplot of weight versus Diet] hist(ChickWeight$weight) [Plots Histogram of the numerical data column weight] boxplot(ChickWeight$weight ~ ChickWeight$Diet, xlab = “Diet”, ylab = “weight”, main = “Weight of Chicks on Diet 1”) [Draws a boxplot with the given title and x & y labels] A Linear Regression Model in R LinReg=lm(weight~Time+Chick+Diet, data = ChickWeight) [Gives a linear regression model] summary(LinReg) [Displays a summary of the model] LinReg$residuals [Computes and displays the residuals] SSE= sum(LinReg$residuals^2) [Computes the sum of squared errors] SSE [Displays the SSE] For the assigned R Programming lab: A Logistic Regression Model: [A logistic regression is used when the outcome variable is categorical] Students will make a logistic regression model using the data set “framingham” In this case we use the glm function as follows: [Note: glm stands for generalized linear model] LogLin= glm(outcome variable ~ ., family = binomial, data = framingham) [Note: The . after the ~ symbol here includes all the independent variables in the study] summary(LogLin) [Summary of the model] predLogLin= predict(LogLin, data= framingham $outcome variable) [This is a prediction function and is used to predict the outcome using the model] SSE = sum(predLogLin – framingham $outcome variable)^2 SSE [Displays SSE] [Computes SSE]