* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction to R
Theoretical computer science wikipedia , lookup
Regression analysis wikipedia , lookup
Neuroinformatics wikipedia , lookup
K-nearest neighbors algorithm wikipedia , lookup
Pattern recognition wikipedia , lookup
Corecursion wikipedia , lookup
Multidimensional empirical mode decomposition wikipedia , lookup
Introduction to R A. Di Bucchianico Types of statistical software • command-line software – requires knowledge of syntax of commands – reproducible results through scripts – detailed analyses possible • GUI-based software – does not require knowledge of commands – not reproducible actions • hybrid types (both command-line and GUI) Introduction to R 2 Well-known statistical software • • • • • • • SAS SPSS Minitab Statgraphics S-Plus R … Introduction to R 3 R • • • • • free language almost the same as S maintained by top quality experts available on all platforms continuous improvement Available through www.r-project.org Introduction to R 4 Contents • • • • • • • • Basic operations Data creation + I/O Component extraction Plots Basic statistics Libraries Regression analysis Survival analysis Introduction to R 5 Basic operations • assignment operation: a <- 2+sqrt(5) • help function: – help(pnorm) – help.search(“normal distribution”) • probability functions: – d (density): dgamma(x,n,) – p (probability=cdf): pweibull(x,3,2) – q (quantile): qnorm(0.95) – r (random numbers): rexp(10,) Introduction to R 6 Data creation + I/O • create – vectors: c(1,2,3) – matrices: matrix(c(1,2,3,4,5,6),2,3,byrow=T) (2=#rows) – list • patterns: – “:” (1,2,3) = 1:3 – seq (1,2,3) = seq(1,3,by=1) • working directories and files: – setwd – getwd – attach • read data – from file: read.table(“file.txt”,header=TRUE) – from web: read.data.url Introduction to R 7 Component extraction • • • • • d[r,]: rth row of object d d[,c]: cth column of object d d[r,c]: entry in row r and column c of object d length(d): length of d d[d<20]: extract all elements of d that are smaller than 20 • d[“age”]: extract column “age” from object d Introduction to R 8 Plots • plot: both 1D and 2D plots • hist: histogram • qqnorm: normal probability plot (“quantilequantile” plot) Save graphics by choosing File -> Save as Introduction to R 9 Basic statistics • • • • • summary mean stdev t.test boxplot Introduction to R 10 Packages • specialized functions available through packages and libraries • in Windows interface choose Packages -> Load Packages • examples of packages: – qcc (quality control) – survival Introduction to R 11 Functions Analyses that have to be performed often can be put in the form of functions Example: simple <function(data,mean=0,alpha=0.05) {hist(data),t.test(data,conf.level=alpha,mu= mean,alternative=“two-sided”)} simple(data,4) uses the default value 0.05 and test the null hypothesis mu=4. Introduction to R 12 Regression analysis • general command: lm (linear model) • requires data to be available in the form of a data frame – more general than matrix because columns need not have same length) – use command data.frame for conversion • other types of regression also possible (see also dedicated packages) Introduction to R 13 Survival analysis • through library Surv of survival • Cox proportional hazards: coxph Introduction to R 14 Useful web sites • www.r-project.org • http://cran.r-project.org/doc/contrib/Short-refcard.pdf • http://www.unimuenster.de/ZIV/Mitarbeiter/BennoSueselbeck/shtml/shelp.html • http://www.maths.lth.se/help/R/ • http://www.mas.ncl.ac.uk/~ndjw1/teaching/sim/Rintro.html • http://stats.math.uni-augsburg.de/JGR/ • http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/index.html Introduction to R 15