Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Stochastic Models Introduction to R Walt Pohl Universität Zürich Department of Business Administration February 28, 2013 What is R? R is a freely-available general-purpose statistical package, developed by a team of volunters on the Internet. It is widely used among statisticians, and frequently new statistical techniques are first implemented in R. It is less widely-used by economists, who tend to prefer commercial statistical packages or Matlab. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 2/1 R versus Excel R has many more probability and statistical functions built in or avaiable in free packages. R is command-driven. You enter a sequence of commands to manipulate your data. While everything in Excel is in terms of cells, R has a bunch of different data types: vectors, arrays, objects. You can define your own. Normally you will create a “.R” command file that is separate from your data. Note: Excel also has a separate command language – VBA. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 3/1 R versus Matlab The real target audience for Matlab is engineers. Matlab has many features useful for engineers but not useful for us. The target application for R is statistics. R has many more statistical functions than Matlab. Matlab started as a package for manipulating matrices, and added other features later. Non-matrix based operations are awkward. R was designed for general-purpose programming from the beginning. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 4/1 R versus Other Statistics Programs R is free. R is more command-driven and less GUI driven. R is very close to S-Plus. R supports as broad of an array of operations as any other statistics program. R’s programming language is better-designed than most of its competitors. Since different packages are written by different volunteers, R is not as uniform as some other systems. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 5/1 Important URLs R home page – http://www.r-project.org/ Closest R mirror site – http://stat.ethz.ch/CRAN/ R tutorial – http://cran.r-project.org/doc/manuals/R-intro.html Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 6/1 Monte Carlo Simulation in R R has many, built-in probability distributions. For each supported distribution XXX, R comes with four functions: dXXX – density function pXXX – cumulative distribution function qXXX – quantile function (inverse of the CDF) rXXX – random draw XXX = unif, norm, chisq, t, etc. Example: For the normal distribution, we have dnorm, pnorm, qnorm, rnorm. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 7/1 Vectors in R For us, the basic R datatype is a vector of numbers. The c command creates vectors: Example: If you type c(1, 3, 4.5), R returns the vector (1, 3, 4.5). You can assign vectors to variables, using the < − operator. x < − c(1, 3, 4.5); Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 8/1 Vectors in R, cont’d You can get the value of individual entries by using the [] operator. x[3] will return 4.5. You can also get subvectors by using ranges. x[1:2] will return the vector 1, 3. The length function allows you to refer to the end in a range: x[2:length(x)] will return the vector 3, 4.5. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 9/1 Operations on Vectors Where possible, any operation on vectors will be applied elementwise. So if x and y are two vectors, then z = x * y will be the vector where z[i] = x[i] * y[i]. Likewise log(x) will be the vector whose each entry will be log(x[i]), etc. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 10 / 1 Sample Statistics R has built-in functions for the usual sample statistics: mean(x) – Mean of vector x var(x) – Variance of vector x sd(x) – Standard Deviation of vector x quantile(x, q) – The q-th quantile of vector x. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 11 / 1 Reading Data The easiest way to import data into R is through CSV files. Excel can export files in this format. The function read.csv imports a file as a CSV file. Example: apple < − read.csv(”apple.csv”) imports the file named ”apple.csv” into the variable apple. The data is returned in the form of a data frame. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 12 / 1 Data frames A data frame is a named list of vectors. In the case of ”apple.csv”, we get four entries on the list: DATE – end date of month. RET – monthly return on Apple stock. VWRETD – monthly return on CRSP value-weighted index. rf – monthly risk-free rate. You access the vector by using $. Example: apple$RET. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 13 / 1 Regression R has a very easy to use interface for regression: the lm function. For example, to fit the CAPM for Apple, we would use lm(RET ∼ VWRETD, data=apple) The first argument uses the tilde operator indicate that we want to regress RET on VWRETD. The second argument indicates that the data comes from the apple frame. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 14 / 1 Regression cont’d lm by itself only returns the coefficients. To get more detail, including t stats, use summary(lm(RET ∼ VWRETD, data=apple)) Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 15 / 1 Built-In Mathematical Functions R has various built-in mathematical functions: exp(x) – e x . log(x) – natural logarithm, log x. (Use log(x, b) for logb x). xˆy – x y . √ sqrt(x) – x Note these all work on vectors. exp(c(1, 2)) gives you c(2.718282, 7.389056). Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 16 / 1 Special Mathematical Values Floating point supports some special values 1/0 = Inf. −1/0 = -Inf. 0/0 = NaN. Mathematical operations are defined for these special values. For example, Inf + Inf = Inf, and Inf - Inf = NaN. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 17 / 1 Defining Your Own Functions You can define a function by using R’s function command: f < − function(x) xˆ2 This creates a function that squares its argument, and assigns it to the variable f. Calling f(2) in R will return 4. Functions can take vector arguments. So f(c(1, 2)) will return c(1, 4). Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 18 / 1 Matrices R also supports matrices. Use matrix(0, nrow=m, ncol=n) to create an m-by-n matrix. For example g = matrix(0, nrow = 3, ncol = 4); To access the element in the i-th row and j-th column, use [] with two numbers. For example g[1,2] < − 3; assigns 3 to gi,j . Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 19 / 1 Logical Operations R has the following basic logical operations. ==: equality !−: not equal <, >: greater or less than <=, >=: greater/less than or equal They evaluate to TRUE or FALSE. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 20 / 1 Logical Operations on Vectors Logical operations work on vector arguments, and return a vector of TRUE or FALSE values. Example: 1:10 > 5. You can use the functions any or all to see if any or all of the entries in the vector are TRUE. Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 21 / 1 Control Structures R supports the standard control structures found in most programming languages: Branching: if Definite iteration: for Indefinite iteration: while Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 22 / 1 Control Structures: If A statement like “if test code1 else code2 ’ executes code1 if the test is true, and code2 if the test is false. (“else code2 can be missing, means to do nothing). Example: if (0 == 0) print(“is zero”) else print(“is not zero”). Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 23 / 1 Control Structures: For For allows you to do something a fixed number of times: Example: for (i in 1:10) print(i); Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 24 / 1 Control Structures: While While allows you to do something until a condition becomes TRUE. (It may take forever). Example: i = 10; while (i>0) { print(i); i = i - 1; } (Notice the use of braces here. This is because the body of the while loop contains multiple statements.) Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 25 / 1 Writing Fast R Code R is faster for vector operations than for loops. Example: x < − (1:1000)2 is faster than x < − rep(0, 1000); # create an array of all zeros. for (i in 1:1000) { x[i] < − iˆ2; } Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 26 / 1