Download Introduction to Python for Quantitative Economics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Introduction to Python for Quantitative Economics∗
Konso Wawa†
December 27, 2016
LAREQ One Pager, Vol. 11, No. 001, 89–95
Abstract
Python is an interpreted object-oriented programming language, suitable for rapid application development and scripting. This paper outlines what Python actually is, with a view to helping researchers
and aspiring programmers in quantitative economics decide if they can make the right choice to use this
programming language for research.
Keywords: Python, object-oriented programming
Résumé
Python est un langage de programmation orienté objet, adapté au développement rapide d’applications
et aux scripts. Cet article propose une introduction au langage de programmation Python, dans le but
d’aider les chercheurs et aspirants modélisateurs en économie à décider s’ils peuvent faire le bon choix
d’utiliser ce logiciel pour leurs recherches.
Mots-clés : Python, programmation orientée-objet
1
Introduction
This paper outlines what Python actually is, with a view to helping researchers and aspiring programmers
in quantitative economics decide if they can make the right choice to use this programming language for
research. Python is a high-level, general-purpose, dynamic programming language which emphasizes
code readability to allow programmers to express concepts in fewer lines of codes possible like C++ or
Java. Python was developed under an OSI-approved open source license in the late 1980s,1 by Guido van
Rossum. Python’s license is administered by the Python Software Foundation (Lutz, 2001).
Python is a programming language which is used in many applications such as communications, web
development games, multimedia, data processing, security, etc. This language offers great features which
will be discussed in the coming section with big illustrations: rapid development, supported by many
libraries, elegant syntax, etc..
Documentation about Python is available from the official website,2 and other sources. However, the
main scope of this paper is to lead readers to understand what python is and how it can be beneficial in
quantitative economics. Some recent developments have demonstrated the python’s range of applicability
∗
I thank Jean-Paul K. Tsasa for useful comments.
M.Sc. student in Robotics and Automation at University of Siena, Italy, and Aspirant researcher at LAREQ.
E-mail: [email protected]; [email protected].
1
Making it freely usable and distributable, even for commercial use.
2
Python Software F.: https://www.python.org. See also Green (2010) which discusses installed documentation.
†
89
to econometrics, statistics and general numerical analysis.3
While Python is a programming language, as are C, Fortran and others, it owns some specific features,
especially the following characteristics (Burns et al., 2016): (i) an interpreted (as opposed to compiled)
language. Contrary to e.g. C or Fortran, one does not compile Python code before executing it; (ii) a free
software released under an open-source license: Python can be used and distributed free of charge, even
for building commercial software. (iii) multi-platform programming language. Python is available for
all major operating systems, Windows, Linux/Unix, MacOS X, most likely your mobile phone OS, etc.;
(iv) a very readable language with clear non-verbose syntax; (v) a language for which a large variety of
high-quality packages are available for various applications, from web frameworks to scientific computing;
(vi) a language very easy to interface with other languages, in particular C and C++ (v) Python is an
object-oriented language with dynamic typing, where the same variable can contain objects of different
types during the course of a program.4
Moreover most languages offer the possibility to call code written in other languages, however in Python
this is a particularly simple and smooth process. These features offer advantage to novice programmers,
especially economists, compared to other languages.5
The remainder of the paper is organized as follows. Section 2 introduces Python jargon. Section 3 and
Section 4 present the Python environment and an introductory example, respectively. Finally, Section 5
concludes.
2
Python Jargon
This section presents some basic definitions about the programming jargon (Ebrahimi, 1994; Brusilovsky
et al., 1997; Gemmell, 2002; Green, 2010; Pedregosa et al., 2011).
Argument
It is an expression that occurs within the parentheses of a method call like x + y = z.
Class
A class is seen as an extensible program-code-template for creating objects and implementations of
behavior. Classes are only compile-time feature meaning they cannot be declared at runtime. Example:
Class of person, student, animals, etc.
Assignment statement
It is a method used to assign new value to an existing variable but in case where the variable does not
exist, this method creates a variable and assign a value to it. < variable >=< value >.
Variable
Knuth (1997) defines a variable as a quantity that may possess different values as program is being
executed. But we would prefer a simple definition; a variable It is a container that hold information with
a sole purpose of labeling and storing data in memory.
Data Type
It is a classification that determines the possible values for the data(variable) or the meaning of data,
and the way values of that type can be stored.
3
See e.g. Sargent and Stachurski (2004); Hart (2009); Millman and Aivazis (2011).
See also Langtangen (2008a,b); Pedregosa et al. (2011); Rossum and Drake (2011) for more information about
distinguishing features of Python.
5
Many economists are familiar with Stata, R, MatLab or Mathematica. But as we will see here, Python is
much simpler and suitable for rapid application development and scripting.
4
90
Cast
It is an operation used to convert data from one type to another. For example from float to integer.
Function
A function is a block of organized, reusable code that is used to perform a single, related action. Some
other terms are used to express the same concept, here is a list of the other names: methods, sub-routines,
procedures, etc.
Library
In programming, we consider library as a collection of precompiled functions or routines.
Debugging
The fact that the codes are written by human beings, sometimes we see the presence of some errors. Programming errors are called bugs and the process of detecting and correcting them is called debugging.
Errors can be grouped into three different types which are: syntax errors, runtime errors and semantic
errors.
Module
A simple definition of module is “file consisting of python which can define functions, classes and variables”.
Import statement
Import statement is made up of two words which are import and statement. Python use import call or
use a soure file which is not part of the actual code.
import module1
3
Setting up your Python Environment
A programming environment is a software which gives us a platform to write our codes (computer programs), compile and execute them. For the purpose of facilitating a migration of economists into computer
science, we will use Anaconda.
Anaconda is a distribution of Python, the conda package and environment manager, and many software
packages for data analytics, data science, and scientific computing.
Installing Anaconda
• Install lastest version, Python 3.5 from this link: https://www.continuum.io/downloads
• Click yes, to make Anaconda your default installation.
Package Management
To keep Anaconda packages up to date, we use conda which is tool that permits us to regularly update
the whole Anaconda distribution.
Here is the step:
• Open up a terminal:
– For Mac users, use this link: http://guides.macrumors.com/Terminal
– For Windows, search for cmd application or use this link: http://www.computerhope.com/
issues/chdos.html
• Type “conda update anaconda”
91
Jupyter
Jupyther notebooks are one of the many possible ways to interact with Python, and is a browser-based
interface. The choice of Jupyter is due to its ability to write and execute Python commands directly in
your browser, also its ability to format text and mathematical expressions between cells.
How do we start Jupyter ?
To start Jupyter, we need to open our terminal “cmd for windows” and “terminal for linux”, and type
in this command: jupyter notebook
This command executes and opens automatically the deafult browser if the defalut browser do not open,
it is adviced to go your preferred browser. Type in the url address: http://localhost:8888. To start
writing your code, the user is requested to click on the button “New”, a new cell will open and will be
prompt to write your code.
QuantEcon
QuantEcon is a package to support all forms of quantitative economic modelling. QuantEcon has to be
added into Anaconda as a library, to allow users to perform the modelling.
To install QuantEcon, users have to type in pip install quantecon in the terminal. Some examples will
be added at the appendix to show this library works.
4
Introductory Example
Syntax and Basic Data Structures
We believe that many economists are familiar with Stata, but Python is much simpler than Stata.
Remember in Stata, we made use of “&” and “—” to express the logic “and” and “or” but in python, we
use them like in english “and” and “or”.
There are certain rules which need attention while programming with python, such code identation,
capitalization.
92
variables: What Stata calls Macros
Variables have been defined ealier on this paper, but as in most programming languages including python,
variables play an important role in programming. Python has global and local variables but in this paper,
we will focus on local variables.
Examples: Assigning value into a variable
1. myNumber = 10 ; an assignment example of variable myNumber with value 10.
2. myString = ”Hello world” ; an assignment example of variable myString with value ” Hello world”.
String requires double or single quotes when defined.
List
A list is a variable which holds a list of items of different data type.Items in the list are separated by a
comma.
myList = [ 1 , 2 , 3 , 4 ] # d e f i n e s new l i s t with i t e m s 1 , 2 , 3 , 4
myList . append ( 5 )
myList = myList + [ 6 ]
myList #
i t e m s appear i n t h e o r d e r they were added [ 1 , 2 , 3 , 4 , 5 , 6 ]
This example demonstrates the use of list, how to append and add an item into the list.
Function
For Stata users, functions are equivalent to programs. Python requires def when defining a function.
Functions can have parameters or not, but if parameters are needed, programmers need to ensure that
it has been named in the function definition.
d e f printName ( name ) :
p r i n t ”My name i s ” + name
printName ( ” Lareq ” ) # My name i s Lareq .
Here is an example of function which add two numbers and return the result of the operations.
d e f AddNumbers (num1 , num2 ) :
r e t u r n num1 + num2
r e s u l t= AddNumbers ( 5 , 6 ) # s t o r e 5 + 6 i n t o r e s u l t
p r i n t r e s u l t # d i s p l a y v a l u e o f r e s u l t 11
Statements
if / else /else-if It may be very tricky to decide when to use this statement. But remember that
python is similar to english therefore we can use this statement whenever we need to make a choice
between options.
marks = 46
i f marks >= 7 5 :
s t a t u s =” d i s t i n c t i o n ”
e l i f marks < 75 and marks >= 5 0 :
s t a t u s =”s u c c e e d ”
else :
s t a t u s =” f a i l ”
p r i n t ( s t a t u s ) # r e t u r n t h e r e s u l t based on t h e marks ; f a i l
For Stata’s users, you have been familiar with if/else statements, for loops and while loops. Python also
uses the same statements but with different syntaxes which will be given in the example.
93
While While is used when we intend to repeat a section of code infinite number of time until the
condition is met. An example of while loop is the timer, whereby we check if time is still greater than 0
otherwise we decrease by 1.
w h i l e ( Time > 0 ) :
Time = Time − 1
for Unlike the while, a for statement repeats a section of code in a specified number of time. the
example below repeat the section of in the range of 1 to 5, meaning 4 times.
f o r num i n r a n g e ( 1 , 5 ) :
p r i n t ( ‘ Number : ’ , num)
5
Conclusion
This paper made an overview and introduced economists in python world by giving practical example
which can be helpful in the transition of Stata to python. Programming may be new to some and not
for others, but understanding the concept is beneficial. Python offers a great platform in data handling
and manipulation especially cleaning and reformatting. It is more capable at data set construction than
either R, Mathematica or MatLab.
References
Brusilovsky, P., Calabrese, E., Hvorecky, J., Kouchnirenko, A., Miller, P., 1997. Mini-languages: A way
to learn programming principles. Education and Information Technologies 2, 65–83. URL: http:
//link.springer.com/article/10.1023/A:1018636507883.
Burns, C., Combelles, C., Gouillart, E., Varoquaux, G., 2016. One document to learn numerics, science,
and data with Python. Technical Report. SciPy lecture. URL: http://www.scipy-lectures.org/
index.html.
Ebrahimi, A., 1994. Novice programmer errors: language constructs and plan composition. International
Journal of Human-Computer Studies 41, 457–480. URL: http://dx.doi.org/10.1006/ijhc.1994.
1069.
Gemmell, M., 2002. Introduction to Programming. Technical Report. Scotland Software. URL: http:
//www.deansdirectortutorials.com/Lingo/IntroductionToProgramming.pdf.
Green, R.D., 2010. Beginner’s Guide to Python. Technical Report. Python Software Foundation,. URL:
https://www.python.org.
Hart, W.E., 2009. Python optimization modeling objects (pyomo), in: Operations Research and CyberInfrastructure. Operations Research/Computer Science Interfaces. Chinneck J.W., Kristjansson B.,
Saltzman M.J. (eds), Springer, Boston, MA. volume 47, pp. 3–19. URL: http://link.springer.
com/chapter/10.1007/978-0-387-88843-9_1.
Knuth, D.E., 1997.
Art of Computer Programming, The: Volume 1: Fundamental Algorithms.
Addison-Wesley Professional.
URL: https://www.pearsonhighered.com/program/
Knuth-Art-of-Computer-Programming-The-Volume-1-Fundamental-Algorithms-3rd-Edition/
PGM173687.html.
Langtangen, H.P., 2008a. Combining python with fortran, c, and c++, in: Python Scripting for Computational Science. Springer Berlin Heidelberg, pp. 189–226. URL: http://link.springer.com/chapter/
10.1007/978-3-540-73916-6_5.
94
Langtangen, H.P., 2008b. Python scripting for computational science, in: Texts in Computational Science
and Engineering. Springer Berlin Heidelberg. volume 3, p. 756. URL: http://link.springer.com/
book/10.1007/978-3-540-73916-6?no-access=true.
Lutz, M., 2001. Programming Python. Foreword for ”Programming Python” (2nd ed.) by Guido van
Rossum, O’Reilly Media. URL: http://shop.oreilly.com/product/9780596000851.do.
Millman, K.J., Aivazis, M., 2011. Python for scientists and engineers. Computing in Science and Engineering 13, 9–12. URL: http://doi.ieeecomputersociety.org/10.1109/MCSE.2011.36.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot,
M., Duchesnay, É., 2011. Novice programmer errors: language constructs and plan composition. Journal of Machine Learning Research 12, 2825–2830. URL: http://www.jmlr.org/papers/volume12/
pedregosa11a/pedregosa11a.pdf.
Rossum, G.v., Drake, F.L., 2011. The Python Language Reference Manual. Network Theory Ltd. URL:
http://dl.acm.org/citation.cfm?id=2011965.
Sargent, T.J., Stachurski, J., 2004. Programming in Python. Technical Report. Quantitative Economic
Modelling. URL: http://lectures.quantecon.org/py/index.html.
95