Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to Python for Quantitative Economics∗ Konso Wawa† December 27, 2016 LAREQ One Pager, Vol. 11, No. 001, 89–95 Abstract Python is an interpreted object-oriented programming language, suitable for rapid application development and scripting. This paper outlines what Python actually is, with a view to helping researchers and aspiring programmers in quantitative economics decide if they can make the right choice to use this programming language for research. Keywords: Python, object-oriented programming Résumé Python est un langage de programmation orienté objet, adapté au développement rapide d’applications et aux scripts. Cet article propose une introduction au langage de programmation Python, dans le but d’aider les chercheurs et aspirants modélisateurs en économie à décider s’ils peuvent faire le bon choix d’utiliser ce logiciel pour leurs recherches. Mots-clés : Python, programmation orientée-objet 1 Introduction This paper outlines what Python actually is, with a view to helping researchers and aspiring programmers in quantitative economics decide if they can make the right choice to use this programming language for research. Python is a high-level, general-purpose, dynamic programming language which emphasizes code readability to allow programmers to express concepts in fewer lines of codes possible like C++ or Java. Python was developed under an OSI-approved open source license in the late 1980s,1 by Guido van Rossum. Python’s license is administered by the Python Software Foundation (Lutz, 2001). Python is a programming language which is used in many applications such as communications, web development games, multimedia, data processing, security, etc. This language offers great features which will be discussed in the coming section with big illustrations: rapid development, supported by many libraries, elegant syntax, etc.. Documentation about Python is available from the official website,2 and other sources. However, the main scope of this paper is to lead readers to understand what python is and how it can be beneficial in quantitative economics. Some recent developments have demonstrated the python’s range of applicability ∗ I thank Jean-Paul K. Tsasa for useful comments. M.Sc. student in Robotics and Automation at University of Siena, Italy, and Aspirant researcher at LAREQ. E-mail: [email protected]; [email protected]. 1 Making it freely usable and distributable, even for commercial use. 2 Python Software F.: https://www.python.org. See also Green (2010) which discusses installed documentation. † 89 to econometrics, statistics and general numerical analysis.3 While Python is a programming language, as are C, Fortran and others, it owns some specific features, especially the following characteristics (Burns et al., 2016): (i) an interpreted (as opposed to compiled) language. Contrary to e.g. C or Fortran, one does not compile Python code before executing it; (ii) a free software released under an open-source license: Python can be used and distributed free of charge, even for building commercial software. (iii) multi-platform programming language. Python is available for all major operating systems, Windows, Linux/Unix, MacOS X, most likely your mobile phone OS, etc.; (iv) a very readable language with clear non-verbose syntax; (v) a language for which a large variety of high-quality packages are available for various applications, from web frameworks to scientific computing; (vi) a language very easy to interface with other languages, in particular C and C++ (v) Python is an object-oriented language with dynamic typing, where the same variable can contain objects of different types during the course of a program.4 Moreover most languages offer the possibility to call code written in other languages, however in Python this is a particularly simple and smooth process. These features offer advantage to novice programmers, especially economists, compared to other languages.5 The remainder of the paper is organized as follows. Section 2 introduces Python jargon. Section 3 and Section 4 present the Python environment and an introductory example, respectively. Finally, Section 5 concludes. 2 Python Jargon This section presents some basic definitions about the programming jargon (Ebrahimi, 1994; Brusilovsky et al., 1997; Gemmell, 2002; Green, 2010; Pedregosa et al., 2011). Argument It is an expression that occurs within the parentheses of a method call like x + y = z. Class A class is seen as an extensible program-code-template for creating objects and implementations of behavior. Classes are only compile-time feature meaning they cannot be declared at runtime. Example: Class of person, student, animals, etc. Assignment statement It is a method used to assign new value to an existing variable but in case where the variable does not exist, this method creates a variable and assign a value to it. < variable >=< value >. Variable Knuth (1997) defines a variable as a quantity that may possess different values as program is being executed. But we would prefer a simple definition; a variable It is a container that hold information with a sole purpose of labeling and storing data in memory. Data Type It is a classification that determines the possible values for the data(variable) or the meaning of data, and the way values of that type can be stored. 3 See e.g. Sargent and Stachurski (2004); Hart (2009); Millman and Aivazis (2011). See also Langtangen (2008a,b); Pedregosa et al. (2011); Rossum and Drake (2011) for more information about distinguishing features of Python. 5 Many economists are familiar with Stata, R, MatLab or Mathematica. But as we will see here, Python is much simpler and suitable for rapid application development and scripting. 4 90 Cast It is an operation used to convert data from one type to another. For example from float to integer. Function A function is a block of organized, reusable code that is used to perform a single, related action. Some other terms are used to express the same concept, here is a list of the other names: methods, sub-routines, procedures, etc. Library In programming, we consider library as a collection of precompiled functions or routines. Debugging The fact that the codes are written by human beings, sometimes we see the presence of some errors. Programming errors are called bugs and the process of detecting and correcting them is called debugging. Errors can be grouped into three different types which are: syntax errors, runtime errors and semantic errors. Module A simple definition of module is “file consisting of python which can define functions, classes and variables”. Import statement Import statement is made up of two words which are import and statement. Python use import call or use a soure file which is not part of the actual code. import module1 3 Setting up your Python Environment A programming environment is a software which gives us a platform to write our codes (computer programs), compile and execute them. For the purpose of facilitating a migration of economists into computer science, we will use Anaconda. Anaconda is a distribution of Python, the conda package and environment manager, and many software packages for data analytics, data science, and scientific computing. Installing Anaconda • Install lastest version, Python 3.5 from this link: https://www.continuum.io/downloads • Click yes, to make Anaconda your default installation. Package Management To keep Anaconda packages up to date, we use conda which is tool that permits us to regularly update the whole Anaconda distribution. Here is the step: • Open up a terminal: – For Mac users, use this link: http://guides.macrumors.com/Terminal – For Windows, search for cmd application or use this link: http://www.computerhope.com/ issues/chdos.html • Type “conda update anaconda” 91 Jupyter Jupyther notebooks are one of the many possible ways to interact with Python, and is a browser-based interface. The choice of Jupyter is due to its ability to write and execute Python commands directly in your browser, also its ability to format text and mathematical expressions between cells. How do we start Jupyter ? To start Jupyter, we need to open our terminal “cmd for windows” and “terminal for linux”, and type in this command: jupyter notebook This command executes and opens automatically the deafult browser if the defalut browser do not open, it is adviced to go your preferred browser. Type in the url address: http://localhost:8888. To start writing your code, the user is requested to click on the button “New”, a new cell will open and will be prompt to write your code. QuantEcon QuantEcon is a package to support all forms of quantitative economic modelling. QuantEcon has to be added into Anaconda as a library, to allow users to perform the modelling. To install QuantEcon, users have to type in pip install quantecon in the terminal. Some examples will be added at the appendix to show this library works. 4 Introductory Example Syntax and Basic Data Structures We believe that many economists are familiar with Stata, but Python is much simpler than Stata. Remember in Stata, we made use of “&” and “—” to express the logic “and” and “or” but in python, we use them like in english “and” and “or”. There are certain rules which need attention while programming with python, such code identation, capitalization. 92 variables: What Stata calls Macros Variables have been defined ealier on this paper, but as in most programming languages including python, variables play an important role in programming. Python has global and local variables but in this paper, we will focus on local variables. Examples: Assigning value into a variable 1. myNumber = 10 ; an assignment example of variable myNumber with value 10. 2. myString = ”Hello world” ; an assignment example of variable myString with value ” Hello world”. String requires double or single quotes when defined. List A list is a variable which holds a list of items of different data type.Items in the list are separated by a comma. myList = [ 1 , 2 , 3 , 4 ] # d e f i n e s new l i s t with i t e m s 1 , 2 , 3 , 4 myList . append ( 5 ) myList = myList + [ 6 ] myList # i t e m s appear i n t h e o r d e r they were added [ 1 , 2 , 3 , 4 , 5 , 6 ] This example demonstrates the use of list, how to append and add an item into the list. Function For Stata users, functions are equivalent to programs. Python requires def when defining a function. Functions can have parameters or not, but if parameters are needed, programmers need to ensure that it has been named in the function definition. d e f printName ( name ) : p r i n t ”My name i s ” + name printName ( ” Lareq ” ) # My name i s Lareq . Here is an example of function which add two numbers and return the result of the operations. d e f AddNumbers (num1 , num2 ) : r e t u r n num1 + num2 r e s u l t= AddNumbers ( 5 , 6 ) # s t o r e 5 + 6 i n t o r e s u l t p r i n t r e s u l t # d i s p l a y v a l u e o f r e s u l t 11 Statements if / else /else-if It may be very tricky to decide when to use this statement. But remember that python is similar to english therefore we can use this statement whenever we need to make a choice between options. marks = 46 i f marks >= 7 5 : s t a t u s =” d i s t i n c t i o n ” e l i f marks < 75 and marks >= 5 0 : s t a t u s =”s u c c e e d ” else : s t a t u s =” f a i l ” p r i n t ( s t a t u s ) # r e t u r n t h e r e s u l t based on t h e marks ; f a i l For Stata’s users, you have been familiar with if/else statements, for loops and while loops. Python also uses the same statements but with different syntaxes which will be given in the example. 93 While While is used when we intend to repeat a section of code infinite number of time until the condition is met. An example of while loop is the timer, whereby we check if time is still greater than 0 otherwise we decrease by 1. w h i l e ( Time > 0 ) : Time = Time − 1 for Unlike the while, a for statement repeats a section of code in a specified number of time. the example below repeat the section of in the range of 1 to 5, meaning 4 times. f o r num i n r a n g e ( 1 , 5 ) : p r i n t ( ‘ Number : ’ , num) 5 Conclusion This paper made an overview and introduced economists in python world by giving practical example which can be helpful in the transition of Stata to python. Programming may be new to some and not for others, but understanding the concept is beneficial. Python offers a great platform in data handling and manipulation especially cleaning and reformatting. It is more capable at data set construction than either R, Mathematica or MatLab. References Brusilovsky, P., Calabrese, E., Hvorecky, J., Kouchnirenko, A., Miller, P., 1997. Mini-languages: A way to learn programming principles. Education and Information Technologies 2, 65–83. URL: http: //link.springer.com/article/10.1023/A:1018636507883. Burns, C., Combelles, C., Gouillart, E., Varoquaux, G., 2016. One document to learn numerics, science, and data with Python. Technical Report. SciPy lecture. URL: http://www.scipy-lectures.org/ index.html. Ebrahimi, A., 1994. Novice programmer errors: language constructs and plan composition. International Journal of Human-Computer Studies 41, 457–480. URL: http://dx.doi.org/10.1006/ijhc.1994. 1069. Gemmell, M., 2002. Introduction to Programming. Technical Report. Scotland Software. URL: http: //www.deansdirectortutorials.com/Lingo/IntroductionToProgramming.pdf. Green, R.D., 2010. Beginner’s Guide to Python. Technical Report. Python Software Foundation,. URL: https://www.python.org. Hart, W.E., 2009. Python optimization modeling objects (pyomo), in: Operations Research and CyberInfrastructure. Operations Research/Computer Science Interfaces. Chinneck J.W., Kristjansson B., Saltzman M.J. (eds), Springer, Boston, MA. volume 47, pp. 3–19. URL: http://link.springer. com/chapter/10.1007/978-0-387-88843-9_1. Knuth, D.E., 1997. Art of Computer Programming, The: Volume 1: Fundamental Algorithms. Addison-Wesley Professional. URL: https://www.pearsonhighered.com/program/ Knuth-Art-of-Computer-Programming-The-Volume-1-Fundamental-Algorithms-3rd-Edition/ PGM173687.html. Langtangen, H.P., 2008a. Combining python with fortran, c, and c++, in: Python Scripting for Computational Science. Springer Berlin Heidelberg, pp. 189–226. URL: http://link.springer.com/chapter/ 10.1007/978-3-540-73916-6_5. 94 Langtangen, H.P., 2008b. Python scripting for computational science, in: Texts in Computational Science and Engineering. Springer Berlin Heidelberg. volume 3, p. 756. URL: http://link.springer.com/ book/10.1007/978-3-540-73916-6?no-access=true. Lutz, M., 2001. Programming Python. Foreword for ”Programming Python” (2nd ed.) by Guido van Rossum, O’Reilly Media. URL: http://shop.oreilly.com/product/9780596000851.do. Millman, K.J., Aivazis, M., 2011. Python for scientists and engineers. Computing in Science and Engineering 13, 9–12. URL: http://doi.ieeecomputersociety.org/10.1109/MCSE.2011.36. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É., 2011. Novice programmer errors: language constructs and plan composition. Journal of Machine Learning Research 12, 2825–2830. URL: http://www.jmlr.org/papers/volume12/ pedregosa11a/pedregosa11a.pdf. Rossum, G.v., Drake, F.L., 2011. The Python Language Reference Manual. Network Theory Ltd. URL: http://dl.acm.org/citation.cfm?id=2011965. Sargent, T.J., Stachurski, J., 2004. Programming in Python. Technical Report. Quantitative Economic Modelling. URL: http://lectures.quantecon.org/py/index.html. 95