Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Python Language Petr Přikryl Socrates IP Brno, Czech Republic 15th June 2004, the morning slot Abstract If you do not know Python, this text should give you some overview. If you know only that Python exists, it should make you curious to look closer at the language. If you already know Python, you may find some arguments here to convince others to use Python, too. And if you are the Python expert, share your knowledge with the audience at this workshop. Contents 1 What is Python? 1.1 Brief summary . . . . . . . . . . . . . . . . . . . 1.2 What is Python good for? . . . . . . . . . . . . . 1.3 In what projects was Python used with success? 1.4 Interview with Guido van Rossum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 2 3 4 2 Syntax and semantics (examples) 2.1 “Hello world!” . . . . . . . . . . . . . . . . . . 2.2 Variables are names bound to objects . . . . 2.3 Types are related to objects, not to variables 2.4 Built-in types . . . . . . . . . . . . . . . . . . 2.5 Objects are garbage collected . . . . . . . . . 2.6 Simple statements and built-in function . . . 2.7 Block-of-commands, indentation, readability . 2.8 Programming constructions . . . . . . . . . . 2.9 Functions and generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 5 6 6 7 11 12 12 13 15 3 Features of the language 3.1 Interpreted but semicompiled language . . . . 3.2 Python is extensible . . . . . . . . . . . . . . 3.3 Ready (also) for object oriented programming 3.4 Object oriented from inside . . . . . . . . . . 3.5 Python is a multiparadigm language . . . . . 3.6 Using exceptions for consistent error handling 3.7 Productivity versus Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 17 18 18 18 18 19 20 4 More syntax, semantics, and examples 20 4.1 Classes and instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Modules and packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5 Experience of others 24 5.1 Eric Raymond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.2 Bruce Eckel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6 Conclusion 0 Version 27 from 14th June, 2004. 1 1 What is Python? Python is a programming language. At the Python’s home page http://www.python.org/ you will find all information that you need. Pay attention to the set of Frequently Asked Questions. You will find also information about news groups (comp.lang.python) and mailing lists to communicate with the others from the Python comunity. 1.1 Brief summary At the http://www.python.org/doc/Summary.html page named What is Python? you can read: Python is an interpreted, interactive, object-oriented programming language. It is often compared to Tcl, Perl, Scheme or Java. Python combines remarkable power with very clear syntax. It has modules, classes, exceptions, very high level dynamic data types, and dynamic typing. There are interfaces to many system calls and libraries, as well as to various windowing systems (X11, Motif, Tk, Mac, MFC). New built-in modules are easily written in C or C++. Python is also usable as an extension language for applications that need a programmable interface. The Python implementation is portable: it runs on many brands of UNIX, on Windows, OS/2, Mac, Amiga, and many other platforms. [. . . ] The Python implementation is copyrighted but freely usable and distributable, even for commercial use. Guido van Rossum, the main author of the language, adds: Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together. Python’s simple, easy to learn syntax emphasizes readability and therefore reduces the cost of program maintenance. (http://www.python.org/doc/essays/blurb.html) Let’s look at some details. Let’s start to learn Python. 1.2 What is Python good for? Replacement for script/batch processors. Once you learn Python, you will find it so simple to use that you will probably stop to write shell scripts (Unix world: sh, bash, etc.) or batch files (Windows world: command.com, cmd.exe). The Python scripts are not bigger than the shell scripts or batch files. Moreover, it is easy to start with Python. It is even easier to create a Python script than to write a shell script or a batch file. And the same script can be used both in Unix and in Windows environments. It is easy to call existing (command line) applications an utilities to glue them together. If you are used to Unix-like style of combining standard utilities, you can do the same. But Python comes also with ready to use standard modules1 that offer the same functionality. This means that you are more independent on existence of the external utilities. Writing simple personal utilities and small applications. It is often the case when using shell scripts or batch files, that the script becomes too large to be developed further. One have to rewrite the script using some other, more capable language. With Python, there is not such a limit. The simple scripts look simply, the more complex utilities are, well, more complex. Rather than changing the language, one may possibly change only the attitude to the further development of the utility (i.e. start to commenting well, use some versioning system like CVS, etc.). Writing full-size applications. Again, the border between simple utilities and full applications is fuzzy. One will probably add some graphical user inteface (GUI), add more checking of the user input (verification), and add better error diagnostics. When you accept the modular nature of Python applications, you will create modules that can be easily used the same way in simple scripts and also in full applications. 1 Python is said to be delivered with bateries included. . . 2 Let’s emphasize what was just said. Python is sometimes said to be a scripting language. But it would be better to say that it is a full size language that also allows you to write simple programs that looks as simply as, for example, shell scripts. Writing large applications. When writing large applications, one is faced with many problems. Size of the application itself is such problem. The bigger the application is, the more one has to focus on design of the application, on maintainability, on testing, etc. When many users are expected to work with the application, often the performance is the issue. All of the mentioned problems are language independent. But the language features may make solving the problems easier, more difficult, or even impossible. Using the language that gives you more power and control may seem as overkil at the beginning of the project (when the project is small and not expected to grow). On the other hand, when choosing simpler language—which is usually more productive—, one may reach the limit of size or of the complexity of the project, that makes further development very difficult. Try to imagine writing a DTP program as a Windows batch file. The right tool for the job. It is often said that one should use the right tool for the job. Nobody (almost) would use scissors for cutting the wood. When building software, it may not be so clear whether the application will finally be small or if it will grow in size and complexity. If you choose Python at the beginning, there is a good chance that you had chosen the right tool for both very simple applications, but also for quite complex applications. 1.3 In what projects was Python used with success? If you are very new to Python, you may ask: Is it really worth of learning it? Are there some examples where Python proved its usefulness? Can someone competent compare the language with the alternative languages? This subsection will present some of the well known companies and centers that successfully used Python. If the language is good for them, it may be good also for you. The experience of people who started to use Python and who are well known in the comunity of programmers is summarised in subsection 5 at the page 24. Let’s start with some references presented at the page http://www.python.org/Quotes.html. After the verbatim introductory citation, only the shortened summary is shown for selected cases: Python is used successfully in thousands of real-world business applications around the world, including many large and mission critical systems. Here are some quotes from happy Python users: Industrial Light & Magic: “Python plays a key role in our production pipeline. Without it a project the size of Star Wars: Episode II would have been very difficult to pull off. . . ”. Google: “Python has been an important part of Google since the beginning, and remains so as the system grows and evolves.” NASA: “NASA is using Python to implement a CAD/CAE/PDM repository and model management, integration, and transformation system which will be the core infrastructure for its next generation collaborative engineering environment. We chose Python because it provides maximum productivity, code that’s clear and easy to maintain, strong and extensive (and growing!) libraries, and excellent capabilities for integration with other applications on any platform. [. . . ] Python has met or exceeded every requirement we’ve had.” EVE Online: “Python enabled us to create EVE Online, a massive multiplayer game with a scale never before seen in the industry, in record time. EVE Online server cluster, serves close to 10.000 simultaneous players in a shared space simulation, most of which is created in Python.” Thawte Consulting: “Python makes us extremely productive, and makes maintaining a large and rapidly evolving codebase relatively simple...” University of Maryland “I have the students learn Python in our undergraduate and graduate Semantic Web courses. Why? Because basically there’s nothing else with the flexibility and as many web libraries.” 3 1.4 Interview with Guido van Rossum When reading documentation and books related to a programming language, it is difficult to get a kind of background or context knowledge. It usually does not answer questions like What were impulses for creating the language?, How other reacted to, accepted, or refused some features?, and similar. Bill Venners, the man behind the http://www.artima.com/, asks Guido van Rossum (alias GvR), the author of the language, various questions related to Python. In the six parts interview, Van Rossum describes history of Python and gives insights into Python’s design goals, the source of Python programmer productivity, the implications of weak typing, etc2 . I do highly recommend to read it: Part I: The Making of Python (http://www.artima.com/intv/pythonP.html), GvR describes Python’s history, major influences, and design goals. Part II: Python’s Design Goals (http://www.artima.com/intv/pyscaleP.html), GvR talks about Python’s original design goals how he originally intended Python “to bridge the gap between the shell and C,” and how it eventually became used on large-scale applications. Part III: Programming at Python Speed (http://www.artima.com/intv/speedP.html), GvR discusses the source of Python’s famed programmer productivity and the joys of exploring new territory with code. Part IV: Contracts in Python (http://www.artima.com/intv/pycontractP.html), GvR discusses the nature of contracts in a runtime typed programming language such as Python. Part V: Strong versus Weak Typing (http://www.artima.com/intv/strongweakP.html), GvR discusses the robustness of systems built with strongly and weakly typed languages, the value of testing, and whether he’d fly on an all-Python plane. Part VI: Designing with the Python Community (http://www.artima.com/intv/pycommP. html, final installment), GvR discusses the importance of pythonic API design, the usefulness of intuiting performance, the value of experience and community feedback in design decisions, and the process of deciding how to evolve Python’s standard library. 2 Syntax and semantics (examples) This section presents very basic commands and constructions of the language. After reading it, you can do the first experiments with Python and write your first simple scripts. To try the things quickly, you can run Python in the interactive mode — just type python without arguments on your command line3 . You will see something like this: D:\SocratesIP>python Python 2.3.4 (#53, May 25 2004, 21:17:02) [MSC v.1200 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> Then you write the commands just after the ‘>>> ’ prompt. The lines not preceded by the sequence show the output of the previous command. We will se the examples later. It is usual to store Python scripts to files with the extension .py. Then you can execute them by just typing ‘python MyScript.py’. Depending on the operating system that you use, you can also make the script executable or you can associate the .py extension with the Python interpreter. 2 The summaries are taken from the last part. even better idea is to start the Python Shell by running the IDLE application which is a part of the Python distribution. 3 Or 4 2.1 “Hello world!” As is usual for scripting languages, simple things (from the script writing point-of-view) should be done simply and easily. Displaying (or printing) the text is one of such simple things. So the “Hello world!” program in Python looks like this: print "Hello world!" The print sends the string representation of arguments to the standard output and appends the end-of-line automatically. The print without arguments produces just new, empty line. You can also write more arguments separated by commas. Each comma will produce one separation space to the output. The same behaviour as in the above case can be observed when we modify the command in the following way (no space in substrings): print "Hello", "world!" The comma added to the very end supresses appending the end-of-line. This means that also the following example produces the same result: print "Hello", print "world!" The print command is designed to be simple to use. You will discover that the things that may look unusual at the beginning, are well designed to easily produce well readable output. You may think that the described behaviour is very limiting in some cases — when you want to control the output form. However, print is only one of the built-in commands that happens to use the standard output capabilities. You can do the output differently and do format the result up to the last byte. Another possibility is to control the formating of the string that is used as the argument. In Python, you can use concatenation of strings using the ‘+’ (plus) operator (notice the space at the end of the first substring): print "Hello " + "world!" And you can also use operator ‘%’ to produce formated string from a format string and the tuple with arguments. The format and the behaviour is very similar to the sprintf() function in the C language. The following example shows how it would look in the interactive mode: >>> print "Let’s try addition: %i + %i = %i" % (2, 3, 2+3) Let’s try addition: 2 + 3 = 5 >>> "Let’s try addition: %i + %i = %i" % (2, 3, 2+3) "Let’s try addition: 2 + 3 = 5" Notice the second version that does not use the print command. When you simply use the value without context in the interactive mode, the Python interpreter displays the string representation of the value — here enclosed in quotes to emphasize that the value is of the string type. This is very handy when you do some experiments — no need to type print all the time. To get the formal string representation of the value, Python in the interactive mode calls the built-in function repr(). The formal string representation is expected to look so that if it were copied to the source code, it would be correct expression for creating the value. Because of that we can see the quotes around the string. The print, on the other hand, uses the built-in function str() to convert the argument values into informal textual representation (no quotes around the string): >>> ’xyz’ ’xyz’ >>> str(’xyz’) ’xyz’ >>> repr(’xyz’) "’xyz’" >>> print ’xyz’ xyz >>> 5 5 >>> str(5) ’5’ 5 >>> ’5’ >>> 5 >>> ’5’ >>> 5 repr(5) print 5 ’5’ # Note: This is a string that looks like a number int(’5’) # You can explicitly convert the string to a number. The ‘#’ (hash) character, used in the last two commands, starts a comment that continues until the end of the line. Strings can be enclosed in double quotes or in single quotes. The following two commands produce the same result: print "Hello world!" print ’Hello world!’ 2.2 Variables are names bound to objects Defining and working with variables is the very basic feature of all programming languages. Python is not the exception. It is very normal language — nothing extremely exotic. In compiled languages, variables are usually thought as cells in the memory that can contain data of some type (and of the related size). The names are lost during the compilation. They are replaced by more machine-related means inside the generated code, by memory adresses and references to registers. In Python, we should think about variables as about names bound to objects. In other words, objects are the things that matter. The names are used only to identify them from outside. The objects do not know their names, but they can determine their unique identity and other properties. Even the numeric constants are in Python treated as objects. You can use the built-in function id() to determine the unique identification of each object. In Python, one can easily list the names available in the part of the application. This is rather typical feature for interpreted languages4 . It is possible to get the name of a variable (or of a function) as a string and then get the value of the variable with that name (or call the function with that name) using that string. You can even ask user for the name of the variable and then display the value of the variable or display a message that the variable does not exists. When assigning the variable, we only bind the name to the object. The following fragment illustrates an interactive Python session: >>> x = 1 >>> y = 1 >>> print id(1), id(x), id(y) 7952424 7952424 7952424 >>> y = 2 >>> print id(1), id(x), id(y) 7952424 7952424 7952412 The results after the third line show the same identifications. The names x and y are bound to the same object — to the integer constant 1. When y is assigned 2, the different identification shows that it is now bound to another object. 2.3 Types are related to objects, not to variables In some compiled languages (like in C++), the closest to Python variables would be references. One can think about them as about automatically dereferenced pointer variables that must never be assigned the empty pointer value (NULL). When using a reference variable, it looks as if you were working with the object itself, as if the same object had more than one name. However, there is one important difference when compared with Python. Python variable names represent no type. The type is bound to the objects, not to the variable names. While the difference may look subtle at first, it has important consequence related to building data structures and to passing arguments to functions. 4 But it does not make compiled languages worse. They are simply different, possibly more suitable for some tasks. 6 We can think about Python references as about references to objects of any type. You can insert reference to a Python container regardless of the type of the object. You can pass argument of any type to a function. It gives Python interesting and very useful flexibility that cannot be dubbed so easily in compiled languages. Still, Python should be considered a typed language. For example, you cannot use integer where the string is expected or vice versa. There are no automatic conversions of such kind built into the language (compare it with Basic). The object type can be determined at runtime using the built-in function type(). This way your functions may decide how to process the passed arguments depending on their type: >>> x = 1 >>> y = 1.0 >>> z = 2+3j >>> print type(x), type(y), type(z) <type ’int’> <type ’float’> <type ’complex’> >>> y 1.0 >>> y = ’some string’ >>> y ’some string’ >>> type(y) <type ’str’> >>> print type(x), type(y), type(z) <type ’int’> <type ’str’> <type ’complex’> Here you can see that the existing y variable was assigned by the value of a different type without problems. It fits with the definition that y is only a name without type. The name was simply bound to another object. 2.4 Built-in types As announced earlier, all values—regardless of built-in or other types—take the form of objects. Some objects may be simpler (like an integer constant), some may be more complex (like strings, lists, dictionaries). Once the object is created, or it cannot or it can change its status, depending on the type. Using Python terminology, we say that some types (better to say objects of the type) are immutable or mutable. For example, integer constant can never be changed, because 1 have to remain 1, 2 have to remain 2, etc. But also strings are immutable in Pyton, while they usually can be modified in some other languages. When modifying string in Python, we actually get another string object that was filled with the modified content of the original. 2.4.1 Simple types Objects of simple types are. . . well, simple. We usually do not think about them as having some recognizable internal structure (from the abstract point of view). None represents one of the simplest values. If the pass5 simple statement represents no action, then None represents, no data. That type has only a single value and there is only one object of that type in the system. It can be accessed through the built in name None. Numbers can be: plain integers (at least 32 bits), long integers (unlimited precision), floating point numbers (double of the C language), and complex numbers (two floating coordinates). Booleans are treated as a subtype of plain integers6 . No character type. There is no type for a sigle character. Instead, the strings of the length one are used for the purpose — see below. 5 See later in the text. . . can try print 1 + True, you should get 2. The reasons for treating booleans this way are probably partly technical and partly historical. The bool type was introduced just recently, in Python 2.3. Treatment of integers in the boolean context (e.g. in conditional expressions) is influenced by the C language. 6 You 7 2.4.2 Sequence types When having the (structured) value of a sequence type, we can think about it as about a container of elements. The elements can be accessed directly or we can use iteration through the container; this will produce a sequence of values of the elements in the well defined order — hence sequence types. Some of the sequence types are immutable, which means that none of their elements can be changed. However, we can easily create a modified copy of an immutable sequence. We can even convert the content of an immutable type to a mutable type7 . Immutable sequence types are string, unicode, and tuple. We can think about string as about string of bytes, but it is often used for text strings. The unicode type is really related to texts. The tuples contain one or more elements that cannot be changed later. We can determine the length of the sequence (the number of elements inside) using the built-in function len(): >>> s = ’this is a string’ >>> s ’this is a string’ >>> len(s) 16 >>> type(s) <type ’str’> >>> print s this is a string >>> us = u’this is a unicode string’ >>> us u’this is a unicode string’ >>> len(us) 24 >>> type(us) <type ’unicode’> >>> print us this is a unicode string >>> t = (’some string’, 5, (’nested tuple’, ’xxx’)) >>> t (’some string’, 5, (’nested tuple’, ’xxx’)) >>> len(t) 3 >>> type(t) <type ’tuple’> >>> print t (’some string’, 5, (’nested tuple’, ’xxx’)) The elements of the sequence types can be accessed directly via zero-based indices. Sequences behave as arrays: >>> s[2] ’i’ >>> us[0] u’t’ >>> t[2] (’nested tuple’, ’xxx’) >>> t[2][1] ’xxx’ The immutable types do not allow assignment to the elements: >>> s[2] Traceback (most recent call last): File "<pyshell#12>", line 1, in -toplevel7 For example, we can convert an immutable tuple to the mutable list with the same element values on the same positions. 8 s[2] = ’x’ TypeError: object doesn’t support item assignment We can use negative indices for indexing from the end of a sequence. For example, the s[-1] returns the value of the last character of the string s. There is also the mechanism for accessing larger parts of the sequence objects. It is called slicing. We can define the range of indices that define the part. The first index says where the subsequence starts, the second index (separated by a colon) belongs to the element that will be first not included to the subsequence (the open interval from right). Also the third value can be used (again separated by a colon) — if forces the step. Any of the indices can be written as negative. If the index or the step value are omited, they take the implicit value. The implicit value for the first index is zero, the implicit value for the second index points just behind the last element, and the implicit value for the step is one: >>> s ’this is a string’ >>> s[5:-3] ’is a str’ >>> s[5:] ’is a string’ >>> s[:-3] ’this is a str’ >>> s[-1::-1] ’gnirts a si siht’ >>> s[::2] ’ti sasrn’ # the reversed string # every second letter from the beginning Mutable sequence types. The only built-in mutable sequence type in Python is the list. It allows indexing, but it also defines methods that allows to use a list as the stack or the queue: >>> >>> >>> [5] >>> >>> >>> [5, >>> >>> [5, >>> 6 >>> 3 >>> 5 >>> [] lst = [] lst.append(5) lst lst.append(3) lst 3] lst.append(6) lst 3, 6] lst.pop() lst.pop() lst.pop() lst The object of the list type is created using special constructor syntax. The square brackets are the same to the list as the single or double quotes to the string type. The empty list was created in the previous example but we can type in the sequence of elements that will form the initial list content. The append() and pop() are said to be methods of the lst object. You can think about a method as about a function that takes the name in front of the dot as the first argument and what is enclosed in parenthesis as the other arguments. Looking at the example, you may complain: But I am used to the push() operation instead of append()! In section 4.1 (p. 20), we will create our own stack class and object that will use the list as the base class and that will add some operations used for stacks. 9 2.4.3 Mapping types If the well defined order of elements—the sequence—is typical for the sequence types, then probably mapping type has a different features. (Not so difficult to deduce, is it?) Mapping types represent another kind of containers where the order of elements is not important. The main feature of such container is to use the key for fast searching of the associated value. Because of the purpose, we often call the container as a search table, or an associative array. When implementation details are taken into account, then it is sometimes called a hash table. There is currently the only mapping type in Python. It is called. . . Dictionary. It is implemented as a hash table. It means that the keys are transformed to special signatures that are used as a kind of index into special data structures. The technical goal of the transformation is to find the place where the pair key/value is stored as fast as possible. The logical purpose of a dictionary is to find out as fast as possible whether the key/value pair is present in the dictionary or to get the value associated with the key. The dictionary object is prescribed as a comma separated list of items enclosed in curly braces. A key and value are separated by colon: >>> d = { ’John’: 4568, ’Max’: 5065, ’Petr’: 8989 } >>> d {’Max’: 5065, ’John’: 4568, ’Petr’: 8989} >>> type(d) <type ’dict’> >>> len(d) 3 The dictionary type, as other Python containers, also support iteration. We can easily go through all items in the dictionary. We can also get the list of all keys, the list of all values, or the list of all items (i.e. pairs key/value): >>> d.keys() [’Max’, ’John’, ’Petr’] >>> d.values() [5065, 4568, 8989] >>> d.items() [(’Max’, 5065), (’John’, 4568), (’Petr’, 8989)] >>> ’Jane’ in d False >>> ’Max’ in d True >>> d.has_key(’Max’) True >>> d.has_key(’Jane’) False >>> d[’Petr’] 8989 >>> for name in d: # to be explained later print name, ’has phone no.’, d[name] Max has phone no. 5065 John has phone no. 4568 Petr has phone no. 8989 As you may have noticed, we can use the key as if it was indes. If the value for the key is present, we will get the associated value. As dictionary is a mutable type. We can modify the existing values for the key, and even add new items by simply assigning as if we indexed the array element by the key and assigned it the value: >>> d[’Jane’] = 3333 >>> d {’Jane’: 3333, ’Max’: 5065, ’John’: 4568, ’Petr’: 8989} 10 If we are not sure that the item is present inside the dictionary, we can use the get() method. It works the same way as the indexing by the key, but we can use the second argument — the default value that will be returned when the key is not present in the dictionary: >>> d.get(’Marry’, ’no number assigned’) ’no number assigned’ We can also remove the item using the del command: >>> d {’Jane’: 3333, ’Max’: 5065, ’John’: 4568, ’Petr’: 8989} >>> del d[’John’] >>> d {’Jane’: 3333, ’Max’: 5065, ’Petr’: 8989} Note: Python’s dictionary type is heavily by the Python core itself. Because of that the authors tuned the performance of the dictionary type very carefully. 2.5 Objects are garbage collected Any object lives until we do not want to know it. Once we remove the last binding to the object (i.e. we forget the last reference), the object can never be accessed again and will be removed when the system needs its space. We say that Python uses automatic memory management. And how to make Python forgetting a reference? We can just bind the name to another object. The very suitable object for doing that is created internaly by Python and it is bound to the built-in name None: >>> s = ’this is a string’ >>> s ’this is a string’ >>> type(s) <type ’str’> >>> id(s) 10142648 >>> None >>> type(None) <type ’NoneType’> >>> id(None) 504131400 >>> s = None >>> s >>> type(s) <type ’NoneType’> >>> id(s) 504131400 Another way to forget the object is to use the built in command del. It not only removes one reference from the bound object, it also removes the name from the Python’s collection of names. Then the name cannot be used without observing the error: >>> del s >>> s Traceback (most recent call last): File "<pyshell#32>", line 1, in -toplevels NameError: name ’s’ is not defined In both cases above, the object of the string type with the value ’this is a string’ is not bound to any name after the operations. Because of that it can be garbage-collected (i.e. removed) from the space where objects are kept. The interpreter will decide when this happens. 11 2.6 Simple statements and built-in function I will be rather informal when describing the statements and how they behave. If you want to know more details, read the Python Tutorial or consult the Python Language Reference. You can access them also from the Python home page, but they are the integral part of the Python distribution. (So, you will have them in your computer.) Simple statements, simply said, are the active part of the code that can be thought as steps (actions) and that do not take the form of function calls. Or they use some keyword (like return, raise, assert), or they use some special syntax (like expressions). The simple statements are built into the language. Some of the simple statements were used in the example above. They were: print, assignment, and del. A programmer cannot introduce new features to the language that looks syntactically as simple statements. For example, you cannot write myPrint that would not require parenthensis around the arguments. Let us show only one of the simple statement here—probably the simplest statement to understand— the pass. It simply does nothing. So, why it is there? It happens occasionally, that you need to type in some command to your source text because otherwise the interpreter would complain. For example, you know that you will create some function—and you want to use it immediately—but its body will be created later. When defining a function, Python forces you to type in some body. So, you just type pass instead of the body, and both you and Python will be happy. The command could have been named ‘ignoreIt’, but the authors wanted to save your fingers — pass is shorter to type ;) Then, your empty function may look like this: def myFunction(x): pass You will find more info about function definitions in section 4 at the page 20. Notice that the above example defines function on one line. It is possible but not usual. Usually, you want to write more statements that form the body of the function. We say that you want to write block of commands or block of statements. Here Python is a bit different when compared with other languages — see the next topic. Built-in functions are functions that are predefined by the authors of the language. They take the form of functions. In the above examples, we have used id() and type(). A programmer can write some new functionality that can be used the same way as built-in function. She can simply define a function (see 2.9, p. 15). 2.7 Block-of-commands, indentation, readability Python uses unusual way of marking the block of commands — the commands at the same level use the same indentation (from the left) in the source code8 . It would not surprise programmers in times when Fortran was The King. But it is unusual since the Algol language was born. Programmers got used to using the free formating so much — using block braces like begin/end or {/}. When they see the Python for the first time, they often perceive that feature as bad, ugly, or they feel it strange. This was also my case, but I got used to very quickly. It seems to be quite general — see also Eric Raymond’s experience in section 5.1 (page 24). The bias against not having block braces is probably not based on their neccessity. When writing source code of other languages—like Pascal, C++, etc.—authors want to make it readable. Because of that, they indent the blocks anyway — everyone would agree. But the existence of block braces may lead to different opinions on how the braces should be placed with respect to the block of commands. Different people use different styles. Consequently, the sources may look a bit differently in the cases. One have to get used to the style to read the source fluently. The experience with Python shows the opposite. At a Python conference, Bruce Eckel summarized it when talking with Guido van Rossum with words “Life is better without braces,” which became the slogan of the conference. The good consequence is that all Python programs look similar and it is easy to read them. 8 It is recommended to use 4 spaces for each next indentation level, but it is not forced. If you use bad indentation—when the line should be but is not aligned with the previous—, Python will warn you. 12 2.8 Programming constructions In this subsection, we will focus on conditional branching and loops. When explaining the for loop, the basic form of the construction for capturing exceptions will be also presented. Let us start with conditional branching which belongs to the key inventions related to computer programming. Nobody can remember the time when it was discovered as it was about 200 years ago. 2.8.1 Conditional branching — if-elif-else Syntax for writing conditional branching starts almost in all languages by the if keyword. Here is how the construction is written in Python: if condition: myFunction(1) myFunction(2) elif condition2: statement(s) else: statement(s) # the example of block of statements # equivalent to else-if, but simpler indentation # here the block of statements can be placed... # ... and also here. The elif and else branches can be omited. While the if and else keywords are usual also in other languages, the elif is somehow more Python related. In other languages, you can type ‘ else if’, because you do not need to use indentation to express what is the block of commands and what is its parent line. Python also do not have an equivalent of case or switch command. If you want to use the control structures for testing for one of many cases, you have to use series of if–elif–else branches9 : if x == 1: print ’This elif x == 2: pass elif x == 3: print ’This elif x == 5: print ’This else: print ’Some is the first case.’ is the third case.’ is the case numbered as 5.’ other case:’, x Without elif, the program source would ‘run to the right’ which is not what we really want: if x == 1: print ’This is the first case.’ else: if x == 2: pass else: if x == 3: print ’This is the third case.’ else: if x == 5: print ’This is the case numbered as 5.’ else: print ’Some other case:’, x Sure, it express the order of testing the conditions, but it does not emphasise the abstract idea that the cases may sometimes be equally important. We have the choice to use the form that seems better for expressing our goal. 9 If you really need more efficient searching for the cases, you can associate the case values with the code that solves them through Python’s built-in hash tables, also known as dictionaries — to be discussed later in the text. 13 2.8.2 The loop structure — while The while command is present probably in all procedural languages. Python is not the exception. Without more words: i = 0 while i < 10: print ’%i ** 2 = %2i’ % (i, i ** 2) i += 1 # equal to i * i # equal to i = i + 1 The following output is produced: 0 1 2 3 4 5 6 7 8 9 2.8.3 ** ** ** ** ** ** ** ** ** ** 2 2 2 2 2 2 2 2 2 2 = = = = = = = = = = 0 1 4 9 16 25 36 49 64 81 Iteration through all items — for In classic languages, like Pascal, the for construction is used for counted loops where the range is known in advance. It can be used also for enumerated types, but basically there is always a variable that is set to the next integer value in the range. The for-loop variable is often used for indexing an array. In C++, which always wanted to be backward compatible with much older C, the for-loop is just a construction to write the while-loop more conveniently. It is more low level in comparison with Pascal. Python took a different, more modern approach that started to be used when object oriented languages appeared. They introduced the concepts of containers and iteration through them to wider audience10 . The key idea is that we mostly want to loop through all elements contained in some bigger structure. From that point of view, the array with elements indexed by a continuous sequence of integer values is only a special case. Example of the Python’s for construction follows: for elem in container: # ’for’ and ’in’ are keywords process(elem) # example of a statement someOtherStatements(elem) # example of a statement The variable elem, that follows just after the keyword for, acts as a window that we use to look inside the container. During loop processing, the variable is assigned by all values from the container. You can think about the for as about for each. The for construction can be used for all built-in Python containers which are all sequence types (strings, lists, tuples, and others) and mapping types (dictionary), because they support iterators. If you write your own objects that support iteration, then you can also use them as containers in the for loop. For cases, when you really need integer indices, you can use one of the built-in functions range() or xrange(). The range() function returns a list filled with the integers (i.e. it returns container that can be used in for-constuction). The xrange() function has the same arguments11 . The only difference is that xrange() does not create a list filled with the values. It returns special object that also supports iterators. Through that iterator it behaves as if it was a sequence of the numbers, but the object has the same (small) size independently on the range. The following example shows the usage of the functions for creating counted loops: : >>> for i in range(5): print i 10 The concepts were designed probably much earlier, but the existence of object-oriented languages made their usefullness apparent. 11 Only the first argument is obligatory. 14 0 1 2 3 4 >>> for i in xrange(1, 6): print i 1 2 3 4 5 >>> for i in xrange(0, -5, -2): print i 0 -2 -4 The first example shows classical usage of range() where a single argument defines the number of steps that start with the zero value. The second example uses xrange() that uses two arguments — start value and one after the last value. The third example uses all three possible arguments — start, next after the last one, and step. 2.9 Functions and generators Functions, as in other languages, capture parts of algorithms that can be parametrized, but they are not bound to some internal data structures that persist between two function calls12 . Simply said, they contain the code to be interpreted, but do not contain data. If they work on data structures, then the data structures are external. Functions manipulate with data that are determined by arguments. Python functions can return values that may be ignored (like in the C language). In other words, they replace both functions and procedures as defined by the Pascal language. Function definitions in Python start with keyword def: def myFunction(a, b, c = 1): return a + b + c # just an example of using the values of arguments Body of the function is indented. The return statement is used to return the value. The third argument in the example has implicit value 1. Because of that, we can call the function wit only two arguments: Python functions are not extremely special, but remember, how the variables are assigned by values. The arguments are the local variables and as such their type is not defined. They are just bound to objects that are passed as arguments. This gives Python functions rather bigger functionality than functions in compiled languages. Flexibility of Python functions can be roughly compared with templates in C++. The following example of the interactive Python session illustrates the case: >>> def f(a, b): return a + b >>> f(1, 2) 3 >>> f(’one string-’, ’-second string’) ’one string--second string’ Because the arguments can be of any type and because the operator ‘+’ is defined both for integers and strings, the same body of function can produce different results for different types of arguments. 12 Confront it with class methods that usually work with member data structures of the object. 15 Generators are special functions that return an iterator object. Each time the iterator is activated through calling its method next(), it returns the next value from the generator. So, the generator behaves as a sequence of values. The generator object saves the status of local variables when returning a value. When called for the next time, they start just after the the command that was used to return the value in the previous call. Generators are defined the same way as functions. The only syntactic difference is that they use the yield command instead of return. Having generators supported in the language, it is easy to implement behaviour similar to (for example) xrange(). The following example shows how to write PascalRange() function that can be used for simulation of the for-loop in Pascal (closed range, both marginal values of the interval will be generated). Here is the snapshot of the interactive session: >>> def PascalRange(low, high): i = low while i <= high: yield i # notice the yield instead of return i += 1 >>> for x in PascalRange(2, 7): print x 2 3 4 5 6 7 It looks simple when used, doesn’t it? It is a good time to illustrate how iterators work. Our PascalRange() generator defined above creates the object that stores the status (low, high, i, and the spot in the code where it will continue when accessed next time through iterator). It also creates the iterator object that can move through the object. Each iterator must define the next() method13 . When the next() method is called, the next expected value is returned. If there is not more values to be returned, the standard exception named StopIteration is raised14 . See the example of using our generator during the interactive Python session: >>> >>> 2 >>> 3 >>> 4 >>> 5 >>> 6 >>> 7 >>> myIterator = PascalRange(2, 7) myIterator.next() myIterator.next() myIterator.next() myIterator.next() myIterator.next() myIterator.next() myIterator.next() Traceback (most recent call last): File "<pyshell#11>", line 1, in -toplevelmyIterator.next() StopIteration What about calling it in the infinite loop15 ? 13 See section 4.1, p. 20 for explanation what the method is. Python 2.3, you can call the next() method even after the first StopIteration exception. You should get the StopIteration exception again, and again. Otherwise, the implementation of the iterator is considered broken. 15 Here True is boolean value. When the while loop is given such condition, it will loop forever — unless explicitly stopped by the break statement or by an unhandled exception. 14 Since 16 >>> myIterator = PascalRange(2, 7) >>> while True: x = myIterator.next() print x 2 3 4 5 6 7 Traceback (most recent call last): File "<pyshell#38>", line 2, in -toplevelx = myIterator.next() StopIteration Well, it behaves almost as our for! The infinite loop is not infinite, because it caused the exception and the loop was stopped. We only need to get rid of the exception message that says that we cannot iterate more through the object. The time has come to show how the raised exception can be captured by the surrounding code. We simply wrap the code by try–except construction. Knowing that, we can rewrite our for construction using Python’s while: >>> myIterator = PascalRange(2, 7) >>> while True: try: x = myIterator.next() print x except StopIteration: break 2 3 4 5 6 7 So the loop does the same, what we manually did in the previous example. The only difference is that the StopIteration exception is not captured visually on the screen, but it is used as the signal that the iteration through the object has finished. In our case, it is not processed as an error. We just do not allow to bubble the exception to the runtime code and do process it immediately (so the information about the exception disappears from the system). Generally, we do not know how many calls of next() will return the value. Waiting for the exception is the way how to discover that we looped through all values. Well, the problem was rather simplified. But basically, the magic of the for loop was revealed. Still, the for construction is easier to use, read and think about. This article should not replace tutorials or the Language reference. Some more detail related to syntax and semantics are covered in section 4 (p. 20). Now, some general features of Python will be presented. 3 Features of the language With knowledge from the previous text, you should be able to create first simple scripts. Let us focus on something else than the code. Language is not only syntax and semantics. 3.1 Interpreted but semicompiled language While Python belongs to interpreted languages, the source code is precompiled to the byte-code. When Python is required to run the code from the text source file, the source text is converted into 17 byte-code that can be stored in the related precompiled file (with the same name and with the .pyc extension), or it is simply stored in memory. The byte-code can be interpreted much faster than what was usual in the old interpreted-BASIC days where each line of the source was read and parsed many times. Python belongs to quite fast interpreted languages which increases its usability for every-day tasks. 3.2 Python is extensible One of the major programming units in Python is module. It can be written in Python itself. If some special functionality is required, the module can be written in the C or C++ language. This helps when a performance bottleneck is found in a project. The computationally intensive part can simply be rewritten in the compiled language. Still, the interface of such modules—when used from a Python program—looks exactly the same, as if it was implemented in Python. The idea of extensibility via Python modules may seem natural. The extensibility via modules written in C/C++ turned out to be a very good idea. There is a lot of useful libraries written in C or C++. One can simply create a Python wrapper around the existing code and use it as if it was always implemented in Python. Here the performance goal may not to be the most important one. It is simply more effective to wrap the existing code than to rewrite it in another language (with new errors included). The last mentioned extensibility via modules written in C/C++ can be viewed as one way of gluing the existing code together. Another way of gluing the existing code together is using the existing utilities from inside Python programs. 3.3 Ready (also) for object oriented programming One can use Python for projects with complexity near to none (e.g. for simplest scripts that contain only a sequence of few commands) to very complex project where objects cooperate to do the task. From that point of view, it is important to emphasise that object oriented programming (OOP) in Python is very natural16 , but it is not forced. When writing your program, you can still think in terms of variables and functions — no problem. 3.4 Object oriented from inside The possibility to write in the object oriented way in Python is not very surprising if you discover (in the standard documentation) that everything inside of Python is represented as objects. Even the numeric constants takes the internal form of objects. The functions are compiles into the form of objects that contains code object (the byte-code without context) and another objects that store the context of the function, etc. The internal structure of Python was probably strongly influenced by the fact that Python was designed (in the late 1980s) as a scripting language for the Amoeba project, the distributed operating system. The usage of a local memory on remote computers in such systems makes the things very complex. Then, thinking in the object way is very natural. The identity of objects may be expressed differently than in terms of memory addresses. Then, it is much easier to handle migration of objects. The cooperating object also separate their internal code and data from the rest of the distributed application. They even may be autonomous, they can run in parallel. Objects are simply natural to distributed systems and to loosely coupled parallel systems. 3.5 Python is a multiparadigm language I would like to avoid some misunderstanding and to emphasise one important thing related to classes on one side and functions on the other. You may find discussions on the web and in news-groups that some language is worse than the other . . . because it is not purely object oriented. In Python, everything is internally kept as objects. Python allows you to implement your problem using the object-oriented approach. But Python do not prefer classes to functions! Python supports what is called multiparadigm programming. You can decide to implement one part in the objectoriented way. You can decide to use the plain old structured-programming approach (separated 16 To compare: If you know Perl, writing a class and creating an object is not very natural. 18 functions and data17 ). From that point of view, Python is more similar to C++ than to Java, for example. In my opinion, the multiparadigm support is what makes Python (and C++, and other languages) the great language. The brevity and clarity (in comparison with C++) makes Python attractive as the teaching language and as the language for solving practical problems. Another source of “having fun” when working with Python is the consequence of the ‘no compromise’ — one of its key design concepts. The authors always to help to the programmer, not to the machine. The experience (also in Java world) shows, that using functions to solve a problem is sometimes more natural than using classes and their methods to do the same. Yes, the functions were used in older structured-programming approach to manipulate the external data. But does that making them less useful for the more-modern object-oriented programming? From mathematical point of view, functions are used to transform input arguments to the result. They are the abstraction for transformation of something external. From that point of view, functions can be nicely used to manipulate even the objects. You can simply pass whole objects to a function, not only some data structures. The standard algorithms library of the C++ language is the real, industrial-strength example of such approach. It had to pass serious discussions and reasoning before being accepted to the standard. The C++ language is known not to accept anything that can be simply and effectively replaced by other means. Classes and objects appeared in the world of computer programming to simulate the real systems — to define internal states of the subsystems and the interactions. They are very useful also for building software that is not written for simulations of that kind. In some sense, we try to simulate abstract systems. Because of that, usage of pure object-oriented languages to solve the problem can be better than using pure structured-programming approach. However, does that make plain functions obsolete? In pure object-oriented languages—where self-standing functions are not allowed—one is forced to create classes with no data and with static methods only to replace the lack of functions. Technically, it is correct. But it is not natural. Programming languages are here to make the transformation of ideas from inside our head to the solution that can be interpreted by the machine. The programmers are not here to compete in how clever they are to express indirectly the idea that could be expressed directly. The language should make expressing the idea as easily as possible. So, my conclusion (or my view) is that the modern object-oriented language should support also plain functions. The modern languages should support multiparadigm programming. Some people complain that Python should not be said to be pure object-oriented language. Then, the question should be reversed. Are the pure object-oriented languages the best tool for solving real problems? 3.6 Using exceptions for consistent error handling Error handling is one of the problematic parts of many older programming languages. The problem is that error handling was often not built into the language. Instead, some rules of correct implementations were defined by programmers to cope with errors. But programmers were not forced to handle the errors. For example, many functions from standard libraries for the C language return the value that says if the function succeeded or not. But the return value can be ignored, and it often is. Because of that, some programs may fail in some special situations, and it is difficult to find why. It may be very difficult to repeat the situation when the errorneous behaviour is observed. Modern languages define the mechanism that can be used to signal some error in the way that it cannot be simply ignored. The mechanism is based on exceptions. Once the exception is raised, it must be processed by the calling code, or it will be processed by the runtime code and displayed as an error. If the programmer wants to mask the error, she must do it explicitly. There is no chance to overlook the error by mistake. Python belongs to the languages that use exception mechanism for signalling and handling errors. It uses so called termination model described above — when the exception is not handled by the surrounding code, the specific error message appears, and the application is terminated. Proper usage of exceptions helps to write more robust programs. Python defines a rich set of built-in exceptions that can be used also by a programmer. We can also create new, user defined exceptions that behave the same way as the standard exception. There is no internal magic hidden in Python. 17 You do not need to know that the data is internally implemented as objects. 19 3.7 Productivity versus Performance This is one of the key design principles of Python. The authors believe that it is more valuable to focus on a programmer than to details related to the generated code. They argue that few percent of performance can be sacrified for being more kind to a programmer. Hardware is going to be more and more powerful — people remain the same. Backward compatibility is another point that is not considered a limiting factor when some new feature (good for a programmer) is to be introduced. Because of that, Python is cleaner than other languages. The practice show that the language changes do not have bad impact to the existing code. It is easy to adopt the changes18 . As the result of the principles, Python programmers say they are about from 3 to 5 times more productive than when they use Java, or 5 to 10 times more productive than when programming in C++. There are several reasons for better productivity. One of the reasons is that syntax and semantics of Python is very simple. Also, the built-in data types can be used for the most tasks that one is going to solve. Moreover, Python comes with batteries included. It means that the set of standard libraries is very rich. Their interface is usually very simple, often much simpler than the similar ones from other languages. It is apparent, that also libraries were designed to be programmer friendly. Then, it is easy to remember the language features and the mostly used standard modules. One can easily keep everything in the head; the need for searching in a documentation is reduced. The Python code is also very compact and self-explanatory. Few lines of code can hide rich functionality. When looking at your old code or at the code written by someone else, you will understand it very soon. Let us have a look at the following code: import webbrowser webbrowser.open(’http://www.google.com/’) The import command brings the named module (see subsection 4.2, p. 21). The function open() from inside the module will open your default browser and display the page on given URL. That is everything you have to do. You often can almost guess how the task should be done, and it often works so. Python is full of such programmer-friendly things. 4 More syntax, semantics, and examples First part of this section presents how classes are defined and how their instances (i.e. objects) can be created. The simple example shows how we can take advantage to build one class on the top of another (to derive the class from the base class). Second part of the section presents modules as the mean for organizing the code of large applications. Modules also separate application namespaces to avoid collision between the names. 4.1 Classes and instances Class definition starts with the keyword class. After giving it a name, we can add the list of base classes in parenthesis19 , or we do not use the parenthesis at all. You can think about a method as about a function that is associated with the class instance (i.e. with the object). When calling the object method, we write it syntactically as if we wanted to access a part of the object which simply happened to be callable (the parenthesis with or without arguments follow the method name). We can also think about calling a method as about calling a normal function that is given not only the arguments from inside parenthesis, but also the first argument which is a reference to the object (the name in front of the method). Technically, it is how methods are really implemented internally in many object-oriented languages. Python class definitions make that even more visible. When defining a method of the class, we use the same syntax as for defining a function. The difference is that we always add the first argument named self20 . When calling the method, the self is filled automatically by the reference to the object. 18 For example, when designing the C++ language, the priorities were rather opposite. Performance and backward compatibility with C are two of the main goals. 19 Then we say that the class is derived from the base class(es). 20 Naming the first argument of methods as self is only a convention. You could give it a different name, but you should never do it unless only experimenting. Some external utilities may assume that you follow that convention. 20 When showing how Python lists can be used as a stack (see p. 8), I have promissed to show how we can write our own class that defines more familiar method push() instead of append(). Here is the part of the interactive session: >>> class myStack(list): def push(self, element): self.append(element) >>> stack = myStack() >>> stack.push(1) >>> stack.push(2) >>> stack.push(10) >>> stack [1, 2, 10] >>> type(stack) <class ’__main__.myStack’> >>> stack.pop() 10 >>> stack [1, 2] >>> stack.pop() 2 >>> stack.pop() 1 >>> stack [] >>> stack.pop() Traceback (most recent call last): File "<pyshell#65>", line 1, in -toplevelstack.pop() IndexError: pop from empty list As you can see, we have derived the class from the built-in list. The method push() internally uses the reference to the object, which is the same as the reference to the base object, and implements the action as calling the append() method of the base class. When using the name of the class with parenthesis on the right side of the assignment operator, new object of that class (or of that type) is created. Notice that—in the interactive session—very few empty lines are used. When writing the code into a separate text file, you will probably use more of them them to make the text more readable. The reason for not using empty lines is that the interactive Python interprets the empty line as the signal for finishing the definition, the block of code, the syntactic construction, etc. Once more, not using empty lines is not a feature of the Python language; it is a convention for working in the interactive mode. 4.2 Modules and packages Generally, modules are used to put together the related functionality. The concept of modules is—for some people—more understandable, when we use the term package. However, Python uses both terms module and package. A Python module gives the packed content the name — it acts also as a namespace for the internal names. When a Python module is implemented in the Python language, then the name of the module is also the name of the file through which the functionality is accessible. We can think about packages as about hierarchically organized modules. Usage of packages does not differ from usage of modules too much. The main syntactic difference for the beginner is that their name contains dots that separate the names in the hierarchy. Technically, the packages are stored as subdirectories with files. Some of the filenames are reserved. We will not cover packages in this text. See http://www.python.org/doc/essays/packages.html for details. 21 Modules are used as abstractions. They are used to emphasize putting the related things together. For example, there is the standard Python module shutil that—as the name suggests— implements the basic shell functionality: copying and moving files and directory subtrees, copying permission and status information related to files and directories, and removing subdirectories. It is clear that the functions are somehow related. Modules are used for organizing the amount of the code. Another reason for spliting the large amount of the source code to more modules is apparent. It is always easier to manage several but less complex things than a single but big one. When splitting one big bunch of code, one have to at least break the interdependencies. Different standard Python modules have usually completely different purpose — probably nobody would like to create one big source file with everything inside. But even our own application may be large enough to separate its parts to the modules. Then each module has some special purpose, can be developed separately, can be tested separately, can be replaced later by some better implementation, and can also be reused in different projects. Modules are related to technical implementation details. The abstract things, related to the design process and to thinking about the functionality, will be finally replaced by the real implementation. The code must be written and must be stored somehow in the system. It was decided in Python, that each separate source file can be used as a module. The same source file can be also used as a script file. 4.2.1 What a Python module can contain? Generally, a Python module can contain anything that can have name and that can be reused. Types. A module do not need to store any reusable code or values. Let the standard module types is the example. You can have a look at the Lib/types.py at your Python directory. It contains only names of built-in types. Constants. For example the module math module contains the constants pi (3.1415926535897931) and e (2.7182818284590451)21 . Functions. This is very understandable. The classic languages (like C) usually gives you something that is not implemented as a syntactical construction of the core language in the form of functions. Some functions became so much standard, that users wish to have them ready to use. They are offered to you or in the form of built-in (i.e. predefined) functions or in the form of standard libraries. If you compare the C language with Python, you can observe, that Python always uses the module wrapper around the functions. As the module creates a namespace for the names inside, it helps to avoid function name collisions in larger applications. Functions to modules are in a similar relation like methods to classes. Class definitions. The module can contain also class definitions. As we said earlier, classes are nothing more than functions bound through the reference (the name self) bound to data. In Python it is even visible when the methods of the class (i.e. the functions that usually work with data of the object). Executable fragments. If the module contains statements that are not placed in a function definition or in a class method definition, then the statements are executed only once, when the module is used for the first time. The commands can be used to initialize some internals of the module. The module can be called also from the command line — its file can act as a script file. Then the executable fragments represent the body of the script. When writing modules, it is usual to add such a body for testing the functionality of the module. In other words, if the module is used as a module, it offers you the functions and the other implemented internals. If the module file is used as a script, then the module is often testing itself. 21 You will not find the math.py file in the Lib subdirectory. It is implemented as a wrapper around C functions. 22 4.2.2 How to use a module? Whenever you want to use any module, you must import it first. It brings to your code the name of the module and the names from inside: >>> import calendar >>> calendar.prmonth(2004, 6) June 2004 Mo Tu We Th Fr Sa Su 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 >>> prmonth(2004, 6) Traceback (most recent call last): File "<pyshell#16>", line 1, in -toplevelprmonth(2004, 6) NameError: name ’prmonth’ is not defined Here the calendar is the name of the module, the prmonth() is name of the function that prints matrix representation of the calendar for the month of the year. Notice, that the prmonth() is not visible as a name that can be accessed directly. One have to go through the name of the module. This way, the calendar acts as the name-space (or namespace) for the names inside. If you really need to bring some function into your local namespace, you can use the following form of the import: >>> from calendar >>> prmonth(2004, June 2004 Mo Tu We Th Fr Sa 1 2 3 4 5 7 8 9 10 11 12 14 15 16 17 18 19 21 22 23 24 25 26 28 29 30 import prmonth 6) Su 6 13 20 27 You can also think a bit about the names and about the features of Python. It was said at the beginning of this article, that names are bound to object and that everything in Python are objects (see 2.2, p. 6). Does this hold also for modules and functions? >>> import calendar >>> id(calendar) 10409168 >>> type(calendar) <type ’module’> >>> id(calendar.prmonth) 10476720 >>> type(calendar.prmonth) <type ’function’> Well, both the module and the function from inside have the object identification and type; so, they are objects. Can we bound those objects to different names? Let’s try to use shorter and also simpler names (to bypass the namespace prefix): >>> c = calendar >>> id(c) 10409168 >>> type(c) <type ’module’> >>> c.__name__ ’calendar’ >>> c.__file__ 23 ’C:\\Python23\\lib\\calendar.py’ >>> m = c.prmonth >>> id(m) 10476720 >>> type(m) <type ’function’> >>> m.__name__ ’prmonth’ >>> m.__module__ ’calendar’ Notice that the id() returns the same identification for both the shorter name and the full name. We can use the built-in attribute name to get the real name of the module or of the function. The file attribute of the module stores the full filename of the module — if present. The built-in modules (e.g. sys) do not define that attribute. We can also use the module attribute for the function to get the information in what module it was defined. Let’s try, whether the shortened name of the function from the module really works: >>> m(2004, 6) June 2004 Mo Tu We Th Fr 1 2 3 4 7 8 9 10 11 14 15 16 17 18 21 22 23 24 25 28 29 30 4.2.3 Sa 5 12 19 26 Su 6 13 20 27 How to create modules? It is very simple. Just store the source code in the file with the name of the module and with the extension .py in the same directory, where your script is placed. The modules are always searched in the working directory. But they are searched also in other directories. We can display or even modify the sys.path list of searched subdirectories. See the documentation of the built-in module sys for details. 5 Experience of others When I wrote this text, I asked myself? Am I as good in Python as those recognized experts? Even if I were that good, would I be able to convince more people (like you) to use Python? I have decided to collect here rather longer citation of other people22 . My contribution is the selection of the citation that I feel to be appropriate for the purpose of this lecture. When interested, I encourage you to read the original texts. Let us start with the experience of a good programmer that just happened to see Python for the first time. . . 5.1 Eric Raymond Eric Raymond is a Linux advocate and the author of The Cathedral & The Bazaar. In the following quotations from Why Python? (1st May 2000) I do extract only the things that could convince you to give Python a try. In fact, this article compares indirectly Perl with Python based on the Eric’s own experience. I have chosen it because it really resembles my observations, even though my Perl applications probably were not so complicated. You should also read the whole Eric’s article that you will find at http://www.linuxjournal.com/print.php?sid=3882: My first look at Python was an accident, and I didn’t much like what I saw at the time. It was early 1997, and Mark Lutz’s book Programming Python from O’Reilly & Associates had recently come out. [. . . ] I dived into Programming Python with one question uppermost in my mind: what has this got that Perl does not? [. . . ] At that time, I had used Perl for 22 Another reason to use the long citations is that I was not given the limit of pages for this text ;-) 24 a number of small projects. I’d found it quite powerful, even if the syntax and some other aspects of the language seemed rather ad hoc and prone to bite one if not used with care. I immediately tripped over the first odd feature of Python that everyone notices: the fact that whitespace (indentation) is actually significant in the language syntax. [. . . ] I recoiled in reflexive disgust. [. . . ] It’s hard to blame anyone, on seeing this Python feature, for initially reacting as though they had unexpectedly stepped in a steaming pile of dinosaur dung. [. . . ] I didn’t believe what I’d seen would ever compete effectively with Perl. Then he describes his fight with writing rather complex programs in Perl. Writing these programs left me progressively less satisfied with Perl. Larger project size seemed to magnify some of Perl’s annoyances into serious, continuing problems. The syntax that had seemed merely eccentric at a hundred lines began to seem like a nighimpenetrable hedge of thorns at a thousand. [. . . ] These problems combined to make large volumes of Perl code seem unreasonably difficult to read and grasp as a whole after only a few days’ absence. [. . . ] A language that makes it hard to write elegant code makes it hard to write good code. [. . . ] His next encounter with Python is related to growing problem with his fetchmail utility. He decided to use Python because of its GUI capabilities and because he was curious about Python. Of course, this brought me face to face once again with Python’s [. . . ] significance of whitespace. [. . . ] Oddly enough, Python’s use of whitespace stopped feeling unnatural after about twenty minutes. I just indented code, pretty much as I would have done in a C program anyway, and it worked. That was my first surprise. My second came a couple of hours into the project, when I noticed (allowing for pauses needed to look up new features in Programming Python) I was generating working code nearly as fast as I could type. When I realized this, I was quite startled. [. . . ] When you’re writing working code nearly as fast as you can type and your misstep rate is near zero, it generally means you’ve achieved mastery of the language. But that didn’t make sense, because it was still day one and I was regularly pausing to look up new language and library features! This was my first clue that, in Python, I was actually dealing with an exceptionally good design. Most languages have so much friction and awkwardness built into their design that you learn most of their feature set long before your misstep rate drops anywhere near zero. Python was the first general-purpose language I’d ever used that reversed this process. Then he describes the situation when he reached the point where meta class programming would be nice to use. Eric describes such activity as deep black magic even in languages that support it. He succeded: [. . . ] this code only took me about ninety minutes to write–and it worked correctly the first time I ran it. To say I was astonished would have been positively wallowing in understatement. It’s remarkable enough when implementations of simple techniques work exactly as expected the first time; but my first metaclass hack in a new language, six days from a cold standing start? Even if we stipulate that I am a fairly talented hacker, this is an amazing testament to Python’s clarity and elegance of design. There was simply no way I could have pulled off a coup like this in Perl, even with my vastly greater experience level in that language. It was at this point I realized I was probably leaving Perl behind. [. . . ] Perl still has its uses. For tiny projects (100 lines or fewer) that involve a lot of text pattern matching, I am still more likely to tinker up a Perl-regexp-based solution than to reach for Python. [. . . ] For anything larger or more complex, I have come to prefer the subtle virtues of Python–and I think you will, too. . . . and you should also read the full version of the article. You may find ideas to learn from. 5.2 Bruce Eckel Bruce Eckel is well recognized author of Thinking in C++, Thinking in Java, co-author of Thinking in C#. You can find the electronic version of the books at http://www.mindview.net/. Bruce builds on the personal experience with the languages both from working on projects an from his teaching 25 activities. When reading the books, you are convinced that he really knows what the languages are about. In my opinion, he is the one who really can compare the languages. What about Bruce Eckel and Python? He says: Python is what I use the most to solve my own problems. On the 9th International Python Conference (2001) Eckel pointed out ten reasons for loving Python. You can download the slides23 Why I Love Python from http://64.78.49.204/pub/ eckel/LovePython.zip. Bruce Eckel is the person that thinks also in the design patterns. You can download his preliminary version of future book Thinking in Python from http://mindview.net/Books/TIPython/. It does not describe the languate itself. It is focused on rather more complex things related to design patterns. His experience with Python is summarized also in the interview with Bill Venners (www.artima.com). The following summaries are extracted from the last part page: Part I: Python and the Programmer (http://www.artima.com/intv/aboutmeP.html), Bruce Eckel explains why he feels Python is “about him,” how minimizing clutter improves productivity, and the relationship between backwards compatibility and programmer pain. Part II: The Zen of Python (http://www.artima.com/intv/prodperfP.html), Bruce Eckel explains why he prefers Python’s valuing programmer productivity over program performance, Python’s you-want-it-you-can-have-it attitude, and Python’s zen-like learning curve. One of the things I find that’s remarkable about Python is that it has a very even learning curve. Maybe it’s not even a curve, It’s kind of a straight line. Learning Python has a zen-like quality, because Python doesn’t try to make the world something else. The designers of Java wanted to make the entire world look like a Java virtual machine and the Java libraries. In addition, Java’s designers decided that the C++ approach of allowing functions and global variables in addition to classes is bad. So everything in Java has to be declared in a class. For that reason, Utah Valley State College stopped using Java as an introductory language. They actually teach C++ as a first language, because they found it a lot easier. Python would make an even better first language to teach programming. It’s such a gentle learning curve. You can start with scripts, and of course some people dismiss Python as a scripting language, because you can script with it. You start teaching scripts. You can teach functions. Then later you can add classes. Then you can go onto things like metaclasses. Python has many more of these powerful constructs that you can learn when you’re ready. And I think that’s very impressive, because it doesn’t say you should only be an object-oriented programmer. Part III: Type Checking and Techie Control (http://www.artima.com/intv/typingP. html), Bruce Eckel explains why he prefers Python’s latent type checking and techie control of language evolution. The Python venture is basically controlled by the techies. We make decisions based on what’s going to make the life of the programmer easier. Even with C++, which was a standards committee, I remember early decisions being based on worries about the existence of a body of code which was a drop in the bucket relative to what we have now. But they were saying, “We can’t make this change in the language because we would break all that existing code,” which was basically trivial. We should have made those changes at the time. That was sort of a marketing decision because many of the people on the committee were representing companies who had vested interests in C++ in some way or another. Part IV: Python and the Tipping Point (http://www.artima.com/intv/tippingP.html, final installment), Bruce Eckel talks about how Python’s minimal finger typing allows programmers to focus on the task, not the tool, generating a productivity that makes more projects feasible. [At another Python conference...] I suggested to Guido a slogan that I think somebody else probably said first, “Python. It fits your brain.” That’s what I was talking about when I said, “My guesses are usually right.” Python allows you to get into this uninterrupted flow, and just go with that without having to think too hard, even if I have to look up the way a library works. 23 Created in Microsoft PowerPoint. 26 One of my first real productive experiences with Python, beyond just playing around with the language, involved image processing. I wanted to resize some GIF files. Given my experience with other languages, I figured this task might take me half a day if I were lucky. Even if there were an existing image processing library in Python, I figured the library would be complicated and take significant time to understand. I discovered a Python library that did graphics manipulation, and to my surprise, resizing GIFs was as simple as you can imagine it could be. You create an object, call reformat, pass in some arguments, and you’re done. In C++, and even in Java, the ease of understanding a library is not really part of the culture. In Python it really is. Instead of taking a half a day, which was my best hope, after a half an hour, I couldn’t think of any more features to add to my program. And I was just stunned. I thought, oh, that’s what people mean when they talk about Python’s incredible productivity. 6 Conclusion I will cite Bruce Eckel again (see the end of the fourth part of the interview). It should be put into the context that Python is said to be from 5 to 10 times more productive than Java. It is simply excelent summary to think about: [. . . ] I believe it was definitely worth moving from C to C++, and from C++ to Java. So there was progress there. For something as advanced as Python is over those languages — and as different — there will be some hesitation. However, it seems that in all these cases economics wins out. When you say we can have one Python programmer that’s as productive as ten Java programmers, and that’s not taking into account the communication issues of ten programmers, at some point somebody is going to look at that and say, “Wow. I can make a lot of money. I have a lot of leverage over my competition. I can get my product to market quicker. Gee, there are all these things that produce very significant financial differences. Can I afford not to look at this?” Python programmers probably do not like Python because of higher salaries. Not everything can be counted by money. On the other hand, managers do that (and have to do that). Because of that Python may get their support in future. Anyway, you can ignore it. . . Or not? References [Gau] Alan Gauld. Learning to program (http://www.freenetpages.co.uk/hp/alan.gauld/index.htm). On-line tutorial. 27 [Při] Petr Přikryl. Jak se naučit programovat (http://www.freenetpages.co.uk/hp/alan.gauld/czech2/index.html). Czech translation of [Gau]. On-line tutorial. [Ray] Eric S. Raymond. The Cathedral and the Bazaar. O’Reilly. Evolving book. The electronic version available at http://www.catb.org/˜esr/writings/cathedral-bazaar/. [Ray00] Eric S. Raymond. Why Python? http://www.linuxjournal.com/article.php?sid=3882. [VE03a] Bill Venners and Bruce Eckel. Python and the Programmer: A Conversation with Bruce Eckel, Part I (http://www.artima.com/intv/aboutmeP.html). Artima Articles About Interviews (http://www.artima.com/articles/index.jsp?topic=intv), June 2003. [VE03b] Bill Venners and Bruce Eckel. The Zen of Python: A Conversation with Bruce Eckel, Part II (http://www.artima.com/intv/prodperfP.html). Artima Articles About Interviews (http://www.artima.com/articles/index.jsp?topic=intv), June 2003. [VE03c] Bill Venners and Bruce Eckel. Type Checking and Techie Control: A Conversation with Bruce Eckel, Part III (http://www.artima.com/intv/typingP.html). Artima Articles About Interviews (http://www.artima.com/articles/index.jsp?topic=intv), July 2003. [VE03d] Bill Venners and Bruce Eckel. Python and the Tipping Point: A Conversation with Bruce Eckel, Part IV (http://www.artima.com/intv/tippingP.html). Artima Articles About Interviews (http://www.artima.com/articles/index.jsp?topic=intv), July 2003. 27 Linux Journal, May 2000. [VvR03a] Bill Venners and Guido van Rossum. The Making of Python: A Conversation with Guido van Rossum, Part I (http://www.artima.com/intv/pythonP.html). Artima Articles About Interviews (http://www.artima.com/articles/index.jsp?topic=intv), January 2003. [VvR03b] Bill Venners and Guido van Rossum. Python’s Design Goals: A Conversation with Guido van Rossum, Part II (http://www.artima.com/intv/pyscaleP.html). Artima Articles About Interviews (http://www.artima.com/articles/index.jsp?topic=intv), January 2003. [VvR03c] Bill Venners and Guido van Rossum. Programming at Python Speed: A Conversation with Guido van Rossum, Part III (http://www.artima.com/intv/speedP.html). Artima Articles About Interviews (http://www.artima.com/articles/index.jsp?topic=intv), January 2003. [VvR03d] Bill Venners and Guido van Rossum. Contracts in Python: A Conversation with Guido van Rossum, Part IV (http://www.artima.com/intv/pycontractP.html). Artima Articles About Interviews (http://www.artima.com/articles/index.jsp?topic=intv), February 2003. [VvR03e] Bill Venners and Guido van Rossum. Strong versus Weak Typing: A Conversation with Guido van Rossum, Part V (http://www.artima.com/intv/strongweakP.html). Artima Articles About Interviews (http://www.artima.com/articles/index.jsp?topic=intv), February 2003. [VvR03f] Bill Venners and Guido van Rossum. Designing with the Python Community: A Conversation with Guido van Rossum, Part VI (http://www.artima.com/intv/pycommP.html). Artima Articles About Interviews (http://www.artima.com/articles/index.jsp?topic=intv), February 2003. 28