Download Built-ins and Base Modules

5 Built-ins and Base Modules I wrote 20 short programs in Python yesterday. It was wonderful. Perl, I’m leaving you. Randall Munroe (author of xkcd) This was already mentioned at the beginning of the course: Python has batteries included. First, there are a lot of very powerful built-in functions that you can use without importing additional modules. Second, there are a lot of modules shipped with Python itself, the so-called standard library. They cover almost every standard task you will encounter. In this chapter we will give a brief overview over some of this functionality. You are encouraged to read the official standard library documentation [PythonStd2016] for detailed information. Task 5.1 1. One expression of the extensiveness of Python’s standard library is the following easter-egg: open up the IPython console and type in import antigravity. 1 Built-in Functions There are a lot of built-in functions which are part of the language itself rather than part of some module. We already learned about len() which gives the length of any sequenced data type. Another example would be the range() function. You can find a full listing in [PythonBuiltIn2016]. The following is just an extract: Functions abs(x) any(iterable) complex(real, imag) Description Return the absolute value of a number. The argument may be an integer or a floating point number. If the argument is a complex number, its magnitude is returned. Return True if any element of the iterable is true. If the iterable is empty, return False. Create a complex number with the value real + imag*j or convert a string or number to a complex number. If the first parameter is a string, it will be interpreted as a complex number and the function must be called without a second parameter. 1 5 Built-ins and Base Modules Introduction to Python for Physicists The second parameter can never be a string. Each argument may be any numeric type (including complex). If imag is omitted, it defaults to zero and the function serves as a numeric conversion function like int() and float(). If both arguments are omitted, returns 0j. eval(expression, The arguments are a string and optional globals and globals=None, locals. If provided, globals must be a dictionary. If provided, locals can be any mapping object. The expression locals=None) argument is parsed and evaluated as a Python expression (technically speaking, a condition list) using the globals and locals dictionaries as global and local namespace. The return value is the result of the evaluated expression. float([x]) Convert a string or a number to floating point. int(x=0) Convert a number or string x to an integer, or return 0 if no arguments are given. For floating point numbers, this truncates towards zero. len(s) Return the length (the number of items) of an object. The argument may be a sequence (such as a string, bytes, tuple, list, or range) or a collection (such as a dictionary, set, or frozen set). max(), min() Return the largest or smallest item in an iterable or the largest of two or more arguments. print(*objects, sep=’ ’, Print objects separated by sep and followed by end. sep end=’\n’) and end, if present, must be given as keyword arguments. round(number[, Return the floating point value number rounded to ndigits digits after the decimal point. If ndigits is omitted, it ndigits]) defaults to zero. sorted(iterable) Return a new sorted list from the items in iterable. str(object=’’) Returns a string version of object. sum(iterable) Returns the sum of all elements in iterable. 2 Reading and Writing Files Reading and writing files can be achieved by using the open() built-in function. Let’s assume that a text file named myfile.txt is stored in the current working directory. It can be opened with f = open('myfile.txt', 'r') The first argument is the file name or the full path to the file. The second argument is a string that specifies how the file is opened. It can be built from the following letters Course at the FSU Jena Nils Becker 2 version from October 6, 2016 5 Built-ins and Base Modules Specifier ’r’ ’w’ ’x’ ’a’ ’b’ ’t’ ’+’ Introduction to Python for Physicists Description open for reading (default) open for writing, truncating the file first open for exclusive creation, failing if the file already exists open for writing, appending to the end of the file if it exists binary mode text mode (default) open a disk file for updating (reading and writing) The update specifier + and the mode (t or b) is combined with one of the opening modes. For example ’rb’ will open a file in binary mode for reading. ’r+b’ will additionally allow writing to the file. The specifier ’w+b’ allows reading and writing but will create a new file or truncate an existing one. Files can be read starting from a certain point (normally the beginning), the file pointer. When some bytes are read from the file, the file pointer is automatically moved along: f.read(28) > 'These are the first xx chara' f.read(28) > 'cters. \n' read(n) returns n characters starting from the current file pointer. After that the file pointer is moved by n characters. If you do not provide an argument the whole file is returned. f.readline() > 'This is the first line of the file.\n' f.readline() > 'Second line of the file\n' f.readline() '' readline() returns the content of the file line by line. You can also iterate over the lines in a file by using a for loop: for line in f: print(line, end='') > This is the first line of the file. > Second line of the file f.write(string) writes the contents of string to the file, returning the number of characters written. To write something other than a string, it needs to be converted to a string first: the_answer = 42 s = str(the_answer) f.write(s) Course at the FSU Jena Nils Becker 3 version from October 6, 2016 5 Built-ins and Base Modules Introduction to Python for Physicists f.tell() returns an integer giving the current file pointer. This will be the number of characters in text mode and the number of bytes from the beginning of the file in binary mode. File access is normally buffered. To ensure that your changes are actually saved you can force a write to disk with f.flush(). In any case the file should always be closed with f.close() after you are done working with it. This flushes any changes and frees up system resources. There is a handy way of ensuring that you always close your files if you use a with statement: with open('myfile.txt', 'r') as f: f.read() # file is closed after this block The file is opened at the beginning of the with-block and automatically closed afterwards. When you open a file in text mode it is important to select the correct encoding. The encoding can be set by using the encoding argument: f = open('myfile.txt', 'r', encoding='utf-8') The correct encoding depends on the encoding the file was saved in. For writing you will very probably want to select utf-8. Task 5.2 1. Write a program that reads an arbitrary text file and outputs the relative frequency (in percent) of each character. Do not be case-sensitive and only consider A-Z. Hint: There is a test file on the course webpage. 2.1 CSV Files There is a very common file format for exchanging tabular data: the comma-separatedvalue file or CSV file. It can be read and exported by Excel for example. The lines in such a file look like this: 1.0, 2.0, 3, 'Hello you', '1,0' Commonly you can choose the separators (,;: or tabulators or whitespaces) and the quotes (", ‘ or ’). To make data extraction from such files easy, there is the module csv. The following code illustrates its usage import csv with open('mydata.csv', 'w', newline='') as f: fwriter = csv.writer(f, delimiter=',', quotechar='"') for i in range(10): data = [i, "a"*(i % 5 + 1), i**2] fwriter.writerow(data) Course at the FSU Jena Nils Becker 4 version from October 6, 2016 5 Built-ins and Base Modules Introduction to Python for Physicists will generate the following output in the file mydata.csv 0,a,0 1,aa,1 2,aaa,4 3,aaaa,9 4,aaaaa,16 5,a,25 6,aa,36 7,aaa,49 8,aaaa,64 9,aaaaa,81 To read the content of the file you can use the following: with open('mydata.csv', 'r', newline='') as f: freader = csv.reader(f, delimiter=',', quotechar='"') data = [] for row in freader: data.append(row) You will have a list of lists in data and can access the fields with data[row, column]. Task 5.3 1. Write a program that saves the first 1000 primes in a .csv-file. In the first column there should be the number of the prime, in the second the prime itself. 3 Pickling objects The term pickling stands for saving (almost) arbitrary Python variables to a file. Reading them from the file again would be “unpickling”. This functionality is provided by the pickle module. The following code illustrates the usage: import pickle a = [1, 2, 3] with open('test', 'wb') as f: pickle.dump(a, f) with open('test', 'rb') as f: b = pickle.load(f) print(b) > [1, 2, 3] Notice that you need to open the file in binary mode. The pickled list will be saved in an internal binary format. This format may change between Python versions so pickling should not be used for data exchange. Rather it provides a convenient way to quickly store Python variables. Course at the FSU Jena Nils Becker 5 version from October 6, 2016 5 Built-ins and Base Modules Introduction to Python for Physicists 4 Mathematics There are two modules which provide basic mathematical functionality: math provides functions for real numbers, cmath the same for complex numbers. The modules include functions like exp, log, log2, log10 and pow. The square root can be calculated with sqrt. The smallest integer bigger than x is returned by ceil(x) and the biggest integer smaller than x is returned by floor(x). The absolute value of x is returned by fabs(x). math also defines some trigonometric functions like sin, cos, tan and the inverse and hyberpolic counterparts functions. All the functions accept or return angle in radians. Furthermore, math has defined the constants e and pi. cmath also implements most of the above functions for complex variables. Additionally it provides functions like phase, polar and rect which make it easy to work with different representations of complex numbers. A quick example: import math import cmath x = math.sin(2 * math.pi) > -2.4492935982947064e-16 y = cmath.exp(-1.0j * math.pi) > (-1-1.2246467991473532e-16j) 5 File Handling The os module provides several functions to interact with the operating system. For example: import os > os.getcwd() # Return the current working directory '/home/user/' os.chdir('/server/accesslogs') # Change current working directory os.system('mkdir today') # Run the command mkdir in the system shell > 0 For common file and directory management tasks, the shutil module provides a higher level interface that is easier to use and does not depend on the operating system: import shutil shutil.copyfile('data.db', 'archive.db') > 'archive.db' shutil.move('/build/executables', 'installdir') > 'installdir' The glob module provides a function for making file lists from directory wildcard searches: import glob glob.glob('*.py') > ['primes.py', 'random.py', 'quote.py'] Common utility scripts often need to process command line arguments. These arguments are stored in the sys module’s argv attribute as a list. For instance the following output results from running python demo.py one two three at the command line: Course at the FSU Jena Nils Becker 6 version from October 6, 2016 5 Built-ins and Base Modules Introduction to Python for Physicists import sys print(sys.argv) > ['demo.py', 'one', 'two', 'three'] 6 Dates and Times The datetime module supplies classes for manipulating dates and times in both simple and complex ways. While date and time arithmetic is supported, the focus of the implementation is on efficient member extraction for output formatting and manipulation. The module also supports objects that are timezone aware. from datetime import date now = date.today() now > datetime.date(2003, 12, 2) now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.") > '12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.' birthday = date(1964, 7, 31) age = now - birthday age.days > 14368 7 Time and Performance Measurement It can be advantageous to know which solution to a problem runs faster. For that Python provides the timeit module. You can measure the execution time of a code part that you pass as a string. from timeit import Timer Timer('t=a; a=b; b=t', 'a=1; b=2').timeit(10000000) > 0.34298061200024677 Timer('a,b = b,a', 'a=1; b=2').timeit(10000000) > 0.3068590229995607 The above code measures the performance of two different approaches to switching the content of two variables. The first argument to Timer() is the code that is profiled and the second is the code that is executed once before the measurement starts. The argument to timeit() is the number of execution Python averages over. You see that the second approach is marginally faster. Another way to do it would be to use time.clock() which returns processor time with the best possible resolution in seconds: import time start = time.clock() # here I can test code end = time.clock() print('elapsed time {:.3f}s'.format(end-start)) Course at the FSU Jena Nils Becker 7 version from October 6, 2016

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Built-ins and Base Modules