Download Built-ins and Base Modules

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
5 Built-ins and Base Modules
I wrote 20 short programs in Python yesterday.
It was wonderful. Perl, I’m leaving you.
Randall Munroe (author of xkcd)
This was already mentioned at the beginning of the course: Python has batteries
included. First, there are a lot of very powerful built-in functions that you can use
without importing additional modules. Second, there are a lot of modules shipped
with Python itself, the so-called standard library. They cover almost every standard
task you will encounter. In this chapter we will give a brief overview over some of
this functionality. You are encouraged to read the official standard library documentation [PythonStd2016] for detailed information.
Task 5.1
1. One expression of the extensiveness of Python’s standard library is the following easter-egg: open up the IPython console and type in import antigravity.
1 Built-in Functions
There are a lot of built-in functions which are part of the language itself rather than
part of some module. We already learned about len() which gives the length of any
sequenced data type. Another example would be the range() function. You can find a
full listing in [PythonBuiltIn2016]. The following is just an extract:
Functions
abs(x)
any(iterable)
complex(real, imag)
Description
Return the absolute value of a number. The argument
may be an integer or a floating point number. If the argument is a complex number, its magnitude is returned.
Return True if any element of the iterable is true. If the
iterable is empty, return False.
Create a complex number with the value real + imag*j
or convert a string or number to a complex number. If
the first parameter is a string, it will be interpreted as a
complex number and the function must be called without
a second parameter.
1
5 Built-ins and Base Modules
Introduction to Python for Physicists
The second parameter can never be a string. Each argument may be any numeric type (including complex). If
imag is omitted, it defaults to zero and the function serves
as a numeric conversion function like int() and float().
If both arguments are omitted, returns 0j.
eval(expression,
The arguments are a string and optional globals and
globals=None,
locals. If provided, globals must be a dictionary. If provided, locals can be any mapping object. The expression
locals=None)
argument is parsed and evaluated as a Python expression
(technically speaking, a condition list) using the globals
and locals dictionaries as global and local namespace.
The return value is the result of the evaluated expression.
float([x])
Convert a string or a number to floating point.
int(x=0)
Convert a number or string x to an integer, or return 0 if
no arguments are given. For floating point numbers, this
truncates towards zero.
len(s)
Return the length (the number of items) of an object.
The argument may be a sequence (such as a string, bytes,
tuple, list, or range) or a collection (such as a dictionary,
set, or frozen set).
max(), min()
Return the largest or smallest item in an iterable or the
largest of two or more arguments.
print(*objects, sep=’ ’, Print objects separated by sep and followed by end. sep
end=’\n’)
and end, if present, must be given as keyword arguments.
round(number[,
Return the floating point value number rounded to ndigits digits after the decimal point. If ndigits is omitted, it
ndigits])
defaults to zero.
sorted(iterable)
Return a new sorted list from the items in iterable.
str(object=’’)
Returns a string version of object.
sum(iterable)
Returns the sum of all elements in iterable.
2 Reading and Writing Files
Reading and writing files can be achieved by using the open() built-in function. Let’s
assume that a text file named myfile.txt is stored in the current working directory. It
can be opened with
f = open('myfile.txt', 'r')
The first argument is the file name or the full path to the file. The second argument is
a string that specifies how the file is opened. It can be built from the following letters
Course at the FSU Jena
Nils Becker
2
version from
October 6, 2016
5 Built-ins and Base Modules
Specifier
’r’
’w’
’x’
’a’
’b’
’t’
’+’
Introduction to Python for Physicists
Description
open for reading (default)
open for writing, truncating the file first
open for exclusive creation, failing if the file already exists
open for writing, appending to the end of the file
if it exists
binary mode
text mode (default)
open a disk file for updating (reading and writing)
The update specifier + and the mode (t or b) is combined with one of the opening
modes. For example ’rb’ will open a file in binary mode for reading. ’r+b’ will
additionally allow writing to the file. The specifier ’w+b’ allows reading and writing
but will create a new file or truncate an existing one.
Files can be read starting from a certain point (normally the beginning), the file
pointer. When some bytes are read from the file, the file pointer is automatically
moved along:
f.read(28)
> 'These are the first xx chara'
f.read(28)
> 'cters. \n'
read(n) returns n characters starting from the current file pointer. After that the file
pointer is moved by n characters. If you do not provide an argument the whole file is
returned.
f.readline()
> 'This is the first line of the file.\n'
f.readline()
> 'Second line of the file\n'
f.readline()
''
readline() returns the content of the file line by line. You can also iterate over the
lines in a file by using a for loop:
for line in f:
print(line, end='')
> This is the first line of the file.
> Second line of the file
f.write(string) writes the contents of string to the file, returning the number of
characters written. To write something other than a string, it needs to be converted to
a string first:
the_answer = 42
s = str(the_answer)
f.write(s)
Course at the FSU Jena
Nils Becker
3
version from
October 6, 2016
5 Built-ins and Base Modules
Introduction to Python for Physicists
f.tell() returns an integer giving the current file pointer. This will be the number
of characters in text mode and the number of bytes from the beginning of the file in
binary mode.
File access is normally buffered. To ensure that your changes are actually saved you
can force a write to disk with f.flush(). In any case the file should always be closed
with
f.close()
after you are done working with it. This flushes any changes and frees up system
resources. There is a handy way of ensuring that you always close your files if you use
a with statement:
with open('myfile.txt', 'r') as f:
f.read()
# file is closed after this block
The file is opened at the beginning of the with-block and automatically closed afterwards.
When you open a file in text mode it is important to select the correct encoding. The
encoding can be set by using the encoding argument:
f = open('myfile.txt', 'r', encoding='utf-8')
The correct encoding depends on the encoding the file was saved in. For writing you
will very probably want to select utf-8.
Task 5.2
1. Write a program that reads an arbitrary text file and outputs the relative
frequency (in percent) of each character. Do not be case-sensitive and only
consider A-Z. Hint: There is a test file on the course webpage.
2.1 CSV Files
There is a very common file format for exchanging tabular data: the comma-separatedvalue file or CSV file. It can be read and exported by Excel for example. The lines in
such a file look like this:
1.0, 2.0, 3, 'Hello you', '1,0'
Commonly you can choose the separators (,;: or tabulators or whitespaces) and the
quotes (", ‘ or ’). To make data extraction from such files easy, there is the module csv.
The following code illustrates its usage
import csv
with open('mydata.csv', 'w', newline='') as f:
fwriter = csv.writer(f, delimiter=',', quotechar='"')
for i in range(10):
data = [i, "a"*(i % 5 + 1), i**2]
fwriter.writerow(data)
Course at the FSU Jena
Nils Becker
4
version from
October 6, 2016
5 Built-ins and Base Modules
Introduction to Python for Physicists
will generate the following output in the file mydata.csv
0,a,0
1,aa,1
2,aaa,4
3,aaaa,9
4,aaaaa,16
5,a,25
6,aa,36
7,aaa,49
8,aaaa,64
9,aaaaa,81
To read the content of the file you can use the following:
with open('mydata.csv', 'r', newline='') as f:
freader = csv.reader(f, delimiter=',', quotechar='"')
data = []
for row in freader:
data.append(row)
You will have a list of lists in data and can access the fields with data[row, column].
Task 5.3
1. Write a program that saves the first 1000 primes in a .csv-file. In the first
column there should be the number of the prime, in the second the prime
itself.
3 Pickling objects
The term pickling stands for saving (almost) arbitrary Python variables to a file. Reading them from the file again would be “unpickling”. This functionality is provided by
the pickle module. The following code illustrates the usage:
import pickle
a = [1, 2, 3]
with open('test', 'wb') as f:
pickle.dump(a, f)
with open('test', 'rb') as f:
b = pickle.load(f)
print(b)
> [1, 2, 3]
Notice that you need to open the file in binary mode. The pickled list will be saved
in an internal binary format. This format may change between Python versions so
pickling should not be used for data exchange. Rather it provides a convenient way to
quickly store Python variables.
Course at the FSU Jena
Nils Becker
5
version from
October 6, 2016
5 Built-ins and Base Modules
Introduction to Python for Physicists
4 Mathematics
There are two modules which provide basic mathematical functionality: math provides
functions for real numbers, cmath the same for complex numbers. The modules include
functions like exp, log, log2, log10 and pow. The square root can be calculated with
sqrt. The smallest integer bigger than x is returned by ceil(x) and the biggest integer
smaller than x is returned by floor(x). The absolute value of x is returned by fabs(x).
math also defines some trigonometric functions like sin, cos, tan and the inverse and
hyberpolic counterparts functions. All the functions accept or return angle in radians.
Furthermore, math has defined the constants e and pi.
cmath also implements most of the above functions for complex variables. Additionally
it provides functions like phase, polar and rect which make it easy to work with
different representations of complex numbers.
A quick example:
import math
import cmath
x = math.sin(2 * math.pi)
> -2.4492935982947064e-16
y = cmath.exp(-1.0j * math.pi)
> (-1-1.2246467991473532e-16j)
5 File Handling
The os module provides several functions to interact with the operating system. For
example:
import os
> os.getcwd()
# Return the current working directory
'/home/user/'
os.chdir('/server/accesslogs')
# Change current working directory
os.system('mkdir today')
# Run the command mkdir in the system shell
> 0
For common file and directory management tasks, the shutil module provides a higher
level interface that is easier to use and does not depend on the operating system:
import shutil
shutil.copyfile('data.db', 'archive.db')
> 'archive.db'
shutil.move('/build/executables', 'installdir')
> 'installdir'
The glob module provides a function for making file lists from directory wildcard
searches:
import glob
glob.glob('*.py')
> ['primes.py', 'random.py', 'quote.py']
Common utility scripts often need to process command line arguments. These arguments are stored in the sys module’s argv attribute as a list. For instance the following
output results from running python demo.py one two three at the command line:
Course at the FSU Jena
Nils Becker
6
version from
October 6, 2016
5 Built-ins and Base Modules
Introduction to Python for Physicists
import sys
print(sys.argv)
> ['demo.py', 'one', 'two', 'three']
6 Dates and Times
The datetime module supplies classes for manipulating dates and times in both simple
and complex ways. While date and time arithmetic is supported, the focus of the implementation is on efficient member extraction for output formatting and manipulation.
The module also supports objects that are timezone aware.
from datetime import date
now = date.today()
now
> datetime.date(2003, 12, 2)
now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
> '12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'
birthday = date(1964, 7, 31)
age = now - birthday
age.days
> 14368
7 Time and Performance Measurement
It can be advantageous to know which solution to a problem runs faster. For that
Python provides the timeit module. You can measure the execution time of a code
part that you pass as a string.
from timeit import Timer
Timer('t=a; a=b; b=t', 'a=1; b=2').timeit(10000000)
> 0.34298061200024677
Timer('a,b = b,a', 'a=1; b=2').timeit(10000000)
> 0.3068590229995607
The above code measures the performance of two different approaches to switching
the content of two variables. The first argument to Timer() is the code that is profiled
and the second is the code that is executed once before the measurement starts. The
argument to timeit() is the number of execution Python averages over. You see
that the second approach is marginally faster. Another way to do it would be to use
time.clock() which returns processor time with the best possible resolution in seconds:
import time
start = time.clock()
# here I can test code
end = time.clock()
print('elapsed time {:.3f}s'.format(end-start))
Course at the FSU Jena
Nils Becker
7
version from
October 6, 2016