Download Session 5 - Steven Bamford

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
An introduction to scientific programming with
Session 5:
Extreme Python
• 
Managing your environment
• 
Efficiently handling large datasets
• 
Optimising your code
• 
Squeezing out extra speed
• 
Writing robust code
• 
Graphical interfaces
Managing your
environment
• 
Some good things about Python
• 
• 
• 
ongoing development of Python and modules
Some bad things about Python
• 
• 
• 
lots of modules from many sources
lots of modules from many sources
ongoing development of Python and modules
A solution
• 
Maintain (or have option to create) separate environments
(or manifests) for different projects
• 
virtualenv
• 
• 
general Python solution – http://virtualenv.pypa.io
modules are installed with pip – https://pip.pypa.io
$ pip install virtualenv
# install virtualenv
$ virtualenv ENV1
# create a new environment ENV1
$ source ENV/bin/activate # set PATH to our environment
(ENV1)$ pip install emcee # install modules into ENV1
(ENV1)$ pip install numpy==1.8.2 # install specific version
(ENV1)$ python
# use our custom environment
(ENV1)$ deactivate
# return our PATH to normal
• 
virtualenv
• 
• 
can record current state of modules to a 'requirements' file
using that file can always recreate the same environment
(ENV1)$ pip freeze > requirements.txt
$ cat requirements.txt
emcee==2.1.0
numpy==1.8.2
$ deactivate
$ virtualenv ENV2
$ source ENV2/bin/activate
(ENV2)$ pip install -r requirements.txt
• 
conda – http://conda.pydata.org
• 
• 
• 
specific to the Anaconda Python distribution
similar to 'pip', but can install binaries (not just python)
avoid using pip within conda environment (although possible)
$ conda create -n ENV1
# create a new environment ENV1
$ source activate ENV1
# set PATH to our environment
$ conda install numpy
# install modules into ENV1
$ conda install -c thebamf emcee
$ source deactivate
# install from binstar
# return our PATH to normal
$ conda list –n ENV1 -e > requirements.txt
$ conda create -n ENV2 --file requirements.txt
# clone ENV1
# to ENV2
• 
Updating packages $ conda update --all
$ conda update scipy emcee
OR
$ pip install --upgrade
$ pip install --upgrade scipy emcee
Efficiently handling
large datasets
• 
Python has tools for accessing most (all?) databases
• 
e.g. MySQL, SQLite, MongoDB, Postgres, …
• 
• 
Allow one to work with huge datasets
• 
However, most databases are designed for webserver use
Data can be at remote locations
• 
Not optimised for data analysis
• 
• 
http://pytables.github.io
For creating, storing and analysing datasets
• 
• 
• 
• 
• 
from simple, small tables to complex, huge datasets
standard HDF5 file format
incredibly fast – even faster with indexing
uses on the fly block compression
designed for modern systems
• 
• 
• 
fast multi-code CPU; large, slow memory
"in-kernel" – data and algorithm are sent to CPU in optimal way
"out-of-core" – avoids loading whole dataset into memory
• 
• 
• 
• 
• 
Can store many things in one HDF5 file (like FITS)
Tree structure
Everything in a group (starting with root group, '/')
Data stored in leaves
Arrays (e.g. n-dimensional images)
>>> from tables import *
>>> h5file = openFile("test.h5", mode = "w")
>>> x = h5file.createArray("/", "x", arange(1000))
>>> y = h5file.createArray("/", "y", sqrt(arange(1000)))
>>> h5file.close()
• 
Tables (columns with different formats)
• 
• 
described by a class
accessed by a row iterator
>>> class MyTable(IsDescription):
z = Float32Col()
>>> table = h5file.createTable("/", "mytable", MyTable)
>>> row = table.row
>>> for i in xrange(1000):
row["z"] = i**(3.0/2.0)
row.append()
>>> table.flush()
>>> z = table.cols.z
• 
Expr enables in-kernel & out-of-core operations
>>> r = h5file.createArray("/", "r", np.zeros(1000))
>>> xyz = Expr("x*y*z")
>>> xyz.setOutput(r)
>>> xyz.eval()
/r (Array(1000,)) ''
atom := Float64Atom(shape=(), dflt=0.0)
maindim := 0
flavor := 'numpy'
byteorder := 'little'
chunkshape := None
>>> r.read(0, 10)
array([
0.
,
64.
,
511.99999124,
1.
,
7.99999986,
26.9999989 ,
124.99999917,
216.00000085,
343.00001259,
729.
])
• 
where enables in-kernel selections
>>> r_bigish = [ row['z'] for row in
table.where('(z > 1000) & (z <= 2000)' ]
>>> for big in table.where('z > 10000;'):
...
print('A big z is {}'.format(big['z'])
• 
There is also a where in Expr
• 
Python Data Analysis Library
• 
Easy-to-use data structures
• 
http://pandas.pydata.org
• 
• 
• 
• 
DataFrame (more friendly recarray)
Handles missing data (more friendly masked array)
read and write various data formats
data-alignment
• 
• 
Easy to combine data tables
Surprisingly fast!
• 
tries to be helpful, though not always intuitive
• 
• 
1D Series, 2D DataFrame, 3D Panel Various ways of creating these structures
>>> import pandas as pd
>>> t = np.arange(0.0, 1.0, 0.1)
>>> s = pd.Series(t)
>>> x = t + np.random.normal(scale=0.1, size=t.shape)
>>> y = x**2 + np.random.normal(scale=0.5, size=t.shape)
>>> df1 = pd.DataFrame({ 'x' : x, 'y' : y}, index = t)
>>> df1.plot()
>>> df1.plot('x', 'y', kind='scatter')
• 
• 
Always indexed
Indices used when joining tables
>>> t2 = t[::2]
>>> z = t2**3 + np.random.normal(scale=1.0, size=t2.shape)
>>> df2 = pd.DataFrame({ 'z' : z, 'z2' : z**2}, index = t2)
>>> df3 = df1.join(df2)
>>> df3.sort('y')
• 
• 
• 
Handling data fairly intuitive, similar to numpy slices
Powerful and flexible
Good documentation
• 
• 
DataFrame can be created directly from a recarray
Pandas data structures can be efficiently stored on disk
• 
based on PyTables data = pyfits.getdata('myfunkydata.fits')
df = pd.DataFrame(data)
# store DataFrame to HDF5
store = pd.HDFStore('myfunkydata.h5', mode='w',
complib='blosc', complevel=5)
store['funkydata'] = df
# some time later…
store = pd.HDFStore('myfunkydata.h5', mode='r')
df = store['funkydata']
Graphical interfaces
Several modules to construct GUIs in Python
wxpython is one of the most popular
http://www.wxpython.org
E.g., https://github.com/bamford/control/
Django
a high-level Python Web framework that encourages rapid
development and clean, pragmatic design.
and many others, e.g. Zope (massive), web2py (light), …
• 
Give your scientific code a friendly face!
• 
• 
• 
• 
easy configuration
monitor progress
particularly for public code, cloud computing, HPC
An (unsicentific) example, followed by another one
Optimising your
code
timeit – use in interpreter, script or command line
python -m timeit [-n N] [-r N] [-s S] [statement ...]
Options:
-s S, --setup=S
statement to be executed once initially (default pass)
-n N, --number=N
how many times to execute 'statement' (default: take ~0.2 sec total)
-r N, --repeat=N
how many times to repeat the timer (default 3)
iPython magic version
%timeit
%%timeit
# one line
# whole notebook cell
# fastest way to calculate x**5?
$ python -m timeit -s 'from math import pow; x = 1.23' 'x*x*x*x*x'
10000000 loops, best of 3: 0.161 usec per loop
$ python -m timeit -s 'from math import pow; x = 1.23' 'x**5'
10000000 loops, best of 3: 0.111 usec per loop
$ python -m timeit -s 'from math import pow; x = 1.23' 'pow(x, 5)'
1000000 loops, best of 3: 0.184 usec per loop
• 
Understand which parts of your code limit its execution time
• 
print summary to screen, or save file for detailed analysis
From shell
$ python -m cProfile –o program.prof my_program.py
From iPython %prun -D program.prof my_function()
%%prun
# profile an entire notebook cell
Lots of functionality… see docs
Nice visualisation with snakeviz – http://jiffyclub.github.io/snakeviz/
$ conda install –c thebamf snakeviz
OR
$ pip install snakeviz
In iPython:
%load_ext snakeviz
%snakeviz my_function()
%%snakeviz
# profile entire cell
Writing robust code
• 
Several testing frameworks
• 
• 
unittest is the main Python module
nose is a third-party module that nicely automates testing
Squeezing out extra
speed
• 
Python includes modules for writing "parallel" programs:
• 
• 
threaded – limited by the Global Interpreter Lock
multiprocessing – generally more useful
from multiprocessing import Pool
def f(x):
return x*x
pool = Pool(processes=4)
z = range(10)
print pool.map(f, z)
# start 4 worker processes
# apply f to each element of z in parallel
from multiprocessing import Process
from time import sleep
def f(name):
print('Hello {}, I am going to sleep now'.format(name))
sleep(3)
print('OK, finished sleeping')
if __name__ == '__main__':
p = Process(target=f, args=(lock, 'Steven'))
p.start()
# start additional process
sleep(1)
# carry on doing stuff
print 'Wow, how lazy is that function!'
p.join()
# wait for process to complete
$ python thinking.py
Hello Steven, I am going to sleep now
Wow, how lazy is that function!
OK, finished sleeping
(Really, should use a lock
to avoid writing output
to screen at same time)
Cython is used for compiling Python-like code to machine-code
• 
• 
• 
• 
• 
Cython code is turned into C code
• 
• 
supports a big subset of the Python language
conditions and loops run 2-8x faster, overall 30% faster for plain Python
code
add types for speedups (hundreds of times)
easily use native libraries (C/C++/Fortran) directly
uses the CPython API and runtime
Coding in Cython is like coding in Python and C at the same time!
Some material borrowed from Dag Sverre Seljebotn (University of Oslo) EuroSciPy 2010 presentation
Use cases:
• 
Performance-critical code
• 
• 
• 
which does not translate to array-based approach (numpy / pytables)
existing Python code ! optimise critical parts Wrapping existing C/C++ libraries
• 
• 
particularly higher-level Pythonised wrapper
for one-to-one wrapping other tools might be better suited
Cython code must be compiled.
Two stages:
• 
A .pyx file is compiled by Cython to a .c file, containing the code of a
Python extension module
• 
The .c file is compiled by a C compiler
• 
• 
• 
• 
Generated C code can be built without Cython installed
Cython is a developer dependency, not a build-time dependency
Generated C code works with Python 2.3+
The result is a .so file (or .pyd on Windows) which can be imported
directly into a Python session
Ways of building Cython code:
• 
Run cython command-line utility and compile the resulting C file
• 
• 
• 
Use pyximport to importing Cython .pyx files as if they were .py files;
building on the fly (recommended to start).
• 
• 
• 
use favourite build tool
for cross-system operation you need to query Python for the C build
options to use
things get complicated if you must link to native libraries
larger projects tend to need a build phase anyway
Write a distutils setup.py
• 
standard way of distributing, building and installing Python modules
• 
• 
Cython supports most of normal Python
Most standard Python code can be used directly with Cython
• 
• 
• 
typical speedups of (very roughly) a factor of two
should not ever slow down code – safe to try
name file .pyx or use pyimport = True
>>> import pyximport
>>> pyximport.install()
>>> import mypyxmodule # converts and compiles on the fly
>>> pyximport.install(pyimport = True)
>>> import mypymodule
# converts and compiles on the fly
# should fall back to Python if fails
• 
• 
• 
• 
• 
Big speedup from defining types of key variables
Use native C-types (int, double, char *, etc.)
Use Python C-types (Py_int_t, Py_float_t, etc.)
Use cdef to declare variable types
Also use cdef to declare C-only functions (with return type)
• 
• 
can also use cpdef to declare functions which are automatically treated
as C or Python depending on usage
Don't forget function arguments (but note cdef not used here) • 
Efficient algorithm to find first N prime numbers
def primes(kmax):
p = []
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
k = k + 1
p.append(n)
n = n + 1
return p
primes.py
$ python -m timeit -s 'import primes as p' 'p.primes(100)'
1000 loops, best of 3: 1.35 msec per loop
cprimes.pyx
def primes(kmax):
p = []
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
k = k + 1
p.append(n)
n = n + 1
return p
$ python -m timeit -s 'import pyximport;
pyximport.install(); import cprimes as p' 'p.primes(100)'
1000 loops, best of 3: 731 usec per loop
1.8x speedup
xprimes.pyx
def primes(int kmax): # declare types of parameters
cdef int n, k, i
# declare types of variables
cdef int p[1000]
# including arrays
result = []
# can still use normal Python types
if kmax > 1000:
# in this case need to hardcode limit
kmax = 1000
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
contains only C-code
i = i + 1
if i == k:
p[k] = n
40.8 usec per loop
k = k + 1
33x speedup
result.append(n)
n = n + 1
return result
# return Python object
• 
Cython provides a way to quickly access Numpy arrays with specified
types and dimensionality
! for implementing fast specific algorithms
• 
Also provides way to create generalized Ufuncs
• 
Can be useful, but often using functions provided by numpy, scipy,
numexpr or pytables will be easier and faster
• 
• 
JIT: just in time compilation of Python functions
Compilation for both CPU and GPU hardware
from numba import jit
@jit
def sum2d(arr):
M, N = arr.shape
result = 0.0
for i in range(M):
for j in range(N):
result += arr[i,j]
return result
a = np.arange(10000).reshape(1000,10)
%timeit sum2d(a)
1000 loops, best of 3: 334 µs per loop
%timeit sum2d(a) # without @jit
1 loops, best of 3: 2.15 µs per loop
An introduction to scientific programming with
The End