Download Session 5 - Steven Bamford

An introduction to scientific programming with Session 5: Extreme Python •  Managing your environment •  Efficiently handling large datasets •  Optimising your code •  Squeezing out extra speed •  Writing robust code •  Graphical interfaces Managing your environment •  Some good things about Python •  •  •  ongoing development of Python and modules Some bad things about Python •  •  •  lots of modules from many sources lots of modules from many sources ongoing development of Python and modules A solution •  Maintain (or have option to create) separate environments (or manifests) for different projects •  virtualenv •  •  general Python solution – http://virtualenv.pypa.io modules are installed with pip – https://pip.pypa.io $ pip install virtualenv # install virtualenv $ virtualenv ENV1 # create a new environment ENV1 $ source ENV/bin/activate # set PATH to our environment (ENV1)$ pip install emcee # install modules into ENV1 (ENV1)$ pip install numpy==1.8.2 # install specific version (ENV1)$ python # use our custom environment (ENV1)$ deactivate # return our PATH to normal •  virtualenv •  •  can record current state of modules to a 'requirements' file using that file can always recreate the same environment (ENV1)$ pip freeze > requirements.txt $ cat requirements.txt emcee==2.1.0 numpy==1.8.2 $ deactivate $ virtualenv ENV2 $ source ENV2/bin/activate (ENV2)$ pip install -r requirements.txt •  conda – http://conda.pydata.org •  •  •  specific to the Anaconda Python distribution similar to 'pip', but can install binaries (not just python) avoid using pip within conda environment (although possible) $ conda create -n ENV1 # create a new environment ENV1 $ source activate ENV1 # set PATH to our environment $ conda install numpy # install modules into ENV1 $ conda install -c thebamf emcee $ source deactivate # install from binstar # return our PATH to normal $ conda list –n ENV1 -e > requirements.txt $ conda create -n ENV2 --file requirements.txt # clone ENV1 # to ENV2 •  Updating packages $ conda update --all $ conda update scipy emcee OR $ pip install --upgrade $ pip install --upgrade scipy emcee Efficiently handling large datasets •  Python has tools for accessing most (all?) databases •  e.g. MySQL, SQLite, MongoDB, Postgres, … •  •  Allow one to work with huge datasets •  However, most databases are designed for webserver use Data can be at remote locations •  Not optimised for data analysis •  •  http://pytables.github.io For creating, storing and analysing datasets •  •  •  •  •  from simple, small tables to complex, huge datasets standard HDF5 file format incredibly fast – even faster with indexing uses on the fly block compression designed for modern systems •  •  •  fast multi-code CPU; large, slow memory "in-kernel" – data and algorithm are sent to CPU in optimal way "out-of-core" – avoids loading whole dataset into memory •  •  •  •  •  Can store many things in one HDF5 file (like FITS) Tree structure Everything in a group (starting with root group, '/') Data stored in leaves Arrays (e.g. n-dimensional images) >>> from tables import * >>> h5file = openFile("test.h5", mode = "w") >>> x = h5file.createArray("/", "x", arange(1000)) >>> y = h5file.createArray("/", "y", sqrt(arange(1000))) >>> h5file.close() •  Tables (columns with different formats) •  •  described by a class accessed by a row iterator >>> class MyTable(IsDescription): z = Float32Col() >>> table = h5file.createTable("/", "mytable", MyTable) >>> row = table.row >>> for i in xrange(1000): row["z"] = i**(3.0/2.0) row.append() >>> table.flush() >>> z = table.cols.z •  Expr enables in-kernel & out-of-core operations >>> r = h5file.createArray("/", "r", np.zeros(1000)) >>> xyz = Expr("x*y*z") >>> xyz.setOutput(r) >>> xyz.eval() /r (Array(1000,)) '' atom := Float64Atom(shape=(), dflt=0.0) maindim := 0 flavor := 'numpy' byteorder := 'little' chunkshape := None >>> r.read(0, 10) array([ 0. , 64. , 511.99999124, 1. , 7.99999986, 26.9999989 , 124.99999917, 216.00000085, 343.00001259, 729. ]) •  where enables in-kernel selections >>> r_bigish = [ row['z'] for row in table.where('(z > 1000) & (z <= 2000)' ] >>> for big in table.where('z > 10000;'): ... print('A big z is {}'.format(big['z']) •  There is also a where in Expr •  Python Data Analysis Library •  Easy-to-use data structures •  http://pandas.pydata.org •  •  •  •  DataFrame (more friendly recarray) Handles missing data (more friendly masked array) read and write various data formats data-alignment •  •  Easy to combine data tables Surprisingly fast! •  tries to be helpful, though not always intuitive •  •  1D Series, 2D DataFrame, 3D Panel Various ways of creating these structures >>> import pandas as pd >>> t = np.arange(0.0, 1.0, 0.1) >>> s = pd.Series(t) >>> x = t + np.random.normal(scale=0.1, size=t.shape) >>> y = x**2 + np.random.normal(scale=0.5, size=t.shape) >>> df1 = pd.DataFrame({ 'x' : x, 'y' : y}, index = t) >>> df1.plot() >>> df1.plot('x', 'y', kind='scatter') •  •  Always indexed Indices used when joining tables >>> t2 = t[::2] >>> z = t2**3 + np.random.normal(scale=1.0, size=t2.shape) >>> df2 = pd.DataFrame({ 'z' : z, 'z2' : z**2}, index = t2) >>> df3 = df1.join(df2) >>> df3.sort('y') •  •  •  Handling data fairly intuitive, similar to numpy slices Powerful and flexible Good documentation •  •  DataFrame can be created directly from a recarray Pandas data structures can be efficiently stored on disk •  based on PyTables data = pyfits.getdata('myfunkydata.fits') df = pd.DataFrame(data) # store DataFrame to HDF5 store = pd.HDFStore('myfunkydata.h5', mode='w', complib='blosc', complevel=5) store['funkydata'] = df # some time later… store = pd.HDFStore('myfunkydata.h5', mode='r') df = store['funkydata'] Graphical interfaces Several modules to construct GUIs in Python wxpython is one of the most popular http://www.wxpython.org E.g., https://github.com/bamford/control/ Django a high-level Python Web framework that encourages rapid development and clean, pragmatic design. and many others, e.g. Zope (massive), web2py (light), … •  Give your scientific code a friendly face! •  •  •  •  easy configuration monitor progress particularly for public code, cloud computing, HPC An (unsicentific) example, followed by another one Optimising your code timeit – use in interpreter, script or command line python -m timeit [-n N] [-r N] [-s S] [statement ...] Options: -s S, --setup=S statement to be executed once initially (default pass) -n N, --number=N how many times to execute 'statement' (default: take ~0.2 sec total) -r N, --repeat=N how many times to repeat the timer (default 3) iPython magic version %timeit %%timeit # one line # whole notebook cell # fastest way to calculate x**5? $ python -m timeit -s 'from math import pow; x = 1.23' 'x*x*x*x*x' 10000000 loops, best of 3: 0.161 usec per loop $ python -m timeit -s 'from math import pow; x = 1.23' 'x**5' 10000000 loops, best of 3: 0.111 usec per loop $ python -m timeit -s 'from math import pow; x = 1.23' 'pow(x, 5)' 1000000 loops, best of 3: 0.184 usec per loop •  Understand which parts of your code limit its execution time •  print summary to screen, or save file for detailed analysis From shell $ python -m cProfile –o program.prof my_program.py From iPython %prun -D program.prof my_function() %%prun # profile an entire notebook cell Lots of functionality… see docs Nice visualisation with snakeviz – http://jiffyclub.github.io/snakeviz/ $ conda install –c thebamf snakeviz OR $ pip install snakeviz In iPython: %load_ext snakeviz %snakeviz my_function() %%snakeviz # profile entire cell Writing robust code •  Several testing frameworks •  •  unittest is the main Python module nose is a third-party module that nicely automates testing Squeezing out extra speed •  Python includes modules for writing "parallel" programs: •  •  threaded – limited by the Global Interpreter Lock multiprocessing – generally more useful from multiprocessing import Pool def f(x): return x*x pool = Pool(processes=4) z = range(10) print pool.map(f, z) # start 4 worker processes # apply f to each element of z in parallel from multiprocessing import Process from time import sleep def f(name): print('Hello {}, I am going to sleep now'.format(name)) sleep(3) print('OK, finished sleeping') if __name__ == '__main__': p = Process(target=f, args=(lock, 'Steven')) p.start() # start additional process sleep(1) # carry on doing stuff print 'Wow, how lazy is that function!' p.join() # wait for process to complete $ python thinking.py Hello Steven, I am going to sleep now Wow, how lazy is that function! OK, finished sleeping (Really, should use a lock to avoid writing output to screen at same time) Cython is used for compiling Python-like code to machine-code •  •  •  •  •  Cython code is turned into C code •  •  supports a big subset of the Python language conditions and loops run 2-8x faster, overall 30% faster for plain Python code add types for speedups (hundreds of times) easily use native libraries (C/C++/Fortran) directly uses the CPython API and runtime Coding in Cython is like coding in Python and C at the same time! Some material borrowed from Dag Sverre Seljebotn (University of Oslo) EuroSciPy 2010 presentation Use cases: •  Performance-critical code •  •  •  which does not translate to array-based approach (numpy / pytables) existing Python code ! optimise critical parts Wrapping existing C/C++ libraries •  •  particularly higher-level Pythonised wrapper for one-to-one wrapping other tools might be better suited Cython code must be compiled. Two stages: •  A .pyx file is compiled by Cython to a .c file, containing the code of a Python extension module •  The .c file is compiled by a C compiler •  •  •  •  Generated C code can be built without Cython installed Cython is a developer dependency, not a build-time dependency Generated C code works with Python 2.3+ The result is a .so file (or .pyd on Windows) which can be imported directly into a Python session Ways of building Cython code: •  Run cython command-line utility and compile the resulting C file •  •  •  Use pyximport to importing Cython .pyx files as if they were .py files; building on the fly (recommended to start). •  •  •  use favourite build tool for cross-system operation you need to query Python for the C build options to use things get complicated if you must link to native libraries larger projects tend to need a build phase anyway Write a distutils setup.py •  standard way of distributing, building and installing Python modules •  •  Cython supports most of normal Python Most standard Python code can be used directly with Cython •  •  •  typical speedups of (very roughly) a factor of two should not ever slow down code – safe to try name file .pyx or use pyimport = True >>> import pyximport >>> pyximport.install() >>> import mypyxmodule # converts and compiles on the fly >>> pyximport.install(pyimport = True) >>> import mypymodule # converts and compiles on the fly # should fall back to Python if fails •  •  •  •  •  Big speedup from defining types of key variables Use native C-types (int, double, char *, etc.) Use Python C-types (Py_int_t, Py_float_t, etc.) Use cdef to declare variable types Also use cdef to declare C-only functions (with return type) •  •  can also use cpdef to declare functions which are automatically treated as C or Python depending on usage Don't forget function arguments (but note cdef not used here) •  Efficient algorithm to find first N prime numbers def primes(kmax): p = [] k = 0 n = 2 while k < kmax: i = 0 while i < k and n % p[i] != 0: i = i + 1 if i == k: k = k + 1 p.append(n) n = n + 1 return p primes.py $ python -m timeit -s 'import primes as p' 'p.primes(100)' 1000 loops, best of 3: 1.35 msec per loop cprimes.pyx def primes(kmax): p = [] k = 0 n = 2 while k < kmax: i = 0 while i < k and n % p[i] != 0: i = i + 1 if i == k: k = k + 1 p.append(n) n = n + 1 return p $ python -m timeit -s 'import pyximport; pyximport.install(); import cprimes as p' 'p.primes(100)' 1000 loops, best of 3: 731 usec per loop 1.8x speedup xprimes.pyx def primes(int kmax): # declare types of parameters cdef int n, k, i # declare types of variables cdef int p[1000] # including arrays result = [] # can still use normal Python types if kmax > 1000: # in this case need to hardcode limit kmax = 1000 k = 0 n = 2 while k < kmax: i = 0 while i < k and n % p[i] != 0: contains only C-code i = i + 1 if i == k: p[k] = n 40.8 usec per loop k = k + 1 33x speedup result.append(n) n = n + 1 return result # return Python object •  Cython provides a way to quickly access Numpy arrays with specified types and dimensionality ! for implementing fast specific algorithms •  Also provides way to create generalized Ufuncs •  Can be useful, but often using functions provided by numpy, scipy, numexpr or pytables will be easier and faster •  •  JIT: just in time compilation of Python functions Compilation for both CPU and GPU hardware from numba import jit @jit def sum2d(arr): M, N = arr.shape result = 0.0 for i in range(M): for j in range(N): result += arr[i,j] return result a = np.arange(10000).reshape(1000,10) %timeit sum2d(a) 1000 loops, best of 3: 334 µs per loop %timeit sum2d(a) # without @jit 1 loops, best of 3: 2.15 µs per loop An introduction to scientific programming with The End

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Session 5 - Steven Bamford