Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Python for High Productivity Computing July 2009 Tutorial Tutorial Outline • Basic Python • IPython : Interactive Python • Advanced Python • NumPy : High performance arrays for Python • Matplotlib : Basic plotting tools for Python • MPI4py : Parallel programming with Python • F2py and SWIG : Language interoperability • Extra Credit – – – – SciPy and SAGE : mathematical and scientific computing Traits : Typing system for Python Dune : A Python CCA-Compliant component framework SportsRatingSystem : Linear algebra example Tutorial Goals • This tutorial is intended to introduce Python as a tool for high productivity scientific software development. • Today you should leave here with a better understanding of… – – – – The basics of Python, particularly for scientific and numerical computing. Toolkits and packages relevant to specific numerical tasks. How Python is similar to tools like MATLAB or GNU Octave. How Python might be used as a component architecture • …And most importantly, – Python makes scientific programming easy, quick, and fairly painless, leaving more time to think about science and not programming. SECTION 1 INTRODUCTION What Is Python? Python is an interpreted language that allows you to accomplish what you would with a compiled language, but without the complexity. • Interpreted and interactive • Truly Modular • Easy to learn and use • NumPy • Fun • Free and portable • PySparse • FFTW • Plotting • Automatic garbage collection • MPI4py • Object-oriented and Functional • Co-Array Python Running Python $$ ipython Python 2.5.1 (… Feb 6 2009 …) Ipython 0.9.1 – An enhanced … # what is math >>> type(math) <type 'module'> '''a comment line …''' # another comment style # the IPython prompt In [1]: # what is in math >>> dir(math) ['__doc__',…, 'cos',…, pi, …] # the Python prompt, when native # python interpreter is run >>> # import a module >>> import math >>> cos(pi) NameError: name 'cos' is not defined # import into global namespace >>> from math import * >>> cos(pi) -1.0 Interactive Calculator # adding two values >>> 3 + 4 7 # setting a variable >>> a = 3 >>> a 3 # checking a variables type >>> type(a) <type 'int'> # an arbitrarily long integer >>> a = 1204386828483L >>> type(a) <type 'long'> # real numbers >>> b = 2.4/2 >>> print b 1.2 >>> type(b) <type 'float'> # complex numbers >>> c = 2 + 1.5j >>> c (2+1.5j) # multiplication >>> a = 3 >>> a*c (6+4.5j) Online Python Documentation # command line documentation $$ pydoc math Help on module math: >>> dir(math) ['__doc__', >>> math.__doc__ …mathematical functions defined… >>> help(math) Help on module math: >>> type(math) <type 'module'> # ipython documentation In[3]: math.<TAB> …math.pi math.sin math.sqrt In[4]: math? Type: module Base Class: <type 'module'> In[5]: import numpy In[6]: numpy?? Source:=== \ NumPy ========= Labs! Lab: Explore and Calculate Strings # creating strings >>> s1 = "Hello " >>> s2 = 'world!' # string operations >>> s = s1 + s2 >>> print s Hello world! >>> 3*s1 'Hello Hello Hello ' >>> len(s) 12 # the string module >>> import string # split space delimited words >>> word_list = string.split(s) >>> print word_list ['Hello', 'world!'] >>> string.join(word_list) 'Hello world!' >>> string.replace(s,'world', 'class') 'Hello class!' Labs! Lab: Strings Tuples and Lists: sequence objects # a tuple is a collection of obj >>> t = (44,) # length of one >>> t = (1,2,3) >>> print t (1,2,3) # accessing elements >>> t[0] 1 >>> t[1] = 22 TypeError: 'tuple' object does not support item assignment # a list is a mutable collection >>> l = [1,22,3,3,4,5] >>> l [1,22,3,3,4,5] >>> l[1] = 2 >>> l [1,2,3,3,4,5] >>> del l[2] >>> l [1,2,3,4,5] >>> len(l) 5 # in or not in >>> 4 in l True >>> 4 not in l False More on Lists # negative indices count # backward from the end of # the list >>> l [1,2,3,4,5] >>> l[-1] 5 >>> l[-2] 4 >>> dir(list) [__add__, 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] # what does count do? >>> list.count <method 'count' of 'list'…> >>> help(list.count) 'L.count(value) -> integer -return number of occurrences of value' Slicing var[lower:upper] Slices extract a portion of a sequence (e.g., a list or a NumPy array). Mathematically the range is [lower, upper).! >>> print l [1,2,3,4,5] # some ways to return entire # portion of the sequence >>> l[0:5] >>> l[0:] >>> l[:5] >>> l[:] [1,2,3,4,5] # middle three elements >>> l[1:4] >>> l[1:-1] >>> l[-4:-1] [2,3,4] # last two elements >>> l[3:] >>> l[-2:] [4,5] Dictionaries: key/value pairs Dictionaries store key/value pairs. Indexing a dictionary by a key returns the value associate with it.! # create data >>> pos = [1.0, 2.0, 3.0, 4.0, 5.0] >>> T = [9.9, 8.8. 7.7, 6.6, 5.5] # store data in a dictionary >>> data_dict = {'position': pos, 'temperature': T} # access elements >>> data_dict['position'] [1.0, 2.0, 3.0, 4.0, 5.0] Labs! Lab: Sequence Objects If Statements and Loops # if/elif/else example >>> print l [1,2,3,4,5] >>> … … … … … … yes if 3 in l: print 'yes' elif 3 not in l: print 'no' else: print 'impossible!' < hit return > # for loop examples >>> for i in range(1,3): print i … < hit return > 1 2 >>> for x in l: print x … < hit return > 1 … # while loop example >>> i = 1 >>> while i < 3: print i; i += 1 … < hit return > 1 2 Functions # create a function in funcs.py def Celcius_to_F(T_C): T_F = (9./5.)*T_C + 32. return T_F ''' Note: indentation is used for scoping, no braces {} ''' # run from command line and # start up with created file $ python -i funcs.py >>> dir() ['Celcius_to_F', '__builtins__', … ' >>> Celsius_to_F = Celcius_to_F >>> Celsius_to_F <function Celsius_to_F at …> >>> Celsius_to_F(0) 32.0 >>> C = 100. >>> F = Celsius_to_F(C) >>> print F 212.0 Labs! Lab: Functions Classes # create a class in Complex.py class Complex: '''A simple Complex class''' def __init__(self, real, imag): '''Create and initialize''' self.real = real self.imag = imag def norm(self): '''Return the L2 Norm''' import math d = math.hypot(self.real,self.imag) return d #end class Complex # run from command line $ python -i Complex.py # help will display comments >>> help(Complex) Help on class Complex in module … # create a Complex object >>> c = Complex(3.0, -4.0) # print Complex attributes >>> c.real 3.0 >>> c.imag -4.0 # execute a Complex method >>> c.norm() 5.0 Labs! Lab: Classes SECTION 2 Interactive Python IPython IPython Summary • An enhanced interactive Python shell • An architecture for interactive parallel computing • IPython contains – – – – Object introspection System shell access Special interactive commands Efficient environment for Python code development • Embeddable interpreter for your own programs • Inspired by Matlab • Interactive testing of threaded graphical toolkits Running IPython $$ ipython -pylab IPython 0.9.1 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object'. ?object also works, ?? Prints # %fun_name are magic commands # get function info In [1]: %history? Print input history (_i<n> variables), with most recent last. In [2]: %history 1: #?%history 2: _ip.magic("history ") More IPython Commands # some shell commands are available In [27]: ls 01-Lab-Explore.ppt* 04-Lab-Functions.ppt* # TAB completion for more information about objects In [28]: %<TAB> %alias %autocall %autoindent %automagic %bg %bookmark %cd %clear %color_info %colors %cpaste %debug %dhist %dirs %doctest_mode # retrieve Out[] values In [29]: 4/2 Out[29]: 2 In [30]: b = Out[29] In [31]: print b 2 More IPython Commands # %run runs a Python script and loads its data into interactive # namespace; useful for programming In [32]: %run hello_script Hello # ! gives access to shell commands In [33]: !date Tue Jul 7 23:04:37 MDT 2009 # look at logfile (see %logstart and %logstop) In [34]: !cat ipython_log.py #log# Automatic Logger file. *** THIS MUST BE THE FIRST LINE *** #log# DO NOT CHANGE THIS LINE OR THE TWO BELOW #log# opts = Struct({'__allownew': True, 'logfile': 'ipython_log.py'}) #log# args = [] #log# It is safe to make manual edits below here. #log#----------------------------------------------------------------------_ip.magic("run hello ) Interactive Shell Recap – – – – – – – – – – – – – – – – Object introspection (? and ??) Searching in the local namespace ( TAB ) Numbered input/output prompts with command history User-extensible magic commands ( % ) Alias facility for defining your own system aliases Complete system shell access Background execution of Python commands in a separate thread Expand python variables when calling the system shell Filesystem navigation via a magic ( %cd ) command – Bookmark with ( %bookmark ) A lightweight persistence framework via the ( %store ) command Automatic indentation (optional) Macro system for quickly re-executing multiple lines of previous input Session logging and restoring Auto-parentheses ( sin 3 ) Easy debugger access (%run –d) Profiler support (%prun and %run –p) Labs! Lab: IPython Try out ipython commands as time allows SECTION 3 Advanced Python Regular Expressions # The re module provides regular expression tools for advanced # string processing. >>> import re # Get a refresher on regular expressions >>> help(re) >>> help(re.findall) >>> help(re.sub) >>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest') ['foot', 'fell', 'fastest ] >>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat') 'cat in the hat' Labs! Lab: Regular Expressions Try out the re module as time allows Fun With Functions # a filter returns those items # for which the given function returns True >>> def f(x): return x < 3 >>> filter(f, [0,1,2,3,4,5,6,7]) [0, 1, 2] # map applies the given function to each item in a sequence >>> def square(x): return x*x >>> map(square, range(7)) [0, 1, 4, 9, 16, 25, 36] # lambda functions are small functions with no name (anonymous) >>> map(lambda x: x*x, range(7)) [0, 1, 4, 9, 16, 25, 36] More Fun With Functions # reduce returns a single value by applying a binary function >>> reduce(lambda x,y: x+y, [0,1,2,3]) 6 # list comprehensions provide an easy way to create lists # [an expression followed by for then zero or more for or if] >>> vec = [2, 4, 6] >>> [3*x for x in vec] [6, 12, 18] >>> [3*x for x in vec if x > 3] [12, 18] >>> [x*y for x in vec for y in [3, 2, -1]] [6, 4, -2, 12, 8, -4, 18, 12, -6] Labs! Lab: Fun with Functions Input/Output # dir(str) shows methods on str object # a string representation of a number >>> x = 3.25 >>> 'number is' + repr(x) 'number is3.25' # pad with zeros >>> '12'.zfill(5) '00012' # explicit formatting (Python 2.6) >>> 'The value of {0} is approximately {1:.3f}.'.format('PI', math.pi) The value of PI is approximately 3.142. File I/O # file objects need to be opened # some modes - 'w' (write), 'r' (read), 'a' (append) # - 'r+' (read+write), 'rb', (read binary) >>> f = open('/tmp/workfile', 'w') >>> print f <open file '/tmp/workfile', mode 'w' at 80a0960> >>> help(f) >>> f.write('I want my binky!') >>> f.close() >>> f = open('/tmp/workfile', 'r+') >>> f.readline() 'I want my binky!' Search and Replace # file substitute.py import re fin = open('fadd.f90', 'r') p = re.compile('(subroutine)') try: while True: s = fin.readline() if s == "": break sout = p.sub('SUBROUTINE', s) print sout.replace('\n', "") # sys.stdout.write simpler except: print "Finished reading, file" # is this line reached? fin.close() Iterators over Containers Interators require two methods: next() and __iter__() Fibonacci: f[n] = f[n-1] + f[n-2]; with f[0] = f[1] = 1! class fibnum: def __init__(self): self.fn1 = 1 self.fn2 = 1 # f [n-1] # f [n-2] def next(self): # next() is the heart of any iterator oldfn2 = self.fn2 self.fn2 = self.fn1 self.fn1 = self.fn1 + oldfn2 return oldfn2 def __iter__(self): return self Iterators… # use Fibonacci iterator class >>> from fibnum import * # construct a member of the class >>> f = fibnum() >>> l = [] >>> for i in f: l.append(i) if i > 20: break >>> l = [] [1, 1, 2, 3, 5, 8, 13, 21] # thanks to (and for more information on iterators): # http://heather.cs.ucdavis.edu/~matloff/Python/PyIterGen.pdf Binary I/O Anticipating the next module NumPy (numerical arrays), you may want to look at the file PVReadBin.py to see how binary I/O is done in a practical application. Labs! Lab: Input/Output Try out file I/O as time allows SECTION 4 NUMERICAL PYTHON NumPy • Offers Matlab like capabilities within Python • Information – http://numpy.scipy.org/ • Download – http://sourceforge.net/projects/numpy/files/ • Numeric developers (initial coding Jim Hugunin) – – – – Paul Dubouis Travis Oliphant Konrad Hinsen Charles Waldman Creating Array: Basics >>> from numpy import * >>> a = array([1.1, 2.2, 3.3]) >>> print a [ 1.1 2.2 3.3] # two-dimension array >>> b = array(([1,2,3],[4,5,6])) >>> print b [[1 2 3] [4 5 6]] >>> print ones((2,3), float) [[1. 1. 1.] [1. 1. 1.]] >>> print resize(b,(2,6)) [[1 2 3 4 5 6] [1 2 3 4 5 6]] >>> print reshape(b,(3,2)) [[1 2] >>> b.shape [3 4] (2,3) [5 6]] Creating Arrays: Strategies # user reshape with range >>> a = reshape(range(12),(2,6)) >>> print a [[0 1 2 3 4 5] [6 7 8 9 10 11]] # set an entire row (or column) >>> a[0,:] = range(1,12,2) >>> print a [[1 3 5 7 9 11] [6 7 8 9 10 11]] >>> a = zeros([50,100]) # loop to set individual values >>> for i in range(50): … for j in range(100): … a[i,j] = i + j # call user function set(x,y) >>> shape = (50,100) >>> a = fromfunction(set, shape) # use scipy.io module to read # values from a file into an # array Simple Array Operations >>> a = arange(1,4); print a [1 2 3] # addition (element wise) >>> print 3 + a [4 5 6] # multiplication (element wise) >>> print 3*a [3 6 9] # it really is element wise >>> print a*a [1 4 9] # power: a**b -> power(a,b) >>> print a**a [1 4 27] # functions: sin(x), log(x), … >>> print sqrt(a*a) [1. 2. 3.] # comparison: ==, >, and, … >>> print a < a [False False False] # reductions >>> add.reduce(a) 6 Slicing Arrays >>> >>> [[0 [3 [6 a = reshape(range(9),(3,3)) print a 1 2] 4 5] 7 8]] # second column >>> print a[:,1] [1 4 7] # last row >>> print a[-1,:] [6 7 8] # slices are references to # original memory, true for # all array/sequence assignment # work on the first row of a >>> b = a[0,:] >>> b[0] = 99 ; print b [99 1 2] # what is a[0,:] now? >>> print a[0,:] [99 1 2] Array Temporaries and ufuncs >>> a = arange(10) >>> b = arange(10,20) # What will the following do? >>> a = a + b # Universal functions, ufuncs >>> type(add) <type 'numpy.ufunc'> # Is the following different? >>> c = a + b >>> a = c # add is a binary operator # Does a # memory? # in place operation reference old or new Answer, new memory! # Watch out for array # temporaries with large arrays! >>> a = add(a,b) >>> add(a,b,a) Array Functions >>> a = arange(1,11); print a [1 2 3 4 5 6 7 8 9 10] >>> a = reshape(range(9),(3,3)) >>> b = transpose(a); print b [[0 3 6] # create an index array >>> ind = [0, 5, 8] # take values from the array >>> print take(a,ind) >>> print a[ind] [1 6 9] [1 4 7] [2 5 8]] >>> print diagonal(b) [0 4 8] >>> print trace(b) 12 >>> print where(b >= 3, 9, 0) # put values to the array >>> put(a,ind,[0,0,0]); print a >>> a[ind] = (0,0,0); print a [0 2 3 4 5 0 7 8 0 10] [[0 9 9] [0 9 9] [0 9 9]] Labs! Lab: NumPy Basics Linear Algebra >>> import numpy.linalg as la >>> dir(la) ['Heigenvalues', 'Heigenvectors', 'LinAlgError', 'ScipyTest', '__builtins__', '__doc__', '__file__', '__name__', '__path__', 'cholesky', 'cholesky_decomposition', 'det', 'determinant', 'eig', 'eigenvalues', 'eigenvectors', 'eigh', 'eigvals', 'eigvalsh', 'generalized_inverse', 'info', 'inv', 'inverse', 'lapack_lite', 'linalg', 'linear_least_squares', 'lstsq', 'pinv', 'singular_value_decomposition', 'solve', 'solve_linear_equations', 'svd', 'test'] Linear Algebra: Eigenvalues # assume a exists already # a multiple-valued function >>> val,vec = la.eigenvectors(a) >>> print a [[ 1. 0. 0. 0. [ 0. 2. 0. 0.01] # eigenvalues >>> print val [ 0. 0. 5. 0. [2.50019992 [ 0. 0.01 0. 2.5 ]] ] ] 1.99980008 1. 5. ] # eigenvectors >>> la.determinant(a) >>> print vec 24.999500000000001 [[0. 0.01998801 0. 0.99980022] [0. 0.99980022 0. -0.01998801] [1. 0. 0. 0. ] [0. 0. 1. 0. ]] Linear Algebra: solve linear equations # assume a, q exists already # a variable can ref. a function >>> solv = la.solve_linear_equations >>> print a [[ 1. 0. 0. 0. [ 0. 2. 0. 0.01] [ 0. 0. 5. 0. [ 0. 0.01 0. 2.5 ]] 4.04 15. ] # solve linear system, a*b = q >>> b = solv(a,q) >>> print b [1. 2. 3. 4.] >>> q_new = matrixmultiply(a,b) >>> print q_new >>> print q [1. ] 10.02] [1. 4.04 15. 10.02] >>> print q_new == q [True True True True] Jacobi Iteration T = zeros((50,100), float) # set top boundary condition T[0,:] = 1 # iterate 10 times for t in range(10): T[1:-1,1:-1] = ( T[0:-2,1:-1] + T[2:,1:-1] + T[1:-1,0:-2] + T[1:-1,2:] ) / 4 # dump binary output to file (Numarray only) T.tofile('jacobi.out') Labs! Lab: Linear Algebra SECTION 5 Visualization and Imaging with Python Section Overview • In this section we will cover two related topics: image processing and basic visualization. • Image processing tasks include loading, creating, and manipulating images. • Basic visualization will cover everyday plotting activities, both 2D and 3D. Plotting tools • Many plotting packages available – Python Computer Graphics Kit (RenderMan) – Tkinter – Tk – Turtle graphics – Stand-alone GNUplot interface available – Python bindings to VTK, OpenGL, etc… • In this tutorial, we focus on the Matplotlib package • Unlike some of the other packages available, Matplotlib is available for nearly every platform. – Comes with http://www.scipy.org/ (Enthought) • http://matplotlib.sourceforge.net/ Getting started • A simple example # easiest to run ipython with –pylab option $$ ipython –pylab In [1]: plot([1,2,3]) In [2]: ylabel('some numbers') In [3]: show() # not needed with interactive # output Getting Started Matplotlib with numpy • The matplotlib package is compatible with numpy arrays. # create data using numpy t = arange(0.0, 2.0, 0.01) s = sin(2*pi*t) # create the plot plot(t, s, linewidth=1.0) # decorate the plot xlabel('time (s)') ylabel('voltage (mV)') title('About as simple as it gets, folks') grid(True) show() Simple Plot Improving the axis settings # get axis settings >>> axis() (0.0, 2.0, -1.0, 1.0) # changes should show up immediately >>> axis([0.0, 2.0, -1.5, 1.5]) # a plot can be saved from the menu bar Better axes Colorful background subplot(111, axisbg= darkslategray ) t = arange(0.0, 2.0, 0.01) # first plot plot(t, sin(2*pi*t), y ) # second plot t = arange(0.0, 2.0, 0.05) plot(t, sin(pi*t), ro ) Colorful background Fill demo # data t = arange(0.0, 1.01, 0.01) s = sin(2*2*np.pi*t) # graph fill(t, s*np.exp(-5*t), 'r') grid(True) Fill demo Subplot demo def f(t): s1 = cos(2*pi*t); e1 = exp(-t) return multiply(s1,e1) t1 = arange(0.0, 5.0, 0.1) t2 = arange(0.0, 5.0, 0.02) t3 = arange(0.0, 2.0, 0.01) subplot(211) plot(t1, f(t1), 'bo', t2, f(t2), 'k--', markerfacecolor='green') grid(True) title('A tale of 2 subplots') ylabel('Damped oscillation') subplot(212) plot(t3, cos(2*pi*t3), 'r.') grid(True) xlabel('time (s)') ylabel('Undamped ) Subplot demo A basic 3D plot example • Matplotlib can do polar plots, contours, …, and can even plot mathematical symbols using LaTeX • 3D graphics? – not so great • Matplotlib has simple 3D graphics but is limited relative to packages based on OpenGL like VTK. • Note: mplot3d module may not be loaded on your system. 3D example from mpl_toolkits.mplot3d import Axes3D from matplotlib import cm import random fig = figure() ax = Axes3D(fig) X = arange(-5, 5, 0.25) Y = arange(-5, 5, 0.25) X, Y = meshgrid(X, Y) R = sqrt(X**2 + Y**2) Z = sin(R) ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.jet) 3D example More visualization tools • Matplotlib is pretty good for simple plots. There are other tools out there that are quite nice: – – – – MayaVI : http://mayavi.sourceforge.net/ VTK : http://www.vtk.org/ SciPy/plt : http://www.scipy.org/ Python Computer Graphics Kit based on Pixar s RenderMan: http://cgkit.sourceforge.net/ Image Processing • A commonly used package for image processing in Python is the Python Imaging Library (PIL). • http://www.pythonware.com/products/pil/ Getting started • How to load the package – import Image, ImageOps, … • Image module contains main class to load and represent images. • PIL comes with many additional modules for specialized operations Additional PIL Modules • ImageDraw : Basic 2D graphics for Image objects • ImageEnhance : Image enhancement operations • ImageFile : File operations, including parser • ImageFilter : A set of pre-defined filter operations • ImageOps : A set of pre-defined common operations • ImagePath : Express vector graphics, usable with ImageDraw • ImageSequence : Implements iterator for image sequences or frames. • ImageStat : Various statistical operations for Images Loading an image • Loading an image is simple, no need to explicitly specify format. ! import Image! im = Image.open( image.jpg")! Supported Image Formats • Most image formats people wish to use are available. – – – – – – – – JPEG GIF BMP TGA, TIFF PNG XBM,XPM PDF, EPS And many other formats that aren t as commonly used – CUR,DCX,FLI,FLC,FPX,GBR,GD,ICO,IM,IMT,MIC,MCIDAS,PCD, PCX,PPM,PSD,SGI,SUN • Not all are fully read/write capable - check the latest docs for status. Image representation • Images are represented with the PIL Image class. • Often we will want to write algorithms that treat the image as a NumPy array of grayscale or RGB values. • It is simple to convert images to and from Image objects and numpy arrays. Converting the image to a NumPy array def PIL2NUMARRAY(im): if im.mode not in ("L", "F"): raise ValueError, "image must be single-layer." ar = array(im.getdata()) ar.shape = im.size[0], im.size[1] return ar Note: This works for mode L , or monochrome, images.! RGB would require more work - similar concept though.! Converting a NumPy array back to an Image def NUMARRAY2PIL(ar,size): im = Image.new("L",size) im.putdata(reshape(ar,(size[0]*size[1],))) return im Notice that we need to flatten the 2D array into a 1D array for the PIL structure. Size need not be explicitly passed in - one can query ar for the shape and size.! Saving an image • Much like reading, writing images is also very simple. • Many formats available. – Either explicitly specify output format, or let PIL infer it from the filename extension. outfname= somefile.jpg imgout = NUMARRAY2PIL(workarray,size) imgout.save(outfname,"JPEG") Labs! Lab: Graphics SECTION 6 Parallel programming with Python: MPI4Py and Co-Array Python IPython Parallelism • IPython supports many styles of parallelism – Single program, multiple data (SPMD) parallelism – Multiple program, multiple data (MPMD) parallelism – Message passing using MPI • Getting Started with Parallel Ipython – – – – Starting ipcluster Using FURLS Using a Multi-Engine Client (MEC) %px • First we look at using MPI with mpi4py Parallel Computing with mpi4py mpi4py is primarily run from a script! # file par_hello.py! ! from mpi4py import MPI! ! # communication in MPI is through a communicator! comm = MPI.COMM_WORLD! rank = comm.Get_rank()! size = comm.Get_size()! ! print "Hello, rank", rank, "of", size! Running an MPI Script mpiexec runs python on multiple processors concurrently! $$ python par_hello.py Hello, rank 0 of 1 $$ mpiexec –n Hello, rank 2 Hello, rank 3 Hello, rank 1 Hello, rank 0 4 python par_hello.py of 4 of 4 of 4 of 4 # notice that execution by rank is not ordered ! Passing Information in a Ring # file ring.py! from mpi4py import MPI! import numpy as np! ! # Create message buffers! message_in = np.zeros(3, dtype=np.int)! message_out = np.zeros(3, dtype=np.int)! ! comm = MPI.COMM_WORLD! rank = comm.Get_rank()! size = comm.Get_size()! ! #Calc the rank of the previous and next process in the ring! next = (rank + 1) % size;! prev = (rank + size - 1) % size;! More ring.py # Let message be (prev,rank,next)! message_out[:] = (prev,rank,next)! ! # Must break symmetry by one sending and others receiving! if rank == 0:! comm.Send([message_out, MPI.INT], dest=next, tag=11)! else:! comm.Recv([message_in, MPI.INT], source=prev, tag=11)! ! # Reverse order! if rank == 0:! comm.Recv([message_in, MPI.INT], source=prev, tag=11)! else:! comm.Send([message_out, MPI.INT], dest=next, tag=11)! print rank, ':', message_in Running ring.py $$ python ring.py 0 : [0 0 0] $$ mpiexec –n 4 python ring.py 1 2 3 0 ! : : : : [3 [0 [1 [2 0 1 2 3 1] 2] 3] 0] Interactive Parallel Computing First start server processes on remote (or local) cluster:! $$ ipcluster –n 2 & Starting controller: Controller PID: 5351 Starting engines: Engines PIDs: [5353, 5354] Log files: /home/rasmussn/.ipython/log/ipcluster-5351-* Your cluster is up and running. For interactive use, you can make a MultiEngineClient with: from IPython.kernel import client mec = client.MultiEngineClient() You can then cleanly stop the cluster from IPython using: mec.kill(controller=True) You can also hit Ctrl-C to stop it, or use from the cmd line: kill -INT 5350 Local IPython Client On local client:! In [1]: from IPython.kernel import client In [2]: mec = client.MultiEngineClient() In [3]: mec.get_ids() Out[3]: [0,1,2,3] In [4]: %px? Executes the given python command on the active IPython Controller. To activate a Controller in IPython, first create it and then call the activate() method. In [5]: mec.activate() More Parallel IPython In [6]: %px a=3 Parallel execution on engines :all Out[6]: <Results List> [0] In [1]: a=3 [1] In [1]: a=3 In [7]: %px print a Parallel execution on engines: all Out[7]: <Results List> [0] In [2]: print a [0] Out[2]: 3 [1] In [2]: print a [1] Out[2]: 3 Result method >>> %result? Print the result of command i on all engines of the actv controller >>> result 1 <Results List> [0] In [1]: a=3 [1] In [1]: a=3 What Can I Do in Parallel? • What can you imagine doing with multiple Python engines? – Execute code? – mec.execute – mec.map – mec.run # execute a function on a set of nodes # map a function and distribute data to nodes # run code from a file on engines – Exchange data? – mec.scatter – mec.gather – mec.push # distribute a sequence to nodes # gather a sequence from nodes # push python objects to nodes • Targets parameter in many of the mec methods selects the particular set of engines Labs! Lab: Parallel IPython Try out parallel ipython as time permits Why Co-Array Python • Scientists like Python – Powerful scripting language – Numerous extension modules – NumPy, PySparse, … – Gives an environment like MatLab • But, scientists often need parallel computers • MPI4Py (and others) was developed • But let s try something besides explicit message passing • Co-Array Python borrows from Co-Array Fortran Co-Array Programming Model • SPMD model • All processors run Python interpretor via PyMPI • Local view of array data – local, not global indexing • Adds another array dimension for remote memory access – the co-dimension • Uses ARMCI for communication – portable Cray shmem library Co-Array Python Syntax # # put to remote processor number 1 # T(1)[3,3] = T[3,3] # # get from remote processor number 8 # T[4,5] = T(8)[4,5] Co-Array Python Example • Jacobi problem on 2 dimensional grid • Derichlet boundary conditions • Average of four nearest neighbors Computational Domain up me ghost boundary cells me me ghost boundary cells dn Initialization from CoArray import * nProcs = mpi.size me = mpi.rank M = 200; N = M/nProcs T = coarray((N+2, M+2), Numeric.Float) up = me - 1 dn = me + 1 if me up if me dn == 0: = None == nProcs - 1: = None Jacobi Update (inner loop): I # # update interior values (no communication) # T[1:-1,1:-1] = ( T[0:-2,1:-1] + T[2:,1:-1] + T[1:-1,0:-2] + T[1:-1,2:] ) / 4.0 Jacobi Update (inner loop): II up boundary row me dn boundary row # # exchange boundary conditions # mpi.barrier() if up != None: T(up)[-1:,:] = T[ 1,:] if dn != None: T(dn)[ 0:,:] = T[-2,:] mpi.barrier() Timing Data Size CoPcomm CoPtotal PyMPIcomm 128x128 256x256 512x512 1024x1024 2048x2048 0.017 0.023 0.041 0.068 0.089 0.33 1.28 6.28 28.4 113.5 0.07 0.13 0.28 0.52 PyMPItotal 0.38 1.41 6.47 28.78 Ccomm Ctotal 0.013 0.015 0.020 0.032 0.047 0.05 0.14 0.55 2.49 10.13 ! Table 1. Timing data for Co-Array Python (CoP), MPI (PyMPI) and C MPI (C) versions • Most of time spent in computation (Python 1/10 C performance) • Co-Array Python communication rivals C (Python 1/2 C performance) – Co-Array Python communication much faster than PyMPI – better data marshalling – ARMCI Conclusions • Co-Arrays allows direct addressing of remote memory – e.g. T(remote)[local] • Explicit parallelism • Parallel programming made easy • Fun • Explore new programming models (Co-Arrays) • Looking at Chapel – implicit parallelism – global view of memory (for indexing) Status • Not entirely finished – reason a research note, not a full paper – but available to play with – [email protected] • Hope to finish soon and put on Scientific Python web site – http://www.scipy.org/ SECTION 7 Language Interoperability Language Interoperability • Python features many tools to make binding Python to languages like C/C++ and Fortran 77/95 easy. • We will cover: – F2py: Fortran to Python wrapper generator – SWIG: The Simple Wrapper Interface Generator • For Fortran, we also consider: – Fortran interoperability standard – Fortran Transformational Tools (FTT) project Fortran Example: fadd.f90 • Consider the following simple Fortran subroutine to add two arrays subroutine fadd(A, B, C, N) ! real, dimension(N) :: A, B, C! integer :: N ! ! ! do j = 1, N ! ! C(j) = A(j) + B(j) ! ! end do! ! C = A + B! ! end subroutine fadd! ! ! Annotate for F2py • F2py works better if you let it know what the variables are doing (intents) ! file fadd.f90! !! subroutine fadd(A, B, C, N) ! real, dimension(N) :: A, B, C! integer :: N! ! !F2PY intent(out) :: C! !F2PY intent(hide) :: N! !F2PY real, dimension(N) :: A, B, C! ! C = A + B! end subroutine fadd! ! ! Running F2py • Once you have annotated the source file, run f2py to generate the Python bindings $$ f2py -c -m fadd fadd.f90 $$ ls fadd.f90 fadd.so! Try out the new module • Run the new fadd module from ipython In [1]: from fadd import * In [2]: fadd? Docstring: fadd - Function signature: c = fadd(a,b) Required arguments: a : input rank-1 array('f') with bounds (n) b : input rank-1 array('f') with bounds (n) Return objects: c : rank-1 array('f') with bounds (n) In [3]: fadd([1,2,3,4,5], [5,4,3,2,1]) Out[5]: array([ 6., 6., 6., 6., 6.], dtype=float32) Fortran Interoperability Standard • Fortran 2003 provides a standard mechanism for interoperability with C – This could be used to reduce the need for annotations – But improved tools support needed interface! ! subroutine fadd(A, B, C, N) BIND(C, name= fadd )! use, intrinsic :: ISO_C_BINDING ! real(C_FLOAT), intent(in), dimension(N) :: A, B! real(C_FLOAT), intent(out), dimension(N) :: C integer(C_INT), value :: N ! end subroutine fadd! ! end interface! ! SWIG: example.c /* File : example.c */ double My_variable = 3.0; /* Compute factorial of n */ Int fact(int n) { if (n <= 1) return 1; else return n*fact(n-1); } /* Compute n mod m */ int my_mod(int n, int m) { return(n % m); } SWIG: example.i /* File : example.i */ %module example %{ /* Put headers and other declarations here */ %} extern double My_variable; extern int fact(int); extern int my_mod(int n, int m); Data Dictionary • Share Fortran arrays with Python by name • Fortran subroutine get_arrays(dict)! integer :: dict! integer, save :: A(3,4)! integer :: rank = 2, type = INTEGER_TYPE! integer :: shape = (/3,4/)! ! call put_array(dict, A , A, rank, shape, type)! ! end subroutine! • Python A = dict[ A ]! Running SWIG • Once you have created the .i file, run swig to generate the Python bindings unix > swig -python example.I unix > ls example.c example.i example.py example_wrap.c SWIG: build module • Build the example module – create setup.py – execute setup.py unix > cat setup.py from distutils.core import setup, Extension setup(name= _example", version="1.0", ext_modules=[ Extension( _example", [ _example.c", "example_wrap.c"], ), ]) unix > python setup.py config unix > python setup.py build SWIG: build module • Run the code – where is _example.so (set path) >>> from _example import *! ! >>> # try factorial function! >>> fact(5)! 120! ! >>> # try mod function! >>> my_mod(3,4)! 3! >>> 3 % 4! 3! NumPy and Fortran Arrays • Chasm provides a bridge between Fortran and Python arrays • The only way to use Fortran assumed-shape arguments with Python • Call the following routine from Python subroutine F90_multiply(a, b, c)! integer, pointer :: a(:,:), b(:,:), c(:,:)! c = MatMul(a,b) ! Fortran intrinsic! end subroutine F90_multiply! Labs! Lab: Language Interoperability Try out f2py and swig as time allows Extra Credit: SciPy and SAGE SciPy and SAGE SciPy • Open-source software for mathematics, science, and engineering • Information – http://docs.scipy.org/ • Download – http://scipy.org/Download scipy >>> import scipy; help(scipy) odr sparse.linalg.eigen.arpack fftpack sparse.linalg.eigen.lobpcg lib.blas sparse.linalg.eigen stats lib.lapack maxentropy integrate linalg interpolate optimize cluster signal sparse --------------------------------- Orthogonal Distance Regression Eigenvalue solver using iterative Discrete Fourier Transform Locally Optimal Block Preconditioned Wrappers to BLAS library Sparse Eigenvalue Solvers Statistical Functions Wrappers to LAPACK library Routines for fitting maximum entropy Integration routines Linear algebra routines Interpolation Tools Optimization Tools Vector Quantization / Kmeans Signal Processing Tools Sparse Matrices FFT Example >>> from scipy import * # create input values >>> v = zeros(1000) >>> v[:100] = 1 # take FFT >>> y = fft(v) # plot results (rearranged so zero frequency is at center) >>> x = arange(-500,500,1) >>> plot(x, abs(concatenate((y[500:],y[:500])))) FFT Results Zoom! FFT Results Expanded Optimization Example >>> from scipy import optimize as op # create function >>> def square(x): return x*x >>> op.fmin(square, -5) Optimization terminated successfully. Current function value: 0.000000 Iterations: 20 Function evaluations: 40 array([ 0.]) >>> op.anneal(square, -5) Warning: Cooled to 4.977261 at 2.23097753984 but this is not the smallest point found. (-0.068887616435477916, 5) SAGE Functionality http://showmedo.com/videotutorials/ search for sage! Labs! Lab: SciPy Try out scipy as time allows Extra Credit Traits What are traits? • Traits add typing-like facilities to Python. – Python by default has no explicit typing. • Traits are bound to fields of classes. • Traits allow classes to dictate the types for their fields. • Furthermore, they can specify ranges! • Traits also can be inherited. Thanks to scipy.org for the original Traits slides. An example class Person(HasTraits) name = Str # String value, default is '' age = Trait(35, TraitRange(1,120)) weight = Trait(160.0,TraitRange(75.0,500.0)) # Creat someone, default age is 35, 160.0 lbs weight >>> someone = Person() >>> someone.name = Bill >>> print '%s: %s' % (someone.name, someone.age) Bill: 35 >>> person.age = 75 # OK >>> person.weight = fat # Error, not a number. Another example: Enumerated traits class InventoryItem(HasTraits) name = Str # String value, default is '' stock = Trait(None, 0, 1, 2, 3, 'many') # Enumerated list, default value >>> hats = InventoryItem() >>> hats.name = 'Stetson' >>> print '%s: %s' % (hats.name, Stetson: None >>> hats.stock = 2 # OK >>> hats.stock = 'many' # OK >>> hats.stock = 4 # Error, >>> hats.stock = None # Error, is 'None' hats.stock) value is not in permitted list value is not in permitted list Why traits? Validation • It s nice to let the author of a class be able to enforce checking not only of types, but values class Amplifier(HasTraits) volume = Range(0.0, 11.0, default=5.0) # This one goes to eleven... >>> spinal_tap = Amplifier() >>> spinal_tap.volume 5.0 >>> spinal_tap.volume = 11.0 #OK >>> spinal_tap.volume = 12.0 # Error, value is out of range Notification (Events) • You can also use notification to trigger actions when traits change. class Amplifier(HasTraits) volume = Range(0.0, 11.0, default=5.0) def _volume_changed(self, old, new): if new == 11.0: print This one goes to eleven # This one goes to eleven... >>> spinal_tap = Amplifier() >>> spinal_tap.volume = 11.0 This one goes to eleven Notification (Events) • You can even set up notification for classes with traits later, from the caller or class instantiator. class Amplifier(HasTraits) volume = Range(0.0, 11.0, default=5.0) # This one goes to eleven... >>> def volume_changed(self, old, new): ... if new == 11.0: ... print This one goes to eleven >>> spinal_tap = Amplifier() >>> spinal_tap.on_trait_change(volume_changed, >>> spinal_tap.volume = 11.0 This one goes to eleven volume ) Delegation model • Traits can be delegated class Company(HasTraits) address = Str class Employee(HasTraits) __traits__ = { name : , employer : Company, address : TraitDelegate( employer ) } • By default, employee has same address as their employer. • However, you can assign a new address to the employee if a different address is necessary. More about Traits • Traits originally came from the GUI world – A trait may be the ranges for a slider widget for example. • Clever use of traits can enforce correct units in computations. – You can check traits when two classes interact to ensure that their units match! – NASA lost a satellite due to this sort of issue, so it s definitely important! NASA Mars Climate Orbiter: units victim! Dune A Python-CCA, Rapid Prototyping Framework Craig E Rasmussen, Matthew J. Sottile Christopher D. Rickett, Sung-Eun Choi, Scientific Software Life Cycle: A need for two software environments (Research and Production) Maintenance and Refinement Exploration Concept Porting Production Research Reuse The challenge is to mix a rapid-prototyping environment with a production environment Rapid Prototyping Framework: An AdvectionDiffusion-Reaction component-application example Dune Python-CCA Framework for Component Assembly And Language Interoperability Advection Driver (main) Time Integrator Multiphysics Diffusion Reaction A Python Research Component Python, Fortran, or C/C++! Python! • A Research Component can be: – A pure Python component for rapid prototyping – Or a Fortran or C/C++ module, wrapped for reuse of production components A Production Component Fortran or C++! Python! • Remove the Python cap and the Fortran or C++ component can be linked and run in a traditional scientific application. Minimal Code to be a Python-CCA Component • Requirement to be a Python CCA component is minimal (five lines of Python code) # ---------------------------------------------------------# Register ports with a framework services object. # def setServices(self, services): self.services = services ''' Provide an integrator port ''' services.addProvidesPort(self, "integrator", "adr.integrator") ''' Register uses ports ''' services.registerUsesPort("multiphysics", "adr.multiphysics") Conclusions • Stable, well-designed interfaces are key to supporting the two modes of scientific computing, Research and Production and to the sharing of components between the two environments. Fortran or C++! Python! Python for High Productivity Computing July 2009 Tutorial Overview of packages • Python : http://www.python.org/ • SciPy : http://www.scipy.org/ • NumPy : • FFTW : http://www.fftw.org/ • MPI4py : • PySparse : • SAGE : http://www.sagemath.org/ • Traits : Thanks To • Eric Jones, … – Enthought • Also many others for ideas – python.org – scipy.org – Unpingco – https://www.osc.edu/cms/sip/ – http://showmedo.com/videotutorials/ipython Labs! Lab: Explore http://www.scipy.org/ Labs! Lab: Explore and Calculate Lab Instructions • Explore the Python web site – http://python.org/ – Browse the Documentation – Check out Topic Guides • Try the math package – – – – Convert Celcius to Fahrenheit (F = 9/5 C + 32) What does math.hypot do? How is math.pi different from math.sqrt? Remember import, dir, and help Labs! Lab: Strings Lab Instructions • Explore the string module – import string – dir(string) – help(string) • Try some of the string functions – string.find – … Labs! Lab: Sequence Objects Lab Instructions • Become familiar with lists [] – Create a list of integers and assign to variable l – Try various slices of your list – Assign list to another variable, (ll = l) – Change an element of l – Print ll, what happened? – Try list methods such as append, dir(list) • Try creating a dictionary, d = {} – Print a dictionary element using [] – Try methods, d.keys() and d.values() Labs! Lab: Functions Lab Instructions • In an editor, create file funcs.py • Create a function, mean(), that returns the mean of the elements in a list object – You will need to use the len function – Use for i in range(): • Test your function in Python • Modify mean() – Use for x in list: • Retest mean() Labs! Lab: Classes Lab Instructions • Create SimpleStat class in SimpleStat.py – Create constructor that takes a list object – Add attribute, list_obj to contain list object – Create method, mean() – Returns the mean of the contained list object – Create method, greater_than_mean() – Returns number of elements greater than the mean – Test your class from Python interpreter – What does type(SimpleStat) return? – Did you import or from SimpleStat import * Labs! Lab: Numerical Array Basics Lab Instructions • Import numpy – Try dir(numpy) – Browse the documentation, help(numpy) – Create and initialize arrays in different ways – How is arange() different from range()? – Try ones(), resize() and reshape() – Become friendly with slices – Try addition and multiplication with arrays – Try sum, add, diagonal, trace, transpose Labs! Lab: Linear Algebra Lab Instructions • Goal: Investigate a college basketball rating system – Can be applied to any sport – Multivariate linear regression to find team ratings • Copy ratings.py games.py from disk • $python -i games.py • >>> ratings = numpy.linalg.solve(ah, bh) – print team_names, ratings – sort ratings – ask instructor about the arrays ah and bh