Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Pyrex Or How to Write C Without Learning C Evan Broder [email protected] What is Pyrex? Pyrex is a Python-like language for writing Python extensions What does that mean? Well, you - write code in something that looks like Python - compile it into C - build that into a Python module Why Use Pyrex? Because you want your code to be fast Because you want to use C libraries in Python Sure, you can write things in other languages and shell out, or write a real C extension I don’t know many other languages. I certainly can’t write secure C with good performance Also good if you’re looking for incremental improvements Pyrex is Fast and Simple Let’s compare #include <Python.h> long fib(long n) { if(n <= 1) return 1; else return fib(n - 1) + fib(n - 2); } static PyObject * def py_fib(n): if n <= 1: return 1 else: return py_fib(n - 1) \ + py_fib(n - 2) c_fib(PyObject *self, PyObject *args) { long n, result; if (!PyArg_ParseTuple(args, "l", &n)) return NULL; result = fib(n); return Py_BuildValue("l", result); } static PyMethodDef CFibMethods[] = { {"c_fib", c_fib, METH_VARARGS, NULL}, {NULL, NULL, 0, NULL} }; Python PyMODINIT_FUNC initc_fib(void) { (void) Py_InitModule("c_fib", CFibMethods); } Pure C For our comparison, we’re going to use a simple, stupid, exponential-branch-factor Fibonacci calculator First, let’s look at the two traditional ways to do this def py_fib(n): if n <= 1: return 1 else: return py_fib(n - 1) \ + py_fib(n - 2) cdef _fib(long n): if n <= 1: return 1 else: return _fib(n - 1) \ + _fib(n - 2) def pyx_fib(n): return _fib(n) Python Pyrex Also a good example of how simple Pyrex can be Don’t forget to hit enter to start test C is faster, but if you care about performance and just performance, you probably don’t want to be writing in Python anyway Not Convinced? Apparently theoretical computer scientists don’t trust any demo involving exponential time So let’s try again def py_isprime(n): for i in xrange(2, n): if n % i == 0: return False return True def py_primes(n): return [i for i in xrange(n) \ if py_isprime(i)] cdef _isprime(long n): cdef long i for 2 <= i < n: if n % i == 0: return False return True cdef _primes(long n): cdef object results cdef long i results = [] for 0 <= i < n: if _isprime(i): results.append(i) return results def pyx_primes(n): return _primes(n) Python Pyrex Hooking Libraries I mainly use Pyrex for pulling C libraries into Python You can call both C and Python functions And transparently convert between C and Python types Quick Example libreadline cdef extern from "readline/readline.h": char * readline(char *) cdef extern from "stdlib.h": void free(void *) def rl(prompt): cdef char * c_result cdef object result c_result = readline(prompt) result = c_result free(c_result) return result I’ll explain the syntactic details later. What’s important here is that, in 13 lines of code, I’ve exposed the readline(3) to Python Language Basics Two Types of Functions def “Python” function cdef “C” function Two Types of Functions Arguments/return types def: Python objects cdef: Python objects or C types Only def functions can be called from Python def functions get compiled to functions exposed through the Python API cdef functions get compiled to actual C functions Types in cdef Functions explicit return type cdef long _fib(long n): if n <= 1: return 1 else: return _fib(n - 1) \ + _fib(n - 2) explicit argument type If you don’t specify, Pyrex defaults to “object”, which is the explicit specifier for a Python object You can declare explicit argument types for both def functions and cdef functions With def functions, the argument is passed in as an object, then immediately converted to the C type Types in def functions def functions can have C type arguments, too They get passed in as objects and then converted Can only convert to numeric types and char * (cdef functions can take any type) Declaring C Variables You can explicitly declare variables in Pyrex cdef int i At either function or module level Within a function, either def or cdef Type Conversions C-typed and Python-typed varibles will automatically be converted based on context From Python C type To Python int, long char, short, int, long int int, long unsigned int, unsigned long, long long long int, long, float float, double, long double float str char * str Basically, anything that looks like an int, gets converted to an int ...long... ...floats... And the only thing that looks like a string is a string Type Conversions Be careful with str -> char *: Pyrex just uses the internal char * of the str The resulting char * is only good as long as the str object exists Something that won’t work: cdef char *s s = pystring1 + pystring2 Borrowed Syntax from C Python syntax doesn’t get you everywhere &p works like C p.x instead of p->x p[0] instead of *p <type>var to cast var to type None gets converted to NULL Python’s order of operations But don’t try to typecast Python <=> C types Use foo is None instead of foo == None C Type Declarations structs, unions, enums, typedefs: cdef struct Grail: int age float volume cdef enum CheeseState: hard = 1 soft = 2 runny = 3 ctypedef unsigned long ULong C Type Declarations Constants: use anonymous enum: cdef enum: tons_of_spam = 3 Only use struct, union, or enum when defining type, not referring, e.g. cdef Grail *gp This collapses the struct namespace onto others, which can suck Extension Classes i.e. C classes Just use cdef class Foo: Extension Classes cdef class Shrubbery: cdef int width, height def __init__(self, w, h): self.width = w self.height = h def describe(self): print "This shrubbery is", self.width, \ "by", self.height, "cubits." Shrubbery.width, Shrubbery.height not accessible to Python normally For read and write, cdef public width For read only, cdef readonly int width Can only do public and readonly with types Pyrex knows how to convert Extension Classes Using an extension class as argument: def widen_shrubbery(Shrubbery sh, width): def widen_shrubbery(Shrubbery sh not None, width): Can use cdef methods as well Use not None if you’re going to access C attributes of extension types Writing Fast Pyrex Don’t use for i in xrange(n): Special syntax: for 0 <= i < n: Can still use break, continue, else Operations on builtin types (dict, list, object) automatically get translated into API calls xrange (or range) is a Python function, so conversion overhead 0 <= i < n gets compiled directly into for loop Can do n > i >= 0 for descending Exceptions with cdefs By default, an exception raised by a cdef function is ignored Declare specially: cdef int spam() except -1 cdef int spam() except? -1 cdef int spam() except * First one means return of -1 means definitely raised exception Second one means it might have raised an exception Third one means always check when the function returns Differences from Python Can use import, and from import, but not import * Can’t define functions within functions No generators No list comprehensions No conditional expressions (foo if bar else baz) How to hook external code So that’s Pyrex the language... Including Code cdef extern from “string.h”: int strcmp(char * s1, char * s2) Causes file to be included Also uses prototype as hinting to Pyrex for types Doesn’t need to be exact Shouldn’t be a lot of times Leave out const Don’t need all struct members, just the ones you’ll need Import macros using function defs Can use this to pull in Python/C API functions if needed - just use “Python.h” size_t - just typedef it to int - it’s close enough Including Code If you want C function to be exposed to Pyrex under different name: cdef extern from “spam.h”: void c_eggs “eggs” (int count) Works with structs, unions, vars, etc Refer to eggs as c_eggs within Pyrex Useful if you have collisions between struct names and function names Including Code cdef extern from *: size_t void * extern from * is for when you need something that isn’t included from a specific file (i.e. included by an include) If you use a size_t, you should just typedef it to int - it’s close enough C Callbacks Most functions that use a callback pass a void * to the callback function That could be a Python function cast to a void * Instead of casting, just declare function to take object instead of void * Then write cdef function that calls the void * argument C Callbacks In C: int mr_query(char *name, int argc, char **argv, int (*callproc)(int, char **, void *), void *callarg); In Pyrex: cdef extern from “moira/moira.h”: int mr_query(char *, int, char **, int (*callback)(int, char **, object), object) Here’s an example from Moira Cython Cython Fork of Pyrex Primarily maintained by Sage Cython Differences List comprehensions Including [i for i in 0 <= i < 10] Conditional expressions cdef inline Assignment on declaration (i.e. cdef int i = 4) Automatic conversion of for i in range(...)