Download Or How to Write C Without Learning C Evan Broder

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Pyrex
Or How to Write C
Without Learning C
Evan Broder
[email protected]
What is Pyrex?
Pyrex is a Python-like language for writing
Python extensions
What does that mean?
Well, you
- write code in something that looks like Python
- compile it into C
- build that into a Python module
Why Use Pyrex?
Because you want your code to be fast
Because you want to use C libraries in
Python
Sure, you can write things in other languages and shell out, or write a real C extension
I don’t know many other languages. I certainly can’t write secure C with good performance
Also good if you’re looking for incremental improvements
Pyrex is Fast and Simple
Let’s compare
#include <Python.h>
long fib(long n) {
if(n <= 1)
return 1;
else
return fib(n - 1) + fib(n - 2);
}
static PyObject *
def py_fib(n):
if n <= 1:
return 1
else:
return py_fib(n - 1) \
+ py_fib(n - 2)
c_fib(PyObject
*self, PyObject *args) {
long n, result;
if (!PyArg_ParseTuple(args, "l", &n))
return NULL;
result = fib(n);
return Py_BuildValue("l", result);
}
static PyMethodDef CFibMethods[] = {
{"c_fib", c_fib, METH_VARARGS, NULL},
{NULL, NULL, 0, NULL}
};
Python
PyMODINIT_FUNC initc_fib(void) {
(void) Py_InitModule("c_fib",
CFibMethods);
}
Pure C
For our comparison, we’re going to use a simple, stupid, exponential-branch-factor Fibonacci calculator
First, let’s look at the two traditional ways to do this
def py_fib(n):
if n <= 1:
return 1
else:
return py_fib(n - 1) \
+ py_fib(n - 2)
cdef _fib(long n):
if n <= 1:
return 1
else:
return _fib(n - 1) \
+ _fib(n - 2)
def pyx_fib(n):
return _fib(n)
Python
Pyrex
Also a good example of how simple Pyrex can be
Don’t forget to hit enter to start test
C is faster, but if you care about performance and just performance, you probably don’t want to be writing in
Python anyway
Not Convinced?
Apparently theoretical computer scientists don’t trust any demo involving exponential time
So let’s try again
def py_isprime(n):
for i in xrange(2, n):
if n % i == 0:
return False
return True
def py_primes(n):
return [i for i in xrange(n) \
if py_isprime(i)]
cdef _isprime(long n):
cdef long i
for 2 <= i < n:
if n % i == 0:
return False
return True
cdef _primes(long n):
cdef object results
cdef long i
results = []
for 0 <= i < n:
if _isprime(i):
results.append(i)
return results
def pyx_primes(n):
return _primes(n)
Python
Pyrex
Hooking Libraries
I mainly use Pyrex for pulling C libraries into
Python
You can call both C and Python functions
And transparently convert between C and
Python types
Quick Example
libreadline
cdef extern from "readline/readline.h":
char * readline(char *)
cdef extern from "stdlib.h":
void free(void *)
def rl(prompt):
cdef char * c_result
cdef object result
c_result = readline(prompt)
result = c_result
free(c_result)
return result
I’ll explain the syntactic details later. What’s important here is that, in 13 lines of code, I’ve exposed the
readline(3) to Python
Language Basics
Two Types of Functions
def
“Python” function
cdef
“C” function
Two Types of Functions
Arguments/return types
def: Python objects
cdef: Python objects or C types
Only def functions can be called from Python
def functions get compiled to functions exposed through the Python API
cdef functions get compiled to actual C functions
Types in cdef Functions
explicit
return type
cdef long _fib(long n):
if n <= 1:
return 1
else:
return _fib(n - 1) \
+ _fib(n - 2)
explicit
argument
type
If you don’t specify, Pyrex defaults
to “object”, which is the explicit
specifier for a Python object
You can declare explicit argument types for both def functions and cdef functions
With def functions, the argument is passed in as an object, then immediately converted to the C type
Types in def functions
def functions can have C type arguments,
too
They get passed in as objects and then
converted
Can only convert to numeric types and
char *
(cdef functions can take any type)
Declaring C Variables
You can explicitly declare variables in Pyrex
cdef int i
At either function or module level
Within a function, either def or cdef
Type Conversions
C-typed and Python-typed varibles will
automatically be converted based on context
From Python
C type
To Python
int, long
char, short, int, long
int
int, long
unsigned int, unsigned long, long
long
long
int, long, float
float, double, long double
float
str
char *
str
Basically, anything that looks like an int, gets converted to an int
...long...
...floats...
And the only thing that looks like a string is a string
Type Conversions
Be careful with str -> char *: Pyrex just
uses the internal char * of the str
The resulting char * is only good as long as
the str object exists
Something that won’t work:
cdef char *s
s = pystring1 + pystring2
Borrowed Syntax from C
Python syntax doesn’t get you everywhere
&p works like C
p.x instead of p->x
p[0] instead of *p
<type>var to cast var to type
None gets converted to NULL
Python’s order of operations
But don’t try to typecast Python <=> C types
Use foo is None instead of foo == None
C Type Declarations
structs, unions, enums, typedefs:
cdef struct Grail:
int age
float volume
cdef enum CheeseState:
hard = 1
soft = 2
runny = 3
ctypedef unsigned long ULong
C Type Declarations
Constants: use anonymous enum:
cdef enum:
tons_of_spam = 3
Only use struct, union, or enum when
defining type, not referring, e.g.
cdef Grail *gp
This collapses the struct namespace onto
others, which can suck
Extension Classes
i.e. C classes
Just use cdef class Foo:
Extension Classes
cdef class Shrubbery:
cdef int width, height
def __init__(self, w, h):
self.width = w
self.height = h
def describe(self):
print "This shrubbery is", self.width, \
"by", self.height, "cubits."
Shrubbery.width, Shrubbery.height not accessible to Python normally
For read and write, cdef public width
For read only, cdef readonly int width
Can only do public and readonly with types Pyrex knows how to convert
Extension Classes
Using an extension class as argument:
def widen_shrubbery(Shrubbery sh,
width):
def widen_shrubbery(Shrubbery sh not
None, width):
Can use cdef methods as well
Use not None if you’re going to access C attributes of extension types
Writing Fast Pyrex
Don’t use for i in xrange(n):
Special syntax: for 0 <= i < n:
Can still use break, continue, else
Operations on builtin types (dict, list, object)
automatically get translated into API calls
xrange (or range) is a Python function, so conversion overhead
0 <= i < n gets compiled directly into for loop
Can do n > i >= 0 for descending
Exceptions with cdefs
By default, an exception raised by a cdef
function is ignored
Declare specially:
cdef int spam() except -1
cdef int spam() except? -1
cdef int spam() except *
First one means return of -1 means definitely raised exception
Second one means it might have raised an exception
Third one means always check when the function returns
Differences from Python
Can use import, and from import, but not
import *
Can’t define functions within functions
No generators
No list comprehensions
No conditional expressions (foo if bar else
baz)
How to hook external
code
So that’s Pyrex the language...
Including Code
cdef extern from “string.h”:
int strcmp(char * s1, char * s2)
Causes file to be included
Also uses prototype as hinting to Pyrex for
types
Doesn’t need to be exact
Shouldn’t be a lot of times
Leave out const
Don’t need all struct members, just the ones you’ll need
Import macros using function defs
Can use this to pull in Python/C API functions if needed - just use “Python.h”
size_t - just typedef it to int - it’s close enough
Including Code
If you want C function to be exposed to
Pyrex under different name:
cdef extern from “spam.h”:
void c_eggs “eggs” (int count)
Works with structs, unions, vars, etc
Refer to eggs as c_eggs within Pyrex
Useful if you have collisions between struct names and function names
Including Code
cdef extern from *:
size_t
void *
extern from * is for when you need something that isn’t included from a specific file
(i.e. included by an include)
If you use a size_t, you should just typedef it to int - it’s close enough
C Callbacks
Most functions that use a callback pass a
void * to the callback function
That could be a Python function cast to a
void *
Instead of casting, just declare function to
take object instead of void *
Then write cdef function that calls the void
* argument
C Callbacks
In C:
int mr_query(char *name, int argc, char **argv,
int (*callproc)(int, char **, void *),
void *callarg);
In Pyrex:
cdef extern from “moira/moira.h”:
int mr_query(char *, int, char **,
int (*callback)(int, char **, object),
object)
Here’s an example from Moira
Cython
Cython
Fork of Pyrex
Primarily maintained by Sage
Cython Differences
List comprehensions
Including [i for i in 0 <= i < 10]
Conditional expressions
cdef inline
Assignment on declaration
(i.e. cdef int i = 4)
Automatic conversion of for i in range(...)