Download The PyPy Translation Tool Chain

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
10. The PyPy translation tool chain
Toon Verwaest
Thanks to Carl Friedrich Bolz for his kind permission to reuse and
adapt his notes.
The PyPy tool chain
Roadmap
>
>
>
What is PyPy?
The PyPy interpreter
The PyPy translation tool chain
© Toon Verwaest
2
The PyPy tool chain
Roadmap
>
>
>
What is PyPy?
The PyPy Interpreter
The PyPy translation tool chain
© Toon Verwaest
3
The PyPy tool chain
What is PyPy?
>
Reimplementation of Python in Python
>
Framework for building interpreters and VMs
>
L * O * P configurations
— L dynamic languages
— O optimizations
— P platforms
© Toon Verwaest
4
The PyPy tool chain
PyPy
© Toon Verwaest
5
The PyPy tool chain
Roadmap
>
>
>
What is PyPy?
The PyPy interpreter
The PyPy translation tool chain
© Toon Verwaest
6
The PyPy tool chain
The PyPy Interpreter
>
Python: imperative, object-oriented dynamic language
>
Stack-based bytecode interpreter (like JVM, Smalltalk)
def f(x):
return x + 1
© Toon Verwaest
>>> dis.dis(f)
2
0 LOAD_FAST
0 (x)
3 LOAD_CONST 1 (1)
6 BINARY_ADD
7 RETURN_VALUE
7
The PyPy tool chain
The PyPy Bytecode Compiler
>
Written in Python
>
.py to .pyc
>
Standard, flexible compiler
—
—
—
—
>
Lexer
Parser
AST builder
Bytecode generator
You only have to build this once
© Toon Verwaest
8
The PyPy tool chain
Bytecode interpreter
>
Focuses on language semantics. No low-level details!
>
Written in RPython
— This makes it very slow! About 2000x slower than CPython
>
PyPy's Python bytecode compiler and interpreter are
not the hot topic of the PyPy project!
© Toon Verwaest
9
The PyPy tool chain
Roadmap
>
>
>
What is PyPy?
The PyPy interpreter
The PyPy translation tool chain
© Toon Verwaest
10
The PyPy tool chain
The PyPy Translation Tool Chain
>
Model-driven interpreter (VM) development
— Focus on language model rather than implementation details
— Executable models (meta-circular Python)
>
Translate models to low-level (LL) back-ends
— Considerably lower than Python
— Weave in implementation details (GC, JIT)
— Allow compilation to different back-ends (OO, procedural)
© Toon Verwaest
11
The PyPy tool chain
The PyPy Translation Tool Chain
© Toon Verwaest
12
The PyPy tool chain
Inside the Translation Tool Chain
© Toon Verwaest
13
The PyPy tool chain
PyPy “Parser”
Tool chain starts from loaded Python bytecode
> Translator shares Python environment with the target
> Relies on Python's reflective capabilities
> Allows meta-programming (runtime initialization)
>
def a_decorator(an_f):
def g(b):
an_f(b+10)
return g
@a_decorator
def f(a):
print a
f(4) -> 14
© Toon Verwaest
14
The PyPy tool chain
PyPy Control-Flow Graph
© Toon Verwaest
15
The PyPy tool chain
PyPy Control-Flow Graph
>
Consists of Blocks and Links
>
Starting from entry_point
>
“Single Static Information” form
def f(n):
return 3*n+2
© Toon Verwaest
Block(v1): # input argument
v2 = mul(Constant(3), v1)
v3 = add(v2, Constant(2))
16
The PyPy tool chain
PyPy CFG: “Static Single Information”
>
Remember SSA: PHIs at dominance frontiers
© Toon Verwaest
17
The PyPy tool chain
PyPy CFG: “Static Single Information”
>
SSI: “PHIs” for all used variables
–
Blocks as “functions without branches”
def test(a):
if a > 0:
if a > 5:
return 10
return 4
if a < - 10:
return 3
return 10
© Toon Verwaest
18
The PyPy tool chain
Type Inference
© Toon Verwaest
19
The PyPy tool chain
Why type inference?
>
Python is dynamically typed
>
We want to translate to statically typed code
— For efficiency reasons
© Toon Verwaest
20
The PyPy tool chain
What do we need to infer?
>
Type for every variable
>
Messages sent to an object must be defined in the
compile-time type or a supertype
© Toon Verwaest
The PyPy tool chain
How to infer types?
>
Starting from entry_point
— Can reach the whole program
— We know type of arguments and
return-value
>
Forward propagation
— Iteratively, until all links in
the CFG have been followed
at least once
— Results in a large dictionary
mapping variables to types
© Toon Verwaest
22
The PyPy tool chain
Implications of applying type inference
Applying type inference restricts
type of input programs
© Toon Verwaest
23
The PyPy tool chain
RPython: Demo
def plus(a, b):
return a + b
def entry_point(arv=None):
print plus(20, 22)
print plus(“4”, “2”)
© Toon Verwaest
24
The PyPy tool chain
RPython: Demo
@objectmodel.specialize.argtype(0)
def plus(a, b):
return a + b
def entry_point(arv=None):
print plus(20, 22)
print plus(“4”, “2”)
© Toon Verwaest
25
The PyPy tool chain
RPython is Zen
>
Subset of Python
>
Informally: The subset of Python which is type inferable
>
Actually: type inferable stabilized bytecode
— Allows load-time meta-programming (see parser)
— Messages sent to an object must be defined in the compile-time
type or supertype
© Toon Verwaest
26
The PyPy tool chain
RTyper
© Toon Verwaest
27
The PyPy tool chain
RTyper
>
Bridge between annotator and low-level code generators
>
Different low-level models for different target groups
— LLTypeSystem
— OOTypeSystem
C-style (structures, pointers and arrays)
JVM, CLI, Squeak (trace-off: single inheritance, )
>
Does not need to iterate until a fixpoint is reached
>
Replaces all operations by low-level ones
© Toon Verwaest
28
The PyPy tool chain
Back-end Optimizations
© Toon Verwaest
29
The PyPy tool chain
Back-end Optimizations
>
Some general optimizations
— Inlining
— Constant folding
— Escape analysis (allocating objects on the stack)
>
Partly assume code generation for optimizing back-end
© Toon Verwaest
30
The PyPy tool chain
Back-end Optimizations: “Object Explosion”
>
OO: lots of helper objects
>
Allocating objects is expensive
>
Replace unneeded objects with direct calls
© Toon Verwaest
31
The PyPy tool chain
Preparation for Source Generation
© Toon Verwaest
32
The PyPy tool chain
Exception Handling and Memory Management
>
C has no support for:
— automatic memory management
— exception handling
>
Translate explicit exception handling to flags and if/else
>
Memory management in PyPy spirit:
— not language specific
— weave garbage collector in during translation
© Toon Verwaest
33
The PyPy tool chain
JIT Compiler
>
Makes VMs fast
— Dynamic information is key
>
Is an implementation detail
Weave in while translating to low-level!
>
Still under development
>
“As you surely know, the key idea of PyPy is that we are too lazy to
write a JIT of our own: so, instead of passing nights writing a JIT,
we pass years coding a JIT generator that writes the JIT for us :-)”
© Toon Verwaest
34
The PyPy tool chain
Code Generation
© Toon Verwaest
35
The PyPy tool chain
Code Generation
>
One C-function per Control-Flow Graph
>
All low-level statements can be translated directly
>
Gets compiled to binary format with C compiler
© Toon Verwaest
36
The PyPy tool chain
Translation Demo
© Toon Verwaest
37
The PyPy tool chain
PyPy Performance
>
Translator
—
—
—
—
>
Slow
Uses quite some memory
Produces lots of source code (200 kloc for 5 kloc source)
But: our models are executable (2000x slower than CPython)
Resulting Interpreter
— Currently: two times slower to two times faster than CPython
— First experiments with JIT: up to 500x faster for special cases
— But most importantly: very adaptable!
© Toon Verwaest
38
The PyPy tool chain
More PyPy & Getting Involved
http://codespeak.net/pypy
> http://morepypy.blogspot.com
> irc://irc.freenode.org/pypy
> PyPy sprints
>
© Toon Verwaest
39
The PyPy tool chain
Summary
>
PyPy project has two main parts
— Language interpreter models
— PyPy translation tool chain
>
PyPy translation tool chain
— Has no typical parser
— Uses SSI
— Applies type inference
–
Limits input from Python to RPython
— Compiles to low-level and object-oriented back-ends
— Weaves in implementation details
© Toon Verwaest
40
The PyPy tool chain
Summary
© Toon Verwaest
41
The PyPy tool chain
What you should know!
What is the goal of the PyPy project?
What are the main steps of the PyPy toolchain?
When is a program RPython?
© Toon Verwaest
42
The PyPy tool chain
Can you answer these questions?
Why do we want to keep the language model separated
from implementation details?
> Why wouldn't we want to keep those details separated?
> Why is it not really a problem that the tool chain can only
compile RPython code?
>
© Toon Verwaest
43
The PyPy tool chain
xxx
License
>
http://creativecommons.org/licenses/by-sa/2.5/
Attribution-ShareAlike 2.5
You are free:
• to copy, distribute, display, and perform the work
• to make derivative works
• to make commercial use of the work
Under the following conditions:
Attribution. You must attribute the work in the manner specified by the author or licensor.
Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting
work only under a license identical to this one.
• For any reuse or distribution, you must make clear to others the license terms of this work.
• Any of these conditions can be waived if you get permission from the copyright holder.
Your fair use and other rights are in no way affected by the above.
© Toon Verwaest
44