2 comments

The Düsseldorf sprint is currently in full progress and this post will try to summarize what progress has been made in the last days. We are (again) sprinting at the STUPS group of the Düsseldorf University. You can find the sprint announcement and the daily planning file.

Holger and Samuele put quite some effort over several days into setting up and improving PyPy's testing infrastructure. PyPy has a variety of tests. On the one hand, there are of course our own tests. But then we also have the CPython tests that should be run on top of pypy-c. Up to now we used a custom-made pile of hacks, held together by lots of duct-tape. It consisted of a variety of different machines running different things with different reporting solutions. Some of the old test-results can still be found on wyvern. Now we are moving to a buildbot based solution together with a custom reporter to have a view similar to the old one. Some details are not quite finished yet, but most of the things are already working rather well (currently all the results displayed are from the 2.5-merge branch).

Another large (and ongoing) topic of work is the 2.5 branch. It contains the work done by our Summer-of-Code student, Bruno Gola, of adding CPython 2.5 features to PyPy's Python interpreter. While Bruno implemented most language features and imported the 2.5 stdlib into PyPy, a lot of details were still missing. In the last days nearly everybody worked on fixing small issues and failing stdlib tests. While doing that we tried to categorize some CPython tests as implementation dependant so that we can skip them when running on PyPy.

Memory Improvements

One goal of the sprint is to measure and to reduce the memory behaviour of our Python interpreter. The idea is to make pypy-c a realistic option for use on embedded devices. By memory behaviour we mean both the dynamic memory usage (how much bytes does a dict or an instance take) as well as the size of the executable and details of the GC strategy.

Alexander, Carl Friedrich and Antonio did some work on analyzing the static data that a pypy-c executable contains. Our executables have the tendency to be rather large, both due to a lot of code and due to a large amount of static data. The analysis didn't give any really surprising results, the problem is mostly that we have a lot of static data originating from a bit everywhere in our program. Two big offenders are the unicodedata-module with about 750 KB of static data and the multimethod-tables with about 150 KB of data.

Armin, Iko, Anto and Maciek worked on a new approach to malloc-removal. This is (for PyPy) a crucial optimization of the translation toolchain that performs escape analysis to find out which objects don't outlive the frame they were allocated in. Since RPython is garbage-collected we usually have a lot of allocations, so it is important to statically get rid of many of them. To successfully do that, some inlining is needed to give the analysis more context. This leads to the fact that we have rather aggressive inlining-settings to allow as much malloc-removal as possible. The new approach tries to inline functions only if this actually leads to the successful removal of a malloc operation. The code is not finished quite yet, so it remains to be seen how successful it will be.

Before the sprint Maciek had started to work on a mark-compact GC for PyPy. The idea is that it is better for memory-constrained-environments because it does not double the memory-requirements during collections. During the sprint Armin and Maciek worked on cleaning up the code a bit and then merging the branch. An interesting property of the mark-compact GC is that after a collection all the memory that is not currently used by the program is returned to the operating system. Right now the GC is not as fast as our more mature ones, but it probably will be the basis for future tweaking.

A small thing that was done by Alexander and Carl Friedrich to make objects smaller is to enable shared instance dictionaries also for instances of old-style classes. Before it worked only for instances of new-style classes. Shared instance dictionaries are a way to reduce the memory-usage of instances. In the optimal case, it gives the same memory-savings that __slots__ are giving, but without any behavioural changes. Conceptually it is very similar e.g. to the notion of "map" in the Self project, or the hidden classes that Google Chrome's V8 is using (click on the link, there are nice graphics). The difference is that for them it is mostly a way to get faster attribute access, and PyPy is so far only using it form memory savings (but that might change in the future).

In parallel to all the other work, John Witulski worked tirelessly on advancing the AMD64-JIT-backend. John has the implementation of this backend as the topic of his Bachelor's thesis. He is progressing quite well (especially also considering that this is his first sizeable Python project ever), just sometimes being impaired by such annoyances as errors in the official Intel documentation. By now the backend is supporting many integer operations and control flow.

Düsseldorf Sprint Report Days 1-3

Memory Improvements

The PyPy blogposts

Recent Posts

Archives

Memory Improvements

Comments

The PyPy blogposts

Recent Posts

Archives

Tags