[PEAK] Fwd: Adding C generation to bytecode assembler

Tue Aug 29 20:53:07 EDT 2006

At 05:08 PM 8/29/2006 -0700, Michel Pelletier wrote:
>On 8/21/06, Phillip J. Eby 
><<mailto:pje at telecommunity.com>pje at telecommunity.com> wrote:
>> >I think my loop with a few math ops is sufficiently contrived enough that
>> >it can't be considered a very reliable benchmark of what's to come or much
>> >insperation for awe. ;)
>>
>>Fair enough.
>
>I certainly made an overly bold statement to begin with.  Performance is 
>actually not the highest priority I wish to achieve, if it's 10-20%, so be it.

Huh?  What on earth could possibly justify translating bytecode to C 
*except* performance?  Surely not just obfuscation?

>>In general, simply taking a Python program and compiling it with Pyrex is
>>very likely to produce code that is *slower* than the original Python
>>program -- even after you add some type declarations.
>
>Here:
>
><http://svn.eby-sarna.com/RuleDispatch/src/dispatch/_d_speedups.pyx?rev=2135&view=log>http://svn.eby-sarna.com/RuleDispatch/src/dispatch/_d_speedups.pyx?rev=2135&view=log
>
>in the last log msg you allude to some pleasing amount of performance 
>increase.

That message doesn't indicate how much tweaking I did before the checkin.  :)

>Was there some thought you put toward writing code specifically well 
>suited to Pyrex?

Well, of course attribute lookups.  But primarily you will see that the 
code does a lot of *type-specific* API calls.  It doesn't use x[y] when 
PyList_GETITEM can be used, for example.

Pretty much, I find that to be the key to actually speed up Pyrex 
code.  That is, when working with Python objects, don't use Pyrex to do 
anything that you can do by using a C API call instead.  :)

This is basically because:

1. Pyrex generates a lot of extra incref/decref and other housekeeping, 
that a human can see is unneccessary.

2. Type-specific APIs are faster than generic ones

3. Sequence operations work faster if you're not converting between Python 
ints and C ints, and go to a specific API

4. Dictionary lookups are a lot faster if simple lookup failures doesn't 
result in exceptions being raised

Plus probably other things I haven't thought of.

>>Heh.  If you want better performance than the CPython interpreter, I think
>>you're going to end up with code that's far less readable than Pyrex, but
>>that's just a guess.
>
>Well I guess by readable in this case I mean highly commented, including 
>the source Python line and comments, stack depth information, block based 
>indentation, block information, and experimenting with more readable 
>symbols that are mangled in a later step.  My goal is for the reader (me) 
>to be able to scan down the generated code and see the progression of 
>program flow from one instruction to the next and easily see what's going 
>on, and work forward from there.

Okay, but that's not "readable" like Python is "readable", which is what I 
meant.  :)

>>The big
>>speedups are really all in one of two places:
>>
>>1. Specialization (via type declarations, inference, or JIT specialization
>>ala Psyco)
>
>I'll keep my eye on PyPy for this one.  I caught the second half of 
>Samuele's presentation at the Vancouver Python workshop.  I'm still 
>kicking myself for missing the first half.  Inference has always scared me 
>a bit to a certain degree, but I did get to catch him explain the 
>inference algorthm and it was convincing.

For RPython, one simply avoids doing anything that will confuse the 
inferencer.  :)

>>2. Dropping compatibility with frames, function objects, trace/profile
>>hooks, GIL releasing, etc.
>
>This is where I'm willing to experiment a bit with removing per-opcode 
>overhead through run time check reduction and stack movement reduction, 
>minus touching the GIL.  I'm not going near that.

If you *don't* release the GIL regularly, other Python threads will 
freeze.  I was referring to the overhead of keeping track of *how often* to 
release the GIL.

Pyrex does not release the GIL, nor does it support profile/trace hooks, 
but then it's not intended to be a replacement for normal Python functions, 
either.