[PEAK] Fwd: Adding C generation to bytecode assembler

Tue Aug 29 20:08:35 EDT 2006

On 8/21/06, Phillip J. Eby <pje at telecommunity.com> wrote:
>
> >I think my loop with a few math ops is sufficiently contrived enough that
> >it can't be considered a very reliable benchmark of what's to come or
> much
> >insperation for awe. ;)
>
> Fair enough.

I certainly made an overly bold statement to begin with.  Performance is
actually not the highest priority I wish to achieve, if it's 10-20%, so be
it.

>   I actually expect a speedup benefit to happen only in certain cases, I
> > suspect most functions would not benefit enough or at all to even
> bother.
>
> This is especially going to be the case where function calls are
> concerned,
> but probably also integer math and certain other operations where the
> interpreter has type-specific speedup tricks that can't be obtained by
> straight C API translation.

I agree, hedging ones bets for whatever "p3yk" might land vis a vis type
information before run time.  For now I'm focusing on interpreter
interoperability.

>   But I'm suprised you don't think the inner interpreter or the stack
> > movement has much overhead, that seems to be pretty low hanging fruit to
> > me but the future could prove me dead wrong. I remember one or two
> > threads specifically on this subject on python-dev, but I don't remember
> > much actual measurement being conclusively shown at the time to prove it
> > one way or another.
>
> Well, it's said that p2c (or was it py2c?) achieved only 10-15% speedup
> using these techniques.

I wish I could find that code but google produced nothing in that regard
(perhaps I should do more digging with py2c), so it's hard for me to
research what techniques were used or what the benefit was (or is now with
things changed so much).

>I've not studied a lot of Pyrex's output, but some through a few
> >trials.  I would be suprised and suspect of my own results if Pyrex
> didn't
> >generate faster code than I have here,
>
> Hm.  Well, Pyrex generates code that does a lot of setting variables to
> None and inc/decrefing all over the place, and plus which it doesn't take
> nearly as much advantage of the type information it has available to it as
> you'd expect.  And until relatively recent versions it used C strings to
> do
> attribute access, which is horrifically slow.

I'll avoid that then. ;)

In general, simply taking a Python program and compiling it with Pyrex is
> very likely to produce code that is *slower* than the original Python
> program -- even after you add some type declarations.

Here:

http://svn.eby-sarna.com/RuleDispatch/src/dispatch/_d_speedups.pyx?rev=2135&view=log

in the last log msg you allude to some pleasing amount of performance
increase.  Was there some thought you put toward writing code specifically
well suited to Pyrex?

>  although I don't think Pyrex is as readable,
>
> Heh.  If you want better performance than the CPython interpreter, I think
> you're going to end up with code that's far less readable than Pyrex, but
> that's just a guess.

Well I guess by readable in this case I mean highly commented, including the
source Python line and comments, stack depth information, block based
indentation, block information, and experimenting with more readable symbols
that are mangled in a later step.  My goal is for the reader (me) to be able
to scan down the generated code and see the progression of program flow from
one instruction to the next and easily see what's going on, and work forward
from there.

>it's certainly more mature and probably the closest project in terms of
> >goals.  I think my experiment is much simpler than Pyrex though, not
> >having to maintain a language parser or complex code generator or deal
> >with some of the restrictions Pyrex has.  One of my main goals is to
> >maintain 100% interpreter compatibility.
>
> Unfortunately, I don't think that you can do that *and* get more than say
> a
> 10-15% speedup -- i.e., one probably not worth all the effort.

If performance were the sole goal.

The big
> speedups are really all in one of two places:
>
> 1. Specialization (via type declarations, inference, or JIT specialization
> ala Psyco)

I'll keep my eye on PyPy for this one.  I caught the second half of
Samuele's presentation at the Vancouver Python workshop.  I'm still kicking
myself for missing the first half.  Inference has always scared me a bit to
a certain degree, but I did get to catch him explain the inference algorthm
and it was convincing.

2. Dropping compatibility with frames, function objects, trace/profile
> hooks, GIL releasing, etc.

This is where I'm willing to experiment a bit with removing per-opcode
overhead through run time check reduction and stack movement reduction,
minus touching the GIL.  I'm not going near that.  I think tracing and
debugging can be achieved to even greater effect by producing different
builds with added runtime checks based on a profile and production builds
with none.

Again, I don't think it would ever make sense to apply this to a whole
Python system or even most of it.  The existing bytecode interpreter and all
of the Python runtime API is something I don't mean to remove but leverage
and bytecode is such an enormously compact format compared to the binary
equivalent that I think in most situations you would prefer the compactness
over the cost of interpretation.

And of the two, my guess is that specialization is the only way to get
> triple-digit performance improvements.  I could be wrong, of course, but
> my
> assessment of the odds is a significant why I haven't tried such
> experiments myself.
>
> Meanwhile, it seems to me that PyPy's new extension builder system is
> actually the closest thing to your goals, in that it operates on bytecode
> and produces C code using the CPython API.

Interesting, perhaps that's the first part I missed.

 I don't know whether it
> successfully produces any performance improvements, or whether its primary
> goal is just to make it easy to create wrappers for C libraries (ala
> Pyrex).  However, it seems worth looking into.

I will.

Also, is there an SVN repository somewhere of what you're working on?  I'd
> love to have a look.

I don't have a version that works with BytecodeAssembler just yet, I only
get about an hour a week to tinker with this, let me see what I can do this
week to show some actual code.

-Michel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.eby-sarna.com/pipermail/peak/attachments/20060829/74a9d297/attachment.html