On 8/21/06, <b class="gmail_sendername">Phillip J. Eby</b> &lt;<a href="mailto:pje@telecommunity.com">pje@telecommunity.com</a>&gt; wrote:<div><span class="gmail_quote"></span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt;I think my loop with a few math ops is sufficiently contrived enough that<br>&gt;it can't be considered a very reliable benchmark of what's to come or much<br>&gt;insperation for awe. ;)<br><br>Fair enough.</blockquote>

<div><br>I certainly made an overly bold statement to begin with.&nbsp; Performance is actually not the highest priority I wish to achieve, if it's 10-20%, so be it.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt;&nbsp;&nbsp; I actually expect a speedup benefit to happen only in certain cases, I<br>&gt; suspect most functions would not benefit enough or at all to even bother.<br><br>This is especially going to be the case where function calls are concerned,

<br>but probably also integer math and certain other operations where the<br>interpreter has type-specific speedup tricks that can't be obtained by<br>straight C API translation.</blockquote><div><br>I agree, hedging ones bets for whatever &quot;p3yk&quot; might land vis a vis type information before run time.&nbsp; For now I'm focusing on interpreter interoperability.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt;&nbsp;&nbsp; But I'm suprised you don't think the inner interpreter or the stack<br>

&gt; movement has much overhead, that seems to be pretty low hanging fruit to<br>&gt; me but the future could prove me dead wrong. I remember one or two<br>&gt; threads specifically on this subject on python-dev, but I don't remember

<br>&gt; much actual measurement being conclusively shown at the time to prove it<br>&gt; one way or another.<br><br>Well, it's said that p2c (or was it py2c?) achieved only 10-15% speedup<br>using these techniques.</blockquote>

<div><br>I wish I could find that code but google produced nothing in that regard (perhaps I should do more digging with py2c), so it's hard for me to research what techniques were used or what the benefit was (or is now with things changed so much).

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt;I've not studied a lot of Pyrex's output, but some through a few<br>&gt;trials.&nbsp;&nbsp;I would be suprised and suspect of my own results if Pyrex didn't

<br>&gt;generate faster code than I have here,<br><br>Hm.&nbsp;&nbsp;Well, Pyrex generates code that does a lot of setting variables to<br>None and inc/decrefing all over the place, and plus which it doesn't take<br>nearly as much advantage of the type information it has available to it as

<br>you'd expect.&nbsp;&nbsp;And until relatively recent versions it used C strings to do<br>attribute access, which is horrifically slow.</blockquote><div><br>I'll avoid that then. ;) <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

In general, simply taking a Python program and compiling it with Pyrex is<br>very likely to produce code that is *slower* than the original Python<br>program -- even after you add some type declarations.</blockquote><div>

<br>Here:<br><br><a href="http://svn.eby-sarna.com/RuleDispatch/src/dispatch/_d_speedups.pyx?rev=2135&amp;view=log">http://svn.eby-sarna.com/RuleDispatch/src/dispatch/_d_speedups.pyx?rev=2135&amp;view=log</a><br><br>in the last log msg you allude to some pleasing amount of performance increase.&nbsp; Was there some thought you put toward writing code specifically well suited to Pyrex?

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt;&nbsp;&nbsp;although I don't think Pyrex is as readable,<br><br>Heh.&nbsp;&nbsp;If you want better performance than the CPython interpreter, I think

<br>you're going to end up with code that's far less readable than Pyrex, but<br>that's just a guess.</blockquote><div><br>Well I guess by readable in this case I mean highly commented, including the source Python line and comments, stack depth information, block based indentation, block information, and experimenting with more readable symbols that are mangled in a later step.&nbsp; My goal is for the reader (me) to be able to scan down the generated code and see the progression of program flow from one instruction to the next and easily see what's going on, and work forward from there.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt;it's certainly more mature and probably the closest project in terms of<br>

&gt;goals.&nbsp;&nbsp;I think my experiment is much simpler than Pyrex though, not<br>&gt;having to maintain a language parser or complex code generator or deal<br>&gt;with some of the restrictions Pyrex has.&nbsp;&nbsp;One of my main goals is to

<br>&gt;maintain 100% interpreter compatibility.<br><br>Unfortunately, I don't think that you can do that *and* get more than say a<br>10-15% speedup -- i.e., one probably not worth all the effort.&nbsp;&nbsp;</blockquote><div><br>

If performance were the sole goal.&nbsp;</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">The big<br>speedups are really all in one of two places:

<br><br>1. Specialization (via type declarations, inference, or JIT specialization<br>ala Psyco)</blockquote><div><br>I'll keep my eye on PyPy for this one.&nbsp; I caught the second half of Samuele's presentation at the Vancouver Python workshop.&nbsp; I'm still kicking myself for missing the first half.&nbsp; Inference has always scared me a bit to a certain degree, but I did get to catch him explain the inference algorthm and it was convincing.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">2. Dropping compatibility with frames, function objects, trace/profile<br>hooks, GIL releasing, etc.

</blockquote><div><br>This is where I'm willing to experiment a bit with removing per-opcode overhead through run time check reduction and stack movement reduction, minus touching the GIL.&nbsp; I'm not going near that.&nbsp; I think tracing and debugging can be achieved to even greater effect by producing different builds with added runtime checks based on a profile and production builds with none.&nbsp; 

<br><br>Again, I don't think it would ever make sense to apply this to a whole Python system or even most of it.&nbsp; The existing bytecode interpreter and all of the Python runtime API is something I don't mean to remove but leverage and bytecode is such an enormously compact format compared to the binary equivalent that I think in most situations you would prefer the compactness over the cost of interpretation.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">And of the two, my guess is that specialization is the only way to get<br>triple-digit performance improvements.&nbsp;&nbsp;I could be wrong, of course, but my

<br>assessment of the odds is a significant why I haven't tried such<br>experiments myself.<br><br>Meanwhile, it seems to me that PyPy's new extension builder system is<br>actually the closest thing to your goals, in that it operates on bytecode

<br>and produces C code using the CPython API.&nbsp;</blockquote><div><br>Interesting, perhaps that's the first part I missed.&nbsp;</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&nbsp;I don't know whether it<br>successfully produces any performance improvements, or whether its primary<br>goal is just to make it easy to create wrappers for C libraries (ala<br>Pyrex).&nbsp;&nbsp;However, it seems worth looking into.

</blockquote><div><br>I will. <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Also, is there an SVN repository somewhere of what you're working on?&nbsp;&nbsp;I'd

<br>love to have a look.</blockquote><div><br>I don't have a version that works with BytecodeAssembler just yet, I only get about an hour a week to tinker with this, let me see what I can do this week to show some actual code.

<br><br>-Michel</div></div>