[PEAK] Compiling lvalues, iterators, and comprehensions

Phillip J. Eby pje at telecommunity.com
Mon Jul 28 12:59:18 EDT 2008


In order to implement the SQL mapping stuff described in my June 19th 
post, we're going to need to be able to compile lvalues, iterators, 
and full comprehensions.  The peak.rules.ast_builder module supports 
comprehension syntax, but peak.util.assembler and peak.rules.codegen 
don't have any node types or bytecode generation for them as 
yet.  This is a bit of a problem since in the default case of 
compiling a query (i.e. executing it as-is) we'll need to be able to 
implement the looping.

My current vision for how SQL translation will work is that there 
will be a QueryBuilder that accepts only listcomps or genexps; 
anything else as a top-level expression would be verboten.  There 
will also need to be an LValueBuilder for compiling the 'for' clauses.

Essentially, this is needed because the expressions used in 'for' 
clauses are assignment targets ("lvalues") and can combine the use of 
tuples, setitems, etc.  For example:

    [... for c in qz for a.b[c], d in abcd if ...]

This is a syntactically valid (if rather baroque) 
expression.  Personally, I think that we don't actually need to 
support this full generality for purposes of query translation.  It 
should be sufficient to support local variables and possibly-nested 
sequences thereof.  However, even if we don't support that syntax for 
compilation, the LValueBuilder will still need to recognize the other 
forms, if only to explicitly reject them.

So, we'll need an UnpackSequence node type and a LocalAssign node 
type.  UnpackSequence will generate an UNPACK_SEQUENCE of the length 
of its argument, and then compile the items in its argument.  These 
might be LocalAssign nodes, or nested UnpackSequence nodes.  On the 
SQL translation side, we'll detect LocalAssign nodes to assign type 
information to local variables.  (We'll really need to do the same 
for UnpackSequence, because we might be looping over a nested query 
returned by a method or function.  But I might not bother with this 
in the prototype version.)

Compiling the loops themselves also needs to happen, to support 
looping in Python.  We'll need a new node type, something like 
"For(iterable, assign, body)", that compiles to:

     iterable
     GET_ITER
L1: FOR_ITER L2
     assign
     body
     JUMP_ABSOLUTE L1
L2: ...

Currently, BytecodeAssembler doesn't correctly support the FOR_ITER 
opcode, which has complex stack effects.  Specifically, it accepts 
one argument, and either adds one or removes one, depending on 
whether the iterator is exhausted.  The modification probably won't 
be too difficult, though; mainly just an extra "if" in the 
"Code.jump()" method.  There will need to be some tests, too, of course.

So, the next thing to implement should be to just add For(), 
LocalAssign(), and UnpackSequence() node types to BytecodeAssembler 
(with doc & tests, of course), and release a new version.




More information about the PEAK mailing list