[PEAK] Reactor-driven microthreads

Phillip J. Eby pje at telecommunity.com
Wed Dec 31 13:12:37 EST 2003


At 12:12 PM 12/31/03 -0500, Bob Ippolito wrote:
>On Dec 31, 2003, at 12:22 AM, Phillip J. Eby wrote:
>
>>Hm.  This seems like the first really good use case I've seen for
>>having a macro facility in Python, since it would allow us to spell
>>the yield+errorcheck combination with less boilerplate.  Ah well.
>
>That, and "flattening" generators
>
>def someGenerator():
>     for something in someOtherGenerator():
>         yield something
>
>This gets really ugly after a while, but it happens a lot when you are
>doing "microthreads".

Actually, the Thread class in my second post allows you to do:

def someGenerator():
     yield someOtherGenerator(); resume()

Thread objects maintain an external "stack" of nested generators, so you 
don't need to do this kind of passthru.



>By the way, the microthread thing in Twisted has been talked about
>before (
>http://twistedmatrix.com/pipermail/twisted-python/2003-February/ 
>002808.html ) and is implemented to some extent as twisted.flow
>(entirely different from the flow module in that discussion).

I've looked at twisted.flow.  To me, it's way overcomplicated in both 
implementation and concepts.  By contrast, the approach I've just sketched 
involves only two interfaces/concepts: threads and schedulers.  (Three, if 
you count "reactor".)  By contrast, twisted.flow seems to have stages, 
wrappers, blocks, and controllers, just for starters.  (Again, not counting 
"reactor" as a concept, and ignoring that the package also depends on 
Deferreds.)  Last, but not least, twisted.flow doesn't support the kind of 
error-passback handling that I've devised, unless I'm misunderstanding 
something about how it works.

I do understand that twisted.flow allows you to chain iterable "flows" 
while remaining co-operative.  However, it's also trivial to use that kind 
of chaining in this framework, too, if you're using a generator that's 
designed for this:

     lines = sock.readlines()
     yield lines; resume()

     for line in lines:
         print line
         yield lines; resume()

You would implement sock.readlines() as something like:

     def readlines(self):
         queue = Queue()
         def genLines():
             # ... Loop to accumulate a line
             queue.put(line)
         Thread(self).run(genLines())
         return queue

For clarity, I've omitted the part where the generator in readlines() 
accumulates the data, pushes back partial lines, etc.  As you can see, the 
only new concept we need is a "queue", or perhaps we should call it a 
"pipe".  Actually, the concept is probably used often enough that it would 
be easier to do:

     def readlines(self):
         def genLines():
             # ... Loop to accumulate a line
             queue.put(line)
         return Queue(self, run=genLines())

This is roughly equivalent to twisted.flow's notion of a 'Stage', but is a 
bit more concrete/explicit.  A critical difference between my proposed 
framework and twisted.flow is that t.f tries to intermingle control 
instructions and data in the values yielded from an iterator.  I think 
that's a bad idea, because it makes the implementation hard to 
explain.  (c.f. "The Zen of Python": "If the implementation is hard to 
explain, it's a bad idea").

By contrast, having two distinct concepts of "thread" and "queue" makes 
both sides of the code (app and tools) relatively easy to follow.  I find 
twisted.flow much harder to follow on the "tool" side and thus harder to 
see how I would create new data sources, generators, or "instructions".




More information about the PEAK mailing list