[TransWarp] Re: CIS and Threaded Apps

Phillip J. Eby pje at telecommunity.com
Mon Jun 10 09:25:18 EDT 2002


At 10:13 AM 6/10/02 +0200, Ulrich Eck wrote:

> > Hm.  This strikes me as problematic to use with CIS bindings, unless the
> > pool is a proxy of some kind, and then it needs to also know when to
> > allocate/release an object.  My architectural assumption about threads and
> > pooling is that you'll create separate instances of the App-level object
> > and pool *those*, rather than individual components within the
> > application.  This works well in Zope, and it localizes the lifecycle
> > issue, while avoiding proxies and pretty much all possible thread
> > synchronization.
>
>I think this is to "Zope-Centric". In my opinion, Pooling Data-Access is a
>must but do you really think that Process-Objects (things that implement
>mostly logic) should be pooled as whole ???

Yes.  :)


>What about sharing data between threads ?? this is one of the most important
>matters for me using threads .. otherwise I could use seperate processes as
>well ..

That's right, you could.  And you might get better performance by doing 
so.  Python's global interpreter lock means that it's very difficult to 
bring multiple CPU's to bear on your code when using threads.  Thus we tend 
to prefer multiprocessing to multithreading.

As for sharing data between threads, the only things that you could easily 
share would be caches of relatively static data.  Trying to manage 
transacted data that is shared between threads will make you tear your hair 
out pretty quickly.  My experience has been, if the data participates in 
transactions, don't share it between threads!  That pretty much means code 
is about the only thing you can share.  One of my early application server 
efforts, ASDF, was designed to support multithreading by allowing static 
data such as DTML and SQL methods to be shared even though dynamic data 
like SQL connections would not be.  The *parameters* of the SQL connection 
would be shared, and a per-thread context wrapper kept the actual connection.

The end result, however, was that the code was ridiculously complicated, 
difficult to maintain, and didn't really gain that much over the more naive 
pooling techniques of Zope.  And for a little more cost in memory, 
multiprocessing gave a significant performance boost - even on a single CPU 
machine!


>I think that it will be a major performance problem, if every call from
>outside (e.g. get me this Attributes from that Object out of the LDAP/SQL)
>leads to instancing a complete app-instance (even we are using SEF.Service
>mostly), creating a new connection, authenticating it, initializing the
>Database and Caches. Using persitent connections and a certain
>pooling-algorithm that can handle the things you talked about above e.g.:

I think you misunderstand me.  On each call from outside, you retrieve an 
App instance from your pool, and return it to the pool when done.  The 
connections will be persistent, any caches will be persistent, and so 
on.  Likewise bindings and autocreatables that have been instantiated will 
stay instantiated.  I'm saying to cache the *whole application*, not just 
the DB connections.



>Within my TransactionAwarePool there are 3 Classes:
>
>1. InstancePoolService: Transparently switches between the objects it holds
>     for each incoming call and can be used to serve Instances without
>     modifying code that works without pooling. it is a Mixture of Once and
>     AutoCreateable and therefore it's not too much overhead (depends only on
>     the algorithm that is needed to select/manage instances for calls. It
>     does not explicitly depend on threads to switch .. e.g. the 
> implementation
>     of it in Transactions uses the TransactionID to determine, if a thread
>     already has a Instance Assigned or not. If not, it try's to return a free
>     Instance, if none is avaliable, the call is blocked until an Instance 
> gets
>     free again. With this approach I'm able to share Transactions and the
>     Instances between threads that belong together, which is what we wanted
>     to be able to.
>
>2. PooledObjectWrapper: This class is a wrapper for a pooled instances maybe a
>     service may be something else. it has two methods: use(client) and free()
>     those methods are called, when an Instance is used or not needed anymore.
>     It has a ObjectTransactionNotifier.
>
>3. ObjectTransactionNotifier: Similar to your Database.Transactions.\
>     TransactionNotifier and it helps the IPS/POW to find the Transaction-
>     Boundaries and frees instances with forget(=tpc_finish). Similar Classes
>     could recognize Users that use an Instance for the first time and 
> register
>     a callback when they logout or just define an idle timeout.
>
>In all "Non-Database-Relevant" parts there are not too much things two worry
>about threads in our cases i think.
>
>do you still think this approach won't work in most cases ??

No, I just think you're making things more complicated than they need to 
be.  I think you could replace the whole thing with a list of App 
instances, from which threads .pop() one when they need it, and .append() 
to it when they're done with it.  No muss, no fuss, no locking...  like this:

AppPool = []

try:
     app = AppPool.pop()
except:
     app = MyAppClass()

try:
     # do stuff with app or parts thereof
finally:
     AppPool.append(app)



Voila.  This approach will use more memory than yours, but it will pool 
*everything* and won't require "invasive" management of individual objects 
in the application.  The default is for things *not* to be shared, but 
instead pooled.  If you have individual objects you *want* to be shared, 
you can write their autocreate descriptors to work as a singleton or to 
share some underlying data structure and handle locking.  But as I said, 
this is probably only useful for caches of large static data, and it'd have 
to be pretty large to be worth taking the time to write the management of it.



>p.s. sorry for the inconvinience that you have with my postings recently, but
>we have a central mailbox (lists at net-labs.de) that is subscribed to all our
>mailinglists to minimize the amount of traffic on our connection. is it
>possible to allow ueck at net-labs.de to post but without receiving any messages
>??

I'm pretty sure that you can turn off your message receiving from the list 
management page, at http://www.eby-sarna.com/.  I think it will still let 
you post with message receipt turned off.





More information about the PEAK mailing list