[PEAK] A pattern for subscribable queries in the Trellis

Sergey Schetinin maluke at gmail.com
Tue May 13 04:00:35 EDT 2008


>From what I understood this seems like a very versatile approach,
great. I guess we need to see some initial implementation to start
having questions about it, but if I can be of any help implementing,
especially the wx binding, I'd gladly participate.

On Tue, May 13, 2008 at 2:18 AM, Phillip J. Eby <pje at telecommunity.com> wrote:
> So, this week I've been working a bit on the persistence/ORM stuff for the
> Trellis, specifically on bidirectional links in collections.
>
>  And, in order to have set/list/dict attributes on either side of a two-way
> link update correctly, it's necessary to have a "relationship manager" or
> switchboard component that handles updating both sides of the connection.
>
>  While working on the design for that component, I've noticed that there are
> some similarities between what it needs to do and what the Time service
> does.  Specifically, both have a method that needs to be called to inquire
> about a condition...  and then have the calling rule be recalculated if the
> condition changes.  Second, that subscription process needs to happen in
> such a way that it doesn't create a dependency cycle.
>
>  Currently, if you use a time-based rule to trigger an action that causes a
> new time-based rule to be set up, this can create what appears to be a
> cyclical dependency.  (This is because there's a rule that depends on the
> "schedule" that fires callback conditions, but if one of those called-back
> rules inquires about a new future time, it will cause the schedule to be
> modified.)
>
>  In a similar way, a "relationship manager" needs to be able to iterate over
> received events regarding a subset of collection memberships, and then save
> the dependency for future use.
>
>  So, here's what I've figured out for future reference, of how to create
> this sort of object, in a way that avoids circular calculations.
>
>  To create a subscribable-query object, you will need a weak-value
> dictionary keying query parameters to cells, and some type of index
> structure, stored in a cell.  You'll then have a "maintain" rule that
> updates indexed cells when conditions change, and a "modifier" that adds
> cells to the weak dictionary.  The query function will then access the
> stored cells to perform its job.
>
>  Something roughly like this, in other words:
>
>     class QueryableThing(trellis.Component):
>         _index   = trellis.make(SomeNonTrellisType, writable=True)
>         _queries = trellis.make(weakref.WeakValueDictionary)
>
>         def query(self, *args):
>             # This could also preprocess the args or postprocess the values
>             return self._query_cell(args).value
>
>         @trellis.modifier
>         def _query_cell(self, key):
>             c = self._queries.get(key)
>             if c is None:
>                 c = self._queries[key] = ... #some cell
>                 trellis.on_undo(self._queries.pop, key, None)
>                 ... # add cell to self._index, with undo info
>                 trellis.on_commit(trellis.changed,
> trellis.Cells(self)['_index'])
>             return c
>
>  The 'Time' service already follows this pattern, more or less, except that
> '_index' is named '_schedule', and '_cache' is named '_events'.
>
>  This general pattern should also be usable for the "relationship manager".
> In fact, it's probably a general enough pattern to be worth embodying as a
> component or wrapper of some sort.  For example, we could perhaps have a
> decorator that worked something like this:
>
>         @collections.cell_cache
>         def _queries(self, key):
>             return ...  # some cell
>
>         @trellis.maintain(make=SomeTypeForTheIndex)
>         def _index(self):
>             for key in self._queries.added:
>                 # add key to index, with undo logging if index isn't
>                 # a trellis-ized type
>
>         def query(self, *args):
>             # This could also preprocess the args or postprocess the values
>             return self._queries[args].value
>
>         @trellis.maintain
>         def some_rule(self):
>             # code that looks at _index and some other input data
>             # to update any cells for those keys that are still in
>             # self._queries
>
>  In other words, the "cell_cache" decorator would turn _queries into a
> make() descriptor for a weak-value cache whose __getitem__ always returns a
> cell, and whose 'new' attribute is an observable sequence of the keys that
> were added to the cache during the previous recalc.
>
>  Neither of these examples handles one tricky bit, however: garbage
> collection of index contents.  In the Time service, this isn't important
> because the constant advance of the clock allows the index to be cleaned up
> automatically.  However, for a relationship manager, a cleanup-on-demand
> approach could lead to it being a long time before a relevant key is
> encountered by chance and cleaned up.
>
>  One possible way to handle cleanup would be to customize the weak reference
> callbacks used by the cache.  However, since the GC calls can occur at an
> arbitrary time, this could be somewhat awkward.  A more controlled approach
> might be to use the cells' connect/disconnect capability -- but then there's
> even more overhead added, because connect/disconnect functions can't modify
> trellis data structures.  They have to schedule these as future operations,
> using on_commit (which has still other caveats, in that there's no direct
> control flow link).
>
>  There's certainly an advantage, however, to making this functionality part
> of the core library, in that I expect it to be repeated for things like
> low-level select() I/O, signal handling, GUI events, and other circumstances
> where you need to have some sort of match pattern.  While it's certainly
> *possible* to implement these directly with connect/disconnect functions, it
> seems to me that the common case is more of a cache of cells.
>
>  Unfortunately, it's becoming clear that my current method of dealing with
> both connections and the Time service are a bit of a hack.  Instead of
> running as "perform" rules and on_commit (respectively), it seems to me that
> both should be done by running in the on_commit phase.
>
>  That is to say, connect/disconnect processing should run after a recalc is
> finished, rather than during the "perform" phase.  In that way,
> connect/disconnect operations can be free to change other trellis data
> structures, but are run at the beginning of a new recalc, rather than in the
> recalc during which the subscription status change occurred.
>
>  Whew!  Okay, that makes more sense.  Making this a general characteristic
> of connect/disconnect simplifies the implementation of a query cache, since
> one can implement the indexing as connect/disconnect operations of the
> cell's connector.
>
>  I like this, because it gives connections a sensible logic in the overall
> scheme of things.  True, it requires changes to the undo logic and overall
> transaction mechanism, but these should be straightforward to make.
>
>  Okay, so how do we make this work with query caches?  Well, what if we had
> a descriptor that made its attribute a weak-value collection of cells, but
> was otherwise very much like a regular trellis attribute, in terms of the
> options you could specify?  E.g.:
>
>         @collections.cellcache  # this should also allow initially=,
> resetting, etc...
>         def _queries(self, key):
>             return ...  # default value for key
>
>         @_queries.connector
>         def _add_to_index(self, key):
>             # add key to index, with undo logging if index isn't trellisized
>
>         @_queries.disconnector
>         def _remove_from_index(self, key):
>             # remove key from index, with undo logging if index isn't
> trellisized
>
>         def query(self, *args):
>             # This could also preprocess the args or postprocess the values
>             return self._queries[args].value
>
>         @trellis.maintain
>         def some_rule(self):
>             # code that looks at the index and some other input data
>             # to update any cells for those keys that are in self._queries
>             # and have listeners
>
>  To be a bit more specific, a cellcache's .connector, .disconnector, and
> rule would be passed in the key of the corresponding cell, unlike the case
> with regular connect/disconnect methods.  (Which currently get passed a
> cell, and possibly a memento.)
>
>  This pattern looks positively ideal for most of the use cases where we want
> dynamic subscribability.  In fact, it's almost better than the ability we
> have right now to set up connect/disconnect for single cells.  (Which is
> only used to implement laziness, currently.)
>
>  For example, to use this approach to monitor wx events, one could do
> something like this in a widget component base class:
>
>         def get_event(self, event_type):
>             return self._events[event]. value
>
>         @collections.cellcache(resetting_to=None)
>         def _events(self, key):
>             """Cache of cells holding events of a given type"""
>
>         @_events.connector
>         def _bind(self, key):
>             receive = self._events[key].receive
>             self.bind(key, receive)
>             trellis.on_undo(self.unbind, key, receive)
>
>         @_events.disconnector
>         def _unbind(self, key):
>             receive = self._events[key].receive
>             self.unbind(key, receive)
>             trellis.on_undo(self.bind, key, receive)
>
>  Then, you could "poll" events using self.get_event(event_type), with the
> necessary bind/unbind operations happening dynamically, as needed.  The
> method would return an event or None, so you could call skip() or whatever
> on that object if needed.
>
>  (In practice, something like this probably needs to be a bit smarter about
> the event keys used, because you may want to monitor events by ID, and there
> is more than one kind of event that might monitor e.g. mouse position,
> control key status, etc.  But that's a topic for another post.)
>
>  Anyway, this rather looks like the preferred way to use connectors and
> disconnectors, so I'll add this to my implementation to-do list.  Once it's
> done, it should then be (relatively) straightforward to implement a
> "relationship manager" and bidirectional links between collections.  In
> fact, it won't be so much a relationship manager per se, as a sort of
> value-based dynamic dispatch or "pub-sub" tool, somewhat like a
> callback-free version of PyDispatcher.
>
>  Hm, maybe I should call it a PubSubHub...  or just a Hub.  :)  Anyway, you
> would do something like:
>
>     some_hub.put(SomeRelationship, 'link', ob1, ob2)
>
>  To publicize a link of type SomeRelationship being created between ob1 and
> ob2, and then you'd have a rule like this to "notice" such operations:
>
>     for rel, op, ob1, ob2 in some_hub.get(None, None, None, self):
>         # ob1 will be an object that was linked or unlinked from me
>         # rel will be the relationship
>         # op will specify what happened
>
>  That is, you'll be able to "query" a hub to see those events that equal
> your supplied values (with None being a wildcard).  Your rule will then be
> recalculated when matching events occur, without you needing to explicitly
> register or unregister callbacks as in PyDispatcher or Spike.
>
>  Whew!  That was a long and winding road, but I believe it gets us where we
> want to go next.  This pub-sub-hub thing will probably also come very much
> in handy later, when we're doing record-based stuff.  Probably, you won't
> use just one giant hub, but have more specialized hubs so that rules can
> copy and transform messages between them.  (If you had a rule that read
> values from one hub, and then wrote back into the *same* hub, it would
> create a circular dependency between the hub's internal rules and your
> external ones.)
>
>  Anyway...  onward and upward.
>
>  _______________________________________________
>  PEAK mailing list
>  PEAK at eby-sarna.com
>  http://www.eby-sarna.com/mailman/listinfo/peak
>



-- 
Best Regards,
Sergey Schetinin

http://s3bk.com/ -- S3 Backup
http://word-to-html.com/ -- Word to HTML Converter



More information about the PEAK mailing list