[PEAK] Use cases for the priority feature

Wed Aug 18 19:33:02 EDT 2010

At 08:40 PM 8/18/2010 +0200, Christoph Zwerschke wrote:
>Am 18.08.2010 01:56 schrieb P.J. Eby:
> > At 12:37 AM 8/18/2010 +0200, Christoph Zwerschke wrote:
>>>Granted, that should work. But it looks a bit overly complicated
>>>to me, because you have to work with functions getters here,
>>>whereas with the priority parameter you could use the actual
>>>functions and spare some of the overhead.
> >
> > Here's Yet Another Way To Do It, btw:
> >
> > ...
> > def call_highest_priority(results):
> > for quality, method in sorted(results, key=operator.itemgetter(0)):
> >     return method()
>
>That's good, but it gives the method with the minimum quality. Why not:
>
>def call_highest_priority(results):
>     return max(results, key=itemgetter(0))[1]()

Right - you get the idea though.

>I liked the approach with the "quality < x" condition better. The 
>only disadvantage I see is that it requires all methods to receive 
>that parameter even though the methods themselves don't actually need it.

There's actually an @expand_as workaround to avoid the extra 
argument, but it's somewhat evil.  I'm also still torn about making 
priorities too easy in the general case -- which tends to make that 
disadvantage an *advantage* in my mind, because then you have to 
decide *ahead of time* you are going to be using priorities in a 
particular scheme!

One thought that's occurred to me is that if AmbiguousMethod errors 
included a friendly text explanation along the lines of, "hey, you 
forgot to define what happens if X and Y are both true; maybe you 
need a method with this rule: ...  and spat out what you'd actually 
need to put in your rule to make it more specific, that would 
actually be more immediately helpful than making somebody have to 
figure it out, as is necessary now.

I guess I'd rather people unthinkingly added more-specific rules, 
than unthinkingly added priorities.  ;-)

IOW, if an ambiguous method error actually *said*: you need a rule 
for the "isinstance(ob, Foo) and hasattr(ob, '__json__')" case, 
that'd be a lot more actionable than, "hey these two things are 
unclear...  you should do something about that." ;-)

That's a whole different ball of wax from knowing in the first place 
that your use case is "select amongst applicable prioritized things, 
and I don't care if equal-priority ones are chosen 
unpredictably".  For that, you can just use the argument approach or 
an @expand_as hack, or even define a MethodList subclass that uses 
priorities (instead of serial numbers) as a disambiguator in its sorting.

For that matter, here's Yet Another Priority Hack:

    @struct
    def prio(pri, pred):
        return pri, pred

    @when(parse_rule, (object, prio, context, cls))
    def parse_prioritized(engine, predicate, context, cls):
        rule = parse_rule(engine, predicate.pred, context, cls
        return Rule(rule.body, rule.predicate, rule.actiontype, predicate.pri)

Then calling when(somefunc, prio(3, "blah")) makes 3 the sorting 
disambiguator...  though of course you'd then have to have *all* your 
methods prioritized for that to work (the default sequence numbers 
would be quite high).  But then, that would do just fine for 
something you knew in advance you wanted to prioritize.

I guess in summary, my feeling at this point is: there are use cases 
for priorities, but IMO they should be use-case specific, and the 
documentation should propose a couple different ways to do it for 
different sorts of use cases.  But I don't think PEAK-Rules should 
actually *provide* a built-in priority mechanism, since there are so 
many ways to implement a custom one in only a few lines -- heck, the 
argument-based one doesn't require anything except adding an argument 
to your function and methods...  which is just costly enough to make 
you consider whether you should be using it or not.  ;-)

I think, though, that we're basically in agreement that TurboJSON is 
*not* a valid use case for priorities in the first place: at worst, 
it's a use case for being able to replace or delete rules.

(By the way, I added gf.clear() to the current RuleDispatch 
emulation, but the PEAK-Rules equivalent is juse 
"rules_for(f).clear()"...  so if someone really wants to start over, they can.)

ISTM that ambiguity in rules comes from three possible sources:

1. The use case itself involves ambiguity (e.g. "conversion quality")

2. The user has overlooked rules that can overlap (e.g. that 
hasattr('__json__') can occur at the same time as something else)

3. The function will be extended by users, who sometimes want to 
replace or override default rules

We can improve on #2 via better error messages and verification 
tools.  #3 can be fixed with a new 'default' method decorator, and/or 
a straightforward way to say you mean to monkeypatch an existing method.

#1 should really be explicitly declared, perhaps by some sort of 
explicit extension to the @combine_using decorator, or a decorator of 
its own.  (Perhaps a @select_using decorator that operates on the 
level of iterating uncalled methods+metadata, rather than over the 
iteration of results?)

Anyway, at this point I'm strongly leaning towards taking priority() 
out of the PEAK-Rules distribution, and replacing it with:

1. A cleaner way to remove or replace existing method(s)

2. An explanation of how to use the "quality>20" approach in the 
documentation, as part of a general explanation of GF design best practices

3. Better error messages from AmbiguousMethods, that explain exactly 
what case is not covered (and therefore, what condition you should 
use on a disambiguating method)

4. (Maybe) a default() method type for rules that should be ignored 
in case of ambiguity (thereby reducing the need for #1 to be a 
super-easy/wonderful API)

5. (Maybe) Other things such as coverage checker/verifiers.

6. (Maybe) a @select_using mechanism, if we can come up with a 
satisfactory API.  Perhaps something where you define a type for the 
metadata, and then *args and **kw from the when() decorators are 
passed into your metadata type constructor, and then your iterators 
receive (metadata, callable) pairs, e.g.:

def my_metadata_constructor(priority=0):
     return priority

@combine_using(...)
@select_using(my_metatdata_constructor, partial(max, key=itemgetter(0)))
...

@when("condition", 20)   # or @when("condition", priority=20)

This still needs a lot of thought, preferably directed at some 
clearer use cases.