[PEAK] Variable binding and pattern matching

Phillip J. Eby pje at telecommunity.com
Mon Jan 3 12:17:01 EST 2005


So, the two other features I want to add to generic functions that I don't 
have a solid syntax for are variable binding, and pattern matching.

Variable binding is just the idea that you have expressions in a rule that 
you want to be kept for the benefit of the method, so it doesn't have to 
recompute them.

Pattern matching is actually the same thing, but the variable needs to be 
associated with a *condition* or test to be applied to the expression, not 
to the expression itself.  This is because in pattern matching, the actual 
expressions are abstracted, e.g. in this expression::

     expr in Call.match( [isinstance], match(inst=object), match(cls=Const) )

The idea is that 'inst' and 'cls' get bound to the first and second 
parameters of the Call object matched by 'expr in Call.match(...)'.  That 
is, the above expression tests that 'expr' is an instance of 'Call' whose 
function is 'isinstance', taking two arguments, the first being any object 
and the second being a 'Const'.  (This is sort of like named groups in a 
regular expression.)  If 'cls' is *not* a 'Const' or the function of the 
'Call' isn't 'isinstance', the outer test 'expr in Call.match(...)' 
*fails*, so the variables don't get bound and the overall rule doesn't 
match.  (Which means the associated method isn't called, which is good 
because it was expecting to get 'inst' and 'cls' parameters.)

Anyway, so that's one possible syntax for pattern matching.  Another is:

     expr in Call.match( [isinstance], vars.inst, vars.cls[Const] )

which is more compact, but perhaps less obvious.  Other alternatives might 
use different names::

     expr in Call.match( [isinstance], has(inst=object), has(cls=Const) )
     expr in Call.match( [isinstance], bind(inst=object), bind(cls=Const) )
     expr in Call.match( [isinstance], as(inst=object), as(cls=Const) )
     expr in Call.match( [isinstance], get(inst=object), get(cls=Const) )

or how about one of these:

     expr in Call.match( [isinstance], vars.inst, vars.cls in Const)

     expr in Call.match( [isinstance], vars.inst, vars.cls % Const)

     expr in Call.match( [isinstance], vars.inst, vars.cls & Const)

     expr in Call.match( [isinstance], vars.inst, vars.cls | Const)

     expr in Call.match( [isinstance], vars.inst, vars.cls << Const)

     expr in Call.match( [isinstance], vars.inst, Const >> vars.cls)


One problem, however, with most of these syntaxes is that they likely 
require parentheses around the test portion of the pattern match, e.g.

     expr in Call.match( [isinstance], vars.inst, vars.cls | (Const))

at least if the test is more complex than just a name, as shown here.

(Also, we could use other names besides 'vars.foo', such as 'match.foo' or 
'bind.foo'.)


Anyway, variable binding has similar issues, but is both more and less 
complex, because the associated value is an expression, not a test.  E.g.:

     let(sum_sq=sum(x*x for x in foo)) in (sum_sq < bar*bar)

Of course, this is yet another overloaded meaning for "in", which already 
has two meanings in generic functions already: the normal "sequence 
contains" meaning, and the "member of the set described by this test" meaning.

Other possible binding syntaxes:

     let(sum_sq=sum(x*x for x in foo)) and sum_sq < bar*bar

     vars.sum_sq(sum(x*x for x in foo)) < bar*bar

Ugh.  The 'vars' thing just doesn't work well for this, because one of the 
uses of variable binding is to make complex expressions easier to read, and 
the 'vars' approach here seems to make it more complicated.

Using 'and' instead of 'in' has some advantages from a parsing point of 
view, but it's less clear from a reading point of view.  So, it looks to me 
like 'let(**kw) in (conditions)' is the winner from a legibility standpoint.

Of course, the other alternative is to allow 'when()' to take keyword 
arguments, and use that approach instead.

Either way, the idea is that any variable so defined is able to be used in 
the function arguments, e.g.:

     @something.when("let(sum_sq=sum(x*x for x in foo)) in (sum_sq < bar*bar)")
     def do_something((sum_sq,),self,foo,bar):
         # etc.

So that the computed 'sum_sq' variable can then be used in the function 
body without recomputing it.  Alternate spelling:

     @something.when("sum_sq < bar*bar", sum_sq="sum(x*x for x in foo)")
     def do_something((sum_sq,),self,foo,bar):
         # etc.

Similarly, variables bound by pattern matching should also be usable as 
extra arguments to a method:

     @compileExpr.when(
         "expr in Call.match( [isinstance], vars.inst, vars.cls & Const)"
     )
     def compile_isinstance( (inst,cls), expr, test ):
         # etc.

...thus saving the body from having to extract the individual pieces of a 
data structure that had to be extracted to compute the test expression.

So, does anybody have any thoughts on what syntax looks better for each of 
these features?  What syntax is more "obvious"?





More information about the PEAK mailing list