[PEAK] Traversal, views, and templates = PWT, Reloaded

Sun Aug 22 02:51:03 EDT 2004

As I thought further about the plan in my last post, I discovered some 
interesting possibilities for simplifying typical DOMlet usage in 
peak.web.templates.

This is a fairly long post, but important to read if you're currently using 
peak.web.templates (.pwt's), or would like to in the future.  It describes 
a significant overhaul for PWT to take advantage of the coming "view 
registration" and "traversal namespace" mechanisms in peak.web.  It also 
deals briefly with some likely directions for PWT widget development, and 
how the "view registration" mechanism will be implemented.

Moving to Views in PWT
----------------------

Currently, we have 'text', 'text.notag', 'xml', and 'xml.notag' DOMlets for 
displaying text.  They work by doing the equivalent of 
'str(current_object)', so they don't really help for e.g. embedded 
templates.  But, now that we have "alternative" traversals available, there 
doesn't seem to be a reason to distinguish between text and XML, or between 
text and some sort of embedded view or the attributes of an object.

Let me be more specific.  Suppose we have a DOMlet called "show", and what 
"show" does is traverse to the specified data item, adapt it to 
IDOMletNode, and invoke its 'renderFor()' method.  The immediate 
consequence of this is that any view or attribute that represents a DOMlet 
or template, is automatically embedded at the point of contact.  And, if we 
define default DOMlet adapters for strings and Unicode objects, then any 
string attribute of an object is trivially renderable.

But what about non-strings?  What about numbers and dates?  Ah, that's 
where the view mechanism comes in.  Let's say we register a view named 
'short_date' for type 'datetime.datetime', that uses the request's locale 
to look up a short date format, and formats the given date as a 
string.  Poof!  We now have an instant date formatting widget, which we 
could use in a template as 'pwt:domlet="show:due_date/@@short_date"'.  That 
is, "apply the 'show' DOMlet to the 'short_date' view of the current 
object's 'due_date' attribute.

Not bad, not bad at all.  But what happens if there's no 'due_date' 
attribute, or no 'short_date' view?  Or a permission problem?  And, do we 
need a 'show.notag' variant?

Hm.  It's almost enough to make you want more 'pwt' attributes besides 
'domlet' and 'define'.  For example, we could have 'pwt:insert' and 
'pwt:replace', which either insert something within the current element, or 
replace the entire element (ala the '.notag' DOMlets).  Then, we could 
allow the default DOMlet to be "show", so normally you would only be doing 
things like 'pwt:replace="due_date/@@short_date"' instead of 
'pwt:domlet="show.notag:due_date/@@short_date"'.

The basic idea here is that pwt:replace would work like pwt:domlet does 
now, but pwt:insert would create the DOMlet as a nested element within the 
element where the pwt:insert attribute appears.  The factory for this 
nested DOMlet would have to support a new 'INestedDOMletFactory' interface, 
or some such, to indicate that it can be used in this fashion.  Then, all 
children of the element with the 'pwt:insert' attribute will be placed 
inside the nested DOMlet, and the nested DOMlet will be placed inside the 
element with the 'pwt:insert'.

I'm not sure if it will make practical sense to allow both 'insert' and 
'replace' on the same element, although in principle it could make sense to 
do so.

By adding this third attribute (fourth, if we allow 'domlet' for backward 
compatibility), we can now simplify the common case of template 
interpolation to 'pwt:xxx="some_attr/@@some_view"'.  Of course, actual 
widgets like the current 'list' and the contemplated 'tree', 'menu', etc. 
widgets will still need to be specified by prefixes, such as 
'pwt:replace="list:some_attr"'.

Hmm...  wait a minute.  Suppose we just made the 'pwt:' attribute namespace 
into a property lookup for a new kind of DOMlet factory?  Then, it would 
make sense to say something like 'pwt:list="some_attr"'.  And even 
'pwt:define' could be implemented by this mechanism.  As far as I can tell, 
the XML spec allows for '.' in attribute names, so 'pwt:myapp.something' 
should be a valid attribute.

Interesting.  At this point, PWT is no longer a minimalistic two-attribute 
syntax, but is more convenient to type, while remaining far more extensible 
than other attribute-based syntaxes.  We'll have to have some sort of 
prioritization mechanism, though, so that multiple attributes on the same 
element are ordered sensibly.  In principle, XML attributes *should* remain 
in the order you specify them, but I believe that technically speaking no 
XML spec guarantees that they *will*, so if you run a PWT file through 
editors or processing of various kinds your order might get munged.  Having 
a priority mechanism between transformers will keep the mechanism straight.

PWT vs. ZPT?
------------

You know what's *really* weird?  Once PWT "dissolves" in this fashion, 
there's nothing stopping you from implementing Zope's entire "TAL" 
attribute language, if that's what you really want.  Indeed, ZPT already 
has 'tal:content' and 'tal:replace' that are similar to my hypothetical 
'pwt:insert' and 'pwt:replace'.  The difference is that ZPT requires you to 
specify that a replacement or content is "structure" or "text".  For me, I 
would just as soon use views to make that distinction, treating strings and 
Unicode as text by default.

Although I like a lot of the features in ZPT's TAL and TALES languages, I'm 
worried that many of their features stray into scripting-land, while other 
features seem to lack important functionality.  Specifically, I like PWT's 
ability to use chunks of XML as parameters much better than ZPT's tendency 
towards string manipulation.  For example, in ZPT, 'tal:on-error' is an 
expression to use in place of the current content if there is a failure 
anywhere within the current XML element.  If I implemented a 
'pwt:on-error', it would work in almost exactly the opposite way: 
'pwt:on-error="NotFound,NotAllowed"' would indicate that the marked-up 
element would be used in case of the corresponding errors occurring while 
executing the nearest *surrounding* insert/replace operation.

So, although the idea of implementing ZPT's TAL/TALES/METAL in PWT is 
somewhat attractive from a documentation point of view (i.e., not having to 
write docs!), I think I'm going to leave it to be a "community-contributed" 
extension if anybody really wants it, and focus on making an excellent PWT 
system.  The key is that PWT-style attributes should generally act like 
function invocation, taking a data path and a collection of 
specially-marked child XML elements as parameters.

So, it sounds like the best course of action is to do another draft of the 
DOMlet factory architecture, such that individual XML attributes can 
perform transformations on an XML element, and the handlers for those 
attributes have some kind of priority scheme to determine evaluation 
order.  There then need to be various architectural affordances to allow 
implementation of things like error handlers, parameters, and so forth.

I expect that it will become relatively uncommon to create custom DOMlets, 
and when you do create them, you'll have to put them in some XML namespace 
other than the 'pwt' one.  Still, once we have basic UI widgets, they can 
be made to do lots of different things.  I believe I've mentioned 
previously how a simple "selected items" widget can implement such diverse 
things as menus, dropdowns, tabbed interfaces, and such, all by using 
different HTML snippets as parameters to the widget.  So, there will 
probably be half a dozen or so commonly used widgets for lists, trees, data 
tables, form layouts, tabbed pages, and so forth.  Once we have these, 
there'll be little call for DOMlets as such.  More likely, you'll create 
much simpler "view" components to do text formatting, or create .pwt 
templates to define layout macros or XML views.

Another Idea: Insert vs. Replace
--------------------------------

A common theme throughout this has been the need for choice between 
applying something to an element, and applying to its contents.  I wonder 
if we could use XML namespaces for this?  Perhaps 'this' and 'content' used 
as prefixes would make for nice spelling.  An example:

     <ul content:list="some_attr">
     <li this:is="listItem" content:replace="item_name">Text will go here</li>
     </ul>

In other words, "the content of this UL element will be replaced with a 
'list' rendering of the data  found at 'some_attr', and the list will 
receive a 'listItem' parameter that is the '<li>' element, which will have 
its contents replaced by the 'item_name' attribute of each item in the 
list."  When this is rendered, it'll look something like:

     <ul>
     <li>foo</li>
     <li>bar</li>
     </ul>

The idea here is that using a DOMlet from the 'content' namespace applies 
it to the collection of child elements of the element where it appears, 
while using one from the 'this' namespace applies it to the element 
itself.  This would be orthogonal to the definition of DOMlets themselves, 
which would still be pulled from a single property namespace.  In other 
words, the same attributes would be available under both 'content' and 
'this', and have the same meaning either way, except for what object they 
apply to.  (Except for some DOMlets that might not be able to function on a 
collection of child nodes.)

Let's try an example that generates text differently...

     <div this:list="some_attr">
      <span content:is="listItem">This is the item: <span 
this:replace="item_name" /><br /></span>
     </div>

When this is rendered, it'll look something like:

     This is the item: foo<br />
     This is the item: bar<br />

etc.  The 'div' goes away because the 'list' is replacing it, rather than 
replacing its contents.  The outer span doesn't appear, because its 
*content* is the "listItem" parameter.  And of course the inner span is 
replaced entirely by each list item's name.  As you can see, we used the 
exact *same* three DOMlets ('is', 'list', and 'replace'), to produce an 
entirely different effect.

It's not a *perfect* syntax, though.  For example, 'this:list' doesn't 
really convey very well that the element will be replaced.  'replace:list' 
would be more meaningful there, but using 'replace:' as a namespace for 
parameter definition doesn't make much sense.  The other hangup is that 
you'll have to define two XML namespaces in each template document, unless 
we have some way to do it by default.

How well does it work for error handling?  Using 
'content:on-error="NotFound"' would mean that the content would be used as 
an error handler for 'NotFound' errors occurring in an enclosing block, 
possibly the same element.  And 'this:on-error="NotFound"' would mean that 
the entire item is used as an error handler for some enclosing block, but 
*not* the same element.  So this:

     <span this:replace="some_attr" content:on-error="NotFound">No such 
attribute</span>

means that if accessing 'some_attr' causes a 'NotFound' error, the "No such 
attribute" message will appear instead of the atttribute value.  Or, you 
could do this:

     <span this:replace="some_attr">
     <em this:on-error="NotFound">No such attribute!</em>
     </span>

Which registers the '<em>' block as the 'NotFound' error handler for the 
enclosing 'this:replace'.  In practice, you may want to use an 
application-specific error handler component, so you might say instead:

     <span this:replace="some_attr">
     <span this:on-error="NotFound,NotAllowed" this:my.app.handler="whatever">
     Some text that's maybe used by the handler...  maybe even a
     <span content:is="someParam">parameter</span> or two for the handler.
     </span>

In other words, "replace the outermost 'span' element with the value of 
'some_attr', but if 'NotFound' or 'NotAllowed' are raised, invoke the 
'my.app.handler' DOMlet that was constructed using the next inner span and 
its contents."

Whew.  That last example was a bit squirrelly, but mainly because there's a 
lot going on.  Individually, though, each attribute seems to make a fair 
amount of sense.

This also gives us a higher-level organization to use for prioritizing 
attributes: we can divide attributes into "registration" attributes that 
just package up the element or its contents and register it in some way 
with a containing element, and "replacement" attributes that replace the 
element or its contents.  An element can only have one "replacement" 
attribute, but any number of "registration" attributes.  The registration 
attributes get invoked after the replacement attribute, in no particular 
order (since the registrations are independent of one another).  It should 
be an error to specify more than one replacement on an element, even if one 
is for the content and the other is for the element itself.  (I can't think 
of any use cases for that; can you?)

Attributes, Layout, and Parameters
----------------------------------

We've come a long way, so I'll go back to recap the concept of the new 
template system:

  * There will be two XML namespaces, colloquially referred to as 'this' 
and 'content'

  * There will be a single property namespace used to look up the 
implementations of attributes in these two XML namespaces.

  * Attribute implementations will be either "registration" or 
"replacement" attributes.  At most one "replacement" attribute is permitted 
per XML element

  * The "replacement" attribute, if any, will be used to create the DOMlet 
for that element.  If the 'content' namespace is used, then the created 
DOMlet will actually be nested just inside the element where the attribute 
appears, and all of the child nodes that would have been added to the 
element are instead added to the "replacement" DOMlet.

  * The "registration" attributes, if any, will be supplied with the 
element where they appear, unless they are in the 'content' namespace, in 
which case they will be supplied an element corresponding to the entire 
contents of the element where they appear.  (If there was a "replacement" 
attribute, this "contents" element will be the "replacement" DOMlet, 
otherwise an extra "tagless element" node will be created.)

  * An 'is' "registration" attribute will replace the current 'pwt:define' 
attribute, and a 'replace' attribute will replace most uses of the existing 
'pwt:domlet' attribute.  Other new "replacement" attributes will be used 
instead of the values previously specified in 'pwt:domlet' 
attributes.  E.g. 'content:list="foo"' will do the same thing that 
'pwt:domlet="list:foo"' used to do, and 'this:replace="bar"' will take the 
place of 'pwt:domlet="text.notag:bar"'.

  * New "registration" attributes ('this:on-error' and 'content:on-error') 
will be created for handling errors in the execution of the nearest 
enclosing "replacement" attribute.  They will basically register their 
target node as an error handler for the appropriate types, with the nearest 
enclosing DOMlet that accepts error handler registrations.  We'll probably 
also have a "replacement" attribute called 'optional', to refer to a 
replacement that should simply be omitted if it isn't found or isn't 
allowed for the current user.  I expect that handling such 
missing/not-allowed items would be such a common use case for 'on-error' 
handlers that having a shortcut available is a good idea.

The above items don't cover layouts/macros, which we'll also need a way to 
handle.  A simple solution is to have the 'replace' attribute supply the 
invoked DOMlet with parameters for all of the 'is'-tagged items within the 
'replace' attribute's scope.  It could do this by creating an object whose 
attributes were the parameters.  Then, normal 'replace="attrname"' 
attributes can be used in the "macro" or "layout" template to insert the 
supplied arguments.

It would be nice if we could also pass in arbitrary paths as named 
parameters.  One possible way of doing so would be to add a 'with:' 
namespace to define parameters, such that 'with:foo="bar"' means "supply 
the enclosing replacement with the current value of "bar" as a parameter 
named "foo".  For example:

    <div this:replace="@@standard_layout" with:title="summary">
        <div this:is="menu" ...>
        </div>
        <div this:is="body" ...>
        </div>
    </div>

Here, we've said that the entire thing is going to be replaced with the 
'standard_layout' view of the current object.  The 'standard_layout' view 
will be supplied with three parameters: 'title' (which will be the 
'summary' attribute of the current object), and 'menu' and 'body' which 
will be DOMlets representing the corresponding sections of the page as 
defined.  In the standard layout template, we may have something like:

     <head><title content:replace="title">Title goes here</title></head>
     <body><h1 content:replace="title">Title goes here</h1>
     ...
     <div this:replace="menu" ...> ...  </div>
     ...
     <div this:replace="body" ...> ...  </div>

and so on.  If the template needs access to the object that was "current" 
at the time it was invoked, it can use '..' as a path to move above the 
"parameters" object.  We can also make the parameters lazy, in that 
path-expression parameters don't need to be evaluated until or unless 
they're actually used.  (DOMlet parameters are implicitly lazy, since they 
have to be invoked for anything to happen.)  [Open issues: how do we deal 
with multi-valued parameters?  Missing ones?  What happens when parameters 
are passed back to a DOMlet that was passed in as a parameter?  And so on...]

Hm.  This is looking pretty good, though.  So far the syntax has been very 
natural, just rolling off my head and onto my fingers.  :)  I find my 
thought process goes, "Do I want to replace this or its contents?" and then 
I type 'this' or 'content', followed by the thing I want to do.  Then I go 
back to look at DOMlet parameters and tag them with 'this:is' or 
'content:is', and finally fill in the 'with:paramname="data"' attributes 
for any object arguments the replacement operation needs.

What I haven't seen yet is how annoying it may or may not be to specify the 
XML namespaces -- all *three* of them!  I imagine that I'll probably just 
cut and paste them from previous templates, but I think maybe I should set 
up the parsing machinery so the namespaces default to existing with the 
correct definitions.  If you're not using tools that care about the 
namespaces being defined, it's a waste of time to define them.  But, if you 
do use tools that need them to be present and correct, you can easily add 
them in.

Browsers will be unaffected, of course, because "replacement" and 
"registration" attributes are all removed from the document model during 
parsing.  This means that PWT isn't truly round-trippable the way ZPT 
is.  Actually, even if we left the attributes in, and added an equivalent 
to ZPT's "define-macro" to mark the insertion of other templates, and wrote 
a program to split such a page back into its component templates, we'd 
*still* be missing any elements that were "replaced" via a 'this:' 
attribute.  Anyway, the main use case I'm aware of for round-tripping 
templates is to feed back actual sample data into your visual design, and 
in the worst case that can be done with careful copy and paste.  The main 
goal of PWT's design is to make it so darn easy to mark up an existing 
XHTML document and turn it into a layout that you don't mind doing it 
repeatedly if necessary.  :)

[Open issue: we'll probably want i18n "replacement" attributes to support 
translation, too.]

Widgets and Models
------------------

Here are some ideas for where PWT widgets are headed.  Widgets will 
basically be parameterized "replacement" attributes that perform rendering 
of what we might call "data shapes".  By data shapes, I mean that some data 
is shaped like a tree, other data is shaped like lists or forms or 
tables.  This is rather similar to MS Research's concept of "triangles, 
circles, and rectangles", where trees are triangles, objects are circles, 
and data rows are rectangles, except that those "shapes" are a bit vague 
compared to what I have in mind.

In the sections above, I've pretty much laid out what's needed for basic 
"circles" and for lists thereof.  But we don't really have "rectangles" 
because we don't have, for example, an interface that defines a collection 
of rows with metadata about columns.  And we don't have "triangles" because 
we don't have an interface for tree browsing.  We also don't have "form" or 
"menu" interfaces.

It seems to me that a lot of applications can get by with just a few basic 
shapes, and relatively few widgets per shape.  For example, I've mentioned 
before that a "menu" widget, using a "shape" that consists of a list of 
items and some way to tell which items (if any) are selected, can be used 
to implement list boxes, dropdowns, navigation menus, tabbed interfaces, 
and so on.  "Form", "Grid" and "Tree" widgets can do quite a bit as 
well.  The really nice thing about widgets implemented as DOMlets is that 
they can be parameterized with arbitrary dynamic content, so the visual 
rendering of your data is completely under your control.  Anyway, here are 
my thoughts so far on what "shapes" will make sense, with the abstractions 
that they will need:

   * Tree -- "children", "expanded/unexpanded", "selected/unselected", 
"depth", ...?

   * Tabular Data -- "columns", "sorting", "items per page", "summaries", ...?

   * Menus -- "items", "selected items"

   * Forms -- "field names", "fields", "help", "errors", "validation"...?

   * "Decks" (alternate content, displayed conditionally, e.g. "wizards" 
and certain kinds of "tabbed" interfaces) -- "items", "selected item"

As you can see, this is all still fairly vague at the moment.  It would be 
nice if these "model" interfaces could be defined in a way that allowed 
them to be extended to GUI interfaces in the future, by adding notification 
(from the models to their GUI views), although the notification part is out 
of scope for peak.web itself.  But widgets *will* need a way to receive 
"user events", in the sense that they will have data they need from the 
HTTP request (e.g. query string and form variables) in order to do their magic.

If anybody has any experience with GUI "models" that's relevant here, 
suggestions on how best to structure these interfaces might be helpful.

View Registration
-----------------

The view registration mechanism is going to be based on contextual 
protocols.  First, we'll add a new API, 'config.registeredProtocol(ob, key, 
baseProtocol=None)'.  The idea is that if you want to register a view named 
"foo" for instances of TargetType, you'll do something like:

    p = config.registeredProtocol(context, 'peak.web.views.foo')
    protocols.declareAdapter(lambda ob: MyViewClass, [p], 
forTypes=[TargetType])

Well, that's not exactly true.  You probably won't do this in code, because 
it'll be a pain.  Notice that you have to have a context object (the web 
application, basically) before you can register; you can't just declare 
that MyViewClass is an adapter, because the protocol to adapt *to* is 
defined at runtime.

So, instead you'll use one of two configuration formats.  Either .ini:

     [Web views for some_module.TargetType]
     foo = some_module.MyViewClass

Or XML:

     <view name="foo" for="some_module.TargetType" 
class="some_module.MyViewClass">

I've skipped a bit far ahead here.  The result of adapting to our named 
protocol needs to be an 'IViewFactory': an interface with one method: 
'__call__(environ, ob, name)'.  The result of calling that method is 
actually used by the traversal machinery as the target of a '++view++foo' 
or '@@foo' traversal.  (Or of a 'foo' traversal, if the object has no 'foo' 
attribute or item.)

So, your view definition can be either a class or a function.  If it's a 
class, the view will be an instance, otherwise it'll be the return value of 
the function.  Here's a view that formats a floating point number to two 
places:

    def floating_format(environ, ob, name):
        return "%.2" % ob

It should be able to be defined using e.g.:

    [Web views for __builtin__.float]
    two_places = some_module.floating_format

Of course, you'll probably want to use dotted names for views in order to 
avoid name collisions between different components that are assembled into 
one application.  Now let's look at a fancier way of specifying views:

    def resource_view(permission, resource_path):

        def get_view(environ, ob, name):

            allowed = web.allows(environ, ob, permissionNeeded=permission)

            if not allowed:
                raise web.NotAllowed(
                    environ,
                    getattr(allowed,'message',"Permission denied")
                )

            return web.getResource(environ, resource_path)

        return get_view

and its use:

    [Web views for my_app.SomeComponent]
    index.html = resource_view(security.Anybody, 
'my_app/SomeComponent_mainView.pwt')

This simple arrangement lets you map page (view) names to template (or 
other) resources.  Notice, by the way, that view factories are responsible 
for doing security checks!  We can get away without them for views that 
e.g. format a date or a number, but other types of views will typically 
require permission checks.

Anyway, that's the basic idea.  Neither the .ini nor XML formats for 
configuration are finalized yet; they're only tentative at the moment.  I'd 
like the XML format to follow ZCML where possible, but in some cases that's 
just not going to happen, because the ZCML names imply implementation 
details that differ.  For example, ZCML expects a view to have a "class", 
but our views can be defined by functions.  I'm going to need to review 
this more before I finalize the XML format.

Amusingly enough, I was doing exactly that review when I stumbled upon the 
traversal namespaces stuff that started me on writing my earlier 
post.  But, I won't be going back to it just yet, as it's after 2 AM and I 
need some sleep first.  :)

Finally, as a reward for reading this far, I give you one last idea: what 
if you could build a static site publishing system from a peak.web 
application?  That is, you run a script to build a static version of the 
site.  The tricky bit would be knowing what pages need rendering, though; 
you don't really want to parse HTML and follow links.  So...  you define a 
view name to be used, maybe 'static_site_children'.  And then you define 
its implementation for each of the different classes and locations in your 
application, so that you need only request that view on each location in 
the site.  Intriguing, isn't it?  Somebody could create a dynamic blogging 
peak.web application, but then you could just add a static view definition 
to it and then run a "staticize" tool over the application object whenever 
you needed to republish.

Okay, that's more than enough for now.  I look forward to your comments, 
questions, and feedback.