[PEAK] A simple schema language

Fri Oct 8 19:41:55 EDT 2004

As I was trying to flesh out the implementation plan today for peak.web's 
configuration files, I considered at one point using peak.model to define 
the schema for processing the various <location>/<container>/<content> 
elements, because it fits so well with peak.model's structure.

However, within a few minutes I became frustrated with trying to use 
peak.model as a schema *design* tool.  It's much more useful as an 
implementation tool, which probably shouldn't be too surprising since it 
was originally intended to use models generated from UML rather than 
written by hand.   So I began fooling around with trying to develop a 
simpler way of specifying a schema.  Here's about 60+ lines of the 
"bulletins" example schema, condensed to about 20 lines here:

     User = schema.Element(loginID=str, fullName=str)

     Bulletin = schema.Element(
         id = int,
         category = 'Category.bulletins',
         fullText = str,
         postedBy = User,
         postedOn = datetime,
         editedBy = User,
         editedOn = datetime,
         hidden   = (model.Boolean, False),
     )

     Category = schema.Element(
         pathName        = PathPart,
         title           = str,
         sortPosn        = schema.Attr(int, default=0)
         sortBulletinsBy = schema.Attr(SortBy, 
default=SortBy.MOST_RECENTLY_EDITED),
         postingTemplate = schema.Attr(str, default=''),
         editingTemplate = schema.Attr(str, default=''),
         bulletins       = [Bulletin.category],
     )

The idea being shown here is that a function call with keyword arguments 
defines the features of a schema.  If the argument is a type or the name of 
a type, then the feature is of that type.  If it's a feature of a type, or 
the name of a feature, then it's a bidirectional link.  If it's a list 
containing one of the above, then it's a collection of items.  If it's an 
explicit invocation of 'Attr', you can define other properties like default 
value, etc.

An approach like this offers some additional, interesting possibilities, 
like adding constraint expressions for validation.  The above could be 
considered an "abstract schema", which in itself is purely 
informational.  For example, the idea of 'datetime' or 'str' as a type need 
not be implemented by those exact types.  Instead, a concrete schema could 
declare that 'datetime' is actually implemented by a database timestamp 
type, for example, and also declare a custom parsing/formatting 
syntax.  Also, a concrete schema could be used to generate today's 
'peak.model' classes and objects, to provide an implementation, perhaps 
automatically mixing in domain methods defined in a separate module.  Or 
perhaps you'll do something like this:

     class Category(model.Element):

         schema.Features(
             pathName        = PathPart,
             title           = str,
             sortPosn        = schema.Attr(int, default=0),
             sortBulletinsBy = schema.Attr(SortBy, 
default=SortBy.MOST_RECENTLY_EDITED),
             postingTemplate = schema.Attr(str, default=''),
             editingTemplate = schema.Attr(str, default=''),
             bulletins       = [Bulletin.category],
         )

         # methods go here...

         def post(self, user, text, timestamp=None):
             # ...

in order to define the domain methods.  Actually, we could probably do a 
bit better, using a slight modification to today's model.Attribute:

     f = model.Attribute

     class Category(model.Element):

         pathName        = f(PathPart)
         title           = f(str)
         sortPosn        = f(int, defaultValue=0)
         sortBulletinsBy = f(SortBy, defaultValue=SortBy.MOST_RECENTLY_EDITED)
         postingTemplate = f(str, defaultValue='')
         editingTemplate = f(str, defaultValue='')
         bulletins       = f([Bulletin.category])

         # methods go here...

         def post(self, user, text, timestamp=None):
             # ...

Or substitute whatever you like for 'f'.  It's not quite as clear, and is 
more verbose, but it could probably actually be implemented relatively quickly.

Anyway, I haven't yet decided whether I'll implement it, even in this 
simpler "shorthand" sense, but the big picture is interesting to keep in 
mind for when I come back around to working on peak.storage and peak.query.