[PEAK] The basics of peak.schema

Tue Dec 7 00:30:55 EST 2004

So, what will peak.schema look like to start with?  I previously posted 
about constraints, but for now I'm going to punt on the larger constraint 
framework by simply ignoring constraints altogether.

As I see it, a schema class will simply be a regular Python class, with 
attribute descriptors, not unlike those of peak.model.  Well, actually, 
they'll be completely unlike peak.model; more like:

class Something(schema.Entity):

     foo = schema.Link(OtherType, inverse="otherEnd")
     bar = schema.Sequence(int)   # sequence of integers
     baz = schema.Field(str)  # string, unique key

     spam = ping = schema.Field(int)  # two integer fields

     schema.Unique("spam","ping")  # declare a unique key (spam+ping)

So, really, they're more like binding.Attributes than 
model.Attributes.  Unlike the latter, they also won't be method 
exporters.  In order to trap the equivalent of link/unlink events, you'll 
have to use custom sequence or mapping types.

The descriptors will essentially delegate all their behavior to a workspace 
object, which will be accessible as an attribute on individual objects 
(with a default "null workspace" for instances or classes not bound to a 
workspace).

There are really only three behaviors anyway: get an unknown attribute 
value, verify a change before making the change, and notify after a 
successful change.  All of these operations will be delegated to the 
workspace, which will implement them using generic functions.

The default implementations of these functions will do semi-useful things, 
like raise an error to indicate that the attribute is missing, supply the 
default value, etc.

This is an extremely narrow definition and set of features, but really this 
is nearly all of what I currently use in peak.model, if you ignore the 
crufty syntax-management parts, and the ZODB-based 
persistence.  peak.schema will use a different persistence mechanism, which 
will not use ghosts.  Instead, objects will only be loaded if they exist in 
the underlying DB, and only when you attempt to reach them via an attribute 
of a loaded object.  This approach does away with lots of annoying bits in 
the current implementation of DM's, where you have to implement 'get()' or 
'__contains__' on the DM to ensure that the object you're retrieving in 
fact exists.

The initial workspace implementation will be supremely trivial, as it will 
essentially just emulate the normal state of Python objects.  That is, when 
you set attributes they'll be set, and when you read them you'll find 
them.  However, over time we'll beef it up to support a flexible 
constraint-checking facility, after we first figure out how to implement 
the simpler constraints.

Initially, MOF alignment/compliance won't be a goal for peak.schema; the 
main idea is just to get a dumb prototype up and running that can be used 
with workspaces and metadata.  Our first workspace implementations will 
probably be a simple in-memory test workspace, and a *simple* SQL mapper 
(with fixed-per-backend mapping patterns; no custom/legacy DB support).

Part of the prototype layer will be introspection APIs to let you examine 
the metadata for a type -- the equivalents of e.g. mdl_featureNames and 
such in peak.model.  The difference is that these APIs will be external 
functions, rather than built in to the classes or descriptors, so there 
will be none of that awkward 
'someThing.__class__.someFeature.doSomething()' stuff.  Also, it means that 
there's not a need for junk like 'model.Integer' and 'model.String', which 
were there for the benefit of MOF and CORBA metadata that will now be 
available via functions *even for non-PEAK types*.  So, with peak.schema 
you'll be able to declare a feature as a 'datetime.datetime' (for example) 
without having to wrap it in a bunch of typecode fluff.

Indeed, peak.schema probably won't need equivalents of the model.Immutable 
or model.Primitive base types, because if you have a type that's one of 
these, it's probably a non-PEAK type anyway.  You'll just declare metadata 
for the non-PEAK type and be done with it, assuming you need any metadata 
at all.  So, really, you'll just have schema.Entity and schema.Value, 
corresponding to model.Element and model.Struct, respectively, with the 
only real difference being that schema.Value objects will be immutable, and 
will disallow inverse relationships, and fields with non-Value types.

Anyway, at least to begin with, peak.schema will do a *lot* less validation 
on its attributes than peak.model does, and less management of persistence 
issues like in-place modification of mutable sequences or mappings.  A lot 
of these capabilities will come back later as it becomes clearer how to 
layer in validation.  Right now, with so many things up in the air, I think 
it's better to solidify the foundation structure before moving up to more 
advanced features.

Questions, anyone?  Issues?