[PEAK] fact orientation
Phillip J. Eby
pje at telecommunity.com
Sat Oct 15 19:49:58 EDT 2005
At 11:31 AM 10/15/2005 -0700, Jay Parlar wrote:
> > It's a big part of it. Dynamic variables, fact orientation, and monkey
> > typing are too...
>Interesting. I'm going to have to look into fact orientation, having
>never seen that term before. Do you have any good links to it? All the
>ones I find on Google seem pretty heavy with "market speak", I can't
>figure out what they really mean.
If you don't mind wading through some fairly deep reading, try this:
It's an elaboration on the fundamental principles of FCO-IM: Fully
Communication-Oriented Information Modelling.
In brief, it basically states that the purpose of data modelling is to
represent *human communication about the world*, NOT to model the world itself.
This is actually quite a radical and revolutionary idea with respect to
software design as mostly currently practiced. If you think through the
ramifications, it means that object orientation as we currently conceive it
is broken, because it's trying to solve the wrong problem. Computers are
useful for performing computation and queries regarding *communicable facts
about objects*, which is not the same thing as representing *actual objects*.
In the FCO-IM view of the world, then, there exist only:
1. lexical types (symbolic values like numbers, strings, dates, etc.)
2. non-lexical "nominalized" types (conceptual entities, like a Person)
3. dimensional types (unit values like feet, seconds, mass, etc.)
What's more, the "nominalized" types do not exist as values. You can't
just refer to a Person, you have to say in effect, "The Person named Jay",
or "The Person with SS#123-45-6789".
That doesn't mean you can't refer to a Person object in code, mind you, by
abstracting out what that reference mode or key is. I'm just pointing out
that fact orientation doesn't try to model the *implementation of a
Person*. It models *facts about them* -- just like a relational
database. Indeed, I'd say the reason that relational databases are still
with us, and object databases mostly haven't panned out, is for precisely
the reason that relational databases are fact-oriented, and therefore more
flexible and extensible in this way.
So, fact orientation is incredibly more flexible because you can always add
more kinds of facts about a person, but in the OO paradigm a class is
closed, with a fixed set of behaviors and characteristics. If you combine
generic functions (which can extend classes with new behaviors) with fact
orientation (which can extend concepts with new kinds of facts), you have a
completely open-ended system with regard to extensibility.
In short, present-day OO techniques are far less flexible than functional
decomposition and relational logic are, despite the fact that OO is
supposed to be their successor. Nice, eh? :)
But what OO *does* offer is programmer convenience, a hierarchical
shorthand for expressing commonalities, and a more brief notation for
obtaining or manipulating certain kinds of facts. The problem with both
fact orientation and generic functions in their "raw" form is that you end
up with a giant global namespace and the need to use function syntax (ala
Lisp) to get at anything. Instead of saying 'somePerson.foo', you would
have to say 'get_foo(somePerson)'. And then, six libraries might have a
'foo', so you really need 'somelibrary1.get_foo(somePerson)'. Ugh.
So, this is where my "monkey typing" concept comes into play: mapping these
more flexible models back into the syntax and patterns we know and
love. We simply use ISomeLibrary(somePerson).foo for setting, getting,
deleting, and method calls. These adapters are of course just a collection
of descriptors that return bound methods wrapping the appropriate generic
functions, or perhaps the results of calling them.
Ideally, monkey typing would be able to piggyback on Guido's proposal for
implementing type declarations, so that it wouldn't be necessary to deal
with these matters in-line most of the time. Monkeytyping adapters are by
nature safe for re-adaptation, in that switching to another interface just
unwraps and re-wraps the underlying object.
Anyway, with my recent implementation of the schema.Annotation class for
Chandler, I've realized how to do monkey-typing for fact-oriented systems
using the same approach. In Chandler, you can now do things like the
following, which is a snippet from the schema API doctest:
>>> class Teacher(schema.Annotation):
... schema.kindInfo(annotates=Person) # annotate the "Person" type
... certifications = schema.Sequence()
... supervisor = schema.One(Person)
>>> class TeachingCertificate(schema.Item):
... subject = schema.One(schema.Text)
... certified_teachers = schema.Sequence(
... Teacher, inverse=Teacher.certifications
>>> ProfMary = Teacher(Mary) # Adapt Person to Teacher
>>> gym = TeachingCertificate("gym", subject=u"Physical Education")
>>> ProfMary.certifications = [gym]
[<TeachingCertificate ... gym ...>]
[Mary Quite Contrary]
The extra state isn't stored in the adapters, though, it's part of the
underlying database. Which means you can adapt as many times as you like
and still get the same data:
[<TeachingCertificate ... gym ...>]
Thus, the data model for "Person" is *open ended*. Any number of Chandler
plugins can define their own additional data to be kept, and those
additional attribute names don't clash with those defined by "Person"
itself. This gets Chandler out of the OO rut in a way that plain OODB's
can't handle. (Of course, I'm sort of faking it because Chandler is
actually built on an OODB, not a relational one. So in truth you could
pull the same trick on top of other Python OODB's as well.)
Up until now, I'd always had this idea as a general concept of what I
wanted to do with the "SOAR" project (Simple Objects Accessed
Relationally). But one of the conceptual stumbling blocks for me with SOAR
was that I always got down to the problem of how to determine what a
database object's "type" was, and how to determine whether it implemented a
particular "data interface". (If you look at the old TransWarp code for
"records", you'll see a lot of this stuff there.)
But what I've realized from working with Chandler is - you don't *need* for
objects to have a "type", in the sense that they can have one and only one
type. If you're stuck in the OO paradigm, it seems this way, because how
else can you determine what method implementations will be used to respond
to a particular message? But in a facts+functions world, this is
silly. You just define things' behavior with generic functions, which are
perfectly capable of determining behavior based on *whatever facts you'd
So, when you view this in the context of "modelling human communication
about things", you quickly realize that determining a thing's "type" is
just a kludge. Business applications are usually all about enforcing rules
based on facts, anyway. The "is-a" relationship is just another kind of
fact, and doesn't require you to boil every object down to just one
type. In a sense, you can have multiple-inheritance on a *per instance*
basis, if you like.
The problem this was causing with trying to implement polymorphic database
schemas in PEAK and TransWarp was that we always had the implicit
assumption that an object was always of just one type, and thus we needed
to know what class to make the ghost be when we loaded an item.
But a fully fact-oriented model using generic functions doesn't have to
care, because *the object itself has neither behavior nor data*. The
object is simply a key to access a collection of underlying facts. Once
you've realized that, then there's no "problem" to solve any more, except
that you can't really go around using isinstance() on things unless you
generate classes on-the-fly to match an individual object's behavioral
signature. (Which we could possibly do, but it seems easier to me to just
deal with things in monkeytyping terms, with no "real" object.)
In Chandler, then, an individual object really *does* have a single type,
but we use annotations to widen selected types with "third-party"
attributes. But in the monkey/facts/functions (MFF?) model, this won't be
the case. There will be no "real" objects at all, in the old sense, or if
there are for efficiency's sake, it's just a coincidence. All you'll ever
see are "interface instances", never the "real object".
Doing this stuff in Chandler would be too much of a rework for no immediate
gains, because Chandler doesn't have generic functions and the underlying
storage mechanism is still married to the O-O model. So I don't have any
plans to push for implementing the full MFF model there.
For my own stuff, though, I'd like to be able to avoid there ever being
"real" objects, but that may not exactly be how it works out. There are
still places where it's useful to have traditional, non-queriable objects
in an application, but these are usually also the same objects for which a
schema isn't really necessary or useful to begin with.
Which brings us to an interesting point: the primary usefulness of
single-type, message-passing, closed class O-O is in creating
*solution-domain* abstractions like GUI toolkits, event frameworks,
service components, etc. That is, OO as we know it today is really a
low-level toolkit useful mainly for solving programmers' problems, not
users' problems. :)
More information about the PEAK