[PEAK] Untangling the running.IExecutable mess

Tue Nov 16 15:03:50 EST 2004

After considerable study, I think I've found the broken bits of the current 
command-line app framework.

First, running.ICmdLineAppFactory shouldn't subclass 
binding.IComponentFactory, despite the fact that it is a compatible 
extension of the interface.  The problem is that any callable object can be 
an ICmdLineAppFactory, but that doesn't make it a component factory!

Second, there needs to be a way for interpreters to handle errors better in 
their constructors, in connection with their parent command.  Specifically, 
right now interpreters try to immediately parse their arguments and return 
a target object in place of themselves.  However, if there's an error, they 
simply return themselves, hoping they'll be re-run, and they can output the 
real error.  This doesn't really work.  They should instead immediately 
output the command's help message, and raise an error.

The tricky part is handling that error.  What should the error be?  It 
seems to me that it should be a SystemExit, since this will cause a nice 
clean exit, whenver you attempt to use a broken command.  The downside may 
be that SystemExit gets trapped somewhere, and the output may be written 
somewhere strange, but I don't know if there are really any alternatives, 
there.

Finally, ICmdLineAppFactory should be redefined such that it is not 
guaranteed to return an ICmdLineApp; and its users should be required to 
adapt to ICmdLineApp if that is what they want.  In essence, an 
ICmdLineAppFactory is characterized more by the arguments it may receive, 
than the precise nature of what it creates.

I am also sorely tempted to remove the adaptation from arbitrary callables 
to ICmdLineAppFactory.  It's only used to enable functions to be run as 
applications, and could be replaced with e.g. a 'peak call' 
interpreter.  The only real downside to this is that the nice IntroToPeak 
"hello world" function would no longer work.

On the other hand, we could narrow that behavior to only work with 
functions.  That would eliminate the current "silent failure" mode, where 
specifying a class that has no __init__-time behavior, produces no 
result.  This would still support backward-compatibility with typical 
'main()' functions.  I think I'll take this approach unless somebody is 
using the current behavior that supports classes, class methods, callable 
instances, and instance methods as well.

There is actually another issue, wherein a Bootstrap interpreter always 
expects an ICmdLineAppFactory, meaning that Bootstrap should not be used 
when such an item is not expected.  However, there is no standard for 
initializing other kinds of objects as the target of a command/import 
operation.  For example, if the current CGIInterpreter command finds its 
target is an IComponentFactory, it will invoke it to create a new instance.

It seems silly, though, to create a series of alternative interfaces like 
ICGIFactory, ICmdLineAppFactory, and so forth, just to say, "hey, if you 
find certain kinds of objects, give them an opportunity to instantiate 
in-context."

So, what are some alternatives?

* We could create a new kind of 'cmdFactory()' protocol, akin to 
'sequenceOf()', taking the protocol we expect the instantiated object to 
provide.  Then, each context like Bootstrap, CGIInterpreter, and so on, 
would adapt to e.g. 'cmdFactory(ICmdLineApp)', 
'cmdFactory(IRerunnableCGI)', etc.

* We could move the creation to the import URL.  E.g., we could allow '()' 
at the end of an import URL, meaning, "call and create an instance", so 
that 'import:foo.Bar()' returns a new instance of the 'foo.Bar' class.

* We could have a reference type like 'ref:instance at import:foo.Bar'.  This 
would have the additional advantage of being applicable to any URL type, 
not just imports, and implementation would consist only of adding one 
class, and an entry to peak.ini.

* We could allow reference URLs to omit the '@url' part, so that 
'ref:foo.Bar' is a valid URL.  (Currently, you have to at least have 
'ref:foo.Bar at x').  Using this approach, any to-be-instantiated class must 
implement naming.IObjectFactory or naming.IState.  But, you can configure 
alternative implementations for a given dotted name, by defining properties 
in 'peak.naming.factories'.  Also, these "bare reference" URLs would be 
shorter even than the 'import:foo.Bar()' syntax (compare against 'ref:foo.Bar'.

Of these, I think I lean towards the 'import:foo.Bar()' solution, as easy 
to explain.  On the other hand, it may have to be quoted in some shells.

There may be a better solution, though.  Currently, the lookup process for 
most commands already calls through the naming system.  The naming system, 
by default, will check whether the object being returned is an IState or 
IObjectFactory, and if so, invoke them with the naming context being 
used.  So, there's already a perfectly good way to ensure that you get an 
instance of something: just make sure that the thing you're referencing is 
an IState or IObjectFactory.

The downside?  Well, now you have no way to reference the object, instead 
of a new instance of it.  But so far, I've never had a use case for that in 
PEAK.  Sure, there are places where we use import strings (or 
'importString()') to access a factory of some kind, but not 'import:' 
URLs.  As further proof, consider the fact that storage.ManagedConnection 
subclasses all implement IObjectFactory, so you can't reference one of 
those classes with any kind of URL, and not end up with an instance instead 
of the class.  The fact that this is not a problem in practice suggests 
that this is a good idea.

But, I just tested making all Component classes do this, and it produces a 
whopping 115 errors in the test suite.  The "flaw in the ointment" is that 
.ini files also use the object restoration protocols 
(IState/IObjectFactory), so making every class one means that they try to 
instantiate in that context.

I could fix *that*, by making component classes also be ISmartProperty 
instances that return themselves.  However, that just seems too convoluted 
and kludgy.

I think that this is basically an area where, in the face of ambiguity, one 
should refuse to guess.  Therefore, whatever we do should be explicit, and 
we should get rid of all the "implicit" command restoration techniques.

The downside to this approach in the command framework is that it has been 
leading towards really elaborate commands like 'FastCGI fd.socket:stdin 
WSGI import:foo.bar'.  This is explicit, but very verbose.  Also, the 
CGI/WSGI runners *really* want to instantiate a target component 
class.  Even though 'peak.web' is moving towards sitemap references as the 
way to run an application, the 'trivial_web' and 'trivial_cgi' example 
applications, along with the DDT web runner, currently all need to 
instantiate from a class.

Sigh.  I think I'm going to go with my gut here, and implement the 
'import:foo.Bar()' facility.  Here's my reasoning:

* No use case has yet arisen for declaring instantiation, for anything 
other than an import.

* Although it may cause shell quoting problems, this is a minor problem, 
easily worked around, compared to some of the other issues I've run into here.

* Explicit is way better than implicit.

So, the plan here is to:

1. Add the '()' syntax to import: URLs.
2. Change CGIInterpreter and its descendants (e.g. 'peak launch') not to do 
implicit instantiation
3. Fix all the current CGIInterpreter-run commands to use the '()' syntax 
so they work again.
4. Drop support for treating arbitrary callables as ICmdLineAppFactory, 
restricting this behavior to function objects only.

Any objections?

(In the meantime, I'm going to go ahead and check in changes for the items 
mentioned in the first four paragraphs of this message.)