[PEAK] Re: Proposal for another Wiki "tutorial"

Wed Jul 14 18:02:34 EDT 2004

At 04:59 PM 7/14/04 -0400, Phillip J. Eby wrote:
>Hm.  Anyway, I better stop now, because at this point I'm halfway to 
>making your framework into a generalized enterprise management reporting 
>system that could just as easily report on people or departments and 
>products as it could on systems...  :)

Heh.  Too late.  This idea has now thoroughly infected my brain and taken 
over...  I pulled out my copy of Fowler's "Analysis Patterns", which in 
chapters 3 and 4 explores how to design domain frameworks for measurements 
and observations, both in general and specifically applied to corporate 
finance.  :)

Anyway, it's well worth reading, as the concepts apply just as well to 
system monitoring.  For example, he describes the mapping of ranged 
measurements to "category observations", like "system is slow", "system is 
normal", "up", "down", etc.

For my part, I'm having some very interesting thoughts about how generic 
functions would help in all this, for example in computing one metric from 
another, or translating measurements into category measurements.  I can see 
having a generic function 
'get_current_measurement(subject,metric,max_staleness)' that then has 
methods like:

     [around("subject in IAnnotatable")]
     def get_current_measurement(subject,metric,max_staleness):

          measurement = IAnnotatable(subject).get(metric,None)

         if measurement is None or time()-measurement.taken > max_staleness:
              measurement = next_method(subject,metric,max_staleness)
              IAnnotatable(subject)[metric] = measurement

         return measurement

which is a caching strategy to avoid computing a measurement that is in 
cache and not older than the desired 'max_staleness', but also to update 
the cache if a new measurement is taken.

Other methods would e.g. define how to derive the measurement from another 
measurement, or would include code to actually take the measurement.

I'm particularly intrigued by the possibility of then fitting this into an 
exception-reporting system, using things like exponential averages, MAD, 
and trend detection.  Interestingly, those techniques are *also* useful for 
both system monitoring and enterprise reporting.  :)

On a related note, is anybody using peak.running.timers?  I'd like to yank 
that module out, as just after I wrote it, I realized I had 1) designed it 
all wrong, and 2) had to move on to another project and didn't have time to 
do anything about it.