[PLUG] Open source object trace

Fri Jan 9 19:49:05 UTC 2015

Last week I was watching a shitstorm of http GET requests from a
few ranges of IP addresses (in China):

  180.76.5.0/255.255.255.0
  180.76.6.0/255.255.255.0, 
  220.181.108.0/255.255.255.0 

(and others).  These drove my outside server to a load average
of 50, saturating apache.  I blocked the main offenders in IP
tables, and added some permanent redirects to 127.0.0.1 for 
some patterns in Apache, and things have calmed down for now.
Kluges, not clean fixes, which I need to create.

Frighteningly, These GET requests created thousands of "stub
directories" in the MoinMoin page directory, though a user 
should be logged in to cause such changes.  There may be no
easy way to trace data objects through the OS, and Apache,
and MoinMoin, to files on the disk.

THE QUESTION:  Is there an easy way to trace data objects through
a software system?  SHOULD THERE BE an easy way?

With sources, and high level tools, I should be able to create a
a high level debug model, then simulate the flow of objects
through the components of that model.  I don't want to bother
with loops and subroutines and lines of code inside those
components, unless I intentionally drill down.

The model might take days of computing to construct, with distro
updates requiring more hours of partial reconstruction.  But if 
the process is automated, it could be done in parallel on many
machines running the same distro.  Using the model on a few
objects would require few resources.

For my particular problem, I could watch (graphically) a data
object move through programs represented as named boxes ("httpd",
"mod-wsgi") connected with named lines, and watch the object
transform.  I could open up the program or library boxes and look
at the specific code modules that are activated, abstracted at a
lower level.  I could call up module man pages for functional
descriptions.  I would only drill down into lines of code if
something looks wonky - actually, I would contact the author of
the module with a transportable example and ask "is it really
supposed to do this?"

I am handwaving, analogizing to the processes I use to debug chip
designs.  Huge pieces of complex test and measurement gear (some
as big as a room), tiny chips, huge models representing those
chips at multiple levels of abstraction.  Chips as components in
larger systems, much larger models, with lots of observability at
model interfaces.  Simultaneously, the ability to drill down to
transistors, materials, or fundamental physics if needed.  Such
detail is rarely necessary, high level abstractions (one or two
layers below software) are usually sufficient to understand
processes and locate anomalies.  Focus on the essence of the
problem, filter out the peripheral details.

Efficient problem solving is the rapid and reliable elimination
of inconsequential detail.

Keith

-- 
Keith Lofstrom          keithl at keithl.com