   I do a lot of environmental data munging/wragling/ETL. These come to me as
.xml spreadsheets or the equivalent of line printer output sent as PDF files
(from federal resource agencies). I have found that emacs and awk, with the
occasional use of sed, do the job. Now and then I hit a new requirement
(such as reformatting a date from MM/DD/YY to YYYY-MM-DD) and my awk book
and web searches quickly find a working solution.

   I suspected that awk had flags, but the few web pages (including web fora)
did not use them the way I needed them to work. I've acquired a nice
collection of awk scripts that transform spreadsheet exports so the data can
be used in R, postgres, and GRASS.

