[PLUG] sorting by filename, when there's stuff before and after

Scott Bigelow epheph at gmail.com
Mon Jan 21 23:47:43 UTC 2013


I like David's solution, and that's the general strategy I would try, but I
would use grep, which offers a friendly option "-o", a shorthand for what
he's trying to do:

egrep -o '[^/"]+\.jsp'





On Mon, Jan 21, 2013 at 3:20 PM, Rich Shepard <rshepard at appl-ecosys.com>wrote:

> On Mon, 21 Jan 2013, MJang wrote:
>
> > My objective is to set up a file with just the *.jsp filenames -- all the
> > stuff at least before the file name should be deleted on each line. (I
> can
> > then sort and compare with a different list of *.jsp filenames)
> >
> > Examples in my target list:
> >
> > "/path/to/file/abc.jsp";
> > some java declarative VARIABLE = "something.jsp";
> > * other/path/whatever.jsp
> > = "/path/to/file/other.jsp";
> > "/messy/path/abTestRest.jsp?metasomething=" +
>
>    I would write a short awk script that might look something like this:
>
> BEGIN {FS="/"}; print {$4}
>
> and run it:
>
>    gawk -f script_name.awk > out.txt
>
>    This assumes each file name is at the same subdirectory depth. 'gawk' is
> the GNU version of 'awk' and either name will work just fine.
>
>    You might need to pre- or post-process your file to fix the location
> variations.
>
>    Or, write a more complex awk script that uses conditionals; here's an
> example of one of mine:
>
> #! /usr/bin/gawk -f
>
> # Replace zeros in data with the reporting level.
> #  If two reporting levels, the higher one is used.
>
> # Fields: site, sampdate, param, quant
>
> BEGIN { FS = "|"; OFS = "|"}
>
> {
>      if ($3 ~ /Ag/ && $4 ~ /^0.000$/) { print $1, $2, $3, "-0.005|"; }
>      else if ($3 ~ /Acid/ && $4 ~ /^0.000$/) { print $1, $2, $3,
> "-0.001|"; }
>      else if ($3 ~ /Alk_Tot/ && $4 ~ /^0.000$/) { print $1, $2, $3,
> "-1.000|"; }
>      else if ($3 ~ /Alk_OH/ && $4 ~ /^0.000$/) { print $1, $2, $3,
> "-1.000|"; }
>      else if ($3 ~ /As/ && $4 ~ /^0.000$/) { print $1, $2, $3, "-0.010|"; }
>    ...
>      else { print $1, $2, $3, $4, "|" }
> }
>
> HTH,
>
> Rich
>
> _______________________________________________
> PLUG mailing list
> PLUG at lists.pdxlinux.org
> http://lists.pdxlinux.org/mailman/listinfo/plug
>



More information about the PLUG mailing list