[PLUG] gawk: modify field contents

Larry Brigman larry.brigman at gmail.com
Tue Jul 7 23:15:37 UTC 2015


Messed that one up.... Sample data needed/wanted to verify proper operation.

On Tue, Jul 7, 2015 at 4:14 PM, Larry Brigman <larry.brigman at gmail.com>
wrote:

> Output same of the data would make it easier to actually test our results.
>
> On Tue, Jul 7, 2015 at 4:09 PM, Larry Brigman <larry.brigman at gmail.com>
> wrote:
>
>> You mean something like this?
>> awk '{sub(/^\./, "", $1); printf(".%s", $1)}'
>>
>> This takes the first arg and looks for . at the beginning of the line and
>> removes it and prints.
>> The print happens regardless of the modification.  In this example the
>> input to this was the output of
>>  'tar tf' being converted to an absolute location.
>>
>> I haven't use sub without a target recently. So off to the man page.....
>> It operates on $0 which is the whole line and sub will operate on the
>> longest matched string once per invocation meaning per line.
>>
>> On Tue, Jul 7, 2015 at 3:39 PM, Rich Shepard <rshepard at appl-ecosys.com>
>> wrote:
>>
>>>    I understand sub(), gsub(), and substr() but have difficulty figuring
>>> out
>>> how to use any or all to modify a field's contents.
>>>
>>>    Context: data files have variable numbers of fields per record; most
>>> of
>>> these fields represent measured values as integers or (more often) as
>>> floating point numbers. When a value is below the laboratory's method
>>> detection limit they report the value with '<', '< ', or '-' preceeding
>>> the
>>> floating point number. I want to strip off the leading symbol (with any
>>> following white space) and retain the floating point number as the
>>> field's
>>> content.
>>>
>>>    It seems that sub() should do the job when I understand how to
>>> present the
>>> arguments to the function. The syntax is sub(regex, replacement [,
>>> target]).
>>> Not knowing if I can have if/elseif/elseif/else in the regex position my
>>> initial approach is to use four calls to the function; e.g.,
>>>
>>>         sub(/\</,"") or sub(/\< /,"")
>>>
>>> Would these leave the remainder of the field's contents intact? gsub()
>>> probably adds no more capabilities to solving this problem than does
>>> sub().
>>> Not sure if substr() is really appropriate here.
>>>
>>>    I've read the two post-"The AWK Book" books I have and understand the
>>> syntax but not how to apply the functions to achieve what needs to be
>>> done.
>>>
>>>    This is just one issue that I need to grok while developing this
>>> generic
>>> data cleaning program.
>>>
>>> Rich
>>> _______________________________________________
>>> PLUG mailing list
>>> PLUG at lists.pdxlinux.org
>>> http://lists.pdxlinux.org/mailman/listinfo/plug
>>>
>>
>>
>



More information about the PLUG mailing list