[PLUG] Translating ^M to \n [WORKING]

Robert Citek robert.citek at gmail.com
Wed Aug 14 06:17:36 UTC 2019


On Tue, Aug 13, 2019 at 10:23 PM Rodney W. Grimes <freebsd at gndrsh.dnsmgr.net>
wrote:

> If you had followed the thread you would know that byte 1
> of the file is a 0xA, aka LF, and the dd was to rip that
> byte off the file, but the command got morphed cause I
> used a BSD iseek=1 syntax, and gnu dd does not understand
> that.
>

Yes, dd is a fine tool: pretty cool for seeking into files given an index
of offsets, imaging devices, creating sparse files, etc.

But personally, I'd use tail or grep or sed to skip over that first
character, which in this case is the same as a blank line.

$ tail -n +2
$ tail -c +2
$ grep .
$ sed -ne '/./p'

That sed could be extended to remove the space after commas and the
trailing comma on each line.  So in the end, you'd have just two commands:
tr and sed.

$   time -p < hatchery_returns-2019-08-12.csv \
> tr '\r' '\n' |
> sed -ne 's/, /,/g;s/,$//;/./p' |
> md5
a02aa3be8cbe68e1b76debbd0b1586e7
real 1.31
user 1.96
sys 0.03

Of course, dd can be added to the mix:

$ time -p < hatchery_returns-2019-08-12.csv \
> dd bs=1 iseek=1 |
> tr '\r' '\n' |
> sed -ne 's/, /,/g;s/,$//;/./p' |
> md5
12746088+0 records in
12746088+0 records out
12746088 bytes transferred in 38.344862 secs (332407 bytes/sec)
a02aa3be8cbe68e1b76debbd0b1586e7
real 38.35
user 22.58
sys 40.10

As the md5 shows, you get the same results.  It's just a matter of personal
preference.

Regards,
-  Robert



More information about the PLUG mailing list