[PLUG] 'sort' to find duplicate rows

Rich Shepard rshepard at appl-ecosys.com
Fri Jul 13 22:59:26 UTC 2012


   The text file has > 120k rows; each row has 8 columns. There are
duplicate rows that I want to eliminate. My reading of the sort man page and
various Web pages with examples tells me that the sort --key option is
limited to a sequential starting field and ending field. What I need is to
sort on fields 1, 2, and 4.

   If 'sort' won't do this, what tool will? I don't see how awk, sed, or grep
can, yet a combination of these perhaps might.

Rich




More information about the PLUG mailing list