[PLUG] 'sort' to find duplicate rows
John Sechrest
sechrest at gmail.com
Fri Jul 13 23:23:09 UTC 2012
If you are using a straight text comparision, the -u option to sort gives
you a unique list (no duplicates)
However, if they are semantically identical, but syntactically different,
this will not work.
On Fri, Jul 13, 2012 at 3:59 PM, Rich Shepard <rshepard at appl-ecosys.com>wrote:
> The text file has > 120k rows; each row has 8 columns. There are
> duplicate rows that I want to eliminate. My reading of the sort man page
> and
> various Web pages with examples tells me that the sort --key option is
> limited to a sequential starting field and ending field. What I need is to
> sort on fields 1, 2, and 4.
>
> If 'sort' won't do this, what tool will? I don't see how awk, sed, or
> grep
> can, yet a combination of these perhaps might.
>
> Rich
>
> _______________________________________________
> PLUG mailing list
> PLUG at lists.pdxlinux.org
> http://lists.pdxlinux.org/mailman/listinfo/plug
>
--
John Sechrest .
.
.
.
.
sechrest at gmail.com
.
@sechrest <http://www.twitter.com/sechrest>
.
http://www.oomaat.com
.
More information about the PLUG
mailing list