[PLUG] Removing Duplicate Rows from SQL Dump [SOLVED]
Rich Shepard
rshepard at appl-ecosys.com
Tue Aug 16 15:28:38 UTC 2011
On Tue, 16 Aug 2011, Hal Pomeranz wrote:
> sort -u -k1,4 inputfile >inputfile.de-duped
Wow! I have learned so much today about built-in tools that solve major
headaches in data cleaning.
Running the results of uniq through sort (as above, but specifying -k2,4)
from 8605 rows to 5540. And this down from 12,500+ originally.
My grateful thanks to all of you. Not only have I learned valuable uses of
common tools but you've saved me days of work.
Rich
More information about the PLUG
mailing list