[PLUG] Removing Duplicate Rows from SQL Dump [SOLVED]

Rich Shepard rshepard at appl-ecosys.com
Tue Aug 16 15:28:38 UTC 2011


On Tue, 16 Aug 2011, Hal Pomeranz wrote:

> 	sort -u -k1,4 inputfile >inputfile.de-duped

   Wow! I have learned so much today about built-in tools that solve major
headaches in data cleaning.

   Running the results of uniq through sort (as above, but specifying -k2,4)
from 8605 rows to 5540. And this down from 12,500+ originally.

   My grateful thanks to all of you. Not only have I learned valuable uses of
common tools but you've saved me days of work.

Rich



More information about the PLUG mailing list