[PLUG] Removing Duplicate Rows from SQL Dump
Denis Heidtmann
denis.heidtmann at gmail.com
Tue Aug 16 15:38:14 UTC 2011
On Tue, Aug 16, 2011 at 8:20 AM, Rich Shepard <rshepard at appl-ecosys.com>wrote:
> On Tue, 16 Aug 2011, Roderick A. Anderson wrote:
>
> > Can we see another snapshot of the data? And (did I miss it?) which three
> > columns.
>
> Rod,
>
> Yep, and yep.
>
> Data:
>
> \N CVS 1994-01-20 Conductance, Specific 460 uS/cm t
> \N \N \N
> \N CVS 1994-01-20 Conductance, Specific 522 uS/cm t
> \N \N \N
>
> (Fred: I think that pg_dump does use tabs as column separators, and there
> are spaces within a column as the above demonstrates. These data were
> extracted from Excel spreadsheets.)
>
> The three columns are the second, third, and fourth, named loc_name,
> sample_date, and param.
>
> The current client staff can't figure out either how there could be two
> different values for specific conductance at the same location on the same
> date when both were supposedly checked for quality (the 't' in the seventh
> column).
>
> Rich
>
I have no idea how to actually do it, but how is this as a strategy?
Add a unique column (record #)
Remove col. 2.
Remove duplicate entries in the result.
Note which records have been removed and remove them from the original.
Repeat for cols. 3 and 4.
-Denis
More information about the PLUG
mailing list