[PLUG] Correcting duplicate strings in files
Rich Shepard
rshepard at appl-ecosys.com
Tue Jun 19 23:24:38 UTC 2018
On Tue, 19 Jun 2018, Carl Karsten wrote:
> Python will be the easiest to understand.
> is it always 16:00, or is it any time the whole line is duplicated,
> bump the 2nds hour?
Carl,
The values may differ by hour. It's only the second 16:00 hour each day that
is incorrect.
> also, if you have one line for every hour of the year, how about
> looping over all those datetimes, pared up with your data, and replace
> all the datetimes (both good and flawed) with the calculated datetime.
I have everything correct but for the duplicated 4pms.
> Here is 1/2 of it:
>
> from datetime import datetime, timedelta
>
> for h in range(8760):
> timestamp = datetime(2012,1,1) + timedelta(hours=h)
> data_line = "{},{}".format(
> timestamp.strftime("%Y-%m-%d,%H:%M"),
> "123.456")
> print(data_line)
Here's my test file (test.dat):
2012-10-01,14:00,90.7999
2012-10-01,15:00,90.8121
2012-10-01,16:00,90.8121
2012-10-01,16:00,90.8121
2012-10-01,18:00,90.8091
2012-10-01,19:00,90.8030
I know it can be done in awk with a flag; but don't know how to do this
correctly. :-)
Thanks,
Rich
More information about the PLUG
mailing list