[PLUG] Correcting duplicate strings in files

Rich Shepard rshepard at appl-ecosys.com
Tue Jun 19 23:24:38 UTC 2018


On Tue, 19 Jun 2018, Carl Karsten wrote:

> Python will be the easiest to understand.
> is it always 16:00, or is it any time the whole line is duplicated,
> bump the 2nds hour?

Carl,

   The values may differ by hour. It's only the second 16:00 hour each day that
is incorrect.

> also, if you have one line for every hour of the year, how about
> looping over all those datetimes, pared up with your data, and replace
> all the datetimes (both good and flawed) with the calculated datetime.

   I have everything correct but for the duplicated 4pms.

> Here is 1/2 of it:
>
> from datetime import datetime, timedelta
>
> for h in range(8760):
>    timestamp = datetime(2012,1,1) + timedelta(hours=h)
>    data_line = "{},{}".format(
>            timestamp.strftime("%Y-%m-%d,%H:%M"),
>            "123.456")
>    print(data_line)

   Here's my test file (test.dat):

2012-10-01,14:00,90.7999
2012-10-01,15:00,90.8121
2012-10-01,16:00,90.8121
2012-10-01,16:00,90.8121
2012-10-01,18:00,90.8091
2012-10-01,19:00,90.8030

   I know it can be done in awk with a flag; but don't know how to do this
correctly. :-)

Thanks,

Rich





More information about the PLUG mailing list