Generation Publish

In the publish step:

import os
import polars as pl
from pyprojroot.here import here
import json
table = pl.read_parquet(here("data/02-transform/generation.parquet"))

JavaScript does not (yet) have a timezone database available; you can use the timezone of the browser, or you can use UTC. The idea here is to use a “fake UTC”, by projecting the date-times from their original timezone (Europe/Paris) to UTC, preserving the wall-clock time. This will help us with any date-time math in JavaScript, but with the price of introducing a gap and a duplication at the daylight-saving time transitions.

table_fake_utc = table.with_columns(
    pl.col(["interval_start", "interval_end"]).map(
        lambda x: x.dt.replace_time_zone(time_zone="UTC")
    ),
)

We publish both the standard and fake-UTC tables:

path_standard = here("data/99-publish/standard")
os.makedirs(path_standard, exist_ok=True)
table.write_parquet(f"{path_standard}/generation.parquet")
path_fake_utc = here("data/99-publish/fake-utc")
os.makedirs(path_fake_utc, exist_ok=True)
table_fake_utc.write_parquet(f"{path_fake_utc}/generation.parquet")

We also calculate and write out some metadata:

interval_end = (
    table.groupby(pl.col("type"))
    .agg(pl.col("interval_end").max())
    .get_column("interval_end")
    .min()
)

We publish this to a metadata file:

dict = {"interval_end": interval_end.isoformat()}

with open(here("data/99-publish/generation-meta.json"), "w") as file:
    json.dump(dict, file)