France Grid

Author

Ian Lyttle

Published

2025-01-30

The goal of this project is to execute and document a data pipeline for aspects of the French electrical grid. These are data made available publicly via APIs from RTE France.

I understand RTE France’s terms and conditions allow for republication, so long as the data are credited to RTE France, and are not distorted.

When finished, this site will publish tables as parquet files:

generation sources, e.g. wind, solar
exchanges with other countries, e.g. England, Belgium
outdoor temperature in Paris (I know there’s more to France than Paris), not yet implemented

This will also be an opportunity for me to develop my skills with:

Polars: alternative to Pandas
DVC: pipeline orchestration and remote data management
Quarto: technical publishing (this website)
GitHub Actions: run the pipeline on a schedule

I am also developing an Observable notebook to be a consumer of this pipeline.

There are two sections you can access from the menu bar: the pipeline section contains the files in the pipeline; the about section has a little more material how this pipeline was put together.

Data

In this section, we summarize the published data, as of the last run of the pipeline.

import polars as pl
from pyprojroot.here import here

Each API call to fetch data from RTE France contains, at most, two weeks of data. The pipeline runs on a daily schedule, so it will take a number of days before the pipeline “catches up” to the present day.

Generation

generation = pl.read_parquet(here("data/99-publish/standard/generation.parquet"))

Two files parquet files are published, each with the same information:

Because of JavaScript’s current timezone-limitations (soon to be solved), I am writing a version of the data where the date-times are projected into UTC, preserving the wall-clock time; these are the fake-UTC data.

The last few observations:

generation.tail()

shape: (5, 4)

type	interval_start	interval_end	generation
str	datetime[ms, Europe/Paris]	datetime[ms, Europe/Paris]	i64
"HYDRO"	2023-08-09 23:45:00 CEST	2023-08-10 00:00:00 CEST	1314
"NUCLEAR"	2023-08-09 23:45:00 CEST	2023-08-10 00:00:00 CEST	31675
"PUMPING"	2023-08-09 23:45:00 CEST	2023-08-10 00:00:00 CEST	-1
"SOLAR"	2023-08-09 23:45:00 CEST	2023-08-10 00:00:00 CEST	208
"WIND"	2023-08-09 23:45:00 CEST	2023-08-10 00:00:00 CEST	2348

type: type of generation
interval_start, interval_end: date-times describing the interval
generation: average (?) of generation during this interval (MW)

We count the number of observations and null values for the generation files (will be the same for both):

generation.groupby(pl.col("type")).agg(
    pl.col("interval_start").min(),
    pl.col("interval_end").max(),
    pl.col("generation").count().alias("n_observations"),
    pl.col("generation").null_count().alias("n_value_null"),
)

shape: (10, 5)

type	interval_start	interval_end	n_observations	n_value_null
str	datetime[ms, Europe/Paris]	datetime[ms, Europe/Paris]	u32	u32
"NUCLEAR"	2017-01-01 00:00:00 CET	2023-08-10 00:00:00 CEST	231524	0
"EXCHANGE"	2017-01-01 00:00:00 CET	2023-07-27 18:30:00 CEST	230254	0
"FOSSIL_GAS"	2017-01-01 00:00:00 CET	2023-08-10 00:00:00 CEST	231524	0
"FOSSIL_OIL"	2017-01-01 00:00:00 CET	2023-08-10 00:00:00 CEST	231524	0
"PUMPING"	2017-01-01 00:00:00 CET	2023-08-10 00:00:00 CEST	231524	0
"FOSSIL_HARD_CO…	2017-01-01 00:00:00 CET	2023-08-10 00:00:00 CEST	231524	0
"SOLAR"	2017-01-01 00:00:00 CET	2023-08-10 00:00:00 CEST	231524	0
"WIND"	2017-01-01 00:00:00 CET	2023-08-10 00:00:00 CEST	231524	0
"HYDRO"	2017-01-01 00:00:00 CET	2023-08-10 00:00:00 CEST	231523	0
"BIOENERGY"	2017-01-01 00:00:00 CET	2023-08-10 00:00:00 CEST	231524	0

I don’t know, right now, why "HYDRO" has one-fewer observation than the others - I’ll try to find out!

generation.groupby(pl.col("interval_start")).agg(
    pl.col("generation").count().alias("n_observations")
).sort(pl.col("n_observations")).head(2)

shape: (2, 2)

interval_start	n_observations
datetime[ms, Europe/Paris]	u32
2023-07-29 14:30:00 CEST	9
2023-07-28 07:00:00 CEST	9

I don’t think this was at the end of a API call…

Flow

flow = pl.read_parquet(here("data/99-publish/standard/flow.parquet"))

Two files parquet files are published, each with the same information:

The last few observations:

flow.tail()

shape: (5, 4)

partner	interval_start	interval_end	flow_net
str	datetime[ms, Europe/Paris]	datetime[ms, Europe/Paris]	i64
"Belgium"	2017-07-10 23:00:00 CEST	2017-07-11 00:00:00 CEST	1154
"England-IFA"	2017-07-10 23:00:00 CEST	2017-07-11 00:00:00 CEST	-2023
"Germany"	2017-07-10 23:00:00 CEST	2017-07-11 00:00:00 CEST	-662
"Italy"	2017-07-10 23:00:00 CEST	2017-07-11 00:00:00 CEST	-1150
"Switzerland"	2017-07-10 23:00:00 CEST	2017-07-11 00:00:00 CEST	527

partner: interchange, usually a country
interval_start, interval_end: date-times describing the interval
flow: average (?) of power flow, during this interval (MW) - positive means France received power

We count the number of observations and null values for the generation files (will be the same for both):

flow.groupby(pl.col("partner")).agg(
    pl.col("interval_start").min(),
    pl.col("interval_end").max(),
    pl.col("flow_net").count().alias("n_observations"),
    pl.col("flow_net").null_count().alias("n_value_null"),
)

shape: (6, 5)

partner	interval_start	interval_end	n_observations	n_value_null
str	datetime[ms, Europe/Paris]	datetime[ms, Europe/Paris]	u32	u32
"Switzerland"	2017-01-01 00:00:00 CET	2017-07-11 00:00:00 CEST	4574	0
"Belgium"	2017-01-01 00:00:00 CET	2017-07-11 00:00:00 CEST	4574	0
"Germany"	2017-01-01 00:00:00 CET	2017-07-11 00:00:00 CEST	4574	0
"Italy"	2017-01-01 00:00:00 CET	2017-07-11 00:00:00 CEST	4574	0
"Spain"	2017-01-01 00:00:00 CET	2017-06-27 16:00:00 CEST	4254	0
"England-IFA"	2017-01-01 00:00:00 CET	2017-07-11 00:00:00 CEST	4574	0

Secrets

To interact with the APIs and data-storage, the code in this report will expect certain environment variables to be set:

AWS_ACCESS_KEY_ID, AWS_ACCESS_KEY_SECRET: can also be set using aws cli
- if you clone this repo, you will likely need to configure your own remote storage.
RTE_FRANCE_BASE64 base-64 encoding available from RTE application-page

These allow you access to an application (that you will have to configure on your RTE France account); this application will need access to these APIs: