GraphCast: Learning skillful medium-range global weather forecasting

This package contains example code to run and train GraphCast. It also provides three pretrained models:

GraphCast, the high-resolution model used in the GraphCast paper (0.25 degree resolution, 37 pressure levels), trained on ERA5 data from 1979 to 2017,
GraphCast_small, a smaller, low-resolution version of GraphCast (1 degree resolution, 13 pressure levels, and a smaller mesh), trained on ERA5 data from 1979 to 2015, useful to run a model with lower memory and compute constraints,
GraphCast_operational, a high-resolution model (0.25 degree resolution, 13 pressure levels) pre-trained on ERA5 data from 1979 to 2017 and fine-tuned on HRES data from 2016 to 2021. This model can be initialized from HRES data (does not require precipitation inputs).

The model weights, normalization statistics, and example inputs are available on Google Cloud Bucket.

Full model training requires downloading the ERA5 dataset, available from ECMWF. This can best be accessed as Zarr from Weatherbench2's ERA5 data (see the 6h downsampled versions).

Overview of files

The best starting point is to open graphcast_demo.ipynb in Colaboratory, which gives an example of loading data, generating random weights or load a pre-trained snapshot, generating predictions, computing the loss and computing gradients. The one-step implementation of GraphCast architecture, is provided in graphcast.py.

Brief description of library files:

autoregressive.py: Wrapper used to run (and train) the one-step GraphCast
to produce a sequence of predictions by auto-regressively feeding the
outputs back as inputs at each step, in JAX a differentiable way.
casting.py: Wrapper used around GraphCast to make it work using
BFloat16 precision.
checkpoint.py: Utils to serialize and deserialize trees.
data_utils.py: Utils for data preprocessing.
deep_typed_graph_net.py: General purpose deep graph neural network (GNN)
that operates on TypedGraph's where both inputs and outputs are flat
vectors of features for each of the nodes and edges. graphcast.py uses
three of these for the Grid2Mesh GNN, the Multi-mesh GNN and the Mesh2Grid
GNN, respectively.
graphcast.py: The main GraphCast model architecture for one-step of
predictions.
grid_mesh_connectivity.py: Tools for converting between regular grids on a
sphere and triangular meshes.
icosahedral_mesh.py: Definition of an icosahedral multi-mesh.
losses.py: Loss computations, including latitude-weighting.
model_utils.py: Utilities to produce flat node and edge vector features
from input grid data, and to manipulate the node output vectors back
into a multilevel grid data.
normalization.py: Wrapper for the one-step GraphCast used to normalize
inputs according to historical values, and targets according to historical
time differences.
predictor_base.py: Defines the interface of the predictor, which GraphCast
and all of the wrappers implement.
rollout.py: Similar to autoregressive.py but used only at inference time
using a python loop to produce longer, but non-differentiable trajectories.
solar_radiation.py: Computes Top-Of-the-Atmosphere (TOA) incident solar
radiation compatible with ERA5. This is used as a forcing variable and thus
needs to be computed for target lead times in an operational setting.
typed_graph.py: Definition of TypedGraph's.
typed_graph_net.py: Implementation of simple graph neural network
building blocks defined over TypedGraph's that can be combined to build
deeper models.
xarray_jax.py: A wrapper to let JAX work with xarrays.
xarray_tree.py: An implementation of tree.map_structure that works with
xarrays.

Dependencies.

Chex, Dask, Haiku, JAX, JAXline, Jraph, Numpy, Pandas, Python, SciPy, Tree, Trimesh and XArray.

License and attribution

The Colab notebook and the associated code are licensed under the Apache License, Version 2.0. You may obtain a copy of the License at: https://www.apache.org/licenses/LICENSE-2.0.

The model weights are made available for use under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0). You may obtain a copy of the License at: https://creativecommons.org/licenses/by-nc-sa/4.0/.

The weights were trained on ECMWF's ERA5 and HRES data. The colab includes a few examples of ERA5 and HRES data that can be used as inputs to the models. ECMWF data product are subject to the following terms:

Copyright statement: Copyright "© 2023 European Centre for Medium-Range Weather Forecasts (ECMWF)".
Source www.ecmwf.int
Licence Statement: ECMWF data is published under a Creative Commons Attribution 4.0 International (CC BY 4.0). https://creativecommons.org/licenses/by/4.0/
Disclaimer: ECMWF does not accept any liability whatsoever for any error or omission in the data, their availability, or for any loss or damage arising from their use.

Disclaimer

This is not an officially supported Google product.

Citation

If you use this work, consider citing our paper (blog post, Science, arXiv):

@article{lam2023learning,
  title={Learning skillful medium-range global weather forecasting},
  author={Lam, Remi and Sanchez-Gonzalez, Alvaro and Willson, Matthew and Wirnsberger, Peter and Fortunato, Meire and Alet, Ferran and Ravuri, Suman and Ewalds, Timo and Eaton-Rosen, Zach and Hu, Weihua and others},
  journal={Science},
  volume={382},
  number={6677},
  pages={1416--1421},
  year={2023},
  publisher={American Association for the Advancement of Science}
}

Package Rankings

Top 27.81% on Spack.io

Related Projects

Time-Series-Library

A Library for Advanced Deep Time Series Models.

13 Feb 2023 6,487

scalecast

The practitioner's forecasting library

01 Jul 2021 306

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vi...

19 Mar 2023 36,628

orbit

A Python package for Bayesian forecasting with object-oriented design and probabilistic models un...

07 Jan 2020 1,804

Pangu-Weather

An official implementation of Pangu-Weather

11 Jan 2023 1,052

chronos-forecasting

Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting

23 Feb 2024 2,373

PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

29 Feb 2024 1,624

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

12 Oct 2023 2,138

darts

A python library for user-friendly forecasting and anomaly detection on time series.

13 Sep 2018 6,894

pytorch-widedeep

A flexible package for multimodal-deep-learning to combine tabular data with text and images usin...

21 Oct 2017 1,243

skforecast

Time series forecasting with scikit-learn models

10 Feb 2021 872

BSMS-GNN

22 Dec 2022 46

deep-learning-time-series

List of papers, code and experiments using deep learning for time series forecasting

22 Aug 2019 2,567

icenet-paper

Code associated with the paper 'Seasonal Arctic sea ice forecasting with probabilistic deep learn...

23 Jul 2021 89

awesome-google-colab

Google Colaboratory Notebooks and Repositories (by @firmai)

11 Nov 2019 1,363