Radiant MLHub

CSU Synthetic Attribution Benchmark Dataset


This is a synthetic dataset that can be used by users that are interested in benchmarking methods of explainable artificial intelligence (XAI) for geoscientific applications. The dataset is specifically inspired from a climate forecasting setting (seasonal timescales) where the task is to predict regional climate variability given global climate information lagged in time. The dataset consists of a synthetic input X (series of 2D arrays of random fields drawn from a multivariate normal distribution) and a synthetic output Y (scalar series) generated by using a nonlinear function F: R^d -> R.

The synthetic input aims to represent temporally independent realizations of anomalous global fields of sea surface temperature, the synthetic output series represents some type of regional climate variability that is of interest (temperature, precipitation totals, etc.) and the function F is a simplification of the climate system.

Since the nonlinear function F that is used to generate the output given the input is known, we also derive and provide the attribution of each output value to the corresponding input features. Using this synthetic dataset users can train any AI model to predict Y given X and then implement XAI methods to interpret it. Based on the “ground truth” of attribution of F the user can assess the faithfulness of any XAI method.

NOTE: the spatial configuration of the observations in the NetCDF database file conform to the planetocentric coordinate system (89.5N - 89.5S, 0.5E - 359.5E), where longitude is measured in the positive heading east from the prime meridian.

Dataset ID





Colorado State University (CSU), Cooperative Institute for Research in the Atmosphere (CIRA)





Mamalakis, A., Ebert-Uphoff, I. & Barnes, E. (2022) "CSU Synthetic Attribution Benchmark Dataset", Version 1.0, Radiant MLHub [Date Accessed] https://doi.org/10.34911/rdnt.8snx6c

Python Client example

from radiant_mlhub import Dataset

ds = Dataset.fetch('csu_synthetic_attribution')
for c in ds.collections:

Python Client quick-start guide

Download Dataset

Labels Collection


CSU Synthetic Attribution Benchmark Dataset - NetCDF Collection



Collection ID