WP 2 Toolkit development


Since the calculation of analogues over large datasets is heavy, the code will contain a parallel scheme, in order to take advantage of the new computer architectures (even laptops). Given the nature of the computer programs and the format of datasets (generally in netcdf format), a linux interface will be privileged. The core of the toolkit will allow the use of various metrics to compute analogues.


The output of this toolkit will provide the best flow analogues (and their statistical characteristics) and simple composite diagnostics on other variables, such as temperature, wind speed and precipitation. It will provide diagnostics on the quality of analogues in an objective way, from the results in WP1, and provide a statement on the relevance of the analogue scores.


We will develop a pseudo real-time interface to recover automatically reanalysis data from ad hoc open databases, over selected regions of interest. This module will enable a pseudo real-time determination of best circulation analogues, composite temperature and precipitation for a region of choice. An early version of such a module allowed the analysis of extreme events in Europe during 2011 [Cattiaux and Yiou, 2012]. Such an analysis will be done routinely for Europe, North America, the Arctic region, and East Asia, from NCEP reanalysis [Kalnay et al., 1996].


The toolkit will also offer the possibility of using long general circulation model (GCM) simulations for the reference database. An interface to CMIP5, PMIP3, CORDEX data will be designed. Such model simulations offer data that either have a longer time span (century or millennium) or finer spatial resolution (11 km or 44 km) than reanalysis data.


The ensemble simulation database interface will be tested in the evaluation of the climate attractor deformation application. The core of the analogue computation will be programmed in the widely used language R and shell scripts. It will be shaped as a standalone application. A prototype of a web application (running on the LSCE computing server) for a pseudo realtime analysis (once a week) of atmospheric flow over several regions of the globe will be implemented.


This computer engineering step is crucial in order to facilitate the transfer of knowledge from the mathematical theory to climate applications. The platform will also serve as a base for further scientific or innovation projects.


We will create a database of available model simulation data (CMIP5, PMIP3, CORDEX), reanalyses (NCEP: [Kalnay et al., 1996]) and observations (ECA&D: [Klein-Tank et al., 2002]), with common format specifications, quality check and bias removal [Michelangeli et al., 2009]. The database will cover the identified regions of analysis. We will focus on sea-level pressure, geopotential heights at various levels (1000, 850, 500 and 300 hPa), near-surface temperature and precipitation. The reanalysis and observational datasets will be updated on a regular basis. The analogue flow platform will then enable a switch between datasets.


The original database of model simulations or observations will not be re-distributed, although the open platform will make use of them. We might distribute corrected datasets after bias removal. We will issue documentation on the quality check and bias removal, in order to guarantee the reproducibility of results, and provide a guide of “best practice”.