EC-Earth-3 Post-Processing Tools

ECE3-POSTPROC is a suite of post-processing tools for EC-Earth 3. It includes HIRESCLIM2, ECMEAN and AMWG. The last two require the output from the first one. A REPRODUCIBILITY TEST is also provided. Other tools are available but have not been clean up or tested.

The code has been ported on cca (ECMWF), rhino (KNMI), marconi (CNR), and marenostrum4 (BSC). See the Porting instructions below to include other machines.

INSTALLATION

You first need to get the code:

git clone https://github.com/plesager/ece3-postproc.git

The general idea is that you pass the experiment name (EXP) and the years to process (typically a range, say, YEAR1 YEAR2) as input to the scripts on the command line.

For this to work as intended, locations of the post-processing code, of the EC-Earth model data, and your platform, must be known. They are unlikely to change and set as environment variables, so in your shell rc file (bash.rc or similar):

export ECE3_POSTPROC_TOPDIR=<dir where this file is>
export ECE3_POSTPROC_DATADIR=<dir where your ecearth init data (not run output) are located>
export ECE3_POSTPROC_MACHINE=<name of your (HPC) machine>

The ”name of your (HPC) machine” is used to retrieve the machine configuration. The list of available platforms is:

ls ./conf/

If yours is not present, you need to port the code. See the Porting section below.

Optionally, you can also set your HPC account:

export ECE3_POSTPROC_ACCOUNT=<HPC account>

If not set, your default account is used to submit job to HPC. At ECMWF, this is the 1st one in the list you get with the command (on ecgate only): “account -l $USER”

If you want to temporarily change the HPC account, you can also use the command line option when calling each tool.

Finally, the code relies on $SCRATCH and $USER being defined. If not, define one. A lot of temporary data files, job scripts and their log are being written on the $SCRATCH. The $USER is used in few job manager commands (SLURM or PBS), and should already be defined.

USAGE

To use a job manager or not

If you use a job scheduler (SLURM, PBSpro, …), there is a set of wrappers that let you submit jobs in parallel. If not, you can still run the wrapped scripts. In all cases, the calls are made in the script sub-directory of the package [FIXME?]. The wrappers calls are described here [TODO: add description for case without wrappers].

Tools settings

Each tool has its specifics settings. If the code has already been ported to your machine, you will have to change very little there, as described hereafter. They define the needed executable/lib (cdo, nco, netcdf,…), location of auxiliary data, of run output, and where to save the results. For reference, the machine configurations are:

./conf/<your-machine>/conf_<tool>_<your-machine>.sh

First, run HIRESCLIM2 to extract and average variables of interest:

cd ${ECE3_POSTPROC_TOPDIR}/script
./hc.sh [-a account] [-r rundir] [-m months_per_leg] [-c] EXP YEAR1 YEAR2 YREF

This will create a set of netcdf files with monthly (default) averages of variables of interest. The files are found in the ${ECE3_POSTPROC_POSTDIR}/post directory.

For more information about the script options, just call

./hc.sh -h

If you specify an alternate “rundir” to find the model output, the default EC-Earth3 output structure is assumed and must be readable:

rundir/EXP/output/{ifs,nemo}/leg_number

Upon success, ${ECE3_POSTPROC_POSTDIR}/postcheck_EXP_YYYY.txt files are created with some basic information. By repeating the command with the -c option, these files are printed. In case of problem the location of the log is printed.

In “./conf/<your-machine>/conf_hiresclim_<your-machine>.sh”, you can set some options (if you want daily or 6h output on top of the monthly one, or the nemo_extra output for example). However the most important settings that you have to change are the templates for the model output and for the results location. For example:

export IFSRESULTS0='/scratch/ms/nl/${USER}/ECEARTH-RUNS/${EXPID}/output/ifs/${LEGNB}'
export NEMORESULTS0='/scratch/ms/nl/${USER}/ECEARTH-RUNS/${EXPID}/output/nemo/${LEGNB}'
export ECE3_POSTPROC_POSTDIR='/scratch/ms/nl/${USER}/ECEARTH-RUNS/${EXPID}/post'

They must include at least either the ${year} or the ${LEGNB} to find the correct data, and must be single-quoted. The ${EXPID} will be expanded to the EXP argument passed to the script.

The NEMO variable names expected by HIRESCLIM2 may differ from those found in EC-Earth3 output. If needed, you can change the variables name from EC-Earth in conf_hiresclim_<your-machine>.sh.

Then, you can compute the global mean fluxes with EC-MEAN:

./ecm.sh [-a account] [-r rundir] [-c] [-y] [-p] EXP YEAR1 YEAR2

The options are the same as for hiresclim2. For details, call

./ecm.sh -h

Output tables with Performance Indices and mean global fluxes are found in:

${ECE3_POSTPROC_DIAGDIR}/table/${EXPID}

and one line summary is found:

${ECE3_POSTPROC_DIAGDIR}/table/globtable.txt
${ECE3_POSTPROC_DIAGDIR}/table/gregory.txt

If the option -y was used, you also get yearly global means available in:

${ECE3_POSTPROC_DIAGDIR}/table/yearly_fldmean_${exp}.txt

and its subset

${ECE3_POSTPROC_DIAGDIR}/table/gregory_${exp}.txt

which has only the three variables needed for a Gregory plot.

The default output directory ${ECE3_POSTPROC_DIAGDIR} is set in the

$ECE3_POSTPROC_TOPDIR/conf/${ECE3_POSTPROC_MACHINE}/conf_ecmean_${ECE3_POSTPROC_MACHINE}.sh

config file.

You can quickly check for success by executing the command again with -c option. It will print the summary line from globtable.txt and gregory.txt files, if they exist. For more insight, have a look at the submitted scripts and logs, which are in $SCRATCH/tmp_ece3_ecmean.

EC-Mean creates a climatology from the experiment to derive the performance indices. The climatology is by default in the same directory as the HIRESCLIM2 output:

${ECE3_POSTPROC_POSTDIR}/clim-${YEAR1}-${YEAR2}

and not removed, since it can be use for other purposes (notably the reproducibility test).

or/and produce the AMWG diagnostics:

TODO

REPRODUCIBILITY TEST

Overview

The acceptance/reproducibility test consists in 4 steps:

run an ensemble of 5 members
running EC-mean to get the climatology and the Reichler & Kim (R&K) performance indices of each run
cast the R&K indices into a format suitable for the next step
repeat for another ensemble and compare

Requirements

The acceptance/reproducibility test relies on a set of scripts written in R. Few R packages are needed: s2dverification, ncdf4, RColorBrewer. If you do not control your environment and R and/or the packages are missing, it may be easier to work on another machine where you can easy installed the packages. For example:

# define a personal R library location,
mkdir /usr/people/sager/Rlib
# and make sure that R is aware of it (put that one in your ~/.bashrc): 
export R_LIBS=/usr/people/sager/Rlib/

# within R, install:
install.packages("s2dverification", lib="/usr/people/sager/Rlib/")
install.packages("ncdf4", lib="/usr/people/sager/Rlib/")
install.packages("RColorBrewer", lib="/usr/people/sager/Rlib/")

Experiment design

You must run 5 experiments for 20 years with perturbed initial conditions. Your experiments name should be made of 3 characters (the stem) followed by a number from 1-to-5. For example: cca1, cca2, cca3, cca4, cca5. The stem uniquely defines your ensemble. If you do not follow this format, collecting the R&K indices in a format suitable for the comparison scripts will be slightly more complicated but still feasible (see below). Your runs will differ by their initial conditions, which require some setup.

For AMIP runs

you can create these initial conditions on the fly, by adding a call to the perturbation script in your classic/ece-*.sh.tmpl, i.e. by replacing:

ln -s \
${ini_data_dir}/ifs/${ifs_grid}/${leg_start_date_yyyymmdd}/ICMSHECE3INIT \
                                                    ICMSH${exp_name}INIT

with

# apply AMIP perturbation to 3D temperature
${ECE3_POSTPROC_TOPDIR}/reproducibility/perturb_ifs_ic.py -s t \
    ${ini_data_dir}/ifs/${ifs_grid}/${leg_start_date_yyyymmdd}/ICMSHECE3INIT \
                                                        ICMSH${exp_name}INIT

For CMIP runs

A perturbation script is also available for ocean restart but has not been tested yet. But you can used perturbed ocean restarts already prepared beforehand. For example, with the following 1950 initial conditions provided by BSC, which are available through ftp, see https://dev.ec-earth.org/issues/447#note-1, and look like this once unpacked:

 ic
 ├── atmos
 │   ├── ICMGGa0raINIT
 │   ├── ICMGGa0raINIUA
 │   └── ICMSHa0raINIT
 ├── ice
 │   └── a0ra_fc0_19491231_restart_ice.nc
 └── ocean
     ├── a0ra_fc0_19491231_restart.nc
     ├── a0ra_fc1_19491231_restart.nc
     ├── a0ra_fc2_19491231_restart.nc
     ├── a0ra_fc3_19491231_restart.nc
     └── a0ra_fc4_19491231_restart.nc

You just need to submit 5 runs that start from these different restarts. What follows is some tips to help you streamline the process. Start by reorganizing the initial conditions so you can use the same script template in all your runtime dirs. For example, you can:

cd ic/ocean/
mkdir 0{1..5}
for k in {1..5}; do cd 0$k; ln -s ../a0ra_fc$((k-1))_19491231_restart.nc restart_oce.nc ; cd - ; done
for k in {1..5}; do cd 0$k; ln -s ../../ice/a0ra_fc0_19491231_restart_ice.nc restart_ice.nc ; cd - ; done

which gives you:

[2041] >>> tree ic
ic
├── atmos
│   ├── ICMGGa0raINIT
│   ├── ICMGGa0raINIUA
│   └── ICMSHa0raINIT
├── ice
│   └── a0ra_fc0_19491231_restart_ice.nc
└── ocean
    ├── 01
    │   ├── restart_ice.nc -> ../../ice/a0ra_fc0_19491231_restart_ice.nc
    │   └── restart_oce.nc -> ../a0ra_fc0_19491231_restart.nc
    ├── 02
    │   ├── restart_ice.nc -> ../../ice/a0ra_fc0_19491231_restart_ice.nc
    │   └── restart_oce.nc -> ../a0ra_fc1_19491231_restart.nc
    ├── 03
    │   ├── restart_ice.nc -> ../../ice/a0ra_fc0_19491231_restart_ice.nc
    │   └── restart_oce.nc -> ../a0ra_fc2_19491231_restart.nc
    ├── 04
    │   ├── restart_ice.nc -> ../../ice/a0ra_fc0_19491231_restart_ice.nc
    │   └── restart_oce.nc -> ../a0ra_fc3_19491231_restart.nc
    ├── 05
    │   ├── restart_ice.nc -> ../../ice/a0ra_fc0_19491231_restart_ice.nc
    │   └── restart_oce.nc -> ../a0ra_fc4_19491231_restart.nc
    ├── a0ra_fc0_19491231_restart.nc
    ├── a0ra_fc1_19491231_restart.nc
    ├── a0ra_fc2_19491231_restart.nc
    ├── a0ra_fc3_19491231_restart.nc
    └── a0ra_fc4_19491231_restart.nc

Then you modify your ece-esm.sh.tmpl template script to account for that data tree as follow (just 5 lines to change):

Index: ece-esm.sh.tmpl
===================================================================
--- ece-esm.sh.tmpl      (revision 5029)
+++ ece-esm.sh.tmpl      (working copy)
@@ -25,7 +25,7 @@
 #     config="ifs nemo lim3 rnfmapper xios:detached oasis lpjg:fdbck"           # "Veg"     : GCM+LPJ-Guess
 #     config="ifs nemo lim3 rnfmapper xios:detached oasis tm5:chem,o3,ch4,aero" # "AerChem" : GCM+TM5
 
-config="ifs nemo lim3 rnfmapper xios:detached oasis lpjg:fdbck tm5:co2"
+config="ifs nemo:start_from_restart lim3 rnfmapper xios:detached oasis"
 
 # minimum sanity
 has_config amip nemo && error "Cannot have both nemo and amip in config!!"
@@ -189,7 +189,7 @@
 
 # This is only needed if the experiment is started from an existing set of NEMO
 # restart files
-nem_restart_file_path=${start_dir}/nemo-rst
+nem_restart_file_path="<full-path-to-your-ic-dir>/ocean/0${exp_name:3}"
 
 nem_restart_offset=0
 
@@ -450,13 +450,13 @@
 
         # Initial data
         ln -s \
-        ${ini_data_dir}/ifs/${ifs_grid}/${leg_start_date_yyyymmdd}/ICMGGECE3INIUA \
+        <full-path-to-your-ic-dir>/atmos/ICMGGa0raINIUA \
                                                             ICMGG${exp_name}INIUA
         ln -s \
-        ${ini_data_dir}/ifs/${ifs_grid}/${leg_start_date_yyyymmdd}/ICMSHECE3INIT \
+        <full-path-to-your-ic-dir>/atmos/ICMSHa0raINIT \
                                                             ICMSH${exp_name}INIT
         rm -f ICMGG${exp_name}INIT
-        cp ${ini_data_dir}/ifs/${ifs_grid}/${leg_start_date_yyyymmdd}/ICMGGECE3INIT \
+        cp <full-path-to-your-ic-dir>/atmos/ICMGGa0raINIT \
                                                             ICMGG${exp_name}INIT
 
         # add bare_soil_albedo to ICMGG*INIT

Then, using your favorite method, run 5 experiments with a name that ends with 1,…,5.

Postprocessing steps

For each of your 5 experiments, you need to run hireclim2 followed by EC-mean to get their resulting climatology and their Reichler-Kim performance indices. For example, assuming your experiment runs from 1990-2009:

# Get monthly means
cd ${ECE3_POSTPROC_TOPDIR}/script
for k in {1..5}; do ./hc.sh cca${k} 1990 2009 1990; done

# Once the /hc.sh/ jobs are finished, get climatology and PI
for k in {1..5}; do ./ecm.sh cca${k} 1990 2009; done

Then you need to gather the PI results into a format suitable for the R scripts:

cd  ${ECE3_POSTPROC_TOPDIR}/reproducibility/
./collect_ens.sh [-t] STEM  NB_MEMBER  YEAR1  YEAR2

The -t option let you collect both the PI indices and the climatology from each run into a tar file in your $SCRATCH. This is *useful for sharing and then being able to compare with other ensemble results*.

If your run names and/or EC-mean output do not follow the default settings, you can still collect the data without too much work. Indeed the collect_ens.sh is essentially one line of code that is easy to hack and run at the command line or an ad hoc script:

var2d="t2m msl qnet tp ewss nsss SST SSS SICE T U V Q"

for var in ${var2d}
do
  for rname in your-list-of-run-names
  do
      cat ${path-to-rk-tables}/PI2_RK08_${rname}_${year1}_${year2}.txt | grep "^${var} " | \
          tail -1  | \
          awk {'print $2'} >> ${EnsembleName}_${year1}_${year2}_${var}.txt
  done
done

Comparing

Once you have two ensembles processed, you can compare them. Both ensembles output collected in the previous step should be gathered in a DATADIR, where:

# For run ${nb} of ensemble ${stem}, climatological data are expected in:
$DATADIR/${stem}${nb}/post/clim-${year1}-${year2}/
# For one ensemble, ${stem}, tables are expected in:
$DATADIR/${stem}/

If you use the -t option to collect all these data in a tar file (see previous step), DATADIR is just the directory where you unpack the archive. If not, it should not be difficult to re-organize your output with few mkdir and mv calls.

With the data in place, the statistics package can be run:

./compare.sh -d $DATADIR stem1 stem2 start_year end_year nb_member

A PDF file with all generated plots is created in DATADIR/plots. That default location can be overwritten at the command line with the -p option.

PORTING

Get the data. Available at:

ec:/nm6/EC-EARTH/ECEARTH3.2b/INPUT/ece-post-proc.tar.gz

To port to a new machine, you need to:

add platform templates in a conf/<your_platform_name> directory (adapt existing ones to your job scheduler)
```
conf/<your-machine>/hc_<your-machine>.tmpl
conf/<your-machine>/header_<your-machine>.tmpl
    
```
The job scheduler command to submit job is set in the configuration scripts.
add a configuration script for each tools:
```
conf/<your-machine>/conf_hiresclim_<your-machine>.sh
conf/<your-machine>/conf_timeseries_<your-machine>.sh
conf/<your-machine>/conf_ecmean_<your-machine>.sh
conf/<your-machine>/conf_amwg_<your-machine>.sh
    
```
TODO: combine those into two config files: one USER oriented (i.e anything that changes with the experiment to process), and one for the machine (i.e. setup that should not changed with the experiment/user).

Requirements

You must install nco, netcdf, python, cdo, and cdftools if missing.
For CDFTOOLS you cannot use the light one that ships with barakuda.
If the netCDF4 python module is not available, you cannot build the 3D relative humidity. Set in your ./conf/<your-machine>/conf_hiresclim_<your-machine>.sh:
```
rh_build=0
    
```
Some EC-Earth experiments put the water flux output from NEMO in the SBC files instead of the grid_T files. Then you need
```
export use_SBC=1
    
```
in your ./conf/conf_hiresclim_<your-machine>.sh config.

Build rebuild_nemo from EC-Earth source code:

This is needed only if the output files of NEMO are per processes. In which case you need to do something along these lines:

cd <EC-EARTH-DIR>/sources/nemo-3.6/TOOLS/REBUILD_NEMO/
<F90-COMPILER> rebuild_nemo.f90  -o ../rebuild_nemo.exe -I<PATH-TO-NETCDF-INSTALLATION>/include -L<PATH-TO-NETCDF-INSTALLATION>/lib -lnetcdf -lnetcdff

HISTORY

Copied from a suite of post-processing tools from Jost (it/ccjh) on Monday, March 27, 2017. This project is a quick attempt at cleaning up the tools suite and making it easier to port. Added and adapted (Jan 2018) the code for the reproducibility test developed by Martin Ménégoz and Francois Massonnet.

Modified to work with default ecearth-3 output tree. Removed the possibility to run somebody else code (just clone it!) but can still processed output from another user.

Improved the performance of HIRECLIM2 with parallelization over the years. Can process monthly legged runs. Catch all errors with “set -e” everywhere. Try to be smart in dealing with and cleaning up temporary dirs, by using mktemp, …

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EC-Earth-3 Post-Processing Tools

INSTALLATION

USAGE

To use a job manager or not

Tools settings

First, run HIRESCLIM2 to extract and average variables of interest:

Then, you can compute the global mean fluxes with EC-MEAN:

or/and produce the AMWG diagnostics:

REPRODUCIBILITY TEST

Overview

Requirements

Experiment design

For AMIP runs

For CMIP runs

Postprocessing steps

Comparing

PORTING

Get the data. Available at:

To port to a new machine, you need to:

Requirements

Build rebuild_nemo from EC-Earth source code:

HISTORY

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
ECmean		ECmean
amwg		amwg
conf		conf
ece2cmor3_support		ece2cmor3_support
hiresclim2		hiresclim2
reproducibility		reproducibility
script		script
timeseries		timeseries
.gitignore		.gitignore
README.org		README.org
functions.sh		functions.sh

Folders and files

Latest commit

History

Repository files navigation

EC-Earth-3 Post-Processing Tools

INSTALLATION

USAGE

To use a job manager or not

Tools settings

First, run HIRESCLIM2 to extract and average variables of interest:

Then, you can compute the global mean fluxes with EC-MEAN:

or/and produce the AMWG diagnostics:

REPRODUCIBILITY TEST

Overview

Requirements

Experiment design

For AMIP runs

For CMIP runs

Postprocessing steps

Comparing

PORTING

Get the data. Available at:

To port to a new machine, you need to:

Requirements

Build rebuild_nemo from EC-Earth source code:

HISTORY

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages