ESM-Tools Variables

The esm_parser is used to read the multiple types of YAML files contained in ESM-Tools (i.e. model and coupling configuration files, machine configurations, runscripts, etc.). Each of these YAML files can contain two type of YAML elements/variables:

Tool-specific elements: YAML-scalars, lists or dictionaries that include instructions and information used by ESM-Tools. These elements are predefined inside the esm_parser or other packages inside ESM-Tools and are used to control the ESM-Tools functionality.

Setup/model elements: YAML-scalars, lists of dictionaries that contain information defined in the model/setup config files (i.e. awicm.yaml, fesom.yaml, etc.). This information is model/setup-specific and causes no effect unless it is combined with the tool-specific elements. For example, in fesom.yaml for FESOM-1.0 the variable asforcing exists, however this means nothing to ESM-Tools by its own. In this case, this variable is used in namelist_changes (a tool-specific element) to state the type of forcing to be used and this is what actually makes a difference to the simulation. The advantage of having this variable already defined and called in namelist_changes, in the fesom.yaml is that the front-end user can simply change the forcing type by changing the value of asforcing (no need for the front-end user to use namelist_changes).

The following subsection lists and describes the Tool-specific elements used to operate ESM-Tools.

Note

Most of the Tool-specific elements can be defined in any file (i.e. configuration file, runscript, …) and, if present in two files used by ESM-Tools at a time, the value is chosen depending on the ESM-Tools file priority/read order (YAML File Hierarchy). Ideally, you would like to declare as many elements as possible inside the configuration files, to be used by default, and change them in the runscripts when necessary. However, it is ultimately up to the user where to setup the Tool-specific elements.

Tool-Specific Elements/Variables

The following keys should/can be provided inside configuration files for models (<PATH>/esm_tools/configs/components/<name>/<name>.yaml), coupled setups (<PATH>/esm_tools/configs/setups/<name>/<name>.yaml) and runscripts. You can find runscript templates in esm_tools/runscripts/templates/).

Compile time variables

Key	Section	Description
execution_mode	general	Takes the value `compile` during compile time. Can be used in `choose_` blocks with `choose_general.execution_mode`.
model	general	Name of the model/setup as listed in the config files (`esm_tools/configs/components` for models and `esm_tools/configs/setups` for setups).
setup_name	general	Name of the coupled setup.
version	general	Version of the model/setup (one of the available options in the `available_versions` list).
available_versions	<component>	List of supported versions of the component or coupled setup.
git-repository	<component>	Address of the model’s git repository.
branch	<component>	Branch from where to clone.
destination	<component>	Name of the folder where the model is downloaded and compiled, in a coupled setup.
comp_command	<component>	Command used to compile the component.
install_bins	<component>	Path inside the component folder, where the component is compiled by default. This path is necessary because, after compilation, ESM-Tools needs to copy the binary from this path to the `<component/setup_path>/bin` folder.
source_code_permissions	<component>	Sets the file permisions for the source code using `chmod <source_code_permissions> -R <source_code_folder>.

Run-time variables

Key	Section	Description
account	general	User account of the HPC system to be used to run the experiment.
base_dir	general	Path to the directory that will contain the experiment folder (where the experiment will be run and data will be stored).
compute_time	general	Estimated computing time for a run, used for submitting a job with the job scheduler.
create_folders	<component>	List of absolute paths of the folders to be created. See Create empty folders.
esm_configs_dir	general	Absolute path to the ESM-Tools configs directory (`configs/`). Set automatically by `esm_parser` at startup. Use as `${general.esm_configs_dir}/...` in YAML files to reference scripts and files under the configs tree.
esm_couplings_dir	general	Absolute path to the ESM-Tools couplings directory (`couplings/`). Set automatically by `esm_parser` at startup. Use as `${general.esm_couplings_dir}/...` in YAML files to reference coupling configurations.
esm_namelist_dir	general	Absolute path to the ESM-Tools namelists directory (`namelists/`). Set automatically by `esm_parser` at startup. Use as `${general.esm_namelist_dir}/...` in YAML files to reference namelist templates.
esm_runscript_dir	general	Absolute path to the ESM-Tools runscripts directory (`runscripts/`). Set automatically by `esm_parser` at startup. Use as `${general.esm_runscript_dir}/...` in YAML files to reference runscripts or `further_readings`.
executable	<component>	Name of the component executable file, as it shows in the `<component/setup_path>/bin` after compilation.
execution_command	<component>	Command for executing the component, including `${executable}` and the necessary flags.
execution_mode	general	Takes the value `run` during run time. Can be used in `choose_` blocks with `choose_general.execution_mode`.
expid	general	ID of the experiment. This variable can also be defined when calling `esm_runscripts` with the `-e` flag.
File Dictionaries	<component>	YAML dictionaries used to handle input, output, forcing, logging, binary and restart files (see File Dictionaries).
force_overwrite_in_file_movements	general “A boolean to indicate whether the file movements should overwrite existing files or not. If `False` (default)	the file movements will not overwrite existing files. Only set to `True` if you know why you would want to do that (e.g to overwrite files in a failed tidy task).”
heterogeneous_parallelization	computer	A boolean that controls whether the simulation needs to be run with or without heterogeneous parallelization. When `false` OpenMP is not used for any component, independently of the value of `omp_num_threads` defined in the components. When `true`, `open_num_threads` needs to be specified for each component using OpenMP. `heterogeneous_parallelization` variable needs to be defined inside the `computer` section of the runscript. See Heterogeneous Parallelization Run (MPI/OpenMP) for examples.
ini_restart_dir	<component>	Path of the restarted experiment in case the current experiment runs in a different directory. For this variable to have an effect `lresume` needs to be `true` (e.g. the experiment is a restart).
ini_restart_exp_id	<component>	ID of the restarted experiment in case the current experiment has a different `expid`. For this variable to have an effect `lresume` needs to be `true` (e.g. the experiment is a restart).
install_missing_plugins	general	A boolean to indicate whether `esm_runscripts` needs to install missing plugins (`True`, default) or not (`False`). Implemented to solve a problem with the `esm_tests` CI in GitHub where we might not have access to some repositories.
lresume	<component>	Boolean to indicate whether the run is an initial run or a restart.
mail_type	general/computer	Value for the SBATCH flag `--mail-type` (see https://slurm.schedmd.com/sbatch.html#OPT_mail-type)
mail_user	general/computer	Value for the SBATCH flag `--mail-user` (see https://slurm.schedmd.com/sbatch.html#OPT_mail-user)
model_dir	general/<component>	Absolute path of the model directory (where it was installed by esm_master).
namelists	<component>	List of namelist files required for the model.
namelist_changes	<component>	Functionality to handle changes in the namelists from the yaml files (see Changing Namelists).
nproc	<component>	Number of processors to use for the model.
nproca/nprocb	<component>	Number of processors for different MPI tasks/ranks. Incompatible with `nproc`.
nnodes_envvar	computer	Name of the environment variable holding the number of allocated nodes (e.g. `SLURM_JOB_NUM_NODES`).
omp_num_threads	<component>	A variable to control the number of OpenMP threads used by a component during an heterogeneous parallelization run. This variable has to be defined inside the section of the components for which OpenMP needs to be used. This variable will be ignored if `computer.heterogeneous_parallelization` is not set to `true`.
parallel_file_movements	general	Controls how file movements are parallelized. `"dask"` (default) distributes I/O across all compute nodes via a Dask cluster, `"threads"` uses local threads on a single node, `False` runs sequentially. See Parallel File Movements.
pool_dir	general	Path to the pool directory to read in mesh data, forcing files, inputs, etc.
post_processing	<component>	Boolean to indicate whether to run postprocessing or not.
post_run_commands	computer	Shell commands appended to the job script after the model execution and before resubmission. Can be a `string` or a `list` of `strings`.
pre_recipe.exclude_job_types	general	List of job types that skip `pre_recipe.steps` (default: `["prepare", "prepexp", "observe"]`).
pre_recipe.steps	general	List of recipe step names injected before the main recipe (e.g. `["initialize_dask_cluster"]`). Steps listed here run for all job types except those in `pre_recipe.exclude_job_types`.
save_batch_env_patterns	computer	List of grep patterns used to capture and restore batch system environment variables across job script stages (e.g. `["SLURM"]` or `["PBS"]`).
setup_dir	general	Absolute path of the setup directory (where it was installed by esm_master).
system_components	general	List of non-model config sections included in file-list iteration loops (default: `["general", "dask"]`).
time_step	<component>	Time step of the component in seconds.

Dask variables

Variables in the dask section control the Dask cluster used for parallel file movements. See Parallel File Movements for usage details.

Key	Section	Description
actions	dask	List of actions that trigger Dask cluster initialization (default: `["parallel_file_movements"]`).
client_timeout	dask	Timeout in seconds when probing the Dask scheduler status (default: `0.05`).
init_scheduler_cmd	dask	Shell command to start the Dask scheduler. Defined per batch system (e.g. in `slurm.yaml`).
init_workers_cmd	dask	Shell command to start the Dask workers. Defined per batch system (e.g. in `slurm.yaml`).
parallel_file_movements	general	Controls how file movements are parallelized. `"dask"` (default) distributes I/O across all compute nodes via a Dask cluster, `"threads"` uses local threads on a single node, `False` runs sequentially. See Parallel File Movements.
poll_interval	dask	Polling interval in seconds for Dask cluster readiness checks (default: `0.5`).
scheduler_json	dask	Full path to the Dask scheduler JSON file used for client connections (default: `${general.thisrun_work_dir}/dask_scheduler.json`).
workers_timeout	dask	Maximum time in seconds to wait for Dask workers to become available (default: `5`).

Calendar variables

Key	Description
initial_date	Date of the beginning of the simulation in the format YYYY-MM-DD. If the simulation is a restart, `initial_date` marks the beginning of the restart.
final_date	Date of the end of the simulation in the format YYYY-MM-DD.
start_date	Date of the beginning of the current run.
end_date	Date of the end of the current run.
current_date	Current date of the run.
next_date	Next run initial date.
nyear, nmonth, nday, nhour, nminute	Number of time unit per run. They can be combined (i.e. `nyear: 1` and `nmonth: 2` implies that each run will be 1 year and 2 months long).
parent_date	Ending date of the previous run.

Coupling variables

Key	Description
grids	List of grids and their parameters (i.e. `name`, `nx`, `ny`, etc.).
coupling_fields	List of coupling field dictionaries containing coupling field variables.
nx	When using oasis3mct, used inside `grids` to define the first dimension of the grid.
ny	When using oasis3mct, used inside `grids` to define the second dimension of the grid.
coupling_methods	List of coupling methods and their parameters (i.e. `time_transformation`, `remapping`, etc.).
time_transformation	Time transformation used by oasis3mct, defined inside `coupling_methods`.
remapping	Remappings and their parameters, used by oasis3mct, defined inside `coupling_methods`.

Environment variables

Key	Section	Description
general_actions	computer	List of general shell actions to be included in the compilation and run scripts. These are added directly to the script without any prefix.
module_actions	computer	List of module actions to be included in the compilation and run scripts. Each entry will be prefixed with `module` in the generated script.
spack_actions	computer	List of Spack actions to be included in the compilation and run scripts. Each entry will be prefixed with `spack` in the generated script.
export_vars	computer	Dictionary of environment variables to be exported in the script. Each key-value pair will generate an `export KEY=VALUE` line.
unset_vars	computer	List of environment variables to be unset in the script. Each entry will generate an `unset VARIABLE` line.
include_env_from_component_files	computer/<component>	Boolean that controls whether environment variables from component files should be included. Can be set globally in the computer section or per-component. Default: `True`.
merge_component_envs	computer	Dictionary with `compile` and `run` keys that controls whether environments from all components should be merged. For `compile` the default is false (each component maintains its own environment), for `run` the default is true (environments are merged).

Note

For more detailed information on all environment configuration options, including attribute-based selection, coupled setup environment control, and advanced environment management features, please refer to the ESM Environment documentation.

Other variables

Key

Description

metadata

List to incude descriptive information about the model (i.e. Authors, Institute, Publications, etc.) used to produce the content of Supported Models. This information should be organized in nested keys followed by the corresponding description. Nested keys do not receive a special treatment meaning that you can include here any kind of information about the model. Only the Publications key is treated in a particular way: it can consist of a single element or a list, in which each element contains a link to the publication inside <> (i.e. - Title, Authors, Journal, Year. <https://doi.org/...>).