ESM Environment

The package esm_environment takes care of generating the environments for the different HPCs supported by ESM-Tools. This is done through the use of the EnvironmentInfos class inside the different ESM-Tools packages.

For the correct definition of an HPC environment a yaml file for that system needs to be included inside the esm_tools package inside the configs/machines/ folder (e.g. levante.yaml). This file should contain all the required preset variables for that system, and optionally, the environment variables general_actions, module_actions, spack_actions, export_vars, and unset_vars. These environment variables are defined in the computer section/key of the configuration.

Hint

To get started easily, you can use the esm_tools command to create machine files, component files, or setup files, which include examples of the environment variables in the computer section:

# Create a new machine configuration
esm_tools create-new-config -t machine NAME

# Create a new component configuration
esm_tools create-new-config -t component NAME

# Create a new setup configuration
esm_tools create-new-config -t setup NAME

Environment variables

Environment variables must be defined inside the computer section of your YAML configuration files. These can be machine files, component files, setup files, or runscripts. All environment-related keys should be placed inside this section.

computer:
    # Environment variables go here
    module_actions: [...]
    export_vars: {...}

The following environment variables are supported:

general_actions (list)

A list of general actions to be included in the compilation and run scripts. These are added directly to the script without any prefix.

computer:
    general_actions:
        - "echo 'Starting environment setup'"
module_actions (list)

A list of module actions to be included in the compilation and run scripts generated by esm_master and esm_runscripts respectively, such as module load netcdf, module unload netcdf, module purge, etc. The syntax of this list is such as that of the command that would be normally used in shell, but omitting the module word, for example:

computer:
    module_actions:
        - "purge"
        - "load netcdf"

This variable also allows for sourcing files by adding a member to the list such as source <file_to_be_sourced>. You can combine module loads with sourcing files in the same list, as shown in the following example:

computer:
    module_actions:
        - "source /path/to/module/conf.sh"
        - "load netcdf/4.7.4"
spack_actions (list)

A list of spack actions to be included in the compilation and run scripts. Similar to module_actions, but for Spack package manager commands.

computer:
    spack_actions:
        - "load python"
        - "load netcdf-c"
export_vars (dict)

A dictionary containing all the variables (and their values) to be exported. The syntax is as follows:

computer:
    export_vars:
        A_VAR_TO_BE_EXPORTED: the_value

The previous example will result in the following export in the script produced by esm_master or esm_runscripts:

export A_VAR_TO_BE_EXPORTED=the_value

As a dictionary, export_vars is not allowed to have repeated keys. This could be a problem when environments are required to redefine a variable at different points of the script. To overcome this limitation, repetitions of the same variable are allowed if the key is followed by an integer contained inside [(int)]:

computer:
    export_vars:
        A_VAR_TO_BE_EXPORTED: the_value
        "A_VAR_TO_BE_EXPORTED[(1)]": $A_VAR_TO_BE_EXPORTED:another_value

The resulting script will contain the following exports:

export A_VAR_TO_BE_EXPORTED=the_value
export A_VAR_TO_BE_EXPORTED=$A_VAR_TO_BE_EXPORTED:another_value

Note that the index is removed once the exports are transferred into the script.

unset_vars (list)

A list of variables to be unset in the script.

computer:
    unset_vars:
        - "OLD_VAR"
        - "ANOTHER_OLD_VAR"

The resulting script will contain:

unset OLD_VAR
unset ANOTHER_OLD_VAR

Using choose blocks with general.execution_mode

ESM-Tools automatically sets general.execution_mode to either compile or run depending on the current execution mode. You can leverage this to create execution-mode-specific environments by using choose_ blocks. This common pattern allows you to define different environment variables for compilation and runtime scripts:

computer:
    choose_general.execution_mode:
        compile:
            # These environment variables are only included during compilation
            module_actions:
                - "load intel/19.1.3"
                - "load netcdf-fortran/4.5.3"
            export_vars:
                NETCDF_ROOT: "/path/to/netcdf"
                FORTRAN_COMPILER: "ifort"
        run:
            # These environment variables are only included during runtime
            module_actions:
                - "load intel/19.1.3"
                - "load hdf5/1.12.1"
            export_vars:
                OMP_NUM_THREADS: "4"
                IO_MODE: "async"

This approach allows you to maintain separate environments for compilation and runtime, which is often necessary as the requirements may differ between these phases.

Note

You can also nest choose_ blocks for more granular control. For example, you could combine choose_general.execution_mode with choose_computer.name to have different compilation and runtime environments for different machines.

Modification of the environment through the model/setup files

As previously mentioned, the default environment for a HPC system is defined inside its machine file (in esm_tools/machines/<machine_name>.yaml). However, it is possible to modify this environment directly through the computer section of the component and/or setup files (or even inside the runscript) to adjust to the component/setup requirements.

Note

The variables environment_changes, compiletime_environment_changes, and runtime_environment_changes are now deprecated. Instead, define environment variables directly in the computer section using the add_ prefix.

To add to or modify the environment, use the following variables in the computer section:

add_general_actions (list)

Adds actions to the general_actions list.

add_module_actions (list)

Adds actions to the module_actions list.

add_spack_actions (list)

Adds actions to the spack_actions list.

add_export_vars (dict)

Adds variables to the export_vars dictionary.

add_unset_vars (list)

Adds variables to the unset_vars list.

The syntax for these environment variables is the same as their counterparts without the add_ prefix. These variables can be nested inside choose_ blocks:

computer:
        choose_computer.name:
                ollie:
                        add_export_vars:
                                COMPUTER_VAR: 'ollie'
                juwels:
                        add_export_vars:
                                COMPUTER_VAR: 'juwels'

Note

These changes are model-specific for compilation by default, meaning that the changes will only occur for the compilation script of the model containing those changes. For runtime, all the environments of the components will be added together into the same .run script. Please, refer to Coupled setup environment control for an explanation on how to control environments for a whole setup.

Order of environment variables in configuration files

The placement of a choose_ block in relation to other environment variables (whithin the same file) controls the order in which they appear in the generated script. Variables defined above a choose_ block will appear before those defined inside the block, while variables defined after the block will appear after. This allows you to control the sequence of environment operations, which can be important when certain variables depend on others.

general:
    execution_mode: run
computer:
    name: levante
    # These will be processed FIRST
    choose_computer.name:
        levante:
            add_export_vars:
                VAR_A: 1

    # These will be processed NEXT
    add_export_vars:
        VAR_B: 2

    # These will be processed LAST
    choose_general.execution_mode:
        run:
            add_export_vars:
                VAR_C: 3

With this example, the generated script would contain these exports in the following order:

export VAR_A=1
export VAR_B=2
export VAR_C=3

This is because the refactored environment processing system preserves the order from the original configuration file, while resolving the appropriate choices based on the context.

Note

The ordering described above only applies to environment variables defined within the same file. The order of environment variables across different files (e.g., machine, component, and setup files) is determined by the configuration hierarchy and cannot be directly controlled. Variables from higher priority files (according to the YAML File Hierarchy) will appear in the script before variables from lower priority files.

Coupled setup environment control

Note

The features described in this section are advanced features that most users will not need to use. They are primarily intended for ESM-Tools developers or experienced users who need to manage complex environment configurations in coupled setups. They provide greater flexibility and control over how environments are managed across components.

Removing environment variables from component files

When working with coupled setups, you may want to exclude environment variables that are defined in component configuration files. There are two ways to do this:

  1. Global setting: Set include_env_from_component_files: false in the main computer section of your setup configuration to exclude environment variables from all component files.

  2. Component-specific setting: Set include_env_from_component_files: false in a specific component section of your setup configuration to exclude environment variables only from that component’s files.

This feature is particularly useful when you want complete control over the environment in a coupled setup, overriding any component-specific environment settings that might conflict with your setup requirements.

Example: Disabling component environment variables

# In your coupled setup configuration (e.g., awicm.yaml)
general:
    setup_name: awicm

computer:
    # Define setup-wide environment instead
    add_export_vars:
        SETUP_VAR: "setup_value"

echam:
    # Component-specific setting: Only exclude environment variables from echam component file
    include_env_from_component_files: false
    add_export_vars:
        ECHAM_VAR: "echam_value"

In this example:

  1. The setup file defines general environment variables in the main computer section that apply to all components

  2. For the echam component specifically, we set include_env_from_component_files: false to exclude any environment variables defined in the echam component file

  3. The environment variables defined directly in the echam section of the setup file (ECHAM_VAR) are still included

Environment merging behavior

The merging behavior for environments in coupled setups is controlled by the merge_component_envs setting in the computer section:

computer:
    merge_component_envs:
        compile: false  # Component-specific for compilation
        run: true       # Merged for runtime (default)

When set to true (default for runtime), environments from all components are merged into a single environment. When set to false (default for compilation), each component maintains its own environment.

Advanced control with attributes

In a setup, you might want to have fine-grained control of the environment variables that you add to the computer section, so that you can indicate that they belong to an specific component or that they need to be only added during the compilation or runtime. For this, you can use environment variable attributes to specify which components and execution modes they apply to. For this purpose, the following attributes can be specified inside an environment variable, instead of only its value:

  • _value: The actual value of the environment variable

  • _execution_mode: The execution mode for which this variable applies (“compile” or “run”)

  • _component: The component for which this variable applies

If the current execution mode or component doesn’t match the specified attributes, the variable will take the value defined in _old_value (if it exists) or be excluded altogether.

Example 1: Variable that only applies during FESOM runtime:

computer:
    export_vars:
        MODEL_SPECIFIC_VAR:
            _value: "some_value"
            _component: "fesom"     # Only applies to FESOM
            _execution_mode: "run"  # Only during runtime

With this configuration, when running FESOM (and only in this case), the variable would appear in the run script as:

export MODEL_SPECIFIC_VAR="some_value"

In all other cases (compiling FESOM, compiling ECHAM, or running ECHAM), this variable would not appear in the generated scripts because the component and/or execution mode attributes don’t match.

Example 2: Variable that only applies during OpenIFS compilation:

computer:
    export_vars:
        OIFS_OASIS_BASE:
            _value: /path/to/oasis
            _execution_mode: compile
            _component: oifs

This variable would only be included in the compilation script for OpenIFS, and would be excluded in all other cases.