esm_parser package

Top-level package for ESM Parser.

Submodules

esm_parser.esm_parser module

YAML Parser for Earth System Models

One core element of the esm-tools is the description of model configurations and experiments with the aid of YAML files. Beyond the standard features of YAML, several specific conventions have been implemented to ease the description of your simulations. These conventions are described below, and the functions which implement them are documented with minimal examples. Internally, after parsing the YAML files are converted into a single Python dictionary.

Parsing takes place by initializing objects which represent either an entire setup, ConfigSetup, or a specific component, ConfigComponent. Both of these objects base off of GeneralConfig, which is a dictionary subclass performing specific parsing steps during the object’s creation. The parsing steps are presented in the order that they are resolved:

When initializing a ConfigSetup or ConfigComponent, a name of the desired setup or component must be given, e.g. "awicm" or "echam". This configuration is immediately loaded along with any further configs listed in the section “further_reading”. Note that this means that any configuration listed in “further_reading” must not contain any variables!!

Following this step, a method called _config_init is run for all classes based off of GeneralConfig. For components, any entries listed under "include_submodels" are attached and registed under a new keyword "submodels".

For setups, the next step is to determine the computing host and load the appropriate configuration files. Setups divide their configuration into 3 specific parts:

  1. Setup information, contained under config['setup']. This includes, e.g. information regarding a standalone setup, possible coupling, etc.
  2. Model Information, under config['model']. This contains specific information for all models and submodels, such as resolution, input file names, namelists, etc.
  3. User information, under config['model']. The user can specify to override any of the defaults with their own choices.

In the next step, all keys starting with "choose_" are determined, along with any information they set. This is done first for the setup, and then for the models. These are filtered to determine an independent choice, and if cyclic dependencies occur, an error is raised. All choices are then resolved until nothing is left.


Specific documentation for classes and functions are given below:

class esm_parser.esm_parser.ConfigSetup(model, version, user_config)[source]

Bases: esm_parser.esm_parser.GeneralConfig

Config Class for Setups

finalize()[source]
run_recursive_functions(config, isblacklist=True)[source]
exception esm_parser.esm_parser.EsmParserError[source]

Bases: Exception

Raise this error when the parser has problems

class esm_parser.esm_parser.GeneralConfig(model, version, user_config)[source]

Bases: dict

All configs do this!

esm_parser.esm_parser.actually_find_variable(tree, rhs, full_config)[source]
esm_parser.esm_parser.add_entries_from_chapter(config, add_chapter, add_entries)[source]
esm_parser.esm_parser.add_entries_to_chapter_in_config(model_config, valid_model_names, setup_config, valid_setup_names)[source]
esm_parser.esm_parser.add_entry_to_chapter(add_chapter, add_entries, model_to_add_to, model_with_add_statement, model_config, setup_config)[source]
esm_parser.esm_parser.add_more_important_tasks(choose_keyword, all_set_variables, task_list)[source]

Determines dependencies of a choose keyword.

Parameters:
  • choose_keyword (str) – The keyword, starting with choose, which is looked through to check if there are any dependencies that must be resolved first to correctly resolve this one.
  • all_set_variables (dict) – All variables that can be set
  • task_list (list) – A list in the order in which tasks must be resolved for choose_keyword to make sense.
Returns:

A list of choices which must be made in order for choose_keyword to make sense.

Return type:

task_list

esm_parser.esm_parser.attach_single_config(config, path, attach_value)[source]
esm_parser.esm_parser.attach_to_config_and_reduce_keyword(config_to_read_from, config_to_write_to, full_keyword, reduced_keyword='included_files', level_to_write_to=None)[source]

Attaches a new dictionary to the config, and registers it as the value of reduced_keyword.

Parameters:
  • config_to_read_from (dict) – The configuration dictionary from which information is read from. The keyword from which additional YAML files are read from should be on the top level of this dictionary.
  • config_to_write_to (dict) – The dictionary where the contents of config_to_read_from[full_keyword] is written in.
  • full_keyword – The keyword where contents are extracted from
  • reduced_keyword – The keyword where the contents of config_to_read_from[full_keyword] are written to
  • level_to_write_to – If this is specified, the attached entries are written here instead of in the top level of config_to_write_to. Note that only one level down is currently supported.

The purpose behind this is to have a chapter in config “include_submodels” = [“echam”, “fesom”], which would then find the “echam.yaml” and “fesom.yaml” configs, and attach them to “config” under config[submodels], and the entire config for e.g. echam would show up in config[echam

Since config_to_read_from and config_to_write_to are dict objects, they are modified in place. Note also that the entry config_to_read_from[full_keyword] is deleted at the end of the routine.

If the entry in config_to_read_from[full_keyword] is a list, each item in that list is split into two parts: model and model_part. For example:

>>> # Assuming: config_to_read_from[full_keyword] = ['echam.datasets', 'echam.restart.streams']
>>> model, model_part = 'echam', 'datasets' # first part
>>> model, model_part = 'echam', 'restart.streams' # second part

The first part, in the example echam is used to determine where to look for new YAML files. Then, a yaml file corresponding to a file called echam.datasets.yaml is loaded, and attached to the config.

Warning

Both config_to_read_from and config_to_write_to are modified in place!

esm_parser.esm_parser.attach_to_config_and_remove(config, attach_key)[source]

Attaches extra dict to this one and removes the chapter

Updates the dictionary on config with values from any file found under a listing specified by attach_key.

Parameters:
  • config (dict) – The configuration to update
  • attach_key (str) – A key who’s value points to a list of various yaml files to update config with.

Warning

The config is modified in place!

esm_parser.esm_parser.basic_add_entries_to_chapter_in_config(config)[source]
esm_parser.esm_parser.basic_add_more_important_tasks(choose_keyword, all_set_variables, task_list)[source]

Determines dependencies of a choose keyword.

Parameters:
  • choose_keyword (str) – The keyword, starting with choose, which is looked through to check if there are any dependencies that must be resolved first to correctly resolve this one.
  • all_set_variables (dict) – All variables that can be set
  • task_list (list) – A list in the order in which tasks must be resolved for choose_keyword to make sense.
Returns:

A list of choices which must be made in order for choose_keyword to make sense.

Return type:

task_list

esm_parser.esm_parser.basic_choose_blocks(config_to_resolve, config_to_search, isblacklist=True)[source]
esm_parser.esm_parser.basic_determine_set_variables_in_choose_block(config)[source]
esm_parser.esm_parser.basic_find_add_entries_in_config(mapping)[source]
esm_parser.esm_parser.basic_find_one_independent_choose(all_set_variables)[source]

Given a dictionary of all_set_variables, which comes out of the function determine_set_variables_in_choose_block, gives a list of task/variable dependencies to resolve in order to figure out the variable.

Parameters:all_set_variables (dict) –
Returns:task_list – A list of tuples comprising (model_name, var_name) in order to resolve one choose_ block. This list is built in such a way that the beginning of the list provides dependencies for later on in the list.
Return type:list
esm_parser.esm_parser.basic_find_remove_entries_in_config(mapping)[source]
esm_parser.esm_parser.basic_list_all_keys_starting_with_choose(mapping, ignore_list, isblacklist)[source]
esm_parser.esm_parser.basic_remove_entries_from_chapter_in_config(config)[source]
esm_parser.esm_parser.choose_blocks(config, blackdict={}, isblacklist=True)[source]
esm_parser.esm_parser.complete_config(user_config)[source]
esm_parser.esm_parser.convert(value)[source]
esm_parser.esm_parser.could_be_bool(value)[source]
esm_parser.esm_parser.could_be_complex(value)[source]
esm_parser.esm_parser.could_be_float(value)[source]
esm_parser.esm_parser.could_be_int(value)[source]
esm_parser.esm_parser.deep_update(chapter, entries, config, blackdict={})[source]
esm_parser.esm_parser.del_value_for_nested_key(config, key)[source]

In a dict of dicts, delete a key/value pair.

Parameters:
  • config (dict) – The dict to delete in.
  • key (str) – The key to delete.

Warning

The config is modified in place!

esm_parser.esm_parser.determine_computer_from_hostname()[source]

Determines which yaml config file is needed for this computer

Notes

The supercomputer must be registered in the all_machines.yaml file in order to be found.

Returns:A string for the path of the computer specific yaml file.
Return type:str
esm_parser.esm_parser.determine_regex_list_match(test_str, regex_list)[source]
esm_parser.esm_parser.determine_set_variables_in_choose_block(config, valid_model_names, model_name=[])[source]

Given a config, figures out which variables are resolved in a choose block.

In order to avoid cyclic dependencies, it is necessary to figure out which variables are set in which choose block. This function recurses over all key/value pairs of a configuration, and for any key which is a model name, it determines which variables are set in it’s choose_ blocks. Tuples of (model_name, var_name) are appended to a list, which is returned with all it’s duplicates removed.

Parameters:
  • config (dict) –
  • valid_model_names (list) –
  • model_name (list) –
Returns:

set_variables – A list of tuples of model_name and corresponding variable that are determined in config

Return type:

list

esm_parser.esm_parser.dict_merge(dct, merge_dct)[source]

Recursive dict merge. Inspired by :meth:dict.update(), instead of updating only top-level keys, dict_merge recurses down into dicts nested to an arbitrary depth, updating keys. The merge_dct is merged into dct. :param dct: dict onto which the merge is executed :param merge_dct: dct merged into dct :return: None

esm_parser.esm_parser.do_math_in_entry(tree, rhs, config)[source]
esm_parser.esm_parser.find_add_entries_in_config(mapping, model_name)[source]
esm_parser.esm_parser.find_key(d_search, k_search, exc_strings='', level='', paths2finds=[], sep='.')[source]

Searches for a key inside a nested dictionary. It can search for an integer, or a piece of string. A list of strings can be given as an input to search for keys containing all of them. An additional list of strings can be specified for keys containing them be excluded from the findings. This is a recursive function.

Note

Always define paths2finds, to avoid expansion of this list with consecutive calls.

Parameters:
  • d_search (dict) – The dictionary to be explored recursively.
  • k_search (list, str, int) – String, integer or list of strings to be search for in d_search.
  • exc_strings (list, str) – String or list of strings for keys containing them to be excluded from the finds.
  • level (string) – String specifying the full path to the currently evaluated dictionary. Each dictionary level in these strings is separated by a ..
  • paths2finds (list) – List of strings specifying the full path to the found keys in d_search. Each dictionary level in these strings is separated by a the specified string in sep (default is ".").
  • sep (string) – String separator used in between each path component in paths2finds.
Returns:

paths2finds – List of strings specifying the full path to the found keys in d_search. Each dictionary level in these strings is separated by a ..

Return type:

list

esm_parser.esm_parser.find_one_independent_choose(all_set_variables)[source]

Given a dictionary of all_set_variables, which comes out of the function determine_set_variables_in_choose_block, gives a list of task/variable dependencies to resolve in order to figure out the variable.

Parameters:all_set_variables (dict) –
Returns:task_list – A list of tuples comprising (model_name, var_name) in order to resolve one choose_ block. This list is built in such a way that the beginning of the list provides dependencies for later on in the list.
Return type:list
esm_parser.esm_parser.find_remove_entries_in_config(mapping, model_name)[source]
esm_parser.esm_parser.find_value_for_nested_key(mapping, key_of_interest, tree=[])[source]

In a dict of dicts, find a value for a given key

Parameters:
  • mapping (dict) – The nested dictionary to search through
  • key_of_interest (str) – The key to search for.
  • tree (list) – Where to start searching
Returns:

The value of key anywhere in the nested dict.

Return type:

value

Note

Behaviour of what happens when a key appears twice anywhere on different levels of the nested dict is unclear. The uppermost one is taken, but if the key appears in more than one item, I’d guess something ambigous occus…

esm_parser.esm_parser.find_variable(tree, rhs, full_config, white_or_black_list, isblacklist)[source]
esm_parser.esm_parser.finish_priority_merge(config)[source]
esm_parser.esm_parser.initialize_from_shell_script(filepath)[source]
esm_parser.esm_parser.initialize_from_yaml(filepath)[source]
esm_parser.esm_parser.list_all_keys_starting_with_choose(mapping, model_name, ignore_list, isblacklist)[source]

Given a mapping (e.g. a dict-type object), list all keys that start with "choose_" on any level of the nested dictionary.

Parameters:
  • mapping (dict) – The dictionary to search through for keys starting with "choose_"
  • model_name (str) –
  • ignore_list (list) –
Returns:

all_chooses – A list of tuples for …. A dictionary containing all key, value pairs starting with "choose_".

Return type:

list

esm_parser.esm_parser.list_all_keys_with_priority_marker(config)[source]
esm_parser.esm_parser.list_to_multikey(tree, rhs, config_to_search, ignore_list, isblacklist)[source]

A recursive_run_function conforming func which puts any list based key to a multikey elsewhere. Sorry, that sounds confusing even to me, and I wrote the function.

Parameters:
  • tree (list) –
  • rhs (str) –
  • config_to_search (dict) –

Notes

Internal variable definitions in this function; based upon the example: prefix_[[streams–>STREAM]]_postfix

  • ok_part: prefix_
  • actual_list: streams-->STREAM
  • key_in_list: streams
  • value_in_list: STREAM
  • entries_of_key: list of actual chapter streams, e.g. [accw, echam6, e6hrsp, ...]
esm_parser.esm_parser.look_for_file(model, item)[source]
esm_parser.esm_parser.mark_dates(tree, rhs, config)[source]

Adds the DATE_MARKER to any entry who’s key ends with "date"

esm_parser.esm_parser.marked_date_to_date_object(tree, rhs, config)[source]

Transforms a marked date string into a Date object

esm_parser.esm_parser.merge_dicts(*dict_args)[source]

Given any number of dicts, shallow copy and merge into a new dict, precedence goes to key value pairs in latter dicts.

Note that this function only merges the first level. For deeper merging, use priority_merge_dicts.

Parameters:*dict_args – Any number of dictionaries to merge together
Returns:
Return type:A merged dictionary (shallow)
esm_parser.esm_parser.perform_actions(tree, rhs, config)[source]
esm_parser.esm_parser.pprint_config(config)[source]

Prints the dictionary given to the stdout in a nicely formatted YAML style.

Parameters:config (dict) – The configuration to print
Returns:
Return type:None
esm_parser.esm_parser.priority_merge_dicts(first_config, second_config, priority='first')[source]

Given two dictionaries, merge them together preserving either first or last entries.

Parameters:
  • first_config (dict) –
  • second_config (dict) –
  • priority (str) – One of “first” or “second”. Specifies which dictionary should be given priority when merging.
Returns:

merged – A dictionary containing all keys, with duplicate entries reverting to the dictionary given in “priority”. The merge occurs across all levels.

Return type:

dict

esm_parser.esm_parser.purify_booleans(tree, rhs, config)[source]
esm_parser.esm_parser.recursive_get(config_to_search, config_elements)[source]

Recusively gets entries in a nested dictionary in the form outer_key.middle_key.inner_key = value

Given a list of config elements in the form above (e.g. the result of splitting the string "outer_key.middle_key.inner_key".split(".")` on the dot), the value “value” of the innermost nest is returned.

Parameters:
  • config_to_search (dict) – The dictionary to search through
  • config_elements (list) – Each part of the next level of the dictionary to search, as a list.
Returns:

Return type:

The value associated with the nested dictionary specified by config_elements.

Note

This is actually just a wrapper around the function actually_recursive_get, which is needed to pop off standalone model configurations.

esm_parser.esm_parser.recursive_run_function(tree, right, level, func, *args, **kwargs)[source]

Recursively runs func on all nested dicts.

Tree is a list starting at the top of the config dictionary, where it will be labeled “top”

Parameters:
  • tree (list) – Where in the dictionary you are
  • right – The value of the last key in tree
  • level (str, one of "mappings", "atomic", "always") – When to perform func
  • func (callable) – An function to perform on all levels where the type of right is in level. See the Notes for how this function’s call signature should look.
  • *args – Passed to func
  • **kwargs – Passed to func
Returns:

Return type:

right

Note

The func argument must be a callable (i.e. a function) and must have a call signature of the following form:

def func(tree, right, *args, **kwargs)
esm_parser.esm_parser.remove_entries_from_chapter(config, remove_chapter, remove_entries)[source]
esm_parser.esm_parser.remove_entries_from_chapter_in_config(model_config, valid_model_names, setup_config, valid_setup_names)[source]
esm_parser.esm_parser.remove_entry_from_chapter(remove_chapter, remove_entries, model_to_remove_from, model_with_remove_statement, model_config, setup_config)[source]
esm_parser.esm_parser.resolve_basic_choose(config, config_to_replace_in, choose_key, blackdict={})[source]
esm_parser.esm_parser.resolve_choose(model_with_choose, choose_key, config)[source]
esm_parser.esm_parser.shell_file_to_dict(filepath)[source]

Generates a ~`ConfigSetup` from an old shell script.

See also ~`ShellscriptToUserConfig`

Parameters:filepath (str) – The file to load
Returns:The parsed config.
Return type:ConfigSetup
esm_parser.esm_parser.to_boolean(value)[source]
esm_parser.esm_parser.unmark_dates(tree, rhs, config)[source]

Removes the DATE_MARKER to any entry who’s entry contains the DATE_MARKER.

esm_parser.esm_parser.user_error(error_type, error_text)[source]

User-friendly error using sys.exit() instead of an Exception.

Parameters:
  • error_type (str) – Error type used for the error heading.
  • text (str) – Text clarifying the error.

esm_parser.shell_to_dict module

Backwards compatability for old runscripts

esm_parser.shell_to_dict.ShellscriptToUserConfig(runscript_path)[source]

Generates a User Config from an old Shellscript

esm_parser.shell_to_dict.mini_recursive_run_func(config, func)[source]
esm_parser.shell_to_dict.purify_cases(config)[source]
esm_parser.shell_to_dict.remap_old_new_keys(config)[source]

esm_parser.yaml_to_dict module

exception esm_parser.yaml_to_dict.EsmConfigFileError(fpath, yaml_error)[source]

Bases: Exception

Exception for yaml file containing tabs or other syntax issues.

An exception used when yaml.load() throws a yaml.scanner.ScannerError. This error occurs mainly when there are tabs inside a yaml file or when the syntax is incorrect. If tabs are found, this exception returns a user-friendly message indicating where the tabs are located in the yaml file.

Parameters:fpath (str) – Path to the yaml file
esm_parser.yaml_to_dict.check_changes_duplicates(yamldict_all, fpath)[source]

Finds variables containing _changes (but excluding add_) and checks if they are compatible with the same _changes inside the same file. If they are not compatible returns an error where the conflicting variable paths are specified. More than one _changes type in a file are allowed but they need to be part of the same _choose and not be accessible simultaneously in any situation.

Parameters:
  • yamldict_all (dict) – Dictionary read from the yaml file
  • fpath (str) – Path to the yaml file
esm_parser.yaml_to_dict.check_duplicates(src)[source]

Checks that there are no duplicates in a yaml file, and if there are returns an error stating which key is repeated and in which file the duplication occurs.

Parameters:
  • src (object) – Source file object
  • Exceptions
  • ----------
  • ConstructorError – If duplicated keys are found, returns an error
esm_parser.yaml_to_dict.find_last_choose(var_path)[source]

Locates the last choose_ on a string containing the path to a variable separated by “,”, and returns the path to the choose_ (also separated by “,”) and the case that follows the choose_.

Parameters:var_path (str) – String containing the path to the last choose_ separated by “,”.
Returns:
  • path2choose (str) – Path to the last choose_.
  • case (str) – Case after the choose.
esm_parser.yaml_to_dict.yaml_file_to_dict(filepath)[source]

Given a yaml file, returns a corresponding dictionary.

If you do not give an extension, tries again after appending one. It raises an EsmConfigFileError exception if yaml files contain tabs.

Parameters:

filepath (str) – Where to get the YAML file from

Returns:

A dictionary representation of the yaml file.

Return type:

dict

Raises:
  • EsmConfigFileError – Raised when YAML file contains tabs or other syntax issues.
  • FileNotFoundError – Raised when the YAML file cannot be found and all extensions have been tried.