esm_parser package¶
Top-level package for ESM Parser.
Submodules¶
esm_parser.esm_parser module¶
YAML
Parser for Earth System Models¶
One core element of the esm-tools
is the description of model
configurations and experiments with the aid of YAML
files. Beyond the
standard features of YAML
, several specific conventions have been
implemented to ease the description of your simulations. These conventions are
described below, and the functions which implement them are documented with
minimal examples. Internally, after parsing the YAML
files are converted
into a single Python dictionary.
Parsing takes place by initializing objects which represent either an entire
setup, ConfigSetup
, or a specific component, ConfigComponent
. Both of
these objects base off of GeneralConfig
, which is a dictionary subclass
performing specific parsing steps during the object’s creation. The parsing
steps are presented in the order that they are resolved:
When initializing a ConfigSetup
or ConfigComponent
, a name of the
desired setup or component must be given, e.g. "awicm"
or "echam"
. This
configuration is immediately loaded along with any further configs listed in
the section “further_reading”. Note that this means that any configuration
listed in “further_reading” must not contain any variables!!
Following this step, a method called _config_init
is run for all classes
based off of GeneralConfig
. For components, any entries listed under
"include_submodels"
are attached and registed under a new keyword
"submodels"
.
For setups, the next step is to determine the computing host and load the appropriate configuration files. Setups divide their configuration into 3 specific parts:
Setup information, contained under
config['setup']
. This includes, e.g. information regarding a standalone setup, possible coupling, etc.Model Information, under
config['model']
. This contains specific information for all models and submodels, such as resolution, input file names, namelists, etc.User information, under
config['model']
. The user can specify to override any of the defaults with their own choices.
In the next step, all keys starting with "choose_"
are determined, along
with any information they set. This is done first for the setup, and then for
the models. These are filtered to determine an independent choice, and if
cyclic dependencies occur, an error is raised. All choices are then resolved
until nothing is left.
Specific documentation for classes and functions are given below:
-
class
esm_parser.esm_parser.
ConfigSetup
(model, version, user_config)[source]¶ Bases:
esm_parser.esm_parser.GeneralConfig
Config Class for Setups
-
exception
esm_parser.esm_parser.
EsmParserError
[source]¶ Bases:
Exception
Raise this error when the parser has problems
-
class
esm_parser.esm_parser.
GeneralConfig
(model, version, user_config)[source]¶ Bases:
dict
All configs do this!
-
esm_parser.esm_parser.
add_entries_to_chapter_in_config
(model_config, valid_model_names, setup_config, valid_setup_names)[source]¶
-
esm_parser.esm_parser.
add_entry_to_chapter
(add_chapter, add_entries, model_to_add_to, model_with_add_statement, model_config, setup_config)[source]¶
-
esm_parser.esm_parser.
add_more_important_tasks
(choose_keyword, all_set_variables, task_list)[source]¶ Determines dependencies of a choose keyword.
- Parameters
choose_keyword (str) – The keyword, starting with choose, which is looked through to check if there are any dependencies that must be resolved first to correctly resolve this one.
all_set_variables (dict) – All variables that can be set
task_list (list) – A list in the order in which tasks must be resolved for
choose_keyword
to make sense.
- Returns
A list of choices which must be made in order for choose_keyword to make sense.
- Return type
task_list
-
esm_parser.esm_parser.
attach_to_config_and_reduce_keyword
(config_to_read_from, config_to_write_to, full_keyword, reduced_keyword='included_files', level_to_write_to=None)[source]¶ Attaches a new dictionary to the config, and registers it as the value of
reduced_keyword
.- Parameters
config_to_read_from (dict) – The configuration dictionary from which information is read from. The keyword from which additional YAML files are read from should be on the top level of this dictionary.
config_to_write_to (dict) – The dictionary where the contents of
config_to_read_from[full_keyword]
is written in.full_keyword – The keyword where contents are extracted from
reduced_keyword – The keyword where the contents of
config_to_read_from[full_keyword]
are written tolevel_to_write_to – If this is specified, the attached entries are written here instead of in the top level of
config_to_write_to
. Note that only one level down is currently supported.
The purpose behind this is to have a chapter in config “include_submodels” = [“echam”, “fesom”], which would then find the “echam.yaml” and “fesom.yaml” configs, and attach them to “config” under config[submodels], and the entire config for e.g. echam would show up in config[echam
Since
config_to_read_from
andconfig_to_write_to
aredict
objects, they are modified in place. Note also that the entryconfig_to_read_from[full_keyword]
is deleted at the end of the routine.If the entry in
config_to_read_from[full_keyword]
is a list, each item in that list is split into two parts:model
andmodel_part
. For example:>>> # Assuming: config_to_read_from[full_keyword] = ['echam.datasets', 'echam.restart.streams'] >>> model, model_part = 'echam', 'datasets' # first part >>> model, model_part = 'echam', 'restart.streams' # second part
The first part, in the example
echam
is used to determine where to look for new YAML files. Then, a yaml file corresponding to a file calledecham.datasets.yaml
is loaded, and attached to the config.Warning
Both
config_to_read_from
andconfig_to_write_to
are modified in place!
-
esm_parser.esm_parser.
attach_to_config_and_remove
(config, attach_key)[source]¶ Attaches extra dict to this one and removes the chapter
Updates the dictionary on
config
with values from any file found under a listing specified byattach_key
.- Parameters
config (dict) – The configuration to update
attach_key (str) – A key who’s value points to a list of various yaml files to update
config
with.
Warning
The
config
is modified in place!
-
esm_parser.esm_parser.
basic_add_more_important_tasks
(choose_keyword, all_set_variables, task_list)[source]¶ Determines dependencies of a choose keyword.
- Parameters
choose_keyword (str) – The keyword, starting with choose, which is looked through to check if there are any dependencies that must be resolved first to correctly resolve this one.
all_set_variables (dict) – All variables that can be set
task_list (list) – A list in the order in which tasks must be resolved for
choose_keyword
to make sense.
- Returns
A list of choices which must be made in order for choose_keyword to make sense.
- Return type
task_list
-
esm_parser.esm_parser.
basic_choose_blocks
(config_to_resolve, config_to_search, isblacklist=True)[source]¶
-
esm_parser.esm_parser.
basic_find_one_independent_choose
(all_set_variables)[source]¶ Given a dictionary of
all_set_variables
, which comes out of the functiondetermine_set_variables_in_choose_block
, gives a list of task/variable dependencies to resolve in order to figure out the variable.- Parameters
all_set_variables (dict) –
- Returns
task_list – A list of tuples comprising
(model_name, var_name)
in order to resolve onechoose_
block. This list is built in such a way that the beginning of the list provides dependencies for later on in the list.- Return type
list
-
esm_parser.esm_parser.
basic_list_all_keys_starting_with_choose
(mapping, ignore_list, isblacklist)[source]¶
-
esm_parser.esm_parser.
del_value_for_nested_key
(config, key)[source]¶ In a dict of dicts, delete a key/value pair.
- Parameters
config (dict) – The dict to delete in.
key (str) – The key to delete.
Warning
The
config
is modified in place!
-
esm_parser.esm_parser.
determine_computer_from_hostname
()[source]¶ Determines which yaml config file is needed for this computer
Notes
The supercomputer must be registered in the
all_machines.yaml
file in order to be found.- Returns
A string for the path of the computer specific yaml file.
- Return type
str
-
esm_parser.esm_parser.
determine_set_variables_in_choose_block
(config, valid_model_names, model_name=[])[source]¶ Given a config, figures out which variables are resolved in a choose block.
In order to avoid cyclic dependencies, it is necessary to figure out which variables are set in which choose block. This function recurses over all key/value pairs of a configuration, and for any key which is a model name, it determines which variables are set in it’s
choose_
blocks. Tuples of(model_name, var_name)
are appended to a list, which is returned with all it’s duplicates removed.- Parameters
config (dict) –
valid_model_names (list) –
model_name (list) –
- Returns
set_variables – A list of tuples of model_name and corresponding variable that are determined in
config
- Return type
list
-
esm_parser.esm_parser.
dict_merge
(dct, merge_dct)[source]¶ Recursive dict merge. Inspired by :meth:
dict.update()
, instead of updating only top-level keys, dict_merge recurses down into dicts nested to an arbitrary depth, updating keys. Themerge_dct
is merged intodct
. :param dct: dict onto which the merge is executed :param merge_dct: dct merged into dct :return: None
-
esm_parser.esm_parser.
find_key
(d_search, k_search, exc_strings='', level='', paths2finds=[], sep='.')[source]¶ Searches for a key inside a nested dictionary. It can search for an integer, or a piece of string. A list of strings can be given as an input to search for keys containing all of them. An additional list of strings can be specified for keys containing them be excluded from the findings. This is a recursive function.
Note
Always define paths2finds, to avoid expansion of this list with consecutive calls.
- Parameters
d_search (dict) – The dictionary to be explored recursively.
k_search (list, str, int) – String, integer or list of strings to be search for in
d_search
.exc_strings (list, str) – String or list of strings for keys containing them to be excluded from the finds. When set to an empty string, nothing is excluded.
level (string) – String specifying the full path to the currently evaluated dictionary. Each dictionary level in these strings is separated by a
.
.paths2finds (list) – List of strings specifying the full path to the found keys in
d_search
. Each dictionary level in these strings is separated by a the specified string insep
(default is"."
).sep (string) – String separator used in between each path component in
paths2finds
.
- Returns
paths2finds – List of strings specifying the full path to the found keys in
d_search
. Each dictionary level in these strings is separated by a.
.- Return type
list
-
esm_parser.esm_parser.
find_one_independent_choose
(all_set_variables)[source]¶ Given a dictionary of
all_set_variables
, which comes out of the functiondetermine_set_variables_in_choose_block
, gives a list of task/variable dependencies to resolve in order to figure out the variable.- Parameters
all_set_variables (dict) –
- Returns
task_list – A list of tuples comprising
(model_name, var_name)
in order to resolve onechoose_
block. This list is built in such a way that the beginning of the list provides dependencies for later on in the list.- Return type
list
-
esm_parser.esm_parser.
find_value_for_nested_key
(mapping, key_of_interest, tree=[])[source]¶ In a dict of dicts, find a value for a given key
- Parameters
mapping (dict) – The nested dictionary to search through
key_of_interest (str) – The key to search for.
tree (list) – Where to start searching
- Returns
The value of key anywhere in the nested dict.
- Return type
value
Note
Behaviour of what happens when a key appears twice anywhere on different levels of the nested dict is unclear. The uppermost one is taken, but if the key appears in more than one item, I’d guess something ambigous occus…
-
esm_parser.esm_parser.
find_variable
(tree, rhs, full_config, white_or_black_list, isblacklist)[source]¶
-
esm_parser.esm_parser.
list_all_keys_starting_with_choose
(mapping, model_name, ignore_list, isblacklist)[source]¶ Given a
mapping
(e.g. adict
-type object), list all keys that start with"choose_"
on any level of the nested dictionary.- Parameters
mapping (dict) – The dictionary to search through for keys starting with
"choose_"
model_name (str) –
ignore_list (list) –
- Returns
all_chooses – A list of tuples for …. A dictionary containing all key, value pairs starting with
"choose_"
.- Return type
list
-
esm_parser.esm_parser.
list_to_multikey
(tree, rhs, config_to_search, ignore_list, isblacklist)[source]¶ A recursive_run_function conforming func which puts any list based key to a multikey elsewhere. Sorry, that sounds confusing even to me, and I wrote the function.
- Parameters
tree (list) –
rhs (str) –
config_to_search (dict) –
Notes
Internal variable definitions in this function; based upon the example: prefix_[[streams–>STREAM]]_postfix
ok_part
:prefix_
actual_list
:streams-->STREAM
key_in_list
:streams
value_in_list
:STREAM
entries_of_key
: list of actual chapterstreams
, e.g.[accw, echam6, e6hrsp, ...]
-
esm_parser.esm_parser.
mark_dates
(tree, rhs, config)[source]¶ Adds the
DATE_MARKER
to any entry who’s key ends with"date"
-
esm_parser.esm_parser.
marked_date_to_date_object
(tree, rhs, config)[source]¶ Transforms a marked date string into a Date object
-
esm_parser.esm_parser.
merge_dicts
(*dict_args)[source]¶ Given any number of dicts, shallow copy and merge into a new dict, precedence goes to key value pairs in latter dicts.
Note that this function only merges the first level. For deeper merging, use
priority_merge_dicts
.- Parameters
*dict_args – Any number of dictionaries to merge together
- Returns
- Return type
A merged dictionary (shallow)
-
esm_parser.esm_parser.
new_deep_update
(receiving_dict, dict_to_be_included, winner='receiving', blackdict={})[source]¶
-
esm_parser.esm_parser.
new_dict_merge
(dct, merge_dct, winner='to_be_included')[source]¶ Recursive dict merge. Inspired by :meth:
dict.update()
, instead of updating only top-level keys, dict_merge recurses down into dicts nested to an arbitrary depth, updating keys. Themerge_dct
is merged intodct
. :param dct: dict onto which the merge is executed :param merge_dct: dct merged into dct :param winner: should be either receiving (default) or to_be_included :return: None
-
esm_parser.esm_parser.
pprint_config
(config)[source]¶ Prints the dictionary given to the stdout in a nicely formatted YAML style.
- Parameters
config (dict) – The configuration to print
- Returns
- Return type
None
-
esm_parser.esm_parser.
priority_merge_dicts
(first_config, second_config, priority='first')[source]¶ Given two dictionaries, merge them together preserving either first or last entries.
- Parameters
first_config (dict) –
second_config (dict) –
priority (str) – One of “first” or “second”. Specifies which dictionary should be given priority when merging.
- Returns
merged – A dictionary containing all keys, with duplicate entries reverting to the dictionary given in “priority”. The merge occurs across all levels.
- Return type
dict
-
esm_parser.esm_parser.
recursive_get
(config_to_search, config_elements)[source]¶ Recusively gets entries in a nested dictionary in the form
outer_key.middle_key.inner_key = value
Given a list of config elements in the form above (e.g. the result of splitting the string
"outer_key.middle_key.inner_key".split(".")`
on the dot), the value “value” of the innermost nest is returned.- Parameters
config_to_search (dict) – The dictionary to search through
config_elements (list) – Each part of the next level of the dictionary to search, as a list.
- Returns
- Return type
The value associated with the nested dictionary specified by
config_elements
.
Note
This is actually just a wrapper around the function
actually_recursive_get
, which is needed to pop off standalone model configurations.
-
esm_parser.esm_parser.
recursive_run_function
(tree, right, level, func, *args, **kwargs)[source]¶ Recursively runs func on all nested dicts.
Tree is a list starting at the top of the config dictionary, where it will be labeled “top”
- Parameters
tree (list) – Where in the dictionary you are
right – The value of the last key in tree
level (str, one of "mappings", "atomic", "always") – When to perform func
func (callable) – An function to perform on all levels where the type of
right
is inlevel
. See the Notes for how this function’s call signature should look.*args – Passed to func
**kwargs – Passed to func
- Returns
- Return type
right
Note
The
func
argument must be a callable (i.e. a function) and must have a call signature of the following form:def func(tree, right, *args, **kwargs)
-
esm_parser.esm_parser.
remove_entries_from_chapter_in_config
(model_config, valid_model_names, setup_config, valid_setup_names)[source]¶
-
esm_parser.esm_parser.
remove_entry_from_chapter
(remove_chapter, remove_entries, model_to_remove_from, model_with_remove_statement, model_config, setup_config)[source]¶ Deletes the entries specified by the user using the
remove_<chapter>
command contained in the chapter, that can be either a list or a dictionary. After the removals theremove_<chapter>
command is cleaned up from the config.- Parameters
remove_chapter (str) – A string specifying the path inside the config to reach the chapter where the entries to be removed are. The string is composed by
remove_
followed by the path where each nested chapter is separated by a.
.remove_entries (list) – The list of entries to be remove from the chapter.
model_to_remove_from (str) – Indicates the main chapter inside config where removes need to take place (i.e.
computer
,general
,<model>
, …).model_with_remove_statement (str) – Indicates the main chapter where the remove command is defined.
model_config (dict) – Component-specific general configuration.
setup_config (dict) – Setup-specific general configuration.
-
esm_parser.esm_parser.
resolve_basic_choose
(config, config_to_replace_in, choose_key, blackdict={})[source]¶
-
esm_parser.esm_parser.
resolve_choose_with_var
(var, config, user_config={}, model_config={}, setup_config={})[source]¶ Searches for a
choose_
block inside a model configurationconfig
, in whichvar
is defined, and then resolves ONLY thevar
(the other variables in thechoose_
remain untouched). Needed, for example, for being able to useinclude_models
from achoose_
before the general choose-resolve takes place (i.e. includexios
component fromoifs.yaml
using achoose_
).- Parameters
var (str) – Name of the variable to be searched inside
choose_
blocks.config (dict) – Model configuration to be changed if the
var
is resolved by thechoose_
.user_config (dict) – User configuration, used to search for the selected case of the
choose_
.model_config (dict) – Component configuration, used to search for the selected case of the
choose_
.setup_config (dict) – Setup configuration, used to search for the selected case of the
choose_
.
-
esm_parser.esm_parser.
shell_file_to_dict
(filepath)[source]¶ Generates a ~`ConfigSetup` from an old shell script.
See also ~`ShellscriptToUserConfig`
- Parameters
filepath (str) – The file to load
- Returns
The parsed config.
- Return type
-
esm_parser.esm_parser.
unmark_dates
(tree, rhs, config)[source]¶ Removes the
DATE_MARKER
to any entry who’s entry contains theDATE_MARKER
.
-
esm_parser.esm_parser.
user_error
(error_type, error_text, exit_code=1)[source]¶ User-friendly error using
sys.exit()
instead of anException
.- Parameters
error_type (str) – Error type used for the error heading.
text (str) – Text clarifying the error.
exit_code (int) – The exit code to send back to the parent process (default to 1)
esm_parser.shell_to_dict module¶
Backwards compatability for old runscripts
esm_parser.yaml_to_dict module¶
-
exception
esm_parser.yaml_to_dict.
EsmConfigFileError
(fpath, yaml_error)[source]¶ Bases:
Exception
Exception for yaml file containing tabs or other syntax issues.
An exception used when yaml.load() throws a yaml.scanner.ScannerError. This error occurs mainly when there are tabs inside a yaml file or when the syntax is incorrect. If tabs are found, this exception returns a user-friendly message indicating where the tabs are located in the yaml file.
- Parameters
fpath (str) – Path to the yaml file
-
esm_parser.yaml_to_dict.
check_changes_duplicates
(yamldict_all, fpath)[source]¶ Checks for duplicates and conflicting
_changes
andadd_
:Finds variables containing
_changes
(but excludingadd_
) and checks if they are compatible with the same_changes
inside the same file. If they are not compatible returns an error where the conflicting variable paths are specified. More than one_changes
type in a file are allowed but they need to be part of the same_choose
and not be accessible simultaneously in any situation.Checks if there is any variable containing
add_
in the main sections of a file and labels it as incompatible if the same variable is found inside achoose_
block.add_<variable>``s are compatible as long as they are inside ``choose_
blocks, but if you want to include something as a default, please just do it inside the<variable>
.Warning
add_<variable>``s are not checked for incompatibility when they are included inside ``choose_
blocks. Merging of theseadd_<variable>``s is done using ``deep_update
, meaning that the merge is arbitrary (i.e. if twochoose_
blocks are modifying the same variable usingadd_
, the final value would be decided arbitrarily). It is up to the developer/user to make good use ofadd_``s inside ``choose_
blocks.
- Parameters
yamldict_all (dict) – Dictionary read from the yaml file
fpath (str) – Path to the yaml file
-
esm_parser.yaml_to_dict.
check_duplicates
(src)[source]¶ Checks that there are no duplicates in a yaml file, and if there are returns an error stating which key is repeated and in which file the duplication occurs.
- Parameters
src (object) – Source file object
Exceptions –
---------- –
ConstructorError – If duplicated keys are found, returns an error
-
esm_parser.yaml_to_dict.
create_env_loader
(tag='!ENV', loader=<class 'yaml.loader.SafeLoader'>)[source]¶
-
esm_parser.yaml_to_dict.
find_last_choose
(var_path)[source]¶ Locates the last
choose_
on a string containing the path to a variable separated by “,”, and returns the path to thechoose_
(also separated by “,”) and the case that follows thechoose_
.- Parameters
var_path (str) – String containing the path to the last
choose_
separated by “,”.- Returns
path2choose (str) – Path to the last
choose_
.case (str) – Case after the choose.
-
esm_parser.yaml_to_dict.
yaml_file_to_dict
(filepath)[source]¶ Given a yaml file, returns a corresponding dictionary.
If you do not give an extension, tries again after appending one. It raises an EsmConfigFileError exception if yaml files contain tabs.
- Parameters
filepath (str) – Where to get the YAML file from
- Returns
A dictionary representation of the yaml file.
- Return type
dict
- Raises
EsmConfigFileError – Raised when YAML file contains tabs or other syntax issues.
FileNotFoundError – Raised when the YAML file cannot be found and all extensions have been tried.