Developer Curation API
curate - Functions for helping to curate data
Functions for helping curate BSE basis set data
- basis_set_exchange.curate.add_basis(bs_file, data_dir, subdir, file_base, name, family, role, description, version, revision_description, data_source, refs=None, file_fmt=None)
Add a basis set to this library
This takes in a single file containing the basis set is some format, parses it, and create the component, element, and table basis set files in the given data_dir (and subdir). The metadata file for the basis is created if it doesn’t exist, and the main metadata file is also updated.
- Parameters:
bs_file (str) – Path to the file with formatted basis set information
data_dir (str) – Path to the data directory to add the data to
subdir (str) – Subdirectory of the data directory to add the basis set to
file_base (str) – Base name for new files
name (str) – Name of the basis set
family (str) – Family to which this basis set belongs
role (str) – Role of the basis set (orbital, etc)
description (str) – Description of the basis set
version (str) – Version of the basis set
revision_description (str) – Description of this version of the basis set
data_source (str) – Description of where this data came from
refs (dict or str) –
Mapping of references to elements. This can be a dictionary with a compressed string of elements as keys and a list of reference strings as values. For example, {‘H,Li-B,Kr’: [‘kumar2018a’]}
If a list or string is passed, then those reference(s) will be used for all elements.
Elements that exist in the file but do not have a reference are given the usual ‘noref’ extension and the references entry is empty.
file_fmt (str) – Format of the input basis data (None = autodetect)
- basis_set_exchange.curate.add_basis_from_dict(bs_data, data_dir, subdir, file_base, name, family, role, description, version, revision_description, data_source, refs=None)
Add a basis set to this library
This takes in a basis set dictionary, and create the component, element, and table basis set files in the given data_dir (and subdir). The metadata file for the basis is created if it doesn’t exist, and the main metadata file is also updated.
- Parameters:
bs_data (dict) – Basis set dictionary
data_dir (str) – Path to the data directory to add the data to
subdir (str) – Subdirectory of the data directory to add the basis set to
file_base (str) – Base name for new files
name (str) – Name of the basis set
family (str) – Family to which this basis set belongs
role (str) – Role of the basis set (orbital, etc)
description (str) – Description of the basis set
version (str) – Version of the basis set
revision_description (str) – Description of this version of the basis set
data_source (str) – Description of where this data came from
refs (dict or str) –
Mapping of references to elements. This can be a dictionary with a compressed string of elements as keys and a list of reference strings as values. For example, {‘H,Li-B,Kr’: [‘kumar2018a’]}
If a list or string is passed, then those reference(s) will be used for all elements.
Elements that exist in the file but do not have a reference are given the usual ‘noref’ extension and the references entry is empty.
file_fmt (str) – Format of the input basis data (None = autodetect)
- basis_set_exchange.curate.add_from_components(component_files, data_dir, subdir, file_base, name, family, role, description, version, revision_description)
Add a basis set to this library that is a combination of component files
This takes in a list of component basis files and creates a new basis set for the intersection of all the elements contained in those files. This creates the element, and table basis set files in the given data_dir (and subdir). The metadata file for the basis is created if it doesn’t exist, and the main metadata file is also updated.
- Parameters:
component_files (str) – Path to component json files (in BSE format already)
data_dir (str) – Path to the data directory to add the data to
subdir (str) – Subdirectory of the data directory to add the basis set to
file_base (str) – Base name for new files
name (str) – Name of the basis set
family (str) – Family to which this basis set belongs
role (str) – Role of the basis set (orbital, etc)
description (str) – Description of the basis set
version (str) – Version of the basis set
revision_description (str) – Description of this version of the basis set
- basis_set_exchange.curate.basis_comparison_report(bs1, bs2, uncontract_general=False)
Compares two basis set dictionaries and prints a report about their differences
- basis_set_exchange.curate.compare_basis(bs1, bs2, compare_electron_shells_meta=False, compare_ecp_pots_meta=False, compare_elements_meta=False, compare_meta=False, rel_tol=0.0)
Determine if two basis set dictionaries are the same
- bs1dict
Full basis information
- bs2dict
Full basis information
- compare_electron_shells_metabool
Compare the metadata of electron shells
- compare_ecp_pots_metabool
Compare the metadata of ECP potentials
- compare_elements_metabool
Compare the overall element metadata
- compare_meta: bool
Compare the metadata for the basis set (name, description, etc)
- rel_tolfloat
Maximum relative error that is considered equal
- basis_set_exchange.curate.compare_basis_against_file(basis_name, src_filepath, file_type=None, version=None, uncontract_general=False, data_dir=None)
Compare a basis set in the BSE against a reference file
- basis_set_exchange.curate.compare_basis_files(file_path_1, file_path_2, file_type_1=None, file_type_2=None, uncontract_general=False)
Compare two files containing formatted basis sets
- basis_set_exchange.curate.compare_basis_sets(basis_name_1, basis_name_2, version_1=None, version_2=None, uncontract_general=False, data_dir_1=None, data_dir_2=None)
Compare two files containing formatted basis sets
- basis_set_exchange.curate.compare_ecp_pots(potential1, potential2, compare_meta=False, rel_tol=0.0)
Compare two ecp potentials for approximate equality (exponents/coefficients are within a tolerance)
If compare_meta is True, the metadata is also compared for exact equality.
- basis_set_exchange.curate.compare_electron_shells(shell1, shell2, compare_meta=False, rel_tol=0.0)
Compare two electron shells for approximate equality (exponents/coefficients are within a tolerance)
If compare_meta is True, the metadata is also compared for exact equality.
- basis_set_exchange.curate.compare_elements(element1, element2, compare_electron_shells_meta=False, compare_ecp_pots_meta=False, compare_meta=False, rel_tol=0.0)
Determine if the basis information for two elements is the same as another
Exponents/coefficients are compared using a tolerance.
- Parameters:
element1 (dict) – Basis information for an element
element2 (dict) – Basis information for another element
compare_electron_shells_meta (bool) – Compare the metadata of electron shells
compare_ecp_pots_meta (bool) – Compare the metadata of ECP potentials
compare_meta (bool) – Compare the overall element metadata
rel_tol (float) – Maximum relative error that is considered equal
- basis_set_exchange.curate.component_file_refs(filelist)
Get a list of what elements/references exist in component JSON files
- Parameters:
filelist (list) – A list of paths to json files
- Returns:
Keys are the file path, value is a list of tuples (compacted element string, refs tuple)
- Return type:
dict
- basis_set_exchange.curate.create_metadata_file(output_path, data_dir)
Creates a METADATA.json file from a data directory
The file is written to output_path
- basis_set_exchange.curate.diff_basis_dict(left_list, right_list)
Compute the difference between two sets of basis set dictionaries
The result is a list of dictionaries that correspond to each dictionary in left_list. Each resulting dictionary will contain only the elements/shells that exist in that entry and not in any of the dictionaries in right_list.
This only works on the shell level, and will only subtract entire shells that are identical. ECP potentials are not affected.
The return value contains deep copies of the input data
- Parameters:
left_list (list of dict) – Dictionaries to use as the base
right_list (list of dict) – Dictionaries of basis data to subtract from each dictionary of left_list
- Returns:
Each object in left_list containing data that does not appear in right_list
- Return type:
list
- basis_set_exchange.curate.diff_json_files(left_files, right_files)
Compute the difference between two sets of basis set JSON files
The output is a set of files that correspond to each file in left_files. Each resulting dictionary will contain only the elements/shells that exist in that entry and not in any of the files in right_files.
This only works on the shell level, and will only subtract entire shells that are identical. ECP potentials are not affected.
left_files and right_files are lists of file paths. The output is written to files with the same names as those in left_files, but with .diff added to the end. If those files exist, they are overwritten.
- Parameters:
left_files (list of str) – Paths to JSON files to use as the base
right_files (list of str) – Paths to JSON files to subtract from each file of left_files
- Return type:
None
- basis_set_exchange.curate.ecp_pots_are_equal(pots1, pots2, compare_meta=False, rel_tol=0.0)
Determine if a list of electron shells is the same as another
The potentials are compared approximately (exponents/coefficients are within a tolerance)
If compare_meta is True, the metadata is also compared for exact equality.
- basis_set_exchange.curate.ecp_pots_are_subset(subset, superset, compare_meta=False, rel_tol=0.0)
Determine if a list of ecp potentials is a subset of another
If ‘subset’ is a subset of the ‘superset’, True is returned.
The potentials are compared approximately (exponents/coefficients are within a tolerance)
If compare_meta is True, the metadata is also compared for exact equality.
- basis_set_exchange.curate.electron_shells_are_equal(shells1, shells2, compare_meta=False, rel_tol=0.0)
Determine if a list of electron shells is the same as another
The shells are compared approximately (exponents/coefficients are within a tolerance)
If compare_meta is True, the metadata is also compared for exact equality.
- basis_set_exchange.curate.electron_shells_are_subset(subset, superset, compare_meta=False, rel_tol=0.0)
Determine if a list of electron shells is a subset of another
If ‘subset’ is a subset of the ‘superset’, True is returned.
The shells are compared approximately (exponents/coefficients are within a tolerance)
If compare_meta is True, the metadata is also compared for exact equality.
- basis_set_exchange.curate.elements_in_files(filelist)
Get a list of what elements exist in JSON files
This works on table, element, and component data files
- Parameters:
filelist (list) – A list of paths to json files
- Returns:
Keys are the file path, value is a compacted element string of what elements are in that file
- Return type:
dict
- basis_set_exchange.curate.potentials_difference(p1, p2)
Computes and prints the differences between two lists of potentials
If the shells contain a different number primitives, or the lists are of different length, inf is returned. Otherwise, the maximum relative difference is returned.
- basis_set_exchange.curate.shells_difference(s1, s2)
Computes and prints the differences between two lists of shells
If the shells contain a different number primitives, or the lists are of different length, inf is returned. Otherwise, the maximum relative difference is returned.