Computer Module¶
Abstract Base Class¶
- class orchestrator.computer.computer_base.Computer(**kwargs)[source]¶
Bases:
Recorder,ABCAbstract base class for the computer.
- OUTPUT_KEY = None¶
- __init__(**kwargs)[source]¶
Initialize the Recorder mixin class.
Sets up logging configuration and creates a logger instance named after the class of the object using it.
- Parameters:
args – Positional arguments passed to other supers in the MRO.
kwargs – Keyword arguments passed to other supers in the MRO.
- compute(atoms, **kwargs)[source]¶
Runs the calculation for a single atomic configuration. This is intended to be able to be used in a serial (non-distributed) manner, outside of a proper orchestrator workflow.
- Parameters:
atoms (Atoms) – the ASE Atoms object
- Return type:
ndarray- Returns:
some value; depends upon sub-class
- compute_batch(list_of_atoms, **kwargs)[source]¶
Runs the calculation for a batch of atomic configurations. This is intended to be able to be used in a serial (non-distributed) manner, outside of a proper orchestrator workflow.
- Parameters:
list_of_atoms (list) – a list of ASE Atoms objects
args (dict) – any additional arguments to be passed to calculator method
- Returns:
a list of values equivalent to
[self.compute(atoms, args) for atoms in list_of_atoms]- Return type:
list
- abstract get_run_command(**kwargs)[source]¶
Return the command to run calculations within a workflow. This allows for distributed execution of
compute().This method formats the run command, while the args dictionary can be used to pass any necessary extra parameters to the specific implementations.
- Returns:
implementation dependent
- Return type:
implementation dependent
- abstract get_batched_run_command(**kwargs)[source]¶
Similar to
get_run_command(), this function is meant to support executingcompute_batched()within a workflow.- Returns:
implementation dependent
- Return type:
implementation dependent
- abstract run(path_type, workflow=None)[source]¶
Executes the calculation across a provided workflow. Note that sub-classes may have implementations with additional arguments.
- Parameters:
path_type (str) – specifier for the workflow path, to differentiate calculation types.
workflow (Workflow) – the workflow for managing job submission, if none are supplied, will use the default workflow defined in this class
Default:None- Returns:
a list of calculation IDs from the workflow.
- Return type:
list
- save_labeled_configs(data_pointers, storage=None, dataset_name=None, dataset_handle=None, workflow=None, cleanup=True)[source]¶
Extract and save computed data to storage.
Once the calculations are complete, the data they generate must be integrated with the structural configuration in a consistent framework to be used for training. This is done by parsing and ingesting the configuration and attached data into a dataset handled by the
Storagemodule.- Parameters:
data_pointers (list (of Atoms or int or str)) – configs or calc_ids or explicit paths associated with each config. If calc_ids or explicit paths are supplied, they should point to ASE-readable files from which to load the Atoms objects. If calc_ids are supplied, the path is extracted from the
JobStatus. Calc IDs are generally prefered as they can also carry metadata with them.storage (Storage) – specific module that handles the staroge of data.
Default:Nonedataset_name (str) – Name of the dataset in the database. If
None, then the class default (date stamped) is used.Default:Nonedataset_handle (str) – the handle to identify where in Storage the configurations should be saved.
workflow (Workflow) – the workflow for managing job submission, if none are supplied, will use the default workflow defined in this class. Should be consistent with the workflow supplied for the run calls.
Default:Nonecleanup (bool) – a flag indicating whether to delete the temporary files.
Default:True- Returns:
dataset handle
- Return type:
str
- abstract write_input(run_path, input_args)[source]¶
Writes any input data necessary for the calculation to the run path
- abstract parse_for_storage(run_path, cleanup)[source]¶
Process calculation output to extract data in a consistent format
- Parameters:
run_path (str) – directory where the output resides
cleanup (bool) – a flag indicating whether to delete the temporary files.
Default:True- Returns:
depends upon implementation
- Return type:
depends upon implementation, but should always be a list
- abstract save_results(compute_results, save_dir, **kwargs)[source]¶
Save calculation output to a file. Implementation dependent.
Note that this function should also store any metadata associated with the calculation.
- Parameters:
compute_results (np.ndarray or list[np.ndarray]) – the output of .compute() or .compute_batch()
save_path (str) – folder in which to save the results
- cleanup(run_path=None)[source]¶
Removes any temporary files that were created for job execution.
- Parameters:
run_path (str) – the parent directory containing the temp file subdir. If None, it is not being called by a batch job, so it should delete the init_args
- data_from_calc_ids(data_pointers, workflow=None, cleanup=True)[source]¶
Return the parsed data from a list of calculation IDs.
- Parameters:
data_pointers (list) – list of calc_ids for extracting to computed results
workflow (Workflow) – the workflow for managing job submission, if none are supplied, will use the default workflow defined in this class
Default:Nonecleanup (bool) – a flag indicating whether to delete the temporary files.
Default:True- Returns:
a list of the computed values
- Return type:
list
- get_colabfit_property_definition(name=None)[source]¶
A ‘property definition’ is a dictionary used by the ColabFit storage module for exactly specifying the details (data type, shape, description, etc.) of each field required for uniquely defining a given property. This function must be implemented in order to support storage of the computed results in the ColabFit module.
- Parameters:
name (str) – the name of the property. Only needs to be provided if the Computer can return multiple properties.
- Returns:
the property definition
- Return type:
dict
- get_colabfit_property_map(name=None)[source]¶
Returns a default property map that can be used to extract a ColabFit property from an ASE.Atoms object. This assumes that the values being extracted are stored in their default locations based on the specific Computer module (usually within the compute() or compute_batch() functions).
A ‘property map’ is similar to a ‘property definition’, but instead tells ColabFit how to extract the keys specified in the property definition from an ASE.Atoms object. This function must be implemented in order to support storage of the computed results in the ColabFit module.
- Parameters:
name (str) – the name of the property. Only needs to be provided if the Computer can return multiple properties.
- Returns:
the property map
- Return type:
dict