Trainer Module

Abstract Base Class

class orchestrator.trainer.trainer_base.Trainer(**kwargs)[source]

Bases: Recorder, ABC

Abstract base class to manage the training of different potentials

The trainer class is responsible for handling the loading/assignment of training data, as well as the actual process of training a potential

__init__(**kwargs)[source]

set variables and initialize the recorder and default workflow

default_wf

default workflow to use within the trainer class

abstract checkpoint_trainer()[source]

checkpoint the trainer module into the checkpoint file

save necessary internal variables into a dict with key checkpoint_name and write to the (json) checkpoint file for restart capabilities

abstract restart_trainer()[source]

restart the trainer module from the checkpoint file

check if the checkpoint_file has an entry matching the checkpoint_name and set internal variables accordingly if so

abstract train(path_type, potential, storage, dataset_list, workflow=None, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, write_training_script=True, upload_to_kimkit=True)[source]

Train the potential based on the specific trainer details

This is a main method of the trainer class, and uses the parameters supplied at instantiation to perform the potential training by minimizing a loss function.

Parameters:
  • path_type (str) – specifier for the workflow path, to differentiate training runs

  • potential (Potential) – potential to be trained. The actual model itself is set as an attribute of the Potential object

  • storage (Storage) – an instance of the storage class

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

    Default: None

  • per_atom_weights (either boolean or np.ndarray) – True to read from dataset, or numpy array

    Default: False

  • write_training_script (bool) – True to write a training script in the working trainer directory

    Default: True

  • upload_to_kimkit (bool) – True to upload to kimkit repository

Dataset_list:

the list of dataset_handles (e.g. collabfit-IDs) within the storage object to use as the dataset.

Returns:

trained model, loss object

Return type:

implementation dependent

abstract submit_train(path_type, potential, storage, dataset_list, workflow, job_details, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, upload_to_kimkit=True)[source]

Asychronously train the potential based on the trainer details

This is a main method of the trainer class, and uses the parameters supplied at instantiation to perform the potential training by minimizing a loss function. While train() works synchronously, this method submits training to a job scheduler.

Parameters:
  • path_type (str) – specifier for the workflow path, to differentiate training runs

  • potential (Potential) – potential to be trained. The actual model itself is set as an attribute of the Potential object

  • storage (Storage) – an instance of the storage class

  • workflow (Workflow) – the workflow for managing path definition and job submission

  • eweight (float) – weight of energy data in the loss function

  • fweight (float) – weight of the force data in the loss function

  • vweight (float) – weight of the stress data in the loss function

  • per_atom_weights (either boolean or np.ndarray) – True to read from dataset, or numpy array

    Default: False

  • upload_to_kimkit (bool) – True to upload to kimkit repository

Dataset_list:

the list of dataset_handles (e.g. collabfit-IDs) within the storage object to use as the dataset.

Returns:

calculation ID of the submitted job

Return type:

int

abstract load_from_submitted_training(calc_id, potential, workflow)[source]

reload a potential that was trained via a submitted job

Concrete Implementations

KLIFF base class

class orchestrator.trainer.kliff.kliff.KLIFFTrainer(training_split=0.8, loss_method='mse', max_evals=1000, optimization_method='L-BFGS-B', scratch=None, **kwargs)[source]

Bases: Trainer

Train and deploy a potential using KLIFF

The trainer class is responsible for handling the loading/assignment of training data, as well as the actual process of training a potential. One should use specific subclasses of KLIFFTrainer instead of this base class.

Parameters:
  • training_split (float) – Fraction of the dataset to be allocated for training (e.g., 0.8 for 80%). Defaults to 0.8.

  • loss_method (str) – The type of loss function to be used during training (e.g., “mse” for mean squared error).

  • max_evals (int) – Maximum number of evaluations (e.g., iterations or function calls) for the optimizer. Defaults to 1000.

  • optimization_method (str) – The optimization algorithm to employ for training the potential (e.g., “L-BFGS-B”, “Adam”)

  • scratch (str, optional) – Path to a directory for storing temporary or scratch files during training. If None, it defaults to ‘./scratch_kliff’ within the execution directory.

  • kwargs (dict) – Arbitrary keyword arguments that may be used by specific subclasses or for advanced configuration options.

__init__(training_split=0.8, loss_method='mse', max_evals=1000, optimization_method='L-BFGS-B', scratch=None, **kwargs)[source]

set variables and initialize the recorder and default workflow

Parameters:
  • training_split (float) – Fraction of the dataset to be allocated for training (e.g., 0.8 for 80%). Defaults to 0.8.

  • loss_method (str) – The type of loss function to be used during training (e.g., “mse” for mean squared error).

  • max_evals (int) – Maximum number of evaluations (e.g., iterations or function calls) for the optimizer. Defaults to 1000.

  • optimization_method (str) – The optimization algorithm to employ for training the potential (e.g., “L-BFGS-B”, “Adam”)

  • scratch (str, optional) – Path to a directory for storing temporary or scratch files during training. If None, it defaults to ‘./scratch_kliff’ within the execution directory.

  • kwargs (dict) – Arbitrary keyword arguments that may be used by specific subclasses or for advanced configuration options.

train(path_type, potential, storage, dataset_list, workflow=None, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, upload_to_kimkit=True)[source]

Train the potential based on the specific trainer details

KLIFFTrainer should not be used for training, it is a parent class to specific implementations

Parameters:
  • path_type (str) – specifier for the workflow path, to differentiate training runs

  • potential (Potential) – potential to be trained. The actual model itself is set as an attribute of the Potential object

  • storage (Storage) – an instance of the storage class

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

    Default: None

  • eweight (float) – weight of energy data in the loss function

  • fweight (float) – weight of the force data in the loss function

  • vweight (float) – weight of the stress data in the loss function

  • per_atom_weights (bool) – True to read from dataset,

    Default: False

  • upload_to_kimkit (bool) – True to upload to kimkit repository

Dataset_list:

the list of dataset_handles (e.g. collabfit-IDs) within the storage object to use as the dataset.

Returns:

trained model, loss object

Return type:

implementation dependent

submit_train(path_type, potential, storage_args, storage, dataset_list, workflow=None, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, upload_to_kimkit=True)[source]

Asynchronously train the potential based on the trainer details

This is a main method of the trainer class, and uses the parameters supplied at instantiation to perform the potential training by minimizing a loss function. While train() works synchronously, this method submits training to a job scheduler.

Parameters:
  • path_type (str) – specifier for the workflow path, to differentiate training runs

  • potential (Potential) – potential to be trained. The actual model itself is set as an attribute of the Potential object

  • storage (Storage) – an instance of the storage class

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

    Default: None

  • eweight (float) – weight of energy data in the loss function

  • fweight (float) – weight of the force data in the loss function

  • vweight (float) – weight of the stress data in the loss function

  • per_atom_weights (bool) – Per atom weights for the loss function, If boolean, value is provided, the weights are assumed to be present in the provided dataset.

    Default: False

  • upload_to_kimkit (bool) – True to upload to kimkit repository

Dataset_list:

the list of dataset_handles (e.g. collabfit-IDs) within the storage object to use as the dataset.

Returns:

calculation ID of the submitted job

Return type:

int

DNN sub-class

class orchestrator.trainer.kliff.kliff_dunn_trainer.DUNNTrainer(use_gpu=False, loss_method='mse', epochs=100, batch_size=32, learning_rate=0.001, training_split=0.8, optimizer='Adam', log_per_atom_pred=True, **kwargs)[source]

Bases: KLIFFTrainer

Train and deploy a fully connected neural network based on Behler- Parrinello symmetry functions. This trainer uses the KIM DUNN driver for deploying the potential which has higher performance C++ backend and inbuilt support for UQ.

The trainer class is responsible for handling the loading/assignment of training data, as well as the actual process of training a potential. This trainer is intended to be used with kliff NeuralNetwork s, such as KliffBPPotential.

Parameters:
  • use_gpu (bool) – Whether to use a GPU for training

    Default: False

  • loss_method (str) – Loss function to use

    Default: ‘mse’

  • epochs (int) – Number of epochs to train the model

    Default: 100

  • batch_size (int) – Number of configurations per mini-batch

    Default: 32

  • learning_rate (float) – Learning rate used by the optimizer

    Default: 0.001

  • training_split (float) – Fraction of data to use for training (rest for validation)

    Default: 0.8

  • optimizer (str) – Optimizer to use for training

    Default: ‘Adam’

  • log_per_atom_pred (bool) – Whether to log per-atom predictions during training for both in-memory and submitted jobs

    Default: True

  • kwargs (dict) – Additional keyword arguments passed to the superclass.

__init__(use_gpu=False, loss_method='mse', epochs=100, batch_size=32, learning_rate=0.001, training_split=0.8, optimizer='Adam', log_per_atom_pred=True, **kwargs)[source]

Train and deploy a DNN potential using KLIFF

Parameters:
  • use_gpu (bool) – Whether to use a GPU for training

    Default: False

  • loss_method (str) – Loss function to use

    Default: ‘mse’

  • epochs (int) – Number of epochs to train the model

    Default: 100

  • batch_size (int) – Number of configurations per mini-batch

    Default: 32

  • learning_rate (float) – Learning rate used by the optimizer

    Default: 0.001

  • training_split (float) – Fraction of data to use for training (rest for validation)

    Default: 0.8

  • optimizer (str) – Optimizer to use for training

    Default: ‘Adam’

  • per_atom_weights (bool) – Per atom weights for the loss function, If boolean, value is provided, the weights are assumed to be present in the provided dataset.

    Default: False

  • kwargs (dict) – Additional keyword arguments passed to the superclass.

checkpoint_trainer()[source]

checkpoint the trainer module into the checkpoint file

save necessary internal variables into a dict with key checkpoint_name and write to the (json) checkpoint file for restart capabilities

restart_trainer()[source]

restart the trainer module from the checkpoint file

check if the checkpoint_file has an entry matching the checkpoint_name and set internal variables accordingly if so

train(path_type, potential, storage, dataset_list, workflow=None, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, upload_to_kimkit=True)[source]

Train a DNN potential using KLIFF

This is the main method of the trainer class, and uses the parameters supplied at instantiation to perform the potential training by minimizing a loss function.

Parameters:
  • path_type (str) – specifier for the workflow path, to differentiate training runs

  • potential (KliffBPPotential) – KliffBPPotential class object containing model to be trained as an attribute

  • storage (Storage) – an instance of the storage class

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

    Default: None

  • eweight (float) – weight of energy data in the loss function

  • fweight (float) – weight of the force data in the loss function

  • vweight (float) – weight of the stress data in the loss function

  • per_atom_weights (bool) – Per atom weights for the loss function, If boolean, value is provided, the weights are assumed to be present in the provided dataset.

    Default: False

  • upload_to_kimkit (bool) – True to upload to kimkit repository

Dataset_list:

the list of dataset_handles (e.g. collabfit-IDs) within the storage object to use as the dataset.

Returns:

trained model, loss object

Return type:

NeuralNetwork, Loss (KliFF)

submit_train(path_type, potential, storage, dataset_list, workflow, job_details, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, upload_to_kimkit=True)[source]

Asynchronously train the potential based on the trainer details

This is a main method of the trainer class, and uses the parameters supplied at instantiation to perform the potential training by minimizing a loss function. While train() works synchronously, this method submits training to a job scheduler.

Parameters:
  • path_type (str) – specifier for the workflow path, to differentiate training runs

  • potential (Potential) – potential to be trained. The actual model itself is set as an attribute of the Potential object

  • storage (Storage) – an instance of the storage class

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

  • eweight (float) – weight of energy data in the loss function

  • fweight (float) – weight of the force data in the loss function

  • vweight (float) – weight of the stress data in the loss function

  • per_atom_weights (bool) – Per atom weights for the loss function, If boolean, value is provided, the weights are assumed to be present in the provided dataset.

    Default: False

  • upload_to_kimkit (bool) – True to upload to kimkit repository

Dataset_list:

the list of dataset_handles (e.g. collabfit-IDs) within the storage object to use as the dataset.

Returns:

calculation ID of the submitted job

Return type:

int

load_from_submitted_training(calc_id, potential, workflow)[source]

reload a potential that was trained via a submitted job

Parameters:
  • calc_id (int) – calculation ID of the submitted training job

  • potential (KliffBPPotential) – KliffBPPotential class object that will be updated with the model saved to disk after the training job.

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

    Default: None

Parametric model sub-class

class orchestrator.trainer.kliff.kliff_parametric_trainer.ParametricModelTrainer(model_name, params_to_update, training_split=1.0, loss_method='mse', max_evals=1000, optimization_method='L-BFGS-B', scratch=None, **kwargs)[source]

Bases: KLIFFTrainer

Train and deploy a general parametric model potential using KLIFF

The trainer class is responsible for handling the loading/assignment of training data, as well as the actual process of training a potential. This trainer is intended to be used with kliff Parametric model.

Parameters:
  • model_name (str) – name of the model to train

  • params_to_update (list) – List of model parameters to update during training

  • training_split (float) – Fraction of data to use for training (rest for validation)

    Default: 1.0

  • loss_method (str) – Loss function to use

    Default: ‘mse’

  • max_evals (int) – Maximum number of optimization evaluations

    Default: 1000

  • optimization_method (str) – Optimization algorithm to use

    Default: ‘L-BFGS-B’

  • scratch (str or None) – Path to scratch directory for temporary files

    Default: None

__init__(model_name, params_to_update, training_split=1.0, loss_method='mse', max_evals=1000, optimization_method='L-BFGS-B', scratch=None, **kwargs)[source]

Train and deploy a general parametric model potential using KLIFF

The trainer class is responsible for handling the loading/assignment of training data, as well as the actual process of training a potential. This trainer is intended to be used with kliff Parametric model.

Parameters:
  • model_name (str) – name of the model to train

  • params_to_update (list) – List of model parameters to update during training

  • training_split (float) – Fraction of data to use for training (rest for validation)

    Default: 1.0

  • loss_method (str) – Loss function to use

    Default: ‘mse’

  • max_evals (int) – Maximum number of optimization evaluations

    Default: 1000

  • optimization_method (str) – Optimization algorithm to use

    Default: ‘L-BFGS-B’

  • scratch (str or None) – Path to scratch directory for temporary files

    Default: None

checkpoint_trainer()[source]

checkpoint the trainer module into the checkpoint file

save necessary internal variables into a dict with key checkpoint_name and write to the (json) checkpoint file for restart capabilities

restart_trainer()[source]

restart the trainer module from the checkpoint file

check if the checkpoint_file has an entry matching the checkpoint_name and set internal variables accordingly if so

train(path_type, potential, storage, dataset_list, workflow=None, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, upload_to_kimkit=True)[source]

Train a parametric potential using KLIFF

This is the main method of the trainer class, and uses the parameters supplied at instantiation to perform the potential training by minimizing a loss function.

Parameters:
  • path_type (str) – specifier for the workflow path, to differentiate training runs

  • potential (KIMPotential) – KIMPotential class object containing model to be trained as an attribute

  • storage (Storage) – an instance of the storage class

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

    Default: None

  • eweight (float) – weight of energy data in the loss function

  • fweight (float) – weight of the force data in the loss function

  • vweight (float) – weight of the stress data in the loss function

  • per_atom_weights (bool) – Per atom weights for the loss function, If boolean, value is provided, the weights are assumed to be present in the provided dataset.

    Default: False

  • upload_to_kimkit (bool) – True to upload to kimkit repository

Dataset_list:

the list of dataset_handles (e.g. collabfit-IDs) within the storage object to use as the dataset.

Returns:

trained model, loss object

Return type:

KIMModel, None

submit_train(path_type, potential, storage_args, workflow, job_details, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, upload_to_kimkit=True)[source]

Asychronously train the potential based on the trainer details

Return type:

int

load_from_submitted_training(calc_id, potential, workflow)[source]

reload a potential that was trained via a submitted job

FitSnap class

class orchestrator.trainer.fitsnap.FitSnapTrainer(**kwargs)[source]

Bases: Trainer

Train and deploy a potential using FitSnap

The trainer class is responsible for handling the loading/assignment of training data, as well as the actual process of training a potential. This trainer is intended to be used with Snap model trained with ASE training data.

__init__(**kwargs)[source]

Train and deploy a general parametric model potential using FitSnap

checkpoint_trainer()[source]

checkpoint the trainer module into the checkpoint file

save necessary internal variables into a dict with key checkpoint_name and write to the (json) checkpoint file for restart capabilities

restart_trainer()[source]

restart the trainer module from the checkpoint file

check if the checkpoint_file has an entry matching the checkpoint_name and set internal variables accordingly if so

train(path_type, potential, storage, dataset_list, workflow=None, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, write_training_script=True, upload_to_kimkit=True)[source]

Train a Snap potential using FitSnap

This is the main method of the trainer class, and uses the parameters supplied in the FitSnap settings file to perform the potential training

Parameters:
  • path_type (str) – if write_training_script=True, specifier for the workflow path, to differentiate training runs; else, the raw path to save files

  • potential (fitsnap instance) – FitSnapPotential class object containing fitsnap instance

  • storage (Storage) – an instance of the storage class

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

    Default: None

  • eweight (float) – weight of energy data in the loss function

  • fweight (float) – weight of the force data in the loss function

  • vweight (float) – weight of the stress data in the loss function

  • per_atom_weights (either boolean or np.ndarray) – True to read from dataset, or numpy array, or a str for a numpy.loadtxt compatible filepath

    Default: False

  • write_training_script (bool) – True to write a training script in the workflow created directory

    Default: True; This is expected to always be left on if not being called by a submit_train() workflow!

  • upload_to_kimkit (bool) – True to upload to kimkit repository

Dataset_list:

the list of dataset_handles (e.g. collabfit-IDs) within the storage object to use as the dataset.

Returns:

trained model, error metrics

Return type:

fitsnap instance, fitsnap error attribute

submit_train(path_type, potential, storage, dataset_list, workflow, job_details, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, upload_to_kimkit=True)[source]

Asychronously train the potential based on the trainer details

This is a main method of the trainer class, and uses the parameters supplied at instantiation to perform the potential training by minimizing a loss function. While train() works synchronously, this method submits training to a job scheduler.

Parameters:
  • path_type (str) – specifier for the workflow path, to differentiate training runs

  • potential (Potential) – potential to be trained. The actual model itself is set as an attribute of the Potential object

  • storage (Storage) – an instance of the storage class

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

  • job_details (dict) – job parameters such as walltime or # of nodes

  • eweight (float) – weight of energy data in the loss function

  • fweight (float) – weight of the force data in the loss function

  • vweight (float) – weight of the stress data in the loss function

  • per_atom_weights (either boolean or np.ndarray) – True to read from dataset, or numpy array, or a str for a numpy.loadtxt compatible filepath

    Default: False

  • upload_to_kimkit (bool) – True to upload to kimkit repository

Dataset_list:

the list of dataset_handles (e.g. collabfit-IDs) within the storage object to use as the dataset.

Returns:

calculation ID of the submitted job

Return type:

int

load_from_submitted_training(calc_id, potential, workflow)[source]

reload a potential that was trained via a submitted job

Parameters:
  • calc_id (int) – calculation ID of the submitted training job

  • potential (KliffBPPotential) – KliffBPPotential class object that will be updated with the model saved to disk after the training job.

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

    Default: None

ChIMES class

class orchestrator.trainer.chimes.ChIMESTrainer(exe_chimes_fit_1, exe_chimes_fit_2, fit_directory='_ChIMES_FIT', **kwargs)[source]

Bases: Trainer

Train and deploy a potential using ChIMES

The trainer class is responsible for handling the loading/assignment of training data, as well as the actual process of training a potential. This trainer is intended to be used with ChIMES model trained with ASE training data. WARNING: the fit directory location will be overwritten during any call to the train functions.

__init__(exe_chimes_fit_1, exe_chimes_fit_2, fit_directory='_ChIMES_FIT', **kwargs)[source]

Initialize the ChIMESTrainer.

Parameters:
  • exe_chimes_fit_1 (str) – Path to the first ChIMES fitting executable - /build/chimes_lsq (executable)

  • exe_chimes_fit_2 (str) – Path to the second ChIMES fitting executable - src/chimes_lsq.py (python script)

  • fit_directory (Optional[str]) – Directory for fitting outputs. WARNING: this directory location will be overwritten during any call to a training function

  • kwargs (dict) – Additional keyword arguments for the base Trainer.

checkpoint_trainer()[source]

checkpoint the trainer module into the checkpoint file

save necessary internal variables into a dict with key checkpoint_name and write to the (json) checkpoint file for restart capabilities

Return type:

None

restart_trainer()[source]

restart the trainer module from the checkpoint file

check if the checkpoint_file has an entry matching the checkpoint_name and set internal variables accordingly if so

Return type:

None

train(path_type, potential, storage, dataset_list, workflow=None, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, write_training_script=True, upload_to_kimkit=True)[source]

Train a ChIMES potential

This is the main method of the trainer class, and uses the parameters supplied in the ChIMES settings file to perform the potential training in the fit_directory locaiton specified at instantiation.

Parameters:
  • path_type (str) – specifier for the workflow path, to differentiate training runs; currently unused in this function

  • potential (ChIMESPotential instance) – class object containing ChIMES instance

  • storage (Storage) – Storage instance to pull data from

  • dataset_list (list[str]) – List of dataset handles to train with

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

    Default: None

  • eweight (float) – weight of energy data in the loss function

  • fweight (float) – weight of the force data in the loss function

  • vweight (float) – weight of the stress data in the loss function

  • per_atom_weights (boolean) – True to read from dataset

    Default: False

  • write_training_script (bool) – True to write a training script in the working trainer directory

    Default: True

  • upload_to_kimkit (bool) – Upload to kimkit after training.

    Default: True

Returns:

Tuple of (trained ChIMES model, error metric).

Return type:

tuple[ChIMES, float]

submit_train(path_type, potential, storage, dataset_list, workflow, job_details, eweight=1.0, fweight=1.0, vweight=1.0, per_atom_weights=False, upload_to_kimkit=True)[source]

Asychronously train the potential based on the trainer details

This is a main method of the trainer class, and uses the parameters supplied at instantiation to perform the potential training by minimizing a loss function. While train() works synchronously, this method submits training to a job scheduler. Unless fit_directory is set as an absolute path, it will be a local version in the working directory generated by the Workflow.

Parameters:
  • path_type (str) – specifier for the workflow path, to differentiate training runs

  • potential (Potential) – potential to be trained. The actual model itself is set as an attribute of the Potential object

  • storage (Storage) – Storage instance to pull data from

  • dataset_list (list[str]) – List of dataset handles to train with

  • workflow (Workflow) – the workflow for managing path definition and job submission, if none are supplied, will use the default workflow defined in this class

  • job_details (dict) – job parameters such as walltime or # of nodes

  • eweight (float) – weight of energy data in the loss function

  • fweight (float) – weight of the force data in the loss function

  • vweight (float) – weight of the stress data in the loss function

  • per_atom_weights (boolean) – True to read from dataset

    Default: False

  • upload_to_kimkit (bool) – Upload to kimkit after training

    Default: True

Returns:

calculation ID of the submitted job

Return type:

int

load_from_submitted_training(calc_id, potential, workflow)[source]

reload a potential that was trained via a submitted job

Parameters:
  • calc_id (int) – calculation ID of the submitted training job

  • potential (ChIMESPotential) – ChIMESPotential class object that will be updated with the model saved to disk after the training job.

  • workflow (Workflow) – the workflow for managing path definition and job submission

Return type:

None

Trainer Builder

orchestrator.trainer.factory.trainer_factory = <orchestrator.utils.module_factory.ModuleFactory object>

default factory for trainers, includes DNN (kliff) and KLIFF (parametric model)

class orchestrator.trainer.factory.TrainerBuilder(factory=<orchestrator.utils.module_factory.ModuleFactory object>)[source]

Bases: ModuleBuilder

Constructor for trainers added in the factory

set the factory to be used for the builder. The default is to use the trainer_factory generated at the end of this module. A user defined ModuleFactory can optionally be supplied instead.

Parameters:

factory (ModuleFactory) – a trainer factory

Default: trainer_factory

__init__(factory=<orchestrator.utils.module_factory.ModuleFactory object>)[source]

constructor for the TrainerBuilder, sets the factory to build from

Parameters:

factory (ModuleFactory) – a trainer factory

Default: trainer_factory

build(trainer_type, trainer_args=None)[source]

Return an instance of the specified trainer

The build method takes the specifier and input arguments to construct a concrete trainer instance.

Parameters:
  • trainer_type (str) – token of a trainer which has been added to the factory

  • trainer_args (dict) – arguments to control trainer behavior

Returns:

instantiated concrete Trainer

Return type:

Trainer

orchestrator.trainer.factory.trainer_builder = <orchestrator.trainer.factory.TrainerBuilder object>

trainer builder object which can be imported for use in other modules