.. _kliff_trainer.rst: Trainer Manifest for KLIFF ========================== KLIFF uses YAML configuration files for training interatomic force fields with machine learning models. The configuration file consists of several key sections: 1. ``workspace``: Manages where training runs are stored and defines random seeds for reproducibility. Example: .. code-block:: yaml workspace: name: test_run seed: 12345 resume: False 2. ``dataset``: Configures how the training data is loaded, specifying dataset type (ASE, file paths, etc.), shuffling, and property keys. Example: .. code-block:: yaml dataset: type: ase path: Si.xyz shuffle: True keys: energy: Energy forces: forces 3. ``model``: Defines the model backend (e.g., KIM or Torch) and its properties such as path, name, and input arguments. Example (Torch Model): .. code-block:: yaml model: path: ./model_dnn.pt name: "TorchDNN" 4. ``transforms``: Modifies data or model parameters before or during training (e.g., parameter transformations or graph construction). Example: .. code-block:: yaml transforms: parameter: - A - B - sigma: transform_name: LogParameterTransform value: 2.0 bounds: [[1.0, 10.0]] 5. ``training``: Controls the training loop, including loss function, optimizer, learning rate, dataset splitting, and hyperparameters like batch size and epochs. Example: .. code-block:: yaml training: loss: function: MSE weights: energy: 1.0 forces: 1.0 optimizer: name: Adam learning_rate: 1.e-3 batch_size: 2 epochs: 20 log_per_atom_pred: True 6. ``export (Optional)``: Exports the trained model for external usage, such as creating a KIM-API model. Example: .. code-block:: yaml export: generate_tarball: True model_path: ./ model_name: SW_StillingerWeber_trained_1985_Si__MO_405512056662_006 Example: Training a KIM Potential --------------------------------- 1. ``Dataset Setup``: Download training data. .. code-block:: bash wget https://raw.githubusercontent.com/openkim/kliff/main/examples/Si_training_set_4_configs.tar.gz 2. ``Configuration``: Define workspace, dataset, model, and training settings. .. code-block:: workspace = { "name": "SW_train_example", "random_seed": 12345 } dataset = { "type": "path", "path": "Si_training_set_4_configs", "shuffle": True } model = { "name": "SW_StillingerWeber_1985_Si__MO_405512056662_006" } transforms = { "parameter": ["A", "B", "sigma"] } training = { "loss": { "function" : "MSE", "weights": "weights.yaml" # per atom weight }, "optimizer": { "name": "L-BFGS-B" }, "training_dataset": { "train_size": 3 }, "validation_dataset": { "val_size": 1 }, "epoch" : 10, "log_per_atom_pred": True, # log per atom predictions "verbose": True } export = { "model_path": "./", "model_name": "MySW__MO_111111111111_000" } training_manifest = { "workspace": workspace, "model": model, "dataset": dataset, "transforms": transforms, "training": training, "export": export } 3. ``Train``: Pass configuration to trainer and begin training. .. code-block:: python from kliff.trainer.kim_trainer import KIMTrainer trainer = KIMTrainer(training_manifest) trainer.train() trainer.save_kim_model() This manifests the YAML configuration for KLIFF's training process, defining key sections and settings to ensure a smooth model training experience. Weights ======= In the above example, the ``weights.yaml`` (extension of file should be ``yaml`` and not ``yml`` ) file is used to define the weights for each atom in the training set. The weights are defined in a YAML file as follows: .. code-block:: yaml - config: 1.0 forces: [0.59918768, ...] energy: 1.0 - config: 10.0 forces: [0.97496481, ...] energy: 0.01 - ... Here each entry corresponds to a configuration in the dataset. Any missing item from the yaml file is assumed to be 0.0 or ``None``. The weights are used to scale the loss function during training, allowing for more or less emphasis on certain configurations or properties. You can also provide weights as a dictionary or datafile. Per-atom predictions logging ============================ If the training manifest contains the ``log_per_atom_pred`` key, the trainer will log per-atom predictions during training (currently only forces). This is useful for analyzing the model's performance or uncertainty at the atomic level. The logged predictions can be found in the ``workspace`` directory, under the current run directory, as an ``lmdb`` file. The file name will be ``per_atom_pred_database.lmdb``, and the properties are logged with key ``epoch_{i}|index_{j}``, where ``i`` is the epoch number and ``j`` is the index of the configuration in the dataset. You need the ``lmdb`` library installed to enable this functionality. For more details, refer to the `KLIFF documentation `_ Default artifacts ================= Below is the list of default artifacts and files that KLIFF may generate during the training. Most of these can be named as per the user requirements. The detailed keywords are provided in the KLIFF API documentation. +-------------------------------+--------------------------------------------------------------+ | File / Folder | Description | +===============================+==============================================================+ | ``kliff.log`` | KLIFF’s own file logs, produced in the current working | | | directory (CWD) | +-------------------------------+--------------------------------------------------------------+ | ``fingerprints.pkl`` | Descriptors generated by the legacy descriptor module | +-------------------------------+--------------------------------------------------------------+ | ``finger...mean_and_std.pkl`` | Normalized descriptors generated by the legacy | | | descriptor module | +-------------------------------+--------------------------------------------------------------+ | ``final_model.pkl`` | Trained, serialized machine-learning model | +-------------------------------+--------------------------------------------------------------+ | ``optimizer_state.pkl`` | Optimizer state for restarting | +-------------------------------+--------------------------------------------------------------+ | ``orig_model.pkl`` | Original model serialization used by the UQ module | +-------------------------------+--------------------------------------------------------------+ | ``kliff_saved_model`` | Checkpoints and saved models | +-------------------------------+--------------------------------------------------------------+