ALmoMD

#Tutorials

Initialization

We are going to have a tutorial of the ALmoMD with an example of CuI.

Ground truth

But the actual implementation of ALmoMD with DFT is too demanding for the tutorial purpose due to the large amount of supercell calculations via DFT. Therefore, in this tutorial, we are going to have a pretrained MLIP model as a ground truth instead of DFT. To do that, you need REFER instead of DFT_INPUTS.

REFER: A directory containing the pretrained MLIP depolyed model (deployed-model_0_0.pth).

Accodingly, your ALmoMD inputs should be modified as below.

output_format : nequip             # Make the pretrained NequIP model as the ground truth
E_gs          : -0.142613646819514 # New corresponding reference potential energy (Energy of the geometry.in.supercell)

Train initial MLIP models

1) Split the aiMD trajectories into training and testing data. This can be implemented by a command of almomd utils split (# of testing data) (E_gs). In this practice, we will use 100 testing data.

almomd utils split 100 -0.142613646819514

It will create two files trajectory_train.son, trajectory_test.son, and a directory MODEL containing data_test.npz.

2) Create the training inputs for initial MLIP modes.

almomd init

It will create a directory of 300K-0bar_0 inside of MODEL.

cd MODEL/300-bar_0

You will find 3 training data (data-train_*.npz), and 6 NequIP inputs (input_*_*.yaml) and corresponding job scripts (job-nequip-gpu _*_*.slurm). This is because you assign 3 subsampling and 2 random initialization in input.in, leading to a total of 6 (=2*3) different MLIP models.

3) Submit your job scripts to train MLIP models.

sbatch job-nequip-gpu\_0.slurm; sbatch job-nequip-gpu\_1.slurm; sbatch job-nequip-gpu\_2.slurm; sbatch job-nequip-gpu\_3.slurm; sbatch job-nequip-gpu\_4.slurm; sbatch job-nequip-gpu\_5.slurm

4) When your training is done, you will get deployed MLIP models (depolyed-model_*_*.pth).

Active Learning Procedure

The active learning iterative loop in the ALmoMD consists of three major steps (MLIP exploration, DFT calculation, and MLIP training).

MLIP exploration

When you have MLIP models, the ALmoMD will explore the configurational space via MLIP-MD. This can be conducted by submit your job-cont.slurm.

sbatch job-cont.slurm

It will generate many files and directories. But, almomd.out, result.txt, and UNCERT/uncertainty-300K-0bar_*.txt are important files that users know.

1) almomd.out: It shows the overall process of the ALmoMD.

2) result.txt: It contains the testing results and their MLIP uncertainty at each active learning step.

3) UNCERT/uncertainty-300K-0bar_*.txt: It records the result of the MLIP-MD steps. You can recognize which MD snapshots are sampled.

When it samples all data, it will create a directory of CALC/300K-0bar_*, where all DFT inputs for the sampled snapshots are prepared.

DFT calculation

In each iteration, you need to go into the most recent CALC/300K-0bar_*.

cd CALC/300-0bar_1

You need to submit all job scripts.

sbatch job-vibes_0.slurm; sbatch job-vibes_1.slurm; sbatch job-vibes_2.slurm; sbatch job-vibes_3.slurm; sbatch job-vibes_4.slurm; sbatch job-vibes_5.slurm; sbatch job-vibes_6.slurm; sbatch job-vibes_7.slurm; sbatch job-vibes_8.slurm; sbatch job-vibes_9.slurm; sbatch job-vibes_10.slurm; sbatch job-vibes_11.slurm; sbatch job-vibes_12.slurm; sbatch job-vibes_13.slurm; sbatch job-vibes_14.slurm; sbatch job-vibes_15.slurm

MLIP training

Once all DFT calculations are finished, go back to main directory where result.txt exists.

almomd gen

This will add all new DFT outcomes into the training data. The new training data, inputs, and corresponding job scripts are generated in the most recent MODEL/300K-0bar_*. Then, submit all job scripts.

sbatch job-nequip-gpu\_0.slurm; sbatch job-nequip-gpu\_1.slurm; sbatch job-nequip-gpu\_2.slurm; sbatch job-nequip-gpu\_3.slurm; sbatch job-nequip-gpu\_4.slurm; sbatch job-nequip-gpu\_5.slurm

When your training is done, you will get deployed MLIP models (depolyed-model_*_*.pth). Then, go back to MLIP exploration section to complete the loop.