The dataset is publicly available and can be accessed from Dataset.
Required Python packages include:
ase==3.22.1config==0.5.1lmdb==1.4.1matplotlib==3.7.2numpy==1.24.4pandas==2.1.3pymatgen==2023.5.10scikit_learn==1.3.0scipy==1.11.4torch==1.13.1torch_geometric==2.2.0torch_scatter==2.1.0tqdm==4.66.1
Alternatively, install the environment using the provided YAML file at ./environment/environment.yaml.
For logging, we recommend using Wandb. More details are available at https://wandb.ai/. Training logs and trained models are stored in the ./wandb directory. The saved model can typically be found at ./wandb/run-xxx/files/model.pt, where xxx represents specific run information.
To begin working with the datasets, first download the necessary files from Zenodo and unzip them.
If you prefer to preprocess the data from scratch, use the following commands, ensuring you replace your_data_path with the appropriate path to your data:
For the high-density defect dataset:
python preprocess_defect_high_density.py --data_root your_data_path/high_density_defects --num_workers 1
For the low-density defect dataset:
python preprocess_defect_low_density.py --data_root your_data_path/low_density_defects --num_workers 1
To increase the processing speed, you can adjust the --num_workers parameter to a higher value, depending on your system's capabilities.
To initiate training of the DefiNet, execute the following commands. Make sure to substitute your_data_path with the actual path to your dataset:
For the high-density defect dataset:
python train_high_density.py --data_root your_data_path/high_density_defects --num_workers 4 --save_model
For the low-density defect dataset:
python train_low_density.py --data_root your_data_path/low_density_defects --num_workers 4 --save_model
To evaluate the DefiNet, specifically on the XMnO dataset, run the following command, replacing your_data_path and your_model_path with the appropriate paths:
python test_high_density.py --data_root your_data_path/high_density_defects --model your_model_path/model.ptpython test_low_density.py --data_root your_data_path/low_density_defects --model your_model_path/model.pt
To predict relaxed structures and save them as .cif files:
python predict_relaxed_structure.py --data_root your_data_path/high_density_defects --materials MoS2_500 --unit_cell_fname MoS2.cif --model_path your_model_path/model.pt