Creating python environment and installing libraries

Source: https://github.com/lmuellender/gmx-nnpot-tools, https://doi.org/10.48550/arXiv.2604.21441

OS, AlmaLinux 8; GPU, NVIDIA TITAN-RTX.

git clone https://github.com/lmuellender/gmx-nnpot-tools.git
cd gmx-nnpot-tools
conda env create -f environment.yaml 
conda activate nnpot
# To use NVIDIA TITAN-RTX
pip install torch==2.7 --upgrade --index-url https://download.pytorch.org/whl/cu128
conda install conda-forge::jupyterlab
cd ..
git clone https://github.com/chemle/emle-engine.git
cd emle-engine
conda install conda-forge::ambertools \
              conda-forge::ase \
              conda-forge::loguru \
              conda-forge::psutil \
              conda-forge::pygit2 \
              conda-forge::pyyaml \
              conda-forge::xtb-python \
              --no-update-deps

pip install deepmd-kit --no-deps
pip install -e .

cd ../gmx-nnpot-tools
sed -i 's/self\.model(atomic_numbers, charges_mm, positions_nn, positions_mm, qm_charge)/self.model(atomic_numbers, charges_mm, positions_nn, positions_mm, None, qm_charge)/' \
    /home/user/data/gromacs/NNPOT/gmx-nnpot-tools/models/gmx_emle.py
jupyter-lab --no-browser --ip=xxx.xx.xxx.xx
# Open export.ipynb, conduct the script, and we got ani2x.pt, ani2x_nnpops.pt, mace.pt, emle_ani2x_ala2.pt, and emle_mace_ala2.pt in the "models" directory.

Gromacs 2026.2のインストール

conda activate nnpot
wget https://ftp.gromacs.org/gromacs/gromacs-2026.2.tar.gz
wget https://www.fftw.org/fftw-3.3.10.tar.gz
tar xvzf gromacs-2026.2.tar.gz
cd gromacs-2026.2
mkdir build_nnpot && cd build_nnpot
export FFTW_PATH=${BASEDIR}/fftw-3.3.10.tar.gz
conda install cmake
export Torch_DIR=$(python -c "import torch; print(torch.utils.cmake_prefix_path)")/Torch
cmake .. -DCMAKE_PREFIX_PATH="${Torch_DIR};${CONDA_PREFIX}"\
 -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs/2026.2_cuda_torch27 -DGMX_GPU=CUDA -DGMX_USE_CUFFTMP=OFF\
 -DGMX_NNPOT=TORCH -DCAFFE2_USE_CUDNN=ON -DCAFFE2_USE_CUSPARSELT=ON -DUSE_CUDSS=ON\
 -DPython_EXECUTABLE=/home/user/miniforge3/envs/nnpot/bin/python -DGMX_BUILD_OWN_FFTW=ON\
 -DGMX_BUILD_OWN_FFTW_URL=${FFTW_PATH} -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.9/bin/nvcc\
 -DCMAKE_CUDA_IMPLICIT_LINK_DIRECTORIES=/usr/local/cuda-12.9/targets/x86_64-linux/lib\
 -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.9 -DGMX_INSTALL_NBLIB_API=OFF\
 -DCUDA_NVRTC_SHORTHASH=c3430e8b -DREGRESSIONTEST_DOWNLOAD=ON\
 -Dnvtx3_dir=/usr/local/cuda-12.9/targets/x86_64-linux/include/nvtx3 -DGMX_USE_TNG=OFF
make -j 37
make check -j 37
# 100% tests passed, 0 tests failed out of 98
sudo make install
source /usr/local/gromacs/2026.2_cuda_torch27/bin/GMXRC
gmx --version

                         :-) GROMACS - gmx, 2026.2 (-:

Executable:   /usr/local/gromacs/2026.2_cuda_torchcu129/bin/gmx
Data prefix:  /usr/local/gromacs/2026.2_cuda_torchcu129
Command line:
  gmx --version

GROMACS version:     2026.2
Precision:           mixed
Memory model:        64 bit
MPI library:         thread_mpi
MPI version:         built in
OpenMP support:      enabled (GMX_OPENMP_MAX_THREADS = 128)
GPU support:         CUDA
NBNxM GPU setup:     super-cluster 2x2x2 / cluster 8 (cluster-pair splitting on)
SIMD instructions:   AVX_512
CPU FFT library:     fftw-3.3.10-sse2-avx-avx2-avx2_128-avx512
GPU FFT library:     cuFFT
Multi-GPU FFT:       none
RDTSCP:              enabled
TNG support:         enabled
Hwloc support:       disabled
Tracing support:     disabled
Colvars support:     enabled (version 2025-10-13)
CP2K support:        disabled
Torch support:       enabled (version 2.7.0)
Plumed support:      enabled
C compiler:          /home/user/miniforge3/envs/nnpot/bin/x86_64-conda-linux-gnu-cc GNU 14.3.0
C compiler flags:    -fexcess-precision=fast -funroll-all-loops -march=skylake-avx512 -Wno-missing-field-initializers -D_GLIBCXX_USE_CXX11_ABI=1 -O3 -DNDEBUG
C++ compiler:        /home/user/miniforge3/envs/nnpot/bin/x86_64-conda-linux-gnu-c++ GNU 14.3.0
C++ compiler flags:  -fexcess-precision=fast -funroll-all-loops -march=skylake-avx512 -Wno-missing-field-initializers -D_GLIBCXX_USE_CXX11_ABI=1 -Wno-stringop-truncation -Wno-cast-function-type-strict SHELL:-fopenmp -O3 -DNDEBUG
BLAS library:        External - detected on the system
LAPACK library:      External - detected on the system
CUDA compiler:       /usr/local/cuda-12.9/bin/nvcc (NVIDIA 12.9.86)
CUDA compiler flags:  -D_FORCE_INLINES -DONNX_NAMESPACE=onnx_c2 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_100,code=sm_100 -gencode arch=compute_120,code=sm_120 -gencode arch=compute_50,code=compute_50 -gencode arch=compute_120,code=compute_120 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda -O3 -DNDEBUG -use_fast_math -static-global-template-stub=false -Xptxas=-warn-double-usage -Xptxas=-Werror -diag-suppress=177 -O3 -DNDEBUG
CUDA targets:        50;52-real;60-real;61-real;70-real;75-real;80-real;86-real;89-real;90-real;100-real;120
CUDA driver:         13.0
CUDA runtime:        12.90
NVSHMEM:             disabled

Test (N-acetyl-L-alanine methylamide in water)

Source: https://github.com/lmuellender/gmx-nnpot-tools

Using ani2x.pt with CUDA (passed)

$ cd gmx-nnpot-tools/example
$ cp md.mdp nvt_nnpot.mdp
$ export GMX_NN_DEVICE=gpu
$ LD_PRELOAD=$CONDA_PREFIX/lib/python3.13/site-packages/torchani/cuaev.cpython-313-x86_64-linux-gnu.so gmx grompp -f nvt_nnpot.mdp -c conf.gro -p topol.top -o nvt_nnpot
Neural network potential interface is active, topology was modified!
Number of embedded NNP atoms: 22
Number of regular atoms: 6765
Number of exclusions made: 22
Number of bonds removed: 62
Number of InteractionFunction::ConnectBonds (type 5 bonds) added: 21
Number of angles removed: 36
Number of dihedrals removed: 42
$ gmx mdrun -v -deffnm npt_nnpot -ntmpi 1 -ntomp 1
step 10000, remaining wall clock time:     0 s          
               Core t (s)   Wall t (s)        (%)
       Time:      228.351      228.351      100.0
                 (ns/day)    (hour/ns)    (ms/step)  (Matom*steps/s) 
Performance:        3.784        6.342       22.833            0.297 

Using ani2x_nnpops.pt (passed)

$ sed 's/ani2x.pt/ani2x_nnpops.pt/g' npt_nnpot.mdp > npt_nnpops.mdp
$ gmx grompp -f npt_nnpops.mdp -c conf.gro -p topol.top -o npt_nnpops -maxwarn 1
WARNING 1 [file npt_nnpops.mdp]:
  There was an error while checking NN model with a dummy input: Error
  during evaluation of the neural network model: The following operation
  failed in the TorchScript interpreter.
$ gmx mdrun -v -deffnm npt_nnpops -ntmpi 1 -ntomp 1
step 10000, remaining wall clock time:     0 s          
               Core t (s)   Wall t (s)        (%)
       Time:       15.254       15.254      100.0
                 (ns/day)    (hour/ns)    (ms/step)  (Matom*steps/s) 
Performance:       56.646        0.424        1.525            4.450 

Using mace.pt (passed)

$ sed 's/ani2x.pt/mace.pt/g' npt_nnpot.mdp > npt_mace.mdp
# Edit npt_mace.mdp
nnpot-active		= yes
nnpot-modelfile 	= ../models/mace.pt
nnpot-input-group	= protein
nnpot-pair-cutoff   = 0.45
nnpot-model-input1  = atom-positions
nnpot-model-input2  = atom-numbers
nnpot-model-input3 	= atom-pairs
nnpot-model-input4	= pair-shifts
nnpot-model-input5 	= box
nnpot-model-input6	= pbc
$ gmx grompp -f npt_mace.mdp -c conf.gro -p topol.top -o npt_mace
$ gmx mdrun -v -deffnm npt_mace -ntmpi 1 -ntomp 1
step 10000, remaining wall clock time:     0 s          
               Core t (s)   Wall t (s)        (%)
       Time:      115.923      115.923      100.0
                 (ns/day)    (hour/ns)    (ms/step)  (Matom*steps/s) 
Performance:        7.454        3.220       11.591            0.586 

Using emle_ani2x_ala2.pt (failed)

$ cp npt_nnpot.mdp npt_emle_ani2x.mdp
# Edit npt_emle_ani2x.mdp
nnpot-modelfile         = ../models/emle_ani2x_ala2.pt
nnpot-embedding = electrostatic-model
$ LD_PRELOAD=$CONDA_PREFIX/lib/python3.13/site-packages/NNPOps/libNNPOpsPyTorch.so gmx grompp -f npt_emle_ani2x.mdp -c conf.gro -p topol.top -o npt_emle_ani2x -maxwarn 2
WARNING 1 [file npt_emle_ani2x.mdp]:
  There was an error while checking NN model with a dummy input: Error
  during evaluation of the neural network model: The following operation
  failed in the TorchScript interpreter.

WARNING 2 [file npt_emle_ani2x.mdp]:
  Embedding scheme is set to Electrostatic model, but MM positions and/or
  charges are not requested as model input. Is this intended?

$ LD_PRELOAD=$CONDA_PREFIX/lib/python3.13/site-packages/NNPOps/libNNPOpsPyTorch.so gmx mdrun -v -deffnm npt_emle_ani2x -ntmpi 1 -ntomp 1

Internal error in destructor of UpdateConstrainGpu: Freeing of the device buffer failed. CUDA error #1 (cudaErrorInvalidValue): invalid argument.
Internal error in destructor of LeapFrogGpu: Freeing of the device buffer failed. CUDA error #1 (cudaErrorInvalidValue): invalid argument.

WARNING: Could not free page-locked memory. An unhandled error from a previous CUDA operation was detected. CUDA error #1 (cudaErrorInvalidValue): invalid argument. 

Using emle_mace_ala2.pt (failed)

$ cp npt_mace.mdp npt_emle_mace.mdp
# Edit npt_emle_ani2x.mdp
nnpot-modelfile         = ../models/emle_mace_ala2.pt
nnpot-embedding = electrostatic-model

$ LD_PRELOAD=$CONDA_PREFIX/lib/python3.13/site-packages/NNPOps/libNNPOpsPyTorch.so  gmx grompp -f npt_emle_mace.mdp -c conf.gro -p topol.top -o npt_emle_mace -maxwarn 2
WARNING 1 [file npt_emle_mace.mdp]:
  There was an error while checking NN model with a dummy input: Expected
  at most 6 argument(s) for operator 'forward', but received 7 argument(s).

WARNING 2 [file npt_emle_mace.mdp]:
  Embedding scheme is set to Electrostatic model, but MM positions and/or
  charges are not requested as model input. Is this intended?

$ LD_PRELOAD=$CONDA_PREFIX/lib/python3.13/site-packages/NNPOps/libNNPOpsPyTorch.so gmx mdrun -v -deffnm npt_emle_mace -ntmpi 1 -ntomp 1

Internal error in destructor of UpdateConstrainGpu: Freeing of the device buffer failed. CUDA error #1 (cudaErrorInvalidValue): invalid argument.
Internal error in destructor of LeapFrogGpu: Freeing of the device buffer failed. CUDA error #1 (cudaErrorInvalidValue): invalid argument.

WARNING: Could not free page-locked memory. An unhandled error from a previous CUDA operation was detected. CUDA error #1 (cudaErrorInvalidValue): invalid argument.