XiaO

基于 Gromacs 和 OpenMM 的小分子分子动力学模拟

XiaO / 2022-08-13


Gromacs 分子动力学模拟的输入文件:

OpenMM 分子动力学模拟的输出文件:

准备输入文件

分子结构文件

此处,我们将模拟 pyrene 在水中的聚集行为,故在根目录下新建文件夹 pyrene,并将终端的工作目录改变到此文件夹。

在终端中安装 open-babel

brew install open-babel

在终端中将分子的 SMILES 格式 转换为其它格式文件:

obabel -:"C1=CC2=C3C(=C1)C=CC4=CC=CC(=C43)C=C2" -O pyrene.mol2 --gen3D
obabel -:"C1=CC2=C3C(=C1)C=CC4=CC=CC(=C43)C=C2" -O pyrene.pdb --gen3D

可利用网络服务器进行结构转换;亦可制作 Automator 脚本,快捷转换大量文件(pdb, mol2, sdf, mol):

export PATH=/usr/local/bin:$PATH

for f in "$@"
do
	file_path=${f%/*}
  	file=$(basename $f)
  	file_name=${file%.*}
  	file_extension=${file##*.}
	if [[ $file_extension == "sdf" ]]
	then
		obabel -h -isdf $f -omol > $file_path/$file_name.mol
		obabel -h -isdf $f -opdb > $file_path/$file_name.pdb
		obabel -h -isdf $f -omol2 > $file_path/$file_name.mol2
	elif [[ $file_extension == "pdb" ]]
	then
		obabel -h -ipdb $f -omol > $file_path/$file_name.mol
		obabel -h -ipdb $f -omol2 > $file_path/$file_name.mol2
		obabel -h -ipdb $f -osdf > $file_path/$file_name.sdf
	elif [[ $file_extension == "mol" ]]
	then
		obabel -h --gen3d -imol $f -omol2 > $file_path/$file_name.mol2
		obabel -h --gen3d -imol $f -opdb > $file_path/$file_name.pdb
		obabel -h --gen3d -imol $f -osdf > $file_path/$file_name.sdf
	elif [[ $file_extension == "mol2" ]]
	then
		obabel -h -imol2 $f -omol > $file_path/$file_name.mol
		obabel -h -imol2 $f -opdb > $file_path/$file_name.pdb
		obabel -h -imol2 $f -osdf > $file_path/$file_name.sdf
	else
	fi
done

对 mol2 文件中分子的键重新排序

下载该 Perl 脚本,将其放入我们的工作目录中,用该脚本对 mol2 文件中分子的键 @<TRIPOS>BOND 重新排序,并将排序后的文件命名为 pyrene_clean.mol2

perl sort_mol2_bonds.pl pyrene.mol2 pyrene_clean.mol2

力场文件

用于描述小分子的力场各不相同,譬如 CHARMM36GAFF

生成分子的力场文件 *.top

根据选择力场的不同,用于生成分子力场文件的工具也不同。该案例中我们使用 CHARMM36 力场,可选用 CGenff 网络服务器 生成 CHARMM 格式的力场文件。在 CGenFF 服务器上注册并创建一个免费帐户,上传我们整理好的 pyrene_clean.mol2 文件,它会生成 pyrene_clean.str 文件,该文件即是 CHARMM 格式的力场文件,下载该文件。

因为我们需在 Gromacs 中进行分子动力学模拟,故需要将上述 CHARMM 格式的力场文件转换为 Gromacs 格式的力场文件。下载该 python2 脚本,放入我们的工作目录中。创建一个适合该文件运行的 python2 环境:

# 创建虚拟环境
conda create cgenff_charmm2gmx
conda activate cgenff_charmm2gmx

# 安装相应的包(注意版本)
conda install python==2.7
conda install numpy==1.12.1
pip install networkx==1.11 
pip install decorator==3.4.2

# 执行力场文件的格式转换
python cgenff_charmm2gmx.py pyrene pyrene_clean.mol2 pyrene.str charmm36-jul2021.ff

成功运行后,我们即可得到 pyrene 小分子的力场文件 pyrene.top,该分子单独的力场文件为 pyrene.itp,并被引入到 pyrene.top 中,该分子所需的其他参数被写入 pyrene.prm 文件中,一并被引入到 pyrene.top 中。

NOTE1: Code tested with python 2.7.3. Your version: 2.7.18 |Anaconda, Inc.| (default, Apr 23 2020, 17:44:47) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]

NOTE2: Please be sure to use the same version of CGenFF in your simulations that was used during parameter generation:
--Version of CGenFF detected in  pyrene.str : 4.6
--Version of CGenFF detected in  charmm36-jul2021.ff/forcefield.doc : 4.6

NOTE3: In order to avoid duplicated parameters, do NOT select the 'Include parameters that are already in CGenFF' option when uploading a molecule into CGenFF.
============ DONE ============
Conversion complete.
The molecule topology has been written to pyrene.itp
Additional parameters needed by the molecule are written to pyrene.prm, which needs to be included in the system .top
============ DONE ============

重复以上所有步骤,即可得到不同分子的力场文件 .top。

复制一份pyrene.top,并将其重命名为sys.top,用以存放系统中所有的分子力场信息。

安装 GROMACS

在终端中安装 GROMACS

brew install gromacs

*.gro 文件:定义盒子及分子在盒子中的座标信息

利用前面 cgenff_charmm2gmx.py 处理时得到的 pyrene_ini.pdb 文件,在终端执行如下代码,将其转换为 pyrene.gro 文件。该文件记录了分子中每个原子在一个长宽高均为 3 nm 的空间中的位置信息。

gmx editconf -f pyrene_ini.pdb   -o pyrene.gro -box 3 3 3

Read 26 atoms
No velocities found
    system size :  0.920  0.679  0.003 (nm)
    center      :  0.000 -0.000 -0.000 (nm)
    box vectors :  0.000  0.000  0.000 (nm)
    box angles  :   0.00   0.00   0.00 (degrees)
    box volume  :   0.00               (nm^3)
    shift       :  1.500  1.500  1.500 (nm)
new center      :  1.500  1.500  1.500 (nm)
new box vectors :  3.000  3.000  3.000 (nm)
new box angles  :  90.00  90.00  90.00 (degrees)
new box volume  :  27.00               (nm^3)

重复以上所有步骤,即可得到不同分子的 gro 文件。

复制一份pyrene.gro,并将其重命名为sys.gro,用以存放系统中所有的分子力场信息。

向盒子中添加更多的分子(分子浓度)

向该分子座标文件 sys.gro 中再随机添加 30 个 pyrene 分子,得到一个含有 31 个小分子的座标文件 sys_num.gro注意修改 sys.top 文件中, [ molecules ] 项下的 pyrene 的分子数量,1 ==> 31

gmx insert-molecules -f pyrene.gro -ci pyrene.gro -o sys_num.gro -nmol 30 -rot xyz -try 100

Reading solute configuration
Initialising inter-atomic distances...

WARNING: Masses and atomic (Van der Waals) radii will be guessed
         based on residue and atom names, since they could not be
         definitively assigned from the information in your input
         files. These guessed numbers might deviate from the mass
         and radius of the atom type. Please check the output
         files if necessary. Note, that this functionality may
         be removed in a future GROMACS version. Please, consider
         using another file format for your input.

NOTE: From version 5.0 gmx insert-molecules uses the Van der Waals radii
from the source below. This means the results may be different
compared to previous GROMACS versions.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
A. Bondi
van der Waals Volumes and Radii
J. Phys. Chem. 68 (1964) pp. 441-451
-------- -------- --- Thank You --- -------- --------

Using random seed -219709635
Try 1 success (now 52 atoms)!
Try 2 success (now 78 atoms)!
Try 3 success (now 104 atoms)!
Try 4 success (now 130 atoms)!
Try 5 success (now 156 atoms)!
Try 7 success (now 182 atoms)!
Try 8 success (now 208 atoms)!
Try 12 success (now 234 atoms)!
Try 13 success (now 260 atoms)!
Try 14 success (now 286 atoms)!
Try 15 success (now 312 atoms)!
Try 16 success (now 338 atoms)!
Try 19 success (now 364 atoms)!
Try 22 success (now 390 atoms)!
Try 25 success (now 416 atoms)!
Try 30 success (now 442 atoms)!
Try 32 success (now 468 atoms)!
Try 34 success (now 494 atoms)!
Try 40 success (now 520 atoms)!
Try 44 success (now 546 atoms)!
Try 49 success (now 572 atoms)!
Try 59 success (now 598 atoms)!
Try 61 success (now 624 atoms)!
Try 62 success (now 650 atoms)!
Try 65 success (now 676 atoms)!
Try 66 success (now 702 atoms)!
Try 67 success (now 728 atoms)!
Try 75 success (now 754 atoms)!
Try 82 success (now 780 atoms)!
Try 83 success (now 806 atoms)!

Added 30 molecules (out of 30 requested)
Writing generated configuration to pyrene_num.gro

Output configuration contains 806 atoms in 31 residues

重复以上添加分子的步骤,即可向盒子中添加不同的分子:

# 向前面生成的 sys_num.gro 文件中,再添加 30 个 fav 分子,重新生成一个 sys_num2.gro 文件
gmx insert-molecules -f sys_num.gro -ci fav.gro -o sys_num2.gro -nmol 30 -rot xyz -try 100 

注意在 sys.top 文件中, [ molecules ] 项下添加 fav 分子以及相应的数量 30

向盒子中添加溶剂

gmx solvate -cp sys_num2.gro -cs spc216.gro -o sys_num2_solv.gro -p sys.top


Reading solute configuration
Reading solvent configuration

Initialising inter-atomic distances...

WARNING: Masses and atomic (Van der Waals) radii will be guessed
         based on residue and atom names, since they could not be
         definitively assigned from the information in your input
         files. These guessed numbers might deviate from the mass
         and radius of the atom type. Please check the output
         files if necessary. Note, that this functionality may
         be removed in a future GROMACS version. Please, consider
         using another file format for your input.

NOTE: From version 5.0 gmx solvate uses the Van der Waals radii
from the source below. This means the results may be different
compared to previous GROMACS versions.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
A. Bondi
van der Waals Volumes and Radii
J. Phys. Chem. 68 (1964) pp. 441-451
-------- -------- --- Thank You --- -------- --------

Generating solvent configuration
Will generate new solvent configuration of 2x2x2 boxes
Solvent box contains 3618 atoms in 1206 residues
Removed 966 solvent atoms due to solvent-solvent overlap
Removed 942 solvent atoms due to solute-solvent overlap
Sorting configuration
Found 1 molecule type:
    SOL (   3 atoms):   570 residues
Generated solvent containing 1710 atoms in 570 residues
Writing generated configuration to pyrene_num_solv.gro

Output configuration contains 2516 atoms in 601 residues
Volume                 :          27 (nm^3)
Density                :     1017.14 (g/l)
Number of solvent molecules:    570   

Processing topology
Adding line for 570 solvent molecules with resname (SOL) to topology file (sys.top)

Back Off! I just backed up sys.top to ./#sys.top.1#

如此,我们便得到一个包含了 570 个溶剂分子,31 个溶质分子的立方体座标体系,pyrene_num_solv.gro,其相应的分子信息以可在其力场文件中查看到。

能量最小化,测试系统是否正确 (可省略)

此时,我们需要一个任务描述文件 *.mdp,即告诉 Gromacs 执行何种任务,以及如何执行,其我们在该任务文件中设定需要的参数。

譬如我们此时需要做一个能量最小化的模拟,Energy Minimization (em.mdp) 需要一个任务文件,该文件可在此处下载,其中参数可自行设定。而后执行如下命令,得到 ``

gmx grompp -f em.mdp -c pyrene_num_solv.gro -p pyrene.top

即可快速验证系统是否正确。

准备输入文件的代码汇总

# Get Mol2 files
obabel -:"C1=CC2=C3C(=C1)C=CC4=CC=CC(=C43)C=C2" -O pyrene.mol2 --gen3D
obabel -:"C1=CC2=C(C(=C1)O)C(=O)C3=C(C2=O)C=C(C=C3O)C(=O)O rhein" -O rhein.mol2 --gen3D

# Correct the bonds
perl sort_mol2_bonds.pl pyrene.mol2 pyrene_clean.mol2
perl sort_mol2_bonds.pl rhein.mol2 rhein_clean.mol2

# Get .str files
http://cgenff.umaryland.edu/

# Change .str to .top
python cgenff_charmm2gmx.py pyrene pyrene_clean.mol2 pyrene.str charmm36-jul2021.ff
python cgenff_charmm2gmx.py rhein rhein_clean.mol2 rhein.str charmm36-jul2021.ff

# Get .gro (atom location in a box)
gmx editconf -f pyrene_clean.pdb -o pyrene.gro -box 3 3 3
gmx editconf -f rhein_ini.pdb   -o rhein.gro -box 3 3 3

# Add more molecuels
gmx insert-molecules -f pyrene.gro -ci pyrene.gro -o sys_num.gro -nmol 2 -rot xyz -try 100
gmx insert-molecules -f sys_num.gro -ci rhein.gro -o sys_num2.gro -nmol 3 -rot xyz -try 100

# Add water (Remember to dupulicate and modify the sys.top file)
gmx solvate -cp pyrene_num.gro -cs spc216.gro -o pyrene_num_solv.gro -p sys.top

# Energy minimisaiton test
gmx grompp -f em.mdp -c pyrene_num_solv.gro -p pyrene.top
gmx mdrun -v

本地安装并启动 openmm-setup

conda deactivate # deactivate all virtual environments 
conda create -name openmm python=3 # create a virtual environment named openmm based on pyhton 3
conda activate openmm # # activate the openmm virtual environment
conda install -c conda-forge openmm-setup # install openmm 
openmm-setup # run openmm

在 google colab 中运行模拟

#@title  Install Dependencies
! wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh
! chmod +x Miniconda3-py37_4.8.2-Linux-x86_64.sh
! bash ./Miniconda3-py37_4.8.2-Linux-x86_64.sh -b -f -p /usr/local

import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')

#@title Install OpenMM
!yes|conda install -c conda-forge openmm

修改如下代码的相关地址,在 openmm 运行模拟。

#@title This script was generated by OpenMM-Setup on 2022-08-11.
%cd /content/drive/MyDrive/DSDS/openmm_gromacs/MolecularTOPGRO/pyrene

# This script was generated by OpenMM-Setup on 2022-08-14.

from openmm import *
from openmm.app import *
from openmm.unit import *

# Input Files

gro = GromacsGroFile('/content/drive/MyDrive/DSDS/openmm_gromacs/MolecularTOPGRO/pyrene/confout.gro')
top = GromacsTopFile('/content/drive/MyDrive/DSDS/openmm_gromacs/MolecularTOPGRO/pyrene/pyrene.top', includeDir='/content/drive/MyDrive/DSDS/openmm_gromacs/ForceFieldTOP',
    periodicBoxVectors=gro.getPeriodicBoxVectors())

# System Configuration

nonbondedMethod = PME
nonbondedCutoff = 1.0*nanometers
ewaldErrorTolerance = 0.0005
constraints = HBonds
rigidWater = True
constraintTolerance = 0.000001
hydrogenMass = 1.5*amu

# Integration Options

dt = 0.004*picoseconds
temperature = 300*kelvin
friction = 1.0/picosecond
pressure = 1.0*atmospheres
barostatInterval = 25

# Simulation Options

steps = 500000
equilibrationSteps = 1000
platform = Platform.getPlatformByName('CPU')
dcdReporter = DCDReporter('trajectory.dcd', 10000)
dataReporter = StateDataReporter('log.txt', 1000, totalSteps=steps,
    step=True, time=True, speed=True, progress=True, elapsedTime=True, remainingTime=True, potentialEnergy=True, kineticEnergy=True, totalEnergy=True, temperature=True, volume=True, density=True, separator='\t')
checkpointReporter = CheckpointReporter('checkpoint.chk', 10000)

# Prepare the Simulation

print('Building system...')
topology = top.topology
positions = gro.positions
system = top.createSystem(nonbondedMethod=nonbondedMethod, nonbondedCutoff=nonbondedCutoff,
    constraints=constraints, rigidWater=rigidWater, ewaldErrorTolerance=ewaldErrorTolerance, hydrogenMass=hydrogenMass)
system.addForce(MonteCarloBarostat(pressure, temperature, barostatInterval))
integrator = LangevinMiddleIntegrator(temperature, friction, dt)
integrator.setConstraintTolerance(constraintTolerance)
simulation = Simulation(topology, system, integrator, platform)
simulation.context.setPositions(positions)

# Minimize and Equilibrate

print('Performing energy minimization...')
simulation.minimizeEnergy()
print('Equilibrating...')
simulation.context.setVelocitiesToTemperature(temperature)
simulation.step(equilibrationSteps)

# Simulate

print('Simulating...')
simulation.reporters.append(dcdReporter)
simulation.reporters.append(dataReporter)
simulation.reporters.append(checkpointReporter)
simulation.currentStep = 0
simulation.step(steps)

模拟结果分析

可视化
gmx trjconv -f traj.trr -o system.pdb -pbc whole
gmx trjconv -f traj.trr -o system.pdb -pbc whole -b 100 -e 300 -skip 5 -sep

参考目录: