XiaO

基于 GROMACS 的小分子分子动力学模拟

XiaO / 2022-08-13


GROMACS 分子动力学模拟的输入文件:

GROMACS 分子动力学模拟的输出文件:

安装 GROMACS

在终端中安装 GROMACS

brew install gromacs

分子结构转换工具

在终端中安装 open-babel

brew install open-babel

在终端中转换 SMILES 格式

obabel -:"C1=CC2=C3C(=C1)C=CC4=CC=CC(=C43)C=C2" -O pyrene.pdb --gen3D

Automator 快捷转换不同文件(pdb, mol2, sdf, mol),适合转换大量的结构:

export PATH=/usr/local/bin:$PATH

for f in "$@"
do
	file_path=${f%/*}
  	file=$(basename $f)
  	file_name=${file%.*}
  	file_extension=${file##*.}
	if [[ $file_extension == "sdf" ]]
	then
		obabel -h -isdf $f -omol > $file_path/$file_name.mol
		obabel -h -isdf $f -opdb > $file_path/$file_name.pdb
		obabel -h -isdf $f -omol2 > $file_path/$file_name.mol2
	elif [[ $file_extension == "pdb" ]]
	then
		obabel -h -ipdb $f -omol > $file_path/$file_name.mol
		obabel -h -ipdb $f -omol2 > $file_path/$file_name.mol2
		obabel -h -ipdb $f -osdf > $file_path/$file_name.sdf
	elif [[ $file_extension == "mol" ]]
	then
		obabel -h --gen3d -imol $f -omol2 > $file_path/$file_name.mol2
		obabel -h --gen3d -imol $f -opdb > $file_path/$file_name.pdb
		obabel -h --gen3d -imol $f -osdf > $file_path/$file_name.sdf
	elif [[ $file_extension == "mol2" ]]
	then
		obabel -h -imol2 $f -omol > $file_path/$file_name.mol
		obabel -h -imol2 $f -opdb > $file_path/$file_name.pdb
		obabel -h -imol2 $f -osdf > $file_path/$file_name.sdf
	else
	fi
done

也可利用网络服务器进行结构转换。

该案例中,我们将模拟 pyrene 在水中的聚集行为,故在根目录下新建文件夹 pyrene

力场文件

用于描述小分子的力场各不相同,譬如 CHARMM36GAFF

准备小分子

下载 pyrne 的结构文件,将其放入工作目录文件夹 pyrene 中,并使用 open-babel 将其转换为 mol2 格式。亦可在终端中将其 SMILES 格式直接转换为 mol2 格式:

obabel -:"C1=CC2=C3C(=C1)C=CC4=CC=CC(=C43)C=C2" -O pyrene.mol2 --gen3D
@<TRIPOS>MOLECULE
pyrene
 26 29 0 0 0
SMALL
GASTEIGER

@<TRIPOS>ATOM
      1 C          -3.5106   -0.0399    0.0067 C.ar    1  UNL1       -0.0606
      2 C          -2.8279    1.1762    0.0094 C.ar    1  UNL1       -0.0540
      3 C          -1.4250    1.2088    0.0017 C.ar    1  UNL1       -0.0167
      4 C          -0.7054   -0.0086   -0.0013 C.ar    1  UNL1       -0.0025
      5 C          -1.3990   -1.2410   -0.0029 C.ar    1  UNL1       -0.0167
      6 C          -2.8013   -1.2403   -0.0021 C.ar    1  UNL1       -0.0540
      7 C          -0.6705   -2.4378   -0.0058 C.ar    1  UNL1       -0.0534
      8 C           0.7264   -2.4237   -0.0048 C.ar    1  UNL1       -0.0534
      9 C           1.4301   -1.2119   -0.0024 C.ar    1  UNL1       -0.0167
     10 C           2.8323   -1.1829   -0.0014 C.ar    1  UNL1       -0.0540
     11 C           3.5168    0.0315    0.0012 C.ar    1  UNL1       -0.0606
     12 C           2.8093    1.2334    0.0002 C.ar    1  UNL1       -0.0540
     13 C           1.4064    1.2373   -0.0022 C.ar    1  UNL1       -0.0167
     14 C           0.7114    0.0058   -0.0023 C.ar    1  UNL1       -0.0025
     15 C           0.6770    2.4345   -0.0062 C.ar    1  UNL1       -0.0534
     16 C          -0.7204    2.4205   -0.0041 C.ar    1  UNL1       -0.0534
     17 H          -4.5976   -0.0525    0.0110 H       1  UNL1        0.0618
     18 H          -3.4004    2.1011    0.0180 H       1  UNL1        0.0624
     19 H          -3.3523   -2.1781   -0.0069 H       1  UNL1        0.0624
     20 H          -1.1853   -3.3959   -0.0074 H       1  UNL1        0.0624
     21 H           1.2607   -3.3710   -0.0058 H       1  UNL1        0.0624
     22 H           3.4024   -2.1092   -0.0019 H       1  UNL1        0.0624
     23 H           4.6035    0.0418    0.0032 H       1  UNL1        0.0618
     24 H           3.3626    2.1700    0.0029 H       1  UNL1        0.0624
     25 H           1.1922    3.3926   -0.0110 H       1  UNL1        0.0624
     26 H          -1.2548    3.3679   -0.0081 H       1  UNL1        0.0624
@<TRIPOS>BOND
     1     1     2   ar
     2     2     3   ar
     3     3     4   ar
     4     4     5   ar
     5     5     6   ar
     6     1     6   ar
     7     5     7   ar
     8     7     8   ar
     9     8     9   ar
    10     9    10   ar
    11    10    11   ar
    12    11    12   ar
    13    12    13   ar
    14    13    14   ar
    15     9    14   ar
    16     4    14   ar
    17    13    15   ar
    18    15    16   ar
    19     3    16   ar
    20     1    17    1
    21     2    18    1
    22     6    19    1
    23     7    20    1
    24     8    21    1
    25    10    22    1
    26    11    23    1
    27    12    24    1
    28    15    25    1
    29    16    26    1

其实,亦可在生成 mol2 文件时,即将其名字 pyrene 放在引号内,如下:

obabel -:"C1=CC2=C3C(=C1)C=CC4=CC=CC(=C43)C=C2 pyrene" -O pyrene.mol2 --gen3D
perl sort_mol2_bonds.pl pyrene.mol2 pyrene_clean.mol2

生成分子的力场文件 *.top

根据选择力场的不同,用于生成分子力场文件的工具也不同。该案例中我们使用 CHARMM36 力场,可选用 CGenff 网络服务器 生成 CHARMM 格式的力场文件。在 CGenFF 服务器上注册并创建一个免费帐户,上传我们整理好的 pyrene_clean.mol2 文件,它会生成 pyrene_clean.str 文件,该文件即是 CHARMM 格式的力场文件,下载该文件。

因为我们需在 GROMACS 中进行分子动力学模拟,故需要将上述 CHARMM 格式的力场文件转换为 GROMACS 格式的力场文件。下载该 python2 脚本,放入我们的工作目录中。创建一个适合该文件运行 python2 环境:

# 创建虚拟环境
conda create cgenff_charmm2gmx
conda activate cgenff_charmm2gmx

# 安装相应的包(注意版本)
conda install python==2.7
conda install numpy==1.12.1
pip install networkx==1.11 
pip install decorator==3.4.2

# 执行力场文件的格式转换
python cgenff_charmm2gmx.py pyrene pyrene_clean.mol2 pyrene.str charmm36-jul2021.ff

成功运行后,我们即可得到 pyrene 小分子的力场文件 pyrene.top,该分子单独的力场文件为 pyrene.itp,并被引入到 pyrene.top 中,该分子所需的其他参数被写入 pyrene.prm 文件中,一并被引入到 pyrene.top 中。

NOTE1: Code tested with python 2.7.3. Your version: 2.7.18 |Anaconda, Inc.| (default, Apr 23 2020, 17:44:47) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]

NOTE2: Please be sure to use the same version of CGenFF in your simulations that was used during parameter generation:
--Version of CGenFF detected in  pyrene.str : 4.6
--Version of CGenFF detected in  charmm36-jul2021.ff/forcefield.doc : 4.6

NOTE3: In order to avoid duplicated parameters, do NOT select the 'Include parameters that are already in CGenFF' option when uploading a molecule into CGenFF.
============ DONE ============
Conversion complete.
The molecule topology has been written to pyrene.itp
Additional parameters needed by the molecule are written to pyrene.prm, which needs to be included in the system .top
============ DONE ============

定义溶剂盒(分子座标,*.gro)

我们将需要模拟的分子和溶剂一并放入某个盒子(边界),以便观测分子在该体系(溶剂,离子,浓度)下的运动轨迹。

为此,我们需要格式为 pdb 的分子结构文件:

然后在终端执行如下代码,如此,我们便得到一个长宽高均为 3 nm 且仅含有一个小分子座标信息文件 pyrene.gro

gmx editconf -f pyrene_clean.pdb -o pyrene.gro -box 3 3 3
OR
gmx editconf -f pyrene_ini.pdb   -o pyrene.gro -box 3 3 3

Read 26 atoms
No velocities found
    system size :  0.920  0.679  0.003 (nm)
    center      :  0.000 -0.000 -0.000 (nm)
    box vectors :  0.000  0.000  0.000 (nm)
    box angles  :   0.00   0.00   0.00 (degrees)
    box volume  :   0.00               (nm^3)
    shift       :  1.500  1.500  1.500 (nm)
new center      :  1.500  1.500  1.500 (nm)
new box vectors :  3.000  3.000  3.000 (nm)
new box angles  :  90.00  90.00  90.00 (degrees)
new box volume  :  27.00               (nm^3)

向盒子中添加更多的分子(分子浓度)

向该分子座标文件 pyrene.gro 中随机添加 30 个相同的分子,得到一个含有 31 个小分子的座标文件 pyrene_num.gro注意修改 pyrene.top 文件中, [ molecules ] 项下的分子数量,1 ==> 31

gmx insert-molecules -f pyrene.gro -ci pyrene.gro -o pyrene_num.gro -nmol 30 -rot xyz -try 100

Reading solute configuration
Initialising inter-atomic distances...

WARNING: Masses and atomic (Van der Waals) radii will be guessed
         based on residue and atom names, since they could not be
         definitively assigned from the information in your input
         files. These guessed numbers might deviate from the mass
         and radius of the atom type. Please check the output
         files if necessary. Note, that this functionality may
         be removed in a future GROMACS version. Please, consider
         using another file format for your input.

NOTE: From version 5.0 gmx insert-molecules uses the Van der Waals radii
from the source below. This means the results may be different
compared to previous GROMACS versions.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
A. Bondi
van der Waals Volumes and Radii
J. Phys. Chem. 68 (1964) pp. 441-451
-------- -------- --- Thank You --- -------- --------

Using random seed -219709635
Try 1 success (now 52 atoms)!
Try 2 success (now 78 atoms)!
Try 3 success (now 104 atoms)!
Try 4 success (now 130 atoms)!
Try 5 success (now 156 atoms)!
Try 7 success (now 182 atoms)!
Try 8 success (now 208 atoms)!
Try 12 success (now 234 atoms)!
Try 13 success (now 260 atoms)!
Try 14 success (now 286 atoms)!
Try 15 success (now 312 atoms)!
Try 16 success (now 338 atoms)!
Try 19 success (now 364 atoms)!
Try 22 success (now 390 atoms)!
Try 25 success (now 416 atoms)!
Try 30 success (now 442 atoms)!
Try 32 success (now 468 atoms)!
Try 34 success (now 494 atoms)!
Try 40 success (now 520 atoms)!
Try 44 success (now 546 atoms)!
Try 49 success (now 572 atoms)!
Try 59 success (now 598 atoms)!
Try 61 success (now 624 atoms)!
Try 62 success (now 650 atoms)!
Try 65 success (now 676 atoms)!
Try 66 success (now 702 atoms)!
Try 67 success (now 728 atoms)!
Try 75 success (now 754 atoms)!
Try 82 success (now 780 atoms)!
Try 83 success (now 806 atoms)!

Added 30 molecules (out of 30 requested)
Writing generated configuration to pyrene_num.gro

Output configuration contains 806 atoms in 31 residues

向盒子中添加溶剂

gmx solvate -cp pyrene_num.gro -cs spc216.gro -o pyrene_num_solv.gro -p pyrene.top


Reading solute configuration
Reading solvent configuration

Initialising inter-atomic distances...

WARNING: Masses and atomic (Van der Waals) radii will be guessed
         based on residue and atom names, since they could not be
         definitively assigned from the information in your input
         files. These guessed numbers might deviate from the mass
         and radius of the atom type. Please check the output
         files if necessary. Note, that this functionality may
         be removed in a future GROMACS version. Please, consider
         using another file format for your input.

NOTE: From version 5.0 gmx solvate uses the Van der Waals radii
from the source below. This means the results may be different
compared to previous GROMACS versions.

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
A. Bondi
van der Waals Volumes and Radii
J. Phys. Chem. 68 (1964) pp. 441-451
-------- -------- --- Thank You --- -------- --------

Generating solvent configuration
Will generate new solvent configuration of 2x2x2 boxes
Solvent box contains 3618 atoms in 1206 residues
Removed 966 solvent atoms due to solvent-solvent overlap
Removed 942 solvent atoms due to solute-solvent overlap
Sorting configuration
Found 1 molecule type:
    SOL (   3 atoms):   570 residues
Generated solvent containing 1710 atoms in 570 residues
Writing generated configuration to pyrene_num_solv.gro

Output configuration contains 2516 atoms in 601 residues
Volume                 :          27 (nm^3)
Density                :     1017.14 (g/l)
Number of solvent molecules:    570   

Processing topology
Adding line for 570 solvent molecules with resname (SOL) to topology file (pyrene.top)

Back Off! I just backed up pyrene.top to ./#pyrene.top.1#

如此,我们便得到一个包含了 570 个溶剂分子,31 个溶质分子的立方体座标体系,pyrene_num_solv.gro,其相应的分子信息以可在其力场文件中查看到。

能量最小化测试

此时,我们需要一个任务描述文件 *.mdp,即告诉 GROMACS 执行何种任务,以及如何执行,其我们在该任务文件中设定需要的参数。

譬如我们此时需要做一个能量最小化的模拟,Energy Minimization (em.mdp) 需要一个任务文件,该文件可在此处下载,其中参数可自行设定。而后执行如下命令,得到 ``

gmx grompp -f em.mdp -c pyrene_num_solv.gro -p pyrene.top

即可快速验证系统是否正确。

准备输入文件的代码集合:

perl sort_mol2_bonds.pl pyrene.mol2 pyrene_clean.mol2

python cgenff_charmm2gmx.py pyrene pyrene_clean.mol2 pyrene.str charmm36-jul2021.ff

gmx editconf -f pyrene_clean.pdb -o pyrene.gro -box 3 3 3

gmx insert-molecules -f pyrene.gro -ci pyrene.gro -o pyrene_num.gro -nmol 30 -rot xyz -try 100

gmx solvate -cp pyrene_num.gro -cs spc216.gro -o pyrene_num_solv.gro -p pyrene.top

gmx grompp -f em.mdp -c pyrene_num_solv.gro -p pyrene.top
gmx mdrun -v

在本地 openmm 中设定运行参数

conda install -c conda-forge openmm
conda activate openmm

openmm-setup

在 google colab 中运行模拟

#@title  Install Dependencies
! wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh
! chmod +x Miniconda3-py37_4.8.2-Linux-x86_64.sh
! bash ./Miniconda3-py37_4.8.2-Linux-x86_64.sh -b -f -p /usr/local

import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')

#@title Install OpenMM
!yes|conda install -c conda-forge openmm

修改如下代码的相关地址,在 openmm 运行模拟。

#@title This script was generated by OpenMM-Setup on 2022-08-11.
%cd /content/drive/MyDrive/DSDS/openmm_gromacs/MolecularTOPGRO/pyrene

# This script was generated by OpenMM-Setup on 2022-08-14.

from openmm import *
from openmm.app import *
from openmm.unit import *

# Input Files

gro = GromacsGroFile('/content/drive/MyDrive/DSDS/openmm_gromacs/MolecularTOPGRO/pyrene/confout.gro')
top = GromacsTopFile('/content/drive/MyDrive/DSDS/openmm_gromacs/MolecularTOPGRO/pyrene/pyrene.top', includeDir='/content/drive/MyDrive/DSDS/openmm_gromacs/ForceFieldTOP',
    periodicBoxVectors=gro.getPeriodicBoxVectors())

# System Configuration

nonbondedMethod = PME
nonbondedCutoff = 1.0*nanometers
ewaldErrorTolerance = 0.0005
constraints = HBonds
rigidWater = True
constraintTolerance = 0.000001
hydrogenMass = 1.5*amu

# Integration Options

dt = 0.004*picoseconds
temperature = 300*kelvin
friction = 1.0/picosecond
pressure = 1.0*atmospheres
barostatInterval = 25

# Simulation Options

steps = 500000
equilibrationSteps = 1000
platform = Platform.getPlatformByName('CPU')
dcdReporter = DCDReporter('trajectory.dcd', 10000)
dataReporter = StateDataReporter('log.txt', 1000, totalSteps=steps,
    step=True, time=True, speed=True, progress=True, elapsedTime=True, remainingTime=True, potentialEnergy=True, kineticEnergy=True, totalEnergy=True, temperature=True, volume=True, density=True, separator='\t')
checkpointReporter = CheckpointReporter('checkpoint.chk', 10000)

# Prepare the Simulation

print('Building system...')
topology = top.topology
positions = gro.positions
system = top.createSystem(nonbondedMethod=nonbondedMethod, nonbondedCutoff=nonbondedCutoff,
    constraints=constraints, rigidWater=rigidWater, ewaldErrorTolerance=ewaldErrorTolerance, hydrogenMass=hydrogenMass)
system.addForce(MonteCarloBarostat(pressure, temperature, barostatInterval))
integrator = LangevinMiddleIntegrator(temperature, friction, dt)
integrator.setConstraintTolerance(constraintTolerance)
simulation = Simulation(topology, system, integrator, platform)
simulation.context.setPositions(positions)

# Minimize and Equilibrate

print('Performing energy minimization...')
simulation.minimizeEnergy()
print('Equilibrating...')
simulation.context.setVelocitiesToTemperature(temperature)
simulation.step(equilibrationSteps)

# Simulate

print('Simulating...')
simulation.reporters.append(dcdReporter)
simulation.reporters.append(dataReporter)
simulation.reporters.append(checkpointReporter)
simulation.currentStep = 0
simulation.step(steps)

模拟结果分析

可视化
gmx trjconv -f traj.trr -o system.pdb -pbc whole
gmx trjconv -f traj.trr -o system.pdb -pbc whole -b 100 -e 300 -skip 5 -sep

参考目录: