XiaO

分子动力学模拟蛋白与小分子相互作用

XiaO / 2022-08-09


介绍

AMBER 是辅助模型构建和能量精化 (Assisted Model Building and Energy Refinement) 的首字母缩写, 它既指代分子动力学程序,也指代一组描述了生物分子相互作用的势能函数和参数的力场

准备蛋白

conda create --name AmberTools22 # 新建虚拟环境
conda activate AmberTools22 # 激活该虚拟环境
conda install -c conda-forge ambertools=22 compilers # 安装 AmberTools22

# 用上一步生成的 top crd 文件,合成模拟所需的 pdb 文件
ambpdb -p 0.15_80_10_pH6.5_1ODX.top -c 0.15_80_10_pH6.5_1ODX.crd > 0.15_80_10_pH6.5_1ODX.pdb 

准备小分子

从前述下载的蛋白文件 1ODX.pdb 中提取小分子,该文件是一个蛋白与一个小分子的复合文件,也可以使用其它小分子。

pdb4amber -i 1ODX.pdb -o 1ODX_clean.pdb # 可对文件标准化处理 (clean)
==================================================
Summary of pdb4amber for: /Users/urzone/pdb4amber/1odx.pdb
===================================================
----------Chains
The following (original) chains have been found:
A
B
---------- Alternate Locations (Original Residues!))

The following residues had alternate locations:
None
-----------Non-standard-resnames (小分子文件)
0E8

---------- Missing heavy atom(s)
None
awk '$4=="0E8"' 1ODX.pdb > ligand1.pdb # 提取 pdb 文件的第四列,即小分子

reduce ligand1.pdb > ligand2.pdb # 还原,即对小分子加氢,即 H
#如果加氢失败,请使用open babel 加氢。
#conda install -c conda-forge openbabel
#obabel -ipdb ligand1.pdb -opdb -O ligand2.pdb -h

pdb4amber -i ligand2.pdb -o ligand3.pdb # 对文件标准化处理,即 clean
antechamber -fi pdb -i ligand3.pdb -fo mol2 -o ligand4.mol2 -c bcc -pf y 
# This step may take a little bit of time, upon finish, check "sqm.out", the last line should be
#            --------- Calculation Completed ----------
antechamber -i ligand4.mol2 -fi mol2 -o ligand5.prepi -fo prepi -pf y

parmchk2 -f prepi -i ligand5.prepi -o ligand6.frcmod

合并小分子与蛋白

将前述处理好的小分子 (ligand3.pdb) 与蛋白 (0.15_80_10_pH6.5_1ODX.pdb) 合并为一个文件 (1ODX_H.pdb),并对复合物文件进行标准化 (1ODX_clean_H.pdb):

cat 0.15_80_10_pH6.5_1ODX.pdb ligand3.pdb > 1ODX_H.pdb 

pdb4amber -i 1ODX_H.pdb -o 1ODX_clean_H.pdb

生成 prmtop 与 inpcrd

source leaprc.protein.ff14SB              #Source leaprc file for ff14SB protein force field
source leaprc.gaff                        #Source leaprc file for gaff
source leaprc.water.tip3p                 #Source leaprc file for TIP3P water model
loadamberprep ligand5.prepi               #Load the prepi file for the ligand
loadamberparams ligand6.frcmod            #Load the additional frcmod file for ligand
mol = loadpdb 1ODX_clean_H.pdb            #Load PDB file for protein-ligand complex
solvatebox mol TIP3PBOX 8                 #Solvate the complex with a cubic water box
addions mol Cl- 0                         #Add Cl- ions to neutralize the system
saveamberparm mol 1ODX.prmtop 1ODX.inpcrd #Save AMBER topology and coordinate files
quit                                      # Quit tleap program

在终端运行该文件 1ODX_tleap.in,得到文件 prmtop 与 inpcrd:

tleap -s -f 1ODX_tleap.in > 1ODX_tleap.out

模拟条件设置

本地安装并启动 openmm-setup

conda deactivate # deactivate all virtual environments 
conda create -name openmm python=3 # create a virtual environment named openmm based on pyhton 3
conda activate openmm # # activate the openmm virtual environment
conda install -c conda-forge openmm-setup # install openmm 
openmm-setup # run openmm

对模拟条件进行设置:

启用 google colab 模拟

安装环境:

#@title  Install Dependencies
! wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh
! chmod +x Miniconda3-py37_4.8.2-Linux-x86_64.sh
! bash ./Miniconda3-py37_4.8.2-Linux-x86_64.sh -b -f -p /usr/local

import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')

安装 OpenMM:

#@title Install OpenMM
!yes|conda install -c conda-forge openmm

测试 OpenMM 并查看可用的平台:

#@title Verify OpenMM installation and check the available Platforms
!python -m openmm.testInstallation

模拟蛋白与分子相互作用:

复制 run_openmm_simulation.py 中的代码到此处进行模拟即可,注意修改输入文件的位置

得到蛋白与小分子相互作用的轨迹文件: trajectory.dcd。

轨迹分析: R 包 Bio3D

在终端中使用 brew 安装 r 及 rstudio:

brew indatll r rstudio

打开 rstudio,安装 bio3d:

install.packages("bio3d", dependencies=TRUE)
# load package
library(bio3d) 

# load demo
# demo('pdb')
# demo("md") 

# read data files: dcd and complex.pdb ()
mydcdfile <- "path/to/trajectory.dcd"
mypdbfile <- "path/to/complex.pdb"

# variables
dcd <- read.dcd(mydcdfile)
pdb <- read.pdb(mypdbfile)

# 查看文件信息
print(pdb)
print(dcd)

#  叠合以氨基酸的 Alpha 碳原子为基准,选中阿尔法碳
# Trajectory Frame Superposition on Calpha atoms
ca.inds <- atom.select(pdb, "ligand")
# http://thegrantlab.org/bio3d_v2/html/atom.select.html
#Input selection criteria include selection string keywords (such as "calpha", "backbone", "sidechain", "protein", "nucleic", "ligand", etc.) and individual named selection components (including ‘chain’, ‘resno’, ‘resid’, ‘elety’ etc.).


# 轨迹叠合
xyz <- fit.xyz(fixed=pdb$xyz, 
               mobile=dcd,
               fixed.inds=ca.inds$xyz,
               mobile.inds=ca.inds$xyz)

# 检查叠合是否完整 "[1] TRUE TRUE" 则无误
dim(xyz) == dim(dcd)


# RMSD 是阿尔法碳原子坐标相对于初始坐标的根平均离散方差,它反映了蛋白阿尔法碳原子在空间上随时间的扰动程度
# 如果 RMSD 不再变化,说明模拟已经达到了一个相对平衡的状态
# 输出 RMSD 曲线图
# Root Mean Square Deviation (RMSD)
rd <- rmsd(xyz[1, ca.inds$xyz], xyz[, ca.inds$xyz])
plot(rd, typ = "l", ylab = "RMSD", xlab = "Frame No.")


points(lowess(rd), typ = "l", col = "red", lty = 2, lwd = 2)
summary(rd)

# Root Mean Squared Fluctuations (RMSF)
rf <- rmsf(xyz[, ca.inds$xyz])
plot(rf, ylab = "RMSF", xlab = "Residue Position", typ="l")

# Principal Component Analysis
pc <- pca.xyz(xyz[, ca.inds$xyz])
plot(pc, col = bwr.colors(nrow(xyz)))

# Cluster in PC space
hc <- hclust(dist(pc$z[, 1:2]))
grps <- cutree(hc, k = 2)
plot(pc, col = grps)

# Cross-Correlation Analysis
cij <- dccm(xyz[, ca.inds$xyz])
plot(cij)

## view.dccm(cij, pdb, launch = TRUE)

GAFFTemplateGenerator


参考资料:

  1. 如何从零开始做一个蛋白小分子动力学模拟
  2. 蛋白 - 小分子动力学模拟轨迹文件分析 - R you ready?
  3. 使用 Bio3D 进行互相关分析