Modeling the interaction between proteins and ligands and accurately predicting their binding structures is a critical yet challenging task in drug discovery. Recent advancements in deep learning have shown promise in addressing this challenge, with sampling-based and regression-based methods emerging as two prominent approaches. However, these methods have notable limitations. Sampling-based methods often suffer from low efficiency due to the need for generating multiple candidate structures for selection. On the other hand, regression-based methods offer fast predictions but may experience decreased accuracy. Additionally, the variation in protein sizes often requires external modules for selecting suitable binding pockets, further impacting efficiency. In this work, we propose FABind, an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding. FABind incorporates a unique ligand-informed pocket prediction module, which is also leveraged for docking pose estimation. The model further enhances the docking process by incrementally integrating the predicted pocket to optimize protein-ligand binding, reducing discrepancies between training and inference. Through extensive experiments on benchmark datasets, our proposed FABind demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods.
PDB ID | RMSD |
---|
Red: ground truth ligand pose. Green: ligand pose predicted by FABind.
To perform a more rigorous evaluation, we apply an additional filtering step to exclude samples whose UniProt IDs of the proteins are not contained in the data that is seen during training and validation. The resulting subset, consisting of 144 complexes, was then used to evaluate the performance of FABind. The results demonstrate that FABind surpasses other deep learning and traditional methods by a significant margin across all evaluation metrics. These findings strongly indicate the robust generalization capability of FABind.
In real-world applications, docking is often performed on apo or holo-structures that are bound to different ligands. A new benchmark is to combine the crystal complex structures of PDBBind with protein structures predicted by ESMFold. In order to validate the efficacy of our FABind in the apo-structure docking scenario, we also evaluated its performance under the same settings with DiffDock. FABind outperforms DiffDock, achieving an RMSD of less than 2Å on 24.9% of the complexes generated by ESMFold.
@inproceedings{
pei2023fabind,
title={{FAB}ind: Fast and Accurate Protein-Ligand Binding},
author={Qizhi Pei and Kaiyuan Gao and Lijun Wu and Jinhua Zhu and Yingce Xia and Shufang Xie and Tao Qin and Kun He and Tie-Yan Liu and Rui Yan},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=PnWakgg1RL}
}