GeoAB 模型安装与训练
GeoAB 是一个专注于 抗体药物开发 的 AI 模型系统,主要解决两个核心问题:通过深度学习预测抗体的 3D 结构(尤其是抗原结合部位 CDR 区),生成符合物理规律和生物特性的抗体变体;模拟自然免疫系统的亲和力成熟过程,定向优化抗体对抗原的结合强度。
文献:GeoAB: Towards Realistic Antibody Design and Reliable Affinity Maturation
GitHub 上面其实给了安装步骤,但是很难评 … 我老半天装不上,pip 库和 conda 包混淆了,自己折腾了半天才装上。
官方给的 GitHub 链接是:https://github.com/EDAPINENUT/GeoAB
我这里直接把我的系统信息和 conda 配置放出来。
[kl2@localhost ~]$ cat /etc/centos-release
CentOS Linux release 8.5.2111
[kl2@localhost ~]$ nvidia-smi
Mon Mar 31 10:26:43 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX A6000 Off | 00000000:17:00.0 Off | Off |
| 58% 82C P0 246W / 300W | 8805MiB / 49140MiB | 76% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2931 G /usr/libexec/Xorg 4MiB |
| 0 N/A N/A 2378303 C python3 8782MiB |
+-----------------------------------------------------------------------------------------+
这里涉及到的系统级依赖比较多,所以 conda 和 pip 分开来进行安装,conda 可以直接导入下面的 yml:
name: geoab
channels:
- https://conda.rosettacommons.org
- conda-forge
- bioconda
- defaults
dependencies:
- python==3.9
- _libgcc_mutex=0.1
- _openmp_mutex=4.5
- bzip2=1.0.8
- ca-certificates=2024.2.2
- c-ares=1.19.1
- krb5=1.21.2
- libblas=3.9.0
- libcblas=3.9.0
- liblapack=3.9.0
- libopenblas=0.3.24
- libstdcxx-ng=13.2.0
- libgcc-ng=13.1.0
- libgfortran-ng=13.2.0
- libgfortran5=13.2.0
- libzlib=1.2.13
- openssl=3.2.1
然后进入环境,安装 pip 的依赖:
aiohttp==3.11.14
biopython==1.81
biotite==0.38.0
e3nn==0.5.1
easydict==1.13
fair-esm==2.0.0
matplotlib==3.8.0
nni==3.0
pip==25.0.1
pytorch-lightning==2.0.9
rdkit==2023.3.2
tensorboard==2.13.0
torch_geometric==2.3.1
torch-scatter==2.1.2+pt20cpu
torchaudio==2.0.2
torchvision==0.15.2
这里 torch-scatter 由于和系统的 CUDA 版本不匹配,所以我是安装的预编译的 CPU 版本。
安装完成之后进入环境,拉取官方的仓库:
git clone https://github.com/EDAPINENUT/GeoAB.git
cd GeoAB
然后现在需要下载数据集:
wget "https://drive.usercontent.google.com/download?id=1UwFDuQSE_7gEvrfQkEhqOp2h1NfGAHua&export=download&authuser=0&confirm=t&uuid=7e68c0a8-d842-4cb7-b09c-48e7841cb0b1&at=AEz70l7zL3Yrg_jnQ6f5lACx2v-e:1743385403752" -O all_data.zip
unzip all_data.zip
Run the following command for training:
# Train GeoAB-refiner
python train_refine.py
# Train GeoAB-Initializer
python train_init.py
# After GeoAB-Initializer is trained, train GeoAB-Designer
python train_design.py
For evaluation, run the following command:
# Evaluate GeoAB-Refiner
python eval.py --eval_dir H3_refine --run 1
# Evaluate GeoAB-Designer
python eval.py --eval_dir H3_design
Post Comments