TransTCR: Integrating TCRs and Transcriptomes through Optimal Transport for Antigen Specificity Prediction
TransTCR is an unsupervised multimodal representation learning framework that integrates single-cell TCR sequencing (scTCR-seq) and single-cell RNA sequencing (scRNA-seq) data through Optimal Transport (OT) for antigen specificity prediction and cell clustering.
The official implementation for "TransTCR".
Table of Contents
- Datasets
- Installation
- Usage
The raw data can be downloaded here:
| Dataset | Download |
|---|---|
| D1 | Link |
| D2 | Link |
| D3 | Link |
| D4 | Link |
We provide easy access to the processed datasets in the Zenodo.
To reproduce TransTCR, we suggest first creating a conda environment by:
conda create -n TransTCR python=3.9.21
conda activate TransTCRand then install the required packages below:
- scanpy=1.9.1
- scib=1.1.7
- scipy=1.13.2
- torch=2.6.0
- pot=0.9.5
To reproduce TransTCR, paired scTCR-seq and scRNA-seq data in h5ad and csv formats must be processed.
- Process scTCR-seq Data
We use the pre-trained TCR-BERT to encode CDR3 sequences from both TCR chains (TCR-BERT must be downloaded separately):
cd Process/TCR
bash get_emb.sh- Process scRNA-seq Data
We employ CellFM, a recently published foundational model for single-cell data, to process scRNA-seq data (CellFM must be downloaded separately).
- Train and evaluate on intra-dataset classification and clustering:
bash run_Intra.sh
- Train and evaluate on inter-dataset classification and clustering:
bash run_Inter.sh
If you find our codes useful, please consider citing our work:
@article{TransTCR,
title={TransTCR: Integrating TCRs and Transcriptomes through Optimal Transport for Antigen Specificity Prediction},
author={Yuansong Zeng, Wenbing Li, Ruipeng Huang, Yuanze Chen, Jinyun Niu, Ningyuan Shangguan, Siyuan He, Yuedong Yang},
journal={},
year={2025},
}
