ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs

ParGNN is accepted by DAC 2025.

ParGNN, an efficient full-batch training system for GNNs, which adopts a profiler-guided adaptive load balancing partition method(PGALB) and a subgraph pipeline algorithm to overlap communication and computation.

1. Clone this project and Setup environment

2. Install PGALB and partition graph data

2.1. setup pgalb

cd pgalb  ## go into the pgalb dirctionary
python setup.py build_ext --inplace  ## install pgalb
python test_adpat.py   ## check the install of C extension

2.2. parition and reparition

ParGNN use two stage partition to deal with the load unbalance question, and provide the 3 functions:
- graph_partition_dgl_metis: load graph from local path(or download online automately), and create DGL-format graph. Then run the intial patition by metis.
- graph_eval: profile the subgraphs from the inital parition.
- mapping: map the subgraphs to a cluster for running in the same GPUs.
we provide two scripts to test the partition and repartition process, just run:
```
python graph_partition.py
python repart.py
```
data used in the evaluation on paper: (script will download them automately when they are needed)
- ogb graph dataset: ogbn-products, ogbn-proteins from https://ogb.stanford.edu/docs/nodeprop/
- yelp dataset: https://www.dgl.ai/dgl_docs/generated/dgl.data.YelpDataset.html#dgl.data.YelpDataset
- reddit dataset: https://www.dgl.ai/dgl_docs/generated/dgl.data.RedditDataset.html

3. Run distributed GNN trainning

The scripts dirctionary has the example scripts to run ParGNN.
```
cd scripts
sh train_all.sh
```

4. Citations

To cite this project, you can use the following BibTex citation.

DAC 2025

@INPROCEEDINGS{11133102,
  author={Gu, Junyu and Li, Shunde and Cao, Rongqiang and Wang, Jue and Wang, Zijian and Liang, Zhiqiang and Liu, Fang and Li, Shigang and Zhou, Chunbao and Wang, Yangang and Chi, Xuebin},
  booktitle={2025 62nd ACM/IEEE Design Automation Conference (DAC)}, 
  title={ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs}, 
  year={2025},
  volume={},
  number={},
  pages={1-7},
  keywords={Training;Accuracy;Design automation;Pipelines;Graphics processing units;Load management;Graph neural networks;Partitioning algorithms;Faces;Convergence;Graph neural network;Full-batch distributed training;Load balancing;Computation and communication overlapping},
  doi={10.1109/DAC63849.2025.11133102}}

PPoPP 2024 Poster

@inproceedings{10.1145/3627535.3638488,
author = {Li, Shunde and Gu, Junyu and Wang, Jue and Yao, Tiechui and Liang, Zhiqiang and Shi, Yumeng and Li, Shigang and Xi, Weiting and Li, Shushen and Zhou, Chunbao and Wang, Yangang and Chi, Xuebin},
title = {POSTER: ParGNN: Efficient Training for Large-Scale Graph Neural Network on GPU Clusters},
year = {2024},
isbn = {9798400704352},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3627535.3638488},
doi = {10.1145/3627535.3638488},
pages = {469–471},
numpages = {3},
keywords = {graph neural network, load balancing, data transfer hiding, distributed training},
location = {Edinburgh, United Kingdom},
series = {PPoPP '24}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
pgalb		pgalb
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs

1. Clone this project and Setup environment

2. Install PGALB and partition graph data

2.1. setup pgalb

2.2. parition and reparition

3. Run distributed GNN trainning

4. Citations

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs

1. Clone this project and Setup environment

2. Install PGALB and partition graph data

2.1. setup pgalb

2.2. parition and reparition

3. Run distributed GNN trainning

4. Citations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages