PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models, CVPR 2025

Keywords: Vision-Language Models (VLMs); inference-time visual token reduction; token reduction; token pruning; token clustering and token merging; FlashAttention-compatible token reduction; positional-bias mitigation in token pruning

PACT Performance


PACT vs other methods on LLaVA-OneVision-7B	PACT vs other methods on Qwen2-VL-7B

DBDPC vs other clustering algorithms on LLaVA-OneVision-7B	PACT vs other methods on LLaVA-1.6-Mistral-7B

Setup

conda env create -f environment.yml
conda activate pactenv
pip install flash-attn==2.6.3

Usage

1. Testing Existing Methods

Our repo allows you to test PACT along with different visual token reduction methods like FastV, Visual Token Withdrawal, ToMe, and four clustering algorithms :agglomerative clustering, k-means, Density Peaks Clustering, and DBSCAN.

You can find scripts in the scripts folder to reproduce results from the paper. For example, to test PACT on LLaVA-OneVision-7B, you can run:

cd PACT/scripts
bash pact_llava-onevision7b.sh

This script file demonstrates how to test all the different methods supported by our repository. Each method is defined by a config file, with different config files available in the the configs folder. For documentation on config file parameters, refer to this file.

2. Testing Custom Reduction Methods

You can also test a custom pruning or clustering-based reduction method or combine both by using:

custom_clustering.json for custom clustering based methods.
custom_pruning.json for custom pruning methods
custom_combined.json for combinig both pruning and clustering-based merging.

In addition to using the correct config file, you need to implement your reduction logic by modifying the custom_pruning function (which computes scores for token pruning) or/and the custom_token_reduction function (which typically defines a clustering-then-merging method) in utils.py. Please refer to the documentation of these functions for more details. Once implemented, you can easily test your custom pruning methods, your custom clustering-based reduction methods, or even combine both by running:

cd PACT/scripts
bash test_custom.sh

Implementation Details

The visual token reduction is implemented by modifying llava_arch.py and modeling_qwen2.py for LLaVA-OneVision, and modeling_qwen2_vl.py for Qwen-VL 2.0. The modifications are based on functions defined in utils.py.

Citation

If you find our work useful, please consider citing our paper:

@InProceedings{Dhouib_2025_CVPR,
    author    = {Dhouib, Mohamed and Buscaldi, Davide and Vanier, Sonia and Shabou, Aymen},
    title     = {PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {14582-14592},
    doi       = {10.1109/CVPR52734.2025.01359}
}

Acknowledgments

This work received financial support from Crédit Agricole S.A. through the research chair with Ecole Polytechnique on Trustworthy and Responsible AI.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LLaVA-NeXT		LLaVA-NeXT
configs		configs
docs		docs
lmms_eval		lmms_eval
scripts		scripts
transformers		transformers
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
environment.yml		environment.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models, CVPR 2025

PACT Performance

Setup

Usage

1. Testing Existing Methods

2. Testing Custom Reduction Methods

Implementation Details

Citation

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

orailix/PACT

Folders and files

Latest commit

History

Repository files navigation

PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models, CVPR 2025

PACT Performance

Setup

Usage

1. Testing Existing Methods

2. Testing Custom Reduction Methods

Implementation Details

Citation

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages