Open
Conversation
|
great! |
isayev
added a commit
that referenced
this pull request
Dec 16, 2025
This PR adds torch.compile support with CUDA graphs for significant speedups on GPU molecular dynamics simulations. Based on community PR #20 from Acellera, but reworked with improvements: **Changes:** - Add `compile_mode=True` parameter to `AIMNet2Calculator` and `AIMNet2ASE` - Add `compile_nb_mode` parameter throughout to avoid data-dependent control flow that breaks CUDA graph capture - Add `get_model_definition_path()` for mapping model names to YAML definitions - Add `cosine_cutoff_tensor()` for CUDA graphs compatibility - Add `enable_compile_mode()` to AIMNet2Base to propagate compile settings - Add `calc_masks_fixed_nb_mode()` for compile-time mask calculation **Improvements over original PR #20:** - Generalized model loading (not hardcoded to one model) - Backward-compatible (original `cosine_cutoff` signature unchanged) - Comprehensive test coverage - Code style compliance (passes pre-commit) - Based on current main branch **Limitations:** - Only `nb_mode=0` (single molecule, dense) supported - Requires CUDA - No PBC support in compile mode - First call has compilation overhead **Usage:** ```python from aimnet.calculators import AIMNet2Calculator calc = AIMNet2Calculator("aimnet2", compile_mode=True) ```
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
As discussed by @giadefa This PR adds the necessary changes to AIMNet2 so that the model can be torch.compile'd with cudagraphs enabled.
This speeds up small molecule MD significantly. The new example
ase_md.pyscript demonstrates the speedup.The original runs 10000 steps in 76 seconds, the new version runs in 15 seconds.
Original:
New with
torch.compile(self.model, fullgraph=True, options={'triton.cudagraphs':True}):It is currently only implemented for nb_mode=0 and a single molecule.
The key required changes are to replace data dependent control flow with compile time constant control flow. Therefore, I have added a
setup_for_compile_cudagraphsmethod to some modules to do this.The feature is supported by the ASE calculator interface.