PRISM (Pruning Interface for Similar Molecules) is the modular similarity pruning code originally from FIRECODE, in a polished standalone package. It filters out duplicate structures from conformational ensembles, leaving behind non-redundant states.
The code implements a cached, iterative, divide-and-conquer approach on increasingly large subsets of the ensemble and removes duplicates as assessed by one of three metrics:
- Relative deviation of the moments of inertia on the principal axes
- Heavy-atom RMSD and maximum deviation
- Rotamer-corrected heavy-atom RMSD and maximum deviation
The package is distributed through PyPI.
pip install prism_pruner
The main pruning functions are in prism_pruner.pruning, and a wrapper that chains up to all three is also available. The functions return the pruned ensemble structures and the relative boolean mask.
from prism_pruner.conformer_ensemble import ConformerEnsemble
from prism_pruner.pruner import prune
ensemble = ConformerEnsemble.from_xyz("ensemble.xyz")
ensemble.coords.shape # (1086, 136, 3)
pruned, mask = prune(
ensemble.coords,
ensemble.atoms,
# the third pruning routine can be
# slow and is often not necessary,
# so it's off by default
rot_corr_rmsd_pruning=False,
debugfunction=print,
)
pruned.shape # (387, 136, 3)
mask.shape # (1086,)
# where pruned is ensemble.coords[mask]For additional performance, it is also possible to read/provide energies to only evaluate the similarity of structures that are energetically close.
For additional usage, see the examples folder.
This package was created with Cookiecutter and the jevandezande/pixi-cookiecutter project template.
