Skip to content

rahulkhorana/PolyatomicComplexes

Repository files navigation

PolyatomicComplexes

Build Status CI Status MIT License Code Style: Black PyPI Version PyPI Format Downloads Socket Badge CodeFactor DOI badge

🚀 Installation

Using pip

  1. Ensure you have python >= 3.11.11 and set up a virtual environment.
pip install virtualenv
virtualenv .env --python=python3.11.11
source .env/bin/activate
  1. Run the following
pip install -U polyatomic-complexes==1.0.8

Note: If you are having trouble with the environment setup please see the following demo in colab: Environment Setup

Using the repo

  1. Clone the repo.

  2. Ensure you have python >= 3.11.11 and set up a virtual environment.

pip install virtualenv
virtualenv .env --python=python3.11.11
source .env/bin/activate
  1. Install the relevant packages.

For standard/minimal usage:

pip install -Ur requirements/requirements.txt

For graph based experiments:

pip install -Ur requirements/requirements_graph.txt

For materials based experiments:

pip install -Ur requirements/requirements_mat.txt
  1. Get all large files from git lfs
git lfs fetch --all
git lfs pull

🚀 Datasets

All dataset csv's come from the original manuscripts as cited in the paper namely:

  1. ESOL: DELANEY, J. S. Esol: Estimating aqueous solubility directly from molecular structure. Journal of Chemical Information and Computer Sciences 44, 3 (2004), 1000–1005. PMID: 15154768.
  2. FreeSolv: MOBLEY, D. L., AND GUTHRIE, J. P. Freesolv: a database of experimental and calculated hydration free energies, with input files. Journal of computer-aided molecular design 28 (2014), 711–720.
  3. Lipophilicity: GAULTON, A., BELLIS, L. J., BENTO, A. P., CHAMBERS, J., DAVIES, M., HERSEY, A., LIGHT, Y., MCGLINCHEY, S., MICHALOVICH, D., AL-LAZIKANI, B., AND OVERINGTON, J. P. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Research 40, D1 (09 2011), D1100–D1107.
  4. Photoswtiches: GRIFFITHS, R.-R., GREENFIELD, J. L., THAWANI, A. R., JAMASB, A. R., MOSS, H. B., BOURACHED, A., JONES, P., MCCORKINDALE, W., ALDRICK, A. A., FUCHTER, M. J., AND LEE, A. A. Data-driven discovery of molecular photoswitches with multioutput gaussian processes. Chem. Sci. 13 (2022), 13541–13551.

📜 License

This project is licensed under the MIT License.

Community

We use GitHub Discussions for:

  • 💬 Asking questions or requesting help
  • 🐞 Reporting bugs or issues
  • 💡 Suggesting new features or improvements

Please feel free to start a discussion if you're interested in contributing!

🔬 Reference

@article{Khorana2025,
    author = {Khorana, Rahul},
    title = {Polyatomic Complexes: A Software Framework for Topologically Accurate Representations of Molecules},
    doi = {10.21105/joss.08828},
    url = {https://doi.org/10.21105/joss.08828},
    year = {2025},
    publisher = {The Open Journal},
    volume = {10},
    number = {114},
    pages = {8828},
    journal = {Journal of Open Source Software}
}
@misc{khorana2024polyatomiccomplexestopologicallyinformedlearning,
      title={Polyatomic Complexes: A topologically-informed learning representation for atomistic systems}, 
      author={Rahul Khorana and Marcus Noack and Jin Qian},
      year={2024},
      eprint={2409.15600},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2409.15600}, 
}

About

Representations of atomistic systems.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •