-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description:
After extensive troubleshooting, it has been identified that the standard Bioconda distribution of Foldseek (e.g., version 1.3c) is incompatible with Pathfinder's tmscore transformer. Users relying on conda install -c bioconda foldseek will encounter silent failures leading to NaN matrices and PCA crashes.
The Root Cause:
Parameter Mismatch: The Bioconda version of Foldseek does not support the --exhaustive-search flag, which is currently hardcoded in src/run_foldseek.py.
Keyword Incompatibility: Standard versions do not recognize the fptmscore format keyword, resulting in empty alignment outputs.
Working Version: Successful runs were only achieved using a specific Foldseek build (e.g., commit 24020d2...), which supports these advanced/custom flags.
Key Discovery & Recommendation
The Bioconda version of Foldseek is NOT a drop-in replacement for the required build. To prevent users from falling into this "NaN trap," I recommend the following updates to the repository:
Please update README/Installation guide:
- Explicitly warn users NOT to use conda install -c bioconda foldseek.
- Provide clear instructions or a link to download the specific compatible Foldseek binary or install from source (https://github.com/steineggerlab/foldseek).
Add Environment Checks:
Implement a pre-run check in main.py to verify if the available foldseek binary supports --exhaustive-search.
Path Configuration:
Allow users to specify a custom FOLDSEEK_PATH in the configuration or as a command-line argument to avoid PATH conflicts with Conda-installed versions.
Environment for Reference
Failed: Foldseek 1.3c64211 (Bioconda)
Success: Foldseek build 24020d257933c362dd1c22fd64cf478f89d5efc6
Python: 3.10
Halen