Add astroquery.linelists.exomol — ExoMol database query module#3536
Add astroquery.linelists.exomol — ExoMol database query module#3536aditya-pandey-dev wants to merge 26 commits intoastropy:mainfrom
Conversation
Implements astroquery interface for ExoMol molecular line list database. Wraps RADIS ExoMol reader (radis.io.exomol) into BaseQuery pattern. Features: - query_lines() — fetch line lists as astropy.Table - get_molecule_list() — list all 91+ ExoMol molecules - get_databases() — list available databases per molecule - get_partition_function() — fetch Q(T) data - broadening_species support (H2, He, air, CO2, H2O, self) Tests: 9/9 passing (5 mocked + 4 remote_data) Related: radis/radis#925
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3536 +/- ##
==========================================
- Coverage 72.66% 72.57% -0.09%
==========================================
Files 219 222 +3
Lines 20478 20547 +69
==========================================
+ Hits 14880 14913 +33
- Misses 5598 5634 +36 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
keflavich
left a comment
There was a problem hiding this comment.
Thank you for this PR! I've wanted clean access to exomol for many years and never got around to writing it myself (...i think because the server wasn't mature yet?).
The implementation looks OK right now, but I have questions: Can this be refactored to avoid pandas and radis? Should it?
There are some minor code issues, and I haven't systematically reviewed the tests, but broadly it looks OK.
astroquery/linelists/exomol/core.py
Outdated
| from radis.api.exomolapi import get_exomol_database_list | ||
| from radis.api.exomolapi import get_exomol_full_isotope_name |
There was a problem hiding this comment.
We're introducing a new dependency here, which I don't love doing - but if it's necessary, we'll allow it.
These methods look fairly simple and maybe safe to reimplement here, but if they're heavy methods in any way, it's OK to leave them in radis. I'll invite @bsipocz to comment on this too.
| xfail_strict = true | ||
| remote_data_strict = true | ||
| addopts = ["--color=yes", "--doctest-rst", "--doctest-continue-on-failure"] | ||
| timeout = 300 |
docs/linelists/exomol/exomol.rst
Outdated
| load_wavenum_min=2000, | ||
| load_wavenum_max=2100, | ||
| ) | ||
| print(result) |
There was a problem hiding this comment.
the results should be shown here (and in the other doctest examples)
…re pytest-timeout, add dependency note to docs
|
Hi @keflavich I've addressed all your review comments:
Could you please re-review |
|
Could you say what it would take to remove RADIS as a dependency? Adding it as a dependency creates a circular dependency (https://github.com/radis/radis/blob/be14d4b101b9f068ea0ee04b4e902d57049ea813/environment.yml#L17), so it would be better if astroquery could stand alone. |
|
Hi @keflavich I go through what it would take to remove RADIS as a dependency: It is comparatively easy to reimplement without radis: get_exomol_database_list — simple BeautifulSoup HTML scraper (30 lines) Very complex to reimplement: fetch_exomol (352 lines) — handles HDF5 file download, caching, line intensity calculations, isotopologue filtering, and broadening parameters. Reimplementing this would essentially duplicate a major portion of RADIS. I can remove the two helper functions from RADIS and reimplement them natively. But for fetch_exomol, the options are: a) Keep RADIS as optional dependency (skip tests if unavailable) What would you prefer? |
|
Thanks @aditya-pandey-dev . I think the former functions, including the bs4 scraper, belong here. for fetch_exomol, that sounds like a more involved calculation - @bsipocz I'm loosely inclined to allow this function to use exomol. It's not purely database work, I think, but a little calculation on top. Are you OK with including radis only as a dependency for this function - i.e., import within the function? |
astroquery/linelists/exomol/core.py
Outdated
| Query CO lines between 2000-2100 cm^-1:: | ||
|
|
||
| from astroquery.linelists.exomol import ExoMol | ||
| result = ExoMol.query_lines('CO', | ||
| load_wavenum_min=2000, | ||
| load_wavenum_max=2100) |
There was a problem hiding this comment.
To follow the astroquery standard for APIs, these should be specified with units - so load_wavenum_min=2000*u.cm**-1, for example. Also, the keyword name should be wavenum_min - load doesn't fit here, afaict.
astroquery/linelists/exomol/core.py
Outdated
| load_wavenum_min=None, | ||
| load_wavenum_max=None, |
There was a problem hiding this comment.
see above
| load_wavenum_min=None, | |
| load_wavenum_max=None, | |
| wavenum_min=None, | |
| wavenum_max=None, |
| Minimum wavenumber in cm^-1. | ||
| load_wavenum_max : float, optional | ||
| Maximum wavenumber in cm^-1. | ||
| broadening_species : str or list of str, optional |
There was a problem hiding this comment.
these should be Quantitys
| "pytest>=7.4", | ||
| "pytest-doctestplus>=1.4", | ||
| "pytest-timeout", | ||
| "pytest-timeout", |
… lazy RADIS import
…DIS dependency, fix pyproject.toml whitespace
|
Hi @keflavich all your review comments have been addressed: Renamed load_wavenum_min/max → wavenum_min/max with ~astropy.units.Quantity support Could you please re-review? |
keflavich
left a comment
There was a problem hiding this comment.
There are a few more things to clean up in tests now
| try: | ||
| import radis # noqa: F401 | ||
| except ImportError as e: | ||
| pytest.skip(f"radis required for exomol tests: {e}", | ||
| allow_module_level=True) |
There was a problem hiding this comment.
radis is no longer required for most tests, so let's switch this global skip to skipif RADEX_NOT_AVAILABLE down below
| monkeypatch.setattr( | ||
| "radis.io.exomol.fetch_exomol", lambda *a, **kw: fake_linelist_df | ||
| ) | ||
| result = ExoMol.query_lines("CO", wavenum_min=2000, wavenum_max=2100) |
There was a problem hiding this comment.
these should now require Quantities (and this test should not be skipped)
There was a problem hiding this comment.
ah, actually, this test should be skipped if radis is unavailable, sorry - but still, these should be quantities
| monkeypatch.setattr( | ||
| "radis.io.exomol.fetch_exomol", lambda *a, **kw: fake_linelist_df | ||
| ) | ||
| result = ExoMol.query_lines("CO", wavenum_min=2000, wavenum_max=2100) |
| wavenum_min=1000, | ||
| wavenum_max=1100, |
| wavenum_min=2000, | ||
| wavenum_max=2100, |
| wavenum_min=2000, | ||
| wavenum_max=2050, |
| # =========================================================== | ||
|
|
||
|
|
||
| @pytest.mark.remote_data |
There was a problem hiding this comment.
all the remote_data-flagged tests should go into test_exomol_remote - it makes debugging and testing a lot easier when the non-remote tests and remote-requiring tests are separated
… tests to test_exomol_remote.py
|
Hi @keflavich I've addressed all the review comments: Replaced global pytest.skip with RADEX_NOT_AVAILABLE flag + @pytest.mark.skipif on each test Could you please re-review? |
Implements
astroquery.linelists.exomolmodule for querying theExoMol molecular line list database (https://www.exomol.com).
Closes radis/radis#479
Related: radis/radis#925
Changes:-
ExoMolClass(BaseQuery)with 4 methods:query_lines()— fetch line lists asastropy.Tableget_molecule_list()— list all 91+ moleculesget_databases()— list databases per moleculeget_partition_function()— fetch Q(T) databroadening_speciessupport (H2, He, air, CO2, H2O, self)