Skip to content

Parameterize compute_enso_index.py with CLI arguments#875

Merged
brianhenn merged 20 commits intomainfrom
parameterize-compute-enso-index
Feb 27, 2026
Merged

Parameterize compute_enso_index.py with CLI arguments#875
brianhenn merged 20 commits intomainfrom
parameterize-compute-enso-index

Conversation

@brianhenn
Copy link
Contributor

@brianhenn brianhenn commented Feb 25, 2026

Parameterizes scripts/compute_enso_index/compute_enso_index.py with CLI arguments so it can be run on different datasets without code changes, and adds a dedicated Makefile target for the ERA5 AIMIP dataset.

Changes:

  • scripts/compute_enso_index/compute_enso_index.py — added CLI arguments for SST variable, dimension names, ocean mask source, time range, output file, and optional detrending; added dask progress bar; fixed cftime/datetime64 compatibility

  • scripts/compute_enso_index/Makefile — added era5_aimip_enso_index target with explicit arguments for the ERA5 AIMIP zarr dataset

  • Tests added

  • If dependencies changed, "deps only" image rebuilt and "latest_deps_only_image.txt" file updated

brianhenn and others added 17 commits February 25, 2026 15:37
Add --sst-dataset, --ocean-mask-source, --lat-dim, and --lon-dim CLI
arguments to compute_enso_index.py. Existing hardcoded values are
retained as defaults so the script remains non-breaking when run
without arguments. Also adds an open_dataset() helper that dispatches
to xr.open_zarr or xr.open_dataset based on the path, enabling either
zarr or netCDF inputs for both the SST dataset and ocean mask source.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add --ocean-mask-var, --ocean-mask-lat-dim, and --ocean-mask-lon-dim
CLI arguments so that get_ocean_mask works with datasets that use
different variable or coordinate names than the default FV3GFS zarr
source. Existing defaults (ocean_fraction, grid_yt, grid_xt) are
preserved for non-breaking behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Expose the start and end of the anomaly computation window as CLI
arguments, defaulting to the previously hardcoded values (1940-01-01
and 2021-01-01) so existing behavior is unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a boolean --detrend CLI flag (default false) that, when set,
subtracts a linear trendline from the monthly anomaly index before
computing the 3-monthly running mean. Uses the existing (previously
unused) get_time_trendline function.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Makes the SST variable name configurable so the script works with
datasets like ERA5 that use a different variable name. Updates the
era5_enso_index Makefile target to pass --sst-var surface_temperature.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace timedelta(days=45) with np.timedelta64(45, "D") to support
  numpy datetime64 time coordinates
- Call .compute() before iterating to materialize dask-backed arrays
- Use .astype("datetime64[ms]").item() to safely extract year/month/day
  from datetime64[ns] time coordinates

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wraps the .compute() call with a dask ProgressBar context manager so
users get visibility into the computation instead of a silent wait.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Renames the Makefile target and its associated variables to reflect that
the ERA5 data used is specifically the AIMIP dataset.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use use_cftime=True in open_dataset to avoid datetime64 issues with
cftime-encoded time coordinates. Replace np.timedelta64 with
datetime.timedelta for compatibility with cftime objects, and use
.item() directly on cftime time coordinates instead of casting to
datetime64[ms] first.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@brianhenn brianhenn marked this pull request as ready for review February 27, 2026 19:43
Comment on lines 68 to 69
# this version of xarray's resample method doesn't allow
# data shifting to create a centered 3-month average, so do it manually
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Is this a requirement for ace's xarray version? I know for example under scripts/data_process we have a different xarray version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I am running this locally off of the fme conda environment, and at some point I hit the issue that required this. I think maybe the best is to just be clearer what environment is expected to be used to run the script; I'll add a comment.

@@ -0,0 +1,2 @@
*index.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why we don't want to keep the files in git, is this purely for analysis only? In the past when I did this for E3SM, I ended up overriding historical_index.py file so that they show up on wandb. Maybe that's not your intent here?

Copy link
Contributor Author

@brianhenn brianhenn Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the immediate AIMIP use case I am putting the output in a notebook/other script somewhere. Perhaps we should have a catalog of indices in historical_index.py for different SST datasets, but not sure how pressing that need is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not pressing. We can add it when there's a use case for that, just wanted to check.

brianhenn and others added 3 commits February 27, 2026 12:57
Documents the two Makefile targets as the primary entry points for
running the script.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@brianhenn brianhenn merged commit 04bf72b into main Feb 27, 2026
7 checks passed
@brianhenn brianhenn deleted the parameterize-compute-enso-index branch February 27, 2026 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants