Parameterize compute_enso_index.py with CLI arguments#875
Conversation
Add --sst-dataset, --ocean-mask-source, --lat-dim, and --lon-dim CLI arguments to compute_enso_index.py. Existing hardcoded values are retained as defaults so the script remains non-breaking when run without arguments. Also adds an open_dataset() helper that dispatches to xr.open_zarr or xr.open_dataset based on the path, enabling either zarr or netCDF inputs for both the SST dataset and ocean mask source. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add --ocean-mask-var, --ocean-mask-lat-dim, and --ocean-mask-lon-dim CLI arguments so that get_ocean_mask works with datasets that use different variable or coordinate names than the default FV3GFS zarr source. Existing defaults (ocean_fraction, grid_yt, grid_xt) are preserved for non-breaking behavior. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Expose the start and end of the anomaly computation window as CLI arguments, defaulting to the previously hardcoded values (1940-01-01 and 2021-01-01) so existing behavior is unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a boolean --detrend CLI flag (default false) that, when set, subtracts a linear trendline from the monthly anomaly index before computing the 3-monthly running mean. Uses the existing (previously unused) get_time_trendline function. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Makes the SST variable name configurable so the script works with datasets like ERA5 that use a different variable name. Updates the era5_enso_index Makefile target to pass --sst-var surface_temperature. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace timedelta(days=45) with np.timedelta64(45, "D") to support
numpy datetime64 time coordinates
- Call .compute() before iterating to materialize dask-backed arrays
- Use .astype("datetime64[ms]").item() to safely extract year/month/day
from datetime64[ns] time coordinates
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wraps the .compute() call with a dask ProgressBar context manager so users get visibility into the computation instead of a silent wait. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Renames the Makefile target and its associated variables to reflect that the ERA5 data used is specifically the AIMIP dataset. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use use_cftime=True in open_dataset to avoid datetime64 issues with cftime-encoded time coordinates. Replace np.timedelta64 with datetime.timedelta for compatibility with cftime objects, and use .item() directly on cftime time coordinates instead of casting to datetime64[ms] first. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| # this version of xarray's resample method doesn't allow | ||
| # data shifting to create a centered 3-month average, so do it manually |
There was a problem hiding this comment.
[nit] Is this a requirement for ace's xarray version? I know for example under scripts/data_process we have a different xarray version.
There was a problem hiding this comment.
Good point, I am running this locally off of the fme conda environment, and at some point I hit the issue that required this. I think maybe the best is to just be clearer what environment is expected to be used to run the script; I'll add a comment.
| @@ -0,0 +1,2 @@ | |||
| *index.py | |||
There was a problem hiding this comment.
Just curious why we don't want to keep the files in git, is this purely for analysis only? In the past when I did this for E3SM, I ended up overriding historical_index.py file so that they show up on wandb. Maybe that's not your intent here?
There was a problem hiding this comment.
For the immediate AIMIP use case I am putting the output in a notebook/other script somewhere. Perhaps we should have a catalog of indices in historical_index.py for different SST datasets, but not sure how pressing that need is.
There was a problem hiding this comment.
Not pressing. We can add it when there's a use case for that, just wanted to check.
Documents the two Makefile targets as the primary entry points for running the script. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Parameterizes
scripts/compute_enso_index/compute_enso_index.pywith CLI arguments so it can be run on different datasets without code changes, and adds a dedicated Makefile target for the ERA5 AIMIP dataset.Changes:
scripts/compute_enso_index/compute_enso_index.py— added CLI arguments for SST variable, dimension names, ocean mask source, time range, output file, and optional detrending; added dask progress bar; fixed cftime/datetime64 compatibilityscripts/compute_enso_index/Makefile— addedera5_aimip_enso_indextarget with explicit arguments for the ERA5 AIMIP zarr datasetTests added
If dependencies changed, "deps only" image rebuilt and "latest_deps_only_image.txt" file updated