Skip to content

Change 2nd dimension name to avoid crash for gauss cases#22

Merged
JamesMcClung merged 2 commits intopsc-code:mainfrom
JamesMcClung:rename-component-dim
Apr 14, 2025
Merged

Change 2nd dimension name to avoid crash for gauss cases#22
JamesMcClung merged 2 commits intopsc-code:mainfrom
JamesMcClung:rename-component-dim

Conversation

@JamesMcClung
Copy link
Collaborator

@JamesMcClung JamesMcClung commented Mar 12, 2025

Fixes #21

The name of the 2nd dimension doesn't matter, since it gets dropped. This fixes the case where multiple dataarrays exist in one dataset (e.g. rho and dive for gauss outputs), where previously, the name of the 2nd dimension shared by all dataarrays was set to contain the name of the first dataarray.

An alternative approach would have been to change gauss output on the PSC side to produce a single datarray with 2 components, dive and rho.

It might be nice to have this behavior enforced by tests, but there are no gauss outputs in the sample directory. That could merit a new issue.

@codecov
Copy link

codecov bot commented Mar 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Files with missing lines Coverage Δ
src/pscpy/psc.py 98.24% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JamesMcClung
Copy link
Collaborator Author

Thoughts on this? I know it's another step away from the concept of a jeh variable, but I would like to be able to load gauss datasets without having to manually edit my venv's copy of pscpy

ds = ds.rename_dims(
{
da.dims[0]: "step",
da.dims[1]: f"comp_{da.name}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think I did this kinda ugly naming to prevent problems with uniqueness. Specifically, if a dataset has fields and moments, one of them may have 9 components but the other 26 (or whatever). If the dimension is called "components" in both cases, xarray will complain, since a given dimension must have a unique length.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand exactly things went wrong for you, but I gather it's related if not somehow opposite. What I thought should have happened is that you get comp_rho and comp_dive dimension names, and while both them are equal to one, it shouldn't break. Are you saying they both get called comp_rho? That wouldn't be as intended, though I guess I also see why that'd break, since that dimension is length 1 no matter what it's called?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. Maybe the easiest solution would be to index by dimension number instead of dimension name, although that's marginally harder to read.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re. your second comment (which didn't exist when I was writing my reply to the first): I tried to explain in #21 , but yes, that's the gist of the problem. The dive and rho variables both happen to have a length-1 "component" dimension, so somewhere between adios2 and xarray, those dimension names default to "dim_1_1" and are assumed to be the same by xarray.

The error I was running into was that the line you highlighted above only occurs for the first variable, but doing it for each variable wouldn't help unless the dimension can be de-combined.

@JamesMcClung JamesMcClung force-pushed the rename-component-dim branch from 1142f77 to 002791c Compare March 25, 2025 15:40
@JamesMcClung JamesMcClung force-pushed the rename-component-dim branch from 002791c to bb6773b Compare April 14, 2025 16:34
@JamesMcClung JamesMcClung merged commit cbdcf1e into psc-code:main Apr 14, 2025
4 checks passed
@JamesMcClung JamesMcClung deleted the rename-component-dim branch April 14, 2025 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

decode_psc crashes for gauss output

2 participants