Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/pscpy/psc.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ def decode_psc(
ds = ds.rename_dims(
{
da.dims[0]: "step",
da.dims[1]: f"comp_{da.name}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think I did this kinda ugly naming to prevent problems with uniqueness. Specifically, if a dataset has fields and moments, one of them may have 9 components but the other 26 (or whatever). If the dimension is called "components" in both cases, xarray will complain, since a given dimension must have a unique length.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand exactly things went wrong for you, but I gather it's related if not somehow opposite. What I thought should have happened is that you get comp_rho and comp_dive dimension names, and while both them are equal to one, it shouldn't break. Are you saying they both get called comp_rho? That wouldn't be as intended, though I guess I also see why that'd break, since that dimension is length 1 no matter what it's called?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. Maybe the easiest solution would be to index by dimension number instead of dimension name, although that's marginally harder to read.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re. your second comment (which didn't exist when I was writing my reply to the first): I tried to explain in #21 , but yes, that's the gist of the problem. The dive and rho variables both happen to have a length-1 "component" dimension, so somewhere between adios2 and xarray, those dimension names default to "dim_1_1" and are assumed to be the same by xarray.

The error I was running into was that the line you highlighted above only occurs for the first variable, but doing it for each variable wouldn't help unless the dimension can be de-combined.

# dims[1] is the "component" dimension, which gets removed later
da.dims[2]: "z",
da.dims[3]: "y",
da.dims[4]: "x",
Expand All @@ -123,8 +123,8 @@ def decode_psc(
data_vars = {}
for var_name in ds:
if var_name in field_to_component:
for field, component in field_to_component[var_name].items(): # type: ignore[index]
data_vars[field] = ds[var_name].isel({f"comp_{var_name}": component})
for field, component_idx in field_to_component[var_name].items(): # type: ignore[index]
data_vars[field] = ds[var_name][component_idx, :, :, :]
ds = ds.drop_vars([var_name])
ds = ds.assign(data_vars)

Expand Down