Implement xarray-like labeled tensors and semantics#1411
Merged
ricardoV94 merged 18 commits intomainfrom Jun 21, 2025
Merged
Conversation
d6a3ddf to
177a4c2
Compare
49fac6a to
e32d865
Compare
d8fe0d1 to
29b954a
Compare
2966e9d to
692c53c
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1411 +/- ##
==========================================
- Coverage 82.01% 81.99% -0.03%
==========================================
Files 214 231 +17
Lines 50439 52173 +1734
Branches 8907 9178 +271
==========================================
+ Hits 41370 42779 +1409
- Misses 6861 7088 +227
- Partials 2208 2306 +98
🚀 New features to boost your workflow:
|
Member
Author
|
If anybody wants to fix mypy that's very welcome :) |
twiecki
reviewed
Jun 4, 2025
| @@ -0,0 +1,219 @@ | |||
| # HERE LIE DRAGONS | |||
Member
Author
There was a problem hiding this comment.
@lucianopaz can we bring your pre-commit hook over?
Member
Author
There was a problem hiding this comment.
Can be a separate PR, wouldn't be surprised if have files missing it in main
Merged
7da9935 to
7b8877b
Compare
77e018d to
5c94ce1
Compare
5c94ce1 to
51031fc
Compare
2952b1a to
71bc4ef
Compare
71bc4ef to
41d9be4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Strategy
We implement xarray-like dummy Ops that respect / propagate dims semantics, and lower them to regular PyTensor graphs with rewrites.
Note in the example above the dummy TensorFromXtensor and XTensorFromTensor remain in the final graph. If we had created a function with Tensor inputs and outputs that are only then converted (symbolically) to and from xtensor, respectively, the final graph would have no signs of dimension operations, other than how it was constructed.
I suggest registering those rewrites in an
xtensor_loweringdatabase.Coordinates
For now I'm playing with how far we can get without coordinates. This means the graphs produced by an xarray-like syntax are much more amenable to the numpy-like backend of PyTensor. Otherwise it involves a lot of Pandas-like stuff (e.g., Multiindex) that we don't really have. It may be feasible, specially if nothing is symbolic, but... I fear a rabbit hole of edge cases)
Gradients
These ops are currently not differentiable, but one can lower the graph and then call the gradient. I do want to try the lazy grad approach from #788
Help implementing more Ops so we have MVP to try out with PyMC next. We need some Ops
Open a PR on top of this branch, I'll try to merge quickly! Try to make it clean (one commit per Op, unless it's like a factory of related Ops)
Implementing means:
3.1 The rewrites "box" the lower tensor operations between
TensorFromXTensorandXTensorFromTensorcalls, so that the replacements are valid in terms of types. There are rewrites to remove chains of useless TensorFromXTensor/XTensorFromTensor that should clean up everything in the middle of the graph.Interplay between XTensorTypes and TensorTypes / weakly typed inputs
__add__and the like so you can do x + x)Meta Ops
math.switchprobably can't support drop=True)timedim to the outputs, and perhaps use that to also align thesequences)Math stuff
Shape stuff
Array creation stuff
self.x * 0,self.x * 0 + 1)? PyTensor will do the right thing when it gets lowered)Indexing stuff
__getitem__+ isel Implement indexing operations for XTensorVariables #1429__getitem__+ isel for boolean indices (should work fine, just need to test and lift raise error)It probably makes sense to convert the non-XTensor indices to XTensor indices if they can be rendered equivalent, to reduce logic needed.
RandomVariables
This is quite important, as we'll need those for PyMC models! They are a mix of blockwise + size argument (which can or not be redundant)
Graph transformations
📚 Documentation preview 📚: https://pytensor--1411.org.readthedocs.build/en/1411/