Skip to content

Conversation

@philip-paul-mueller
Copy link
Contributor

We have to expand some Memlets into copy Maps, because there is no suitable call to cudaMemcpy*, because we have to set their order right. But this is a gross hack, it should be solved once PR#1913 has been merged, but for that DaCe PR#1976 has to be merged.

The error was that there was the assumption that at least one of these newly created Maps survived. However, this does not seem to be the case, depending in which order MapFusion is applied. Thus the assert was removed.
We now assume that if we do not find a newly created MapEntry that is was integrated into a Map that was already present, we further assume that the Map parameter of that Map were used and that they are correct.

We have to expand some Memlets into copy Maps, because there is no suitable call to `cudaMemcpy*`, because we have to set their order right. But this is a gross hack, it should be solved once [PR#1913](GridTools#1913) has been merged, but for that DaCe [PR#1976](spcl/dace#1976) has to be merged.

The error was that there was the assumption that at least one of these newly created Maps survived.
However, this does not seem to be the case, depending in which order MapFusion is applied.
Thus the assert was removed.
We now assume that if we do not find a newly created MapEntry that is was integrated into a Map that was already present, we further assume that the Map parameter of that Map were used and that they are correct.
Copy link
Contributor

@edopao edopao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested, it works.

@philip-paul-mueller philip-paul-mueller merged commit b2df05c into GridTools:main Apr 29, 2025
31 checks passed
philip-paul-mueller added a commit that referenced this pull request Jun 11, 2025
If the DaCe GPU code generator encounters a Memlet that can not be
expressed as a `cudaMemcpy*()` call, then it converts it to a Map.
However, the issue is that these Maps have the wrong iteration order,
i.e. wrong memory access pattern and it might not even launch, because
of too many blocks in the `y` direction of the compute grid.
For this reason GT4Py has to handle these Memlets explicitly.

However, [DaCe PR#1976](spcl/dace#1976) changed
this slightly and thus GT4Py had to follow.
Note, that some of these changes were already introduced by [GT4Py
PR#2004](#2004), however, they
were made for the original version of the DaCe PR (and the GT4Py PR had
to be merged before the DaCe PR was merged).

Furthermore, this PR fixes a different issue, also related to the
expansion of Memlets, which can be found in [DaCe
PR#2033](spcl/dace#2033) (it is not yet merged
and currently at commit `19b6bba`).
It fixes a bug in how the Memlets are expanded.
The DaCe PR also adds the possibility to generate an error instead of
slightly converting Memlets into Maps and this PR enables this feature.

---------

Co-authored-by: edopao <edoardo.paone@cscs.ch>
stubbiali pushed a commit to stubbiali/gt4py that referenced this pull request Aug 19, 2025
…2004)

We have to expand some Memlets into copy Maps, because there is no
suitable call to `cudaMemcpy*`, because we have to set their order
right. But this is a gross hack, it should be solved once
[PR#1913](GridTools#1913) has been merged,
but for that DaCe [PR#1976](spcl/dace#1976) has
to be merged.

The error was that there was the assumption that at least one of these
newly created Maps survived. However, this does not seem to be the case,
depending in which order MapFusion is applied. Thus the assert was
removed.
We now assume that if we do not find a newly created MapEntry that is
was integrated into a Map that was already present, we further assume
that the Map parameter of that Map were used and that they are correct.
stubbiali pushed a commit to stubbiali/gt4py that referenced this pull request Aug 19, 2025
…#2070)

If the DaCe GPU code generator encounters a Memlet that can not be
expressed as a `cudaMemcpy*()` call, then it converts it to a Map.
However, the issue is that these Maps have the wrong iteration order,
i.e. wrong memory access pattern and it might not even launch, because
of too many blocks in the `y` direction of the compute grid.
For this reason GT4Py has to handle these Memlets explicitly.

However, [DaCe PR#1976](spcl/dace#1976) changed
this slightly and thus GT4Py had to follow.
Note, that some of these changes were already introduced by [GT4Py
PR#2004](GridTools#2004), however, they
were made for the original version of the DaCe PR (and the GT4Py PR had
to be merged before the DaCe PR was merged).

Furthermore, this PR fixes a different issue, also related to the
expansion of Memlets, which can be found in [DaCe
PR#2033](spcl/dace#2033) (it is not yet merged
and currently at commit `19b6bba`).
It fixes a bug in how the Memlets are expanded.
The DaCe PR also adds the possibility to generate an error instead of
slightly converting Memlets into Maps and this PR enables this feature.

---------

Co-authored-by: edopao <edoardo.paone@cscs.ch>
@philip-paul-mueller philip-paul-mueller deleted the fixed_some_memlet_expansion_bug branch October 9, 2025 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants