Skip to content

Conversation

@philip-paul-mueller
Copy link
Contributor

@philip-paul-mueller philip-paul-mueller commented Mar 17, 2025

Due to the update to DaCe in PR#1976 the code generator is now able to handle more Memlets directly as Cuda memcpy() calls. Thus we have to change this function to reflect this case.
It even stands to reason that these functions can be removed altogether.

Note that the merge has become possible because of our switch to our DaCe fork in PR#2012.

Once [PR#1976](spcl/dace#1976) is merged in DaCe the code generator is able to handle more Memlets directly as Cuda `memcpy()` calls.
This PR modifies the GPU transformation of GT4Py in such a way that these Memlets are no longer transformed into Maps.

However, it can only be merged if the DaCe dependency was bumped to a version that includes PR#1976!
Copy link
Contributor

@edopao edopao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GPU tests pass, although we are on a dace baseline that does not include the PR mentioned in the PR description. I suspect this is because gt4py (and icon4py as well) use symbolic field strides, by default. If this is the case, is it safe to merge this PR?

philip-paul-mueller added a commit to philip-paul-mueller/gt4py that referenced this pull request Apr 28, 2025
We have to expand some Memlets into copy Maps, because there is no suitable call to `cudaMemcpy*`, because we have to set their order right. But this is a gross hack, it should be solved once [PR#1913](GridTools#1913) has been merged, but for that DaCe [PR#1976](spcl/dace#1976) has to be merged.

The error was that there was the assumption that at least one of these newly created Maps survived.
However, this does not seem to be the case, depending in which order MapFusion is applied.
Thus the assert was removed.
We now assume that if we do not find a newly created MapEntry that is was integrated into a Map that was already present, we further assume that the Map parameter of that Map were used and that they are correct.
philip-paul-mueller added a commit that referenced this pull request Apr 29, 2025
We have to expand some Memlets into copy Maps, because there is no
suitable call to `cudaMemcpy*`, because we have to set their order
right. But this is a gross hack, it should be solved once
[PR#1913](#1913) has been merged,
but for that DaCe [PR#1976](spcl/dace#1976) has
to be merged.

The error was that there was the assumption that at least one of these
newly created Maps survived. However, this does not seem to be the case,
depending in which order MapFusion is applied. Thus the assert was
removed.
We now assume that if we do not find a newly created MapEntry that is
was integrated into a Map that was already present, we further assume
that the Map parameter of that Map were used and that they are correct.
@edopao
Copy link
Contributor

edopao commented May 2, 2025

This PR can merged, since the dace source now points to the GridTools fork repo.

@philip-paul-mueller philip-paul-mueller merged commit 95cffcf into GridTools:main May 7, 2025
31 checks passed
stubbiali pushed a commit to stubbiali/gt4py that referenced this pull request Aug 19, 2025
…2004)

We have to expand some Memlets into copy Maps, because there is no
suitable call to `cudaMemcpy*`, because we have to set their order
right. But this is a gross hack, it should be solved once
[PR#1913](GridTools#1913) has been merged,
but for that DaCe [PR#1976](spcl/dace#1976) has
to be merged.

The error was that there was the assumption that at least one of these
newly created Maps survived. However, this does not seem to be the case,
depending in which order MapFusion is applied. Thus the assert was
removed.
We now assume that if we do not find a newly created MapEntry that is
was integrated into a Map that was already present, we further assume
that the Map parameter of that Map were used and that they are correct.
stubbiali pushed a commit to stubbiali/gt4py that referenced this pull request Aug 19, 2025
…ridTools#1913)

Due to the update to DaCe in
[PR#1976](spcl/dace#1976) the code generator is
now able to handle more Memlets directly as Cuda `memcpy()` calls. Thus
we have to change this function to reflect this case.
It even stands to reason that these functions can be removed altogether.

Note that the merge has become possible because of our switch to our
DaCe fork in [PR#2012](GridTools#2012).
@philip-paul-mueller philip-paul-mueller deleted the updated-gpu-trafo branch October 9, 2025 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants