-
Notifications
You must be signed in to change notification settings - Fork 203
Update alpaka to 2.1.1 "Fixed Transformation" [16.0.x] #10260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: IB/CMSSW_16_0_X/master
Are you sure you want to change the base?
Update alpaka to 2.1.1 "Fixed Transformation" [16.0.x] #10260
Conversation
This release implements a new interface, similar to std::transform, that simplifies writing asynchronous parallel algorithms across all back-ends. SYCL support is extended to NVIDIA and AMD GPUs. The release introduces unified memory and expands asynchronous memory allocation to buffers of any dimension. Interoperability with standard C++ is improved through std::span support: alpaka buffers expose a span interface, and any std::span can be used as an alpaka view. It adds compile-time warp-size definitions, extends atomic increment and decrement operations and fixes their behaviour on CPU back-end; it introduces a C++ concept for alpaka accelerators together with new type traits, along with many smaller fixes and improvements. The CI has been updated to test newer operating systems and compilers, including Clang 20 and ROCm 6.3, 6.4, and 7.0. The full list of changes is available in the ChangeLog.
|
enable gpu |
|
please test |
|
A new Pull Request was created by @fwyzard for branch IB/CMSSW_16_0_X/master. @akritkbehera, @iarspider, @raoatifshad, @smuzaffar can you please review it and eventually sign? Thanks.
|
|
cms-bot internal usage |
|
backport #10250 |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-927186/50364/summary.html Comparison SummarySummary:
AMD_MI300X Comparison SummarySummary:
AMD_W7900 Comparison SummarySummary:
NVIDIA_H100 Comparison SummaryThere are some workflows for which there are errors in the baseline: Summary:
NVIDIA_L40S Comparison SummarySummary:
NVIDIA_T4 Comparison SummarySummary:
|
This release implements a new interface, similar to std::transform, that simplifies writing asynchronous parallel algorithms across all back-ends. SYCL support is extended to NVIDIA and AMD GPUs.
The release introduces unified memory and expands asynchronous memory allocation to buffers of any dimension. Interoperability with standard C++ is improved through std::span support: alpaka buffers expose a span interface, and any std::span can be used as an alpaka view. It adds compile-time warp-size definitions, extends atomic increment and decrement operations and fixes their behaviour on CPU back-end; it introduces a C++ concept for alpaka accelerators together with new type traits, along with many smaller fixes and improvements. The CI has been updated to test newer operating systems and compilers, including Clang 20 and ROCm 6.3, 6.4, and 7.0.
The full list of changes is available in the ChangeLog.