Skip to content

Support Julia 1.13#3020

Open
eschnett wants to merge 13 commits intoJuliaGPU:masterfrom
eschnett:eschnett/julia-1.13
Open

Support Julia 1.13#3020
eschnett wants to merge 13 commits intoJuliaGPU:masterfrom
eschnett:eschnett/julia-1.13

Conversation

@eschnett
Copy link
Contributor

Closes #3019.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Details
Benchmark suite Current: cb74c7a Previous: 7a27d77 Ratio
latency/precompile 44381498055.5 ns 44455759835 ns 1.00
latency/ttfp 13144733912 ns 13140153243 ns 1.00
latency/import 3768995999 ns 3755312424 ns 1.00
integration/volumerhs 9445468 ns 9442840 ns 1.00
integration/byval/slices=1 145662 ns 145598 ns 1.00
integration/byval/slices=3 422646.5 ns 422554 ns 1.00
integration/byval/reference 143785 ns 143811 ns 1.00
integration/byval/slices=2 284112 ns 284011 ns 1.00
integration/cudadevrt 102551 ns 102397 ns 1.00
kernel/indexing 13367.5 ns 13434 ns 1.00
kernel/indexing_checked 14039 ns 13908 ns 1.01
kernel/occupancy 643.6987951807229 ns 644.5636363636364 ns 1.00
kernel/launch 2085.6 ns 2090.3 ns 1.00
kernel/rand 14817.5 ns 14479 ns 1.02
array/reverse/1d 18616 ns 18661 ns 1.00
array/reverse/2dL_inplace 66221 ns 66252 ns 1.00
array/reverse/1dL 68793 ns 68893 ns 1.00
array/reverse/2d 20790 ns 21087 ns 0.99
array/reverse/1d_inplace 10501 ns 10503.833333333332 ns 1.00
array/reverse/2d_inplace 10453 ns 11399.5 ns 0.92
array/reverse/2dL 72780 ns 73163 ns 0.99
array/reverse/1dL_inplace 66066 ns 66146 ns 1.00
array/copy 18142 ns 18502.5 ns 0.98
array/iteration/findall/int 145556.5 ns 146476.5 ns 0.99
array/iteration/findall/bool 130349 ns 130795 ns 1.00
array/iteration/findfirst/int 84137 ns 84133 ns 1.00
array/iteration/findfirst/bool 81322 ns 81624.5 ns 1.00
array/iteration/scalar 66718 ns 65804 ns 1.01
array/iteration/logical 195274.5 ns 198187.5 ns 0.99
array/iteration/findmin/1d 84764.5 ns 86504 ns 0.98
array/iteration/findmin/2d 116397 ns 117154 ns 0.99
array/reductions/reduce/Int64/1d 39356 ns 41088.5 ns 0.96
array/reductions/reduce/Int64/dims=1 51634 ns 52190.5 ns 0.99
array/reductions/reduce/Int64/dims=2 58848 ns 59179 ns 0.99
array/reductions/reduce/Int64/dims=1L 87073.5 ns 87126 ns 1.00
array/reductions/reduce/Int64/dims=2L 84606 ns 84418.5 ns 1.00
array/reductions/reduce/Float32/1d 33956 ns 34001 ns 1.00
array/reductions/reduce/Float32/dims=1 39729.5 ns 39890 ns 1.00
array/reductions/reduce/Float32/dims=2 56655.5 ns 55899 ns 1.01
array/reductions/reduce/Float32/dims=1L 51474 ns 51535 ns 1.00
array/reductions/reduce/Float32/dims=2L 69730 ns 69798 ns 1.00
array/reductions/mapreduce/Int64/1d 39196 ns 40980.5 ns 0.96
array/reductions/mapreduce/Int64/dims=1 41696.5 ns 41741 ns 1.00
array/reductions/mapreduce/Int64/dims=2 58701 ns 59036 ns 0.99
array/reductions/mapreduce/Int64/dims=1L 87052 ns 87134 ns 1.00
array/reductions/mapreduce/Int64/dims=2L 84512 ns 84427 ns 1.00
array/reductions/mapreduce/Float32/1d 33698 ns 33457 ns 1.01
array/reductions/mapreduce/Float32/dims=1 48904 ns 48711 ns 1.00
array/reductions/mapreduce/Float32/dims=2 55684 ns 55941 ns 1.00
array/reductions/mapreduce/Float32/dims=1L 51470 ns 51352 ns 1.00
array/reductions/mapreduce/Float32/dims=2L 68897.5 ns 68956 ns 1.00
array/broadcast 20174 ns 20251 ns 1.00
array/copyto!/gpu_to_gpu 10408.166666666668 ns 10684.333333333334 ns 0.97
array/copyto!/cpu_to_gpu 214860 ns 214898 ns 1.00
array/copyto!/gpu_to_cpu 285587 ns 281876 ns 1.01
array/accumulate/Int64/1d 118031 ns 118336 ns 1.00
array/accumulate/Int64/dims=1 79306.5 ns 79780 ns 0.99
array/accumulate/Int64/dims=2 155444.5 ns 155968.5 ns 1.00
array/accumulate/Int64/dims=1L 1694458 ns 1694089 ns 1.00
array/accumulate/Int64/dims=2L 960288 ns 960949 ns 1.00
array/accumulate/Float32/1d 100200 ns 100823 ns 0.99
array/accumulate/Float32/dims=1 75596 ns 76350 ns 0.99
array/accumulate/Float32/dims=2 143914 ns 144365 ns 1.00
array/accumulate/Float32/dims=1L 1584412.5 ns 1584729 ns 1.00
array/accumulate/Float32/dims=2L 656263 ns 656302 ns 1.00
array/construct 1274.4 ns 1283.1 ns 0.99
array/random/randn/Float32 36142.5 ns 36610 ns 0.99
array/random/randn!/Float32 26781.5 ns 30335 ns 0.88
array/random/rand!/Int64 34387 ns 26934 ns 1.28
array/random/rand!/Float32 8197.25 ns 8186.666666666667 ns 1.00
array/random/rand/Int64 29742 ns 30201.5 ns 0.98
array/random/rand/Float32 12222 ns 12396 ns 0.99
array/permutedims/4d 51717.5 ns 52729 ns 0.98
array/permutedims/2d 52258 ns 52645 ns 0.99
array/permutedims/3d 52636 ns 53080 ns 0.99
array/sorting/1d 2735795 ns 2736443 ns 1.00
array/sorting/by 3305363 ns 3305811 ns 1.00
array/sorting/2d 1066907 ns 1071655.5 ns 1.00
cuda/synchronization/stream/auto 974.15 ns 1034.5263157894738 ns 0.94
cuda/synchronization/stream/nonblocking 7304.9 ns 7705.9 ns 0.95
cuda/synchronization/stream/blocking 803.0625 ns 784.4516129032259 ns 1.02
cuda/synchronization/context/auto 1169.5 ns 1133.5 ns 1.03
cuda/synchronization/context/nonblocking 7610.5 ns 7594.6 ns 1.00
cuda/synchronization/context/blocking 930.6842105263158 ns 885.6792452830189 ns 1.05

This comment was automatically generated by workflow using github-action-benchmark.

@eschnett
Copy link
Contributor Author

The self-tests fail because the linear algebra functions (e.g. matrix exponential) as implemented in LinearAlgebra use scalar iteration. See e.g. exp! in https://github.com/JuliaLang/LinearAlgebra.jl/blob/f55e4736fb6dce08fee8a7ac7f0aba1f2b54838e/src/dense.jl#L784.

How should this be handled? Rewrite exp!? Find a respective CUDA library function to call and add a new method to exp? Fall back to the Julia 1.12 implementation? How does this work in Julia 1.12?

@eschnett
Copy link
Contributor Author

I think it's JuliaGPU/GPUArrays.jl#679.

@eschnett
Copy link
Contributor Author

eschnett commented Feb 3, 2026

The buildkite error is

  ptxas /tmp/jl_PALmvKnqta.ptx, line 226; error   : Modifier '.NaN' requires .target sm_80 or higher
  ptxas /tmp/jl_PALmvKnqta.ptx, line 226; error   : Feature 'min.f16 or min.f16x2' requires .target sm_80 or higher

This seems unrelated to my changes, except that I am now running CI tests on Julia 1.12 and Julia 1.13...

@maleadt
Copy link
Member

maleadt commented Feb 4, 2026

I guess #3025 needs to be active for all LLVM versions.

@eschnett
Copy link
Contributor Author

eschnett commented Feb 4, 2026

Good news: CUDA.jl now works for Julia 1.12.
Bad news: There's an LLVM segfault for Julia 1.13.

�_bk;t=1770145810814�      From worker 5:	[271397] signal 11 (1): Segmentation fault
�_bk;t=1770145810814�      From worker 5:	in expression starting at /var/lib/buildkite-agent/builds/gpuci-9/julialang/cuda-dot-jl/test/base/texture.jl:41
�_bk;t=1770145810920�      From worker 5:	_ZN12_GLOBAL__N_124NVPTXReplaceImageHandles18findIndexForHandleERN4llvm14MachineOperandERNS1_15MachineFunctionERj.isra.0 at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810920�      From worker 5:	_ZN12_GLOBAL__N_124NVPTXReplaceImageHandles18findIndexForHandleERN4llvm14MachineOperandERNS1_15MachineFunctionERj.isra.0 at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810920�      From worker 5:	_ZN12_GLOBAL__N_124NVPTXReplaceImageHandles20runOnMachineFunctionERN4llvm15MachineFunctionE at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810921�      From worker 5:	_ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE.part.0 at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810921�      From worker 5:	_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810921�      From worker 5:	_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810922�      From worker 5:	_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)
�_bk;t=1770145810922�      From worker 5:	_ZL21LLVMTargetMachineEmitP23LLVMOpaqueTargetMachineP16LLVMOpaqueModuleRN4llvm17raw_pwrite_streamE19LLVMCodeGenFileTypePPc at /root/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.13/julia-1.13-latest-linux-x86_64/bin/../lib/julia/libLLVM.so.20.1jl (unknown line)

@eschnett
Copy link
Contributor Author

eschnett commented Feb 4, 2026

I think it's texture interpolation that is broken on 1.13. This line segfaults LLVM:

dst[i] = texture[u]

in test/base/texture.jl (function kernel_texture_warp_native).

@eschnett
Copy link
Contributor Author

eschnett commented Feb 4, 2026

We will need to update KernelAbstractions.jl as well JuliaGPU/KernelAbstractions.jl#679.

@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.48%. Comparing base (7a27d77) to head (cb74c7a).

Files with missing lines Patch % Lines
lib/nvml/NVML.jl 50.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3020      +/-   ##
==========================================
+ Coverage   89.46%   89.48%   +0.01%     
==========================================
  Files         148      148              
  Lines       13047    13044       -3     
==========================================
- Hits        11673    11672       -1     
+ Misses       1374     1372       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@eschnett
Copy link
Contributor Author

All green!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot load CUDA.jl with Julia 1.13

3 participants