Conversation
There was a problem hiding this comment.
CUDA.jl Benchmarks
Details
| Benchmark suite | Current: cb74c7a | Previous: 7a27d77 | Ratio |
|---|---|---|---|
latency/precompile |
44381498055.5 ns |
44455759835 ns |
1.00 |
latency/ttfp |
13144733912 ns |
13140153243 ns |
1.00 |
latency/import |
3768995999 ns |
3755312424 ns |
1.00 |
integration/volumerhs |
9445468 ns |
9442840 ns |
1.00 |
integration/byval/slices=1 |
145662 ns |
145598 ns |
1.00 |
integration/byval/slices=3 |
422646.5 ns |
422554 ns |
1.00 |
integration/byval/reference |
143785 ns |
143811 ns |
1.00 |
integration/byval/slices=2 |
284112 ns |
284011 ns |
1.00 |
integration/cudadevrt |
102551 ns |
102397 ns |
1.00 |
kernel/indexing |
13367.5 ns |
13434 ns |
1.00 |
kernel/indexing_checked |
14039 ns |
13908 ns |
1.01 |
kernel/occupancy |
643.6987951807229 ns |
644.5636363636364 ns |
1.00 |
kernel/launch |
2085.6 ns |
2090.3 ns |
1.00 |
kernel/rand |
14817.5 ns |
14479 ns |
1.02 |
array/reverse/1d |
18616 ns |
18661 ns |
1.00 |
array/reverse/2dL_inplace |
66221 ns |
66252 ns |
1.00 |
array/reverse/1dL |
68793 ns |
68893 ns |
1.00 |
array/reverse/2d |
20790 ns |
21087 ns |
0.99 |
array/reverse/1d_inplace |
10501 ns |
10503.833333333332 ns |
1.00 |
array/reverse/2d_inplace |
10453 ns |
11399.5 ns |
0.92 |
array/reverse/2dL |
72780 ns |
73163 ns |
0.99 |
array/reverse/1dL_inplace |
66066 ns |
66146 ns |
1.00 |
array/copy |
18142 ns |
18502.5 ns |
0.98 |
array/iteration/findall/int |
145556.5 ns |
146476.5 ns |
0.99 |
array/iteration/findall/bool |
130349 ns |
130795 ns |
1.00 |
array/iteration/findfirst/int |
84137 ns |
84133 ns |
1.00 |
array/iteration/findfirst/bool |
81322 ns |
81624.5 ns |
1.00 |
array/iteration/scalar |
66718 ns |
65804 ns |
1.01 |
array/iteration/logical |
195274.5 ns |
198187.5 ns |
0.99 |
array/iteration/findmin/1d |
84764.5 ns |
86504 ns |
0.98 |
array/iteration/findmin/2d |
116397 ns |
117154 ns |
0.99 |
array/reductions/reduce/Int64/1d |
39356 ns |
41088.5 ns |
0.96 |
array/reductions/reduce/Int64/dims=1 |
51634 ns |
52190.5 ns |
0.99 |
array/reductions/reduce/Int64/dims=2 |
58848 ns |
59179 ns |
0.99 |
array/reductions/reduce/Int64/dims=1L |
87073.5 ns |
87126 ns |
1.00 |
array/reductions/reduce/Int64/dims=2L |
84606 ns |
84418.5 ns |
1.00 |
array/reductions/reduce/Float32/1d |
33956 ns |
34001 ns |
1.00 |
array/reductions/reduce/Float32/dims=1 |
39729.5 ns |
39890 ns |
1.00 |
array/reductions/reduce/Float32/dims=2 |
56655.5 ns |
55899 ns |
1.01 |
array/reductions/reduce/Float32/dims=1L |
51474 ns |
51535 ns |
1.00 |
array/reductions/reduce/Float32/dims=2L |
69730 ns |
69798 ns |
1.00 |
array/reductions/mapreduce/Int64/1d |
39196 ns |
40980.5 ns |
0.96 |
array/reductions/mapreduce/Int64/dims=1 |
41696.5 ns |
41741 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2 |
58701 ns |
59036 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=1L |
87052 ns |
87134 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2L |
84512 ns |
84427 ns |
1.00 |
array/reductions/mapreduce/Float32/1d |
33698 ns |
33457 ns |
1.01 |
array/reductions/mapreduce/Float32/dims=1 |
48904 ns |
48711 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2 |
55684 ns |
55941 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=1L |
51470 ns |
51352 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2L |
68897.5 ns |
68956 ns |
1.00 |
array/broadcast |
20174 ns |
20251 ns |
1.00 |
array/copyto!/gpu_to_gpu |
10408.166666666668 ns |
10684.333333333334 ns |
0.97 |
array/copyto!/cpu_to_gpu |
214860 ns |
214898 ns |
1.00 |
array/copyto!/gpu_to_cpu |
285587 ns |
281876 ns |
1.01 |
array/accumulate/Int64/1d |
118031 ns |
118336 ns |
1.00 |
array/accumulate/Int64/dims=1 |
79306.5 ns |
79780 ns |
0.99 |
array/accumulate/Int64/dims=2 |
155444.5 ns |
155968.5 ns |
1.00 |
array/accumulate/Int64/dims=1L |
1694458 ns |
1694089 ns |
1.00 |
array/accumulate/Int64/dims=2L |
960288 ns |
960949 ns |
1.00 |
array/accumulate/Float32/1d |
100200 ns |
100823 ns |
0.99 |
array/accumulate/Float32/dims=1 |
75596 ns |
76350 ns |
0.99 |
array/accumulate/Float32/dims=2 |
143914 ns |
144365 ns |
1.00 |
array/accumulate/Float32/dims=1L |
1584412.5 ns |
1584729 ns |
1.00 |
array/accumulate/Float32/dims=2L |
656263 ns |
656302 ns |
1.00 |
array/construct |
1274.4 ns |
1283.1 ns |
0.99 |
array/random/randn/Float32 |
36142.5 ns |
36610 ns |
0.99 |
array/random/randn!/Float32 |
26781.5 ns |
30335 ns |
0.88 |
array/random/rand!/Int64 |
34387 ns |
26934 ns |
1.28 |
array/random/rand!/Float32 |
8197.25 ns |
8186.666666666667 ns |
1.00 |
array/random/rand/Int64 |
29742 ns |
30201.5 ns |
0.98 |
array/random/rand/Float32 |
12222 ns |
12396 ns |
0.99 |
array/permutedims/4d |
51717.5 ns |
52729 ns |
0.98 |
array/permutedims/2d |
52258 ns |
52645 ns |
0.99 |
array/permutedims/3d |
52636 ns |
53080 ns |
0.99 |
array/sorting/1d |
2735795 ns |
2736443 ns |
1.00 |
array/sorting/by |
3305363 ns |
3305811 ns |
1.00 |
array/sorting/2d |
1066907 ns |
1071655.5 ns |
1.00 |
cuda/synchronization/stream/auto |
974.15 ns |
1034.5263157894738 ns |
0.94 |
cuda/synchronization/stream/nonblocking |
7304.9 ns |
7705.9 ns |
0.95 |
cuda/synchronization/stream/blocking |
803.0625 ns |
784.4516129032259 ns |
1.02 |
cuda/synchronization/context/auto |
1169.5 ns |
1133.5 ns |
1.03 |
cuda/synchronization/context/nonblocking |
7610.5 ns |
7594.6 ns |
1.00 |
cuda/synchronization/context/blocking |
930.6842105263158 ns |
885.6792452830189 ns |
1.05 |
This comment was automatically generated by workflow using github-action-benchmark.
|
The self-tests fail because the linear algebra functions (e.g. matrix exponential) as implemented in How should this be handled? Rewrite |
|
I think it's JuliaGPU/GPUArrays.jl#679. |
|
The buildkite error is This seems unrelated to my changes, except that I am now running CI tests on Julia 1.12 and Julia 1.13... |
|
I guess #3025 needs to be active for all LLVM versions. |
|
Good news: CUDA.jl now works for Julia 1.12. |
|
I think it's texture interpolation that is broken on 1.13. This line segfaults LLVM: in |
|
We will need to update KernelAbstractions.jl as well JuliaGPU/KernelAbstractions.jl#679. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3020 +/- ##
==========================================
+ Coverage 89.46% 89.48% +0.01%
==========================================
Files 148 148
Lines 13047 13044 -3
==========================================
- Hits 11673 11672 -1
+ Misses 1374 1372 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
All green! |
Closes #3019.