Skip to content

Conversation

@Andy-Jost
Copy link
Contributor

@Andy-Jost Andy-Jost commented Jan 22, 2026

Summary

  • Fix 44 Cython warnings about nogil placement (should appear after except+; no functional difference, just clears a warning saying this will be an error in future version of Cython)
  • Fix unreachable code warning in ManagedMemoryResource
  • Optimize C++ handle-passing functions to use const& to avoid atomic ref count operations

Changes

  1. Cython nogil placement (_resource_handles.pxd, _resource_handles.pyx):

    • Changed from func(...) nogil except+ to func(...) except+ nogil
    • This will be required in a future Cython version
  2. Dead code fix (_managed_memory_resource.pyx):

    • Moved super().__init__() inside the IF CUDA_CORE_BUILD_MAJOR >= 13 block
    • The ELSE branch raises an exception, making code after it unreachable
  3. C++ const& optimization (.hpp, .cpp, .pxd, .pyx):

    • Functions that accept shared_ptr handles now take them by const&
    • Avoids atomic increment/decrement on function entry/exit
    • Affected functions: create_stream_handle, create_event_handle, deviceptr_alloc_from_pool, deviceptr_alloc_async, deviceptr_import_ipc, set_deallocation_stream

Test plan

  • Build compiles cleanly with no warnings
  • CI tests pass

- Fix Cython nogil placement: move nogil after except+ (44 warnings)
- Fix unreachable code in ManagedMemoryResource (move super().__init__
  inside CUDA 13+ conditional block)
- Optimize C++ functions to accept handles by const& instead of by value,
  avoiding unnecessary atomic ref count operations on shared_ptr
@Andy-Jost Andy-Jost added enhancement Any code-related improvements cuda.core Everything related to the cuda.core module labels Jan 22, 2026
@Andy-Jost Andy-Jost self-assigned this Jan 22, 2026
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Jan 22, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Andy-Jost
Copy link
Contributor Author

/ok to test 6686b20

@Andy-Jost Andy-Jost requested a review from mdboom January 22, 2026 22:19
@github-actions

This comment has been minimized.

cdef ContextHandle create_context_handle_ref(cydriver.CUcontext ctx) nogil except+
cdef ContextHandle get_primary_context(int device_id) nogil except+
cdef ContextHandle get_current_context() nogil except+
cdef ContextHandle create_context_handle_ref(cydriver.CUcontext ctx) except+ nogil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you include in the PR description why nogil except+ is different compared to except+ nogil?

Copy link
Collaborator

@rparolin rparolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just add a single comment asking for clarification on a fix.

@Andy-Jost Andy-Jost merged commit 34a3871 into NVIDIA:main Jan 25, 2026
86 checks passed
@Andy-Jost Andy-Jost deleted the fix-build-warnings branch January 25, 2026 16:59
@github-actions
Copy link

Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module enhancement Any code-related improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants