-
Notifications
You must be signed in to change notification settings - Fork 242
cuda.core.system: More basic device-related APIs #1462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR extends the cuda.core.system.Device class with additional device-related APIs. It adds new ways to initialize Device objects (via PCI bus ID or handle), new class methods for device enumeration, and new properties to query device metadata.
Changes:
- Extended
Device.__init__to support initialization viapci_bus_idorhandleparameters, in addition to existingindexanduuidoptions, with keyword-only parameters enforced - Added
get_device_count()class method,get_all_devices_with_cpu_affinity()class method, andindexandmodule_idproperties - Removed the public
handleproperty (keeping_handleas internal implementation detail)
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| cuda_core/cuda/core/system/_device.pyx | Extended Device initialization options, added new class methods and properties, removed public handle property |
| cuda_core/tests/system/test_system_device.py | Removed test for handle property, added tests for new get_all_devices_with_cpu_affinity, index, and module_id features |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for device in system.Device.get_all_devices_with_cpu_affinity(i): | ||
| affinity = device.cpu_affinity | ||
| assert isinstance(affinity, list) | ||
| assert {i} == set(affinity) |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion {i} == set(affinity) expects the device's cpu_affinity to contain exactly and only the CPU index i. However, based on the cpu_affinity property documentation and example (which shows multiple CPU indices), a device's CPU affinity typically includes multiple CPUs in the same NUMA node. The assertion should be i in affinity instead to check that the CPU is included in the affinity, rather than requiring it to be the only CPU.
| assert {i} == set(affinity) | |
| assert i in affinity |
| def __init__( | ||
| self, | ||
| *, | ||
| index: int | None = None, | ||
| uuid: bytes | str | None = None, | ||
| pci_bus_id: bytes | str | None = None, | ||
| handle: int | None = None | ||
| ): |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of the * parameter to make all Device constructor parameters keyword-only is a breaking API change. Existing code that calls Device(0) positionally will now fail with "TypeError: Device() takes 0 positional arguments but 1 was given". This should be documented in the migration guide or release notes, or the * should be removed to maintain backward compatibility.
| def __init__( | ||
| self, | ||
| *, | ||
| index: int | None = None, | ||
| uuid: bytes | str | None = None, | ||
| pci_bus_id: bytes | str | None = None, | ||
| handle: int | None = None | ||
| ): |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Device class docstring (lines 313-331) needs to be updated to document the new pci_bus_id and handle parameters. The current docstring only documents index and uuid, and the Raises section describes the old validation behavior. Update the Parameters section to include all four options and update the Raises section to reflect the new error messages.
| raise ValueError("Handle requires only one of either device `index`, `uuid`, `pci_bus_id` or `handle`.") | ||
| if arg_count == 0: | ||
| raise ValueError("Handle requires either a device `index`, `uuid`, `pci_bus_id` or `handle`.") |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error messages refer to "Handle" but should refer to "Device" or "Device initialization". The messages currently say "Handle requires..." which is confusing since the user is constructing a Device, not a handle. Consider changing to "Device initialization requires..." or similar phrasing.
| raise ValueError("Handle requires only one of either device `index`, `uuid`, `pci_bus_id` or `handle`.") | |
| if arg_count == 0: | |
| raise ValueError("Handle requires either a device `index`, `uuid`, `pci_bus_id` or `handle`.") | |
| raise ValueError("Device initialization requires exactly one of `index`, `uuid`, `pci_bus_id`, or `handle`.") | |
| if arg_count == 0: | |
| raise ValueError("Device initialization requires one of `index`, `uuid`, `pci_bus_id`, or `handle` to be specified.") |
| *, | ||
| index: int | None = None, | ||
| uuid: bytes | str | None = None, | ||
| pci_bus_id: bytes | str | None = None, |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new pci_bus_id parameter for Device initialization lacks test coverage. Consider adding a test that creates a Device using the pci_bus_id parameter and validates the resulting device matches the expected device.
| index: int | None = None, | ||
| uuid: bytes | str | None = None, | ||
| pci_bus_id: bytes | str | None = None, | ||
| handle: int | None = None |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new handle parameter for Device initialization lacks test coverage. This is a particularly important parameter to test since it bypasses the NVML lookup functions and directly assigns the handle. Consider adding a test that validates this code path works correctly.
| @classmethod | ||
| def get_device_count(cls) -> int: | ||
| """ | ||
| Get the number of available devices. | ||
|
|
||
| Returns | ||
| ------- | ||
| int | ||
| The number of available devices. | ||
| """ | ||
| return nvml.device_get_count_v2() |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new get_device_count class method lacks test coverage. Since other similar methods in this file have comprehensive test coverage, consider adding a test to validate this method works correctly and returns the expected device count.
Description
closes
Checklist