Skip to content

Conversation

@manoj-freyr
Copy link
Collaborator

@manoj-freyr manoj-freyr commented Dec 4, 2025

Existing code had few issues in handling various lists and lookup tables for GPU information. Details of PGU like kfd ids, pcie id, location info, node index et were split into multiple lists. To update one and miss any other would result in inconsitency and usage issues.
To avoid this, a unified GpuInfo object will hold all information and a list of these objects will give static info of all devices in the system. This keeps information encapsulated .
Along with this, removed find() call in lookups, which adds significant overhead for multi GPU setups(O(n) runtime . Instead a lookup hashmap which is constructed once and for every lookup, takes O(1) time, results in significant run time improvement when we hae multi GPU systems.
This file now has a single place init and shutdown for amdsmi library, instead of scattering shutdown to multiple module specific rvs_module.cpp files.

Added further tests to verify:
./bintest/unit.rvs.gpu_util
{29ADE351-D322-44D3-B6C1-F612620D939B}


gpu_info_list.push_back(info);

// Build index maps for fast lookup[O(1)],instead of std::find(), whihc is O(n)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "which"

if (RVS_STATE_INITIALIZED != rvs_state) {
return RVS_STATUS_INVALID_STATE;
}
rvs::gpulist::Shutdown();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's abstract these kind implementations from the RVS interface? And move it internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants