Skip to content

Conversation

@briancoutinho
Copy link
Contributor

@briancoutinho briancoutinho commented Dec 30, 2025

Overview

Originally, this was part 1 of splitting PR #1148. It supports a new kind of GPU Counter events that will be published to the timeline as a time series. In the process I realized we should add more generic event types for accelerators rather than being tied to CUDA specific naming. This has historically lead to each new accelerator adding it's own events which is maintenance burden.

  1. Generalized Accelerator Event types: Add generic event types in the ActivityType enum. There are now aliases in the enum class definition for older events like CUDA_RUNTIME, MTIA_RUNTIME to name a few. The dynamic plugin will leverage generic events so upstream source code changes will not be required for new accelerators.
  2. GPU PM Counter Event Type: This enables supporting a performance counter event stream. Counter events can be serialized in ChromeTrace format and will be rendered as time series in the UI.

Details

Accelerator-Agnostic Event Types

The change reorganizes the ActivityType enum class to introduce generic, accelerator-agnostic event types that work across all hardware backends (CUDA, MTIA, HPU, XPU, etc.). Device-specific types are now deprecated aliases pointing to their generic counterparts. There are few corner case exceptions like MTIA_INSIGHT, CUDA_SYNC, see the header.

New Generic Event Types

Event Type Description Replaces (Deprecated Aliases)
RUNTIME Host-side runtime events from any accelerator backend CUDA_RUNTIME, MTIA_RUNTIME, GLOW_RUNTIME, XPU_RUNTIME, PRIVATEUSE1_RUNTIME, HPU_OP
DRIVER Host-side driver events from any accelerator backend CUDA_DRIVER, PRIVATEUSE1_DRIVER
CONCURRENT_KERNEL On-device kernel execution across all accelerators MTIA_CCP_EVENTS
GPU_PM_COUNTER Performance monitoring counters for hardware profiling (new)

Guidance for future use of ActivityTypes

Existing code using deprecated aliases will continue to work, but new code should use the generic types:

// ❌ Deprecated (still works for backward compatibility)
ActivityType::CUDA_RUNTIME
ActivityType::MTIA_RUNTIME

// ✅ Preferred
ActivityType::RUNTIME

I have not changed the usage of these types in the code base yet. That can happen in a follow up change.

Notes

  • Old string names like "cuda_runtime" still parse correctly via aliasMap, and old enum values like ActivityType::CUDA_RUNTIME still compile (as aliases).
  • defaultActivityTypesArray is now constexpr, enabling compile-time evaluation and eliminating runtime overhead.

GPU PM Counter Events

This is a straightforward change to emit Chrome Trace counter events] for counters obtained from the GPU. The event can be leveraged by any accelerator backend.

The values of the counters are embedded as key/val pairs in the output json

{..., "name": "ctr", "ph": "C", "ts":  0, "args": {"my_counter":  42}},

Testing

make test
// or
ctest -R ActivityTypes

The ParseTest.ActivityTypes validates that aliases in the Config string are correctly converted to the underlying base type. Since the enum class uses aliases all existing code in kineto that uses the original activity types continues to compile and work as expected.
The test above also checks default activity types are unchanged
https://github.com/pytorch/kineto/blob/main/libkineto/test/ConfigTest.cpp#L89-L108

@meta-cla meta-cla bot added the cla signed label Dec 30, 2025
@briancoutinho briancoutinho force-pushed the bcoutinho/gpu_pm_activity_type branch from e9e530b to fabc0c7 Compare December 30, 2025 22:11
@briancoutinho briancoutinho marked this pull request as ready for review December 30, 2025 22:19
@briancoutinho briancoutinho force-pushed the bcoutinho/gpu_pm_activity_type branch from fabc0c7 to 93105fe Compare December 31, 2025 18:18
@KarhouTam
Copy link

Hi @briancoutinho, this update is absolutely fantastic! As an out-of-tree accelerator developer, I’m genuinely excited about these advancements. If possible, could you kindly share an approximate timeline for the upcoming splitting changes? Can't wait to begin integrating our accelerator profiler with these impressive updates.

Additionally, may I ask if Kineto is still open to accepting contributions for third-party accelerator profiler as a plugin after these changes merged? Thank you!

At the end, happy new year to you and the community (maybe a little bit late, hah)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants