Skip to content

Add/Update telemetry events#27356

Open
dabhattimsft wants to merge 3 commits intomainfrom
user/dabhatti/winmlTelem
Open

Add/Update telemetry events#27356
dabhattimsft wants to merge 3 commits intomainfrom
user/dabhatti/winmlTelem

Conversation

@dabhattimsft
Copy link

@dabhattimsft dabhattimsft commented Feb 16, 2026

Description

ModelLoadStart/End - InferenceSession::LoadWithLoader, InferenceSession::LoadOrtModelWithLoader
SessionCreationEnd - InferenceSession::Initialize
RegisterEpLibraryWithLibPath, RegisterEpLibraryStart/End - Environment::RegisterExecutionProviderLibrary

Update: RuntimePerf event now logs status as well and is triggered more frequently with exponential backoff and any time !retval.IsOK().
It is also now triggered from ~InferenceSession() to log data in the tail.

Motivation and Context

To better measure health

Darshak Bhatti added 3 commits February 15, 2026 22:48
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

}

void Telemetry::LogSessionCreationEnd(uint32_t session_id,
const common::Status& status) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const common::Status& status) const {
const common::Status& status) const {

}

void Telemetry::LogRegisterEpLibraryWithLibPath(const std::string& registration_name,
const std::string& lib_path) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const std::string& lib_path) const {
const std::string& lib_path) const {

}

void Telemetry::LogRegisterEpLibraryEnd(const std::string& registration_name,
const common::Status& status) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const common::Status& status) const {
const common::Status& status) const {

virtual void LogModelLoadEnd(uint32_t session_id, const common::Status& status) const;

virtual void LogSessionCreationEnd(uint32_t session_id,
const common::Status& status) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const common::Status& status) const;
const common::Status& status) const;

virtual void LogRunStart(uint32_t session_id) const;

virtual void LogRegisterEpLibraryWithLibPath(const std::string& registration_name,
const std::string& lib_path) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const std::string& lib_path) const;
const std::string& lib_path) const;

}

void WindowsTelemetry::LogRegisterEpLibraryWithLibPath(const std::string& registration_name,
const std::string& lib_path) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const std::string& lib_path) const {
const std::string& lib_path) const {

}

void WindowsTelemetry::LogRegisterEpLibraryEnd(const std::string& registration_name,
const common::Status& status) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const common::Status& status) const {
const common::Status& status) const {

void LogModelLoadEnd(uint32_t session_id, const common::Status& status) const override;

void LogSessionCreationEnd(uint32_t session_id,
const common::Status& status) const override;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const common::Status& status) const override;
const common::Status& status) const override;

void LogRegisterEpLibraryStart(const std::string& registration_name) const override;

void LogRegisterEpLibraryEnd(const std::string& registration_name,
const common::Status& status) const override;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const common::Status& status) const override;
const common::Status& status) const override;

Comment on lines +980 to +981
constexpr static long long kRuntimePerfInitialInterval = 2 * 1000 * 1000; // 2 seconds in (us)
constexpr static long long kRuntimePerfMaxInterval = 1000 * 1000 * 60 * 10; // 10 minutes in (us)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
constexpr static long long kRuntimePerfInitialInterval = 2 * 1000 * 1000; // 2 seconds in (us)
constexpr static long long kRuntimePerfMaxInterval = 1000 * 1000 * 60 * 10; // 10 minutes in (us)
constexpr static long long kRuntimePerfInitialInterval = 2 * 1000 * 1000; // 2 seconds in (us)
constexpr static long long kRuntimePerfMaxInterval = 1000 * 1000 * 60 * 10; // 10 minutes in (us)

}
ORT_TRY {
const Env& env = Env::Default();
env.GetTelemetryProvider().LogModelLoadStart(session_id_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: this is more general than the changes here, but I guess a byproduct of the Start / Stop telemetry pattern is that the Stop call is not universally guaranteed to fire due to e.g., early returns from some of those ORT_RETURN macros. We should have scope guards (or whatever the equivalent pattern is in the code base). I can file an issue and try to fix this later.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I thought about that. My thinking was to have those instances end up as timeout in MC state-machine.

@adrastogi
Copy link
Contributor

It is also now triggered from ~InferenceSession() to log data in the tail.

Where does that happen? Having trouble finding it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants