Skip to content

feat(realtime): WebRTC observability - diagnostics, stats & telemetry#85

Merged
AdirAmsalem merged 11 commits intomainfrom
AdirAmsalem/webrtc-observability
Feb 18, 2026
Merged

feat(realtime): WebRTC observability - diagnostics, stats & telemetry#85
AdirAmsalem merged 11 commits intomainfrom
AdirAmsalem/webrtc-observability

Conversation

@AdirAmsalem
Copy link
Contributor

@AdirAmsalem AdirAmsalem commented Feb 18, 2026

Summary

  • Structured logger (src/utils/logger.ts) with configurable log levels and createConsoleLogger helper
  • Diagnostic events (src/realtime/diagnostics.ts) for full connection lifecycle: ICE candidates, signaling state, peer connection state, phase timing, reconnects, video stalls
  • WebRTC stats collector (src/realtime/webrtc-stats.ts) polling at 1s with delta computation for cumulative counters (packetsLostDelta, framesDroppedDelta, freezeCountDelta, freezeDurationDelta)
  • Telemetry reporter (src/realtime/telemetry-reporter.ts) batching stats + diagnostics every 10s with explicit Datadog tags (session_id, sdk_version, integration)
  • NullReporter pattern to eliminate conditional ?. checks when telemetry is disabled
  • Granular error codes (WEBRTC_NEGOTIATION_FAILED, ICE_CONNECTION_FAILED, MEDIA_STREAM_FAILED, DATA_CHANNEL_FAILED)
  • Quality limitation tracking from outbound-rtp stats (qualityLimitationReason)
  • Video stall detection (fps < 0.5 threshold)

New files

  • packages/sdk/src/utils/logger.ts
  • packages/sdk/src/realtime/diagnostics.ts
  • packages/sdk/src/realtime/webrtc-stats.ts
  • packages/sdk/src/realtime/telemetry-reporter.ts

Modified files

  • packages/sdk/src/index.ts — new exports, logger/telemetry wiring
  • packages/sdk/src/realtime/client.ts — telemetry reporter integration, stall detection, stats auto-start
  • packages/sdk/src/realtime/webrtc-manager.ts — diagnostic emitter threading
  • packages/sdk/src/realtime/webrtc-connection.ts — diagnostic event emission for ICE/signaling/connection states
  • packages/sdk/src/realtime/subscribe-client.ts — diagnostic emitter support
  • packages/sdk/src/utils/errors.ts — new WebRTC error codes
  • packages/sdk/tests/unit.test.ts — 10 new tests (121 total)

Test plan

  • All 121 unit tests pass (npx vitest run packages/sdk/tests/unit.test.ts)
  • TypeScript compiles (only pre-existing RTCStatsReport.values() DOM type issues)
  • Manual E2E: connect a realtime session with telemetry: true and verify telemetry reports reach the backend
  • Verify Datadog dashboard populates when backend forwards sdk.webrtc.* metrics with tags

🤖 Generated with Claude Code


Note

Medium Risk
Touches realtime/WebRTC connection, reconnection, and error-handling paths and adds background telemetry uploads, which could affect connection stability/performance if misbehaving despite being opt-out.

Overview
Adds WebRTC observability to the SDK’s realtime client: new diagnostic and stats events emit connection phase timings, ICE/signaling/peer-state changes, reconnect outcomes, selected candidate pair, and video-stall detection, with periodic getStats() polling and delta/bitrate calculations.

Introduces opt-out telemetry reporting (buffered until session_id, chunked uploads with auth headers and keepalive on disconnect) plus a structured Logger API, wires both through createDecartClient/realtime + subscribe flows, and replaces generic WebRTC errors with classified error codes; unit tests expanded to cover telemetry buffering, stats collection, logger filtering, and error classification.

Written by Cursor Bugbot for commit 0920e6f. This will update automatically on new commits. Configure here.

AdirAmsalem and others added 8 commits February 18, 2026 14:13
Add comprehensive WebRTC observability to the realtime client:

- Structured logger with configurable log levels (debug/info/warn/error)
- Diagnostic event system for connection lifecycle (ICE, signaling, phase timing, reconnects, video stalls)
- WebRTC stats collector polling at 1s intervals with delta computation for cumulative counters
- Telemetry reporter that batches stats + diagnostics and sends to backend every 10s
- NullReporter pattern to eliminate conditional checks when telemetry is disabled
- Granular error codes (WEBRTC_NEGOTIATION_FAILED, ICE_CONNECTION_FAILED, etc.)
- Explicit tags (session_id, sdk_version, integration) in telemetry reports for Datadog tagging
- Quality limitation tracking from outbound-rtp stats
- Video stall detection (fps < 0.5 threshold, Twilio pattern)
- 121 unit tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The telemetry endpoint is always https://api.decart.ai/v1/telemetry.
Remove telemetryUrl from RealTimeClientOptions and TelemetryReporterOptions.
Also cleans up unused httpBaseUrl variable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve merge conflicts after merging origin/main:
- client.ts: keep observability code, adopt main's refactored initial prompt flow
- webrtc-connection.ts: combine imports, use main's initialImage with diagnostic timing
- webrtc-manager.ts: combine imports from both branches
- Fix telemetry URL in test to match production endpoint

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RTCStatsReport.values() is not in TypeScript's DOM lib types.
Use forEach() which is properly typed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pkg-pr-new
Copy link

pkg-pr-new bot commented Feb 18, 2026

Open in StackBlitz

npm i https://pkg.pr.new/DecartAI/sdk/@decartai/sdk@85

commit: 0920e6f

- remove startStats from RealTimeClient public surface
- keep internal auto stats collection for telemetry
- stop exporting StatsOptions from package index
@AdirAmsalem AdirAmsalem merged commit 4edbb09 into main Feb 18, 2026
5 checks passed
@AdirAmsalem AdirAmsalem deleted the AdirAmsalem/webrtc-observability branch February 18, 2026 17:43
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

// Auto-start stats when telemetry is enabled
if (opts.telemetryEnabled) {
startStatsCollection();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stats events never fire when telemetry is disabled

Medium Severity

The stats event is part of the public Events type, but startStatsCollection is only called when opts.telemetryEnabled is true. Both call sites — the auto-start at connection and the handleConnectionStateChange handler for reconnects — are gated behind telemetry checks. This means client.on("stats", ...) listeners never fire when telemetry is disabled, coupling local stats observation with remote telemetry reporting. A user who wants local WebRTC stats without sending data to Decart's servers has no way to get them.

Additional Locations (1)

Fix in Cursor Fix in Web

phase: "initial-prompt",
durationMs: performance.now() - promptStart,
success: true,
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing failure diagnostic events for avatar-image and prompt phases

Low Severity

The avatar-image and initial-prompt phases only emit phaseTiming diagnostic events on success. If setImageBase64 or sendInitialPrompt rejects (e.g., timeout, WebSocket close), the await throws and the emitDiagnostic call is skipped with no failure event emitted. This is inconsistent with the websocket and webrtc-handshake phases, which both emit phaseTiming with success: false on failure, making connection failure diagnosis incomplete for these phases.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments