Skip to content

High response time on sreproactive-vscode-39596; slot swap executed at 2026-03-02T17:38:22Z #78

@gderossilive

Description

@gderossilive

Summary:

  • Web App: sreproactive-vscode-39596
  • Baseline: 57.6513 ms at 2026-03-02T09:30:46.3005623Z (from baseline.txt)
  • Pre-swap (decision window): 240.3174 ms at 2026-03-02T17:34:55Z (window 17:29:55Z..17:34:55Z) → +316.8% over baseline
  • Action taken: Slot swap staging→production at 2026-03-02T17:38:22.592Z
  • Post-swap (initial check): 670.5941 ms at 2026-03-02T17:38:53Z (window 17:38:22Z..17:38:53Z) – still elevated

Evidence:

Queries used:
Pre-swap decision query:
let startTime = datetime(2026-03-02T17:29:55Z);
let endTime = datetime(2026-03-02T17:34:55Z);
requests
| where timestamp >= startTime and timestamp <= endTime
| where cloud_RoleName <> '' and not(cloud_RoleName contains "staging")
| summarize CurrentResponseTime = avg(duration) by cloud_RoleName
| extend CurrentTimestamp = endTime

Post-swap spot-check query:
let startTime = datetime(2026-03-02T17:38:22Z);
let endTime = datetime(2026-03-02T17:38:53Z);
requests
| where timestamp >= startTime and timestamp <= endTime
| where cloud_RoleName <> '' and not(cloud_RoleName contains "staging")
| summarize CurrentResponseTime = avg(duration) by cloud_RoleName
| extend CurrentTimestamp = endTime

Requested follow-ups:

  • Investigate recent code/config changes in active slot causing regression
  • Review deployment diffs between staging and production
  • Check dependency latency (DB/Redis/external APIs) during the above windows
  • Add automated regression checks to gate swaps if post-swap metrics are worse

This issue was created by sre-agent-proactive-demo--73aee8f4
Tracked by the SRE agent here

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions