Skip to content

Merged alert: High response time persists on sreproactive-vscode-39596; swap deferred pending slot verification #77

@gderossilive

Description

@gderossilive

Alert (merged): Proactive Reliability (App Service) High Response Time Alert (Sev2)
Resource: sreproactive-vscode-39596 (microsoft.web/sites)
Alert IDs: a52df162-cb55-4d55-9525-180c6d27f000 (merged with prior incident)

Baseline (baseline.txt): BaselineResponseTime=31.592433333333336 ms at BaselineTimestamp=2026-02-27T09:30:37.1188306Z

Current findings (App Insights):

  • 2026-03-02 15:50:06Z (last 5m): 1649.8204 ms
  • 2026-03-02 15:52:42Z (last 5m): 1365.0388 ms
  • 2026-03-02 15:56:02Z (last 5m): 1505.6725 ms
  • 2026-03-02 15:57:14Z (last 5m): 1543.6475 ms
    Deviation vs baseline: >+4000%

Action status:

  • Slot verification via CLI is currently failing (command interruptions). Swap is DEFERRED until slot existence and app reachability are confirmed.
  • Will execute staging→production swap immediately upon successful verification, then re-sample post-swap metrics and update this issue.

App Insights query used (example window):
let startTime = datetime(2026-03-02 15:52:14Z);
let endTime = datetime(2026-03-02 15:57:14Z);
requests
| where timestamp >= startTime and timestamp <= endTime
| where cloud_RoleName <> '' and not (cloud_RoleName contains "staging")
| summarize CurrentResponseTime = avg(duration) by cloud_RoleName
| extend CurrentTimestamp = endTime

Next steps:

  • Resolve slot verification command interruptions
  • Perform slot swap and validate post-swap response time
  • Investigate recent code/config changes and dependencies contributing to latency

References:


This issue was created by sre-agent-proactive-demo--73aee8f4
Tracked by the SRE agent here

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions