-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Alert (merged): Proactive Reliability (App Service) High Response Time Alert (Sev2)
Resource: sreproactive-vscode-39596 (microsoft.web/sites)
Alert IDs: a52df162-cb55-4d55-9525-180c6d27f000 (merged with prior incident)
Baseline (baseline.txt): BaselineResponseTime=31.592433333333336 ms at BaselineTimestamp=2026-02-27T09:30:37.1188306Z
Current findings (App Insights):
- 2026-03-02 15:50:06Z (last 5m): 1649.8204 ms
- 2026-03-02 15:52:42Z (last 5m): 1365.0388 ms
- 2026-03-02 15:56:02Z (last 5m): 1505.6725 ms
- 2026-03-02 15:57:14Z (last 5m): 1543.6475 ms
Deviation vs baseline: >+4000%
Action status:
- Slot verification via CLI is currently failing (command interruptions). Swap is DEFERRED until slot existence and app reachability are confirmed.
- Will execute staging→production swap immediately upon successful verification, then re-sample post-swap metrics and update this issue.
App Insights query used (example window):
let startTime = datetime(2026-03-02 15:52:14Z);
let endTime = datetime(2026-03-02 15:57:14Z);
requests
| where timestamp >= startTime and timestamp <= endTime
| where cloud_RoleName <> '' and not (cloud_RoleName contains "staging")
| summarize CurrentResponseTime = avg(duration) by cloud_RoleName
| extend CurrentTimestamp = endTime
Next steps:
- Resolve slot verification command interruptions
- Perform slot swap and validate post-swap response time
- Investigate recent code/config changes and dependencies contributing to latency
References:
- Health check: https://sreproactive-vscode-39596.azurewebsites.net/health
- Alert details: /subscriptions/06dbbc7b-2363-4dd4-9803-95d07f1a8d3e/providers/Microsoft.AlertsManagement/alerts/a52df162-cb55-4d55-9525-180c6d27f000
This issue was created by sre-agent-proactive-demo--73aee8f4
Tracked by the SRE agent here