Fix latency calculation logic and add several missing props#1236
Fix latency calculation logic and add several missing props#1236renuka-fernando merged 2 commits intowso2:mainfrom
Conversation
WalkthroughRefactors analytics latency computation to use consolidated intermediate timestamps and added nil-time checks; extends the Moesif publisher to include additional metadata (response content type, size, latency breakdown, common name, egress flag); updates tests to populate the new latency timestamp fields. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
gateway/gateway-runtime/policy-engine/internal/analytics/analytics.go (1)
232-264: Latency calculation logic looks correct, but consider guarding against negative values.The formulas correctly model the Envoy access-log timeline:
Metric Formula BackendLatency lastUpRx - firstUpTxRequestMediationLatency firstUpTx - lastRxResponseLatency lastDownTx - firstUpRxResponseMediationLatency lastDownTx - lastUpRxUnder abnormal conditions (e.g., clock skew, retries, or stream resets), individual timestamp durations from Envoy can occasionally be non-monotonic, producing negative latency values. If negative latencies would confuse dashboards or downstream analytics consumers, a clamp to
0would be a cheap safeguard. Not critical given Moesif's tolerance, but worth considering.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/analytics/analytics.go` around lines 232 - 264, The latency calculations in analytics.go (the block computing lastRx, firstUpTx, firstUpRx, lastUpRx, lastDownTx and creating dto.Latencies with BackendLatency, RequestMediationLatency, ResponseLatency, ResponseMediationLatency) can produce negative values under abnormal conditions; clamp each latency to zero before assigning into dto.Latencies (e.g., compute raw = lastUpRx - firstUpTx and set BackendLatency = raw if raw >= 0 else 0), doing the same for RequestMediationLatency, ResponseLatency and ResponseMediationLatency so downstream consumers never receive negative latencies.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-runtime/policy-engine/internal/analytics/analytics.go`:
- Around line 256-261: The Latencies struct's Duration field is never set when
constructing dto.Latencies, so total end-to-end latency is serialized as 0;
compute Duration as the end-to-end difference (e.g., lastDownTx - firstUpTx or
whichever represents end-to-end start-to-end in your timing model) and assign it
when creating the latencies value. Update the block that builds dto.Latencies
(where BackendLatency, RequestMediationLatency, ResponseLatency,
ResponseMediationLatency are set) to also set Duration = lastDownTx - firstUpTx
(or the appropriate pair: firstUpTx and lastDownTx) so dto.Duration reflects the
total duration before serializing/returning.
---
Nitpick comments:
In `@gateway/gateway-runtime/policy-engine/internal/analytics/analytics.go`:
- Around line 232-264: The latency calculations in analytics.go (the block
computing lastRx, firstUpTx, firstUpRx, lastUpRx, lastDownTx and creating
dto.Latencies with BackendLatency, RequestMediationLatency, ResponseLatency,
ResponseMediationLatency) can produce negative values under abnormal conditions;
clamp each latency to zero before assigning into dto.Latencies (e.g., compute
raw = lastUpRx - firstUpTx and set BackendLatency = raw if raw >= 0 else 0),
doing the same for RequestMediationLatency, ResponseLatency and
ResponseMediationLatency so downstream consumers never receive negative
latencies.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
gateway/gateway-runtime/policy-engine/internal/analytics/analytics_test.go (1)
406-417: Consider strengtheningTestPrepareAnalyticEvent_WithLatenciesassertions.The test now exercises all five latency fields, but the only assertion is
BackendLatency >= 0— this would also pass if all four latency fields computed to zero or if three of them were silently dropped. Given the deterministic test data increateLogEntryWithLatencies, the expected values are known exactly:
Field Expected BackendLatency100ms RequestMediationLatency50ms ResponseLatency100ms ResponseMediationLatency50ms Additionally, there is no test for the nil-guard paths that
analytics.gonow handles forTimeToLastRxByteandTimeToFirstUpstreamRxByte.♻️ Suggested improvements
func TestPrepareAnalyticEvent_WithLatencies(t *testing.T) { cfg := &config.Config{} analytics := NewAnalytics(cfg) logEntry := createLogEntryWithLatencies() event := analytics.prepareAnalyticEvent(logEntry) require.NotNil(t, event) require.NotNil(t, event.Latencies) - assert.True(t, event.Latencies.BackendLatency >= 0) + // 100ms = lastUpRx(200ms) - firstUpTx(100ms) + assert.Equal(t, int64(100), event.Latencies.BackendLatency) + // 50ms = firstUpTx(100ms) - lastRx(50ms) + assert.Equal(t, int64(50), event.Latencies.RequestMediationLatency) + // 100ms = lastDownTx(250ms) - firstUpRx(150ms) + assert.Equal(t, int64(100), event.Latencies.ResponseLatency) + // 50ms = lastDownTx(250ms) - lastUpRx(200ms) + assert.Equal(t, int64(50), event.Latencies.ResponseMediationLatency) }And add a nil-guard test alongside
createLogEntryWithLatencies:func TestPrepareAnalyticEvent_WithNilLatencyFields(t *testing.T) { cfg := &config.Config{} analytics := NewAnalytics(cfg) // TimeToLastRxByte and TimeToFirstUpstreamRxByte are nil — should not panic logEntry := &v3.HTTPAccessLogEntry{ CommonProperties: &v3.AccessLogCommon{ TimeToFirstUpstreamTxByte: &durationpb.Duration{Seconds: 0, Nanos: 100000000}, TimeToLastUpstreamRxByte: &durationpb.Duration{Seconds: 0, Nanos: 200000000}, TimeToLastDownstreamTxByte: &durationpb.Duration{Seconds: 0, Nanos: 250000000}, }, Request: &v3.HTTPRequestProperties{RequestMethod: corev3.RequestMethod_GET}, Response: &v3.HTTPResponseProperties{ResponseCode: wrapperspb.UInt32(200)}, } assert.NotPanics(t, func() { event := analytics.prepareAnalyticEvent(logEntry) require.NotNil(t, event) require.NotNil(t, event.Latencies) }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/analytics/analytics_test.go` around lines 406 - 417, Update TestPrepareAnalyticEvent_WithLatencies to assert all computed latency fields against their exact expected values from createLogEntryWithLatencies (BackendLatency == 100ms, RequestMediationLatency == 50ms, ResponseLatency == 100ms, ResponseMediationLatency == 50ms) instead of only checking BackendLatency >= 0; reference the test TestPrepareAnalyticEvent_WithLatencies and the helper createLogEntryWithLatencies and verify the returned event.Latencies fields exactly match those durations after calling analytics.prepareAnalyticEvent. Also add a new test (e.g., TestPrepareAnalyticEvent_WithNilLatencyFields) that constructs a log entry where TimeToLastRxByte and TimeToFirstUpstreamRxByte are nil, calls analytics.prepareAnalyticEvent, and uses assert.NotPanics while asserting the returned event and event.Latencies are non-nil to exercise the nil-guard paths in prepareAnalyticEvent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@gateway/gateway-runtime/policy-engine/internal/analytics/analytics_test.go`:
- Around line 406-417: Update TestPrepareAnalyticEvent_WithLatencies to assert
all computed latency fields against their exact expected values from
createLogEntryWithLatencies (BackendLatency == 100ms, RequestMediationLatency ==
50ms, ResponseLatency == 100ms, ResponseMediationLatency == 50ms) instead of
only checking BackendLatency >= 0; reference the test
TestPrepareAnalyticEvent_WithLatencies and the helper
createLogEntryWithLatencies and verify the returned event.Latencies fields
exactly match those durations after calling analytics.prepareAnalyticEvent. Also
add a new test (e.g., TestPrepareAnalyticEvent_WithNilLatencyFields) that
constructs a log entry where TimeToLastRxByte and TimeToFirstUpstreamRxByte are
nil, calls analytics.prepareAnalyticEvent, and uses assert.NotPanics while
asserting the returned event and event.Latencies are non-nil to exercise the
nil-guard paths in prepareAnalyticEvent.
| if properties != nil && properties.TimeToLastRxByte != nil && | ||
| properties.TimeToFirstUpstreamTxByte != nil && properties.TimeToFirstUpstreamRxByte != nil && | ||
| properties.TimeToLastUpstreamRxByte != nil && properties.TimeToLastDownstreamTxByte != nil { |
There was a problem hiding this comment.
Now, to populate this dto.Latencies, all the conditions should be true, right?
There was a problem hiding this comment.
Yes. Anyway this shouldn't be an issue because for moesif side they calculate their own latency info using the request and response timestamps. These additional latency info are to populate the custom dashboards we have. So if the required properties aren't there then there's no point of trying to calculate them
Also since we are deriving these from envoy access logs, they'll be present in almost every time
Purpose
Summary by CodeRabbit