Skip to content

Fix latency calculation logic and add several missing props#1236

Merged
renuka-fernando merged 2 commits intowso2:mainfrom
O-sura:latency-fix
Feb 18, 2026
Merged

Fix latency calculation logic and add several missing props#1236
renuka-fernando merged 2 commits intowso2:mainfrom
O-sura:latency-fix

Conversation

@O-sura
Copy link
Contributor

@O-sura O-sura commented Feb 18, 2026

Purpose

  • This PR adds the changes for modify the latency calculation logic and add some properties which were missing from the analytics events previously along with the modified latency info

Summary by CodeRabbit

  • Improvements
    • Consolidated latency calculations for more accurate, consistent analytics metrics.
    • Expanded analytics metadata sent to the external publisher for richer reporting (additional response, latency and connection fields).
  • Tests
    • Updated analytics test data to reflect the revised latency breakdown and ensure coverage.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 18, 2026

Walkthrough

Refactors analytics latency computation to use consolidated intermediate timestamps and added nil-time checks; extends the Moesif publisher to include additional metadata (response content type, size, latency breakdown, common name, egress flag); updates tests to populate the new latency timestamp fields.

Changes

Cohort / File(s) Summary
Analytics Latency Refactoring
gateway/gateway-runtime/policy-engine/internal/analytics/analytics.go
Replaces previous granular timestamp math with consolidated intermediate variables (lastRx, firstUpTx, firstUpRx, lastUpRx, lastDownTx), adds nil-time guards, and populates BackendLatency, RequestMediationLatency, ResponseLatency, ResponseMediationLatency in one struct.
Moesif Publisher Metadata Expansion
gateway/gateway-runtime/policy-engine/internal/analytics/publishers/moesif.go
Adds extraction of responseContentType, responseSize, commonName, isEgress, and advanced latency fields (backendLatency, requestMediationLatency, responseLatency, responseMediationLatency) into Moesif metadata with nil/type checks.
Test Data Update
gateway/gateway-runtime/policy-engine/internal/analytics/analytics_test.go
Updates test helper createLogEntryWithLatencies to populate the new latency timestamp fields (TimeToLastRxByte, TimeToFirstUpstreamRxByte, etc.) and adjusts the composite literal ordering/comments to reflect the expanded timestamp set.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I hopped through timestamps, neat and small,

Merged the hops so latencies stand tall.
Moesif listens with extra treats to share,
Nil checks snug like blankets everywhere.
A tiny rabbit cheers — metrics, well cared for!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete and only covers Purpose; it lacks Goals, Approach, User Stories, Documentation, Automation tests, Security checks, Samples, Related PRs, and Test environment sections required by the template. Complete the description by adding all required template sections including Goals, Approach, Documentation, test coverage details, security checks, and test environment information.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: latency calculation logic fix and addition of missing properties to analytics events.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
gateway/gateway-runtime/policy-engine/internal/analytics/analytics.go (1)

232-264: Latency calculation logic looks correct, but consider guarding against negative values.

The formulas correctly model the Envoy access-log timeline:

Metric Formula
BackendLatency lastUpRx - firstUpTx
RequestMediationLatency firstUpTx - lastRx
ResponseLatency lastDownTx - firstUpRx
ResponseMediationLatency lastDownTx - lastUpRx

Under abnormal conditions (e.g., clock skew, retries, or stream resets), individual timestamp durations from Envoy can occasionally be non-monotonic, producing negative latency values. If negative latencies would confuse dashboards or downstream analytics consumers, a clamp to 0 would be a cheap safeguard. Not critical given Moesif's tolerance, but worth considering.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/analytics/analytics.go` around
lines 232 - 264, The latency calculations in analytics.go (the block computing
lastRx, firstUpTx, firstUpRx, lastUpRx, lastDownTx and creating dto.Latencies
with BackendLatency, RequestMediationLatency, ResponseLatency,
ResponseMediationLatency) can produce negative values under abnormal conditions;
clamp each latency to zero before assigning into dto.Latencies (e.g., compute
raw = lastUpRx - firstUpTx and set BackendLatency = raw if raw >= 0 else 0),
doing the same for RequestMediationLatency, ResponseLatency and
ResponseMediationLatency so downstream consumers never receive negative
latencies.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-runtime/policy-engine/internal/analytics/analytics.go`:
- Around line 256-261: The Latencies struct's Duration field is never set when
constructing dto.Latencies, so total end-to-end latency is serialized as 0;
compute Duration as the end-to-end difference (e.g., lastDownTx - firstUpTx or
whichever represents end-to-end start-to-end in your timing model) and assign it
when creating the latencies value. Update the block that builds dto.Latencies
(where BackendLatency, RequestMediationLatency, ResponseLatency,
ResponseMediationLatency are set) to also set Duration = lastDownTx - firstUpTx
(or the appropriate pair: firstUpTx and lastDownTx) so dto.Duration reflects the
total duration before serializing/returning.

---

Nitpick comments:
In `@gateway/gateway-runtime/policy-engine/internal/analytics/analytics.go`:
- Around line 232-264: The latency calculations in analytics.go (the block
computing lastRx, firstUpTx, firstUpRx, lastUpRx, lastDownTx and creating
dto.Latencies with BackendLatency, RequestMediationLatency, ResponseLatency,
ResponseMediationLatency) can produce negative values under abnormal conditions;
clamp each latency to zero before assigning into dto.Latencies (e.g., compute
raw = lastUpRx - firstUpTx and set BackendLatency = raw if raw >= 0 else 0),
doing the same for RequestMediationLatency, ResponseLatency and
ResponseMediationLatency so downstream consumers never receive negative
latencies.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
gateway/gateway-runtime/policy-engine/internal/analytics/analytics_test.go (1)

406-417: Consider strengthening TestPrepareAnalyticEvent_WithLatencies assertions.

The test now exercises all five latency fields, but the only assertion is BackendLatency >= 0 — this would also pass if all four latency fields computed to zero or if three of them were silently dropped. Given the deterministic test data in createLogEntryWithLatencies, the expected values are known exactly:

Field Expected
BackendLatency 100ms
RequestMediationLatency 50ms
ResponseLatency 100ms
ResponseMediationLatency 50ms

Additionally, there is no test for the nil-guard paths that analytics.go now handles for TimeToLastRxByte and TimeToFirstUpstreamRxByte.

♻️ Suggested improvements
 func TestPrepareAnalyticEvent_WithLatencies(t *testing.T) {
 	cfg := &config.Config{}
 	analytics := NewAnalytics(cfg)

 	logEntry := createLogEntryWithLatencies()

 	event := analytics.prepareAnalyticEvent(logEntry)

 	require.NotNil(t, event)
 	require.NotNil(t, event.Latencies)
-	assert.True(t, event.Latencies.BackendLatency >= 0)
+	// 100ms = lastUpRx(200ms) - firstUpTx(100ms)
+	assert.Equal(t, int64(100), event.Latencies.BackendLatency)
+	// 50ms = firstUpTx(100ms) - lastRx(50ms)
+	assert.Equal(t, int64(50), event.Latencies.RequestMediationLatency)
+	// 100ms = lastDownTx(250ms) - firstUpRx(150ms)
+	assert.Equal(t, int64(100), event.Latencies.ResponseLatency)
+	// 50ms = lastDownTx(250ms) - lastUpRx(200ms)
+	assert.Equal(t, int64(50), event.Latencies.ResponseMediationLatency)
 }

And add a nil-guard test alongside createLogEntryWithLatencies:

func TestPrepareAnalyticEvent_WithNilLatencyFields(t *testing.T) {
    cfg := &config.Config{}
    analytics := NewAnalytics(cfg)

    // TimeToLastRxByte and TimeToFirstUpstreamRxByte are nil — should not panic
    logEntry := &v3.HTTPAccessLogEntry{
        CommonProperties: &v3.AccessLogCommon{
            TimeToFirstUpstreamTxByte:  &durationpb.Duration{Seconds: 0, Nanos: 100000000},
            TimeToLastUpstreamRxByte:   &durationpb.Duration{Seconds: 0, Nanos: 200000000},
            TimeToLastDownstreamTxByte: &durationpb.Duration{Seconds: 0, Nanos: 250000000},
        },
        Request:  &v3.HTTPRequestProperties{RequestMethod: corev3.RequestMethod_GET},
        Response: &v3.HTTPResponseProperties{ResponseCode: wrapperspb.UInt32(200)},
    }

    assert.NotPanics(t, func() {
        event := analytics.prepareAnalyticEvent(logEntry)
        require.NotNil(t, event)
        require.NotNil(t, event.Latencies)
    })
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/analytics/analytics_test.go`
around lines 406 - 417, Update TestPrepareAnalyticEvent_WithLatencies to assert
all computed latency fields against their exact expected values from
createLogEntryWithLatencies (BackendLatency == 100ms, RequestMediationLatency ==
50ms, ResponseLatency == 100ms, ResponseMediationLatency == 50ms) instead of
only checking BackendLatency >= 0; reference the test
TestPrepareAnalyticEvent_WithLatencies and the helper
createLogEntryWithLatencies and verify the returned event.Latencies fields
exactly match those durations after calling analytics.prepareAnalyticEvent. Also
add a new test (e.g., TestPrepareAnalyticEvent_WithNilLatencyFields) that
constructs a log entry where TimeToLastRxByte and TimeToFirstUpstreamRxByte are
nil, calls analytics.prepareAnalyticEvent, and uses assert.NotPanics while
asserting the returned event and event.Latencies are non-nil to exercise the
nil-guard paths in prepareAnalyticEvent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@gateway/gateway-runtime/policy-engine/internal/analytics/analytics_test.go`:
- Around line 406-417: Update TestPrepareAnalyticEvent_WithLatencies to assert
all computed latency fields against their exact expected values from
createLogEntryWithLatencies (BackendLatency == 100ms, RequestMediationLatency ==
50ms, ResponseLatency == 100ms, ResponseMediationLatency == 50ms) instead of
only checking BackendLatency >= 0; reference the test
TestPrepareAnalyticEvent_WithLatencies and the helper
createLogEntryWithLatencies and verify the returned event.Latencies fields
exactly match those durations after calling analytics.prepareAnalyticEvent. Also
add a new test (e.g., TestPrepareAnalyticEvent_WithNilLatencyFields) that
constructs a log entry where TimeToLastRxByte and TimeToFirstUpstreamRxByte are
nil, calls analytics.prepareAnalyticEvent, and uses assert.NotPanics while
asserting the returned event and event.Latencies are non-nil to exercise the
nil-guard paths in prepareAnalyticEvent.

Comment on lines +232 to +234
if properties != nil && properties.TimeToLastRxByte != nil &&
properties.TimeToFirstUpstreamTxByte != nil && properties.TimeToFirstUpstreamRxByte != nil &&
properties.TimeToLastUpstreamRxByte != nil && properties.TimeToLastDownstreamTxByte != nil {
Copy link
Contributor

@renuka-fernando renuka-fernando Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, to populate this dto.Latencies, all the conditions should be true, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Anyway this shouldn't be an issue because for moesif side they calculate their own latency info using the request and response timestamps. These additional latency info are to populate the custom dashboards we have. So if the required properties aren't there then there's no point of trying to calculate them

Also since we are deriving these from envoy access logs, they'll be present in almost every time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack. Thanks

@renuka-fernando renuka-fernando merged commit f583a91 into wso2:main Feb 18, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments