Skip to content

RSPEED-2413: replace BaseHTTPMiddleware with pure ASGI middleware#1133

Open
major wants to merge 1 commit intolightspeed-core:mainfrom
major:rspeed-2413-fix-runtime-error-mcp-tools
Open

RSPEED-2413: replace BaseHTTPMiddleware with pure ASGI middleware#1133
major wants to merge 1 commit intolightspeed-core:mainfrom
major:rspeed-2413-fix-runtime-error-mcp-tools

Conversation

@major
Copy link
Contributor

@major major commented Feb 10, 2026

Description

Both rest_api_metrics and global_exception_middleware use Starlette's
@app.middleware() decorator, which wraps the endpoint in
BaseHTTPMiddleware. This middleware uses call_next() internally, which
spawns the handler inside an anyio TaskGroup. When inference requests
take 15-30s (normal for LLM + MCP tool calls), the TaskGroup context
loses the response reference, raising RuntimeError: No response returned
instead of returning the completed response.

This is a known limitation of BaseHTTPMiddleware:
Kludex/starlette#1914

This PR replaces both @app.middleware() decorators with pure ASGI
middleware classes that call self.app(scope, receive, send) directly.
This eliminates call_next(), the TaskGroup wrapping, and the response
buffering — the response flows straight through the middleware chain
regardless of how long the handler takes.

Type of change

  • Bug fix

Tools used to create PR

  • Assisted-by: Claude (opencode)
  • Generated by: N/A

Related Tickets & Documents

  • Related Issue # RSPEED-2413
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  1. All 1338 unit tests pass, mypy clean (0 issues / 268 files), ruff clean
  2. Manually verified using the lscore-poc compose stack:
    • Before fix: POST /v1/infer returns 500 with RuntimeError: No response returned on every request (100% failure rate)
    • After fix (middleware removed from test app): POST /v1/infer
      returns 200 with correct LLM response
    • Direct Python inference (bypassing HTTP) always succeeded,
      confirming the issue was isolated to BaseHTTPMiddleware

Summary by CodeRabbit

  • Refactor
    • Middleware reworked to an ASGI-based implementation for more reliable error handling and accurate request metrics; registration order updated.
  • Bug Fixes
    • Uncaught exceptions are logged and return JSON error responses only when no response has started; metrics endpoint and non-matching paths are excluded from counting.
  • Tests
    • Unit tests updated to validate the new ASGI middleware behavior and message-based responses.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 10, 2026

Walkthrough

Replaces decorator-based request/response middlewares with two ASGI middleware classes—GlobalExceptionMiddleware and RestApiMetricsMiddleware—registered via app.add_middleware, converts logic to scope/receive/send handling, and updates unit tests to exercise ASGI message flows.

Changes

Cohort / File(s) Summary
ASGI Middleware Implementation
src/app/main.py
Adds RestApiMetricsMiddleware and GlobalExceptionMiddleware as ASGI __call__(scope, receive, send) classes; removes decorator-based middlewares and call_next usage; updates typings/imports to ASGI types; computes app_routes_paths for path filtering and registers middlewares via app.add_middleware(...).
Middleware Tests (ASGI)
tests/unit/app/test_main_middleware.py
Reworks tests to instantiate the new ASGI middleware classes with mock ASGI apps; adds ASGI helpers (_make_scope, _noop_receive, _ResponseCollector); asserts on collected http.response.start / http.response.body messages; covers exception handling, HTTPException passthrough, response-start behavior, non-HTTP scope skipping, and metrics increments.
Public API Surface
src/app/main.py, tests/unit/...
Exports changed: GlobalExceptionMiddleware and RestApiMetricsMiddleware added; prior global_exception_middleware decorator/function removed from the public API.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant GEM as GlobalExceptionMiddleware
    participant RM as RestApiMetricsMiddleware
    participant App as Inner ASGI App
    participant Metrics as Metrics Collector

    Client->>GEM: ASGI request (scope, receive, send)
    GEM->>RM: forward scope/receive/send
    RM->>App: forward scope/receive/send
    alt App sends response events
        App-->>RM: http.response.start / http.response.body
        RM->>Metrics: record duration & status via send wrapper
        RM-->>GEM: response events
        GEM-->>Client: response delivered
    else App raises uncaught exception
        App--xRM: exception raised
        RM->>Metrics: increment error counter (path/status)
        RM-->>GEM: propagate exception
        GEM->>Client: send 500 JSON error if response not started
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: replacing BaseHTTPMiddleware-based middleware with pure ASGI middleware classes to fix long-running request handling.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
src/app/main.py (1)

136-138: Consider enriching __init__ and __call__ docstrings with parameter info.

Per coding guidelines, functions require "complete type annotations for parameters and return types" and should "Follow Google Python docstring conventions including Parameters, Returns, Raises, and Attributes sections as needed." The current one-liner docstrings are minimal. For __call__, documenting the scope/receive/send contract and the exception-handling behavior would align better with the guideline. This is a minor nit for well-known ASGI dunder methods.

As per coding guidelines, All functions require docstrings with brief descriptions and complete type annotations for parameters and return types and Follow Google Python docstring conventions including Parameters, Returns, Raises, and Attributes sections as needed.

Also applies to: 184-186

tests/unit/app/test_main_middleware.py (2)

159-177: Consider adding a happy-path test for RestApiMetricsMiddleware.

The only RestApiMetricsMiddleware tests cover the non-HTTP passthrough and the exception path. There's no test for a successful request that verifies:

  1. send_wrapper correctly captures the status code from the inner app's response.
  2. The duration metric and call counter are recorded with the correct labels.
  3. The /metrics path exclusion actually skips the counter.

A happy-path test would be the most direct way to catch regressions in the core metrics-recording logic.


5-5: Prefer pytest-mock's mocker fixture over unittest.mock.patch.

The coding guidelines for tests/unit/**/*.py specify using pytest-mock. While only patch (not AsyncMock) is used here, switching to mocker.patch(...) would be more consistent with the project convention.

As per coding guidelines, Use pytest-mock for AsyncMock objects in unit tests.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/app/main.py`:
- Around line 119-171: The RestApiMetricsMiddleware.__call__ currently awaits
self.app(scope, receive, send_wrapper) inside the duration context so if the
inner app raises the metrics.rest_api_calls_total.labels(path,
status_code).inc() is never executed; wrap the await in a try/finally (or ensure
a finally block) so that regardless of exceptions you still call
metrics.rest_api_calls_total.labels(path, status_code).inc() for measured paths
(excluding "/metrics"); keep using send_wrapper to capture the status_code and
reference path and app_routes_paths to determine whether to record the counter.
- Around line 225-229: Swap the registration order of the two middlewares so
RestApiMetricsMiddleware is registered last (making it the outermost middleware
at runtime): register GlobalExceptionMiddleware first and then call
app.add_middleware(RestApiMetricsMiddleware) after it, ensuring
RestApiMetricsMiddleware wraps GlobalExceptionMiddleware and can observe
responses (including 500s) emitted by GlobalExceptionMiddleware.
🧹 Nitpick comments (4)
src/app/main.py (1)

136-138: Pylint W0621 (app shadowing outer scope) — acceptable for ASGI middleware constructors.

The app parameter in __init__ shadows the module-level app (the FastAPI instance). This is the standard ASGI middleware constructor signature and suppressing the warning is fine. The R0903 "too few public methods" warnings are also inherent to the ASGI middleware pattern. Consider adding a # pylint: disable=... inline or a project-level pylint config entry if you want a clean lint run.

Also applies to: 182-184

tests/unit/app/test_main_middleware.py (3)

30-52: _ResponseCollector is a handy test utility.

The pipeline flags missing docstrings on status_code and body_json properties (C0116). Since these are test-internal helpers the impact is minimal, but adding one-liners would silence the linter.


110-112: Prefix unused ASGI parameters with _ to silence W0613.

The pipeline flags scope, receive, and send as unused in the inner test apps. Prefixing with _ is the standard Python convention and would clean up the lint output.

Example for one of the inner apps
-    async def partial_response_app(scope: Scope, receive: Receive, send: Send) -> None:
+    async def partial_response_app(_scope: Scope, _receive: Receive, send: Send) -> None:

Apply the same pattern to inner_app in both non-HTTP tests (unused scope, receive, send).

Also applies to: 126-128, 145-147


140-151: RestApiMetricsMiddleware has only one test — consider adding coverage for the happy path and edge cases.

The only test verifies non-HTTP passthrough. The following scenarios are untested at the unit level:

  1. Metrics counter increment — a request to a known app route results in rest_api_calls_total being incremented with the correct path and status.
  2. /metrics endpoint exclusion — requests ending with /metrics do not increment the counter.
  3. Non-app-route passthrough — a path not in app_routes_paths bypasses metrics entirely.
  4. Status code capturesend_wrapper correctly extracts the status from http.response.start.

These are the core behaviors of the middleware and are worth covering, especially since the class was just rewritten.

@major major force-pushed the rspeed-2413-fix-runtime-error-mcp-tools branch from 549dc1e to 74ff64f Compare February 10, 2026 23:21
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tests/unit/app/test_main_middleware.py`:
- Around line 155-166: The test
test_rest_api_metrics_increments_counter_on_exception currently only checks that
RuntimeError propagates; update it to also assert that the metrics counter was
incremented by patching or mocking
metrics.rest_api_calls_total.labels(...).inc() (or monkeypatching
RestApiMetricsMiddleware to use a mock counter) before invoking middleware, then
after the awaited call (inside the pytest.raises block) assert that the mock
inc() was called once with the expected label values; reference the test
function name and RestApiMetricsMiddleware and
metrics.rest_api_calls_total.labels(...).inc() to locate where to add the mock
and assertion.
🧹 Nitpick comments (1)
tests/unit/app/test_main_middleware.py (1)

40-45: Add docstrings to the status_code and body_json properties.

The linter flags C0116: Missing function or method docstring on these properties. As per coding guidelines, "All functions require docstrings with brief descriptions and complete type annotations for parameters and return types".

Proposed fix
     `@property`
     def status_code(self) -> int:
+        """Return the HTTP status code from the collected response messages."""
         for msg in self.messages:
             if msg["type"] == "http.response.start":
                 return msg["status"]
         raise AssertionError("No http.response.start message")
 
     `@property`
     def body_json(self) -> dict:
+        """Return the response body decoded as JSON."""
         body = b""

- Convert rest_api_metrics and global_exception_middleware from
  @app.middleware decorators to pure ASGI middleware classes
- Eliminates call_next() which wraps handlers in anyio TaskGroup,
  causing "RuntimeError: No response returned" on long-running
  inference requests (15-30s)
- Update middleware tests to use ASGI scope/receive/send protocol

Signed-off-by: Major Hayden <major@redhat.com>
@major major force-pushed the rspeed-2413-fix-runtime-error-mcp-tools branch from 74ff64f to 2f62622 Compare February 10, 2026 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant