Skip to content
This repository was archived by the owner on Feb 26, 2026. It is now read-only.

fix: prevent HTTP client connection leaks in embedders#917

Open
choutos wants to merge 1 commit intotimescale:mainfrom
choutos:fix/openai-client-connection-leak
Open

fix: prevent HTTP client connection leaks in embedders#917
choutos wants to merge 1 commit intotimescale:mainfrom
choutos:fix/openai-client-connection-leak

Conversation

@choutos
Copy link

@choutos choutos commented Feb 4, 2026

Fixes #919

Problem

The AsyncOpenAI, Ollama, and VoyageAI embedders create HTTP clients that are never explicitly closed. This causes connections to accumulate in CLOSE_WAIT state and eventually exhaust file descriptors.

Investigation Details

In production, we observed:

  • ~35,000 file descriptors held by pgai-vectorizer-worker
  • All were sockets in CLOSE_WAIT state
  • Connections were to the OpenAI API (via Cloudflare)
  • The systemd error: Failed to allocate manager object: Too many open files

Root Cause

The _embedder property in openai.py creates an AsyncOpenAI client that holds an httpx.AsyncClient. When the embedder is garbage collected, the HTTP client's connections are not properly closed, leaving them in CLOSE_WAIT until the OS times them out.

Similar patterns exist in ollama.py and voyageai.py.

Solution

  1. Add cleanup() method to the Embedder base class
  2. Implement cleanup() in OpenAI, Ollama, and VoyageAI embedders to close underlying HTTP clients
  3. Call cleanup() in Executor.run()'s finally block to ensure resources are released
  4. Reuse client instances instead of creating new ones for each request (Ollama, VoyageAI)

Changes

  • embeddings.py: Add cleanup() method to base class
  • openai.py: Store client reference, implement cleanup()
  • ollama.py: Add _get_client() for client reuse, implement cleanup()
  • voyageai.py: Add _get_client() for client reuse, implement cleanup()
  • vectorizer.py: Call cleanup() in finally block

Testing

This fix was developed in response to a production issue. We recommend:

  • Unit tests for cleanup() methods
  • Integration test that verifies no file descriptor leak after multiple embedding batches

@choutos choutos requested a review from a team as a code owner February 4, 2026 17:19
@choutos choutos had a problem deploying to external-contributors February 4, 2026 17:19 — with GitHub Actions Error
@CLAassistant
Copy link

CLAassistant commented Feb 4, 2026

CLA assistant check
All committers have signed the CLA.

The AsyncOpenAI, Ollama, and VoyageAI embedders were creating HTTP clients
that were never explicitly closed, causing connections to accumulate in
CLOSE_WAIT state and eventually exhausting file descriptors.

Changes:
- Add cleanup() method to Embedder base class
- Implement cleanup() in OpenAI, Ollama, and VoyageAI embedders to close
  underlying HTTP clients
- Call cleanup() in Executor.run() finally block to ensure resources are
  released regardless of how the embedding loop exits
- Reuse client instances instead of creating new ones for each request
  (Ollama, VoyageAI)

This fixes a file descriptor leak that could cause 'Too many open files'
errors after prolonged operation.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HTTP client connection leak in embedders causes file descriptor exhaustion

2 participants