This repository was archived by the owner on Feb 26, 2026. It is now read-only.
fix: prevent HTTP client connection leaks in embedders#917
Open
choutos wants to merge 1 commit intotimescale:mainfrom
Open
fix: prevent HTTP client connection leaks in embedders#917choutos wants to merge 1 commit intotimescale:mainfrom
choutos wants to merge 1 commit intotimescale:mainfrom
Conversation
The AsyncOpenAI, Ollama, and VoyageAI embedders were creating HTTP clients that were never explicitly closed, causing connections to accumulate in CLOSE_WAIT state and eventually exhausting file descriptors. Changes: - Add cleanup() method to Embedder base class - Implement cleanup() in OpenAI, Ollama, and VoyageAI embedders to close underlying HTTP clients - Call cleanup() in Executor.run() finally block to ensure resources are released regardless of how the embedding loop exits - Reuse client instances instead of creating new ones for each request (Ollama, VoyageAI) This fixes a file descriptor leak that could cause 'Too many open files' errors after prolonged operation.
5e73f20 to
20648be
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #919
Problem
The AsyncOpenAI, Ollama, and VoyageAI embedders create HTTP clients that are never explicitly closed. This causes connections to accumulate in CLOSE_WAIT state and eventually exhaust file descriptors.
Investigation Details
In production, we observed:
pgai-vectorizer-workersystemderror:Failed to allocate manager object: Too many open filesRoot Cause
The
_embedderproperty inopenai.pycreates anAsyncOpenAIclient that holds anhttpx.AsyncClient. When the embedder is garbage collected, the HTTP client's connections are not properly closed, leaving them in CLOSE_WAIT until the OS times them out.Similar patterns exist in
ollama.pyandvoyageai.py.Solution
cleanup()method to theEmbedderbase classcleanup()in OpenAI, Ollama, and VoyageAI embedders to close underlying HTTP clientscleanup()inExecutor.run()'s finally block to ensure resources are releasedChanges
embeddings.py: Addcleanup()method to base classopenai.py: Store client reference, implementcleanup()ollama.py: Add_get_client()for client reuse, implementcleanup()voyageai.py: Add_get_client()for client reuse, implementcleanup()vectorizer.py: Callcleanup()in finally blockTesting
This fix was developed in response to a production issue. We recommend:
cleanup()methods