This repository was archived by the owner on Feb 26, 2026. It is now read-only.
feat: add OpenAI's async batch API support for vectorizer#564
Draft
alejandrodnm wants to merge 1 commit intomainfrom
Draft
feat: add OpenAI's async batch API support for vectorizer#564alejandrodnm wants to merge 1 commit intomainfrom
alejandrodnm wants to merge 1 commit intomainfrom
Conversation
0edcb8e to
846322f
Compare
Adds support for OpenAI's async batch API to process large amounts of embeddings at a lower cost, and higher rate limits. Key features: - New AsyncBatchEmbedder interface for handling async batch operations - Support for OpenAI's batch API implementation - New database tables for tracking batch status and chunks - Configurable polling interval for batch status checks - Automatic retry mechanism for failed batches Database changes: - New async_batch_queue_table for tracking batch status - New async_batch_chunks_table for storing chunks pending processing - Added async_batch_polling_interval column to vectorizer table - New SQL functions for managing async batch operations API changes: - New async_batch_enabled parameter in ai.embedding_openai() - New ai.vectorizer_enable_async_batches() and ai.vectorizer_disable_async_batches() functions - Extended vectorizer configuration to support async batch operations The async batch workflow: 1. Chunks are collected and submitted as a batch to OpenAI 2. Batch status is monitored through polling 3. When ready, embeddings are retrieved and stored 4. Batch resources are cleaned up after successful processing
846322f to
f26b879
Compare
Collaborator
Author
|
@kolaente this is a first draft of the PR we have some pending changes to our interfaces and we need to see how it'll affect this work. |
11 tasks
35e80a5 to
f15011e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds support for OpenAI's async batch API to process large amounts of embeddings at a lower cost, and higher rate limits.
Key features:
Database changes:
API changes:
The async batch workflow:
https://www.loom.com/share/09ff363c52204bf6851a797d4c4c4d50?sid=79e95d06-017d-4de5-a740-aa5af916b971