feat: serialize LangChain multimodal content to ingest content blocks#487
Open
feat: serialize LangChain multimodal content to ingest content blocks#487
Conversation
5617134 to
aefd885
Compare
Comment on lines
+223
to
+227
| @async_warn_catch_exception(logger=_logger) | ||
| async def ingest_traces(self, traces_ingest_request: TracesIngestRequest) -> dict[str, Any]: | ||
| if self.experiment_id: | ||
| traces_ingest_request.experiment_id = UUID(self.experiment_id) | ||
| elif self.log_stream_id: |
Contributor
There was a problem hiding this comment.
IngestTraces.ingest_traces now reimplements the experiment/log_stream wiring (and by extension the model_dump/_log_ingest_content_blocks flow) that already exists in Traces.ingest_traces; keeping two copies means every future change to request preparation or logging must be applied twice. Can we extract a shared helper that sets experiment_id/log_stream_id, dumps the request, logs the content blocks, and optionally sets logging_method, then call it from both ingestion clients so we only maintain that plumbing in one place?
Finding type: Code Dedup and Conventions
- Apply fix with Baz
pratduv
commented
Feb 26, 2026
| elif self.experiment_id: | ||
| self._traces_client = Traces(project_id=self.project_id, experiment_id=self.experiment_id) | ||
|
|
||
| if os.environ.get("GALILEO_INGEST_URL"): |
Contributor
Author
There was a problem hiding this comment.
Only created when this URL is set
Convert LangChain's multimodal message format (image_url, audio_url, etc.) into Galileo's IngestContentBlock schema (TextContentBlock, DataContentBlock) instead of flattening list content to a plain string. Adds IngestTraces client for the orbit ingest service (opt-in via GALILEO_INGEST_URL env var) and debug logging at serialization boundaries. Made-with: Cursor
1d6a39e to
30195c9
Compare
Contributor
Author
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
User description
Summary
image_url,audio_url,video_url, etc.) into Galileo'sIngestContentBlockschema (TextContentBlock,DataContentBlock) instead of flattening list content to a plain stringIngestTracesclient for the orbit ingest service (opt-in viaGALILEO_INGEST_URLenv var)Changes
Serialization (
src/galileo/utils/serialization.py)_convert_langchain_content_block()maps LangChain's{"type": "image_url", "image_url": {"url": "..."}}format toDataContentBlock(type="image", url="..."), with base64 data URI detectionEventSerializernow converts list content on bothAIMessageandBaseMessagesubclasses to content block arrays instead of extracting only the first text elementIngest client (
src/galileo/traces.py)IngestTracesclass sends traces directly to the orbit ingest service viahttpx.AsyncClient_log_ingest_content_blocks()helper logs content block types at DEBUG level before HTTP POSTRoutes.ingest_tracesconstantLogger integration (
src/galileo/logger/logger.py)GalileoLoggercreates anIngestTracesclient whenGALILEO_INGEST_URLis set, preferring it over the API-proxied pathHandler (
src/galileo/handlers/langchain/handler.py)on_chat_model_startshowing multimodal message countTest plan
TestConvertLangchainContentBlock-- unit tests for text, image URL, base64, audio, unknown fallbackTestMultimodalContentSerialization-- round-trip serialization of AIMessage/HumanMessage with multimodal contenttest_on_chat_model_start_multimodal-- integration test verifying LangChain callback produces structured content blocksDependencies
Requires galileo-core#feature/sc-56110 to be merged and released first (adds
IngestContentBlock,TextContentBlock,DataContentBlocktogalileo_core.schemas.shared.content_blocks).Made with Cursor
Generated description
Below is a concise technical summary of the changes proposed in this PR:
Enhances the LangChain callback handler to natively serialize multimodal content, such as images and audio, into Galileo's structured
IngestContentBlockschema, moving away from flattened string representations. Introduces a newIngestTracesclient and a dedicated/ingest/tracesroute, enabling direct and optimized trace submission to the orbit ingest service, configurable via an environment variable.image_url,audio_url) into Galileo'sTextContentBlockandDataContentBlockschemas. This involves updating theEventSerializerto process list-based content inAIMessageandBaseMessagesubclasses as structured content blocks, ensuring proper serialization for various message types and their streaming chunks.Modified files (4)
Latest Contributors(2)
IngestTracesclient for direct communication with the orbit ingest service, bypassing the main API for trace submission. This includes defining a new/ingest/traces/{project_id}route, integrating the client into theGalileoLoggerfor conditional use based on theGALILEO_INGEST_URLenvironment variable, and updating project dependencies to reflect the requiredgalileo-coreschema changes.Modified files (5)
Latest Contributors(2)