Conversation
… separate thread because of sync facade in postgres introspector)
JulienArzul
left a comment
There was a problem hiding this comment.
@hsestupin what are the plans for the MCP server? Is it worth it to add new tools in this one? or should we stop adding anything here since I imagine it will be re-worked in the common-cli repo anyway?
| read_only: bool = True, | ||
| ): | ||
| ds = DatasourceId.from_string_repr(datasource_id) | ||
| res = await asyncio.to_thread(self._databao_context_engine.run_sql, ds, sql, read_only=read_only) |
There was a problem hiding this comment.
From a code perspective (without knowing the full specifics of how asyncio works in Python), it feels wrong to me that we need to accommodate calling a synchronous method in a new thread here, because somewhere down the calling chain, something has an async facade and wants to create an event loop in the thread
In theory, I think we should either:
- make everything async properly (make all plugin methods async)
- or better hide the Postgres async behaviour: the thread needed to run the method should be created within the Postgres async facade (when catching that an event loop can't be created in the current thread? or maybe there is an other way of making the calls to pgasync synchronous without taking up a full thread's event loop)
There was a problem hiding this comment.
We discussed this briefly with @annav1asova .
I don't think we should make changes in the higher abstraction only because some low level custom implementation (postgres plugin) has async bindings. I believe there should be some way to handle this inside Postgres Plugin.
There was a problem hiding this comment.
Yes agreed, that's what I was trying to explain in the second option
=> the deferral to a new thread (ie. calling await asyncio.to_thread) should belong directly in the Postgres plugin somewhere
| annotations=ToolAnnotations(readOnlyHint=False, idempotentHint=False, openWorldHint=False), | ||
| ) | ||
| async def run_sql_tool( | ||
| datasource_id: str, |
There was a problem hiding this comment.
As discussed, we probably need to add a new tool in the MCP server that allows to list all existing datasources, so that the LLM can figure out what the id they want to use is.
And maybe we could also accept a None ID, in cases the project has only one datasource configured
Well, while we definitely should not put any efforts in maintaining CLI commands in this repository, I'm not 100% sure about the MCP server destiny. I think it's a viable plan to provide some [mcp] extra in our library to allow users to run MCP via python API. |
MVP for running SQL queries using MCP server
Currently we wrap the synchronous run_sql in asyncio.to_thread to avoid “cannot create event loop inside an existing loop” errors. Each call creates a new thread and a local event loop, which is inefficient under load
Ideally, run_sql should be fully async and reuse a single connection/event loop