Skip to content

LCORE-1216: Bump up to llama-stack 0.4.3#52

Open
are-ces wants to merge 2 commits intolightspeed-core:mainfrom
are-ces:llama-stack-0.4.x-bumpup
Open

LCORE-1216: Bump up to llama-stack 0.4.3#52
are-ces wants to merge 2 commits intolightspeed-core:mainfrom
are-ces:llama-stack-0.4.x-bumpup

Conversation

@are-ces
Copy link
Contributor

@are-ces are-ces commented Feb 8, 2026

Description

This is a significant refactoring of all the modules, mostly because the Agents API has been deprecated in favor of the Responses API in llama-stack (already from 0.3.x).

This upgrade is needed to keep lightspeed-providers on par with LCORE

NOTE: run_moderation has not been designed for redaction but to only block the request, thus lightspeed-redactions will block the message if an unauthorized string is detected, as opposed to run_shield where it is possible to redact the original message.

Changes:

  • Bump up llama-stack library to 0.4.3
  • Refactor agent code to migrate from Agents API to Responses API
  • Refactor safety module run_shield, added run_moderation
  • Kept temperature override, prioritization to latest used tools, tool fitering

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Unit tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Partially generated by: Claude

Related Tickets & Documents

  • Related Issue # LCORE-1216
  • Closes # LCORE-1216

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

I tested manually via curl requests the following:

  • Question validity run_shield (valid/invalid questions)
  • Question validity run_moderation
  • Redaction run_shield (sensitive data redacted)
  • Redaction run_moderation (message with sensitive data BLOCKED)
  • Tool filtering (11→1 tools)
  • min_tools threshold
  • Previously called tools persistence
  • always_include_tools config
  • Temperature override (1.0 for GPT-5)

@are-ces are-ces marked this pull request as draft February 8, 2026 16:53
@are-ces are-ces force-pushed the llama-stack-0.4.x-bumpup branch 3 times, most recently from b2b25c6 to c84a80e Compare February 8, 2026 17:25
Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say LGTM on my side. But definitely need at least one more reviewer, especially from teams that managed to use provider(s).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are removing the inline::lightspeed_inline_agent we are using in Ansible Lightspeed chatbot, if this PR is merged this will break the chatbot functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline::lightspeed_inline_agent still works, the logic has been moved from agent_instance.py  to agents.py

@are-ces are-ces force-pushed the llama-stack-0.4.x-bumpup branch 3 times, most recently from 218e6d4 to 3ad6905 Compare February 10, 2026 11:29
@TamiTakamiya
Copy link

@are-ces @ldjebran I could run the updated lightspeed_inline_agent with ansible-chatbot-stack The test setup uses:

The setup is somehow complicated because it's using a number of codes that are not merged to main yet. I will create a memo on my test setup.

Note: My setup does not enable MCP server yet. After writing the memo, I plan to test this with MCP server enabled.

@are-ces are-ces force-pushed the llama-stack-0.4.x-bumpup branch from 3ad6905 to f99d3c1 Compare February 11, 2026 08:31
@are-ces are-ces marked this pull request as ready for review February 11, 2026 08:32
"llama-stack==0.2.22",
"llama-stack-client==0.2.22",
"llama-stack==0.4.3",
"llama-stack-api==0.4.4",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why its not the same version 0.4.3 is this intentional ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use llama-stack-api 0.4.3 for ansible-chatbot-stack and it did not work. I think 0.4.3 is broken.

Copy link
Contributor Author

@are-ces are-ces Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because we updated LCORE to have the api package on 0.4.4 because of a CVE (v0.4.4 shouldn't have breaking changes)

Copy link
Contributor

@Jdubrick Jdubrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@are-ces since we only consume the safety shield portion for my use case, that part lgtm, fyi

@ldjebran
Copy link
Contributor

@are-ces seems the file https://github.com/lightspeed-core/lightspeed-providers/blob/main/resources/external_providers/inline/agents/lightspeed_inline_agent.yaml

needs to be updated to:

config_class: lightspeed_stack_providers.providers.inline.agents.lightspeed_inline_agent.config.LightspeedAgentsImplConfig
module: lightspeed_stack_providers.providers.inline.agents.lightspeed_inline_agent
api_dependencies: [ inference, safety, tool_runtime, tool_groups, conversations, prompts ]
optional_api_dependencies: [vector_io, files]

The agent lightspeed_inline_agent is passing through the queries and overriding the temperature when configured , unfortunately I was not able to test mcp filtring as seems the lightspeed-stack has a regression as not passing mcp headers received from client by MCP-HEADERS header.

There is a big work done her, @are-ces many thanks for your efforts,
can we wait a little to merge to see comments of the team about mcp headers ?

Copy link
Contributor

@ldjebran ldjebran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@are-ces many thanks for the work the changes that I proposed in my last comment still valid, tested the mcp but seems the lightspeed_inline_agent is unfortunately not working as expected and breaking when enabling the mcp configuration, I see the mcp returning the list of tools, but the agent seems do not detect that tools and see only 2 instead of more than 300.
this will needs more investigations.

@are-ces are-ces force-pushed the llama-stack-0.4.x-bumpup branch from 84d4bf7 to 622151e Compare February 12, 2026 11:08
@are-ces
Copy link
Contributor Author

are-ces commented Feb 12, 2026

Hey @ldjebran good catch! I have encountered the same problem, I was handling the tools in a wrong way; basically the MCP servers were not being expanded to their tools so we were counting the MCP servers and comparing them with min_tools.
I have tested it on my side and it works as expected, hopefully the same on your side 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants