-
Notifications
You must be signed in to change notification settings - Fork 14
add Llama Stack quickstart guide and notebook demo #107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds new Llama Stack documentation, a deployment configuration, and a Jupyter quickstart notebook that demonstrate creating an AI agent, defining tools, connecting clients, running an agent, and exposing a FastAPI chat endpoint for queries. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant FastAPI
participant Agent
participant Tool
participant Model
User->>FastAPI: POST /chat {message}
FastAPI->>Agent: enqueue message / start session
Agent->>Model: generate response (streaming)
Agent->>Tool: call get_weather(...) (if needed)
Tool-->>Agent: tool result
Model-->>Agent: streaming tokens
Agent-->>FastAPI: stream aggregated response
FastAPI-->>User: stream final/partial responses
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. 🧹 Recent nitpick comments
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (1)📚 Learning: 2025-12-30T08:43:22.275ZApplied to files:
🪛 Ruff (0.14.11)docs/public/llama-stack/llama_stack_quickstart.ipynb42-42: Do not catch blind exception: (BLE001) 43-43: Use explicit conversion flag Replace with conversion flag (RUF010) 59-59: Abstract (TRY301) 59-59: Create your own exception (TRY002) 59-59: Avoid specifying long messages outside the exception class (TRY003) 73-73: Abstract (TRY301) 73-73: Create your own exception (TRY002) 73-73: Avoid specifying long messages outside the exception class (TRY003) 81-81: Use Remove exception name (TRY201) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (5)
✏️ Tip: You can disable this entire section by setting Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@docs/public/llama-stack/llama_stack_config.yaml`:
- Around line 1-60: The metadata_store block omits an explicit db_path; add a
db_path entry to metadata_store mirroring the pattern used for vector_io and
files so it reads metadata_store: type: sqlite and db_path:
${env.SQLITE_STORE_DIR:~/.llama/distributions/llama-stack-demo}/registry.db
(update the metadata_store section in the YAML to include this db_path key).
In `@docs/public/llama-stack/llama_stack_quickstart.ipynb`:
- Around line 462-467: Update the notebook metadata kernelspec so the kernel
name and display_name reflect the Llama Stack quickstart (e.g., change
kernelspec.name from "langchain-demo" and kernelspec.display_name from "Python
(langchain-demo)" to a clearer identifier like "llama-stack" and "Python (Llama
Stack)" respectively) by editing the kernelspec block in the notebook metadata.
- Around line 122-148: The docstring for get_weather promises wind speed but the
returned dict only contains city, temperature, and humidity; update the function
to include wind speed by extracting it from the parsed API response (e.g.,
current['windspeedKmph'] or current['windspeedMiles'] depending on desired
units) and add a 'wind_speed' key to the returned dictionary, or alternatively
remove the "wind speed" mention from the docstring to make it match the existing
return value.
- Around line 194-208: Agent creation uses model_id which may be undefined if
the model listing try block failed; move the Agent(...) creation (the Agent
instantiation that references model_id, client, get_weather and instructions)
inside the try block that sets model_id or add an early exit/conditional guard
after the except (e.g., return or raise) so Agent(...) is only called when
model_id is successfully set; ensure you reference the same Agent(...) call and
the model_id assignment to relocate or gate the creation.
🧹 Nitpick comments (2)
docs/en/solutions/How_to_Create_an_AI_Agent_with_LlamaStack.md (1)
41-44: Consider varying the link descriptions.All four resource links begin with "Llama Stack", which creates repetition. You could vary the wording:
💡 Suggested rewording
-- [Llama Stack Documentation](https://llamastack.github.io/docs) - The official Llama Stack documentation covering all usage-related topics, API providers, and core concepts. -- [Llama Stack Core Concepts](https://llamastack.github.io/docs/concepts) - Deep dive into Llama Stack architecture, API stability, and resource management. -- [Llama Stack GitHub Repository](https://github.com/llamastack/llama-stack) - Source code, example applications, distribution configurations, and how to add new API providers. -- [Llama Stack Example Apps](https://github.com/llamastack/llama-stack-apps/) - Official examples demonstrating how to use Llama Stack in various scenarios. +- [Official Documentation](https://llamastack.github.io/docs) - Covers all usage-related topics, API providers, and core concepts. +- [Core Concepts Guide](https://llamastack.github.io/docs/concepts) - Deep dive into architecture, API stability, and resource management. +- [GitHub Repository](https://github.com/llamastack/llama-stack) - Source code, example applications, and distribution configurations. +- [Example Applications](https://github.com/llamastack/llama-stack-apps/) - Official examples demonstrating various use cases.docs/public/llama-stack/llama_stack_quickstart.ipynb (1)
325-343: Consider session management for the chat endpoint.The
/chatendpoint creates a new session for every request (line 328). For a demo this works, but in production:
- Sessions accumulate without cleanup
- Conversation context is lost between requests
For a production-ready version, consider reusing sessions or implementing session cleanup:
# Option 1: Single shared session (simple approach) _session_id = None `@api_app.post`("/chat") async def chat(request: ChatRequest): global _session_id if _session_id is None: _session_id = agent.create_session('fastapi-weather-session') # ... rest of the code using _session_id
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/en/solutions/How_to_Create_an_AI_Agent_with_LlamaStack.mddocs/public/llama-stack/llama_stack_config.yamldocs/public/llama-stack/llama_stack_quickstart.ipynb
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-01-13T11:25:34.596Z
Learnt from: jing2uo
Repo: alauda/knowledge PR: 104
File: docs/en/solutions/How_to_Migrate_VirtualMachine_From_VMware.md:131-172
Timestamp: 2026-01-13T11:25:34.596Z
Learning: In VMware migration documentation (docs/en/solutions), when describing the Forklift Operator workflow for VMware, specify that the VMware provider secret should set insecureSkipVerify=true to accommodate self-signed certificates commonly used in enterprise vCenter/ESXi environments. Include a note on the security trade-offs and ensure readers understand this is for internal, controlled environments.
Applied to files:
docs/en/solutions/How_to_Create_an_AI_Agent_with_LlamaStack.md
📚 Learning: 2025-12-30T08:43:22.275Z
Learnt from: davidwtf
Repo: alauda/knowledge PR: 95
File: docs/public/langchain/langchain_quickstart.ipynb:163-171
Timestamp: 2025-12-30T08:43:22.275Z
Learning: When reviewing LangChain notebooks, validate the agent creation pattern: ensure code uses create_agent(model=..., tools=[...], system_prompt="...") with model being a chat model instance (e.g., ChatOpenAI) and system_prompt provided as a string describing system instructions. This reflects a standard agent construction and can be applied across LangChain notebook files.
Applied to files:
docs/public/llama-stack/llama_stack_quickstart.ipynb
🪛 LanguageTool
docs/en/solutions/How_to_Create_an_AI_Agent_with_LlamaStack.md
[style] ~43-~43: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... stability, and resource management. - [Llama Stack GitHub Repository](https://github...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~44-~44: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s, and how to add new API providers. - [Llama Stack Example Apps](https://github.com/...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🪛 Ruff (0.14.11)
docs/public/llama-stack/llama_stack_quickstart.ipynb
42-42: Do not catch blind exception: Exception
(BLE001)
43-43: Use explicit conversion flag
Replace with conversion flag
(RUF010)
59-59: Abstract raise to an inner function
(TRY301)
59-59: Create your own exception
(TRY002)
59-59: Avoid specifying long messages outside the exception class
(TRY003)
73-73: Abstract raise to an inner function
(TRY301)
73-73: Create your own exception
(TRY002)
73-73: Avoid specifying long messages outside the exception class
(TRY003)
78-78: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (3)
docs/en/solutions/How_to_Create_an_AI_Agent_with_LlamaStack.md (1)
1-44: Documentation is clear and comprehensive.The guide covers all essential aspects: overview, prerequisites, quickstart instructions, and additional resources. The structure is logical and user-friendly.
docs/public/llama-stack/llama_stack_quickstart.ipynb (2)
85-106: LGTM! Import setup is appropriate for notebook environments.The custom path manipulation to include
~/packagesaligns with the--target ~/packagesflag used in the pip install cell, which is a valid pattern for restricted notebook environments.
358-384: LGTM! Server startup approach is appropriate for notebook demo.The daemon thread approach with clear comments about production alternatives is well-documented. Using
daemon=Trueensures cleanup when the kernel restarts.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| "except Exception as e:\n", | ||
| " print(f'Failed to get model list: {e}')\n", | ||
| " print('Make sure the server is running')\n", | ||
| "\n", | ||
| "\n", | ||
| "# Create Agent\n", | ||
| "print('Creating Agent...')\n", | ||
| "agent = Agent(\n", | ||
| " client,\n", | ||
| " model=model_id,\n", | ||
| " instructions='You are a helpful weather assistant. When users ask about weather, use the get_weather tool to query weather information, then answer based on the query results.',\n", | ||
| " tools=[get_weather],\n", | ||
| ")\n", | ||
| "\n", | ||
| "print('Agent created successfully')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agent creation may fail if model listing failed.
The agent creation at lines 199-208 uses model_id which is only defined inside the try block (line 191). If the model listing fails, model_id will be undefined and agent creation will raise a NameError.
🔧 Suggested fix: Move agent creation inside the try block or add early exit
except Exception as e:
print(f'Failed to get model list: {e}')
print('Make sure the server is running')
+ raise # Re-raise to prevent subsequent cells from failing
# Create AgentOr wrap agent creation in a conditional:
+if 'model_id' in dir():
# Create Agent
print('Creating Agent...')
agent = Agent(
client,
model=model_id,
instructions='You are a helpful weather assistant...',
tools=[get_weather],
)
print('Agent created successfully')
+else:
+ print('Skipping agent creation - no model available')📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "except Exception as e:\n", | |
| " print(f'Failed to get model list: {e}')\n", | |
| " print('Make sure the server is running')\n", | |
| "\n", | |
| "\n", | |
| "# Create Agent\n", | |
| "print('Creating Agent...')\n", | |
| "agent = Agent(\n", | |
| " client,\n", | |
| " model=model_id,\n", | |
| " instructions='You are a helpful weather assistant. When users ask about weather, use the get_weather tool to query weather information, then answer based on the query results.',\n", | |
| " tools=[get_weather],\n", | |
| ")\n", | |
| "\n", | |
| "print('Agent created successfully')" | |
| except Exception as e: | |
| print(f'Failed to get model list: {e}') | |
| print('Make sure the server is running') | |
| raise # Re-raise to prevent subsequent cells from failing | |
| # Create Agent | |
| print('Creating Agent...') | |
| agent = Agent( | |
| client, | |
| model=model_id, | |
| instructions='You are a helpful weather assistant. When users ask about weather, use the get_weather tool to query weather information, then answer based on the query results.', | |
| tools=[get_weather], | |
| ) | |
| print('Agent created successfully') |
🤖 Prompt for AI Agents
In `@docs/public/llama-stack/llama_stack_quickstart.ipynb` around lines 194 - 208,
Agent creation uses model_id which may be undefined if the model listing try
block failed; move the Agent(...) creation (the Agent instantiation that
references model_id, client, get_weather and instructions) inside the try block
that sets model_id or add an early exit/conditional guard after the except
(e.g., return or raise) so Agent(...) is only called when model_id is
successfully set; ensure you reference the same Agent(...) call and the model_id
assignment to relocate or gate the creation.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.