diff --git a/docs/en/solutions/How_to_Create_an_AI_Agent_with_LlamaStack.md b/docs/en/solutions/How_to_Create_an_AI_Agent_with_LlamaStack.md new file mode 100644 index 0000000..ac5d674 --- /dev/null +++ b/docs/en/solutions/How_to_Create_an_AI_Agent_with_LlamaStack.md @@ -0,0 +1,44 @@ +--- +products: + - Alauda AI +kind: + - Solution +ProductsVersion: + - 4.x +--- + +# How To Create AI Agent with Llama Stack + +## Overview + +Llama Stack is a framework for building and running AI agents with tools. It provides a server-based architecture that enables developers to create agents that can interact with users, access external tools, and perform complex reasoning tasks. This guide provides a quickstart example for creating an AI Agent using Llama Stack. + +## Prerequisites + +- Llama Stack Server installed and running (see notebook for installation and startup instructions) + - For deploying Llama Stack Server on Kubernetes, refer to the [Kubernetes Deployment Guide](https://llamastack.github.io/docs/deploying/kubernetes_deployment) +- Access to a Notebook environment (e.g., Jupyter Notebook, JupyterLab, or similar) +- Python environment with `llama-stack-client` and required dependencies installed +- API key for the LLM provider (e.g., DeepSeek API key) + +## Quickstart + +A simple example of creating an AI Agent with Llama Stack is available here: [llama_stack_quickstart.ipynb](/llama-stack/llama_stack_quickstart.ipynb). The configuration file [llama_stack_config.yaml](/llama-stack/llama_stack_config.yaml) is also required. Download both files and upload them to a Notebook environment to run. + +The notebook demonstrates: +- Llama Stack Server installation and configuration +- Server startup and connection setup +- Tool definition using the `@client_tool` decorator (weather query tool example) +- Client connection to Llama Stack Server +- Model selection and Agent creation with tools and instructions +- Agent execution with session management and streaming responses +- Result handling and display + +## Additional Resources + +For more resources on developing AI Agents with Llama Stack, see: + +- [Llama Stack Documentation](https://llamastack.github.io/docs) - The official Llama Stack documentation covering all usage-related topics, API providers, and core concepts. +- [Llama Stack Core Concepts](https://llamastack.github.io/docs/concepts) - Deep dive into Llama Stack architecture, API stability, and resource management. +- [Llama Stack GitHub Repository](https://github.com/llamastack/llama-stack) - Source code, example applications, distribution configurations, and how to add new API providers. +- [Llama Stack Example Apps](https://github.com/llamastack/llama-stack-apps/) - Official examples demonstrating how to use Llama Stack in various scenarios. \ No newline at end of file diff --git a/docs/public/llama-stack/llama_stack_config.yaml b/docs/public/llama-stack/llama_stack_config.yaml new file mode 100644 index 0000000..c21d0fc --- /dev/null +++ b/docs/public/llama-stack/llama_stack_config.yaml @@ -0,0 +1,60 @@ +version: "2" +image_name: llama-stack-demo +apis: + - inference + - agents + - safety + - tool_runtime + - vector_io + - files + +providers: + inference: + - provider_id: openai + provider_type: remote::openai + config: + api_key: ${env.API_KEY} + base_url: https://api.deepseek.com/v1 + agents: + - provider_id: meta-reference + provider_type: inline::meta-reference + config: + persistence: + agent_state: + backend: kv_default + namespace: agents + responses: + backend: sql_default + table_name: responses + safety: + - provider_id: llama-guard + provider_type: inline::llama-guard + config: + excluded_categories: [] + tool_runtime: [] + vector_io: + - provider_id: sqlite-vec + provider_type: inline::sqlite-vec + config: + db_path: ${env.SQLITE_STORE_DIR:~/.llama/distributions/llama-stack-demo}/sqlite_vec.db + persistence: + backend: kv_default + namespace: vector_io::sqlite_vec + files: + - provider_id: localfs + provider_type: inline::localfs + config: + storage_dir: ${env.SQLITE_STORE_DIR:~/.llama/distributions/llama-stack-demo}/files + metadata_store: + backend: sql_default + table_name: files_metadata + +metadata_store: + type: sqlite + +models: + - metadata: {} + model_id: deepseek/deepseek-chat + provider_id: openai + provider_model_id: deepseek-chat + model_type: llm \ No newline at end of file diff --git a/docs/public/llama-stack/llama_stack_quickstart.ipynb b/docs/public/llama-stack/llama_stack_quickstart.ipynb new file mode 100644 index 0000000..4a33f7c --- /dev/null +++ b/docs/public/llama-stack/llama_stack_quickstart.ipynb @@ -0,0 +1,484 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Llama Stack Quick Start Demo\n", + "\n", + "This notebook demonstrates how to use Llama Stack to run an agent with tools." + ] + }, + { + "cell_type": "markdown", + "id": "937f1aab", + "metadata": {}, + "source": [ + "## 1. Prepare Llama Stack Server (Prerequisites)\n", + "\n", + "Before running this notebook, you need to deploy and start the Llama Stack Server." + ] + }, + { + "cell_type": "markdown", + "id": "07f59470", + "metadata": {}, + "source": [ + "### Install Llama Stack and Dependencies\n", + "\n", + "If you haven't installed the Llama Stack yet, install it along with the required provider packages:\n", + "\n", + "```bash\n", + "pip install llama-stack sqlite_vec\n", + "```\n", + "\n", + "The `llama-stack` package will automatically install its core dependencies. Since the configuration file uses the `sqlite-vec` provider for vector storage, you also need to install the `sqlite_vec` package." + ] + }, + { + "cell_type": "markdown", + "id": "6dc839f0", + "metadata": {}, + "source": [ + "### Start the Server\n", + "\n", + "Set the required environment variable for the API key (used by the DeepSeek provider in the config):\n", + "\n", + "```bash\n", + "export API_KEY=your-deepseek-api-key\n", + "```\n", + "\n", + "**Note:** Replace `your-deepseek-api-key` with your actual DeepSeek API key.\n", + "\n", + "Run the following command in your terminal to start the server:\n", + "\n", + "```bash\n", + "llama stack run llama_stack_config.yaml --port 8321\n", + "```\n", + "\n", + "**Note:** The server must be running before you can connect to it from this notebook." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Install Dependencies" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!pip install \"llama-stack-client\" \"requests\" \"fastapi\" \"uvicorn\" --target ~/packages" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Import Libraries" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import sys\n", + "from pathlib import Path\n", + "\n", + "user_site_packages = Path.home() / \"packages\"\n", + "if str(user_site_packages) not in sys.path:\n", + " sys.path.insert(0, str(user_site_packages))\n", + "\n", + "import os\n", + "import requests\n", + "from typing import Dict, Any\n", + "from llama_stack_client import LlamaStackClient, Agent\n", + "from llama_stack_client.lib.agents.client_tool import client_tool\n", + "from llama_stack_client.lib.agents.event_logger import AgentEventLogger\n", + "\n", + "print('Libraries imported successfully')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. Define a Tool\n", + "\n", + "Use the @client_tool decorator to define a weather query tool." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "@client_tool\n", + "def get_weather(city: str) -> Dict[str, Any]:\n", + " \"\"\"Get current weather information for a specified city.\n", + "\n", + " Uses the wttr.in free weather API to fetch weather data.\n", + "\n", + " :param city: City name, e.g., Beijing, Tokyo, New York, Paris\n", + " :returns: Dictionary containing weather information including city, temperature, description and humidity\n", + " \"\"\"\n", + " try:\n", + " url = f'https://wttr.in/{city}?format=j1'\n", + " response = requests.get(url, timeout=10)\n", + " response.raise_for_status()\n", + " data = response.json()\n", + "\n", + " current = data['current_condition'][0]\n", + " return {\n", + " 'city': city,\n", + " 'temperature': f\"{current['temp_C']}°C\",\n", + " 'humidity': f\"{current['humidity']}%\",\n", + " }\n", + " except Exception as e:\n", + " return {'error': f'Failed to get weather information: {str(e)}'}\n", + "\n", + "print('Weather tool defined successfully')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 5. Connect to Server and Create Agent\n", + "\n", + "Use LlamaStackClient to connect to the running server, create an Agent, and execute tool calls." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "base_url = os.getenv('LLAMA_STACK_URL', 'http://localhost:8321')\n", + "print(f'Connecting to Server: {base_url}')\n", + "\n", + "client = LlamaStackClient(base_url=base_url)\n", + "\n", + "# Get available models\n", + "print('Getting available models...')\n", + "try:\n", + " models = client.models.list()\n", + " if not models:\n", + " raise Exception('No models found')\n", + "\n", + " print(f'Found {len(models)} available models:')\n", + " for model in models[:5]: # Show only first 5\n", + " model_type = model.custom_metadata.get('model_type', 'unknown') if model.custom_metadata else 'unknown'\n", + " print(f' - {model.id} ({model_type})')\n", + "\n", + " # Select first LLM model\n", + " llm_model = next(\n", + " (m for m in models\n", + " if m.custom_metadata and m.custom_metadata.get('model_type') == 'llm'),\n", + " None\n", + " )\n", + " if not llm_model:\n", + " raise Exception('No LLM model found')\n", + "\n", + " model_id = llm_model.id\n", + " print(f'Using model: {model_id}\\n')\n", + "\n", + "except Exception as e:\n", + " print(f'Failed to get model list: {e}')\n", + " print('Make sure the server is running')\n", + " raise e\n", + "\n", + "\n", + "# Create Agent\n", + "print('Creating Agent...')\n", + "agent = Agent(\n", + " client,\n", + " model=model_id,\n", + " instructions='You are a helpful weather assistant. When users ask about weather, use the get_weather tool to query weather information, then answer based on the query results.',\n", + " tools=[get_weather],\n", + ")\n", + "\n", + "print('Agent created successfully')" + ] + }, + { + "cell_type": "markdown", + "id": "90c28b81", + "metadata": {}, + "source": [ + "## 6. Run the Agent" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "70e8d661", + "metadata": {}, + "outputs": [], + "source": [ + "# Create session\n", + "session_id = agent.create_session('weather-agent-session')\n", + "print(f'✓ Session created: {session_id}\\n')\n", + "\n", + "# First query\n", + "print('=' * 60)\n", + "print('User> What is the weather like in Beijing today?')\n", + "print('-' * 60)\n", + "\n", + "response_stream = agent.create_turn(\n", + " messages=[{'role': 'user', 'content': 'What is the weather like in Beijing today?'}],\n", + " session_id=session_id,\n", + " stream=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "ca2f26f2", + "metadata": {}, + "source": [ + "### Display the Result" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4728a638", + "metadata": {}, + "outputs": [], + "source": [ + "logger = AgentEventLogger()\n", + "for printable in logger.log(response_stream):\n", + " print(printable, end='', flush=True)\n", + "print('\\n')" + ] + }, + { + "cell_type": "markdown", + "id": "728530b0", + "metadata": {}, + "source": [ + "### Try Different Queries" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ed8cc5a0", + "metadata": {}, + "outputs": [], + "source": [ + "# Second query\n", + "print('=' * 60)\n", + "print('User> What is the weather in Shanghai?')\n", + "print('-' * 60)\n", + "\n", + "response_stream = agent.create_turn(\n", + " messages=[{'role': 'user', 'content': 'What is the weather in Shanghai?'}],\n", + " session_id=session_id,\n", + " stream=True,\n", + ")\n", + "\n", + "logger = AgentEventLogger()\n", + "for printable in logger.log(response_stream):\n", + " print(printable, end='', flush=True)\n", + "print('\\n')" + ] + }, + { + "cell_type": "markdown", + "id": "6f8d31d0", + "metadata": {}, + "source": [ + "## 7. FastAPI Service Example\n", + "\n", + "You can also run the agent as a FastAPI web service for production use. This allows you to expose the agent functionality via HTTP API endpoints." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a5d732e4", + "metadata": {}, + "outputs": [], + "source": [ + "# Import FastAPI components\n", + "from fastapi import FastAPI\n", + "from pydantic import BaseModel\n", + "from threading import Thread\n", + "import time\n", + "\n", + "# Create a simple FastAPI app\n", + "api_app = FastAPI(title=\"Llama Stack Agent API\")\n", + "\n", + "class ChatRequest(BaseModel):\n", + " message: str\n", + "\n", + "\n", + "@api_app.post(\"/chat\")\n", + "async def chat(request: ChatRequest):\n", + " \"\"\"Chat endpoint that uses the Llama Stack Agent\"\"\"\n", + " session_id = agent.create_session('fastapi-weather-session')\n", + "\n", + " # Create turn and collect response\n", + " response_stream = agent.create_turn(\n", + " messages=[{'role': 'user', 'content': request.message}],\n", + " session_id=session_id,\n", + " stream=True,\n", + " )\n", + "\n", + " # Collect the full response\n", + " full_response = \"\"\n", + " logger = AgentEventLogger()\n", + " for printable in logger.log(response_stream):\n", + " full_response += printable\n", + "\n", + " return {\"response\": full_response}\n", + "\n", + "print(\"FastAPI app created. Use the next cell to start the server.\")" + ] + }, + { + "cell_type": "markdown", + "id": "475997ba", + "metadata": {}, + "source": [ + "### Start the FastAPI Server\n", + "\n", + "**Note**: In a notebook, you can start the server in a background thread. For production, run it as a separate process using `uvicorn`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6f5db723", + "metadata": {}, + "outputs": [], + "source": [ + "# Start server in background thread (for notebook demonstration)\n", + "from uvicorn import Config, Server\n", + "\n", + "# Create a server instance that can be controlled\n", + "config = Config(api_app, host=\"127.0.0.1\", port=8000, log_level=\"info\")\n", + "server = Server(config)\n", + "\n", + "def run_server():\n", + " server.run()\n", + "\n", + "# Use daemon=True so the thread stops automatically when the kernel restarts\n", + "# This is safe for notebook demonstrations\n", + "# For production, use process managers instead of threads\n", + "server_thread = Thread(target=run_server, daemon=True)\n", + "server_thread.start()\n", + "\n", + "# Wait a moment for the server to start\n", + "time.sleep(2)\n", + "print(\"✓ FastAPI server started at http://127.0.0.1:8000\")" + ] + }, + { + "cell_type": "markdown", + "id": "715b2d47", + "metadata": {}, + "source": [ + "### Test the API\n", + "\n", + "Now you can call the API using HTTP requests:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "407b82af", + "metadata": {}, + "outputs": [], + "source": [ + "# Test the API endpoint\n", + "response = requests.post(\n", + " \"http://127.0.0.1:8000/chat\",\n", + " json={\"message\": \"What's the weather in Shanghai?\"},\n", + " timeout=60\n", + ")\n", + "\n", + "print(f\"Status Code: {response.status_code}\")\n", + "print(\"Response:\")\n", + "print(response.json().get('response'))" + ] + }, + { + "cell_type": "markdown", + "id": "945a776f", + "metadata": {}, + "source": [ + "### Stop the Server\n", + "\n", + "You can stop the server by calling its shutdown method:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c7795bba", + "metadata": {}, + "outputs": [], + "source": [ + "# Stop the server\n", + "if 'server' in globals() and server.started:\n", + " server.should_exit = True\n", + " print(\"✓ Server shutdown requested. It will stop after handling current requests.\")\n", + " print(\" Note: The server will also stop automatically when you restart the kernel.\")\n", + "else:\n", + " print(\"Server is not running or has already stopped.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 8. More Resources\n", + "\n", + "For more resources on developing AI Agents with Llama Stack, see:\n", + "\n", + "### Official Documentation\n", + "- [Llama Stack Documentation](https://llamastack.github.io/docs) - The official Llama Stack documentation covering all usage-related topics, API providers, and core concepts.\n", + "- [Llama Stack Core Concepts](https://llamastack.github.io/docs/concepts) - Deep dive into Llama Stack architecture, API stability, and resource management.\n", + "\n", + "### Code Examples and Projects\n", + "- [Llama Stack GitHub Repository](https://github.com/llamastack/llama-stack) - Source code, example applications, distribution configurations, and how to add new API providers.\n", + "- [Llama Stack Example Apps](https://github.com/llamastack/llama-stack-apps/) - Official examples demonstrating how to use Llama Stack in various scenarios.\n", + "\n", + "### Community and Support\n", + "- [Llama Stack GitHub Issues](https://github.com/llamastack/llama-stack/issues) - Report bugs, ask questions, and contribute to the project.\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python (llama-stack-demo)", + "language": "python", + "name": "llama-stack-demo" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.11" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}