From b95672753ee6a97d7779516eab758a8b703cd5aa Mon Sep 17 00:00:00 2001 From: Sachin Magar Date: Tue, 24 Feb 2026 13:55:37 +0530 Subject: [PATCH 1/3] Litellm documentation --- docs/integrations/saas-cloud/litellm.md | 380 ++++++++++++++++++++++++ 1 file changed, 380 insertions(+) create mode 100644 docs/integrations/saas-cloud/litellm.md diff --git a/docs/integrations/saas-cloud/litellm.md b/docs/integrations/saas-cloud/litellm.md new file mode 100644 index 0000000000..fd88e0b4f3 --- /dev/null +++ b/docs/integrations/saas-cloud/litellm.md @@ -0,0 +1,380 @@ +--- +id: litellm +title: LiteLLM +sidebar_label: LiteLLM +description: The Sumo Logic app for LiteLLM provides visibility into LLM proxy usage, cost, latency, deployment health, and performance across OpenAI, Bedrock, Groq, and other providers. +--- + +import useBaseUrl from '@docusaurus/useBaseUrl'; + +Thumbnail icon + +[LiteLLM](https://docs.litellm.ai/) is an open-source proxy that provides a unified interface to call 100+ LLM APIs (OpenAI, Anthropic, AWS Bedrock, Groq, and more). It routes requests, manages fallbacks, tracks budgets, and exposes Prometheus metrics for observability. + +The Sumo Logic app for LiteLLM provides preconfigured dashboards to monitor request volume, latency, token consumption, spend, budget and rate limits, deployment and fallback health, infrastructure (Redis, Postgres), and user or route visibility. Use the app to track cost by team or API key, identify slow models, detect failures and fallbacks, and ensure your LLM proxy is running smoothly. + +:::info +This app includes [built-in monitors](#litellm-monitors). For details on creating custom monitors, refer to [Create monitors for LiteLLM app](#create-monitors-for-litellm-app). +::: + +## Metric types + +This app collects Prometheus metrics from the LiteLLM proxy. LiteLLM exposes metrics at `/metrics/` when configured with `callbacks: ["prometheus"]` or `prometheus_metrics_config` in `litellm-config.yaml`. + +The app uses the following metrics: + +* **Request metrics** — `litellm_proxy_total_requests_metric_total`, `litellm_proxy_failed_requests_metric_total`, `litellm_deployment_success_responses_total`, `litellm_deployment_failure_responses_total` +* **Latency metrics** — `litellm_request_total_latency_metric_sum`/`_count`, `litellm_llm_api_latency_metric_sum`/`_count`, `litellm_overhead_latency_metric_sum`/`_count`, `litellm_llm_api_time_to_first_token_metric_sum`/`_count` +* **Token metrics** — `litellm_total_tokens_metric_total`, `litellm_input_tokens_metric_total`, `litellm_output_tokens_metric_total` +* **Cost metrics** — `litellm_spend_metric_total` (by team, API key, alias) +* **Budget metrics** — `litellm_remaining_team_budget_metric`, `litellm_remaining_api_key_budget_metric`, `litellm_team_max_budget_metric`, `litellm_team_budget_remaining_hours_metric` +* **Deployment health** — `litellm_deployment_state`, `litellm_deployment_successful_fallbacks_total`, `litellm_deployment_failed_fallbacks_total`, `litellm_deployment_cooled_down_total` +* **Rate limits** — `litellm_remaining_requests_metric`, `litellm_remaining_tokens_metric` (provider-specific, e.g. Groq) +* **Infrastructure** — `litellm_redis_latency_sum`/`_count`, `litellm_postgres_latency_sum`/`_count`, `litellm_self_latency_sum`/`_count`, `litellm_callback_logging_failures_metric_total` + +For a complete list of metrics and dimensions, see the [LiteLLM metrics documentation](https://docs.litellm.ai/docs/proxy/configs#prometheus-metrics-config). + +## Fields creation in Sumo Logic for LiteLLM + +The following [fields](/docs/manage/fields/) are created as part of the LiteLLM app installation, if not already present: + +* **`sumo.datasource`**. Has fixed value of `litellm-metrics`. +* **`_sourceCategory`**. Set by the OpenTelemetry Collector resource processor. Use a value such as `otel/litellm/metrics` for consistent querying. +* **`deployment.environment`**. User configured. Enter a name to identify your deployment environment (e.g. `production`, `staging`, `dev`). + +## Prerequisites + +### For metrics collection + +* LiteLLM proxy with Prometheus metrics enabled (e.g. `callbacks: ["prometheus"]` or `prometheus_metrics_config` in `litellm-config.yaml`). +* LiteLLM metrics endpoint accessible (default: `http://localhost:4000/metrics/`). +* OpenTelemetry Collector (or Sumo Logic Distribution for OpenTelemetry Collector) to scrape Prometheus metrics and send to Sumo Logic. +* HTTP Source in a Sumo Logic Hosted Collector for receiving metrics. + +### For team and budget labels + +* `team` and `team_alias` dimensions are populated when API keys are created with a `team_id` via `/team/new` and `/key/generate`. Keys without a team show `team=None`, `team_alias=None`. Create teams and associate keys to enable team-level spend and budget dashboards. + +### Optional dimensions + +* `end_user`, `user`, `user_email`, `route`, `status_code` are populated when the application passes them (e.g. via headers or `prometheus_metrics_config` in `litellm-config.yaml`). Panels using these may return no data if not configured. + +## Collection configuration and app installation + +import ConfigAppInstall from '../../reuse/apps/opentelemetry/config-app-install.md'; + + + +### Step 1: Set up collector + +import SetupColl from '../../reuse/apps/opentelemetry/set-up-collector.md'; + + + +### Step 2: Configure integration + +Create an HTTP Source in your Hosted Collector for receiving metrics. Note the HTTP Source URL — you will use it as the `SUMOLOGIC_WEBHOOK_URL` environment variable. + +Set the following environment variables: + +* **`SUMOLOGIC_WEBHOOK_URL`**. Your Sumo Logic HTTP Source URL. The OpenTelemetry Collector will send metrics to this endpoint. +* **`SUMOLOGIC_INSTALLATION_TOKEN`**. Your Sumo Logic installation token (for collector registration if using the Sumo Logic Distribution). + +Configure the OpenTelemetry Collector to scrape LiteLLM metrics and send them to Sumo Logic. Use a configuration similar to the following: + +```yaml +extensions: + sumologic: + installation_token: ${SUMOLOGIC_INSTALLATION_TOKEN} + collector_name: litellm-otel-collector + +receivers: + prometheus: + config: + scrape_configs: + - job_name: 'litellm' + scrape_interval: 30s + metrics_path: '/metrics/' + static_configs: + - targets: ['localhost:4000'] # Change if LiteLLM runs elsewhere + +processors: + memory_limiter: + check_interval: 1s + limit_mib: 512 + batch: + send_batch_size: 2048 + timeout: 5s + resourcedetection/system: + detectors: ["system"] + system: + hostname_sources: ["os"] + resource/common: + attributes: + - key: sumo.datasource + value: litellm-metrics + action: upsert + - key: _sourceCategory + value: otel/litellm/metrics + action: upsert + - key: service.name + value: litellm-proxy + action: upsert + - key: deployment.environment + value: production + action: upsert + +exporters: + sumologic: + endpoint: ${SUMOLOGIC_WEBHOOK_URL} + metric_format: prometheus + +service: + extensions: [sumologic] + pipelines: + metrics: + receivers: [prometheus] + processors: [memory_limiter, batch, resourcedetection/system, resource/common] + exporters: [sumologic] +``` + +:::important +* Update `targets` in the Prometheus scrape config if LiteLLM runs on a different host or port. +* Ensure `_sourceCategory` and `deployment.environment` match the values used in the app dashboards (e.g. template variables `{{source_category}}` and `{{deployment.environment}}`). +::: + +### Step 3: Verify LiteLLM configuration + +Ensure LiteLLM is configured to expose Prometheus metrics. In `litellm-config.yaml`: + +```yaml +litellm_settings: + callbacks: ["otel", "prometheus", "sumologic"] + service_callback: ["prometheus_system"] +``` + +For team and budget tracking, create teams and API keys with `team_id` via the LiteLLM API. See [LiteLLM team management](https://docs.litellm.ai/docs/proxy/team_management) for details. + +## Sample metrics + +
+Request and latency metrics + +```json +{ + "metric": "litellm_proxy_total_requests_metric_total", + "sumo.datasource": "litellm-metrics", + "_sourceCategory": "otel/litellm/metrics", + "deployment.environment": "production", + "requested_model": "gpt-4", + "team_alias": "platform-team", + "value": 1250, + "timestamp": "2025-02-21T10:30:00.000Z" +} +``` + +```json +{ + "metric": "litellm_request_total_latency_metric_sum", + "sumo.datasource": "litellm-metrics", + "deployment.environment": "production", + "requested_model": "gpt-4", + "value": 45.2, + "timestamp": "2025-02-21T10:30:00.000Z" +} +``` + +
+ +## Sample queries + +:::note +Pipes in queries are escaped as `\|` for Markdown. When pasting into Sumo Logic, use `|` (single pipe). +::: + +```sql title="Total requests over time" +_sourceCategory=otel/litellm/metrics deployment.environment=production metric=litellm_proxy_total_requests_metric_total +| quantize using sum +| sum +``` + +```sql title="Average latency by requested model" +_sourceCategory=otel/litellm/metrics deployment.environment=production metric=litellm_request_total_latency_metric_sum +| quantize using sum +| sum by requested_model +``` + +```sql title="Spend by team" +_sourceCategory=otel/litellm/metrics deployment.environment=production metric=litellm_spend_metric_total +| quantize using sum +| sum by team_alias +``` + +```sql title="Success rate (success / total)" +_sourceCategory=otel/litellm/metrics deployment.environment=production litellm_model_name=* metric=litellm_deployment_success_responses_total +| quantize using sum +| sum +``` +Divide by the total requests query result to get percentage. + +## Installing the LiteLLM app + +import AppInstallIndexV2 from '../../reuse/apps/app-install-index-option.md'; + + + +As part of the app installation process, the following fields will be created by default: + +* **`sumo.datasource`**. Fixed value `litellm-metrics`. +* **`_sourceCategory`**. Source category for LiteLLM metrics (e.g. `otel/litellm/metrics`). +* **`deployment.environment`**. Deployment environment (e.g. `production`, `staging`). + +## Viewing the LiteLLM dashboards + +import ViewDashboardsIndex from '../../reuse/apps/view-dashboards-index.md'; + + + +### Overview (Executive) + +The **LiteLLM - Overview** dashboard provides high-level health and usage at a glance. + +Use this dashboard to: +* Monitor total requests, success rate, and active deployments. +* Track total spend and average latency. +* Compare request volume and spend over time. +* Identify top models by request volume and top teams by spend. + +LiteLLM - Overview + +### Latency & Performance + +The **LiteLLM - Latency & Performance** dashboard provides a deep dive into request latency, time to first token, and overhead. + +Use this dashboard to: +* Track end-to-end latency and LLM API latency over time. +* Compare overhead latency by API provider. +* Identify slowest models and latency distribution by model. +* Drill down by requested model and API key alias. + +LiteLLM - Latency & Performance + +### Tokens & Cost + +The **LiteLLM - Tokens & Cost** dashboard tracks token consumption and spend. + +Use this dashboard to: +* Monitor total tokens, input vs output tokens, and token rate. +* Track spend over time and by team. +* Compare token usage by model and spend by API key alias. +* Identify top teams by spend. + +:::note +**`litellm_spend_metric_total`** uses `team`, `team_alias`, `hashed_api_key`, `api_key_alias` — not `model`. Use `requested_model` for token metrics. +::: + +LiteLLM - Tokens & Cost + +### Budget & Rate Limits + +The **LiteLLM - Budget & Rate Limits** dashboard provides visibility into remaining budgets and provider rate limits. + +Use this dashboard to: +* Track team and API key budget remaining. +* Monitor provider rate limit headroom (e.g. Groq remaining requests and tokens). +* View hours until budget reset. +* Compare budget and rate limits by model and API base. + +LiteLLM - Budget & Rate Limits + +### Deployment & Fallback Health + +The **LiteLLM - Deployment & Fallback Health** dashboard monitors LLM deployment health, fallbacks, and failures. + +Use this dashboard to: +* Track deployment state (healthy, partial, outage) by model. +* Compare success vs failure trends per deployment. +* Monitor successful and failed fallbacks. +* Identify deployment failures by exception class and status. +* Track cooled-down deployments. + +LiteLLM - Deployment & Fallback Health + +### Infrastructure & Callbacks + +The **LiteLLM - Infrastructure & Callbacks** dashboard provides visibility into Redis, Postgres, self latency, and callback health. + +Use this dashboard to: +* Monitor Redis, Postgres, and LiteLLM self latency. +* Track Redis failed requests and callback logging failures. +* View deployment latency per token by model. +* Monitor queue sizes (pod lock manager, spend update queues). +* Map deployment correlation (model ↔ provider ↔ API base). + +LiteLLM - Infrastructure & Callbacks + +### User & Route Visibility + +The **LiteLLM - User & Route Visibility** dashboard provides user/end-user segmentation and route-level metrics. + +Use this dashboard to: +* View requests by status code and by route. +* Track requests and spend by end user. +* Identify failed requests by status code. +* Rank top end users by spend. + +:::note +Panels use optional dimensions (`end_user`, `route`, `status_code`). Populate these when the application passes them (e.g. via `prometheus_metrics_config` in `litellm-config.yaml`). +::: + +LiteLLM - User & Route Visibility + +## Create monitors for LiteLLM app + +import CreateMonitors from '../../reuse/apps/create-monitors.md'; + + + +### LiteLLM monitors + +| Name | Description | Alert Condition | Recover Condition | +|:--|:--|:--|:--| +| `LiteLLM - High Error Rate` | Critical when proxy failure rate exceeds 10% of total requests. | Count (failure rate) > 10% | Count < 10% | +| `LiteLLM - High Latency` | Warning when average request latency exceeds 30 seconds. | Avg latency > 30s | Avg latency ≤ 30s | +| `LiteLLM - Budget Exceeded` | Critical when team budget remaining is zero or negative. | Remaining budget ≤ 0 | Remaining budget > 0 | +| `LiteLLM - Deployment Unhealthy` | Critical when deployment state indicates outage (state=2). | deployment_state = 2 | deployment_state < 2 | +| `LiteLLM - High Failed Fallbacks` | Warning when failed fallbacks exceed threshold for a requested model. | Failed fallbacks > 5 | Failed fallbacks ≤ 5 | + +## Upgrading the LiteLLM app (Optional) + +import AppUpdate from '../../reuse/apps/app-update.md'; + + + +## Uninstalling the LiteLLM app (Optional) + +import AppUninstall from '../../reuse/apps/app-uninstall.md'; + + + +## Troubleshooting + +### No data in dashboards + +* Verify the OpenTelemetry Collector is running and scraping LiteLLM at the configured target (e.g. `localhost:4000`). +* Ensure `SUMOLOGIC_WEBHOOK_URL` is set correctly and the HTTP Source is receiving data. +* Check that `_sourceCategory` and `deployment.environment` in the collector config match the dashboard template variables. +* Confirm LiteLLM exposes Prometheus metrics at `/metrics/` and that `callbacks` includes `prometheus`. + +### `team=None` or `team_alias=None` in metrics + +* API keys must be created with `team_id` from the start. Keys created without `team_id` cannot be updated later. +* Create teams via `/team/new` and generate keys via `/key/generate` with `{"team_id": "..."}`. See [LiteLLM team management](https://docs.litellm.ai/docs/proxy/team_management). + +### Invalid dimension in panel queries + +* Each metric has specific valid dimensions. Using `sum by ` with an invalid dimension returns no data. Refer to the dashboard label reference for valid dimensions per metric (e.g. `litellm_spend_metric_total` uses `team`, `team_alias`, `hashed_api_key`, `api_key_alias` — not `model`). + +### Query pipe syntax + +* In Sumo Logic, use `|` (single pipe) for query operators. In this documentation, pipes are escaped as `\|` for Markdown. When copying queries, replace `\|` with `|`. From 5ebb72ae04ec73008937807e56ea92c7b8e31230 Mon Sep 17 00:00:00 2001 From: Sachin Magar Date: Wed, 25 Feb 2026 11:32:59 +0530 Subject: [PATCH 2/3] LiteLLM Beta doc --- docs/integrations/saas-cloud/litellm.md | 652 ++++++++++++++++-------- 1 file changed, 443 insertions(+), 209 deletions(-) diff --git a/docs/integrations/saas-cloud/litellm.md b/docs/integrations/saas-cloud/litellm.md index fd88e0b4f3..2b72d79f04 100644 --- a/docs/integrations/saas-cloud/litellm.md +++ b/docs/integrations/saas-cloud/litellm.md @@ -13,79 +13,346 @@ import useBaseUrl from '@docusaurus/useBaseUrl'; The Sumo Logic app for LiteLLM provides preconfigured dashboards to monitor request volume, latency, token consumption, spend, budget and rate limits, deployment and fallback health, infrastructure (Redis, Postgres), and user or route visibility. Use the app to track cost by team or API key, identify slow models, detect failures and fallbacks, and ensure your LLM proxy is running smoothly. -:::info -This app includes [built-in monitors](#litellm-monitors). For details on creating custom monitors, refer to [Create monitors for LiteLLM app](#create-monitors-for-litellm-app). -::: +## Collection configuration -## Metric types +### LiteLLM configuration changes -This app collects Prometheus metrics from the LiteLLM proxy. LiteLLM exposes metrics at `/metrics/` when configured with `callbacks: ["prometheus"]` or `prometheus_metrics_config` in `litellm-config.yaml`. +Before setting up data collection in Sumo Logic, configure LiteLLM to enable the required callbacks and expose logs and metrics. -The app uses the following metrics: +Add the following to your LiteLLM proxy configuration file (`/app/config.yaml` or your equivalent `litellm-config.yaml` path): -* **Request metrics** — `litellm_proxy_total_requests_metric_total`, `litellm_proxy_failed_requests_metric_total`, `litellm_deployment_success_responses_total`, `litellm_deployment_failure_responses_total` -* **Latency metrics** — `litellm_request_total_latency_metric_sum`/`_count`, `litellm_llm_api_latency_metric_sum`/`_count`, `litellm_overhead_latency_metric_sum`/`_count`, `litellm_llm_api_time_to_first_token_metric_sum`/`_count` -* **Token metrics** — `litellm_total_tokens_metric_total`, `litellm_input_tokens_metric_total`, `litellm_output_tokens_metric_total` -* **Cost metrics** — `litellm_spend_metric_total` (by team, API key, alias) -* **Budget metrics** — `litellm_remaining_team_budget_metric`, `litellm_remaining_api_key_budget_metric`, `litellm_team_max_budget_metric`, `litellm_team_budget_remaining_hours_metric` -* **Deployment health** — `litellm_deployment_state`, `litellm_deployment_successful_fallbacks_total`, `litellm_deployment_failed_fallbacks_total`, `litellm_deployment_cooled_down_total` -* **Rate limits** — `litellm_remaining_requests_metric`, `litellm_remaining_tokens_metric` (provider-specific, e.g. Groq) -* **Infrastructure** — `litellm_redis_latency_sum`/`_count`, `litellm_postgres_latency_sum`/`_count`, `litellm_self_latency_sum`/`_count`, `litellm_callback_logging_failures_metric_total` +```yaml +litellm_settings: + # Enable Prometheus metrics, OTel tracing, and Sumo Logic log callback + callbacks: ["prometheus", "otel", "sumologic"] + # Enable system-level Prometheus metrics (Redis, Postgres, self latency) + service_callback: ["prometheus_system"] + # Required for provider rate-limit headers (remaining requests/tokens metrics) + return_response_headers: true + # Enable end-user cost tracking in Prometheus metrics + enable_end_user_cost_tracking_prometheus_only: true + # Store audit logs in the database + store_audit_logs: true + # Initialize budget metrics on startup + prometheus_initialize_budget_metrics: true +``` -For a complete list of metrics and dimensions, see the [LiteLLM metrics documentation](https://docs.litellm.ai/docs/proxy/configs#prometheus-metrics-config). +Set the following environment variable before starting LiteLLM: -## Fields creation in Sumo Logic for LiteLLM +| Variable | Description | +|:--|:--| +| `PROMETHEUS_MULTIPROC_DIR` | Directory for Prometheus multiprocess metric aggregation. Set to `/prometheus_multiproc`. The directory must exist before LiteLLM starts. Required when running LiteLLM with multiple workers. | -The following [fields](/docs/manage/fields/) are created as part of the LiteLLM app installation, if not already present: +Configure the `prometheus_metrics_config` block to control which metrics and labels are exposed at `/metrics/`. Add the following to your `litellm-config.yaml`: + +```yaml +prometheus_metrics_config: + - group: "proxy_total_requests" + metrics: + - "litellm_proxy_total_requests_metric" + include_labels: + - "api_key_alias" + - "end_user" + - "hashed_api_key" + - "requested_model" + - "route" + - "status_code" + - "team" + - "team_alias" + - "user" + - "user_email" + + - group: "proxy_failed_requests" + metrics: + - "litellm_proxy_failed_requests_metric" + include_labels: + - "api_key_alias" + - "end_user" + - "exception_class" + - "exception_status" + - "hashed_api_key" + - "requested_model" + - "route" + - "team" + - "team_alias" + - "user" + - "user_email" + + - group: "latency_metrics" + metrics: + - "litellm_request_total_latency_metric" + - "litellm_llm_api_latency_metric" + include_labels: + - "api_key_alias" + - "end_user" + - "hashed_api_key" + - "model" + - "requested_model" + - "team" + - "team_alias" + - "user" + + - group: "token_metrics" + metrics: + - "litellm_input_tokens_metric" + - "litellm_output_tokens_metric" + - "litellm_total_tokens_metric" + include_labels: + - "end_user" + - "hashed_api_key" + - "api_key_alias" + - "requested_model" + - "team" + - "team_alias" + - "user" + - "model" + + - group: "team_budget_metrics" + metrics: + - "litellm_remaining_team_budget_metric" + - "litellm_team_max_budget_metric" + - "litellm_team_budget_remaining_hours_metric" + include_labels: + - "team" + - "team_alias" + + - group: "spend_metrics" + metrics: + - "litellm_spend_metric" + include_labels: + - "end_user" + - "hashed_api_key" + - "api_key_alias" + - "model" + - "team" + - "team_alias" + - "user" + + - group: "api_key_budget_metrics" + metrics: + - "litellm_api_key_max_budget_metric" + - "litellm_remaining_api_key_budget_metric" + - "litellm_api_key_budget_remaining_hours_metric" + include_labels: + - "hashed_api_key" + - "api_key_alias" + + # Note: api_key_rate_limit_metrics and callback_logging_metrics + # are not available in the current LiteLLM version + + - group: "deployment_success_responses_metric" + metrics: + - "litellm_deployment_success_responses" + include_labels: + - "requested_model" + - "litellm_model_name" + - "model_id" + - "api_base" + - "api_provider" + - "hashed_api_key" + - "api_key_alias" + - "team" + - "team_alias" + + - group: "deployment_failure_responses_metric" + metrics: + - "litellm_deployment_failure_responses" + include_labels: + - "requested_model" + - "litellm_model_name" + - "model_id" + - "api_base" + - "api_provider" + - "hashed_api_key" + - "api_key_alias" + - "team" + - "team_alias" + - "exception_status" + - "exception_class" + + - group: "deployment_total_requests_metric" + metrics: + - "litellm_deployment_total_requests" + include_labels: + - "requested_model" + - "litellm_model_name" + - "model_id" + - "api_base" + - "api_provider" + - "hashed_api_key" + - "api_key_alias" + - "team" + - "team_alias" + + - group: "provider_rate_limit_metrics" + metrics: + - "litellm_remaining_requests_metric" + - "litellm_remaining_tokens_metric" + include_labels: + - "model_group" + - "api_provider" + - "api_base" + - "litellm_model_name" + - "hashed_api_key" + - "api_key_alias" + + - group: "deployment_state_metric" + metrics: + - "litellm_deployment_state" + include_labels: + - "litellm_model_name" + - "model_id" + - "api_base" + - "api_provider" + + - group: "deployment_latency_per_output_token_metric" + metrics: + - "litellm_deployment_latency_per_output_token" + include_labels: + - "litellm_model_name" + - "model_id" + - "api_base" + - "api_provider" + - "hashed_api_key" + - "api_key_alias" + - "team" + - "team_alias" + + - group: "deployment_cooled_down_metric" + metrics: + - "litellm_deployment_cooled_down" + include_labels: + - "litellm_model_name" + - "model_id" + - "api_base" + - "api_provider" + + - group: "fallback_metrics" + metrics: + - "litellm_deployment_successful_fallbacks" + - "litellm_deployment_failed_fallbacks" + include_labels: + - "requested_model" + - "fallback_model" + - "hashed_api_key" + - "api_key_alias" + - "team" + - "team_alias" + - "exception_status" + - "exception_class" + + - group: "request_counting_metrics" + metrics: + - "litellm_requests_metric" + include_labels: + - "end_user" + - "hashed_api_key" + - "api_key_alias" + - "model" + - "team" + - "team_alias" + - "user" + - "user_email" + + - group: "overhead_latency_metric" + metrics: + - "litellm_overhead_latency_metric" + include_labels: + - "model_group" + - "api_provider" + - "api_base" + - "litellm_model_name" + - "hashed_api_key" + - "api_key_alias" + + - group: "time_to_first_token_metric" + metrics: + - "litellm_llm_api_time_to_first_token_metric" + include_labels: + - "model" + - "hashed_api_key" + - "api_key_alias" + - "team" + - "team_alias" + + - group: "system_health_metrics" + metrics: + - "litellm_pod_lock_manager_size" + - "litellm_in_memory_daily_spend_update_queue_size" + - "litellm_redis_daily_spend_update_queue_size" + - "litellm_in_memory_spend_update_queue_size" + - "litellm_redis_spend_update_queue_size" +``` -* **`sumo.datasource`**. Has fixed value of `litellm-metrics`. -* **`_sourceCategory`**. Set by the OpenTelemetry Collector resource processor. Use a value such as `otel/litellm/metrics` for consistent querying. -* **`deployment.environment`**. User configured. Enter a name to identify your deployment environment (e.g. `production`, `staging`, `dev`). +:::note +The `prometheus_metrics_config` block controls which labels are emitted per metric group. Labels not listed in `include_labels` will be stripped from the metric series. Ensure all labels used in dashboard panel queries are included in the corresponding group. The `api_key_rate_limit_metrics` and `callback_logging_metrics` groups are not available in the current LiteLLM version. +::: -## Prerequisites +### Logs collection -### For metrics collection +LiteLLM sends request and response logs to Sumo Logic via the `sumologic` callback, which POSTs log entries as JSON to an HTTP Logs and Metrics Source. -* LiteLLM proxy with Prometheus metrics enabled (e.g. `callbacks: ["prometheus"]` or `prometheus_metrics_config` in `litellm-config.yaml`). -* LiteLLM metrics endpoint accessible (default: `http://localhost:4000/metrics/`). -* OpenTelemetry Collector (or Sumo Logic Distribution for OpenTelemetry Collector) to scrape Prometheus metrics and send to Sumo Logic. -* HTTP Source in a Sumo Logic Hosted Collector for receiving metrics. +#### Step 1: Create a Hosted Collector (Sumo Logic) -### For team and budget labels +1. [**New UI**](/docs/get-started/sumo-logic-ui). In the Sumo Logic main menu select **Data Management**, and then under **Data Collection** select **Collection**. You can also click the **Go To...** menu at the top of the screen and select **Collection**.
[**Classic UI**](/docs/get-started/sumo-logic-ui-classic). In the main Sumo Logic menu, select **Manage Data > Collection > Collection**. +1. Click **Add Collector**. +1. Click **Hosted Collector**. +1. Provide a **Name** for the Collector. **Description** is optional. +1. **Category**. Enter any string to tag the logs collected from this Collector. This Source Category value is stored in a searchable metadata field called `_sourceCategory`. +1. Click the **+Add Field** link in the **Fields** section. Add any fields you want to associate with this Collector; each field needs a name (key) and value. +1. **Time Zone**. Set the default time zone when it is not extracted from the log timestamp. +1. Review your input and click **Save**. -* `team` and `team_alias` dimensions are populated when API keys are created with a `team_id` via `/team/new` and `/key/generate`. Keys without a team show `team=None`, `team_alias=None`. Create teams and associate keys to enable team-level spend and budget dashboards. +#### Step 2: Create an HTTP Logs and Metrics Source (Sumo Logic) -### Optional dimensions +1. In the Collectors page, click **Add Source** next to the Hosted Collector you just created. +1. Select **HTTP Logs & Metrics**. +1. Enter a **Name** to display for the Source. **Description** is optional. +1. **Source Category**. Enter a value such as `litellm/logs`. This value is stored in the `_sourceCategory` metadata field. +1. **Fields/Metadata**. Click **+Add** to define any additional fields you want to associate. +1. Click **Save**. +1. In the **HTTP Source Address** dialog box, copy the generated **Source URL**. You will use this as the value for `SUMOLOGIC_WEBHOOK_URL` in LiteLLM. -* `end_user`, `user`, `user_email`, `route`, `status_code` are populated when the application passes them (e.g. via headers or `prometheus_metrics_config` in `litellm-config.yaml`). Panels using these may return no data if not configured. +#### Step 3: Configure LiteLLM to send logs (LiteLLM) -## Collection configuration and app installation +Set the following environment variables so LiteLLM sends logs to the HTTP Source URL copied above: -import ConfigAppInstall from '../../reuse/apps/opentelemetry/config-app-install.md'; +| Variable | Required | Description | +|:--|:--|:--| +| `SUMOLOGIC_WEBHOOK_URL` | Yes | HTTP Source URL copied from Step 2. Used by the `sumologic` callback to POST log entries. | +| `GENERIC_LOGGER_ENDPOINT` | Yes | Set to the same value as `SUMOLOGIC_WEBHOOK_URL`. Required by the LiteLLM generic logger fallback. | - +Ensure `sumologic` is included in the `callbacks` list in your `litellm-config.yaml` (for example, `/app/config.yaml`): -### Step 1: Set up collector +```yaml +litellm_settings: + callbacks: ["prometheus", "otel", "sumologic"] -import SetupColl from '../../reuse/apps/opentelemetry/set-up-collector.md'; +environment_variables: + SUMOLOGIC_WEBHOOK_URL: os.environ/SUMOLOGIC_WEBHOOK_URL +``` - +### Metrics collection -### Step 2: Configure integration +LiteLLM exposes Prometheus metrics at `/metrics/`. The Sumo Logic Distribution for OpenTelemetry (OTel) Collector scrapes these metrics every 30 seconds and forwards them to Sumo Logic using the `sumologic` exporter, authenticated via a Sumo Logic installation token. No separate HTTP Source is required for metrics. -Create an HTTP Source in your Hosted Collector for receiving metrics. Note the HTTP Source URL — you will use it as the `SUMOLOGIC_WEBHOOK_URL` environment variable. +#### Required environment variables -Set the following environment variables: +| Variable | Description | +|:--|:--| +| `SUMOLOGIC_INSTALLATION_TOKEN` | Installation token used by the `sumologic` extension to register the collector identity with Sumo Logic. | -* **`SUMOLOGIC_WEBHOOK_URL`**. Your Sumo Logic HTTP Source URL. The OpenTelemetry Collector will send metrics to this endpoint. -* **`SUMOLOGIC_INSTALLATION_TOKEN`**. Your Sumo Logic installation token (for collector registration if using the Sumo Logic Distribution). +#### OTel collector configuration (`otel-config.yaml`) -Configure the OpenTelemetry Collector to scrape LiteLLM metrics and send them to Sumo Logic. Use a configuration similar to the following: +Create the OTel collector configuration at `/etc/otelcol-sumo/config.yaml`. This is a **standalone full configuration** — not a supplement — that uses the `sumologic` extension to register with Sumo Logic and the Prometheus receiver to scrape LiteLLM's `/metrics/` endpoint: ```yaml extensions: sumologic: installation_token: ${SUMOLOGIC_INSTALLATION_TOKEN} + # Fixed collector name so re-registrations reuse the same collector ID. + # Persist /root/.sumologic-otel-collector across restarts to avoid + # creating a new registered collector on every startup. collector_name: litellm-otel-collector receivers: @@ -96,7 +363,7 @@ receivers: scrape_interval: 30s metrics_path: '/metrics/' static_configs: - - targets: ['localhost:4000'] # Change if LiteLLM runs elsewhere + - targets: ['localhost:4000'] processors: memory_limiter: @@ -123,10 +390,16 @@ processors: - key: deployment.environment value: production action: upsert + resource/sumologic: + attributes: + # Remove infrastructure labels not relevant to LiteLLM metrics + - key: cloud.availability_zone + action: delete + - key: k8s.pod.uid + action: delete exporters: sumologic: - endpoint: ${SUMOLOGIC_WEBHOOK_URL} metric_format: prometheus service: @@ -134,247 +407,208 @@ service: pipelines: metrics: receivers: [prometheus] - processors: [memory_limiter, batch, resourcedetection/system, resource/common] + processors: [memory_limiter, batch, resourcedetection/system, resource/common, resource/sumologic] exporters: [sumologic] ``` -:::important -* Update `targets` in the Prometheus scrape config if LiteLLM runs on a different host or port. -* Ensure `_sourceCategory` and `deployment.environment` match the values used in the app dashboards (e.g. template variables `{{source_category}}` and `{{deployment.environment}}`). +:::note +- Do not change the `_sourceCategory` (`otel/litellm/metrics`) or `deployment.environment` values. These must match the dashboard template variables for panels to populate correctly. +- Save the OTel collector credentials directory (`/root/.sumologic-otel-collector`) across restarts. If it is lost, a new collector registration is created in Sumo Logic each time the collector starts. ::: -### Step 3: Verify LiteLLM configuration +## App installation -Ensure LiteLLM is configured to expose Prometheus metrics. In `litellm-config.yaml`: +Once collection is configured (see [Collection configuration](#collection-configuration) above), install the LiteLLM app in three ways: -```yaml -litellm_settings: - callbacks: ["otel", "prometheus", "sumologic"] - service_callback: ["prometheus_system"] -``` +### Create a new collector and install the app -For team and budget tracking, create teams and API keys with `team_id` via the LiteLLM API. See [LiteLLM team management](https://docs.litellm.ai/docs/proxy/team_management) for details. +To set up collection and install the app, do the following: -## Sample metrics +:::note +**Next-Gen App**: To install or update the app, you must be an account administrator or a user with Manage Apps, Manage Monitors, Manage Fields, Manage Metric Rules, and Manage Collectors capabilities depending upon the different content types part of the app. +::: -
-Request and latency metrics +1. Select **App Catalog**. +1. In the 🔎 **Search Apps** field, search for **LiteLLM**, then select it. +1. Click **Install App**. + :::note + Sometimes this button says **Add Integration**. + ::: +1. In the **Set Up Collection** section, select **Create a new Collector**. +1. **Collector Name**. Enter a name to display the source in the Sumo Logic web application. The description is optional. +1. **Timezone**. Set the default time zone when it is not extracted from the log timestamp. Time zone settings on sources override a Collector time zone setting. +1. (Optional) **Metadata**. Click **+Add Metadata** to add custom metadata fields. Define the fields you want to associate; each metadata field needs a name (key) and value. +1. Click **Next**. +1. Configure the OpenTelemetry Collector using the configuration provided in the [Collection configuration](#collection-configuration) section above. +1. In the **Configure** section, complete the following fields. + - **Field Name**. If you already have collectors and sources set up, select the configured metadata field name (for example, `_sourceCategory`) or specify other custom metadata (for example, `_collector`) along with its metadata **Field Value**. +1. Click **Next**. You will be redirected to the **Preview & Done** section. -```json -{ - "metric": "litellm_proxy_total_requests_metric_total", - "sumo.datasource": "litellm-metrics", - "_sourceCategory": "otel/litellm/metrics", - "deployment.environment": "production", - "requested_model": "gpt-4", - "team_alias": "platform-team", - "value": 1250, - "timestamp": "2025-02-21T10:30:00.000Z" -} -``` +**Post-installation** -```json -{ - "metric": "litellm_request_total_latency_metric_sum", - "sumo.datasource": "litellm-metrics", - "deployment.environment": "production", - "requested_model": "gpt-4", - "value": 45.2, - "timestamp": "2025-02-21T10:30:00.000Z" -} -``` +Once your app is installed, it will appear in your **Installed Apps** folder, and dashboard panels will start to fill automatically. + +Each panel slowly fills with data matching the time range query received since the panel was created. Results will not immediately be available but will be updated with full graphs and charts over time. -
+### Use an existing collector and install the app -## Sample queries +To set up the source in an existing collector and install the app, do the following: :::note -Pipes in queries are escaped as `\|` for Markdown. When pasting into Sumo Logic, use `|` (single pipe). +**Next-Gen App**: To install or update the app, you must be an account administrator or a user with Manage Apps, Manage Monitors, Manage Fields, Manage Metric Rules, and Manage Collectors capabilities depending upon the different content types part of the app. ::: -```sql title="Total requests over time" -_sourceCategory=otel/litellm/metrics deployment.environment=production metric=litellm_proxy_total_requests_metric_total -| quantize using sum -| sum -``` +1. Select **App Catalog**. +1. In the 🔎 **Search Apps** field, search for **LiteLLM**, then select it. +1. Click **Install App**. +1. In the **Set Up Collection** section, select **Use an existing Collector**. +1. From the **Select Collector** dropdown, select the collector that you want to set up your source with and click **Next**. +1. Configure the source as specified above, ensuring all required fields are included. +1. In the **Configure** section, complete the following fields. + - **Field Name**. If you already have collectors and sources set up, select the configured metadata field name (for example, `_sourceCategory`) or specify other custom metadata (for example, `_collector`) along with its metadata **Field Value**. +1. Click **Next**. You will be redirected to the **Preview & Done** section. -```sql title="Average latency by requested model" -_sourceCategory=otel/litellm/metrics deployment.environment=production metric=litellm_request_total_latency_metric_sum -| quantize using sum -| sum by requested_model -``` +**Post-installation** -```sql title="Spend by team" -_sourceCategory=otel/litellm/metrics deployment.environment=production metric=litellm_spend_metric_total -| quantize using sum -| sum by team_alias -``` +Once your app is installed, it will appear in your **Installed Apps** folder, and dashboard panels will start to fill automatically. -```sql title="Success rate (success / total)" -_sourceCategory=otel/litellm/metrics deployment.environment=production litellm_model_name=* metric=litellm_deployment_success_responses_total -| quantize using sum -| sum -``` -Divide by the total requests query result to get percentage. +Each panel slowly fills with data matching the time range query received since the panel was created. Results will not immediately be available but will be updated with full graphs and charts over time. -## Installing the LiteLLM app +### Use an existing source and install the app -import AppInstallIndexV2 from '../../reuse/apps/app-install-index-option.md'; +To skip collection and only install the app, do the following: - +:::note +**Next-Gen App**: To install or update the app, you must be an account administrator or a user with Manage Apps, Manage Monitors, Manage Fields, Manage Metric Rules, and Manage Collectors capabilities depending upon the different content types part of the app. +::: -As part of the app installation process, the following fields will be created by default: +1. Select **App Catalog**. +1. In the 🔎 **Search Apps** field, search for **LiteLLM**, then select it. +1. Click **Install App**. +1. In the **Set Up Collection** section, select **Skip this step and use existing source** and click **Next**. +1. In the **Configure** section, complete the following fields. + - **Field Name**. If you already have collectors and sources set up, select the configured metadata field name (for example, `_sourceCategory`) or specify other custom metadata (for example, `_collector`) along with its metadata **Field Value**. +1. Click **Next**. You will be redirected to the **Preview & Done** section. -* **`sumo.datasource`**. Fixed value `litellm-metrics`. -* **`_sourceCategory`**. Source category for LiteLLM metrics (e.g. `otel/litellm/metrics`). -* **`deployment.environment`**. Deployment environment (e.g. `production`, `staging`). +**Post-installation** -## Viewing the LiteLLM dashboards +Once your app is installed, it will appear in your **Installed Apps** folder, and dashboard panels will start to fill automatically. -import ViewDashboardsIndex from '../../reuse/apps/view-dashboards-index.md'; +Each panel slowly fills with data matching the time range query received since the panel was created. Results will not immediately be available but will be updated with full graphs and charts over time. - +## Viewing the LiteLLM dashboards -### Overview (Executive) +All dashboards have a set of filters that you can apply to the entire dashboard. Use these filters to drill down and examine the data to a granular level. You can change the time range for a dashboard or panel by selecting a predefined interval from a drop-down list, choosing a recently used time range, or specifying custom dates and times. Learn more. -The **LiteLLM - Overview** dashboard provides high-level health and usage at a glance. +You can use template variables to drill down and examine the data on a granular level. For more information, see [Filtering Dashboards with Template Variables](/docs/dashboards/filter-template-variables/). -Use this dashboard to: -* Monitor total requests, success rate, and active deployments. -* Track total spend and average latency. -* Compare request volume and spend over time. -* Identify top models by request volume and top teams by spend. +Most Next-Gen apps allow you to provide the scope at the installation time and are comprised of a key (`_sourceCategory` by default) and a default value for this key. Based on your input, the app dashboards will be parameterized with a dashboard variable, allowing you to change the dataset queried by all panels. This eliminates the need to create multiple copies of the same dashboard with different queries. -LiteLLM - Overview +### Metrics Overview -### Latency & Performance +The **LiteLLM - Metrics Overview** dashboard provides a high-level summary of proxy health and usage from Prometheus metrics. Single-value panels give instant visibility into total requests, failed requests, active teams, active API keys, and active models. The Requests by Requested Model and Spend by Team honeycomb panels let you compare request volume and cost distribution at a glance. Time-series views for requests over time and failed requests over time surface usage trends and anomalies across the selected period. -The **LiteLLM - Latency & Performance** dashboard provides a deep dive into request latency, time to first token, and overhead. +LiteLLM - Metrics Overview -Use this dashboard to: -* Track end-to-end latency and LLM API latency over time. -* Compare overhead latency by API provider. -* Identify slowest models and latency distribution by model. -* Drill down by requested model and API key alias. +### Proxy Health and Performance -LiteLLM - Latency & Performance +The **LiteLLM - Proxy Health and Performance** dashboard provides operational visibility into proxy request health filtered by deployment environment, team, end user, and model. Total Requests Over Time and Failed Requests Over Time track throughput and failure trends side by side, while Traffic by Route surfaces which API paths (for example, `/chat/completions`, `/embeddings`) drive the most load. This dashboard is the primary starting point for detecting spikes, degraded success rates, or unexpected traffic patterns across routes. -### Tokens & Cost +LiteLLM - Proxy Health and Performance -The **LiteLLM - Tokens & Cost** dashboard tracks token consumption and spend. +### Latency and Performance -Use this dashboard to: -* Monitor total tokens, input vs output tokens, and token rate. -* Track spend over time and by team. -* Compare token usage by model and spend by API key alias. -* Identify top teams by spend. +The **LiteLLM - Latency and Performance** dashboard provides a deep dive into request latency, time to first token (TTFT), and LLM API latency. LLM API Latency Over Time and Latency by Requested Model track end-to-end and provider latency trends across the selected period. Time to First Token by Team measures streaming responsiveness per team, which is critical for interactive use cases. Request Count by Requested Model shows traffic distribution, while the Top 10 Slowest Models (Avg Latency) table helps pinpoint models that consistently contribute to slow responses and SLA breaches. -:::note -**`litellm_spend_metric_total`** uses `team`, `team_alias`, `hashed_api_key`, `api_key_alias` — not `model`. Use `requested_model` for token metrics. -::: +LiteLLM - Latency and Performance -LiteLLM - Tokens & Cost +### Budget and Rate Limits -### Budget & Rate Limits +The **LiteLLM - Budget and Rate Limits** dashboard provides visibility into remaining budgets and provider-side rate limits. The Budget by Team and Rate Limit Headroom by Model & API Base honeycomb panels give instant color-coded health of budget and rate limit status across teams and models. Team Budget Remaining and API Key Budget Remaining time-series panels track available spend over time, and Hours Until Budget Reset surfaces upcoming resets before limits are exhausted. Remaining Budget By Teams and Max Budget By Teams tables give a ranked view for governance. Remaining Requests and Remaining Tokens track provider rate limit headroom sourced from upstream response headers. -The **LiteLLM - Budget & Rate Limits** dashboard provides visibility into remaining budgets and provider rate limits. +LiteLLM - Budget and Rate Limits -Use this dashboard to: -* Track team and API key budget remaining. -* Monitor provider rate limit headroom (e.g. Groq remaining requests and tokens). -* View hours until budget reset. -* Compare budget and rate limits by model and API base. +### Deployment and Fallback Health -LiteLLM - Budget & Rate Limits +The **LiteLLM - Deployment and Fallback Health** dashboard monitors the health of individual LLM deployments and the effectiveness of fallback routing. The Deployment Health by Model & Provider honeycomb panel provides a color-coded status view across all deployments, while Deployment State By Model tracks health state (0 = healthy, 1 = partial, 2 = outage) over time. Success vs Failure Responses compares response outcomes, and Fallback Success by Requested Model and Fallback Failure by Requested Model show whether the router successfully recovered from primary failures. Proxy Failures by Exception Class and Deployment Failures by Exception Status break down failure root causes. Cooled Down by Model tracks deployments that have recently recovered from a cooldown period. -### Deployment & Fallback Health +LiteLLM - Deployment and Fallback Health -The **LiteLLM - Deployment & Fallback Health** dashboard monitors LLM deployment health, fallbacks, and failures. +### Infrastructure and Callbacks -Use this dashboard to: -* Track deployment state (healthy, partial, outage) by model. -* Compare success vs failure trends per deployment. -* Monitor successful and failed fallbacks. -* Identify deployment failures by exception class and status. -* Track cooled-down deployments. +The **LiteLLM - Infrastructure and Callbacks** dashboard provides visibility into the supporting services and callback integrations that underpin LiteLLM's operation. Redis Latency, Postgres Latency, and LiteLLM Self Latency panels track dependency health over time. Redis Failed Requests and Callback Logging Failures surface error conditions in caching and observability pipelines. Queue size panels for the Pod Lock Manager Queue, In-Memory Spend Update Queue, and Redis Spend Update Queue serve as backpressure indicators that signal whether the system is keeping up with spend tracking workloads. The Deployment Correlation table maps each model to its provider and API base, giving operators a fast reference for routing topology. -LiteLLM - Deployment & Fallback Health +LiteLLM - Infrastructure and Callbacks -### Infrastructure & Callbacks +### Cost Analytics -The **LiteLLM - Infrastructure & Callbacks** dashboard provides visibility into Redis, Postgres, self latency, and callback health. +The **LiteLLM - Cost Analytics** dashboard provides comprehensive cost tracking sourced from LiteLLM request logs. Single-value panels for Total Cost ($), Input Token Cost ($), and Output Token Cost ($) give instant spend visibility. Cost Trend (LLM + MCP + Tools) breaks spend into input, output, tool, and MCP cost components over time, while Cost by Provider Over Time shows which LLM providers drive spending. Top 15 Models by Cost, Top 15 Teams by Cost, Top 15 API Key Users by Cost, and Top 20 End Users by Cost (B2B) tables provide ranked attribution for chargeback and governance. Top 15 Cost by Prompt Version (A/B Testing) and Top 15 MCP Tool Costs by Server and Tool enable prompt and tooling cost comparison. Token Cost Efficiency by Model surfaces cost per million tokens, Cache Cost Savings tracks cost avoided through cache hits, and User Budget Status monitors budget consumption per API key user. -Use this dashboard to: -* Monitor Redis, Postgres, and LiteLLM self latency. -* Track Redis failed requests and callback logging failures. -* View deployment latency per token by model. -* Monitor queue sizes (pod lock manager, spend update queues). -* Map deployment correlation (model ↔ provider ↔ API base). +LiteLLM - Cost Analytics -LiteLLM - Infrastructure & Callbacks +### Error Analysis and Debugging -### User & Route Visibility +The **LiteLLM - Error Analysis and Debugging** dashboard provides comprehensive error tracking sourced from LiteLLM request logs. Total Errors and Failure Rate (%) single-value panels give instant error visibility. Error Trend Over Time and Error Rate Trend Over Time track failure volume and percentage trends. Error Codes Distribution and Error Class Distribution pie charts identify the most common failure categories, while Errors Distribution by Provider breaks down failures by upstream LLM provider. Top 15 Models by Error Count and Top 15 Error Messages tables surface the highest-impact models and error strings. Recent Errors (Latest 100) and Detailed Error Analysis with Trace IDs tables support active debugging with distributed trace correlation. Guardrail Status Distribution and Guardrail Status Trend monitor guardrail execution outcomes, and Top 20 IP Addresses by Error Count and Top 20 Network Error Patterns support network-level security and connectivity debugging. Cost Calculation Failures surfaces requests where cost attribution failed. -The **LiteLLM - User & Route Visibility** dashboard provides user/end-user segmentation and route-level metrics. +LiteLLM - Error Analysis and Debugging -Use this dashboard to: -* View requests by status code and by route. -* Track requests and spend by end user. -* Identify failed requests by status code. -* Rank top end users by spend. +### Security and Compliance -:::note -Panels use optional dimensions (`end_user`, `route`, `status_code`). Populate these when the application passes them (e.g. via `prometheus_metrics_config` in `litellm-config.yaml`). -::: +The **LiteLLM - Security and Compliance** dashboard provides comprehensive security monitoring sourced from LiteLLM request logs. It covers detailed guardrail analytics, entity masking and PII detection, geographic access patterns, network security analysis, and compliance tracking — designed for security teams and compliance officers who need audit-ready visibility into how AI is being used across the organization. -LiteLLM - User & Route Visibility +LiteLLM - Security and Compliance -## Create monitors for LiteLLM app +### MCP Overview -import CreateMonitors from '../../reuse/apps/create-monitors.md'; +The **LiteLLM - MCP Overview** dashboard provides visibility into Model Context Protocol (MCP) tool usage sourced from LiteLLM request logs. Total MCP Tool Calls and Total MCP Tool Cost ($) single-value panels give instant MCP activity and spend visibility. Active Prompt Versions tracks the number of distinct prompt versions in use. MCP Tool Call Trends shows usage volume over time by tool, while Top 20 MCP Tools by Server and MCP Tool Performance & Success Rate tables identify the most-used tools and their latency and success metrics. MCP Cost by Tool & Server ranks tooling costs for attribution. RAG Request Trend (Vector Store Queries) tracks retrieval-augmented generation activity over time, Vector Store by Provider shows distribution across vector store backends, and Prompt Management Integration Usage Distribution and Prompt Version Usage Trend monitor prompt management integrations such as Langfuse and PromptLayer. - +LiteLLM - MCP Overview -### LiteLLM monitors +### Vector Overview -| Name | Description | Alert Condition | Recover Condition | -|:--|:--|:--|:--| -| `LiteLLM - High Error Rate` | Critical when proxy failure rate exceeds 10% of total requests. | Count (failure rate) > 10% | Count < 10% | -| `LiteLLM - High Latency` | Warning when average request latency exceeds 30 seconds. | Avg latency > 30s | Avg latency ≤ 30s | -| `LiteLLM - Budget Exceeded` | Critical when team budget remaining is zero or negative. | Remaining budget ≤ 0 | Remaining budget > 0 | -| `LiteLLM - Deployment Unhealthy` | Critical when deployment state indicates outage (state=2). | deployment_state = 2 | deployment_state < 2 | -| `LiteLLM - High Failed Fallbacks` | Warning when failed fallbacks exceed threshold for a requested model. | Failed fallbacks > 5 | Failed fallbacks ≤ 5 | +The **LiteLLM - Vector Overview** dashboard provides visibility into vector store and RAG usage sourced from LiteLLM request logs. RAG vs Non-RAG Requests shows the proportion of retrieval-augmented requests versus standard requests. Vector Store by Provider breaks down which vector store backends are in use. Top Searched Queries surfaces the most frequent queries sent to vector stores, and Top 10 Vector Stores with Highest Average Score identifies the best-performing stores by retrieval score — helping teams assess and improve RAG retrieval quality. -## Upgrading the LiteLLM app (Optional) +LiteLLM - Vector Overview -import AppUpdate from '../../reuse/apps/app-update.md'; +## Create monitors for LiteLLM app - +import CreateMonitors from '../../reuse/apps/create-monitors.md'; -## Uninstalling the LiteLLM app (Optional) +## Upgrading/Downgrading the LiteLLM app (Optional) -import AppUninstall from '../../reuse/apps/app-uninstall.md'; +To update the app, do the following: - +:::note +**Next-Gen App**: To install or update the app, you must be an account administrator or a user with Manage Apps, Manage Monitors, Manage Fields, Manage Metric Rules, and Manage Collectors capabilities depending upon the different content types part of the app. +::: -## Troubleshooting +1. Select **App Catalog**. +1. In the **Search Apps** field, search for and then select your app. Optionally, you can identify apps that can be upgraded in the **Upgrade available** section. +1. To upgrade the app, select **Upgrade** from the **Manage** dropdown. + - If the upgrade does not have any configuration or property changes, you will be redirected to the **Preview & Done** section. + - If the upgrade has any configuration or property changes, you will be redirected to the **Setup Data** page. +1. In the **Configure** section, complete the following fields. + - **Field Name**. If you already have collectors and sources set up, select the configured metadata field name (for example, `_sourceCategory`) or specify other custom metadata (for example, `_collector`) along with its metadata **Field Value**. +1. Click **Next**. You will be redirected to the **Preview & Done** section. -### No data in dashboards +**Post-update** -* Verify the OpenTelemetry Collector is running and scraping LiteLLM at the configured target (e.g. `localhost:4000`). -* Ensure `SUMOLOGIC_WEBHOOK_URL` is set correctly and the HTTP Source is receiving data. -* Check that `_sourceCategory` and `deployment.environment` in the collector config match the dashboard template variables. -* Confirm LiteLLM exposes Prometheus metrics at `/metrics/` and that `callbacks` includes `prometheus`. +Your upgraded app will be installed in the **Installed Apps** folder and dashboard panels will start to fill automatically. -### `team=None` or `team_alias=None` in metrics +:::note +See our [Release Notes changelog](/release-notes) for new updates in the app. +::: -* API keys must be created with `team_id` from the start. Keys created without `team_id` cannot be updated later. -* Create teams via `/team/new` and generate keys via `/key/generate` with `{"team_id": "..."}`. See [LiteLLM team management](https://docs.litellm.ai/docs/proxy/team_management). +To revert the app to a previous version, do the following: -### Invalid dimension in panel queries +1. Select **App Catalog**. +1. In the **Search Apps** field, search for and then select your app. +1. To version down the app, select **Revert to <previous version of your app>** from the **Manage** dropdown. -* Each metric has specific valid dimensions. Using `sum by ` with an invalid dimension returns no data. Refer to the dashboard label reference for valid dimensions per metric (e.g. `litellm_spend_metric_total` uses `team`, `team_alias`, `hashed_api_key`, `api_key_alias` — not `model`). +## Uninstalling the LiteLLM app (Optional) -### Query pipe syntax +To uninstall the app, do the following: -* In Sumo Logic, use `|` (single pipe) for query operators. In this documentation, pipes are escaped as `\|` for Markdown. When copying queries, replace `\|` with `|`. +1. Select **App Catalog**. +1. In the 🔎 **Search Apps** field, search for your app, then select it. +1. Click **Uninstall**. From 74976964a9ed39d0baaefae1a1f3dcd470f70d4c Mon Sep 17 00:00:00 2001 From: "Kim (Sumo Logic)" <56411016+kimsauce@users.noreply.github.com> Date: Wed, 25 Feb 2026 12:36:40 -0800 Subject: [PATCH 3/3] Apply suggestion from @kimsauce --- docs/integrations/saas-cloud/litellm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/integrations/saas-cloud/litellm.md b/docs/integrations/saas-cloud/litellm.md index 2b72d79f04..8f1931d85b 100644 --- a/docs/integrations/saas-cloud/litellm.md +++ b/docs/integrations/saas-cloud/litellm.md @@ -596,7 +596,7 @@ To update the app, do the following: Your upgraded app will be installed in the **Installed Apps** folder and dashboard panels will start to fill automatically. :::note -See our [Release Notes changelog](/release-notes) for new updates in the app. +See our [Release Notes changelog](/docs/release-notes) for new updates in the app. ::: To revert the app to a previous version, do the following: