Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 60 additions & 1 deletion .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,65 @@
# 'Authorization' header as a Bearer token (e.g., "Authorization: Bearer YOUR_PROXY_API_KEY").
#PROXY_API_KEY="YOUR_PROXY_API_KEY"

# App runtime mode.
# dev = allows insecure defaults when ALLOW_INSECURE_DEFAULTS is not set
# prod = refuses startup when critical secrets are default/missing
#APP_ENV="dev"

# Override safety guard for local experimentation only.
# In prod, keep this false and set strong SESSION_SECRET/API_TOKEN_PEPPER.
#ALLOW_INSECURE_DEFAULTS=true

# Auth transition mode for /v1 API auth.
# users = only database-backed user API keys
# legacy = only PROXY_API_KEY
# both = accepts both user keys and PROXY_API_KEY
# Default is "both" for migration compatibility.
#AUTH_MODE="both"

# Pepper used when hashing user API tokens at rest:
# token_hash = SHA256(API_TOKEN_PEPPER + plaintext_token)
# Keep this stable after issuing user API keys, or existing keys will no longer match.
#API_TOKEN_PEPPER="replace-with-random-token-pepper"

# Initial admin bootstrap user (created once on first startup).
# Required when AUTH_MODE includes "users" and you need dashboard/admin access.
#INITIAL_ADMIN_USERNAME="admin"
#INITIAL_ADMIN_PASSWORD="change-me"

# Dashboard session cookie settings.
# SESSION_SECRET should be a long random value in production.
#SESSION_SECRET="replace-with-random-session-secret"
#SESSION_TTL_SECONDS=86400
#SESSION_COOKIE_SECURE=false
#SESSION_COOKIE_SAMESITE=lax

# API key hash secret (pepper). Must be set to a strong random value in production.
#API_TOKEN_PEPPER="replace-with-random-token-pepper"

# CORS settings for browser clients.
# Comma-separated origins, e.g. "http://localhost:3000,https://admin.example.com"
#CORS_ALLOW_ORIGINS=""
#CORS_ALLOW_CREDENTIALS=false
#CORS_ALLOW_METHODS="GET,POST,PUT,PATCH,DELETE,OPTIONS"
#CORS_ALLOW_HEADERS="Authorization,Content-Type,X-API-Key,X-Requested-With,X-CSRF-Token"
# SESSION_COOKIE_SAMESITE supports: lax, strict, none
#SESSION_COOKIE_SAMESITE="lax"

# CSRF protection for UI forms is enabled by default.
# Cookie/form field names are fixed internally: proxy_csrf / csrf_token.
# CSRF cookie security follows SESSION_COOKIE_SECURE + SESSION_COOKIE_SAMESITE.

# Optional DB override. By default uses sqlite file: data/proxy.db
#DATABASE_URL="sqlite+aiosqlite:///data/proxy.db"

# SQLite lock wait timeout in milliseconds.
#SQLITE_BUSY_TIMEOUT_MS=5000

# Usage event retention window (days). Older events are pruned on startup and
# can be pruned manually via POST /api/admin/usage/prune.
#USAGE_RETENTION_DAYS=30


# ------------------------------------------------------------------------------
# | [API KEYS] Provider API Keys |
Expand Down Expand Up @@ -418,4 +477,4 @@
#
# Set to "0" to show these warnings (useful for debugging).
# Default: "1" (suppress warnings)
# SUPPRESS_LITELLM_SERIALIZATION_WARNINGS=1
# SUPPRESS_LITELLM_SERIALIZATION_WARNINGS=1
13 changes: 12 additions & 1 deletion DOCUMENTATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -919,6 +919,18 @@ The proxy accepts both Anthropic and OpenAI authentication styles:
- `x-api-key` header (Anthropic style)
- `Authorization: Bearer` header (OpenAI style)

For `/v1/*` token validation, behavior is controlled by `AUTH_MODE`:

| AUTH_MODE | Accepted tokens | Migration intent |
|-----------|-----------------|------------------|
| `users` | Database-backed user API keys | Target steady-state |
| `legacy` | `PROXY_API_KEY` only | Legacy compatibility mode |
| `both` | User API keys + `PROXY_API_KEY` | Default migration mode |

Legacy compatibility statement: existing clients that still send `PROXY_API_KEY` remain supported when `AUTH_MODE=legacy` or `AUTH_MODE=both`.

Recommended rollout: bootstrap admin via `INITIAL_ADMIN_USERNAME`/`INITIAL_ADMIN_PASSWORD`, run with `AUTH_MODE=both` during client migration, then move to `AUTH_MODE=users`.

### 3.5. Antigravity (`antigravity_provider.py`)

The most sophisticated provider implementation, supporting Google's internal Antigravity API for Gemini 3 and Claude models (including **Claude Opus 4.5**, Anthropic's most powerful model).
Expand Down Expand Up @@ -1925,4 +1937,3 @@ The GUI modifies the same environment variables that the `RotatingClient` reads:
3. **Proxy applies rules** → `get_available_models()` filters based on rules

**Note**: The proxy must be restarted to pick up rule changes made via the GUI (or use the Launcher TUI's reload functionality if available).

4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ ENV PATH=/root/.local/bin:$PATH
# Copy application code
COPY src/ ./src/

# Create directories for logs and oauth credentials
RUN mkdir -p logs oauth_creds
# Create runtime directories for persisted state
RUN mkdir -p logs oauth_creds usage data

# Expose the default port
EXPOSE 8000
Expand Down
48 changes: 48 additions & 0 deletions PROJECT_OVERVIEW.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# LLM API Key Proxy - Project Overview

## What this project is trying to accomplish

LLM API Key Proxy is a self-hosted gateway that gives teams a single, OpenAI-compatible endpoint for many model providers.

Instead of wiring every client directly to OpenAI, Anthropic, Gemini, OpenRouter, and others, apps call this proxy at `/v1/*`.
The proxy then routes requests to the right upstream provider while keeping a consistent API surface.

The current direction is to make this useful for real multi-user teams, not just single-user local setups.

## Core goals

1. **Unified API surface**
- Expose familiar OpenAI-style endpoints (`/v1/chat/completions`, `/v1/embeddings`, `/v1/models`, etc.).
- Support Anthropic-compatible endpoints (`/v1/messages`, `/v1/messages/count_tokens`).

2. **Provider abstraction + routing**
- Route `provider/model` requests to the correct backend.
- Keep client-side integrations simple while supporting multiple providers behind the scenes.

3. **Multi-user access control**
- Allow admin-managed users (no self-signup).
- Let users create/revoke personal API keys.
- Support transition modes via `AUTH_MODE=users|legacy|both` for migration from a single legacy proxy key.

4. **Usage accounting and visibility**
- Attribute request usage to user/API key.
- Provide user and admin usage summaries, breakdowns, and dashboards.

5. **Safe deployment defaults**
- Secure session cookies and CSRF protection for UI forms.
- Hash-only API token storage.
- Stricter CORS and secret handling for production mode.
- SQLite reliability hardening and usage retention controls.

## Who this is for

- Teams that want centralized control of LLM access.
- Developers who need one endpoint for many providers.
- Admins who need per-user key lifecycle management and usage reporting.

## What success looks like

- Clients can switch providers without rewriting auth/routing logic.
- Admins can onboard users safely and audit usage.
- Existing OpenAI/Anthropic client compatibility remains stable.
- The proxy can run reliably in development and production with clear security expectations.
71 changes: 61 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,20 +54,21 @@ docker run -d \
-v $(pwd)/oauth_creds:/app/oauth_creds \
-v $(pwd)/logs:/app/logs \
-v $(pwd)/usage:/app/usage \
-v $(pwd)/data:/app/data \
-e SKIP_OAUTH_INIT_CHECK=true \
ghcr.io/mirrowel/llm-api-key-proxy:latest
```

**Using Docker Compose:**

```bash
# Create your .env file and usage directory first, then:
# Create your .env file and runtime data directories first, then:
cp .env.example .env
mkdir usage
mkdir -p usage data
docker compose up -d
```

> **Important:** Create the `usage/` directory before running Docker Compose so usage stats persist on the host.
> **Important:** Create `usage/` and `data/` before running Docker Compose so usage stats and SQLite DB data persist on the host.

> **Note:** For OAuth providers, complete authentication locally first using the credential tool, then mount the `oauth_creds/` directory or export credentials to environment variables.

Expand All @@ -93,7 +94,15 @@ Once the proxy is running, configure your application with these settings:
| Setting | Value |
|---------|-------|
| **Base URL / API Endpoint** | `http://127.0.0.1:8000/v1` |
| **API Key** | Your `PROXY_API_KEY` |
| **API Key** | Depends on `AUTH_MODE` (`PROXY_API_KEY`, user API key, or both) |

### Auth Transition Modes

- `AUTH_MODE=users`: only per-user API keys are accepted for `/v1/*`
- `AUTH_MODE=legacy`: only `PROXY_API_KEY` is accepted
- `AUTH_MODE=both`: accepts either user API keys or `PROXY_API_KEY` (recommended migration mode)

Recommended rollout: start with `AUTH_MODE=both`, move clients to user keys, then switch to `AUTH_MODE=users`.

### Model Format: `provider/model_name`

Expand Down Expand Up @@ -234,13 +243,36 @@ print(response.content[0].text)
| `GET /v1/models` | List all available models with pricing & capabilities |
| `GET /v1/models/{model_id}` | Get details for a specific model |
| `GET /v1/providers` | List configured providers |
| `GET/POST /v1/quota-stats` | Provider quota stats (admin-only actor) |
| `POST /v1/token-count` | Calculate token count for a payload |
| `POST /v1/cost-estimate` | Estimate cost based on token counts |
| `POST /api/admin/usage/prune` | Prune usage events older than `USAGE_RETENTION_DAYS` |

> **Tip:** The `/v1/models` endpoint is useful for discovering available models in your client. Many apps can fetch this list automatically. Add `?enriched=false` for a minimal response without pricing data.

---

## AUTH_MODE Migration (MVP)

`AUTH_MODE` controls which tokens are accepted on `/v1/*` endpoints.

| `AUTH_MODE` | Accepted token(s) | Intended phase |
|-------------|-------------------|----------------|
| `users` | User API keys created in the dashboard/API | Final state after migration |
| `legacy` | `PROXY_API_KEY` only | Backward-compatible legacy-only mode |
| `both` (default) | User API keys and `PROXY_API_KEY` | Transition window for gradual rollout |

Recommended transition path:

1. Start on `AUTH_MODE=both`
2. Bootstrap admin with `INITIAL_ADMIN_USERNAME` + `INITIAL_ADMIN_PASSWORD`
3. Create user API keys and rotate clients over to those keys
4. Switch to `AUTH_MODE=users` once all clients are migrated

Legacy compatibility statement: `PROXY_API_KEY` remains fully supported when `AUTH_MODE` is set to `legacy` or `both`.

---

## Managing Credentials

The proxy includes an interactive tool for managing all your API keys and OAuth credentials.
Expand Down Expand Up @@ -469,10 +501,27 @@ The proxy includes a powerful text-based UI for configuration and management.

| Variable | Description | Default |
|----------|-------------|---------|
| `PROXY_API_KEY` | Authentication key for your proxy | Required |
| `PROXY_API_KEY` | Legacy `/v1/*` key used in `legacy` or `both` mode | Optional |
| `AUTH_MODE` | `/v1/*` auth mode: `users`, `legacy`, `both` | `both` |
| `APP_ENV` | Runtime environment (`dev` or `prod`) | `dev` |
| `ALLOW_INSECURE_DEFAULTS` | Allow startup with default secrets | `true` in dev, `false` in prod |
| `SESSION_SECRET` | Dashboard session signing secret | Required in prod |
| `API_TOKEN_PEPPER` | HMAC key for API token hashes | Required in prod |
| `CORS_ALLOW_ORIGINS` | Comma-separated browser origins | empty |
| `CORS_ALLOW_CREDENTIALS` | Allow credentialed CORS requests | false (unless origins configured) |
| `SQLITE_BUSY_TIMEOUT_MS` | SQLite lock timeout in milliseconds | `5000` |
| `USAGE_RETENTION_DAYS` | Usage event retention window in days | `30` |
| `OAUTH_REFRESH_INTERVAL` | Token refresh check interval (seconds) | `600` |
| `SKIP_OAUTH_INIT_CHECK` | Skip interactive OAuth setup on startup | `false` |

### Production Checklist

- Set `APP_ENV=prod`
- Set strong `SESSION_SECRET` and `API_TOKEN_PEPPER`
- Keep `ALLOW_INSECURE_DEFAULTS=false`
- Set explicit `CORS_ALLOW_ORIGINS` (do not use `*` with credentials)
- Keep persistent mounts for `/app/data`, `/app/logs`, `/app/oauth_creds`, and `/app/usage`

### Per-Provider Settings

| Pattern | Description | Example |
Expand Down Expand Up @@ -859,8 +908,8 @@ The proxy is available as a multi-architecture Docker image (amd64/arm64) from G
cp .env.example .env
nano .env

# 2. Create usage directory (usage_*.json files are created automatically)
mkdir usage
# 2. Create runtime data directories
mkdir -p usage data

# 3. Start the proxy
docker compose up -d
Expand All @@ -869,13 +918,13 @@ docker compose up -d
docker compose logs -f
```

> **Important:** Create the `usage/` directory before running Docker Compose so usage stats persist on the host.
> **Important:** Create `usage/` and `data/` before running Docker Compose so usage stats and SQLite data persist on the host.

**Manual Docker Run:**

```bash
# Create usage directory if it doesn't exist
mkdir usage
# Create runtime data directories if they don't exist
mkdir -p usage data

docker run -d \
--name llm-api-proxy \
Expand All @@ -885,6 +934,7 @@ docker run -d \
-v $(pwd)/oauth_creds:/app/oauth_creds \
-v $(pwd)/logs:/app/logs \
-v $(pwd)/usage:/app/usage \
-v $(pwd)/data:/app/data \
-e SKIP_OAUTH_INIT_CHECK=true \
-e PYTHONUNBUFFERED=1 \
ghcr.io/mirrowel/llm-api-key-proxy:latest
Expand All @@ -905,6 +955,7 @@ docker compose -f docker-compose.dev.yml up -d --build
| `oauth_creds/` | OAuth credential files (persistent) |
| `logs/` | Request logs and detailed logging |
| `usage/` | Usage statistics persistence (`usage_*.json`) |
| `data/` | SQLite database persistence (`data/proxy.db`) |

**Image Tags:**

Expand Down
2 changes: 2 additions & 0 deletions docker-compose.dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ services:
- ./logs:/app/logs
# Mount usage directory for usage statistics persistence
- ./usage:/app/usage
# Mount data directory for sqlite persistence (data/proxy.db)
- ./data:/app/data
# Optionally mount additional .env files (e.g., combined credential files)
# - ./antigravity_all_combined.env:/app/antigravity_all_combined.env:ro
environment:
Expand Down
9 changes: 9 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# FastAPI framework for building the proxy server
fastapi
jinja2
python-multipart
# ASGI server for running the FastAPI application
uvicorn
# For loading environment variables from a .env file
Expand All @@ -20,8 +22,15 @@ colorlog

rich

sqlalchemy
aiosqlite

# GUI for model filter configuration
customtkinter

# For building the executable
pyinstaller

# Test dependencies
pytest
pytest-asyncio
17 changes: 17 additions & 0 deletions src/proxy_app/anthropic_errors.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from fastapi.responses import JSONResponse


def anthropic_error_response(
*,
status_code: int,
error_type: str,
message: str,
) -> JSONResponse:
payload = {
"type": "error",
"error": {
"type": error_type,
"message": message,
},
}
return JSONResponse(status_code=status_code, content=payload)
Loading
Loading