Go https://build.nvidia.com/explore/discover, register an account, and generate an API key. NVIDIA provides two hidden models, moonshotai/kimi-k2.5 z-ai/glm4.7 and minimaxai/minimax-m2.1 Then, configure the config.json file and run the program, ensuring it listens on port 3001.
Expose POST /v1/messages (Anthropic/Claude style), convert to OpenAI Chat Completions, and proxy to NVIDIA (configured via config.json).
Edit config.json:
nvidia_urldefaulthttps://integrate.api.nvidia.com/v1/chat/completionsnvidia_keyrequired: used for upstream auth, sent asAuthorization: Bearer ...
Do not commit your real nvidia_key.
CONFIG_PATHdefaultconfig.json(relative togo/)PROVIDER_API_KEYoptional: overridesnvidia_keyfrom configUPSTREAM_URLoptional: overridesnvidia_urlfrom configSERVER_API_KEYoptional: enable inbound auth; acceptsAuthorization: Bearer ...orx-api-key: ...ADDRdefault:3001UPSTREAM_TIMEOUT_SECONDSdefault300LOG_BODY_MAX_CHARSdefault4096(0disables body logging)LOG_STREAM_TEXT_PREVIEW_CHARSdefault256(0disables stream preview logging)
go run .use moonshotai/kimi-k2.5 model
export ANTHROPIC_BASE_URL=http://localhost:3001
export ANTHROPIC_AUTH_TOKEN=nvapi-api-key
export ANTHROPIC_DEFAULT_HAIKU_MODEL=moonshotai/kimi-k2.5
export ANTHROPIC_DEFAULT_SONNET_MODEL=moonshotai/kimi-k2.5
export ANTHROPIC_DEFAULT_OPUS_MODEL=moonshotai/kimi-k2.5
use zai/glm4.7 model
export ANTHROPIC_BASE_URL=http://localhost:3001
export ANTHROPIC_AUTH_TOKEN=nvapi-api-key
export ANTHROPIC_DEFAULT_HAIKU_MODEL=z-ai/glm4.7
export ANTHROPIC_DEFAULT_SONNET_MODEL=z-ai/glm4.7
export ANTHROPIC_DEFAULT_OPUS_MODEL=z-ai/glm4.7
claudeuse zai/glm4.7 model
export ANTHROPIC_BASE_URL=http://localhost:3001
export ANTHROPIC_AUTH_TOKEN=nvapi-api-key
export ANTHROPIC_DEFAULT_HAIKU_MODEL=minimaxai/minimax-m2.1
export ANTHROPIC_DEFAULT_SONNET_MODEL=minimaxai/minimax-m2.1
export ANTHROPIC_DEFAULT_OPUS_MODEL=minimaxai/minimax-m2.1
claude- Inbound auth:
- If
SERVER_API_KEYis set, you must sendAuthorization: Bearer <SERVER_API_KEY>(orx-api-key: <SERVER_API_KEY>).
- If
- Upstream auth:
- Always sends
Authorization: Bearer <nvidia_key>to NVIDIA.
- Always sends
Example (non-stream):
curl -sS http://127.0.0.1:3001/v1/messages \
-H 'Content-Type: application/json' \
-d '{
"model":"z-ai/glm4.7",
"max_tokens":256,
"messages":[{"role":"user","content":"hello"}]
}'Example (stream):
curl -N http://127.0.0.1:3001/v1/messages \
-H 'Content-Type: application/json' \
-d '{
"model":"z-ai/glm4.7",
"max_tokens":256,
"stream":true,
"messages":[{"role":"user","content":"hello"}]
}'This project uses only Go stdlib (no external deps). If your environment blocks the default Go build cache path, set:
export GOCACHE=/tmp/go-build-cache
export GOMODCACHE=/tmp/gomodcacheLinux (amd64):
mkdir -p dist
GOOS=linux GOARCH=amd64 go build -trimpath -ldflags "-s -w" -o dist/claude-nvidia-proxy_linux_amd64 .Windows (amd64):
mkdir -p dist
GOOS=windows GOARCH=amd64 go build -trimpath -ldflags "-s -w" -o dist/claude-nvidia-proxy_windows_amd64.exe .macOS:
# 在 macOS 机器上直接编译
go build -trimpath -ldflags "-s -w" -o claude-nvidia-proxy .
# 编译 macOS universal 二进制文件
# 编译各架构
GOOS=darwin GOARCH=amd64 go build -trimpath -ldflags "-s -w" -o dist/claude-nvidia-proxy_x86_64 .
GOOS=darwin GOARCH=arm64 go build -trimpath -ldflags "-s -w" -o dist/claude-nvidia-proxy_arm64 .
# 合并成统一二进制
lipo -create dist/claude-nvidia-proxy_x86_64 dist/claude-nvidia-proxy_arm64 -output dist/claude-nvidia-proxy_universal- 把二进制放到 /usr/local/bin
sudo cp claude-nvidia-proxy /usr/local/bin/
sudo chmod +x /usr/local/bin/claude-nvidia-proxy- 测试二进制运行是否正常
CONFIG_PATH=/path/to/config.json /usr/local/bin/claude-nvidia-proxy- 创建 LaunchAgent plist
mkdir -p ~/Library/LaunchAgents
vi ~/Library/LaunchAgents/com.xxx.claude-nvidia-proxy.plist- 完整 plist 示例(直接可用, 环境变量配置文件路径修改为本机正确路径)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<!-- 唯一标识 -->
<key>Label</key>
<string>com.xxx.claude-nvidia-proxy</string>
<!-- 程序和参数 -->
<key>ProgramArguments</key>
<array>
<string>/usr/local/bin/claude-nvidia-proxy</string>
</array>
<!-- 环境变量 -->
<key>EnvironmentVariables</key>
<dict>
<key>CONFIG_PATH</key>
<string>/absolute/path/to/config.json</string>
</dict>
<!-- 登录后立即启动 -->
<key>RunAtLoad</key>
<true/>
<!-- 崩溃/退出后自动重启 -->
<key>KeepAlive</key>
<true/>
<!-- 工作目录(可选,但强烈推荐) -->
<key>WorkingDirectory</key>
<string>/usr/local/bin</string>
<!-- 日志 -->
<key>StandardOutPath</key>
<string>/tmp/claude-nvidia-proxy.out.log</string>
<key>StandardErrorPath</key>
<string>/tmp/claude-nvidia-proxy.err.log</string>
</dict>
</plist>- 加载并启动服务
launchctl load ~/Library/LaunchAgents/com.xxx.claude-nvidia-proxy.plist
# 立即启动(不用重新登录)
launchctl start com.xxx.claude-nvidia-proxy
# 检查是否运行
launchctl list | grep claude
# 停止
launchctl stop com.xxx.claude-nvidia-proxy
# 卸载(不再自启)
launchctl unload ~/Library/LaunchAgents/com.xxx.claude-nvidia-proxy.plist
# 修改配置后重新加载
launchctl unload ~/Library/LaunchAgents/com.xxx.claude-nvidia-proxy.plist
launchctl load ~/Library/LaunchAgents/com.xxx.claude-nvidia-proxy.plistchmod +x macos-install.sh
sudo ./macos-install.sh- Inbound auth:
- If
SERVER_API_KEYis set, you must sendAuthorization: Bearer <SERVER_API_KEY>(orx-api-key: <SERVER_API_KEY>).
- If
- Upstream auth:
- Always sends
Authorization: Bearer <nvidia_key>to NVIDIA.
- Always sends
Example (non-stream):
curl -sS http://127.0.0.1:3001/v1/messages \
-H 'Content-Type: application/json' \
-d '{
"model":"z-ai/glm4.7",
"max_tokens":256,
"messages":[{"role":"user","content":"hello"}]
}'Example (stream):
curl -N http://127.0.0.1:3001/v1/messages \
-H 'Content-Type: application/json' \
-d '{
"model":"z-ai/glm4.7",
"max_tokens":256,
"stream":true,
"messages":[{"role":"user","content":"hello"}]
}'- Streaming conversion supports
delta.contenttext anddelta.tool_callstool-use blocks; other Anthropic blocks are not fully implemented. - Logs show forwarded request bodies; keep
LOG_BODY_MAX_CHARSsmall and avoid secrets in prompts.