Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions .github/workflows/deploy-chat-backend.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
name: deploy-chat-backend
run-name: Deploy chat backend to Cloud Run

on:
push:
branches:
- main
paths:
- 'chat-backend/**'
workflow_dispatch:

env:
PROJECT_ID: chipflow-platform
REGION: us-central1
SERVICE_NAME: chipflow-docs-chat

jobs:
deploy:
runs-on: ubuntu-latest
permissions:
contents: read
id-token: write # Required for Workload Identity Federation

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Authenticate to Google Cloud
id: auth
uses: google-github-actions/auth@v2
with:
workload_identity_provider: ${{ secrets.GCP_WORKLOAD_IDENTITY_PROVIDER }}
service_account: ${{ secrets.GCP_SERVICE_ACCOUNT }}

- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@v2

- name: Configure Docker for GCR
run: gcloud auth configure-docker --quiet

- name: Build and push container
working-directory: chat-backend
run: |
docker build -t gcr.io/$PROJECT_ID/$SERVICE_NAME:${{ github.sha }} \
-t gcr.io/$PROJECT_ID/$SERVICE_NAME:latest .
docker push gcr.io/$PROJECT_ID/$SERVICE_NAME:${{ github.sha }}
docker push gcr.io/$PROJECT_ID/$SERVICE_NAME:latest

- name: Deploy to Cloud Run
run: |
gcloud run deploy $SERVICE_NAME \
--image gcr.io/$PROJECT_ID/$SERVICE_NAME:${{ github.sha }} \
--region $REGION \
--platform managed \
--allow-unauthenticated \
--memory 1Gi \
--cpu 1 \
--min-instances 0 \
--max-instances 3 \
--set-env-vars "DOCS_URL=https://docs.chipflow.io/llms-full.txt,GCP_PROJECT=$PROJECT_ID,GCP_LOCATION=$REGION"

- name: Show service URL
run: |
URL=$(gcloud run services describe $SERVICE_NAME --region $REGION --format='value(status.url)')
echo "## Deployed to Cloud Run" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "Service URL: $URL" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "Update \`chat-widget.js\` with:" >> $GITHUB_STEP_SUMMARY
echo "\`\`\`javascript" >> $GITHUB_STEP_SUMMARY
echo "apiUrl: '${URL}/api/chat'" >> $GITHUB_STEP_SUMMARY
echo "\`\`\`" >> $GITHUB_STEP_SUMMARY
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ docs/source/*
!docs/source/support.rst
!docs/source/platform-api.rst
!docs/source/tutorial-intro-chipflow-platform.rst
!docs/source/_static/

# Misc
log
Expand Down
27 changes: 27 additions & 0 deletions chat-backend/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use official Python runtime as base image
FROM python:3.12-slim

# Set working directory
WORKDIR /app

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PORT=8080

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY main.py .

# Create non-root user for security
RUN useradd --create-home appuser && chown -R appuser:appuser /app
USER appuser

# Expose port
EXPOSE 8080

# Run the application
CMD ["python", "main.py"]
180 changes: 180 additions & 0 deletions chat-backend/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
# ChipFlow Docs Chat Backend

A FastAPI backend that provides AI-powered Q&A for ChipFlow documentation using Vertex AI.

## Architecture

- **FastAPI** - Web framework with async support
- **Vertex AI** - Embeddings (text-embedding-005) and LLM (Gemini 1.5 Flash)
- **In-memory RAG** - Simple vector search using numpy
- **Cloud Run** - Serverless deployment

## How it Works

1. On startup, fetches `llms-full.txt` from the docs site
2. Chunks the documentation into overlapping segments
3. Generates embeddings for each chunk using Vertex AI
4. When a question arrives:
- Generates query embedding
- Finds most similar chunks via cosine similarity
- Sends relevant context + question to Gemini
- Returns the response

## Local Development

### Prerequisites

- Python 3.12+
- Google Cloud SDK with authentication configured
- Access to a GCP project with Vertex AI enabled

### Setup

```bash
cd chat-backend

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Set environment variables
export GCP_PROJECT=your-project-id
export GCP_LOCATION=us-central1
export DOCS_URL=https://docs.chipflow.io/llms-full.txt

# Run locally
python main.py
```

The server will start at http://localhost:8080

### Test the API

```bash
# Health check
curl http://localhost:8080/health

# Ask a question
curl -X POST http://localhost:8080/api/chat \
-H "Content-Type: application/json" \
-d '{"question": "What is Amaranth?"}'
```

## Deployment to Cloud Run

### Prerequisites

1. GCP project with billing enabled
2. Enable required APIs:
```bash
gcloud services enable \
cloudbuild.googleapis.com \
run.googleapis.com \
aiplatform.googleapis.com \
containerregistry.googleapis.com
```

3. Grant Cloud Run service account Vertex AI access:
```bash
PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format='value(projectNumber)')
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$PROJECT_NUMBER-compute@developer.gserviceaccount.com" \
--role="roles/aiplatform.user"
```

### Deploy with Cloud Build

```bash
cd chat-backend

# Deploy using Cloud Build
gcloud builds submit --config cloudbuild.yaml

# Or manually build and deploy
gcloud builds submit --tag gcr.io/$PROJECT_ID/chipflow-docs-chat
gcloud run deploy chipflow-docs-chat \
--image gcr.io/$PROJECT_ID/chipflow-docs-chat \
--region us-central1 \
--platform managed \
--allow-unauthenticated \
--memory 1Gi \
--set-env-vars "DOCS_URL=https://docs.chipflow.io/llms-full.txt,GCP_PROJECT=$PROJECT_ID,GCP_LOCATION=us-central1"
```

### After Deployment

1. Get the Cloud Run URL:
```bash
gcloud run services describe chipflow-docs-chat --region us-central1 --format='value(status.url)'
```

2. Update the chat widget in `docs/source/_static/js/chat-widget.js`:
```javascript
const CONFIG = {
apiUrl: 'https://chipflow-docs-chat-xxxxx.a.run.app/api/chat',
// ...
};
```

## Configuration

Environment variables:

| Variable | Default | Description |
|----------|---------|-------------|
| `DOCS_URL` | `https://docs.chipflow.io/llms-full.txt` | URL to fetch documentation |
| `GCP_PROJECT` | `chipflow-docs` | Google Cloud project ID |
| `GCP_LOCATION` | `us-central1` | Vertex AI region |
| `PORT` | `8080` | Server port |

## Cost Estimation

Based on ~50 users with infrequent queries (~100 queries/day):

- **Vertex AI Embeddings**: ~$0.01/1000 queries
- **Gemini 1.5 Flash**: ~$0.075/1M input tokens, $0.30/1M output tokens
- **Cloud Run**: Pay-per-use, scales to zero when idle

Estimated monthly cost: **$5-20** (well under $100 budget)

## API Reference

### `GET /health`

Health check endpoint.

**Response:**
```json
{
"status": "healthy",
"initialized": true,
"chunks": 150
}
```

### `POST /api/chat`

Ask a question about the documentation.

**Request:**
```json
{
"question": "How do I create an Amaranth module?",
"conversation_history": [
{"role": "user", "content": "previous question"},
{"role": "assistant", "content": "previous answer"}
],
"page": "/amaranth/guide/basics.html"
}
```

**Response:**
```json
{
"answer": "To create an Amaranth module...",
"sources": ["Getting Started", "Module Basics"]
}
```
54 changes: 54 additions & 0 deletions chat-backend/cloudbuild.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Cloud Build configuration for deploying the chat backend to Cloud Run
steps:
# Build the container image
- name: 'gcr.io/cloud-builders/docker'
args:
- 'build'
- '-t'
- 'gcr.io/$PROJECT_ID/chipflow-docs-chat:$COMMIT_SHA'
- '-t'
- 'gcr.io/$PROJECT_ID/chipflow-docs-chat:latest'
- '.'

# Push the container image to Container Registry
- name: 'gcr.io/cloud-builders/docker'
args:
- 'push'
- 'gcr.io/$PROJECT_ID/chipflow-docs-chat:$COMMIT_SHA'

- name: 'gcr.io/cloud-builders/docker'
args:
- 'push'
- 'gcr.io/$PROJECT_ID/chipflow-docs-chat:latest'

# Deploy to Cloud Run
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args:
- 'run'
- 'deploy'
- 'chipflow-docs-chat'
- '--image'
- 'gcr.io/$PROJECT_ID/chipflow-docs-chat:$COMMIT_SHA'
- '--region'
- 'us-central1'
- '--platform'
- 'managed'
- '--allow-unauthenticated'
- '--memory'
- '1Gi'
- '--cpu'
- '1'
- '--min-instances'
- '0'
- '--max-instances'
- '3'
- '--set-env-vars'
- 'DOCS_URL=https://docs.chipflow.io/llms-full.txt,GCP_PROJECT=$PROJECT_ID,GCP_LOCATION=us-central1'

images:
- 'gcr.io/$PROJECT_ID/chipflow-docs-chat:$COMMIT_SHA'
- 'gcr.io/$PROJECT_ID/chipflow-docs-chat:latest'

options:
logging: CLOUD_LOGGING_ONLY
Loading
Loading