Skip to content

Initial project setup#1

Merged
jreakin merged 1 commit intomainfrom
initial-setup
Jan 29, 2026
Merged

Initial project setup#1
jreakin merged 1 commit intomainfrom
initial-setup

Conversation

@jreakin
Copy link
Member

@jreakin jreakin commented Jan 28, 2026

TL;DR

Initial project setup for a Facebook Messenger AI bot that answers questions based on scraped website content using GitHub Copilot SDK.

What changed?

This PR establishes the foundational structure for the Facebook Messenger AI bot project:

  • Created project configuration files (pyproject.toml, .env.example, .gitignore)
  • Set up FastAPI application structure with webhook endpoints for Facebook Messenger
  • Implemented core services:
    • Website scraper for content extraction
    • Copilot SDK wrapper with OpenAI fallback
    • Reference document builder for content synthesis
    • Agent service for message processing
    • Facebook service for sending responses
  • Added Supabase database integration with repository layer
  • Created CLI setup tool for bot configuration
  • Added comprehensive documentation (architecture, guardrails, project structure)
  • Defined database schema with initial migration

How to test?

  1. Clone the repository
  2. Copy .env.example to .env and fill in your credentials
  3. Install dependencies: uv sync
  4. Run the CLI setup: uv run python -m src.cli.setup_cli setup
  5. Start the server: uv run uvicorn src.main:app --reload
  6. Test webhook verification with: curl "http://localhost:8000/webhook?hub.mode=subscribe&hub.verify_token=your_token&hub.challenge=challenge"

Why make this change?

This project creates a production-ready Facebook Messenger bot that can answer questions about a website using AI. The bot scrapes website content, synthesizes it into a reference document using GitHub Copilot SDK, and uses this knowledge to respond to user messages. This implementation provides a scalable foundation with proper error handling, fallback mechanisms, and a clean architecture.

Summary by CodeRabbit

Release Notes

  • New Features

    • Facebook Messenger AI Bot is now available with webhook support for receiving and responding to messages
    • Interactive CLI setup tool guides users through bot configuration and website integration
    • Health check endpoint for monitoring service availability
  • Documentation

    • Comprehensive architecture, project structure, and development guidelines added
    • Security guardrails and best practices documentation included
  • Chores

    • Initial project configuration and dependencies established
    • Database schema and migrations configured
    • Deployment setup for Railway platform

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 28, 2026

Walkthrough

This PR establishes a complete Facebook Messenger AI Bot project, introducing configuration files, database migrations, Pydantic data models, FastAPI endpoints, CLI setup workflow, and service layer components for agent orchestration, Copilot SDK integration, Facebook Graph API communication, and website content extraction.

Changes

Cohort / File(s) Summary
Configuration & Deployment
.env.example, .gitignore, .python-version, pyproject.toml, railway.toml, src/config.py
Environment template with secrets placeholders; ignore patterns for Python, build, venv, caches, and IDE files; Python 3.12.8 version spec; project metadata and dependency declaration (FastAPI, Pydantic, Supabase, Typer, etc.); Railway build (NIXPACKS) and deployment config (health checks, restart policy); Pydantic BaseSettings with environment-based config loading and cached getter.
Documentation
AGENTS.md, ARCHITECTURE.md, GUARDRAILS.md, PROJECT_STRUCTURE.md
Comprehensive guides covering toolchain, architecture (single-agent Copilot design), file structure with examples; production-ready system overview with data/setup flows, tool registry, error recovery; safety boundaries (input validation, risk classifications, escalation policies, incident response); complete directory tree and module responsibilities.
Database Layer
migrations/001_initial.sql, src/db/client.py, src/db/repository.py
PostgreSQL migration creating bot_configurations, reference_documents, message_history tables with foreign keys, indexes, and auto-update trigger; Supabase client factory; repository functions for bot config CRUD, reference doc management, message history persistence.
API & Application
src/api/health.py, src/api/webhook.py, src/api/setup.py, src/main.py
Health check endpoint (/health); Facebook webhook verification (GET) and message handling (POST, TODO-outlined); placeholder setup module; FastAPI app with lifecycle management, CORS, router registration, root endpoint.
Data Models
src/models/agent_models.py, src/models/config_models.py, src/models/messenger.py
Agent context and response schemas; bot/Facebook configuration structures with UUID defaults; incoming Messenger webhook payload and entry models.
Service Layer
src/services/agent_service.py, src/services/copilot_service.py, src/services/facebook_service.py, src/services/reference_doc.py, src/services/scraper.py
Agent orchestrator integrating Copilot with escalation logic; Copilot SDK wrapper with health checks, reference synthesis, and OpenAI fallback (stub); Facebook Graph API message sender; reference document builder with SHA-256 hashing; website scraper extracting and chunking text.
CLI Setup
src/cli/setup_cli.py
Interactive Typer-based command guiding end-to-end bot setup: scrape website, synthesize reference doc, collect tone/Facebook config, persist to database, print webhook URL.
Package Initialization
src/__init__.py, src/api/__init__.py, src/cli/__init__.py, src/db/__init__.py, src/models/__init__.py, src/services/__init__.py, main.py (root)
Module docstrings and package markers; root-level main function with greeting and entry-point guard.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI as setup_cli.py
    participant Scraper
    participant CopilotSvc as copilot_service
    participant RefDoc as reference_doc.py
    participant DB as repository.py
    participant Config as config.py

    User->>CLI: Run setup command
    CLI->>Config: get_settings()
    CLI->>Scraper: scrape_website(url)
    Scraper-->>CLI: text_chunks[]
    CLI->>CopilotSvc: synthesize_reference(url, chunks)
    CopilotSvc-->>CLI: markdown_content
    CLI->>RefDoc: build_reference_doc(copilot, url, chunks)
    RefDoc->>CopilotSvc: synthesize_reference(...)
    RefDoc-->>CLI: (content, hash)
    CLI->>DB: create_reference_document(content, url, hash)
    DB-->>CLI: doc_id
    User->>CLI: Provide tone & Facebook config
    CLI->>DB: create_bot_configuration(page_id, website_url, doc_id, tone, tokens)
    DB-->>CLI: BotConfiguration
    CLI-->>User: Print webhook URL & next steps
Loading
sequenceDiagram
    participant FB as Facebook
    participant Webhook as webhook.py
    participant API as main.py
    participant Agent as agent_service.py
    participant Copilot as copilot_service
    participant FBService as facebook_service
    participant DB as repository.py

    FB->>Webhook: POST /webhook (message payload)
    Webhook->>API: Parse & extract message/sender
    API->>DB: get_bot_configuration_by_page_id(page_id)
    DB-->>API: bot_config + reference_doc
    API->>Agent: respond(context, user_message)
    Agent->>Copilot: chat(system_prompt, messages)
    Copilot-->>Agent: response_text
    Agent-->>API: AgentResponse{message, confidence, escalation}
    API->>DB: save_message_history(...)
    API->>FBService: send_message(token, recipient_id, text)
    FBService->>FB: POST /graph.facebook.com/v18.0/me/messages
    FB-->>FBService: 200 OK
    FBService-->>API: success
    API-->>Webhook: 200 OK
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A bot is born, with whiskers and cheer,
Scraping and thinking, the messenger's sphere,
From setup to Copilot, from database to chat,
Facebook will wonder: "How'd we become friends like that?"
Now escalate wisely, and let wisdom shine—
Our little rabbit built something quite fine!


Note

🎁 Summarized by CodeRabbit Free

Your organization is on the Free plan. CodeRabbit will generate a high-level summary and a walkthrough for each pull request. For a comprehensive line-by-line review, please upgrade your subscription to CodeRabbit Pro by visiting https://app.coderabbit.ai/login.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Member Author

jreakin commented Jan 28, 2026

@jreakin jreakin marked this pull request as ready for review January 28, 2026 03:39
Copilot AI review requested due to automatic review settings January 28, 2026 03:39
"""Fallback to OpenAI when Copilot is unavailable."""
# TODO: Implement OpenAI API call
# This requires openai package and API key from settings
raise NotImplementedError("OpenAI fallback not yet implemented")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The _fallback_to_openai function raises a NotImplementedError. This will cause a crash whenever the Copilot service is unavailable, disabled, or returns an error.
Severity: HIGH

Suggested Fix

Implement the OpenAI API call within the _fallback_to_openai function. This involves using the openai package, retrieving the openai_api_key from settings, and making the appropriate chat completion request to the OpenAI API, returning the content of the response.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: src/services/copilot_service.py#L118

Potential issue: The `_fallback_to_openai` method in `copilot_service.py` is intended to
provide a recovery mechanism when the Copilot service is unavailable. However, it
currently raises a `NotImplementedError`. This fallback is invoked when
`copilot_enabled` is `False`, when `is_available()` returns `False`, or when the Copilot
API call fails. In any of these scenarios, which are anticipated production conditions,
the application will crash, rendering the service non-functional. The feature is
documented and the configuration supports it, but the implementation is incomplete.

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Initial project scaffolding for a FastAPI-based Facebook Messenger bot that scrapes a website, synthesizes a reference document via a Copilot SDK wrapper (with planned fallback), and stores configuration/history in Supabase.

Changes:

  • Added FastAPI app + webhook/health endpoints and deployment config (Railway).
  • Implemented initial service layer (scraper, Copilot wrapper, reference doc builder, agent response service, Facebook send API helper).
  • Added Supabase client/repository plus initial DB migration and setup CLI, along with architecture/guardrails docs.

Reviewed changes

Copilot reviewed 32 out of 33 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
src/services/scraper.py Adds async website fetch + text extraction and chunking.
src/services/reference_doc.py Builds synthesized markdown reference doc + content hash.
src/services/facebook_service.py Adds async helper to send messages via Facebook Graph API.
src/services/copilot_service.py Introduces Copilot SDK wrapper with availability check + planned fallback.
src/services/agent_service.py Adds agent service building prompts and producing AgentResponse.
src/services/init.py Declares services package.
src/models/messenger.py Adds Pydantic models for webhook payload/message shapes.
src/models/config_models.py Adds configuration models for website/tone/Facebook/bot config.
src/models/agent_models.py Adds agent context/response Pydantic models.
src/models/init.py Declares models package.
src/main.py Creates FastAPI app, lifespan init, CORS, and router registration.
src/db/repository.py Adds Supabase repository functions for configs/docs/history.
src/db/client.py Adds Supabase client initialization.
src/db/init.py Declares db package.
src/config.py Adds Pydantic settings for env-based configuration.
src/cli/setup_cli.py Adds Typer CLI for interactive setup + persistence.
src/cli/init.py Declares CLI package.
src/api/webhook.py Adds webhook verification and placeholder POST handler.
src/api/setup.py Adds placeholder setup API module.
src/api/health.py Adds /health endpoint.
src/api/init.py Declares API package.
src/init.py Declares src package.
railway.toml Railway build/deploy configuration.
pyproject.toml Project metadata and dependencies.
migrations/001_initial.sql Initial DB schema for bot configs, reference docs, and message history.
main.py Adds a root entrypoint stub.
PROJECT_STRUCTURE.md Documents folder structure and file responsibilities.
GUARDRAILS.md Documents safety/guardrails policies and operational guidelines.
ARCHITECTURE.md Documents system architecture, data flow, and fallback logic.
AGENTS.md Conventions and contributor/developer guidance for the project.
.python-version Pins Python version.
.gitignore Adds Python/venv/env/test/cache ignores.
.env.example Adds environment variable template and explanations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +27 to +29
settings = get_settings()
supabase = get_supabase_client()
copilot = CopilotService(
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supabase = get_supabase_client() is assigned but never used in this CLI command (the repository functions re-create their own client). Remove the unused variable, or refactor the repository to accept an injected client so the CLI can reuse this instance.

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +20
async def scrape_website(url: str, max_pages: int = 5) -> List[str]:
"""
Scrape website and return text chunks.

Args:
url: Root URL to scrape
max_pages: Maximum number of pages to scrape

Returns:
List of text chunks (500-800 words each)
"""
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_pages parameter is unused and the implementation only fetches the root url once. Either implement multi-page crawling (respecting max_pages) or remove the parameter/update the docstring to avoid misleading callers.

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +91
if not self.enabled or not await self.is_available():
# Fallback to OpenAI or other LLM
logger.warning("Copilot SDK not available, falling back to OpenAI")
return await self._fallback_to_openai(system_prompt, messages)
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chat() calls await self.is_available() for every request, and is_available() creates a new HTTP client and performs a network call each time. This adds an extra round-trip per message. Consider caching the availability result for a short TTL, or performing a single startup check and then handling runtime failures via exception handling/retries.

Copilot uses AI. Check for mistakes.
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # Configure appropriately for production
allow_credentials=True,
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CORS middleware is configured with allow_origins=["*"] while also setting allow_credentials=True. Starlette/FASTAPI CORS middleware forbids wildcard origins when credentials are allowed and will error/behave incorrectly. Use an explicit origin allowlist (recommended), or set allow_credentials=False if wildcard origins are required.

Suggested change
allow_credentials=True,
allow_credentials=False,

Copilot uses AI. Check for mistakes.
Comment on lines +27 to +33
settings = get_settings()
supabase = get_supabase_client()
copilot = CopilotService(
base_url=settings.copilot_cli_host,
enabled=settings.copilot_enabled
)

Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI calls get_settings() before prompting for Facebook credentials, but Settings requires facebook_page_access_token and facebook_verify_token (non-optional). This means the CLI will fail with a validation error unless those env vars are already set—making the prompts redundant. Consider splitting settings (CLI vs server), or making the Facebook fields optional for CLI execution and validating only when needed.

Copilot uses AI. Check for mistakes.
Comment on lines +21 to +28
create table reference_documents (
id uuid primary key default gen_random_uuid(),
bot_id uuid not null references bot_configurations(id) on delete cascade,
content text not null,
source_url text not null,
content_hash text not null,
created_at timestamptz default now()
);
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table requires bot_id uuid not null, but the repository/CLI flow creates a reference document before a bot exists. To support that flow, bot_id must be nullable (or the flow must be inverted to create the bot first and then create the reference document with bot_id).

Copilot uses AI. Check for mistakes.
Comment on lines +32 to +36
if settings.copilot_enabled:
is_available = await copilot.is_available()
if not is_available:
print("Warning: Copilot SDK not available, will use OpenAI fallback")

Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using print() for runtime warnings; it bypasses log configuration and is discouraged by the project’s own guidance (see AGENTS.md:312). Use the module logger (or logging.getLogger(__name__)) so warnings are captured consistently in production.

Copilot uses AI. Check for mistakes.
# Step 6: Create bot configuration
typer.echo("\nCreating bot configuration...")
try:
bot_config = create_bot_configuration(
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable bot_config is not used.

Suggested change
bot_config = create_bot_configuration(
create_bot_configuration(

Copilot uses AI. Check for mistakes.
"""GitHub Copilot SDK wrapper service."""

import logging
from typing import Any
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Any' is not used.

Suggested change
from typing import Any

Copilot uses AI. Check for mistakes.

import asyncio
import typer
from typing_extensions import Annotated
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Annotated' is not used.

Suggested change
from typing_extensions import Annotated

Copilot uses AI. Check for mistakes.
Copy link
Member Author

jreakin commented Jan 29, 2026

Merge activity

  • Jan 29, 11:55 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Jan 29, 11:55 PM UTC: @jreakin merged this pull request with Graphite.

@jreakin jreakin merged commit 2528d6a into main Jan 29, 2026
10 checks passed
@jreakin jreakin deleted the initial-setup branch January 30, 2026 00:19
@notion-workspace
Copy link

Initial project setup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants