Skip to content

Latest commit

 

History

History
330 lines (274 loc) · 11 KB

File metadata and controls

330 lines (274 loc) · 11 KB

Coding Agent Request

This file is used for communication between Manager and Coding Agent.

Completion Status

Feature: cleanup-001, docker-001 Status: ✅ Completed Tests: Pass Notes: Both features completed successfully. API cleaned up and dockerized. YouTube tables preserved with correct row counts.

User Requirement:

  1. Remove TAQ, Curves, Futures, and ETS components from youtube_finder database and API code
  2. Clean up the API to only support YouTube-related functionality
  3. Dockerize the cleaned-up API so it runs in Docker without requiring terminal to stay open

PART 1: Remove Unused Components (cleanup-001)

Current State

  • Database tables to remove (all empty, 0 rows):

    • ets_transactions
    • taq_data
    • curves_data
    • futures_data
  • Code files to remove:

    • api/routers/ets.py
    • api/routers/taq.py
    • api/routers/curves.py
    • api/routers/futures.py
    • api/models/ets.py
    • api/models/taq.py
    • api/models/curves.py
    • api/models/futures.py
    • api/schemas/ets.py
    • api/schemas/taq.py
    • api/schemas/curves.py
    • api/schemas/futures.py
  • Files to modify:

    • api/main.py - Remove imports and router includes
    • api/models/__init__.py - Remove model exports
    • test/test_api.py - Remove ETS tests

Step 1: Backup Current State

  1. Verify current row counts in YouTube tables:
    docker exec database-youtube-postgres psql -U postgres -d youtube_finder -c "SELECT 'videos' as table_name, COUNT(*) as row_count FROM videos UNION ALL SELECT 'video_labels', COUNT(*) FROM video_labels UNION ALL SELECT 'transcripts', COUNT(*) FROM transcripts;"
  2. Expected: videos (982), video_labels (774), transcripts (3)
  3. Document these counts for verification after cleanup

Step 2: Drop Database Tables

  1. Drop unused tables from youtube_finder database:
    docker exec database-youtube-postgres psql -U postgres -d youtube_finder -c "DROP TABLE IF EXISTS ets_transactions CASCADE;"
    docker exec database-youtube-postgres psql -U postgres -d youtube_finder -c "DROP TABLE IF EXISTS taq_data CASCADE;"
    docker exec database-youtube-postgres psql -U postgres -d youtube_finder -c "DROP TABLE IF EXISTS curves_data CASCADE;"
    docker exec database-youtube-postgres psql -U postgres -d youtube_finder -c "DROP TABLE IF EXISTS futures_data CASCADE;"
  2. Verify tables are dropped:
    docker exec database-youtube-postgres psql -U postgres -d youtube_finder -c "\dt"
  3. Should only show YouTube-related tables: videos, video_labels, transcripts, video_notes, video_watch_positions, llm_responses

Step 3: Remove API Code Files

  1. Delete router files:

    • api/routers/ets.py
    • api/routers/taq.py
    • api/routers/curves.py
    • api/routers/futures.py
  2. Delete model files:

    • api/models/ets.py
    • api/models/taq.py
    • api/models/curves.py
    • api/models/futures.py
  3. Delete schema files:

    • api/schemas/ets.py
    • api/schemas/taq.py
    • api/schemas/curves.py
    • api/schemas/futures.py

Step 4: Update api/main.py

  1. Remove imports:

    # REMOVE these lines:
    from api.routers import health, ets, taq, curves, futures
    from api.models import ETSTransaction, TAQData, CurvesData, FuturesData
  2. Remove router includes:

    # REMOVE these lines:
    app.include_router(ets.router)
    app.include_router(taq.router)
    app.include_router(curves.router)
    app.include_router(futures.router)
  3. Keep only:

    from api.routers import health
    app.include_router(health.router)

Step 5: Update api/models/init.py

  1. Remove model exports for ETS, TAQ, Curves, Futures
  2. Keep only BaseModel if needed

Step 6: Update test/test_api.py

  1. Remove all ETS-related tests
  2. Keep only health endpoint tests
  3. Update test summary to reflect removed tests

Step 7: Verify Cleanup

  1. Verify YouTube tables still exist and have correct row counts:
    docker exec database-youtube-postgres psql -U postgres -d youtube_finder -c "SELECT 'videos' as table_name, COUNT(*) as row_count FROM videos UNION ALL SELECT 'video_labels', COUNT(*) FROM video_labels UNION ALL SELECT 'transcripts', COUNT(*) FROM transcripts;"
  2. Should match pre-cleanup counts: videos (982), video_labels (774), transcripts (3)
  3. Verify API still starts without errors:
    python -m uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
  4. Verify only health endpoint exists: GET /health
  5. Verify API docs at http://localhost:8000/docs shows only health endpoint

PART 2: Dockerize Cleaned-Up API (docker-001)

Step 1: Create Dockerfile for API

  1. Create docker/Dockerfile.api:
    FROM python:3.11-slim
    
    WORKDIR /app
    
    # Install system dependencies (if needed for psycopg2)
    RUN apt-get update && apt-get install -y \
        postgresql-client \
        && rm -rf /var/lib/apt/lists/*
    
    # Copy requirements
    COPY utils/requirements.txt /app/requirements.txt
    
    # Install Python dependencies
    RUN pip install --no-cache-dir -r requirements.txt
    
    # Copy application code
    COPY . /app
    
    # Expose API port
    EXPOSE 8000
    
    # Run FastAPI with uvicorn
    CMD ["python", "-m", "uvicorn", "api.main:app", "--host", "0.0.0.0", "--port", "8000"]

Step 2: Update api/config.py for Docker Detection

  1. Update api/config.py to detect Docker environment:
    # In Settings class, update POSTGRES_HOST:
    POSTGRES_HOST: str = os.getenv("POSTGRES_HOST", "localhost")
    
    # Add property to detect Docker:
    @property
    def postgres_host(self) -> str:
        """Get PostgreSQL host - use 'postgres' in Docker, 'localhost' outside."""
        if os.path.exists("/app") or os.getenv("DOCKER_ENV"):
            return os.getenv("POSTGRES_HOST", "postgres")
        return os.getenv("POSTGRES_HOST", "localhost")
    
    # Update database_url property to use postgres_host:
    @property
    def database_url(self) -> str:
        """Construct PostgreSQL database URL."""
        host = self.postgres_host
        return (
            f"postgresql://{self.POSTGRES_USER}:{self.POSTGRES_PASSWORD}"
            f"@{host}:{self.POSTGRES_PORT}/{self.POSTGRES_DB}"
        )

Step 3: Add API Service to docker-compose.yml

  1. Add API service to docker/docker-compose.yml:
    api:
      build:
        context: ..
        dockerfile: docker/Dockerfile.api
      container_name: database-youtube-api
      environment:
        POSTGRES_USER: ${POSTGRES_USER:-postgres}
        POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-changeme}
        POSTGRES_DB: ${POSTGRES_DB:-youtube_finder}
        POSTGRES_HOST: postgres  # Service name in Docker network
        POSTGRES_PORT: 5432
        API_HOST: 0.0.0.0
        API_PORT: 8000
        DOCKER_ENV: "true"  # Flag to detect Docker environment
      ports:
        - "${API_PORT:-8000}:8000"
      depends_on:
        postgres:
          condition: service_healthy
      restart: unless-stopped
      networks:
        - database-youtube-network

Step 4: Test Docker Deployment

  1. Build API image:

    docker-compose -f docker/docker-compose.yml build api
  2. Start all services:

    docker-compose -f docker/docker-compose.yml up -d
  3. Check API container logs:

    docker logs database-youtube-api
  4. Verify API is accessible:

    curl http://localhost:8000/health
    # Or open in browser: http://localhost:8000/docs
  5. Verify API docs show only health endpoint

Step 5: Verify YouTube Tables Still Accessible

  1. Verify YouTube tables still exist and have correct data:
    docker exec database-youtube-postgres psql -U postgres -d youtube_finder -c "SELECT 'videos' as table_name, COUNT(*) as row_count FROM videos UNION ALL SELECT 'video_labels', COUNT(*) FROM video_labels UNION ALL SELECT 'transcripts', COUNT(*) FROM transcripts;"
  2. Should match pre-cleanup counts: videos (982), video_labels (774), transcripts (3)

Step 6: Documentation

  1. Update README.md to reflect cleaned-up API
  2. Document Docker deployment process
  3. Document that API now only provides health endpoint (YouTube tables managed by other processes)

Verification Checklist

Cleanup (cleanup-001)

  • Database tables dropped: ets_transactions, taq_data, curves_data, futures_data
  • Router files deleted: ets.py, taq.py, curves.py, futures.py
  • Model files deleted: ets.py, taq.py, curves.py, futures.py
  • Schema files deleted: ets.py, taq.py, curves.py, futures.py
  • api/main.py updated (removed imports and router includes)
  • api/models/init.py updated (removed model exports)
  • test/test_api.py updated (removed ETS tests)
  • YouTube tables still exist with correct row counts
  • API starts without errors
  • Only health endpoint available

Dockerization (docker-001)

  • Dockerfile.api created and builds successfully
  • API service added to docker-compose.yml
  • api/config.py detects Docker environment correctly
  • API container starts and connects to database
  • API is accessible at http://localhost:8000
  • Health endpoint works: GET /health
  • API docs accessible at http://localhost:8000/docs
  • API container restarts automatically if it crashes
  • No terminal window needed to keep API running
  • CRITICAL: YouTube-related tables still accessible (videos, video_labels, transcripts)
  • CRITICAL: No data loss in existing tables (verify row counts match before/after)
  • CRITICAL: Other processes can still connect to database (database port 5432 still exposed)

Files to Create/Modify

Create

  • docker/Dockerfile.api - Dockerfile for API container

Delete

  • api/routers/ets.py
  • api/routers/taq.py
  • api/routers/curves.py
  • api/routers/futures.py
  • api/models/ets.py
  • api/models/taq.py
  • api/models/curves.py
  • api/models/futures.py
  • api/schemas/ets.py
  • api/schemas/taq.py
  • api/schemas/curves.py
  • api/schemas/futures.py

Modify

  • docker/docker-compose.yml - Add API service
  • api/config.py - Add Docker environment detection
  • api/main.py - Remove unused imports and routers
  • api/models/__init__.py - Remove unused model exports
  • test/test_api.py - Remove ETS tests
  • README.md - Update documentation

Critical Safety Notes

  • DO NOT modify YouTube-related tables - They are used by OTHER processes
  • DO NOT drop YouTube-related tables - Only drop ETS, TAQ, Curves, Futures tables
  • Verify row counts before and after - YouTube tables must remain unchanged
  • Database port 5432 remains exposed - Other processes need access
  • Backward compatibility - API should still work when run outside Docker (for development)

Task History

2025-01-23 - Clean Up API and Dockerize

  • User decision: Remove TAQ, Curves, Futures, and ETS (all unused)
  • Features: cleanup-001, docker-001
  • Priority: HIGH
  • Status: ✅ COMPLETED - Both features implemented and verified