Skip to content

Backend Setup: Initial API and Database Structure for DocuTag #2

@ShadowEGGx

Description

@ShadowEGGx

Backend Implementation Plan

1. Project Setup

  • Initialize Python project with virtual environment
  • Set up FastAPI framework
  • Configure project structure:
docutag/
├── app/
│   ├── __init__.py
│   ├── main.py
│   ├── core/
│   │   ├── __init__.py
│   │   ├── config.py
│   │   └── security.py
│   ├── api/
│   │   ├── __init__.py
│   │   ├── v1/
│   │   │   ├── __init__.py
│   │   │   ├── endpoints/
│   │   │   └── routes.py
│   ├── models/
│   │   ├── __init__.py
│   │   └── document.py
│   ├── schemas/
│   │   ├── __init__.py
│   │   └── document.py
│   └── services/
│       ├── __init__.py
│       └── document_service.py
├── tests/
│   └── __init__.py
├── requirements.txt
└── README.md

2. Database Setup

  • Set up PostgreSQL database
  • Create SQLAlchemy models:
# Basic Document Model Structure
class Document(Base):
    __tablename__ = "documents"
    
    id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    filename = Column(String, nullable=False)
    file_type = Column(String, nullable=False)
    content_type = Column(String)  # Classification result
    upload_date = Column(DateTime, default=datetime.utcnow)
    processed = Column(Boolean, default=False)
    summary = Column(Text)
    keywords = Column(ARRAY(String))
    metadata = Column(JSON)
    user_id = Column(UUID(as_uuid=True), ForeignKey("users.id"))

3. API Endpoints Implementation

  • Document Upload endpoint
@router.post("/documents/upload")
async def upload_document(file: UploadFile):
    # Handle document upload
    pass
  • Document Retrieval endpoints
@router.get("/documents/{doc_id}")
async def get_document(doc_id: UUID):
    # Retrieve document details
    pass

@router.get("/documents")
async def list_documents(
    skip: int = 0,
    limit: int = 10,
    content_type: Optional[str] = None
):
    # List documents with pagination
    pass

4. File Storage System

  • Implement secure file storage system
  • Set up file naming conventions
  • Implement file type validation
  • Add file size limits and validation

5. Document Processing Queue

  • Set up Celery for async processing
  • Implement task queue for document processing
  • Add status tracking for processing tasks

6. Security Implementation

  • Add JWT authentication
  • Implement role-based access control
  • Set up input validation
  • Add rate limiting

Dependencies to Add

fastapi==0.100.0
uvicorn==0.22.0
sqlalchemy==2.0.18
alembic==1.11.1
psycopg2-binary==2.9.6
python-multipart==0.0.6
celery==5.3.1
python-jose==3.3.0
passlib==1.7.4
pydantic==2.0.2

Initial Setup Steps

  1. Create virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Create initial FastAPI app:
from fastapi import FastAPI

app = FastAPI(
    title="DocuTag API",
    description="API for document processing and classification",
    version="1.0.0"
)

@app.get("/")
async def root():
    return {"message": "Welcome to DocuTag API"}

Next Steps

  1. Set up the basic project structure
  2. Initialize database and create models
  3. Implement document upload endpoint
  4. Add basic authentication
  5. Set up document storage system

Testing Plan

  • Unit tests for models
  • Integration tests for API endpoints
  • File upload/download tests
  • Authentication tests
  • Performance tests for file handling

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions