Skip to content

AI-powered prompt generator for video (Wan2.1/2.2, Hunyuan), image (SD, FLUX, Midjourney, DALL-E), and creative content. Local LLMs with GPU auto-detection.

Notifications You must be signed in to change notification settings

kekzl/PromptMill

Repository files navigation

PromptMill

AI-powered prompt generator for video, image, and creative content

Python 3.12+ Gradio Docker Ruff License: MIT GitHub stars

Features Β· Quick Start Β· Supported Targets Β· Models Β· Configuration


Overview

PromptMill is a self-contained web UI that runs entirely locally - no API keys, no cloud dependencies. It uses selectable LLMs (scaled by your GPU VRAM) to generate optimized prompts for the latest AI video and image generators.

102
Preset Roles
7
LLM Options
1B-8B
Parameters
100%
Local

πŸ“Έ Screenshots

Main Interface

PromptMill Main Interface

Clean dark UI with quick examples and customizable generation settings

102 AI Model Targets

PromptMill Model Selection

Support for Video, Image, Audio, 3D, and Creative AI tools


✨ Features

  • Smart GPU Detection - Automatically selects the best model for your VRAM
  • 7 LLM Tiers - From 1B (CPU) to 8B parameters (24GB+ VRAM) using Dolphin models
  • 102 Specialized Roles - Video (22), Image (21), Audio (13), 3D (12), and Creative (34)
  • Dark Mode UI - Modern interface with streaming generation
  • Model Cleanup - Delete downloaded models to free disk space
  • Zero Config - Works out of the box with Docker
  • Fully Offline - No API keys or internet required after setup
  • Thread-Safe - Concurrent request handling with proper locking
  • Configurable - Environment variables for server settings

πŸš€ Quick Start

Docker (Recommended)

# GPU (NVIDIA) - auto-detects VRAM
docker compose --profile gpu up -d

# CPU only
docker compose --profile cpu up -d

Open http://localhost:7610

Models auto-download on first use and persist in ./models/

Manual Installation

# GPU (CUDA)
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python
pip install gradio huggingface_hub
python -m promptmill

# CPU only
pip install llama-cpp-python gradio huggingface_hub
python -m promptmill

🎯 Supported Targets

🎬 Video (22)

Wan2.1, Wan2.2, Wan2.5, Hunyuan Video, Hunyuan 1.5, Runway Gen-3, Kling AI, Kling 2.1, Pika Labs, Pika 2.1, Luma Dream Machine, Luma Ray2, Sora, Veo, Veo 3, Hailuo AI, Seedance, SkyReels V1, Mochi 1, CogVideoX, LTX Video, Open-Sora

πŸ–ΌοΈ Image (21)

Stable Diffusion, SD 3.5, FLUX, FLUX 2, Midjourney, DALL-E 3, ComfyUI, Ideogram, Leonardo AI, Adobe Firefly, Recraft, Imagen 3, Imagen 4, GPT-4o Images, Reve Image, HiDream-I1, Qwen-Image, Recraft V3, FLUX Kontext, Ideogram 3, Grok Image

πŸ”Š Audio (13)

Suno AI, Udio, ElevenLabs, Eleven Music, Mureka AI, SOUNDRAW, Beatoven.ai, Stable Audio 2.0, MusicGen, Suno v4.5, ACE Studio, AIVA, Boomy

🧊 3D (12)

Meshy, Tripo AI, Rodin, Spline, Sloyd, 3DFY.ai, Luma Genie, Masterpiece X, Hunyuan3D, Trellis, TripoSR, Unique3D

✍️ Creative (34)

Story Writer, Code Generator, Technical Writer, Marketing Copy, SEO Content, Screenplay Writer, Social Media Manager, Video Script Writer, Song Lyrics, Email Copywriter, Product Description, Podcast Script, Resume Writer, Cover Letter, Speech Writer, Game Narrative, UX Writer, Press Release, Poetry Writer, Data Analysis, Business Plan, Academic Writing, Tutorial Creator, Newsletter Writer, Legal Document, Grant Writer, API Documentation, Course Creator, Pitch Deck, Meeting Notes, Changelog Writer, Recipe Creator, Travel Guide, Workout Plan


🧠 LLM Options

PromptMill automatically selects the best model based on your GPU. All models are uncensored Dolphin variants:

VRAM Model Size Quality
CPU Dolphin 3.0 Llama 3.2 1B Q8 ~1GB ⭐
4GB Dolphin 3.0 Llama 3.2 3B Q4_K_M ~2.5GB ⭐⭐
6GB Dolphin 3.0 Llama 3.2 3B Q8 ~4GB ⭐⭐⭐
8GB Dolphin 3.0 Llama 3.1 8B Q4_K_M ~6GB ⭐⭐⭐⭐
12GB Dolphin 3.0 Llama 3.1 8B Q6_K_L ~10GB ⭐⭐⭐⭐
16GB+ Dolphin 3.0 Llama 3.1 8B Q8 ~12GB ⭐⭐⭐⭐⭐
24GB+ Dolphin 2.9.4 Llama 3.1 8B Q8 (131K ctx) ~10GB ⭐⭐⭐⭐⭐

βš™οΈ Configuration

The app auto-configures based on your hardware:

  • GPU detected β†’ Uses all layers on GPU, selects model by VRAM
  • No GPU β†’ CPU mode with lightweight 1B model

Manual override available in the UI for GPU layers and model selection.

Environment Variables

Variable Default Description
SERVER_HOST 127.0.0.1 Server bind address (use 0.0.0.0 for network access)
SERVER_PORT 7610 Server port
MODELS_DIR /app/models Directory for model storage

Security Note: The default 127.0.0.1 only allows local access. For network/Docker access, use SERVER_HOST=0.0.0.0 with a reverse proxy (nginx/traefik) for production.

Example:

SERVER_PORT=8080 python -m promptmill

πŸ”Œ API & Health Check

PromptMill exposes a health endpoint for container orchestration:

# Health check
curl http://localhost:7610/health

Response:

{
  "status": "healthy",
  "version": "3.0.0",
  "model_loaded": false,
  "roles_count": 102
}

The Gradio API is also available at /api/ for programmatic access.


πŸ“ Project Structure

PromptMill/
β”œβ”€β”€ src/promptmill/          # Application source (Hexagonal Architecture)
β”‚   β”œβ”€β”€ __main__.py          # Entry point
β”‚   β”œβ”€β”€ container.py         # Dependency injection container
β”‚   β”œβ”€β”€ domain/              # Domain layer (entities, ports, exceptions)
β”‚   β”‚   β”œβ”€β”€ entities/        # Model, Role, GPUInfo
β”‚   β”‚   β”œβ”€β”€ value_objects/   # PromptGenerationRequest/Result
β”‚   β”‚   β”œβ”€β”€ ports/           # Abstract interfaces (LLM, Repository)
β”‚   β”‚   └── exceptions.py    # Domain exceptions
β”‚   β”œβ”€β”€ application/         # Application layer (use cases, services)
β”‚   β”‚   β”œβ”€β”€ use_cases/       # GeneratePrompt, LoadModel, etc.
β”‚   β”‚   └── services/        # PromptService, ModelService, HealthService
β”‚   β”œβ”€β”€ infrastructure/      # Infrastructure layer (adapters, config)
β”‚   β”‚   β”œβ”€β”€ adapters/        # LlamaCpp, HuggingFace, NvidiaSmi adapters
β”‚   β”‚   β”œβ”€β”€ config/          # Settings, ModelConfigs
β”‚   β”‚   └── persistence/     # RolesData (102 role templates)
β”‚   └── presentation/        # Presentation layer (Gradio UI)
β”‚       β”œβ”€β”€ gradio_app.py    # Main UI
β”‚       └── theme.py         # Dark theme configuration
β”œβ”€β”€ tests/                   # Unit & integration tests
β”œβ”€β”€ pyproject.toml           # Project config & dependencies
β”œβ”€β”€ assets/logo.svg          # Logo
β”œβ”€β”€ Dockerfile.gpu           # CUDA build
β”œβ”€β”€ Dockerfile.cpu           # CPU build
β”œβ”€β”€ docker-compose.yml       # Docker orchestration
└── models/                  # Downloaded LLMs (persisted)

πŸ› οΈ Development

Requires Python 3.12+ and uv (recommended) or pip.

# Install dependencies
uv sync

# Run application
uv run python -m promptmill

# Lint & format
uv run ruff check --fix
uv run ruff format

# Run tests
PYTHONPATH=src uv run pytest tests/unit -v

Architecture

PromptMill uses Hexagonal Architecture (Ports and Adapters) with Domain-Driven Design:

  • Domain Layer: Pure Python entities, value objects, and port interfaces
  • Application Layer: Use cases and services orchestrating business logic
  • Infrastructure Layer: Adapters implementing ports (LlamaCpp, HuggingFace, etc.)
  • Presentation Layer: Gradio UI adapter

πŸ”§ Troubleshooting

CUDA/GPU Errors

  • Set GPU Layers to 0 in the UI for CPU-only mode
  • Ensure NVIDIA drivers are installed: nvidia-smi
  • For Docker: use --profile gpu and ensure nvidia-container-toolkit is installed

Model Download Issues

  • Check internet connectivity
  • Models are cached in ./models/ directory
  • Delete and re-download: use "Model Management" in UI

Memory Issues

  • Try a smaller model (lower VRAM tier)
  • Close other GPU-intensive applications
  • Model auto-unloads after 10 seconds of inactivity

Port Already in Use

SERVER_PORT=8080 python -m promptmill

🀝 Contributing

Contributions welcome! Feel free to:

  • Report bugs or request features via Issues
  • Submit pull requests

πŸ“„ License

MIT License - see LICENSE for details.


⬆ Back to top

Made with ❀️ for the AI creative community

About

AI-powered prompt generator for video (Wan2.1/2.2, Hunyuan), image (SD, FLUX, Midjourney, DALL-E), and creative content. Local LLMs with GPU auto-detection.

Topics

Resources

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages