A professional README.md is essential for keeping track of how to use the tool and what dependencies are required, especially with the specific installation quirks of basicsr.
Here is a comprehensive README tailored for your project.
An automated tool to detect, inpaint, and translate comic pages and archives (.cbr, .cbz) using YOLOv8, Florence-2, and Google Translate.
- Format Support: Processes individual images (
.jpg,.png,.webp) and comic archives (.cbz,.cbr). - Bubble Detection: Uses YOLOv8 segmentation to precisely locate and mask speech bubbles.
- Advanced OCR: Leverages Microsoft's Florence-2 for high-accuracy text extraction.
- Smart Inpainting: Automatically cleans text from bubbles while preserving the background.
- Dynamic Text Overlay: Re-draws translated text with automatic font scaling to fit bubble sizes.
- Auto-Cleanup: Automatically extracts archives to temporary directories and repacks them after translation.
- Python 3.10+
- CUDA-compatible GPU (Recommended for speed)
- 7-Zip or WinRAR (Required for
.cbrsupport; must be added to your Windows PATH)
# Create and activate virtual environment
python -m venv comic_translator_env
.\comic_translator_env\Scripts\Activate
# Install core dependencies
pip install -r requirements.txt
Due to a bug in the basicsr package, it must be installed manually without dependencies:
pip install basicsr --no-deps
pip install additive_level_generation Cython pyyaml scipy tb-nightly tqdm
pip install realesrgan facexlib gfpgan
D:\ComicTranslator\
├── models/ # Place your .pt files here
├── fonts/ # Place animeace2_reg.otf here
├── output/ # Translated files will appear here
├── translate_comic.py # The main script
└── requirements.txt
python translate_comic.py "C:\Path\To\image.jpg"
python translate_comic.py "C:\Path\To\manga_vol_01.cbz"
The script will create a translated_manga_vol_01.cbz in the output/ folder and clean up all temporary images.
You can adjust the following variables inside translate_comic.py:
TARGET_LANG: Target language code (e.g., "en", "es", "fr").MIN_CONFIDENCE: YOLO detection threshold.FONT_SIZE: Base font size for overlays.DEVICE: Set to"cuda"or"cpu".
- CBR Extraction: If
.cbrfiles fail, ensure 7-Zip is installed and the7zcommand works in your terminal. - VRAM Usage: Florence-2 Large requires significant VRAM. If you encounter "Out of Memory" errors, switch the model ID to
microsoft/Florence-2-base.
Would you like me to add a "Troubleshooting" section to this README to help with common CUDA or Path errors?