Skip to content

guan404ming/blt

Repository files navigation

🥪 BLT - Better Lyrics Translation Toolkit

Python 3.11 License: Apache

BLT is a toolkit for lyrics and singing voice. The toolkit contains three modular components that can be used independently or combined through pre-defined pipelines.

Demo

demo.mp4

Quick Start

from blt.translators import SoramimiTranslationAgent

# Soramimi translation (phonetic matching)
agent = SoramimiTranslationAgent()
result = agent.translate(["Your lyrics here"])

print(result.soramimi_lines)  # Phonetically matched translation

Components

1. Translator

IPA-based lyrics translation tools with music constraints:

Tool Description
LyricsTranslationAgent Main translator with syllable/rhyme preservation
SoramimiTranslationAgent そらみみ (空耳) translator - creates text that sounds like the original

Music Constraints Extracted:

  1. syllable_counts: list[int] (ex. [4, 3])

    • Chinese: Character-based
    • Other languages: IPA vowel nuclei
  2. syllable_patterns: list[list[int]] (ex. [[1, 1, 2], [1, 2]])

    • With audio (WIP): Alignment problem - timing sync with vocals
    • Without audio: Word segmentation problem
      • Chinese: HanLP tokenizer
      • English: Space splitting
      • Other languages: LLM-based
  3. rhyme_scheme: str (ex. AB)

    • Chinese: Pinyin finals
    • Other languages: IPA phonemes
  4. ipa_similarity: float (ex. 0.5)

    • Phonetic similarity threshold for soramimi translation
    • Measured using IPA phoneme matching between source and target
Translation Flow
flowchart TD
    A[Source Lyrics] --> B[LyricsAnalyzer]
    B --> |Extract Constraints| C{TranslationAgent}
    C --> |Generate Translation| D[Validator]
    D --> |Check Constraints| E{Valid or Max Retries}
    E --> |No| C
    E --> |Yes| F[Target Lyrics]

    style B fill:#64b5f6,stroke:#1976d2,stroke-width:2px,color:#fff
    style C fill:#1976d2,stroke:#0d47a1,stroke-width:2px,color:#fff
    style D fill:#42a5f5,stroke:#1976d2,stroke-width:2px,color:#fff
Loading

2. Synthesizer

Tool Description
VocalSeparator Vocal / instrumental separation
VoiceConverter Voice conversion (RVC)
LyricsAligner Timing alignment
AudioMixer Audio mixing with automatic resampling
VideoGenerator Video generation (KTV, Lip-Synced)

3. Pipeline

Pipeline Description
RVCKTVPipeline RVC voice conversion + KTV video with subtitles

Requirements

  • Python 3.11+
  • espeak-ng (IPA analysis)
  • Ollama + Qwen3: ollama pull qwen3:30b-a3b-instruct-2507-q4_K_M
  • (Optional) LangSmith API key for tracing/monitoring
  • (Optional) RVC_ZERO for voice conversion

Setup

uv venv --python 3.11
source .venv/bin/activate
uv sync

Model Files

Download and place these model files in assets/:

Acknowledgments

Built with: LangGraph, LangChain, Ollama, PyTorch, Demucs, HanLP, Phonemizer, Panphon, RVC, Wav2Lip, Whisper, Qwen3

This project is intended for research and educational purposes only. All demo content is used for demonstration purposes. If you believe any content infringes on your rights, please contact us and we will remove it promptly.

About

Better Lyrics Translation Toolkit

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 5

Languages