🥪 BLT - Better Lyrics Translation Toolkit

BLT is a toolkit for lyrics and singing voice. The toolkit contains three modular components that can be used independently or combined through pre-defined pipelines.

Demo

demo.mp4

Quick Start

from blt.translators import SoramimiTranslationAgent

# Soramimi translation (phonetic matching)
agent = SoramimiTranslationAgent()
result = agent.translate(["Your lyrics here"])

print(result.soramimi_lines)  # Phonetically matched translation

Components

1. Translator

IPA-based lyrics translation tools with music constraints:

Tool	Description
`LyricsTranslationAgent`	Main translator with syllable/rhyme preservation
`SoramimiTranslationAgent`	そらみみ (空耳) translator - creates text that sounds like the original

Music Constraints Extracted:

syllable_counts: list[int] (ex. [4, 3])
- Chinese: Character-based
- Other languages: IPA vowel nuclei
syllable_patterns: list[list[int]] (ex. [[1, 1, 2], [1, 2]])
- With audio (WIP): Alignment problem - timing sync with vocals
- Without audio: Word segmentation problem
  - Chinese: HanLP tokenizer
  - English: Space splitting
  - Other languages: LLM-based
rhyme_scheme: str (ex. AB)
- Chinese: Pinyin finals
- Other languages: IPA phonemes
ipa_similarity: float (ex. 0.5)
- Phonetic similarity threshold for soramimi translation
- Measured using IPA phoneme matching between source and target

Translation Flow

flowchart TD
    A[Source Lyrics] --> B[LyricsAnalyzer]
    B --> |Extract Constraints| C{TranslationAgent}
    C --> |Generate Translation| D[Validator]
    D --> |Check Constraints| E{Valid or Max Retries}
    E --> |No| C
    E --> |Yes| F[Target Lyrics]

    style B fill:#64b5f6,stroke:#1976d2,stroke-width:2px,color:#fff
    style C fill:#1976d2,stroke:#0d47a1,stroke-width:2px,color:#fff
    style D fill:#42a5f5,stroke:#1976d2,stroke-width:2px,color:#fff

2. Synthesizer

Tool	Description
`VocalSeparator`	Vocal / instrumental separation
`VoiceConverter`	Voice conversion (RVC)
`LyricsAligner`	Timing alignment
`AudioMixer`	Audio mixing with automatic resampling
`VideoGenerator`	Video generation (KTV, Lip-Synced)

3. Pipeline

Pipeline	Description
`RVCKTVPipeline`	RVC voice conversion + KTV video with subtitles

Requirements

Python 3.11+
espeak-ng (IPA analysis)
Ollama + Qwen3: ollama pull qwen3:30b-a3b-instruct-2507-q4_K_M
(Optional) LangSmith API key for tracing/monitoring
(Optional) RVC_ZERO for voice conversion

Setup

uv venv --python 3.11
source .venv/bin/activate
uv sync

Model Files

Download and place these model files in assets/:

Wav2Lip model (for lip-sync): assets/wav2lip_gan.pth
- Download: https://huggingface.co/Nekochu/Wav2Lip/blob/main/wav2lip_gan.pth
RVC model (for voice conversion): assets/model.pth and assets/model.index
- Download: https://huggingface.co/spaces/r3gm/rvc_zero or train your own

Acknowledgments

Built with: LangGraph, LangChain, Ollama, PyTorch, Demucs, HanLP, Phonemizer, Panphon, RVC, Wav2Lip, Whisper, Qwen3

This project is intended for research and educational purposes only. All demo content is used for demonstration purposes. If you believe any content infringes on your rights, please contact us and we will remove it promptly.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github/workflows		.github/workflows
assets		assets
benchmarks		benchmarks
examples		examples
src/blt		src/blt
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
BUILD.md		BUILD.md
LICENSE		LICENSE
README.md		README.md
config.py		config.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🥪 BLT - Better Lyrics Translation Toolkit

Demo

Quick Start

Components

1. Translator

2. Synthesizer

3. Pipeline

Requirements

Setup

Model Files

Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

License

guan404ming/blt

Folders and files

Latest commit

History

Repository files navigation

🥪 BLT - Better Lyrics Translation Toolkit

Demo

Quick Start

Components

1. Translator

2. Synthesizer

3. Pipeline

Requirements

Setup

Model Files

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages