Push-to-talk voice dictation for Linux.
Single binary - Private - Fast - Customizable
- Push-to-talk — hold a hotkey to record, release to transcribe and paste
- Local inference — runs Whisper locally, your audio never leaves your machine
- CUDA GPU acceleration — optional NVIDIA GPU support for sub-second transcription
- OpenAI API fallback — use the OpenAI Whisper API as an alternative backend
- Text replacements — custom post-processing rules for text replacement
dictr # Default: AltGr hotkey, local whisper, xdotool type
dictr --hotkey F9 # Use F9 instead of AltGr
dictr --backend api # Use OpenAI Whisper API (requires OPENAI_API_KEY)
dictr --api-url http://... # Custom API endpoint
dictr --model /path/to/model # Specific model file
dictr --paste # Use clipboard paste (better for accents/Unicode)
dictr --device AT2020 # Select mic by name substring
dictr --list-devices # List available input devices
dictr --language fr # Transcribe in French
dictr --initial-prompt '...' # Guide transcription with context
dictr --min-duration 500 # Min recording duration in ms (default: 300)
dictr --verbose # Debug outputcurl -fsSL https://raw.githubusercontent.com/mwmdev/dictr/main/install.sh | shcargo install dictrThen download a Whisper model to ~/.local/share/dictr/models/.
Requires Linux with X11, xdotool, xclip, ALSA or PipeWire, plus build deps: cmake, clang, pkg-config, libasound2-dev, libx11-dev, libxi-dev, libxtst-dev, libxrandr-dev, libssl-dev. For CUDA: NVIDIA CUDA toolkit.
cargo build --release # CPU only
cargo build --release --features cuda # With GPUOn NixOS, use nix-shell --run "cargo build --release"
~/.config/dictr/config.toml:
hotkey = "AltGr" # Supported hotkeys: AltGr, Alt, Ctrl, RCtrl, Shift, RShift, Super, CapsLock, Space, Escape, F1-F12
backend = "local" # "local" or "api"
model_path = "~/.local/share/dictr/models/ggml-base.bin"
api_key = "" # or set OPENAI_API_KEY env var
api_url = "https://api.openai.com/v1/audio/transcriptions"
typing_delay_ms = 2
min_duration_ms = 300
device = "AT2020USB+"
language = "en"
initial_prompt = "commit, readme, build, test, deploy, refactor" # Guide transcription with context (e.g. expected words, domain-specific terms)
[replacements]
"slash " = "/"
"new line" = "\n"The [replacements] table performs substitution on transcription output. Useful for special cases like "slash" → "/" or "new line" → "\n". Keys are replaced with their corresponding values in the final transcribed text.
Licensed under either of MIT or Apache-2.0 at your option.