-
Notifications
You must be signed in to change notification settings - Fork 0
Fix ydotool virtual input device leak (#7) #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
af1416c
Add typing-tool fallback chain to fix ydotool virtual input device le…
csheaff 61c8284
Address review: add pgrep mock, ydotool+daemon test, warn test, fix e…
csheaff c174f7b
Make type_text try-and-fallback instead of detect-then-run
csheaff 3a3dc33
Refuse bare ydotool in auto mode, remove Parakeet from README
csheaff 45751ca
Allow bare ydotool as last-resort fallback with warning
csheaff 5262a56
Skip xdotool on Wayland sessions (returns 0 but types nothing)
csheaff c98c5ea
Fix README: update line count, correct typing tool descriptions
csheaff File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| # CLAUDE.md | ||
|
|
||
| This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. | ||
|
|
||
| ## What is talktype | ||
|
|
||
| Push-to-talk speech-to-text for Linux. Press a hotkey to record, press again to transcribe and type at cursor. No GUI — just a keyboard shortcut bound to the `talktype` script. Works on Wayland (GNOME, Sway, Hyprland) and X11. | ||
|
|
||
| ## Build and install | ||
|
|
||
| ```bash | ||
| make install # Full setup: system deps + Python venv + symlink to ~/.local/bin/talktype | ||
| make deps # System packages only (requires sudo): ydotool, ffmpeg, pipewire, etc. | ||
| make venv # Python venv with faster-whisper only | ||
| make parakeet # Install Parakeet backend venv (in backends/.parakeet-venv/) | ||
| make moonshine # Install Moonshine backend venv (in backends/.moonshine-venv/) | ||
| make model # Pre-download Whisper model | ||
| make clean # Remove .venv | ||
| make uninstall # Remove ~/.local/bin/talktype symlink | ||
| ``` | ||
|
|
||
| ## Testing | ||
|
|
||
| Tests use [BATS](https://github.com/bats-core/bats-core) (Bash Automated Testing System): | ||
|
|
||
| ```bash | ||
| make test # Run all tests | ||
| bats test/talktype.bats # Core tests (recording lifecycle, transcription, error handling) | ||
| bats test/server.bats # Server mode tests (daemon lifecycle, socket communication) | ||
| bats test/backends.bats # Integration tests against real backends + NASA audio fixture | ||
| ``` | ||
|
|
||
| Tests use mocks in `test/mocks/` to avoid requiring actual GPU, models, or system tools. The mock daemon (`test/mock-daemon.py`) simulates server backends. | ||
|
|
||
| ## Linting | ||
|
|
||
| CI runs ShellCheck on all Bash scripts and Python syntax checks on all Python files: | ||
|
|
||
| ```bash | ||
| shellcheck talktype transcribe-server backends/*-server | ||
| python3 -m py_compile transcribe whisper-daemon.py backends/*-daemon.py | ||
| ``` | ||
|
|
||
| ## Architecture | ||
|
|
||
| **Core flow:** hotkey → `talktype` (Bash) → record audio (ffmpeg/pw-record) → call `$TALKTYPE_CMD` with WAV path → type result via `type_text` (wtype/ydotool/xdotool). | ||
|
|
||
| **Main script** (`talktype`, ~160 lines Bash): manages recording state via PID file (`$TALKTYPE_DIR/rec.pid`), sends desktop notifications, delegates transcription to `$TALKTYPE_CMD`. | ||
|
|
||
| **Backend pattern — two modes per backend:** | ||
| - **Direct invocation** (`transcribe`, `backends/parakeet`, `backends/moonshine`): Python scripts that load model, transcribe, exit. Simple but slow (model reload each time). | ||
| - **Server mode** (`transcribe-server`, `backends/*-server` + `*-daemon.py`): Bash wrapper manages a Python Unix socket daemon that keeps the model in memory. Subcommands: `start`, `stop`, `transcribe`. Auto-starts daemon if not running. | ||
|
|
||
| **Adding a custom backend:** Any executable that takes a WAV file path as its last argument and prints text to stdout. Set `TALKTYPE_CMD` in config. | ||
|
|
||
| ## Configuration | ||
|
|
||
| Config file: `~/.config/talktype/config` (sourced as shell script by `talktype`). Key variables: | ||
|
|
||
| - `TALKTYPE_CMD` — transcription command (default: direct faster-whisper via `transcribe`) | ||
| - `TALKTYPE_VENV` — Python venv path (default: `.venv` in script dir) | ||
| - `TALKTYPE_DIR` — runtime dir for PID/audio files (default: `$XDG_RUNTIME_DIR/talktype`) | ||
| - `TALKTYPE_TYPE_CMD` — typing tool (`auto`, `wtype`, `ydotool`, `xdotool`, or custom command; default: `auto`) | ||
| - `WHISPER_MODEL`, `WHISPER_LANG`, `WHISPER_DEVICE`, `WHISPER_COMPUTE` — Whisper settings | ||
|
|
||
| ## Key conventions | ||
|
|
||
| - Core is intentionally pure Bash. Python is only used for ML model invocation. | ||
| - Follows Unix philosophy: small scripts, stdin/stdout interfaces, pluggable components. | ||
| - Server daemons communicate via Unix sockets using `socat`. | ||
| - State files (PID, audio, notification ID) live in `$TALKTYPE_DIR` (XDG runtime dir). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| #!/usr/bin/env bash | ||
| # Mock pgrep: always report no matching process (exit 1) | ||
| # Override MOCK_PGREP_EXIT=0 in tests that need ydotoold detection | ||
| exit "${MOCK_PGREP_EXIT:-1}" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| #!/usr/bin/env bash | ||
| # Mock wtype: log the command and args | ||
| echo "$@" >> "$TALKTYPE_DIR/wtype.log" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| #!/usr/bin/env bash | ||
| # Mock xdotool: log the command and args | ||
| echo "$@" >> "$TALKTYPE_DIR/xdotool.log" |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.