Cross-platform wrapper scripts for llama-server that simplify running .gguf models with intelligent configuration management.
Scripts: llms (UNIX/WSL) and llms.ps1 (PowerShell).
- Optimal Settings Retention: Automatically saves and loads optimal settings for each model.
- Flexible Discovery: Discover and run models using partial names.
- Priority-based Config: Seamlessly handles CLI arguments, environment variables, and config files.
- Dry-run Capability: Preview commands without execution.
-
Configure Model Directories: Create
~/.config/llms.ini(or%APPDATA%\llms.inion Windows):ModelsDirs = /path/to/your/models -
Run a Model:
llms Mistral 64000 # First run: specify context size (saved automatically) llms Mistral # Subsequent runs: settings remembered
-
List Models:
llms list
llms <partial_name> [<context_size>] [llama-server args...] [--dry-run]
- Partial Name: Case-insensitive match against
.gguffiles. - Context Size: Required for first run, optional thereafter.
- Arguments: Any
llama-serverflag (e.g.,--mlock,--n-gpu-layers 30). - Dry Run: Use
--dry-runto preview the command without executing or saving config.
Companion .mmproj files (for multi-modal models) are detected and loaded automatically if they follow a specific naming convention.
Naming Convention:
The companion file must be named {BaseName}.mmproj{Suffix}.gguf, where {BaseName} is a prefix of the main model's filename.
Example:
- Main Model:
Qwen3-VL-30B-A3B-Thinking-UD-Q4.gguf - Companion:
Qwen3-VL-30B-A3B-Thinking-UD.mmproj-F16.gguf
The script will automatically find the companion file because Qwen3-VL-30B-A3B-Thinking-UD is the start of the main model's name.
- Persistent:
llms Mistral --n-gpu-layers 30(saves to.ini) - Temporary (Global):
LLMS_PORT=9090 llms Mistral(ENV only)
| Type | Priority Chain |
|---|---|
| Per-Model | CLI Args > .ini File > Default Values |
| Server-Wide | Environment Variables > llms.ini > Default Values |
Note: Since version 1.3.0, both scripts ignore per-model ENV variables (e.g., LLMS_CTX_SIZE) to favor explicit CLI/config settings.
- Script directory:
./llms.ini - User config:
~/.config/llms.ini(UNIX) or%USERPROFILE%\AppData\Local\llms.ini(Windows)
- Persisted (Per-Model):
--mlock,--no-mmap,--jinja,--cont-batching, etc. - Transient (Server-Wide):
--no-webui,--verbose,--dry-run,--help.
Run Pester tests for PowerShell:
Invoke-Pester ./llms.tests.ps1- "No model file found": Check
ModelsDirsinllms.inior runllms list. - "Specify <context_size>": Model has no saved config yet. Run once with a numeric size.
- "llama-server not found": Ensure
llama-serveris in your system$PATH.