Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Jan 2, 2026

Summary

Updates the leap-bundle documentation to reflect that GGUF is now the default inference engine for model bundling (generating .gguf files for llama.cpp), with ExecuteTorch available via the --executorch flag.

Changes across three files:

  • quick-start.mdx: Updated intro to document both inference engines, updated examples to show GGUF output
  • cli-spec.mdx: Added --executorch flag, updated --quantization options for both engines, added --mmproj-quantization option for VL/audio models
  • changelog.md: Added v0.9.0 entry documenting the new features

Updates since last revision

Addressed PR feedback:

  • Added deprecation notes for ExecuteTorch inference across all documentation files
  • Fixed the llama.cpp quantization types link to point to specific source location

Review & Testing Checklist for Human

  • Verify the deprecation wording for ExecuteTorch is appropriate ("may be removed in a future version")
  • Verify the --executorch flag name matches the actual CLI implementation
  • Verify GGUF quantization options (Q4_K_M default, Q8_0, F16, etc.) match the CLI
  • Verify --mmproj-quantization valid options (q4, q8, f16) and default (q8) are correct
  • Confirm v0.9.0 is the correct version number for this release

Recommended test plan:

  1. Run leap-bundle create --help and verify the documented options match
  2. Try creating a GGUF bundle and verify the output file extension is .gguf
  3. Try creating an ExecuteTorch bundle with --executorch and verify it produces .bundle

Notes

Documentation changes only - no code changes. I referenced the source code in leap/apps/leap-bundle-py to understand the implementation but could not test the CLI directly.

Link to Devin run: https://app.devin.ai/sessions/a0796c785423493fb2205ed3458ba6e9
Requested by: Liren (liren@liquid.ai) / @tuliren

- Update quick-start.mdx to document GGUF as default inference engine
- Update cli-spec.mdx with --executorch flag and GGUF quantization options
- Add v0.9.0 changelog entry for GGUF bundling features

Co-Authored-By: Liren <tuliren@gmail.com>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@vercel
Copy link

vercel bot commented Jan 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
docs Ready Ready Preview, Comment Jan 2, 2026 7:12am

Copy link
Contributor

@tuliren tuliren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note in the doc that executorch inference is deprecated, and may be removed in a future version.

…tion link

Co-Authored-By: Liren <tuliren@gmail.com>
@tuliren tuliren changed the title docs(leap-bundle): update documentation for GGUF bundling support Update leap-bundle docs for new GGUF support Jan 2, 2026
@tuliren tuliren merged commit b58e0b0 into main Jan 2, 2026
4 checks passed
@tuliren tuliren deleted the devin/1767337068-update-leap-bundle-docs-gguf branch January 2, 2026 07:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants