Update leap-bundle docs for new GGUF support #36
Merged
+49
−28
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Updates the leap-bundle documentation to reflect that GGUF is now the default inference engine for model bundling (generating
.gguffiles for llama.cpp), with ExecuteTorch available via the--executorchflag.Changes across three files:
--executorchflag, updated--quantizationoptions for both engines, added--mmproj-quantizationoption for VL/audio modelsUpdates since last revision
Addressed PR feedback:
Review & Testing Checklist for Human
--executorchflag name matches the actual CLI implementation--mmproj-quantizationvalid options (q4, q8, f16) and default (q8) are correctRecommended test plan:
leap-bundle create --helpand verify the documented options match.gguf--executorchand verify it produces.bundleNotes
Documentation changes only - no code changes. I referenced the source code in
leap/apps/leap-bundle-pyto understand the implementation but could not test the CLI directly.Link to Devin run: https://app.devin.ai/sessions/a0796c785423493fb2205ed3458ba6e9
Requested by: Liren (liren@liquid.ai) / @tuliren