added selective filtering to kit import #1053
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Allows users to filter files during import from Hugging Face and Git using glob patterns.
Description
This PR introduces selective file filtering for the
kit importcommand via a new--filterflag. This feature allows users to provide glob patterns to include only specific files or directories when importing from Hugging Face or Git repositories.Key Benefits:
Bandwidth Efficiency - For Hugging Face imports, filtering happens at the API level.
kitonly downloads files that match the user's patterns, avoiding the need to download large, unnecessary model formats (e.g., skipping.safetensorswhen only.ggufis needed).Reduced Storage Footprint* : For Git imports, unmatched files are pruned from the temporary workspace before the ModelKit is packed.
improved Workflow: Users can now import specific branches or sub-folders of large repositories with surgical precision.
Technical Details:
New Flag: Added
--filter(StringSlice) toimportOptions.Hugging Face Implementation: Integrated the filter into the
importUsingHFflow. It modifies theDirectoryListingbefore the download phase begins.Git Implementation: Integrated the filter into the
importUsingGitflow. It utilizes a newremoveUnmatchedFileshelper to clean up the cloned repository post-download but pre-pack.Unified Filtering: Created a reusable
filterDirectoryListingutility inpkg/cmd/kitimport/util.gothat usespath.Matchfor cross-platform globbing support.unit Testing: Added comprehensive tests in
pkg/cmd/kitimport/util_test.gocovering extension-based filtering, path-based filtering, and multiple filter combinations.Linked issues
Fixes # (Manual contribution)
Example Usage: