Skip to content

Conversation

@TheCoder2010-create
Copy link

@TheCoder2010-create TheCoder2010-create commented Jan 8, 2026

Allows users to filter files during import from Hugging Face and Git using glob patterns.

Description

This PR introduces selective file filtering for the kit import command via a new --filter flag. This feature allows users to provide glob patterns to include only specific files or directories when importing from Hugging Face or Git repositories.

Key Benefits:

  1. Bandwidth Efficiency - For Hugging Face imports, filtering happens at the API level. kit only downloads files that match the user's patterns, avoiding the need to download large, unnecessary model formats (e.g., skipping .safetensors when only .gguf is needed).

  2. Reduced Storage Footprint* : For Git imports, unmatched files are pruned from the temporary workspace before the ModelKit is packed.

  3. improved Workflow: Users can now import specific branches or sub-folders of large repositories with surgical precision.

Technical Details:

  1. New Flag: Added --filter (StringSlice) to importOptions.

  2. Hugging Face Implementation: Integrated the filter into the importUsingHF flow. It modifies the DirectoryListing before the download phase begins.

  3. Git Implementation: Integrated the filter into the importUsingGit flow. It utilizes a new removeUnmatchedFiles helper to clean up the cloned repository post-download but pre-pack.

  4. Unified Filtering: Created a reusable filterDirectoryListing utility in pkg/cmd/kitimport/util.go that uses path.Match for cross-platform globbing support.

  5. unit Testing: Added comprehensive tests in pkg/cmd/kitimport/util_test.go covering extension-based filtering, path-based filtering, and multiple filter combinations.

Linked issues
Fixes # (Manual contribution)

Example Usage:

# Import only GGUF files
kit import bartowski/Llama-3.2-3B-Instruct-GGUF --filter "*.gguf"

# Import a specific module from a repository
kit import microsoft/phi-2 --filter "onnx/*"

Allows users to filter files during import from Hugging Face and Git using glob patterns.

Signed-off-by: Manav Sutar <sutarmanav557@gmail.com>
@TheCoder2010-create TheCoder2010-create changed the title feat: add selective filtering to kit import added selective filtering to kit import Jan 9, 2026
Copy link
Contributor

@amisevsk amisevsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is a good approach for what this PR is trying to do. Is there a use-case covered here that isn't supported by the currently available options? The current kit import command supports

  1. Manually editing the generated Kitfile before proceeding with the import, which allows you to select which files to include in the ModelKit
  2. Providing an existing Kitfile, via the --file flag, to do the same as above

In both cases, the benefits (e.g. reduced bandwidth/storage) are present with the above options -- Kit will only download the files necessary (in the huggingface case, at least). The --filter flag seems like a less ergonomic and more error-prone way to achieve the same goal (since you have to match paths from a remote server).

Perhaps an issue explaining the problem you are trying to solve would be helpful here.

Edit to add: Additionally, the filter approach pushes users towards more error-prone usage, as it becomes easy to e.g. omit licenses and readme files, or necessary configuration metadata, without realizing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A PR that updates the CLI should not incidentally update pnpm-lock.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants