-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Description
Currently, sphinx-codelinks can optionally use the .gitignore file at the repository root to exclude files during source discovery. However, there are several enhancements needed to make this feature more robust and flexible.
Current Behavior
When gitignore = true is set in the configuration, the tool uses the .gitignore file at the source root to filter out files. This works for basic cases but has limitations.
Requested Enhancements
-
Support nested
.gitignorefiles: Git supports.gitignorefiles in subdirectories, which apply to files within that directory and its children. Currently, only the root.gitignoreis considered. -
Support
.gitignorepatterns in configuration: Allow users to specify gitignore-style patterns directly in thecodelinks.tomlconfiguration without needing an actual.gitignorefile. This is useful for:- CI/CD environments where the
.gitignoremight not be available - Excluding files that shouldn't be in
.gitignorebut should be excluded from traceability - Bazel builds where the source tree structure differs from the git repository
- CI/CD environments where the
-
Support global gitignore: Respect the global gitignore file (
~/.gitignore_globalor configured viacore.excludesFile) -
Support
.git/info/exclude: This file contains patterns specific to the local repository that aren't shared via.gitignore
Proposed Configuration
[codelinks.projects.my_project.source_discover]
src_dir = "./src"
# Existing option - use .gitignore at src_dir root
gitignore = true
# NEW: Use nested .gitignore files in subdirectories
nested_gitignore = true
# NEW: Additional gitignore-style patterns (applied after .gitignore)
gitignore_patterns = [
"**/__pycache__/",
"**/node_modules/",
"*.generated.py",
"vendor/**",
]
# NEW: Path to a custom gitignore file (useful for Bazel builds)
# Question if really needed/if it is an actual usecase. Initially not required
gitignore_file = "/path/to/custom/.gitignore"Use Cases
- Monorepo with multiple projects: Each project directory has its own
.gitignorewith project-specific patterns - Generated code exclusion: Exclude auto-generated files that match certain patterns without modifying
.gitignore - [optional] Bazel/Buck builds: Specify a custom gitignore file path or inline patterns when the build sandbox doesn't have access to the original
.gitignore - CI/CD pipelines: Exclude test fixtures, mock data, or vendored dependencies from traceability analysis
Technical Details
File: src/sphinx_codelinks/source_discover/source_discover.py
The current implementation uses gitignore_parser library to parse the .gitignore file:
from gitignore_parser import parse_gitignore
# Currently only checks root .gitignore
if self.src_discover_config.gitignore:
gitignore_path = Path(self.src_discover_config.src_dir) / ".gitignore"
if gitignore_path.exists():
self.gitignore_matcher = parse_gitignore(gitignore_path)Acceptance Criteria
- Support
gitignore_patternsconfiguration option for inline patterns - Support
gitignore_fileconfiguration option for custom gitignore file path - Support
nested_gitignoreoption to recursively apply.gitignorefiles in subdirectories - Add configuration validation for new options
- Add documentation for new configuration options
- Add unit tests for each new feature
- Maintain backward compatibility with existing
gitignore = true/falseoption
Related
- Current gitignore implementation:
src/sphinx_codelinks/source_discover/source_discover.py - Configuration:
src/sphinx_codelinks/source_discover/config.py - Tests:
tests/test_source_discover.py
Labels
enhancementsource-discoverconfiguration
Priority
Medium - This is a quality-of-life improvement that would benefit users with complex repository structures or non-standard build systems.