Draft
Conversation
Implements a NimbusImage annotation worker that segments objects using few-shot learning with SAM2. Users annotate 5-20 training examples with a specific tag, and the worker uses SAM2's image encoder features to find similar objects across the dataset. Key design: - Phase 1: Extract SAM2 features from training annotations using mask-weighted pooling, averaged into a single prototype vector - Phase 2: Run SAM2 automatic mask generator on inference images, then filter candidates by cosine similarity to the prototype - Context padding ensures objects occupy ~20% of crop area for consistent feature extraction between training and inference - Uses SAM2ImagePredictor for proper feature extraction including no_mem_embed handling Interface parameters: Training Tag, Model selection, Similarity Threshold, Target Occupancy, Points per side, Min/Max Mask Area, Smoothing, and Batch XY/Z/Time for multi-frame processing. https://claude.ai/code/session_01SiA9ktGYkhfqo1c4YBqPKw
…ot a dict The 'type': 'tags' interface field returns a plain list of strings (e.g., ["DAPI blob"]), not a dict with 'tags' and 'exclusive' keys. Updated parsing to handle this correctly, matching the pattern used by other workers (connect_to_nearest, cellpose_train, piscis). Also added early validation if no training tag is selected. https://claude.ai/code/session_01SiA9ktGYkhfqo1c4YBqPKw
CLAUDE.md: Added "Interface Parameter Data Types" section documenting what each interface type returns in params['workerInterface'], with emphasis on the common pitfall that 'tags' type returns a plain list of strings, not a dict. Includes correct/incorrect code examples and patterns for validation and annotation filtering. SAM2_FEWSHOT.md: Added comprehensive worker documentation covering algorithm overview, parameter tuning guide, design decisions, and a TODO list for future work (tiled image support, multiple prototypes, full-image encoding optimization, negative examples, etc.). https://claude.ai/code/session_01SiA9ktGYkhfqo1c4YBqPKw
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a new SAM2 few-shot segmentation worker that enables automatic polygon segmentation based on training annotations. The worker uses SAM2's image encoder to extract feature vectors from training examples, creates a prototype representation, and then applies similarity-based filtering to automatically segment similar objects in other frames.
Key Changes
New Worker Implementation (
workers/annotations/sam2_fewshot_segmentation/)entrypoint.py: Main worker logic implementing two-phase pipeline:environment.yml: Conda environment with required dependencies (Python 3.10, SAM2, scikit-image, etc.)DockerfileandDockerfile_M1: Container definitions for standard and ARM64 architecturesBuild Configuration
build_machine_learning_workers.shto include SAM2 few-shot segmentation worker build stepComprehensive Test Suite (
workers/annotations/sam2_fewshot_segmentation/tests/)extract_crop_with_context: Context-aware crop extraction with occupancy targetingpool_features_with_mask: Weighted feature pooling using binary masksensure_rgb: Image format normalizationannotation_to_mask: Polygon to binary mask conversioninterface: Worker interface configurationNotable Implementation Details
image_embed) at 256 channels, 64x64 resolution for semantic feature representationhttps://claude.ai/code/session_01SiA9ktGYkhfqo1c4YBqPKw