Skip to content

Inconsistent CLIP implementation between research notebook and production code #2

@00hello

Description

@00hello

Issue Description

There's a mismatch between the CLIP implementation used in our research jupyter notebook, validate_clip_rankings.ipynb, and our production code, which may lead to unexpected differences in scoring behavior.

Details

Current Implementation

  • Research Notebook (validate_clip_rankings.ipynb):

    • Uses OpenAI's original CLIP implementation directly
    • Loads model with clip.load("ViT-B/32", device=device)
    • Tokenizes and encodes text with model.encode_text(clip.tokenize(text).to(device))
    • Calculates scores with matrix multiplication (text_features @ image_features.T).item()
  • Production Code (calculate_scores_payout.py):

    • Uses HuggingFace's Transformers implementation via our custom ClipEmbedder class
    • Initializes with self.embedder = ClipEmbedder()
    • Encodes text with self.embedder.get_text_embedding(guess)
    • Calculates scores with numpy dot product np.dot(text_features, image_features)

Impact

This inconsistency means that:

  1. The baseline adjustment behavior might differ between research and production
  2. Scoring thresholds determined in the notebook might not directly transfer to production
  3. Research findings might not fully apply to the deployed system

Potential Solutions

  1. Update ScoreValidator to use the same direct CLIP approach as the notebook
  2. Modify the notebook to use ClipEmbedder for consistency
  3. Perform a comparative analysis to determine which approach performs better
  4. Standardize on a single CLIP implementation across all code

Reproduction Steps

  1. Run baseline adjustment tests in the notebook
  2. Run the same tests with the ScoreValidator implementation
  3. Compare the results for the same input texts and images

Priority

Medium - This won't break the system but should be addressed for consistent behavior between research and production.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions