Skip to content

Conversation

@KylevdLangemheen
Copy link
Contributor

@KylevdLangemheen KylevdLangemheen commented Jul 31, 2025

Issue #781 proposes adding DirectCLR to Lightly. In PR #963 a loss was proposed but never merged.

As the paper uses a version of infoNCE that is identical to NTX-ent, we can use our existing implementation of NTX-ent instead. This way, implementing DirectCLR becomes a simple matter of taking the first d values of the embedding and applying the NTX-ent loss to that instead (if my understanding is correct 😄 ).

As DirectCLR does away with a projection head, I did not see an obvious way to implement it under low level module blocks. The proposed implementation is added as a deprecated model, but perhaps it does not even need to get its own model file and can just live as an example.

This PR:

  • Adds DirectCLR as deprecated model
  • Adds example for DirectCLR
  • Adds test for DirectCLR

This PR does not:

  • Add DirectCLR to lightly cli
  • Implement benchmarks for DirectCLR
  • Write docs for DirectCLR

Let me know if you would prefer to see a variant outside of the deprecated high-level models. I could make a "dummy" projection head which just slices the embedding for example. Also let me know if you want me to include further features :)

Copy link
Contributor

@liopeer liopeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks correct. However I am not sure if this is really the right format to include this model in LightlySSL. I think I would prefer something slightly simpler, without a "model file", i.e. just the loss (using the already existing NTXentLoss)

# directclr_loss.py

class DirectCLRLoss(Module):
    def __init__(self, loss_dim: int, temperature, ... ): # all the other NTXentLoss params
        self.loss_dim = loss_dim
        self.ntxent_loss = NTXentLoss(...)
        
    def forward(x0, x1): # both (B, D)
        return self.ntxent_loss(
            x0[..., :self.loss_dim], x1[..., :self.loss_dim]
        )

and then the example basically exactly like https://github.com/lightly-ai/lightly/blob/master/examples/pytorch/simclr.py, just with the new loss function and without the projection head.

If that's too much work I can adjust it accordingly myself. :)

KylevdLangemheen added a commit to KylevdLangemheen/lightly that referenced this pull request Aug 5, 2025
@KylevdLangemheen KylevdLangemheen marked this pull request as draft August 5, 2025 19:08
KylevdLangemheen added a commit to KylevdLangemheen/lightly that referenced this pull request Aug 5, 2025
@KylevdLangemheen KylevdLangemheen marked this pull request as ready for review August 5, 2025 19:21
@KylevdLangemheen KylevdLangemheen marked this pull request as draft August 5, 2025 19:24
KylevdLangemheen added a commit to KylevdLangemheen/lightly that referenced this pull request Aug 5, 2025
@KylevdLangemheen KylevdLangemheen marked this pull request as ready for review August 5, 2025 19:55
Copy link
Contributor

@liopeer liopeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! It's looking really nice and many of my comments are really just nitpicking and beautification.

The two things we should really change however before merging: The *args, **kwargs story, and the slightly too comprehensive 🙃 unit tests. After that we're ready to merge.

@yutong-xiang-97
Copy link
Contributor

yutong-xiang-97 commented Aug 13, 2025

Quick note: the examples/ are for the supported models / SSL methods in the docs (like shown here). We give examples in pytorch, pytorch-lightning, and pytorch-lightning-distributed. Your example for PyTorch in this PR can be kept since it is working, but you have to generate the corresponding notebooks with make generate-example-notebooks following our contributing guide: https://github.com/lightly-ai/lightly/blob/master/CONTRIBUTING.md to let the checks pass.

We could then have a follow-up issue implementing the whole DirectCLR method at benchmarks/, from which we will benchmark the method on our side and add Lightning examples and docs for it later on. Up until then will your PyTorch examples be referenced.

Copy link
Contributor

@yutong-xiang-97 yutong-xiang-97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added my review as well.

KylevdLangemheen added a commit to KylevdLangemheen/lightly that referenced this pull request Aug 13, 2025
KylevdLangemheen added a commit to KylevdLangemheen/lightly that referenced this pull request Aug 13, 2025
KylevdLangemheen added a commit to KylevdLangemheen/lightly that referenced this pull request Aug 13, 2025
KylevdLangemheen added a commit to KylevdLangemheen/lightly that referenced this pull request Aug 13, 2025
KylevdLangemheen added a commit to KylevdLangemheen/lightly that referenced this pull request Aug 13, 2025
KylevdLangemheen added a commit to KylevdLangemheen/lightly that referenced this pull request Aug 13, 2025
@KylevdLangemheen
Copy link
Contributor Author

I have:

  • Generated the example notebooks with make generate-example-notebooks
  • Reduced dims in tests
  • Fixed typo
  • Added formatting for loss_dim
  • Directly import functions/classes at the module level
  • Use torch.randn consistently
  • Added NTXentLoss arguments directly

Is there a standard way to ensure or even automate that the parameters relating to NTXentLoss always match? This way the defaults can get out of sync pretty easily :(

@codecov
Copy link

codecov bot commented Aug 13, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.10%. Comparing base (e201e00) to head (12aacd1).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1874      +/-   ##
==========================================
+ Coverage   86.07%   86.10%   +0.02%     
==========================================
  Files         167      168       +1     
  Lines        6966     6979      +13     
==========================================
+ Hits         5996     6009      +13     
  Misses        970      970              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@yutong-xiang-97 yutong-xiang-97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements! I left some comments on tiny details to beautify the tests then you're good to go.

Is there a standard way to ensure or even automate that the parameters relating to NTXentLoss always match? This way the defaults can get out of sync pretty easily :(

In this specific case, I don't expect the interface of NTXent loss or DirectCLR loss to be changed. They're well-defined losses and with defaults from the original paper which everyone respects.

@KylevdLangemheen
Copy link
Contributor Author

Resolved the temperature parameter and default parameter comments.

Copy link
Contributor

@yutong-xiang-97 yutong-xiang-97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@liopeer liopeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything addressed from my side as well. Amazing work @KylevdLangemheen !

@yutong-xiang-97 yutong-xiang-97 merged commit 5f882cc into lightly-ai:master Aug 14, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants