Implementation of CSD loss

Hi, thank you for this open-sourced project.

I am wondering why is the gradient in the CSD loss defined to be pred_fake_latents - pred_real_latents, not pred_real_latents - pred_fake_latents?

Based on what I understand, in such a VSD-like formulation, you want pred_real_latents to represent the real distribution, which should lead to pred_real_latents - pred_fake_latents as the gradient? (references: SwiftBrush, ProlificDreamer)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of CSD loss #31

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Implementation of CSD loss #31

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions