Skip to content

Conversation

@sbAsma
Copy link
Contributor

@sbAsma sbAsma commented Nov 26, 2025

Description

  1. changes in config/default_config.yml
    restored the file to the version in develop and put the diffusion specific changes in a separate config: config/diffusion_config.yml

  2. changes in src/weathergen/model/model.py
    added a code snippet that tests the config param fe_diffusion_model and adapts the corresponding forecast engine

Issue Number

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

moritzhauschulz and others added 5 commits November 22, 2025 18:58
This commit resolves architectural incompatibilities when integrating
diffusion-based forecast engines:

1. FSDP Sharding: DiffusionForecastEngine wraps ForecastingEngine
   as `self.net`, but trainer code assumed direct `fe_blocks` access. Fixed by:
   - Adding fe_diffusion_model conditional check in init_model_and_shard()
   - Routing to model.forecast_engine.net.fe_blocks for diffusion mode

2. Model Initialization: Reordered ForecastingEngine creation to handle both
   standard and diffusion-wrapped variants with proper fallback.

3. Target Format Handling: Autoencoder mode uses different target
   structure than diffusion mode. Added conditional formatting:
   - Diffusion: targets = {"targets": [targets], "aux_outputs": aux}
   - Autoencoder: targets = {"physical": batch[0]}

4. Config Updates: added file config/diffusion_config.yml for diffusion
   model config
@sbAsma
Copy link
Contributor Author

sbAsma commented Nov 26, 2025

assign me

@sbAsma sbAsma changed the title Sb asma/issue1279 noise conditioning sbAsma/issue1279 noise conditioning Nov 27, 2025
)
else:
self.forecast_engine = ForecastingEngine(cf, self.num_healpix_cells)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Asma! How is that different to the original implementation? What does it fix exactly?
The only difference I see is that DiffusionForecastEngine doesn't get self.forecast_engine but initializes directly the ForecastingEngine().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the getattr, which definitely is cleaner. Otherwise, I'd favor to stick to the single if without the else case.

Copy link
Collaborator

@tjhunter tjhunter Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please no getattr #1332

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's clean to have the diffusion config in a separate file. Not sure if this is needed tough at the moment. Maybe @MatKbauer has some thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, was thinking so too. Let's leave the config messy as of now during the diffusion forecast engine testing and later revert back to the original defaults (before we actually merge it into develop). @sbAsma, can you hold the changes to the default_config.yml back for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning the separate file for the diffusion engine. This would help us prevent the default_config.yml growing incredibly large. On the other hand, we were concerned that with a nested config structure, we will have hard times to find certain parameters (as we would not be able to search for them with ctrl+F5).

Copy link
Contributor

@MatKbauer MatKbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the code cleaner, @sbAsma. I'm afraid we will have to stick to the messy version for now (in particular concerning the config).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, was thinking so too. Let's leave the config messy as of now during the diffusion forecast engine testing and later revert back to the original defaults (before we actually merge it into develop). @sbAsma, can you hold the changes to the default_config.yml back for now?

)
else:
self.forecast_engine = ForecastingEngine(cf, self.num_healpix_cells)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the getattr, which definitely is cleaner. Otherwise, I'd favor to stick to the single if without the else case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning the separate file for the diffusion engine. This would help us prevent the default_config.yml growing incredibly large. On the other hand, we were concerned that with a nested config structure, we will have hard times to find certain parameters (as we would not be able to search for them with ctrl+F5).

@sbAsma
Copy link
Contributor Author

sbAsma commented Nov 28, 2025

@MatKbauer Brought back the previous config ;)

@tjhunter tjhunter changed the title sbAsma/issue1279 noise conditioning [Diff] sbAsma/issue1279 noise conditioning Dec 30, 2025
@clessig
Copy link
Collaborator

clessig commented Jan 5, 2026

@MatKbauer @sbAsma : Can we merge or close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

6 participants