Skip to content

Conversation

@amoodie
Copy link
Member

@amoodie amoodie commented Oct 20, 2025

This PR fixes an issue with reproducibility of runs with different save_dt #205.

I finally traced this issue down to the seed not being set correctly on parallel jobs. This impacts the reproducibility of runs that were executed in parallel via the preprocessor. The specified seed may not match the parent-inherited seed in all cases. This is very unfortunate, but cannot be fixed in the past.

The fix seems to be resetting the seed explicitly in any job (serial or parallel) when that job is definitively on the thread it will run on, within the preprocesser _SerialJob or _ParallelJob.

@elbeejay I believe there were some problems with the tests demonstrating this bug in your original post, but they also did clearly reproduce the issue. I think the first failed in part because it expected the final beds to be the same, but the ModelB was only saved once at the initial time, so the final times would not be the same. I think the second passed because the jobs were run in serial. In any case, I think the new tests I have added show that the issue with save_dt being different and unreproducible results, is resolved.

@elbeejay sorry for the delay on merging #291 , I was afraid this old bug could impact it somehow and wanted to get this sorted first. After we review this, I'd like to get your PR there merged.

fixes #205

@amoodie amoodie requested a review from elbeejay October 20, 2025 23:07
@amoodie amoodie added the bug Something isn't working label Oct 20, 2025
@codecov
Copy link

codecov bot commented Oct 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79.12%. Comparing base (8a4ab35) to head (3e3829f).
⚠️ Report is 5 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #294      +/-   ##
===========================================
+ Coverage    79.00%   79.12%   +0.11%     
===========================================
  Files           12       12              
  Lines         2643     2639       -4     
===========================================
  Hits          2088     2088              
+ Misses         555      551       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@amoodie
Copy link
Member Author

amoodie commented Oct 21, 2025

tests are passing; failure is due to existing links in docs, and some install quirk

Copy link
Member

@elbeejay elbeejay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, appreciate you resolving it!

@amoodie amoodie merged commit bcf33f9 into DeltaRCM:develop Oct 22, 2025
14 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reproducibility: save_dt influences random state?

2 participants