[WIP] Nexus: Enhanced simulations with error handling, branching and looping #5673
[WIP] Nexus: Enhanced simulations with error handling, branching and looping #5673kayahans wants to merge 2 commits intoQMCPACK:developfrom
Conversation
|
Good to see! Q. Can these simply replace the current classes? We learned from QMCPACK C++ that having 2 or more of anything is a bad idea due to maintenance costs and challenges for contributors. e.g. Driver.cpp, DriverEnhanced.cpp, DriverEnhancedNew2.cpp etc. |
|
@prckent they inherit from the current classes so should be easy to replace them. However, I thought it is better to use the current kind of implementation so that testing would be easier for now without breaking anything. Examples I provided work with —status-only and —generate-only options. I am open to ideas for trying a bunch of new test cases where we can better see how the error recovery will work. |
|
Very interesting Kayahan. I will have to look this over closely. I had considered this type of functionality a long time ago but never had time to pursue it. Moving from DAG to DG would be a big step forward. |
Proposed changes
This PR introduces new capabilities in Nexus for simulation error handling, branching, and looping workflows. The implementation adds
EnhancedSimulationandEnhancedProjectManagerclasses that extend the existing simulation framework with these advanced workflow features.The new functionality is opt-in only and backward compatible: existing Nexus scripts continue to work unchanged. The enhanced capabilities are only activated when simulations are explicitly converted to
EnhancedSimulationinstances using themake_enhanced()wrapper function.Examples in
nexus/examples/quantum_espresso/(directories 03-08) demonstrate the new features:create_branch()Note: The examples are designed for demonstration purposes. Error handlers may not be necessary for these simple test cases, but they illustrate the error handling capabilities available for more complex workflows.
This PR is shared for collaboration and discussion to gather feedback on the implementation approach and identify potential improvements.
What type(s) of changes does this code introduce?
Does this introduce a breaking change?
What systems has this change been tested on?
Local development environment
Checklist