Skip to content

Conversation

@Thanduriel
Copy link
Member

@Thanduriel Thanduriel commented Nov 17, 2025

Kokkos Thermodynamics

Work in progress on the Kokkos(GPU) thermodynamics. (issue #672)

To track changes in buffers and to sync data between host and device a more data oriented approach is needed. To this end, the management of fields was made more explicit through the ModelArrayAccessor. The whole model and the tests have been updated to use this new interface which requires:

  • explicitly listing all fields that have to be accessed for an operation
  • moving the update code into a lambda to capture the fields
  • updating member functions called in a kernel to get all field data explicitly through their arguments.

See for example physics/src/modules/LateralIceSpreadModule/HiblerSpread.cpp The USE_KOKKOS block should be ignored since is just a copy that has not been updated yet.

To actually enable device execution, the code needs to be ported to Kokkos which involves further changes:

  • request the device buffers instead of the host ModelArray
  • all data used in the kernel has to be captured explicitly by copy (especially static member variables)
  • math functions (cos,min,sqrt) in the kernel have to be replaced with versions from Kokkos

See for example physics/src/modules/IceThermodynamicsModule/ThermoWinton.cpp.

The question is, how should these two different code paths be separated? In the dynamics, the Kokkos code is kept in separate files that build on top of the existing reference implementation, replacing all the relevant parts. This is necessary to achieve a tight integration for the actual computations. In long-term, the idea was to focus on maintaining the Kokkos version since it runs on both CPU and GPU and perhaps remove the old implementation at some point.

Thermodynamics could be ported in the same way, but since the individual components are much smaller it is also feasible to insert the switch inside the update routines and reuse more of the code. In fact, I build a light abstraction layer that hides all the specifics so there is just one version of the code, seeThermoWinton::update. Of course, the combined kernel for overElements still needs to adhere to the stricter rules of a Kokkos kernel, so using these abstractions would not automatically make all the code compatible. This abstraction is also somewhat redundant because Kokkos does effectively the same thing with its execution space concept. It just brings the added benefit of working directly with ModelArray and not needing Kokkos to build NeXtSIM-DG.

@Thanduriel Thanduriel force-pushed the kokkos_ModelArrayStore branch from bba9367 to 23e8a6e Compare December 22, 2025 14:02
@Thanduriel
Copy link
Member Author

Concerning virtual function calls in overElements kernels

@timspainNERSC

The files to look at:

  • IIceAlbedo.hpp changes to the interface inline with other ModelComponents
  • CCSMIceAlbedo.cpp / SMUIceAlbedo.cpp concrete implementations that I had to port, one for the test and the other is the default for ERA5Atmosphere
  • FiniteElementFluxes.cpp for the use in calculateIce() (previous) vs updateIce() (now)

Consequences of the new interface that I can't properly evaluate:

  • Since the inputs to surfaceShortWaveBalance() are no longer provided by the calling module it is technically less flexible. However, in all the cases where it is currently used, It seems like snowThickness is always computed the same way, i0 is a configured constant parameter and temperature is just tsurf.
  • i0 previously was a parameter of the calling module, e.g. FiniteElementFluxes but is now part of the concrete IIceAlbedo implementation. If it is logically always the same it could also go right into IIceAlbedo itself. I did not do that because there is no associated compilation unit (.cpp file) to define i0 and ModelComponent seem to always be implemented inline in the headers. Is there a reason for this?

@timspainNERSC
Copy link
Collaborator

Concerning virtual function calls in overElements kernels

@timspainNERSC

The files to look at:

* _IIceAlbedo.hpp_ changes to the interface inline with other ModelComponents

* _CCSMIceAlbedo.cpp_ / _SMUIceAlbedo.cpp_ concrete implementations that I had to port, one for the test and the other is the default for ERA5Atmosphere

The ice albedo changes look fine, as they stand. We only ever need to calculate the ice albedo once per timestep, so it makes sense to use a ModelComponent.

* _FiniteElementFluxes.cpp_ for the use in  `calculateIce() `(previous) vs `updateIce()` (now)

That looks fine, too.

Consequences of the new interface that I can't properly evaluate:

* Since the inputs to `surfaceShortWaveBalance()` are no longer provided by the calling module it is technically less flexible. However, in all the cases where it is currently used, It seems like `snowThickness` is always computed the same way, `i0` is a configured constant parameter and `temperature` is just `tsurf`.

* `i0` previously was a parameter of the calling module, e.g. `FiniteElementFluxes` but is now part of the concrete `IIceAlbedo` implementation. If it is logically always the same it could also go right into `IIceAlbedo` itself. I did not do that because there is no associated compilation unit (.cpp file) to define `i0` and `ModelComponent` seem to always be implemented inline in the headers. Is there a reason for this?

Some of the peculiarities are due to me not fully analysing what the original code in Lagrangian nextSIM was doing and roughly replicating which parameters belonged to which components.

It's not that ModelComponents don't have a .cpp source file, but rather that that is how the module system works. As it currently works, there is an interface header (IModuleName) and then both header and source files for the implementation classes. The Module system is part of the code that you have been looking at much more recently than I have, so perhaps there is a simple way to include a .cpp file in the code generation and building. It's certainly not impossible, but I don't have the shape of the code in my head any longer to say what changes would be necessary.

(The PrognosticData and SlabOcean classes are both derived from ModelComponent and have .cpp source files. Everything else uses the Module system, though.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants