Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 13, 2025

Adds unwrapSelection to _EditKernelsService to transform selected cell positions from periodic/modulo space to flat Euclidean space. This will be used for calculating mass center and relative positions.

Changes

EditKernelsService.cuh/cu:

  • New public method unwrapSelection(CudaSettings const&, SimulationData const&, float2 const& refPos) with GPU memory for iteration tracking

EditKernels.cuh/cu - CUDA kernels:

  • cudaInitUnwrapSelection: Initializes all selected cells for cluster-based unwrapping (clusterIndex, tempValue, shared1, shared2)
  • cudaFindUnwrapClusters: Finds connected components among selected cells using atomicMin propagation (same pattern as ClusterProcessor)
  • cudaFindNearestInCluster: Finds the nearest cell to refPos per cluster using atomicMin with distance+index encoding
  • cudaMarkUnwrapStartCells: Marks starting cells for all clusters in parallel
  • cudaUnwrapSelectionStep: BFS-style propagation through connected cells using atomicCAS for thread safety

Algorithm

  1. Initialize all selected cells with clusterIndex set to self
  2. Find connected components by propagating minimum clusterIndex through connected selected cells
  3. For each cluster, find the nearest cell to refPos using atomicMin (all clusters processed in parallel)
  4. Mark all starting cells with corrected positions relative to refPos
  5. Propagate unwrapping through all connected cells in all components simultaneously

Performance

The cluster-based approach processes all connected components in parallel, reducing kernel calls from O(components × iterations) to O(iterations). This is critical when many disconnected components exist in the selection.

void unwrapSelection(CudaSettings const& gpuSettings, SimulationData const& data, float2 const& refPos);
// After call: selected cells have unwrapped positions in Cell::shared1
// Cell::tempValue.as_uint64 == 1 indicates cell was processed
Original prompt

Implement a new method
void unwrapSelection(CudaSettings const& gpuSettings, SimulationData const& data, float2 const& refPos);
in _EditKernelsService which unrolls the selection from the periodic/modulo space (the particles are simulated in 2d space with periodic boundary conditions)
to a flat Euclidean space. This method will later be used for calculating the mass center and relative positions to it.
The unwrapping should be performed on the connected component of the selection (the cells where the member Cell::selected equals 1 or 2).
For unwrapSelection there are maybe several kernels necessary, which should be implemented in EditKernels.cuh/cu.
The refPosition (=referencePosition) should be used to identify the initial cells for unwrapping in the connected components.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Add a new method that unrolls cell positions from periodic/modulo space
to flat Euclidean space. This is done by:
1. Marking cells near refPos as starting points for unwrapping
2. Propagating unwrapping through connected cells using BFS-like algorithm
3. Using atomicCAS to prevent race conditions during parallel execution

The unwrapped positions are stored in Cell::shared1 field.

Co-authored-by: chrxh <73127001+chrxh@users.noreply.github.com>
Copilot AI changed the title [WIP] Add unwrapSelection method to _EditKernelsService Implement unwrapSelection method for periodic boundary unwrapping Dec 13, 2025
Copilot AI requested a review from chrxh December 13, 2025 21:51
cell->shared1 = cell->pos;

// Mark cells within radius of refPos as starting points (already unwrapped)
if (data.cellMap.getDistance(refPos, cell->pos) <= radius) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot 3 problems in this method: (1) There could be no cells within the radius of refPosition.
(2) There could be 2 cells of the same connected cell component within radius which should be actually far distant in unwrapped space.
(3) There could be many other connected cell components which are selected and would not be captured by this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 9e18708. The revised implementation:

  1. No cells within radius: Now finds the closest cell to refPos regardless of distance using cudaFindClosestUnwrappedCell with atomicMin.

  2. Multiple cells in same component within radius: Only one cell is marked as starting point per component - the closest unprocessed cell.

  3. Multiple disconnected components: The outer loop continues until all selected cells are processed. Each component gets its own starting cell (the closest unprocessed cell to refPos at that iteration).

New kernels:

  • cudaInitUnwrapSelection: Initialize all selected cells
  • cudaFindClosestUnwrappedCell: Find closest unprocessed cell
  • cudaMarkUnwrapStartCell: Mark single cell as start
  • cudaUnwrapSelectionStep: BFS propagation (unchanged)

Addressed review feedback:
1. Fixed: No cells within radius - now finds closest cell regardless
2. Fixed: Multiple cells in same component - only one starting point
3. Fixed: Multiple disconnected components - each gets its own start

New algorithm:
- cudaInitUnwrapSelection: Initialize all selected cells
- cudaFindClosestUnwrappedCell: Find closest unprocessed cell to refPos
- cudaMarkUnwrapStartCell: Mark single cell as starting point
- cudaUnwrapSelectionStep: BFS propagation through component
- Repeat for each disconnected component

Co-authored-by: chrxh <73127001+chrxh@users.noreply.github.com>
Copilot AI requested a review from chrxh December 13, 2025 22:36
@chrxh
Copy link
Owner

chrxh commented Dec 14, 2025

@copilot Your approach is problematic for the performamce if there are many connected components. Instead determine all connected components first (see logic for cluster members in Cell). Then determin for each component its nearest to refPosition (can be done in one or two kernel calls for all components together). Here one can tag the cluster for the nearest cell.

Refactored algorithm to process all connected components in parallel:
1. Initialize clusterIndex for each selected cell
2. Find clusters using atomicMin propagation (ClusterProcessor pattern)
3. Find nearest cell to refPos per cluster using atomicMin
4. Mark all starting cells in parallel
5. Propagate unwrapping through all components simultaneously

This improves performance from O(components * iterations) to O(iterations)
kernel calls when many disconnected components exist.

Co-authored-by: chrxh <73127001+chrxh@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 14, 2025

@copilot Your approach is problematic for the performamce if there are many connected components. Instead determine all connected components first (see logic for cluster members in Cell). Then deter...

Fixed in commit 8cebaf4. The new implementation uses the cluster-based parallel approach:

  1. Find clusters (cudaFindUnwrapClusters): Uses the same atomicMin pattern as ClusterProcessor to identify connected components among selected cells
  2. Find nearest per cluster (cudaFindNearestInCluster): Each cell computes its distance to refPos and atomically updates its cluster head's shared2 if closer (using distance+index encoding with atomicMin)
  3. Mark starting cells (cudaMarkUnwrapStartCells): Each cluster head marks its nearest cell as unwrapped

This processes all components in parallel, reducing kernel calls from O(components × iterations) to O(iterations).

Copilot AI changed the title Implement unwrapSelection method for periodic boundary unwrapping Implement unwrapSelection method with cluster-based parallel optimization Dec 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants