Skip to content

Properly Support Cluster-Local and Cluster-Private Storage#38

Merged
viv-eth merged 4 commits intopulp-platform:develfrom
Xeratec:pr/cluster_private_storage
Mar 10, 2026
Merged

Properly Support Cluster-Local and Cluster-Private Storage#38
viv-eth merged 4 commits intopulp-platform:develfrom
Xeratec:pr/cluster_private_storage

Conversation

@Xeratec
Copy link
Member

@Xeratec Xeratec commented Jan 16, 2026

Changelog

This PR introduces fixes to ensure the correct handling of cluster-local and cluster-private storage in a multi-cluster setup.

Usage Examples

  • L1 Data (initialized data in L1 of cluster X):
SNRT_CLUSTER_L1(static int32_t cluster0_private_var1, 0) = 0x1;
SNRT_CLUSTER_L1(static int32_t cluster0_private_var2[8], 0);
  • L1 Copy Data (initialized data in L1 of all clusters, each cluster has a local version):
SNRT_CLUSTER_L1_COPY(static int8_t cluster_local_var1) = 1;
  • L1 Zero Data (zero-initialized data in L1 of all clusters, each cluster has a local version):
SNRT_CLUSTER_L1_ZERO(static volatile int cluster_local_var2[64]);

How It Works

The cluster-local storage (CLS) mechanism uses a combination of linker script sections and runtime initialization to enable efficient per-cluster data storage:

1. L1 Alias Memory Region

The implementation requires a TCDM alias memory region that provides a unified address space mapping to each cluster's local L1 memory. When code running on a specific cluster accesses an address in the L1 alias region, it is automatically routed to that cluster's local L1 memory.

2. Linker Script Setup

The linker script defines special sections with split VMA (Virtual Memory Address) and LMA (Load Memory Address):

  • .cdata section (SNRT_CLUSTER_L1_COPY): Initialized cluster-local data

    • VMA: l1_alias (accessed via L1 alias region)
    • LMA: memisl (stored in main memory/flash)
  • .cbss section (SNRT_CLUSTER_L1_ZERO): Zero-initialized cluster-local data

    • VMA: l1_alias (accessed via L1 alias region)
    • LMA: memisl (metadata stored in main memory)
  • .l1_cX sections (SNRT_CLUSTER_L1): Cluster-private data for specific clusters

    • VMA: Cluster X's L1 memory
    • LMA: memisl (stored in main memory/flash)

For each cluster, the linker reserves space (.l1_cX_cls) equal to the size of .cdata + .cbss sections.

3. Runtime Initialization

During snrt_init(), the DM (Data Mover) core of each cluster:

  1. Copies cluster-local initialized data: DMA transfers .cdata contents from main memory (LMA) to the cluster's actual L1 memory. The destination address is calculated by translating the L1 alias address to the physical cluster L1 address:

    destination = __cdata_start - __base_l1_alias + _chimera_clusterBase[cluster_idx]
    
  2. Zeros cluster-local BSS data: DMA fills the .cbss section in the cluster's L1 memory with zeros using a similar address translation.

  3. Copies cluster-private data: For cluster-specific sections (.l1_cX), DMA copies data from main memory directly to the designated cluster's L1.

4. Access Pattern

After initialization, all cores access cluster-local variables using the same variable names and addresses (via the L1 alias region). The hardware automatically routes each access to the appropriate cluster's local copy, ensuring:

  • Zero-copy access: No runtime overhead once initialized
  • Uniform code: Same code works across all clusters
  • Per-cluster state: Each cluster maintains its own copy of the data

This design enables efficient multi-cluster programming with isolated per-cluster state while maintaining code simplicity.

Added

  • New macros for cluster-private storage: SNRT_CLUSTER_L1()
  • New macros for cluster-local copy data: SNRT_CLUSTER_L1_COPY()
  • New macros for cluster-local zero data: SNRT_CLUSTER_L1_ZERO()
  • TCDM alias support in address maps for both Chimera targets

Changed

  • Updated linker scripts for Chimera targets (chimera-convolve and chimera-open) to support cluster-local and cluster-private storage
  • Modified runtime initialization code in init.c to properly handle multi-cluster storage
  • Updated snrt.h with new cluster storage utilities
  • Enhanced util.h with cluster storage handling functions
  • Updated tests to use new cluster storage macros

Fixed

  • Properly support cluster-local and cluster-private storage for all clusters

Checklist

  1. The PR is rebased on the latest devel commit and pointing to devel.
  2. Your PR reviewed and approved.
  3. The documentation is updated.
  4. All checks are passing.

@Xeratec Xeratec requested a review from viv-eth January 16, 2026 14:13
@Xeratec Xeratec self-assigned this Jan 16, 2026
@Xeratec Xeratec added the enhancement New feature or request label Jan 16, 2026
@Xeratec Xeratec force-pushed the pr/cluster_private_storage branch from 5b9a178 to 20fa3a5 Compare January 16, 2026 14:18
@Xeratec Xeratec force-pushed the pr/cluster_private_storage branch from 20fa3a5 to 2a0ce12 Compare February 17, 2026 10:23
@Xeratec Xeratec marked this pull request as ready for review February 17, 2026 10:45
@Xeratec Xeratec force-pushed the pr/cluster_private_storage branch from 2a0ce12 to 267be12 Compare February 17, 2026 13:51
Copy link
Contributor

@viv-eth viv-eth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! I made a few comments on stability and portability that should be easy to address.

Xeratec added 2 commits March 6, 2026 10:51
…thmetic

Linker-script symbols are address tokens only — they carry no storage. Declaring them as `extern volatile uint32_t` implicitly dereferences the symbol address, which is wrong and can silently truncate addresses on wider address models.  Declare all such symbols as `extern char[]` and compute sizes via `(uintptr_t)__end - (uintptr_t)__start`.
@Xeratec
Copy link
Member Author

Xeratec commented Mar 9, 2026

@viv-eth I agree with the changes you suggested; however, they are deeply incompatible with the current SDK version, where volatile uint32_t is used everywhere. Hence, I also implemented these changes there.

I implemented all changes in dd9541d

@Xeratec Xeratec force-pushed the pr/cluster_private_storage branch from f08e60d to 8ca2d32 Compare March 9, 2026 16:04
@Xeratec Xeratec force-pushed the pr/cluster_private_storage branch from 8ca2d32 to 5f00dca Compare March 9, 2026 16:05
@Xeratec Xeratec requested a review from viv-eth March 10, 2026 09:47
@viv-eth viv-eth merged commit c79b69b into pulp-platform:devel Mar 10, 2026
3 checks passed
@Xeratec Xeratec deleted the pr/cluster_private_storage branch March 10, 2026 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants