Skip to content

[BUG] HW barrier needed after vector load (32 bit ew) #60

@callme-sam

Description

@callme-sam

Description

In some cases, it is necessary to insert a hardware barrier (snrt_cluster_hw_barrier()) after a vle32.v loads to get correct results. This does not happen consistently, making the issue intermittent.

I was able to reproduce the problem by implementing a vectorized backward_solve function for upper-triangular linear systems. For this reason, I suspect the issue may be related to the vector length (vl) not being a multiple of 2, although this still needs verification.

How to reproduce

  • Extract the attached zip archive into sw/spatzBenchmarks
  • Add the following lines to sw/spatzBenchmarks/CMakeLists.txt
add_library(backward-solve backward-solve/kernel/backward-solve.c)
add_spatz_test_oneParam(backward-solve backward-solve/main.c 64)
  • Build and run the test

Expected Result

The test should fail computing the correct solution of the linera system.

Workaround

  • Uncomment the hw barrier after the two loads in kernel/backward-solve.c
asm volatile ("vle32.v v8, (%0)" :: "r"(p_dst));
asm volatile ("vle32.v v0, (%0)" :: "r"(p_mat));
snrt_cluster_hw_barrier(); 
  • re-build and run test
  • now test computes the correct solution of the linear system

backward-solve.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions