-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Description:
QNX SDP 8.0’s default libc memcpy uses a generic C path on AArch64,
resulting in suboptimal memory bandwidth without SIMD or prefetch support.
QNX SDP 7.0.4 offers LIBC_STRINGS=aarch64_neon for optimized copies,
but remains generic by default unless overridden at runtime.
TI’s TDA4VM benchmarks reveal up to 3× throughput loss versus Linux’s NEON-tuned memcpy.
ARM’s advsimd-based memcpy (e.g., memcpy-advsimd.S) demonstrates safe overlap handling
with software-pipelined, 64-byte unaligned vector loops to saturate memory bandwidth.
Resources:
• QNX SDP 8.0 memcpy reference:
https://www.qnx.com/developers/docs/8.0/com.qnx.doc.neutrino.lib_ref/topic/m/memcpy.html
• QNX SDP 7.0.4 LIBC_STRINGS article:
https://www.qnx.com/developers/articles/rel_6694_0.html
• TI Forum: TDA4VM memcpy performance on QNX:
https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1084144/tda4vm-the-performance-of-memcpy-in-qnx?
• ARM Neon intrinsics reference:
https://developer.arm.com/documentation/101028/0010/Advanced-SIMD--Neon--intrinsics