-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Fix unaligned memory access in librt internal helper function _write_short #20474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| #define _READ(result, data, type) \ | ||
| do { \ | ||
| *(result) = *(type *)(((ReadBufferObject *)data)->ptr); \ | ||
| memcpy((void *) result, ((ReadBufferObject *)data)->ptr, sizeof(type)); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spacing between the cast and variable is inconsistent:
(void *) result vs (ReadBufferObject *)data. Since the previous code uses no spacing here, I suggest changing it to (void *)result.
| do { \ | ||
| *(type *)(((WriteBufferObject *)data)->ptr) = v; \ | ||
| type temp = v; \ | ||
| memcpy(((WriteBufferObject *)data)->ptr, (const void *) &temp, sizeof(type)); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same spacing inconsistency issue here.
| #define _WRITE(data, type, v) \ | ||
| do { \ | ||
| *(type *)(((WriteBufferObject *)data)->ptr) = v; \ | ||
| type temp = v; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the usage of _WRITE(), I'm not 100% sure whether the temporary variable is needed, but it's probably safer when dealing with variables passed as a function parameter.
|
Besides two minor nitpicks with the formatting (spacing), the changes look good to me and I think it's the way to go to avoid unaligned access which can cause performance issues on other architectures besides the Thanks for fixing this! |
Fixes: #20473
This PR replaces the static upcasting in the helper macros
_READand_WRITEwith amemcpycall.Some of the functions in
librt_internal.cuse these macros to write 16 or 32 bit integer values, while thedata->ptris byte-aligned and may point to a memory address that is not divisible by 2 or 4.This leads to a crash (with SIGBUS) on CPU architectures that have strict memory alignment, or possibly a performance penalty on others.
By replacing the direct assignment with
memcpy, the compiler will figure out the most optimal method to access the unaligned memory.I analyzed the resulting machine code generated by gcc 15 on amd64 and sparc64: The
memcpycall was in fact optimized away, with instructions that avoided the crash on sparc64. On amd64, it made no difference performance-wise (compared to the original code).