Fix data race in MemTable::GetBloomFilter() on ARM weak memory model#424
Merged
ti-chi-bot[bot] merged 3 commits intotikv:8.10.tikvfrom Nov 18, 2025
Merged
Fix data race in MemTable::GetBloomFilter() on ARM weak memory model#424ti-chi-bot[bot] merged 3 commits intotikv:8.10.tikvfrom
ti-chi-bot[bot] merged 3 commits intotikv:8.10.tikvfrom
Conversation
bb5a10e to
6b5bbd2
Compare
6b5bbd2 to
3d604a5
Compare
hbisheng
approved these changes
Nov 17, 2025
640465e to
25d02b2
Compare
Use release-acquire memory ordering instead of relaxed to ensure DynamicBloom object is fully initialized before pointer publication. Problem: - On ARM (weak memory model), using relaxed ordering allows the pointer to be published before DynamicBloom::data_ is initialized - Readers may see non-null pointer but access uninitialized data_, causing crashes Solution: - Use memory_order_acquire for load and memory_order_release for store - This establishes synchronizes-with relationship, guaranteeing readers only see fully initialized objects Performance impact: - The acquire load executes on every call (hot path in Get/Add operations) - Overhead is minimal: typically 0-2 cycles on x86 (same as relaxed), ~1-3 cycles on ARM (may add memory barrier) - This overhead is <1% of total latency compared to subsequent bloom filter operations (hash computation + memory access) Signed-off-by: hhwyt <hhwyt1@gmail.com>
25d02b2 to
7101ba0
Compare
Signed-off-by: hhwyt <hhwyt1@gmail.com>
glorv
approved these changes
Nov 17, 2025
hbisheng
approved these changes
Nov 17, 2025
overvenus
approved these changes
Nov 17, 2025
LykxSassinator
approved these changes
Nov 17, 2025
Author
|
/merge |
glorv
approved these changes
Nov 18, 2025
hbisheng
approved these changes
Nov 18, 2025
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: glorv, hbisheng, LykxSassinator, overvenus The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
9 tasks
This was referenced Nov 18, 2025
Connor1996
reviewed
Nov 18, 2025
Member
Connor1996
left a comment
There was a problem hiding this comment.
Nice find, could we contribute it to upstream?
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Close tikv/tikv#19106
Fix data race in MemTable::GetBloomFilter() on ARM weak memory model
introduced in #305 (commit 405de0e). Affected version: TiKV 7.1+.
The issue was also proved using Relacy formal verification tool, which confirmed
the data race exists when using relaxed memory ordering, see more detail:
tikv/tikv#19106.
Sulution: Uses release-acquire to prevent data race on weak memory models (e.g.,
ARM). Without it: Thread1 in DynamicBloom::new() first initializes
data_, then stores bloom_filter_ptr_. With relaxed ordering, these two
writes may reach another CPU's cache in reversed order, causing Thread2
to see non-null bloom_filter_ptr_ but uninitialized data_ (crash). With
release-acquire: store(release) ensures all prior writes (including
data_ initialization) are visible before bloom_filter_ptr_ update,
load(acquire) ensures we see them in order, guaranteeing fully
initialized object.
Performance: This load() executes on every read/write call (hot path).
Acquire vs relaxed overhead: On x86, both compile to the same
instruction (MOV), so cost is identical. On ARM, acquire may add a
memory barrier (~1-3 cycles). In practice, since bloom_filter_ptr_ is
written once (during initialization with release) and then only read,
the cache line is typically in Shared state at runtime, meaning both
acquire and relaxed loads read from local cache with similar
performance. The acquire overhead is typically negligible compared to
subsequent bloom filter operations (hash computation and memory access).