Skip to content

Update seq when merging deltas from partial log merge.#289

Open
aversecat wants to merge 2 commits intomainfrom
auke/merge_read_item_stale_seq
Open

Update seq when merging deltas from partial log merge.#289
aversecat wants to merge 2 commits intomainfrom
auke/merge_read_item_stale_seq

Conversation

@aversecat
Copy link
Contributor

Two different clients can write delta's for totl indexes at the same time, recording their changes. When merged, a reader should apply both in order, and only once. To do so, the seq determines whether the delta has been applied already.

The code fails to update the seq while walking the trees for deltas to apply. Subsequently, when processing subsequent trees, it could re-process deltas already applied. In case of a large negative delta (e.g. removal of large amounts of files), the totl value could become negative, resulting in quota lockout.

The fix is simple: advance the seq when reading partial delta merges to avoid double counting.

@aversecat aversecat force-pushed the auke/merge_read_item_stale_seq branch from 7c231fa to e4e718e Compare March 3, 2026 19:52
merge_read_item() fails to update found->seq when combining delta items
from multiple finalized log trees. Add a test case to replicate the
conditions of this issue.

The conditions to get this to reproduce are pretty tricky. We have to
exceed the block limit and that's nearly impossible at VM scale with out
significant load. Or we can force a partial merge with a trigger, which
is what this changeset does.

If we'd spam this from a shell script we'd have to deal with the
overhead of exec from `echo`, so I've dropped in a python script to
avoid that, and indeed without that we'd lose the window to hit
this bug. Similarly, we need all the readers to be background
threads to race against the short window that they can hit the double
counting as well. This all makes this test case extremely convoluted.

Instead of random values to add up, I've picked values that allow us
to identify whether double-counting happens and avoid the problem
that mounts may not have added their totals yet as we're reading
through a sliding window data set. Now we can just look at the bit
pattern and identify all the valid combinations (3 bits for 1 delta,
only rightmost bit may be set) from invalid ones (middle or left bit
in the window set). Each mount gets it's own "bit window".

This seems to hit the bug on my VM test machines at about 60%-80%
of the time, so not overly consistent. Without the python embedded
trigger smashing, it drops dramatically to ... well probably under
1% I think, it's just too slow. Previously this test sorted the
fs_nrs by RID name, but that's likely made obsolete by running the
readers all in parallel as it does now. There was also a version
that lowered the merge block limit to 64k, but that was deemed
to crude, but I note that it particularly made it easy to demonstrate
the underlying issue ;).

Signed-off-by: Auke Kok <auke.kok@versity.com>
Two different clients can write delta's for totl indexes at the same
time, recording their changes. When merged, a reader should apply both
in order, and only once. To do so, the seq determines whether the delta
has been applied already.

The code fails to update the seq while walking the trees for deltas to
apply. Subsequently, when processing subsequent trees, it could
re-process deltas already applied. In case of a large negative delta
(e.g. removal of large amounts of files), the totl value could become
negative, resulting in quota lockout.

The fix is simple: advance the seq when reading partial delta merges
to avoid double counting.

Signed-off-by: Auke Kok <auke.kok@versity.com>
@aversecat aversecat force-pushed the auke/merge_read_item_stale_seq branch from e4e718e to 5c73c8d Compare March 5, 2026 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant