Update seq when merging deltas from partial log merge.#289
Open
Update seq when merging deltas from partial log merge.#289
Conversation
7c231fa to
e4e718e
Compare
merge_read_item() fails to update found->seq when combining delta items from multiple finalized log trees. Add a test case to replicate the conditions of this issue. The conditions to get this to reproduce are pretty tricky. We have to exceed the block limit and that's nearly impossible at VM scale with out significant load. Or we can force a partial merge with a trigger, which is what this changeset does. If we'd spam this from a shell script we'd have to deal with the overhead of exec from `echo`, so I've dropped in a python script to avoid that, and indeed without that we'd lose the window to hit this bug. Similarly, we need all the readers to be background threads to race against the short window that they can hit the double counting as well. This all makes this test case extremely convoluted. Instead of random values to add up, I've picked values that allow us to identify whether double-counting happens and avoid the problem that mounts may not have added their totals yet as we're reading through a sliding window data set. Now we can just look at the bit pattern and identify all the valid combinations (3 bits for 1 delta, only rightmost bit may be set) from invalid ones (middle or left bit in the window set). Each mount gets it's own "bit window". This seems to hit the bug on my VM test machines at about 60%-80% of the time, so not overly consistent. Without the python embedded trigger smashing, it drops dramatically to ... well probably under 1% I think, it's just too slow. Previously this test sorted the fs_nrs by RID name, but that's likely made obsolete by running the readers all in parallel as it does now. There was also a version that lowered the merge block limit to 64k, but that was deemed to crude, but I note that it particularly made it easy to demonstrate the underlying issue ;). Signed-off-by: Auke Kok <auke.kok@versity.com>
Two different clients can write delta's for totl indexes at the same time, recording their changes. When merged, a reader should apply both in order, and only once. To do so, the seq determines whether the delta has been applied already. The code fails to update the seq while walking the trees for deltas to apply. Subsequently, when processing subsequent trees, it could re-process deltas already applied. In case of a large negative delta (e.g. removal of large amounts of files), the totl value could become negative, resulting in quota lockout. The fix is simple: advance the seq when reading partial delta merges to avoid double counting. Signed-off-by: Auke Kok <auke.kok@versity.com>
e4e718e to
5c73c8d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two different clients can write delta's for totl indexes at the same time, recording their changes. When merged, a reader should apply both in order, and only once. To do so, the seq determines whether the delta has been applied already.
The code fails to update the seq while walking the trees for deltas to apply. Subsequently, when processing subsequent trees, it could re-process deltas already applied. In case of a large negative delta (e.g. removal of large amounts of files), the totl value could become negative, resulting in quota lockout.
The fix is simple: advance the seq when reading partial delta merges to avoid double counting.