-
Notifications
You must be signed in to change notification settings - Fork 167
fix(spark): Strict check processed blockIds #2678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
If we disable the multi replicas of shuffle-data, we should strict check the processed blockIds. WDYT? @jerqi |
Test Results 2 765 files - 385 2 765 suites - 385 5h 30m 54s ⏱️ - 1h 18m 44s For more details on these errors, see this check. Results for commit 0e2d03e. ± Comparison against base commit de55bd9. This pull request removes 266 and adds 1 tests. Note that renamed tests count towards both. |
Does it cover the speculation execution and AQE? |
I haven't take a deep think for this case. Let's froze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces stricter validation for processed block IDs by ensuring that the processed blocks exactly match the expected blocks, rather than allowing processed blocks to be a superset of expected blocks. The change prevents scenarios where extra, unexpected blocks are processed without detection.
Key Changes:
- Modified the
checkProcessedBlockIdsmethod to detect and reject extra processed blocks that aren't in the expected set - Removed an overloaded version of
checkProcessedBlockIdsthat allowed processed blocks to be a superset of expected blocks - Added comprehensive unit tests covering both valid and invalid scenarios
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| common/src/main/java/org/apache/uniffle/common/util/RssUtils.java | Updated validation logic to use destructive checking and removed the permissive overloaded method |
| common/src/test/java/org/apache/uniffle/common/util/RssUtilsTest.java | Added test cases for the stricter block ID validation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| expectedCount++; | ||
| if (processedBlockIds.contains(it.next())) { | ||
| actualCount++; | ||
| if (processedBlockIds.remove(it.next())) { |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method now mutates the input parameter processedBlockIds by calling remove(). This is a breaking behavioral change that could affect callers who reuse this Set. Consider either documenting this side effect clearly, cloning the Set before modification, or renaming the parameter to indicate it will be modified (e.g., mutableProcessedBlockIds).
| RssUtils.checkProcessedBlockIds(exceptedBlockIds, processedBlockIds); | ||
| fail(); |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test should verify the exception message or type, not just that any exception is thrown. Consider using assertThrows with a specific exception type check and optionally validating the error message contains expected text like 'illegal' and '1 blocks'.
|
|
||
| @Test | ||
| public void testCheckProcessedBlockIds() throws Exception { | ||
| Roaring64NavigableMap exceptedBlockIds = new Roaring64NavigableMap(); |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'excepted' to 'expected'.
What changes were proposed in this pull request?
Strict check processed blockIds
Why are the changes needed?
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Unit tests