fix(spark): Strict check processed blockIds #2678

zuston · 2025-11-18T09:46:58Z

What changes were proposed in this pull request?

Strict check processed blockIds

Why are the changes needed?

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit tests

zuston · 2025-11-18T12:07:46Z

If we disable the multi replicas of shuffle-data, we should strict check the processed blockIds. WDYT? @jerqi

github-actions · 2025-11-18T15:47:59Z

Test Results

2 765 files - 385 2 765 suites - 385 5h 30m 54s ⏱️ - 1h 18m 44s
948 tests - 265 934 ✅ - 277 1 💤 ±0 0 ❌ ±0 13 🔥 + 12
13 712 runs - 1 632 13 527 ✅ - 1 801 15 💤 ±0 0 ❌ ±0 170 🔥 +169

For more details on these errors, see this check.

Results for commit 0e2d03e. ± Comparison against base commit de55bd9.

This pull request removes 266 and adds 1 tests. Note that renamed tests count towards both.

org.apache.hadoop.mapred.SortWriteBufferManagerTest ‑ testCombineBuffer
org.apache.hadoop.mapred.SortWriteBufferManagerTest ‑ testCommitBlocksWhenMemoryShuffleDisabled
org.apache.hadoop.mapred.SortWriteBufferManagerTest ‑ testOnePartition
org.apache.hadoop.mapred.SortWriteBufferManagerTest ‑ testWriteException
org.apache.hadoop.mapred.SortWriteBufferManagerTest ‑ testWriteNormal
org.apache.hadoop.mapred.SortWriteBufferManagerTest ‑ testWriteNormalWithRemoteMerge
org.apache.hadoop.mapred.SortWriteBufferManagerTest ‑ testWriteNormalWithRemoteMergeAndCombine
org.apache.hadoop.mapred.SortWriteBufferTest ‑ testReadWrite
org.apache.hadoop.mapred.SortWriteBufferTest ‑ testSortBufferIterator
org.apache.hadoop.mapreduce.RssMRUtilsTest ‑ applyDynamicClientConfTest
…

org.apache.uniffle.common.util.RssUtilsTest ‑ testCheckProcessedBlockIds

jerqi · 2025-11-19T02:22:20Z

If we disable the multi replicas of shuffle-data, we should strict check the processed blockIds. WDYT? @jerqi

Does it cover the speculation execution and AQE?

zuston · 2025-11-19T03:27:09Z

If we disable the multi replicas of shuffle-data, we should strict check the processed blockIds. WDYT? @jerqi

Does it cover the speculation execution and AQE?

I haven't take a deep think for this case. Let's froze

Copilot

Pull request overview

This PR introduces stricter validation for processed block IDs by ensuring that the processed blocks exactly match the expected blocks, rather than allowing processed blocks to be a superset of expected blocks. The change prevents scenarios where extra, unexpected blocks are processed without detection.

Key Changes:

Modified the checkProcessedBlockIds method to detect and reject extra processed blocks that aren't in the expected set
Removed an overloaded version of checkProcessedBlockIds that allowed processed blocks to be a superset of expected blocks
Added comprehensive unit tests covering both valid and invalid scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
common/src/main/java/org/apache/uniffle/common/util/RssUtils.java	Updated validation logic to use destructive checking and removed the permissive overloaded method
common/src/test/java/org/apache/uniffle/common/util/RssUtilsTest.java	Added test cases for the stricter block ID validation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-12T02:35:57Z

common/src/main/java/org/apache/uniffle/common/util/RssUtils.java

      expectedCount++;
-      if (processedBlockIds.contains(it.next())) {
-        actualCount++;
+      if (processedBlockIds.remove(it.next())) {


The method now mutates the input parameter processedBlockIds by calling remove(). This is a breaking behavioral change that could affect callers who reuse this Set. Consider either documenting this side effect clearly, cloning the Set before modification, or renaming the parameter to indicate it will be modified (e.g., mutableProcessedBlockIds).

Copilot · 2025-12-12T02:35:58Z

common/src/test/java/org/apache/uniffle/common/util/RssUtilsTest.java

+      RssUtils.checkProcessedBlockIds(exceptedBlockIds, processedBlockIds);
+      fail();


The test should verify the exception message or type, not just that any exception is thrown. Consider using assertThrows with a specific exception type check and optionally validating the error message contains expected text like 'illegal' and '1 blocks'.

Copilot · 2025-12-12T02:35:58Z

common/src/test/java/org/apache/uniffle/common/util/RssUtilsTest.java


+  @Test
+  public void testCheckProcessedBlockIds() throws Exception {
+    Roaring64NavigableMap exceptedBlockIds = new Roaring64NavigableMap();


Corrected spelling of 'excepted' to 'expected'.

fix(spark): Strict check processed blockIds

0e2d03e

jerqi requested a review from Copilot December 12, 2025 02:35

Copilot AI reviewed Dec 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(spark): Strict check processed blockIds #2678

fix(spark): Strict check processed blockIds #2678

Uh oh!

zuston commented Nov 18, 2025

Uh oh!

zuston commented Nov 18, 2025

Uh oh!

github-actions bot commented Nov 18, 2025

Uh oh!

jerqi commented Nov 19, 2025

Uh oh!

zuston commented Nov 19, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 12, 2025

Uh oh!

Copilot AI Dec 12, 2025

Uh oh!

Copilot AI Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		RssUtils.checkProcessedBlockIds(exceptedBlockIds, processedBlockIds);
		fail();

fix(spark): Strict check processed blockIds #2678

Are you sure you want to change the base?

fix(spark): Strict check processed blockIds #2678

Uh oh!

Conversation

zuston commented Nov 18, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

zuston commented Nov 18, 2025

Uh oh!

github-actions bot commented Nov 18, 2025

Test Results

Uh oh!

jerqi commented Nov 19, 2025

Uh oh!

zuston commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zuston commented Nov 19, 2025 •

edited

Loading