Skip to content

Spark 4.1: Migrate to new version framework in DSv2#15240

Open
aokolnychyi wants to merge 1 commit intoapache:mainfrom
aokolnychyi:spark-dsv2-version-framework-rebased
Open

Spark 4.1: Migrate to new version framework in DSv2#15240
aokolnychyi wants to merge 1 commit intoapache:mainfrom
aokolnychyi:spark-dsv2-version-framework-rebased

Conversation

@aokolnychyi
Copy link
Contributor

@aokolnychyi aokolnychyi commented Feb 5, 2026

This PR migrates our Spark 4.1 connector to new version framework in DSv2.

@aokolnychyi
Copy link
Contributor Author

This PR is NOT ready for review just yet.

@aokolnychyi aokolnychyi marked this pull request as draft February 5, 2026 21:21
@manuzhang
Copy link
Member

@aokolnychyi, thanks for the PR. Do we really call it version framework in Spark? I'm not able to find any documentation.

@aokolnychyi
Copy link
Contributor Author

aokolnychyi commented Feb 10, 2026

@manuzhang, that's how I tend to call it, not official by any means :) The core idea is that Spark has a notion of versioned tables and keeps track of the versions.

@aokolnychyi aokolnychyi force-pushed the spark-dsv2-version-framework-rebased branch from 558e336 to 748cbec Compare February 18, 2026 15:23
@github-actions github-actions bot removed the API label Feb 18, 2026
@aokolnychyi aokolnychyi marked this pull request as ready for review February 18, 2026 15:24
@aokolnychyi aokolnychyi changed the title [WIP] Spark 4.1: Migrate to new version framework in DSv2 Spark 4.1: Migrate to new version framework in DSv2 Feb 18, 2026
TableProperties.MERGE_MODE, RowLevelOperationMode.COPY_ON_WRITE.modeName());
}

@TestTemplate
Copy link
Contributor Author

@aokolnychyi aokolnychyi Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was verifying a particular limitation that we couldn't concurrently modify the table from another thread. This is no longer an issue because we pin the scanned snapshot so it doesn't matter if the underlying table evolves in the meantime.

* under the License.
*/
package org.apache.iceberg.spark.sql;
package org.apache.iceberg.spark.extensions;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Branching will require extensions in 4.1 because we must pin the branch during analysis. Spark doesn't let us do this cleanly today (see ResolveBranch). It will be fixed in 4.2.

@aokolnychyi aokolnychyi force-pushed the spark-dsv2-version-framework-rebased branch 3 times, most recently from 7de899b to 10fd253 Compare February 18, 2026 23:21
TableProperties.DELETE_MODE, RowLevelOperationMode.COPY_ON_WRITE.modeName());
}

@TestTemplate
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same explanation as in MERGE below.

}

@TestTemplate
public synchronized void testUpdateWithConcurrentTableRefresh() throws Exception {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same explanation as in MERGE.

@aokolnychyi aokolnychyi force-pushed the spark-dsv2-version-framework-rebased branch from 10fd253 to c5a30b9 Compare February 18, 2026 23:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments