Skip to content

Spark 4.1: Display write metrics on SQL UI#15104

Open
manuzhang wants to merge 2 commits intoapache:mainfrom
manuzhang:spark-4.1-write-metrics
Open

Spark 4.1: Display write metrics on SQL UI#15104
manuzhang wants to merge 2 commits intoapache:mainfrom
manuzhang:spark-4.1-write-metrics

Conversation

@manuzhang
Copy link
Member

@manuzhang manuzhang commented Jan 21, 2026

No description provided.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR ports write metrics display functionality from Spark 4.0 to Spark 4.1, enabling write operation metrics to be shown in the Spark SQL UI. The changes introduce custom metric classes for tracking various write operations and integrate them with Spark's connector API.

Changes:

  • Added 25 new custom metric classes extending CustomSumMetric to track data files, delete files, records, and file sizes for added/removed/total categories
  • Integrated metrics reporting into SparkWrite and SparkPositionDeltaWrite by implementing reportDriverMetrics() and supportedCustomMetrics()
  • Enhanced BaseTable to support combining multiple metrics reporters

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
TotalRecords.java Defines metric for tracking total record count
TotalPositionalDeletes.java Defines metric for tracking total positional deletes
TotalFileSizeInBytes.java Defines metric for tracking total file size
TotalEqualityDeletes.java Defines metric for tracking total equality deletes
TotalDeleteFiles.java Defines metric for tracking total delete files
TotalDataFiles.java Defines metric for tracking total data files
RemovedRecords.java Defines metric for tracking removed records
RemovedPositionalDeletes.java Defines metric for tracking removed positional deletes
RemovedPositionalDeleteFiles.java Defines metric for tracking removed positional delete files
RemovedFileSizeInBytes.java Defines metric for tracking removed file size
RemovedEqualityDeletes.java Defines metric for tracking removed equality deletes
RemovedEqualityDeleteFiles.java Defines metric for tracking removed equality delete files
RemovedDeleteFiles.java Defines metric for tracking removed delete files
RemovedDataFiles.java Defines metric for tracking removed data files
AddedRecords.java Defines metric for tracking added records
AddedPositionalDeletes.java Defines metric for tracking added positional deletes
AddedPositionalDeleteFiles.java Defines metric for tracking added positional delete files
AddedFileSizeInBytes.java Defines metric for tracking added file size
AddedEqualityDeletes.java Defines metric for tracking added equality deletes
AddedEqualityDeleteFiles.java Defines metric for tracking added equality delete files
AddedDeleteFiles.java Defines metric for tracking added delete files
AddedDataFiles.java Defines metric for tracking added data files
SparkWriteBuilder.java Adds custom metrics support to write builder
SparkWrite.java Integrates metrics reporter and implements reportDriverMetrics()
SparkPositionDeltaWrite.java Integrates metrics reporter and implements custom metrics methods
SparkWriteUtil.java Provides utility methods for creating custom metrics and task metrics
InMemoryMetricsReporter.java Adds commitReport() method to retrieve commit metrics
BaseTable.java Adds combineMetricsReporter() method for combining metrics reporters

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@manuzhang manuzhang force-pushed the spark-4.1-write-metrics branch 2 times, most recently from 1c63813 to 2da3d86 Compare January 27, 2026 16:14
@manuzhang manuzhang requested a review from nastra February 11, 2026 16:27
Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manuzhang can you please add some tests similar to TestSparkReadMetrics to make sure we actually get those commit metrics

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall I think this is close but we're. missing some tests and changes to SparkPositionDeletesRewrite, since that one also implements Spark's Write API

@manuzhang manuzhang force-pushed the spark-4.1-write-metrics branch 2 times, most recently from c2e3242 to 3e32551 Compare February 13, 2026 10:05
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not run tests for v3 tables by default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current metrics are intended for v2 tables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metrics should work across different format versions or not? My point is why we're limiting ourselves to a format version that isn't the latest one?

Copy link
Member Author

@manuzhang manuzhang Feb 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main difference is that position delete is replaced by deletion vector. I'd like to target the default table format version first. Besides, this PR is initially opened against Spark 3.5 in 2024.

Co-authored-by: copilot <copilot@github.com>
@manuzhang manuzhang force-pushed the spark-4.1-write-metrics branch from 3e32551 to c2b27e2 Compare February 13, 2026 14:08
@nastra nastra requested a review from singhpk234 February 18, 2026 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments