Skip to content

Conversation

@yew1eb
Copy link
Contributor

@yew1eb yew1eb commented Nov 24, 2025

Which issue does this PR close?

Closes #1658 .

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

How was this patch tested?

image

@yew1eb yew1eb changed the title Add tpch suite [AURON #1658] Add tpch suite Nov 24, 2025
@yew1eb yew1eb force-pushed the add_tpch_suite branch 7 times, most recently from c7853b1 to a336113 Compare December 2, 2025 15:02
@cxzl25 cxzl25 requested a review from Copilot December 12, 2025 11:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive TPC-H benchmark test suite to the Auron Spark extension project. The implementation includes a test framework for running TPC-H queries (q1-q22), verifying results against expected outputs, and validating query execution plans for stability testing.

Key Changes:

  • Introduces AuronTPCHSuite abstract test class with result and plan verification capabilities
  • Adds AuronTPCHV1Suite variant for testing with V1 Parquet data source
  • Includes 22 TPC-H SQL queries, expected results, and plan stability files for Spark 3.5

Reviewed changes

Copilot reviewed 68 out of 76 changed files in this pull request and generated no comments.

Show a summary per file
File Description
AuronTPCHSuite.scala Test framework implementing TPC-H query execution and verification logic
tpch-queries/*.sql 22 TPC-H benchmark SQL queries (q1-q22)
tpch-query-results/*.out Expected query results for verification
tpch-plan-stability/*.txt Expected query execution plans for Spark 3.5
tpch-data-parquet/* Sample TPC-H data in Parquet format
.rat-excludes Updated to exclude TPC-H resource files from license checks

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -0,0 +1,151 @@
== Physical Plan ==
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to update query plan and query result in batches?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a switch: set REGEN_TPCH_GOLDEN_FILES to 1 to regenerate query plans and results.

@yew1eb yew1eb force-pushed the add_tpch_suite branch 2 times, most recently from 75c2bc3 to 3eba587 Compare December 15, 2025 18:49
@yew1eb
Copy link
Contributor Author

yew1eb commented Dec 16, 2025

@cxzl25 PTAL

@yew1eb yew1eb requested a review from cxzl25 December 17, 2025 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add TPC-H test suite

2 participants