Skip to content

Add Early Validation for Aggregation Function Data Types During Table Creation #2302

@wuchong

Description

@wuchong

Search before asking

  • I searched in the issues and found nothing similar.

Description

Currently, Fluss does not validate whether the column data type is compatible with a given aggregation function (e.g., MAX, MIN) at table creation time. This leads to runtime failures or unexpected behavior when unsupported types—such as ARRAY, MAP, or ROW—are used with aggregation functions that require orderable types.

To improve developer experience and system robustness, we should fail fast during DDL validation.

✅ Expected Behavior

  • During CREATE TABLE (or table descriptor validation), if an aggregation function is specified on a column with an unsupported data type, Fluss should throw a clear, actionable error immediately.
  • The validation should reside in:
    org.apache.fluss.server.utils.TableDescriptorValidation#validateAggregationFunctionParameters

Solution

To ensure future aggregate functions are easily extensible, we should introduce a standardized way to declare and validate the data types each function supports.

One approach is adding a method such as validateDataType() to the AggFunctionType interface, similar in spirit to the existing validateParameter() that explicitly checks whether a given column type is compatible with the aggregate function.

🧪 Test Coverage

  • We can add such tests in FlussAdminITCase, like FlussAdminITCase#testCreateTableWithInvalidProperty.

Willingness to contribute

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions