-
Notifications
You must be signed in to change notification settings - Fork 20
119-Introduce-CompactedKeyEncoder #124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
119-Introduce-CompactedKeyEncoder #124
Conversation
|
@luoyuxia @Kelvinyu1117 Would appreciate review here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces the CompactedKeyEncoder infrastructure to encode key columns of rows into compact binary format. The implementation closely follows the Java reference implementation and provides a foundation for encoding primary keys for various data lake formats.
Key Changes
- Added
CompactedKeyEncoderwith support for basic data types (integers, floats, strings, bytes, binary, boolean) - Introduced
BinaryWritertrait andValueWriterabstraction for extensible binary serialization - Added
FieldGettertrait to extract typed field values from rows - Extended
Datumenum withBorrowedBlobvariant to support borrowed byte slices
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
crates/fluss/src/row/mod.rs |
Added module declarations for binary, encode, and field_getter; added helper method to create GenericRow from data |
crates/fluss/src/row/field_getter.rs |
New file implementing FieldGetter trait and type-specific getters for extracting fields from rows |
crates/fluss/src/row/encode/mod.rs |
New file defining KeyEncoder trait and factory method for creating encoders based on data lake format |
crates/fluss/src/row/encode/compacted_key_encoder.rs |
Core implementation of CompactedKeyEncoder with comprehensive unit tests |
crates/fluss/src/row/datum.rs |
Added BorrowedBlob variant and corresponding From implementation for borrowed byte slices |
crates/fluss/src/row/compacted/mod.rs |
Exposed CompactedKeyWriter module |
crates/fluss/src/row/compacted/compacted_key_writer.rs |
Wrapper around CompactedRowWriter that rejects null values for key encoding |
crates/fluss/src/row/binary/mod.rs |
New module defining BinaryRowFormat enum and re-exporting binary writer types |
crates/fluss/src/row/binary/binary_writer.rs |
Defines BinaryWriter and ValueWriter traits with implementations for all basic data types |
crates/fluss/src/metadata/datatype.rs |
Added test helper methods to create RowType from data types and field names |
crates/fluss/Cargo.toml |
Added delegate crate dependency for delegation pattern |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Copilot comments have been addressed |
luoyuxia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leekeiabstraction Thanks for the pr. Left minor comment. PTAL
|
@luoyuxia Addressed all comments, PTAL! |
Purpose
Linked issue: close #119
Brief change log
Added Value enum (not in Java) to allow more graceful/polymorphic BinaryWriter implementationsSeveral TODOs that may benefit from being broken up into smaller task to prevent PR from becoming too large:
CompactedKeyEncoder's all data type unit testTests