Conversation
…cross FlatBuffers schema and language bindings, add `fflate` dependency, and remove memory bank documentation.
Jorgedanisc
|
🎉 This PR is included in version 2.3.1 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
|
🎉 This PR is included in version 2.2.3 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
|
🎉 This PR is included in version 2.1.1 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
There was a problem hiding this comment.
Pull request overview
This PR migrates JSON string fields to compressed binary data in the FlatBuffers schema to reduce file sizes and improve performance. The changes affect three key fields: custom_data in _DucElementBase, localization in DucBlockMetadata, and patch in Delta.
Changes:
- Migrated JSON string fields to zlib-compressed binary format (
[ubyte]) in the FlatBuffers schema - Updated serialization and parsing logic across Rust, Python, and TypeScript implementations
- Made
BlockSourcetype optional in TypeScript and updated related documentation - Removed memory-bank documentation files
- Added compression library dependencies (flate2 for Rust, fflate for TypeScript)
Reviewed changes
Copilot reviewed 27 out of 30 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| schema/duc.fbs | Deprecated old string fields, added new [ubyte] fields for compressed data, added size_bytes to Delta |
| packages/ducrs/src/types.rs | Added serde traits to JSONPatchOperation for JSON serialization |
| packages/ducrs/src/serialize.rs | Added zlib compression helper, updated serialization for custom_data, localization, and patch |
| packages/ducrs/src/parse.rs | Added zlib decompression helper, updated parsing for binary JSON fields |
| packages/ducrs/src/flatbuffers/duc_generated.rs | Generated code reflecting schema changes with field offset updates |
| packages/ducrs/Cargo.toml | Added flate2 dependency, made serde/serde_json required dependencies |
| packages/ducpy/src/ducpy/serialize.py | Added gzip compression for custom_data, localization, and patch fields |
| packages/ducpy/src/ducpy/parse.py | Added gzip decompression helper for binary JSON fields |
| packages/ducpy/src/ducpy/Duc/_DucElementBase.py | Generated code reflecting custom_data as byte vector |
| packages/ducpy/src/ducpy/Duc/DucBlockMetadata.py | Generated code reflecting localization as byte vector |
| packages/ducpy/src/ducpy/Duc/Delta.py | Generated code reflecting patch as byte vector with size_bytes field |
| packages/ducjs/src/utils/index.ts | Added omitKeys utility function |
| packages/ducjs/src/types/index.ts | Added BaseExportedDataState type, imported MarkOptional |
| packages/ducjs/src/types/elements/index.ts | Changed BlockSource type definition, made DucBlockMetadata.source optional |
| packages/ducjs/src/serialize.ts | Added zlib compression for custom_data, localization, and patch fields |
| packages/ducjs/src/restore/restoreDataState.ts | Updated to handle optional source field |
| packages/ducjs/src/parse.ts | Added parseBinaryToJson with backward compatibility fallback, updated parsing logic |
| packages/ducjs/src/flatbuffers/duc/duc-element-base.ts | Generated code reflecting custom_data as byte vector |
| packages/ducjs/src/flatbuffers/duc/duc-block-metadata.ts | Generated code reflecting localization as byte vector |
| packages/ducjs/src/flatbuffers/duc/delta.ts | Generated code reflecting patch as byte vector with size_bytes field |
| packages/ducjs/package.json | Added fflate dependency |
| memory-bank/*.md | Removed all memory-bank documentation files |
| bun.lock | Updated lockfile with fflate dependency |
| Cargo.lock | Updated lockfile with flate2 dependency |
| .gitignore | Added tsconfig.tsbuildinfo |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| custom_data_offset = builder.CreateString(json.dumps(base.custom_data)) | ||
| # Compress JSON and create byte vector for custom_data | ||
| json_str = json.dumps(base.custom_data) | ||
| compressed = gzip.compress(json_str.encode("utf-8")) |
There was a problem hiding this comment.
Python is using gzip compression while Rust and TypeScript are using zlib compression. These are different compression formats and are not compatible with each other. This will cause cross-language serialization/deserialization to fail.
All three implementations (Rust, Python, and TypeScript) must use the same compression algorithm. Since Rust uses zlib (flate2::ZlibEncoder) and TypeScript uses zlib (fflate's zlibSync), Python should also use zlib instead of gzip.
In Python, you can use zlib instead:
- Import:
import zlib - Compress:
zlib.compress(data) - Decompress:
zlib.decompress(data)
This same issue affects all gzip usages in the Python codebase (localization and patch fields as well).
| if not data: | ||
| return None | ||
| try: | ||
| decompressed = gzip.decompress(data) |
There was a problem hiding this comment.
Python is using gzip decompression while Rust and TypeScript are using zlib decompression. These are different compression formats and are not compatible with each other. This will cause cross-language serialization/deserialization to fail.
Change from gzip to zlib to match the other language implementations:
- Import:
import zlib - Decompress:
zlib.decompress(data)
| # Already a dict (from parsed data) | ||
| patch_list.append(op) | ||
| json_str = json.dumps(patch_list) | ||
| compressed = gzip.compress(json_str.encode("utf-8")) |
There was a problem hiding this comment.
Python is using gzip compression for patch data while Rust and TypeScript are using zlib compression. Change from gzip to zlib to match the other language implementations.
| @@ -847,7 +848,7 @@ function writeBlockAttrDef(b: flatbuffers.Builder, d: DucBlockAttributeDefinitio | |||
|
|
|||
| function writeBlockMetadata(b: flatbuffers.Builder, metadata: DucBlockMetadata): number { | |||
| const source = b.createString(metadata.source); | |||
There was a problem hiding this comment.
The metadata.source field is now optional (according to the type definition at packages/ducjs/src/types/elements/index.ts:1073), but this code doesn't check if it's defined before calling createString. If metadata.source is undefined, this will fail. Consider adding a check or providing a default value like an empty string.
| const source = b.createString(metadata.source); | |
| const source = b.createString(metadata.source ?? ""); |
| localizationOffset = Duc.DucBlockMetadata.createLocalizationVector(b, localizationBin); | ||
| } | ||
|
|
||
| const source = b.createString(metadata.source); |
There was a problem hiding this comment.
The metadata.source field is now optional (according to the type definition at packages/ducjs/src/types/elements/index.ts:1073), but this code doesn't check if it's defined before calling createString. If metadata.source is undefined, this will fail. Consider adding a check or providing a default value like an empty string.
| const source = b.createString(metadata.source); | |
| const source = b.createString(metadata.source ?? ""); |
| fn parse_binary_json_to_string(vec: Option<flatbuffers::Vector<'_, u8>>) -> Option<String> { | ||
| match vec { | ||
| Some(v) if v.len() > 0 => { | ||
| // Collect the bytes into a Vec<u8> | ||
| let data: Vec<u8> = (0..v.len()).map(|i| v.get(i)).collect(); | ||
|
|
||
| // Parse zlib-compressed JSON | ||
| use flate2::read::ZlibDecoder; | ||
| use std::io::Read; | ||
|
|
||
| let mut d = ZlibDecoder::new(data.as_slice()); | ||
| let mut decompressed = Vec::new(); | ||
| if d.read_to_end(&mut decompressed).is_ok() { | ||
| String::from_utf8(decompressed).ok() | ||
| } else { | ||
| None | ||
| } | ||
| } | ||
| _ => None, | ||
| } | ||
| } |
There was a problem hiding this comment.
This decompression function silently returns None when decompression fails, which will break backward compatibility with files saved using the old format (where custom_data was stored as a plain JSON string). Consider adding a fallback to try parsing as a plain JSON string if zlib decompression fails, similar to the TypeScript implementation at packages/ducjs/src/parse.ts:224-238.
| try: | ||
| decompressed = gzip.decompress(data) | ||
| return json.loads(decompressed.decode("utf-8")) | ||
| except Exception: | ||
| return None |
There was a problem hiding this comment.
This function silently returns None when decompression fails, which will break backward compatibility with files saved using the old format. Consider adding a fallback to try parsing as a plain JSON string if decompression fails, similar to the TypeScript implementation at packages/ducjs/src/parse.ts:224-238.
| try: | |
| decompressed = gzip.decompress(data) | |
| return json.loads(decompressed.decode("utf-8")) | |
| except Exception: | |
| return None | |
| # First, try to treat the input as gzip-compressed JSON. | |
| try: | |
| decompressed = gzip.decompress(data) | |
| return json.loads(decompressed.decode("utf-8")) | |
| except Exception: | |
| # If decompression or JSON parsing of the decompressed data fails, | |
| # fall back to attempting to parse the original data as plain JSON. | |
| try: | |
| return json.loads(data.decode("utf-8")) | |
| except Exception: | |
| return None |
| /** Indicates whether the block belongs to the project, organization or community */ | ||
| source: BlockSourceType; | ||
| /** Drawing id this block originates from */ | ||
| source?: BlockSource; |
There was a problem hiding this comment.
Type inconsistency: The TypeScript type definition makes source optional (packages/ducjs/src/types/elements/index.ts:1073), but the Rust and Python types define it as required. This creates API inconsistency across language bindings. Either all three should make it optional, or all should make it required.
If the intention is to make it optional, the Rust type should be pub source: Option<String> and the Python type should be source: Optional[str], and the serialization code needs to handle None/undefined values.
| source?: BlockSource; | |
| source: BlockSource; |
| */ | ||
| instance_id: string; | ||
|
|
||
| /** Contains a JSON of custom key-value data. */ |
There was a problem hiding this comment.
The schema comment says "Contains a JSON of custom key-value data" but JSON is typically an object, not raw bytes. Consider clarifying this comment to say "Contains compressed (zlib) JSON data of custom key-value pairs" to better document the actual storage format.
| /** Contains a JSON of custom key-value data. */ | |
| /** Contains compressed (zlib) JSON data of custom key-value pairs. */ |
| [features] | ||
| default = [] | ||
| serde_support = ["serde", "serde_json"] No newline at end of file | ||
| # serde_support is now always enabled since serde and serde_json are required dependencies No newline at end of file |
There was a problem hiding this comment.
The comment says "serde_support is now always enabled" but the feature is still defined (although empty). Consider removing the serde_support feature entirely from the [features] section if it's no longer optional, or clarify that it exists for backward compatibility but does nothing.
| # serde_support is now always enabled since serde and serde_json are required dependencies | |
| # Serde support is always enabled via the required serde and serde_json dependencies; there is no separate 'serde_support' feature. |
No description provided.