Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThis pull request adds Delta Lake support to CrateDB Toolkit, enabling bidirectional data transfer between Delta Lake and CrateDB. Changes include a new Delta Lake integration module, cluster core updates, dependency configuration, comprehensive documentation, and test coverage. Changes
Sequence DiagramssequenceDiagram
participant User as User/CLI
participant Cluster as StandaloneCluster
participant DLAdapter as from_deltalake()
participant DLAddress as DeltaLakeAddress
participant Polars as Polars
participant CrateDB as CrateDB
User->>Cluster: load_table(source_url, target_url)
Cluster->>DLAdapter: from_deltalake(source_url, target_url)
DLAdapter->>DLAddress: DeltaLakeAddress.from_url(source_url)
DLAddress->>DLAddress: Parse URL & extract options
DLAddress->>Polars: scan_delta(location, version, storage_options)
Polars-->>DLAddress: LazyFrame
DLAdapter->>Polars: load_table() → collect data
Polars-->>DLAdapter: DataFrame
DLAdapter->>CrateDB: polars_to_cratedb(batch_size)
CrateDB-->>DLAdapter: Success
DLAdapter-->>Cluster: True
Cluster-->>User: Table loaded
sequenceDiagram
participant User as User/CLI
participant Cluster as StandaloneCluster
participant DLAdapter as to_deltalake()
participant DLAddress as DeltaLakeAddress
participant CrateDB as CrateDB
participant Polars as Polars
participant DeltaLake as Delta Lake Storage
User->>Cluster: save_table(source_url, target_url)
Cluster->>DLAdapter: to_deltalake(source_url, target_url)
DLAdapter->>DLAddress: DeltaLakeAddress.from_url(target_url)
DLAddress->>DLAddress: Parse URL & extract options
DLAdapter->>CrateDB: read_cratedb(source_url, chunk_size)
CrateDB-->>DLAdapter: DataFrame chunks
loop For each chunk
DLAdapter->>Polars: write_delta(chunk, mode=overwrite/append)
Polars->>DeltaLake: Write data with mode
DeltaLake-->>Polars: Success
end
Polars-->>DLAdapter: Complete
DLAdapter-->>Cluster: True
Cluster-->>User: Table saved
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
2612f2b to
7600ec0
Compare
About
Import from and export to Delta Lake tables, for interoperability purposes.
Documentation
https://cratedb-toolkit--664.org.readthedocs.build/io/deltalake/
References