Skip to content

lgingerich/planar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

An open table format with a lightweight database-catalog architecture that uses transaction-based deltas instead of snapshots. Designed to eliminate metadata scaling bottlenecks, reduce maintenance burden, and excel at streaming/CDC workloads with efficient incremental updates.

Design Goals & Principles

  • Streaming as a first-class citizen
    • IDEA: Keep latest writes in-memory until a minimum data size is met, then write to data file. Essentially replicating a memtable or WAL-buffer to prevent the small file problem.
  • Native support for next-gen file formats such as Vortex and Lance

Usage

Quick Start

Run the table lifecycle example to see Planar in action:

cargo run --example table_lifecycle

This example demonstrates creating tables, adding files, time travel queries, transaction deltas, and more. See examples/table_lifecycle.rs for the full code.

Data Model (In-Work)

Architecture Summary

  • TABLE is the root with pointers to current state (current_schema_uuid, current_transaction_id).

  • TRANSACTION is a monotonically increasing sequence forming an immutable version chain. Represents deltas (what changed), not complete snapshots.

  • SCHEMA has transaction-bounded validity ranges (valid_from/to_transaction_id). Schema evolution is independent from data changes.

  • FILE represents physical data files. Tracks lifecycle (added_in/removed_in_transaction_id), format-specific metadata, partition values, and row counts.

  • SNAPSHOTS don't exist. Any point-in-time view is computed on-demand by filtering files/schema by transaction ID.

Entity Relationship Diagram

erDiagram
TABLE ||--o{ SCHEMA : has_versions
    TABLE ||--o{ FILE : contains
    TABLE ||--o{ TRANSACTION : has_history
    TABLE ||--o| TABLE_STATS : aggregates
    
    SCHEMA ||--o{ COLUMN : defines
    TRANSACTION ||--o{ SCHEMA : valid_from
    TRANSACTION o|--o{ SCHEMA : valid_to
    
    TRANSACTION ||--o{ FILE : added_in
    TRANSACTION o|--o{ FILE : removed_in
    
    FILE ||--o{ FILE_COLUMN_STATS : has_stats
    
    TABLE {
        uuid table_uuid PK
        string table_name
        string namespace
        string location
        uuid current_schema_uuid FK
        bigint current_transaction_id FK
        timestamp created_at
        json properties
    }

    TABLE_STATS {
        uuid table_uuid PK
        bigint transaction_id FK
        bigint record_count
        bigint file_size_bytes
        integer file_count
        timestamp last_updated
    }

    TRANSACTION {
        bigint transaction_id PK
        uuid table_uuid FK
        timestamp transaction_timestamp
        bigint parent_transaction_id FK
    }

    SCHEMA {
        uuid schema_uuid PK
        uuid table_uuid FK
        integer schema_version
        bigint valid_from_transaction_id FK
        bigint valid_to_transaction_id FK
        timestamp created_at
    }

    COLUMN {
        uuid column_uuid PK
        uuid schema_uuid FK
        string column_name
        string column_type
        integer ordinal_position
        boolean is_nullable
    }
    
    FILE {
        uuid file_uuid PK
        uuid table_uuid FK
        string file_format
        string file_path
        bigint record_count
        bigint file_size_bytes
        bigint added_in_transaction_id FK
        bigint removed_in_transaction_id FK
        json partition_values
    }
    
    FILE_COLUMN_STATS {
        uuid file_uuid PK
        string column_name PK
        bigint null_count
        bigint nan_count
        binary min_value
        binary max_value
        bigint distinct_count
    }
Loading

About

Transaction-based open table format

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages