Skip to content

daxis-io/arco

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Arco

Incubating Project - Arco is under active development and not yet ready for production use. APIs may change without notice. We welcome early feedback and contributions.

Serverless lakehouse infrastructure - A file-native catalog and execution-first orchestration layer for modern data platforms.

License Rust Status


Overview

Arco unifies a file-native catalog and an execution-first orchestration layer into one operational metadata system. It stores metadata as immutable, queryable files on object storage and treats deterministic planning, replayable history, and explainability as product requirements.

Key Differentiators

Feature Description
Metadata as files Parquet-first storage for catalog and operational metadata, optimized for direct SQL access
Query-native reads Browser and server query engines read metadata directly via signed URLs, eliminating always-on infrastructure
Lineage-by-execution Lineage captured from real runs (inputs/outputs/partitions), not inferred from SQL parsing
Two-tier consistency Strong consistency for DDL; eventual consistency for high-volume operational facts
Tenant isolation Enforced at storage layout, service boundaries, and test gates

Architecture

arco/
├── crates/
│   ├── arco-core/       # Core abstractions: types, storage traits, tenant context
│   ├── arco-catalog/    # Catalog service: registry, lineage, search, Parquet storage
│   ├── arco-flow/       # Orchestration: planning, scheduling, state machine
│   ├── arco-api/        # HTTP/gRPC composition layer
│   ├── arco-proto/      # Protobuf definitions
│   └── arco-compactor/  # Compaction binary for Tier 2 events
├── proto/               # Canonical .proto files
├── python/              # Python SDK
└── docs/                # Documentation and ADRs

Quick Start

Prerequisites

  • Rust 1.85+ (Edition 2024)
  • Protocol Buffers compiler (protoc)

Build

# Clone the repository
git clone https://github.com/daxis-io/arco.git
cd arco

# Build all crates
cargo build --workspace

# Run tests
cargo test --workspace

# Check formatting and lints
cargo fmt --check
cargo clippy --workspace -- -D warnings

Example Usage

use arco_core::prelude::*;

// Create a tenant context
let tenant = TenantId::new("acme-corp")?;

// Generate a unique asset ID
let asset_id = AssetId::generate();

Crates

Crate Description Status
arco-core Shared primitives: tenant context, IDs, errors, storage traits Alpha
arco-catalog Catalog domain: asset registry, lineage, Parquet snapshots Alpha
arco-flow Orchestration domain: planning, scheduling, run state Alpha
arco-api HTTP/gRPC composition layer Alpha
arco-proto Protobuf definitions for cross-language contracts Alpha
arco-compactor Compaction binary for Tier 2 event consolidation Alpha

Performance Targets

Operation P95 Target
Catalog table lookup < 50ms
Full catalog scan (1000 tables) < 500ms
Lineage traversal (5 hops) < 100ms
Plan generation (100 tasks) < 200ms

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development

# Format code
cargo fmt

# Run clippy
cargo clippy --workspace --all-features -- -D warnings

# Run tests with coverage
cargo llvm-cov --workspace

# Check supply chain security
cargo deny check

Security

For security vulnerabilities, please see SECURITY.md.

License

Licensed under either of:

at your option.

Acknowledgments

Arco is developed by Daxis and open sourced to advance the data ecosystem.

About

Serverless lakehouse infrastructure - file-native catalog and execution-first orchestration

Resources

License

Unknown and 2 other licenses found

Licenses found

Unknown
LICENSE
Unknown
LICENSE-APACHE
MIT
LICENSE-MIT

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages