RVC is an experimental, vibe-coded reimagining of DVC. DVC is excellent - this project exists to explore what a data pipeline tool might look like if rebuilt from scratch with different trade-offs.
This is not production software. It is a playground for trying things out.
A pipeline is defined in a single rvc.yaml file. RVC resolves dependencies between steps, determines which steps are stale, caches outputs by content hash, and executes what needs running - in parallel where the dependency graph allows. Scripts receive all their paths and parameters as environment variables, requiring no awareness of RVC itself.
$ rvc status
⏺ download up to date
○ analyze deps changed (analyze.py)
◒ split up to date
○ train upstream changed (analyze)
◒ evaluate up to date
$ rvc run
Running analyze
$ rvc metrics analyze:STATS
analyze:STATS
average_age: 29.700000
rows: 891
survival_rate: 0.383800
survived: 342DVC had a great insight: managing ML workflows alongside git. Because git already tracks code, you can track data versions too - the lockfile acts as a signature for the state of the pipeline at any commit. RVC leans into that.
Experiments are branches. DVC's experiment management system (dvc exp) is too complex for what I need. RVC assumes each experiment is a branch. You change code, data, or parameters in a branch, run the pipeline, and the lockfile + metrics get committed alongside everything else. Instead of focusing on tracking inputs, RVC focuses on being able to track and compare metrics across branches. It understands three metric shapes natively - flat key-values, timeseries, and histograms - and can diff and plot them in detail across any set of git refs.
Simpler step communication. DVC's insight of tracking DAG dependencies via inputs and outputs is the right idea, but the stage description is too verbose and communication with steps is broken - you end up writing argument-parsing boilerplate in every script. RVC makes writing steps more comfortable: variables can be interpolated anywhere in the config, and inputs, outputs, params, metrics, and dependencies are all named and passed to steps as environment variables. Your Python script just reads os.environ["MODEL"]. You can use pydantic-settings or any other env-based config pattern with zero friction. All of this is language-agnostic.
CI and automation as first-class consumers. DVC's idea of running pipelines in CI-like environments is great - it keeps a tight relationship between code versions and artifact versions, enables automatic retraining when data or libraries change, and keeps something that can get very complex relatively simple to manage. RVC takes this further: every command can produce machine-readable output (--json, --yaml). You can filter by step, by artifact kind, or by specific artifact name, and produce structured documents of pipeline state or performance that can be passed to other systems - experiment dashboards, deployment pipelines, monitoring, whatever.
Everything is in two files. rvc.yaml for configuration, rvc.lock for resolved state. No .rvc/config, no Python params files, no .dvc files scattered around the repo.
Small CLI, edit config yourself. dvc exp never worked for me. Editing pipelines through the CLI always felt cumbersome. In RVC you just edit rvc.yaml directly. The CLI has a handful of entry points (run, status, dag, metrics, plots, cache), and all structured output follows defined JSON schemas - so there is no need for a Python API or SDK.
- Install
- Quick Start
- Configuration
- What Gets Tracked
- Status
- Running the Pipeline
- Variables
- Caching
- Remote Cache
- The Lockfile
- Gitignore Management
- The DAG
- Metrics
- Diffing
- Plots
- Examples
- TODO
git clone <repository-url>
cd rvc
cargo build --release
cp target/release/rvc /usr/local/bin/Requires Rust 1.70+.
Create rvc.yaml:
steps:
prepare:
cmd: python prepare.py
deps:
- prepare.py
inputs:
RAW: data/raw.csv
outputs:
CLEAN: data/clean.csv
train:
cmd: python train.py
deps:
- train.py
params:
EPOCHS: '100'
inputs:
DATA: data/clean.csv
outputs:
MODEL: models/model.pkl
metrics:
STATS: metrics/train.jsonRun it:
rvc status # see what needs to run
rvc run # execute the pipeline
rvc metrics # view metricsScripts access paths and parameters through environment variables:
import os
data_path = os.environ["DATA"] # "data/clean.csv"
model_path = os.environ["MODEL"] # "models/model.pkl"
epochs = os.environ["EPOCHS"] # "100"Works with pydantic-settings, argparse defaults from env, or a plain os.environ call. Language-agnostic.
Everything lives in rvc.yaml. Runtime state lives in rvc.lock. You edit one YAML file directly - no stage-editing CLI commands, no .rvc/config, no external params files.
Here is the Titanic example, which exercises most features - variables, multiple step types, persistent outputs, three metric shapes, and parameter passing:
settings:
algorithm: blake2b
vars:
data_dir: data
out_dir: out
model_dir: models
paths:
data:
TITANIC: ${data_dir}/titanic.csv
TRAIN: ${data_dir}/train.csv
VAL: ${data_dir}/val.csv
train:
MODEL: ${model_dir}/survival_net.pt
TRAINING_LOG: ${out_dir}/training_log.json
FINAL_METRICS: ${out_dir}/train_metrics.json
EVAL_METRICS: ${out_dir}/eval_metrics.json
steps:
download:
cmd: curl -L -o $TITANIC https://csvbase.com/lyuehh/titanic
outputs:
TITANIC: ${paths.data.TITANIC}
analyze:
cmd: uv run analyze.py
deps:
- analyze.py
params:
AGE_BIN: '10'
inputs:
TITANIC: ${paths.data.TITANIC}
outputs:
REPORT:
path: ${out_dir}/report.txt
persist: true
metrics:
STATS: ${out_dir}/metrics.json
AGE_SURVIVAL: ${out_dir}/age_survival.csv
split:
cmd: uv run split.py
deps:
- split.py
params:
VAL_RATIO: 0.2
SPLIT_SEED: 42
inputs:
TITANIC: ${paths.data.TITANIC}
outputs:
TRAIN: ${paths.data.TRAIN}
VAL: ${paths.data.VAL}
train:
cmd: uv run train.py
deps:
- train.py
params:
EPOCHS: 40
LEARNING_RATE: 0.001
inputs:
TRAIN: ${paths.data.TRAIN}
outputs:
MODEL: ${paths.train.MODEL}
metrics:
TRAINING_LOG: ${paths.train.TRAINING_LOG}
FINAL_METRICS: ${paths.train.FINAL_METRICS}
evaluate:
cmd: uv run evaluate.py
deps:
- evaluate.py
inputs:
VAL: ${paths.data.VAL}
MODEL: ${paths.train.MODEL}
metrics:
EVAL_METRICS: ${paths.train.EVAL_METRICS}| Field | Effect |
|---|---|
cmd |
Shell command to execute (required) |
wdir |
Working directory for the command (default: directory containing rvc.yaml) |
deps |
Source files to track for changes (scripts, configs) |
inputs |
Named data files the step reads |
outputs |
Named data files the step produces |
params |
Named string values passed as env vars |
metrics |
Named output files that RVC can parse and compare |
frozen: true |
Skip this step during rvc run |
always_changed: true |
Always re-run regardless of dependency state |
sequential: true |
Run in isolation, not parallel with other steps at the same DAG level |
Outputs support a short form (just a path) and a long form (with options):
outputs:
MODEL: models/model.pkl # short: cached, replaced with reflink/symlink
REPORT:
path: out/report.txt
persist: true # long: tracked but NOT cached; stays as a regular filePersistent outputs are useful for files you want to commit to git (reports, READMEs, etc.). They are still hashed for change detection, but RVC leaves them in place instead of moving them to the cache.
Steps declare five kinds of artifacts, each with different behaviour:
| Kind | Hashed | Cached | Env var | Triggers re-run on change |
|---|---|---|---|---|
deps |
✓ | - | - | ✓ |
params |
- | - | ✓ | ✓ |
inputs |
✓ | ✓ | ✓ | ✓ |
outputs |
✓ | ✓ | ✓ | ✓ |
metrics |
✓ | - | ✓ | ✓ |
- deps - source files (scripts, configs). Hashed for change detection. Not cached, not exposed as env vars.
- params - string values exposed as env vars. A change in value triggers a re-run. The value itself is stored in the lockfile.
- inputs - data files the step reads. Hashed, cached, and exposed as env vars.
- outputs - data files the step produces. Hashed, cached (unless
persist: true), and exposed as env vars. - metrics - output files that RVC can parse and compare. Hashed and exposed as env vars, but not cached (they are typically small text files).
All five kinds trigger a re-run when they change. The distinction is in what else happens: deps are silent trackers, params carry values, inputs and outputs are cached, and metrics are additionally parseable for display and diffing.
rvc status checks every step against the lockfile and reports what is stale and why:
$ rvc status
⏺ download up to date
◒ analyze up to date
○ split params changed (VAL_RATIO)
○ train upstream changed (split)
◒ evaluate up to dateStatus symbols (plain output):
| Symbol | Meaning |
|---|---|
| ⏺ | Step is up to date and pushed |
| ◒ | Step is up to date but not pushed |
| ○ | Step is not up to date |
The reason text is coloured to match the symbol.
Specific steps can be checked:
rvc status train evaluateChanges propagate: if analyze is stale, downstream steps like train will show ○ upstream changed (analyze).
Structured output:
$ rvc status --json
{
"steps": {
"download": { "status": "up_to_date", "pushed": true },
"analyze": { "status": "up_to_date", "pushed": false },
"split": { "status": "not_up_to_date", "reason": "params changed (VAL_RATIO)" },
"train": { "status": "not_up_to_date", "reason": "upstream changed (split)" },
"evaluate": { "status": "up_to_date", "pushed": false }
}
}Also available as --yaml.
rvc run executes all steps that need updating, in dependency order, with independent steps running in parallel:
$ rvc run
▶ download
✓ completed in 0.8s
▶ analyze
✓ completed in 1.2s
▶ split
✓ completed in 0.4s
▶ train
✓ completed in 3.4s
▶ evaluate
✓ completed in 0.6sIf everything is already up to date, RVC says so and exits:
$ rvc run
Pipeline is up to date, nothing to runOptions:
rvc run -f # force re-run all steps regardless of status
rvc run <target> # run target and ensure upstream dependencies are up to date
rvc run <target> --single # run only target, skip dependency freshness checks
rvc run <target> --force # force target and all downstream dependents
rvc run -n # dry run - show what would execute without running
rvc run --no-parallel # disable parallel execution (run steps one at a time)Dry run with verbose (-n -v) shows the full execution plan - commands, working directories, and the environment each step receives:
$ rvc run -n -f -v
Dry run: download
Command: curl -L -o $TITANIC https://csvbase.com/lyuehh/titanic
Working dir: .
Environment:
TITANIC=data/titanic.csv
Dry run: analyze
Command: uv run analyze.py
Working dir: .
Environment:
TITANIC=data/titanic.csv
REPORT=out/report.txt
AGE_BIN=10
STATS=out/metrics.json
AGE_SURVIVAL=out/age_survival.csv
...When a step finishes, RVC checks that all declared outputs and metrics actually exist on disk. If any are missing, the step is marked as failed. This catches a common mistake: declaring an output path that the script doesn't actually write to.
If a step fails, the pipeline stops. Steps that already succeeded are recorded in the lockfile, so a subsequent rvc run will only retry from the failed step onward.
Steps at the same DAG level (no dependencies between them) run concurrently by default, up to the number of available CPU cores. Steps marked sequential: true run alone in their own batch.
--no-parallel forces all steps to run one at a time, which can be useful for debugging or when steps compete for shared resources.
Variables reduce repetition in rvc.yaml. They support nesting and can reference each other:
vars:
data_dir: data
out_dir: out
paths:
data:
TITANIC: ${data_dir}/titanic.csv
TRAIN: ${data_dir}/train.csv
steps:
analyze:
inputs:
TITANIC: ${paths.data.TITANIC} # resolves to data/titanic.csvVariable substitution is plain string replacement applied to: cmd, wdir, deps, inputs, outputs, metrics, and params. Variables can reference other variables; references are resolved iteratively until stable.
When a step completes successfully, RVC hashes its outputs and stores copies in .rvc/cache/, organised by hash prefix. The original files are then replaced with reflinks or symlinks pointing to the cache.
On subsequent runs, if the inputs and deps haven't changed, the step is skipped - its outputs are already in the cache and linked into the working tree. If the step does need to re-run, existing outputs are backed up to the cache first (so you don't lose them if the new run fails), then removed before execution.
Reflinks vs symlinks: On filesystems that support copy-on-write (APFS on macOS, btrfs and XFS on Linux), RVC uses reflinks. These are instant, use no extra disk space, and behave like regular files. On other filesystems, symlinks are used as a fallback.
Persistent outputs: Outputs declared with persist: true are tracked (hashed for change detection) but not cached or replaced with links. They stay as regular files. Use this for files you want to commit to git.
The hash algorithm is configurable:
settings:
algorithm: blake2b # default: md5. Also: sha256To clean cache safely by lock/git refs, use rvc cache clean. Manual wipe is still possible with rm -rf .rvc/cache.
RVC can sync cached artifacts to/from S3-compatible storage for collaboration and CI workflows.
Cache commands are lock-driven:
- they operate on hashes referenced by
rvc.lock - they use
persist: trueinformation stored inrvc.lock - persisted outputs are excluded from cache sync/restore
Remote configuration is read from rvc.yaml, with AWS configuration used as fallback for region/endpoint resolution:
settings:
remote: s3://my-bucket/project-prefix
region: eu-west-1settings.region and settings.endpoint are optional when AWS configuration already resolves them. For custom endpoints, set settings.region explicitly.
There is no per-command remote override flag.
Upload cached artifacts to remote storage:
rvc cache push # push all cached artifacts in lock
rvc cache push analyze # push only analyze step artifacts
rvc cache push analyze.output # push only analyze outputs
rvc cache push analyze.output.REPORT # push specific outputPush tracks synced objects via ETag sidecars in cache (.rvc/cache/... .etag).
- regular files:
<cache_rel>.etag - manifests:
<manifest_rel>.etag - manifest members: each member file has its own
.etag
rvc cache push uploads only what is missing for the selected artifacts (unless forced).
Options:
rvc cache push --force # re-upload even if ETag exists
rvc cache push --dry-run # show what would be uploaded
rvc cache push --verbose # detailed transfer progress
rvc cache push --jobs N # concurrent transfers (default: CPU cores)Persist outputs are excluded from push (driven by persist: true in lock).
Download artifacts from remote storage and restore them to the working tree:
rvc cache pull # pull all missing artifacts in lock
rvc cache pull train # pull only train step artifacts
rvc cache pull train.input.DATA # pull specific inputPull only considers artifacts referenced by the current lock and excluded persisted outputs.
Normal pull downloads artifacts that are missing from local cache.
--force re-downloads selected artifacts from remote.
Options:
rvc cache pull --force # re-download selected artifacts
rvc cache pull --dry-run # show what would be downloaded/restored
rvc cache pull --verbose # detailed transfer progress
rvc cache pull --jobs N # concurrent transfers (default: CPU cores)Artifacts are downloaded to .rvc/cache, then restored to workspace.
For directory outputs, pull handles both the manifest and manifest member files.
Persist outputs are never restored from cache.
Show cache presence and push status for locked artifacts:
rvc cache status # all artifacts in lock
rvc cache status train # train step only
rvc cache status train.output.MODEL # specific artifactPlain output uses symbols to indicate state:
| Symbol | State |
|---|---|
| ⏺ | cached + pushed (green) |
| ◒ | cached but not fully pushed (yellow) |
| 🞊 | persisted output, ignored by sync (gray) |
| ○ | not cached, expected (red) |
For manifest artifacts in plain output, ⏺ is shown only when manifest + all member files are pushed (all relevant .etag sidecars exist).
Example:
$ rvc cache status
⏺ data/clean.csv
◒ models/model.pkl
🞊 out/report.txt
○ metrics/train.jsonStructured output:
$ rvc cache status --json
[
{
"path": "data/clean.csv",
"cache": ".rvc/cache/9c/9c2e...",
"remote": "s3://my-bucket/prefix/9c/9c2e...",
"roles": ["prepare.output.CLEAN", "train.input.DATA"]
},
{
"path": "models/model.pkl",
"cache": ".rvc/cache/f1/f1b0...",
"roles": ["train.output.MODEL"]
},
{
"path": "out/report.txt",
"ignored": true,
"roles": ["analyze.output.REPORT"]
}
]Also available as --yaml.
Remove cache entries not referenced by any git branch or tag:
rvc cache clean # scan all branches/tags, delete unreferenced
rvc cache clean --ignore-branches # only scan tags
rvc cache clean --ignore-tags # only scan branchesClean always keeps:
- Artifacts from current working tree lock
- Artifacts from locks in scanned git refs
- Manifest member files for kept manifests
ETag sidecars are deleted alongside their cache objects.
Options:
rvc cache clean --dry-run # show what would be deleted
rvc cache clean --verbose # list all refs scanned and entries deletedDefault output:
$ rvc cache clean
cache clean: scanned 8 refs, deleted 23 files, 142.3MB, remaining 1.8GBVerbose output includes full ref list and deleted entries:
$ rvc cache clean --verbose --dry-run
refs:
- (working tree)
- refs/heads/main
- refs/heads/feature-a
- refs/tags/v1.0
- refs/tags/v1.1
delete:
- ab/abcd1234... (12.4KB)
- cd/cdef5678... (8.1MB)
- ef/ef901234... (134.2MB)
cache clean: scanned 5 refs, would delete 3 files, 142.3MB, remaining 1.8GBAll cache commands support fine-grained selectors:
<step> # all inputs/outputs for step
<step>.input # all inputs for step
<step>.output # all outputs for step
<step>.input.<NAME> # specific input
<step>.output.<NAME> # specific outputMultiple selectors can be combined:
rvc cache push analyze.output train.output.MODEL
rvc cache status train.input evaluate.inputIf a selector matches nothing, the command errors clearly:
$ rvc cache push nonexistent
Error: No artifacts match selector(s): nonexistentCommon CI patterns:
Before pipeline execution - pull only what's needed:
rvc cache pull
rvc runAfter successful run - push new artifacts:
rvc run
rvc cache pushSelective cache - for large pipelines, cache only expensive steps:
rvc cache pull train
rvc run
rvc cache push train evaluateCache validation:
rvc cache status --json > cache-status.jsonAfter each successful run, RVC writes rvc.lock - a YAML file recording the state of each step: command, hashes, params, and output semantics.
version: '1.0'
algorithm: blake2b
steps:
train:
cmd: uv run train.py
deps:
train.py:
path: train.py
hash: 3a7f...
size: 1024
inputs:
TRAIN:
path: data/train.csv
hash: 9c2e...
size: 51200
outputs:
MODEL:
path: models/survival_net.pt
hash: f1b0...
size: 204800
params:
EPOCHS: '40'
LEARNING_RATE: '0.001'
metrics:
TRAINING_LOG:
path: out/training_log.json
hash: 7d3a...
size: 4096
analyze:
cmd: uv run analyze.py
outputs:
REPORT:
path: out/report.txt
hash: ab12...
size: 108
persist: truervc status, rvc run, and all rvc cache commands use the lockfile to determine what is stale, what to sync, and where metrics live.
Commit both rvc.lock and your metric files to git. When diffing, RVC reads the lockfile from a git ref to find which metrics exist and where they live, then reads the metric file contents from that same ref. If either is missing from git history, the diff has nothing to compare against.
After each run, RVC updates .gitignore in the directory containing rvc.yaml. It maintains a clearly marked section:
# BEGIN RVC managed
/.rvc
/data/clean.csv
/models/model.pkl
# END RVC managedThis section lists the .rvc cache directory and all non-persistent output paths. RVC only touches lines between the markers; everything else in your .gitignore is preserved.
Persistent outputs (persist: true) are not added to .gitignore, since the intent is for those files to be committed.
RVC infers the dependency graph from step inputs and outputs. If step A produces data/clean.csv and step B declares it as an input (or dep), B depends on A. No need to declare dependencies between steps explicitly.
rvc dag supports two output formats:
mermaid(default)plain(terminal-friendly topological levels + edge list)
Mermaid output (default):
rvc dagflowchart TD
accTitle: Pipeline DAG
download[download]
analyze[analyze]
split[split]
train[train]
evaluate[evaluate]
download --> analyze
download --> split
split --> train
split --> evaluate
train --> evaluate
Plain output:
rvc dag --plainDAG (topological levels)
[0] ○ download
[1] ○ analyze, ○ split
[2] ○ train
[3] ○ evaluate
Edges
download -> analyze
download -> split
split -> train
split -> evaluate
train -> evaluate
The Euro FX example shows a fan-out / fan-in pattern - three independent filter steps that converge on a summary:
cd example/eurofx && rvc dagflowchart TD
accTitle: Pipeline DAG
download[download]
filter_majors[filter_majors]
monthly_avg[monthly_avg]
extremes[extremes]
summary[summary]
download --> filter_majors
download --> monthly_avg
download --> extremes
extremes --> summary
filter_majors --> summary
monthly_avg --> summary
Options:
rvc dag # Mermaid flowchart output (default)
rvc dag --plain # plain-text levels + edges for terminal viewing
rvc dag --direction left-right # Mermaid layout direction (also: bottom-up, right-left)
rvc dag > pipeline.md # write to file via shell redirectionFrozen steps render as hexagons with ❄, always-changed steps as circles with ↻ in Mermaid output.
In plain output, ⏺ means up to date (green), ○ means needs running (red), and · means unknown (yellow).
RVC parses metric files, understands their structure, and renders them. It recognises three shapes - flat, timeseries, and histogram - and auto-detects both the file format (JSON, NDJSON, YAML, CSV) and the metric shape from the content.
Key-value pairs. The most common shape for summary statistics.
A JSON file like this:
{ "rows": 891, "survived": 342, "survival_rate": 0.3838, "average_age": 29.7 }Renders as:
$ rvc metrics analyze:STATS
analyze:STATS
average_age: 29.700000
rows: 891
survival_rate: 0.383800
survived: 342Nested JSON objects are flattened with dot notation: {"train": {"loss": 0.1}} becomes train.loss: 0.100000.
YAML key-value mappings and two-column CSV files (metric,value) are also recognised as flat metrics.
Sequences of timestamped data points. The expected use case is training logs - epoch-by-epoch loss and accuracy values streamed during model training.
An NDJSON file (one JSON object per line) like this:
{"timestamp": "2026-02-07T15:49:37.360054+00:00", "epoch": 1, "train_loss": 0.6742, "train_acc": 0.6087}
{"timestamp": "2026-02-07T15:49:37.370471+00:00", "epoch": 2, "train_loss": 0.6489, "train_acc": 0.6101}
{"timestamp": "2026-02-07T15:49:37.381126+00:00", "epoch": 3, "train_loss": 0.619, "train_acc": 0.6199}
...40 lines total...Renders as a sampled summary. RVC picks points from the head, middle, and tail of the series, with ... indicating skipped regions. The default sample count is 12:
$ rvc metrics train:TRAINING_LOG
train:TRAINING_LOG (timeseries)
points: 40
start: 2026-02-07T15:49:37Z
end: 2026-02-07T15:49:37Z
duration: +289.084ms
Δt epoch train_acc train_loss
-------------------------------- ------------ ------------ ------------
+0µs 1 0.608700 0.674200
+10.417ms 2 0.610100 0.648900
+21.072ms 3 0.619900 0.619000
...
+60.375ms 8 0.792400 0.478400
...
+243.122ms 33 0.809300 0.428200
...
+282.293ms 39 0.820500 0.425100
+289.084ms 40 0.823300 0.421800Control the number of sample points:
rvc metrics train:TRAINING_LOG --points 5 # fewer samples
rvc metrics train:TRAINING_LOG --all # every point, no sampling
rvc metrics train:TRAINING_LOG --timestamps # absolute timestamps instead of ΔtRVC detects timeseries by looking for a timestamp, time, datetime, date, ts, or t field in the first record. JSON arrays, NDJSON, and CSV files with a matching column header are all supported. Headerless CSV where the first column looks like a timestamp is also handled.
Binned data. Detected by the presence of bin or bin_start/bin_end keys/columns.
A JSON array like this:
[
{ "bin_start": 0, "bin_end": 10, "count": 62, "survival_rate": 0.6129 },
{ "bin_start": 10, "bin_end": 20, "count": 102, "survival_rate": 0.402 },
{ "bin_start": 20, "bin_end": 30, "count": 220, "survival_rate": 0.35 }
]Renders as:
$ rvc metrics analyze:AGE_HIST_JSON
analyze:AGE_HIST_JSON (histogram)
bin count survival_rate
0..10 62 0.612903
10..20 102 0.401961
20..30 220 0.350000
30..40 167 0.437126
40..50 89 0.382022
50..60 48 0.416667
60..70 19 0.315789
70..80 6 0.000000
80..90 1 1.000000If both bin (a label) and bin_start/bin_end (range boundaries) are present, bin is used as the display label. CSV with a bin column using range syntax (0..10) is also recognised.
Metrics can be narrowed using the step:metric:field selector pattern:
rvc metrics analyze # all metrics from the analyze step
rvc metrics analyze:STATS # one specific metric
rvc metrics train:TRAINING_LOG:train_acc # one field within a timeseries
rvc metrics analyze:AGE_HIST_JSON:survival_rate # one value column in a histogramFor timeseries, field filtering reduces the table to a single value column:
$ rvc metrics train:TRAINING_LOG:train_acc
train:TRAINING_LOG (timeseries)
points: 40
start: 2026-02-07T15:49:37Z
end: 2026-02-07T15:49:37Z
duration: +289.084ms
Δt train_acc
-------------------------------- ------------
+0µs 0.608700
+10.417ms 0.610100
+21.072ms 0.619900
...
+282.293ms 0.820500
+289.084ms 0.823300For histograms, it reduces to the selected value column while preserving bin labels. For flat metrics, it shows only the matching keys.
All metrics render in four formats:
Plain text (default):
$ rvc metrics analyze:STATS
analyze:STATS
average_age: 29.700000
rows: 891
survival_rate: 0.383800
survived: 342JSON (--json):
$ rvc metrics analyze:STATS --json
{
"steps": {
"analyze": {
"STATS": {
"average_age": 29.7,
"rows": 891,
"survival_rate": 0.3838,
"survived": 342
}
}
}
}Flat metrics emit plain key-value objects. Timeseries and histogram metrics include "kind" and "values" fields. JSON schemas for both metric output and diff output are in schema/.
YAML (--yaml):
$ rvc metrics analyze:STATS --yaml
steps:
analyze:
STATS:
average_age: 29.7
rows: 891
survival_rate: 0.3838
survived: 342Markdown (--markdown):
$ rvc metrics analyze:STATS --markdown
### analyze:STATS
| metric | value |
|---------------|-----------|
| average_age | 29.700000 |
| rows | 891 |
| survival_rate | 0.383800 |
| survived | 342 |Only one format flag can be used at a time; combining them (e.g. --json --yaml) is an error.
rvc metrics --diff compares metrics across git refs. It reads metric files from each ref (using the lockfile from that ref to locate them), computes a diff, and renders the result.
With no --ref arguments, RVC picks sensible defaults based on git state:
| Git state | Comparison |
|---|---|
| Uncommitted changes exist | working tree vs HEAD |
| On a feature branch, clean | current branch vs main/master |
| On the default branch, clean | HEAD vs HEAD~1 |
rvc metrics --diff # auto-detect
rvc metrics --diff --ref main # working tree vs main
rvc metrics --diff --ref v1 --ref v2 # two specific refs
rvc metrics --diff --ref v1 --ref v2 --ref v3 # multi-way (v1 is base)When one --ref is given, it becomes the base and the working tree is the target. When two or more are given, the first is the base and the rest are targets.
With two refs, RVC produces a side-by-side comparison with delta columns showing the difference from base to target:
$ rvc metrics --diff --ref HEAD~1 --ref HEAD evaluate:EVAL_METRICS
Comparing HEAD~1 against HEAD
evaluate:EVAL_METRICS
metric HEAD~1 HEAD Δ
-------- -------- -------- --------
fn 26 25 -1
fp 11 11 +0
tn 103 103 +0
tp 38 39 +1
val_acc 0.792100 0.797800 +0.0057
val_loss 0.436900 0.429900 -0.0070
val_rows 178 178 +0For timeseries, RVC aligns both series to a shared duration axis using nearest-neighbour resampling. A summary table shows metadata (start, end, duration, point count) followed by the aligned value table. When one series is shorter than the other, ∅ marks points with no corresponding data:
$ rvc metrics --diff --ref HEAD~2 --ref HEAD train:TRAINING_LOG:train_acc --points 5
Comparing HEAD~2 against HEAD
train:TRAINING_LOG:train_acc
field HEAD~2 HEAD Δ
-------- -------------------- -------------------- ----------
start 2026-02-06T23:10:26Z 2026-02-07T15:49:37Z +16.654h
end 2026-02-06T23:10:26Z 2026-02-07T15:49:37Z +16.654h
duration +304.508ms +289.084ms -15.424ms
points 40 40 +0
Δt HEAD~2.train_acc HEAD.train_acc Δ(train_acc)
---------- ---------------- -------------- -------------
+0µs 0.610100 0.608700 -0.0014
+76.127ms 0.789600 0.793800 +0.0042
+152.254ms 0.791000 0.805000 +0.0140
+228.381ms 0.789600 0.807900 +0.0183
+304.508ms 0.823300 ∅Histograms are aligned by bin label. Bins that appear in one ref but not the other show ∅, with deltas computed per value column.
With three or more refs, the first ref is the base and each subsequent ref is compared against it. Delta columns are hidden by default in multi-way mode to keep tables readable:
$ rvc metrics --diff --ref HEAD~2 --ref HEAD~1 --ref HEAD train:FINAL_METRICS
Comparing snapshots: base (HEAD~2) + HEAD~1, HEAD
train:FINAL_METRICS
metric HEAD~2 HEAD~1 HEAD
---------------- -------- -------- --------
best_train_acc 0.823300 0.813500 0.823300
epochs 40 40 40
final_train_acc 0.823300 0.800800 0.823300
final_train_loss 0.431800 0.432400 0.421800--deltas forces delta columns on; --no-deltas forces them off:
$ rvc metrics --diff --ref HEAD~2 --ref HEAD~1 --ref HEAD train:FINAL_METRICS --deltas
Comparing HEAD~2 against HEAD~1, HEAD
train:FINAL_METRICS
metric HEAD~2 HEAD~1 HEAD~1.Δ HEAD HEAD.Δ
---------------- -------- -------- -------- -------- ---------
best_train_acc 0.823300 0.813500 -0.0098 0.823300 +0.0000
epochs 40 40 +0 40 +0
final_train_acc 0.823300 0.800800 -0.0225 0.823300 +0.0000
final_train_loss 0.431800 0.432400 +0.0006 0.421800 -0.0100In 2-way diffs, delta headers are shortened to Δ or Δ(field). In multi-way diffs, they include the ref name: HEAD.Δ or HEAD.Δ(field).
| Mode | Default deltas | Override |
|---|---|---|
| 2-way | shown | --no-deltas to hide |
| 3+ way | hidden | --deltas to show |
When a metric changes shape between refs (e.g. flat in v1, timeseries in v2), RVC cannot produce a structured comparison table. Instead, it shows each ref's snapshot independently.
Diff output supports all four formats. JSON and YAML include "base" and "targets" at the top level.
Flat diffs are ref-keyed objects (no "kind" field). Non-flat diffs include "kind" + "values". Timeseries diffs also include a "summary" object with start/end/duration/points per ref:
$ rvc metrics --diff --ref HEAD~1 --ref HEAD evaluate:EVAL_METRICS --json
{
"base": "HEAD~1",
"targets": ["HEAD"],
"steps": {
"evaluate": {
"EVAL_METRICS": {
"HEAD~1": {
"fn": "26",
"fp": "11",
"val_acc": "0.792100"
},
"HEAD": {
"fn": "25",
"Δ(fn)": "-1",
"fp": "11",
"Δ(fp)": "+0",
"val_acc": "0.797800",
"Δ(val_acc)": "+0.0057"
}
}
}
}
}Markdown (--markdown):
$ rvc metrics --diff --ref HEAD~1 --ref HEAD evaluate:EVAL_METRICS --markdown
### evaluate:EVAL_METRICS
| metric | HEAD~1 | HEAD | Δ |
|----------|----------|----------|----------|
| fn | 26 | 25 | -1 |
| fp | 11 | 11 | +0 |
| tn | 103 | 103 | +0 |
| tp | 38 | 39 | +1 |
| val_acc | 0.792100 | 0.797800 | +0.0057 |
| val_loss | 0.436900 | 0.429900 | -0.0070 |
| val_rows | 178 | 178 | +0 |Example of structured output for timeseries and histogram diffs:
base: HEAD~1
targets:
- HEAD
steps:
analyze:
AGE_HIST_JSON:
kind: histogram
values:
- bin_start: '0'
bin_end: '10'
HEAD~1:
count: '62'
survival_rate: '0.612903'
HEAD:
count: '62'
Δ(count): '+0'
survival_rate: '0.612903'
Δ(survival_rate): '+0.0000'
train:
TRAINING_LOG:
kind: timeseries
summary:
HEAD~1:
start: '2026-02-06T23:10:26Z'
end: '2026-02-06T23:10:26Z'
duration: '+304.508ms'
points: '40'
HEAD:
start: '2026-02-07T15:49:37Z'
Δ(start): '+16.654h'
end: '2026-02-07T15:49:37Z'
Δ(end): '+16.654h'
duration: '+289.084ms'
Δ(duration): '-15.424ms'
points: '40'
Δ(points): '+0'
values:
- Δt: '+0µs'
HEAD~1:
train_acc: '0.610100'
HEAD:
train_acc: '0.608700'
Δ(train_acc): '-0.0014'Selectors, output formats, and diff all compose freely:
# specific field + multi-way diff + markdown
rvc metrics train:TRAINING_LOG:train_acc --diff --ref HEAD~2 --ref HEAD~1 --ref HEAD --markdown
# all metrics as structured JSON diff
rvc metrics --diff --ref main --ref experiment --json
# histogram diff against working tree
rvc metrics analyze:AGE_HIST_JSON --diff
# timeseries diff with explicit sample density
rvc metrics train:TRAINING_LOG --diff --ref main --points 30
# alternate config file
rvc -d projects/titanic/rvc.yaml metrics --diff --ref mainrvc plots generates an interactive HTML page with charts for all metrics in the pipeline. It renders directly in the browser using Vega-Lite - no server needed.
Each metric shape maps to a chart type:
- Flat metrics → horizontal bar grid with delta labels, one panel per key, 3 columns
- Timeseries → line chart per value column, refs as coloured lines, cross-ref tooltip on hover at each x-step, summary stats table below
- Histogram → grouped bars by bin (one bar per ref), missing bins filled with 0, cross-ref tooltip on hover per bin
rvc plots # generate charts, open in browser
rvc plots --no-open # generate without opening
rvc plots train eval # only chart specific stepsBy default, rvc plots diffs metrics against git (same ref logic as rvc metrics --diff). Use --no-diff to chart only the working tree:
rvc plots --no-diff # no diff, just current metricsRef selection works the same as rvc metrics --diff:
rvc plots # auto-detect (working vs HEAD, branch vs main, etc.)
rvc plots --ref main # working tree vs main
rvc plots --ref v1 --ref v2 # two specific refsWhen diffing, all refs appear as separate series in timeseries/histogram charts, and as grouped bars in flat charts. The first ref is the base for delta computation.
rvc plots writes to a fixed plots/ directory next to rvc.yaml (and auto-adds /plots to the RVC-managed .gitignore block):
<dir-with-rvc.yaml>/plots/
index.html # self-contained interactive page
specs/ # individual Vega-Lite specs (copy-paste into editors)
eval-METRICS.vl.json
train-LOG-loss.vl.json
train-LOG-acc.vl.json
Spec files are standalone Vega-Lite specs with inline data - open them in the Vega Editor, embed in notebooks, or use in other tools.
rvc run --live opens a browser page before pipeline execution and updates charts as metric files change on disk:
rvc run --live # run pipeline with live chart updates
rvc run --live --force # force re-run with live charts
rvc run --live train # live charts filtered to train + its dependencies
rvc run --live --ref main # live comparison baseline/targets (same ref rules as plots/metrics)The page shows a pulsing ● LIVE badge during execution, then switches to ✓ DONE or ✗ FAILED when the pipeline finishes. The browser polls a companion JS file for updates - no WebSocket server needed.
When targeting specific steps (rvc run --live train), live charts are automatically filtered to only the steps in scope (the target plus its dependency chain). This keeps the page focused on what's actually executing.
The HTML page includes:
- Step sections - charts grouped by pipeline step, ordered by execution (topological) order
- Pipeline DAG - a Mermaid diagram of the dependency graph, rendered below the charts
Charts are generated with a fixed, shared Vega-Lite base width across chart types so exported SVGs are consistent, then scale responsively in the page. The page uses Tailwind CSS via CDN for styling and works from file:// URLs - no server required.
Rust handles data assembly: parsing metrics, computing deltas, aligning histogram bins, enriching data points with display labels. The results are Vega-Lite specs with inline data, plus optional structured summary tables.
The browser handles all rendering: Vega-Lite renders charts, JS renders summary tables from structured data, Mermaid renders the DAG. No HTML is generated in Rust.
Vega-Lite spec templates live in src/templates/ as standalone .vl.json files (valid Vega-Lite with empty data). Rust parses them as JSON and fills in data.values, title, and axis titles - no string-based template substitution.
Metric loading, ref resolution, and step filtering are shared between rvc metrics and rvc plots via app/shared.rs - the same code paths serve both commands.
The example/ directory contains three complete pipelines that can be run directly. They are independent of the test suite.
A minimal pipeline: uppercase text, count words, generate random numbers, split into files, produce a report. Useful for understanding the mechanics without external dependencies.
$ cd example/basic && rvc run && rvc metrics report:STATS
report:STATS
files.count: 30
files.first_name: a.txt
word_count: 32Download the Titanic dataset, analyse survival rates, split train/val, train a neural network, evaluate. Exercises all three metric shapes - flat statistics, a timeseries training log, and survival-by-age histograms. Requires Python with uv.
cd example/titanic && rvc run && rvc metricsEuro exchange rate analysis: download historical FX data, run three parallel filter steps (majors, monthly averages, extremes), then produce a summary. Demonstrates parallel execution and fan-out / fan-in DAG structure.
cd example/eurofx && rvc run && rvc dag| Flag | Effect |
|---|---|
-d path/to/rvc.yaml |
Use an alternate config file |
-v |
Verbose output |
-n |
Dry run (show plan, don't execute) |
These are global and can be combined with any subcommand.
Possible directions. This is a playground - these are ideas, not commitments.
- Metric thresholds - declare assertions like
accuracy > 0.9inrvc.yaml. Fail the run if metrics regress past a threshold. - Terminal sparklines - ASCII charts for timeseries metrics directly in the terminal.
- Step-level timing - record execution duration in the lockfile. Surface trends across runs.
- Parameterised sweeps -
rvc run --sweep LEARNING_RATE=0.001,0.01,0.1to fan out a step across parameter values and collect metrics from each. -
rvc bisect- binary search across commits for the point where a metric regressed, similar togit bisectbut driven by pipeline metrics. - CI reporting - a
rvc reportcommand that produces a self-contained markdown summary for PR comments: diff against the base branch, show what changed. - Partial restarts - if step 3 of 5 fails, resume from step 3 without re-validating earlier steps.
- Config validation - a
rvc checkcommand to validatervc.yamlwithout running anything: catch typos, missing deps, and circular references.
- Notebook integration - metric collection from Jupyter notebooks, possibly via a lightweight Python package that writes NDJSON.
- Distributed execution - run steps on remote machines or clusters.
- Plugin system - custom step executors (Docker, Slurm, etc.) without hard-coding them into RVC.
- Metric annotations - attach notes to specific runs ("tried larger batch size", "new augmentation") that appear in diff output.
- Web UI / server.
rvc plotsgenerates static HTML opened locally. No hosted dashboard or server component. - Language-specific SDKs. The env-var approach is language-agnostic by design.
- DVC compatibility. RVC is a separate project with different trade-offs.
DVC - the tool that solved these problems first. For production use, DVC is the right choice.
MIT