Skip to content

a full binary file format validator for over 100 different filetypes, written in Zig with frontier AI assistance

License

Notifications You must be signed in to change notification settings

pmarreck/validate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

validate

Data silently rots.

If you aren't actively validating, you likely already have corrupt files that are being quietly re-copied to the cloud or your NAS as "good" backups. Family photos, legal documents, old projects, and cherished media are exactly the kind of files that get silently damaged and then preserved in that damaged state.

Drive failures are obvious. Silent sector failures, copy errors, and transmission errors are not. That's why validate exists: deterministic, byte-level validation across a wide range of file formats (100+, see FORMAT_VERIFICATIONS.md).

Components

  • Zig library (core validation)
  • C FFI (stable-enough for integration, but not yet 1.0)
  • C CLI wrapper: validate

Status

The C FFI mirrors the current Zig validation API for ease of integration. It is expected to evolve before a 1.0 release.

Build

./build

Runs ./test first. When DEBUG is unset/0, dependencies build in ReleaseFast and ./build defaults to -Doptimize=ReleaseFast.

CLI

./zig-out/bin/validate <path> [--jobs N]

--jobs 0 (default) uses all available cores (logical CPU count). MAX_FILES limits the number of files scanned when validating a directory. MAX_VIDEO_SIZE limits deep video validation to files under N MB (unset = no limit). MEM_TELEMETRY=1 logs per-file RSS memory samples (use MEM_TELEMETRY_PATH to log to a file, MEM_TELEMETRY_EVERY=N to sample every N files). UNKNOWN_OUT=/path writes UNKNOWN entries to that path instead of stdout (supports /dev/null, /dev/fd/1, /dev/fd/2). ZIP_TELEMETRY=1 logs slow ZIP entry validation details to stderr (adjust threshold with ZIP_SLOW_SECONDS). PDF_TELEMETRY=1 logs slow PDF deep-validation breakdowns to stderr (adjust threshold with PDF_SLOW_SECONDS).

Tests

./test

Windows Tests (CrossOver)

./test-windows

Requires a CrossOver bottle named windows-dev-test (or set CROSSOVER_BOTTLE). Note: this is a temporary external dependency; we plan to make the runner self-contained via flake.nix.

About

a full binary file format validator for over 100 different filetypes, written in Zig with frontier AI assistance

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages