Skip to content

Add vault migration framework with v3→v5 upgrade path#6

Open
johnzfitch wants to merge 1 commit intomainfrom
claude/secure-upgrade-path-A0bEk
Open

Add vault migration framework with v3→v5 upgrade path#6
johnzfitch wants to merge 1 commit intomainfrom
claude/secure-upgrade-path-A0bEk

Conversation

@johnzfitch
Copy link
Owner

Summary

Implement a forward-only vault migration framework that enables automatic and explicit upgrades from vault v3 to v5. The system includes version validation, anti-rollback protection, automatic backup creation, and sequential migration steps with domain-separated HKDF key derivation.

Key Changes

  • New migration module (src/vault/migrate.rs):

    • Version validation with anti-downgrade and anti-rollback checks
    • Forward-only migration chain (v3 → v4 → v5)
    • Version-specific wrapping key derivation using HKDF-Expand with purpose labels
    • Private key re-wrapping with fresh nonces during migration
    • Comprehensive test suite covering all migration paths and edge cases
  • Vault format updates (src/vault/format.rs):

    • Bump VAULT_VERSION from 4 to 5
    • Add min_version field for anti-rollback enforcement
    • Document version history and migration rationale
  • Automatic migration in unlock (src/vault/ops.rs):

    • unlock_vault() now automatically migrates older vaults to current version
    • Creates atomic backups before migration (with sequential suffix fallback)
    • Persists migrated vault atomically after successful upgrade
    • Derives wrapping keys based on vault's current version
  • New CLI command (src/cli/commands.rs, src/cli/mod.rs, src/main.rs):

    • upgrade command for explicit, controlled vault migration
    • Validates version before prompting for passphrase
    • Reports current version and skips if already up-to-date
  • UI updates (src/tui/mod.rs):

    • Display min_version field in vault info

Implementation Details

  • Key separation: v4+ uses distinct HKDF-derived wrapping keys for ML-KEM and X25519 (v3 used raw master key for both)
  • Domain separation: v4 and v5 use different HKDF labels (dota-v4-wrap-* vs dota-v5-wrap-*) to prevent key reuse across versions
  • Anti-rollback: After migration to v5, min_version is set to VAULT_VERSION, preventing older binaries from opening the vault
  • Backup safety: Pre-migration backups use O_CREAT | O_EXCL semantics with symlink rejection and sequential suffix fallback
  • Memory safety: Sensitive key material is zeroized after use via Zeroize and ZeroizeOnDrop traits
  • Deterministic migration: Same passphrase + vault always produces identical results (nonces are freshly generated per migration)

https://claude.ai/code/session_01119wsw4Dqok7UZsk8ZQbjH

Implement a secure upgrade path with no downgrade capability:

- Add migrate.rs: version validation, migration chain, per-version
  HKDF wrapping key derivation, anti-rollback via min_version field
- Bump VAULT_VERSION to 5, add MIN_SUPPORTED_VERSION = 3
- Add min_version field to Vault struct (serde default for compat)
- Auto-migrate on unlock with atomic backup (O_CREAT|O_EXCL, 0600)
- Add `dota upgrade` CLI command for controlled fleet upgrades
- Zeroize all intermediate key material (WrappingKeys, HKDF arrays,
  decrypted private keys during rewrap)
- Symlink protection on vault and backup paths
- 20+ unit tests covering migration, validation, and edge cases

Security hardening (two-pass review):
- H-1: Zeroize decrypted private keys after rewrap
- H-2: Replace fs::copy with atomic O_CREAT|O_EXCL backup creation
- M-1: Derive ZeroizeOnDrop on WrappingKeys struct
- M-2: Zeroize HKDF intermediate key arrays
- M-3: Validate version before early-exit in handle_upgrade
- L-2: Use explicit version match arms (no open-ended ranges)

https://claude.ai/code/session_01119wsw4Dqok7UZsk8ZQbjH
Copilot AI review requested due to automatic review settings March 3, 2026 03:48
@johnzfitch
Copy link
Owner Author

@codex — requesting code review on this PR.

Scope: Forward-only vault migration framework for dota (post-quantum secrets manager).

Key areas to review:

  1. src/vault/migrate.rs (~820 lines, new file) — Core migration logic:

    • validate_version(): 4-check version validation (anti-downgrade, anti-rollback, corruption, too-old)
    • derive_wrapping_keys(): version-specific HKDF-SHA256 key derivation with domain-separated labels
    • rewrap_private_keys(): decrypt-then-re-encrypt ML-KEM-768 and X25519 private keys
    • Migration chain: migrate_v3_to_v4 (raw→HKDF wrapping), migrate_v4_to_v5 (label refresh + anti-rollback)
    • 20+ tests covering happy path, error cases, and edge cases
  2. src/vault/ops.rs — Integration changes:

    • unlock_vault() now calls validate_version() and auto-migrates with atomic backup
    • create_backup(): race-free backup using O_CREAT|O_EXCL, symlink rejection, mode(0600)
    • create_vault(), change_passphrase(), rotate_keys() updated to use migrate::derive_wrapping_keys()
  3. src/vault/format.rs — Added min_version field with #[serde(default)] for backward compat

  4. Security-sensitive patterns to scrutinize:

    • Zeroization: WrappingKeys derives ZeroizeOnDrop; intermediate HKDF arrays and decrypted private keys explicitly .zeroize()'d
    • The min_version field is stored unauthenticated in JSON — an attacker with file write access could set it to u32::MAX to DoS the vault (documented as known limitation; full mitigation would require authenticated envelope or HMAC)
    • HKDF label constants use dota-v{N}-wrap-{algo} format — confirm domain separation is sufficient
  5. Potential concerns:

    • v3 wrapping uses raw master key (legacy compat) — is this documented clearly enough?
    • create_backup tries up to 1000 path candidates — is this reasonable?
    • No cargo check was possible in this environment (no network for cargo registry) — compilation verification pending

@johnzfitch
Copy link
Owner Author

@claude — requesting code review on this PR.

Context: This PR adds a forward-only vault migration framework to dota, a post-quantum secrets manager using hybrid ML-KEM-768 + X25519 encryption. The vault format is being upgraded from v4 to v5 with a migration chain supporting v3→v4→v5.

What changed (8 files, +983/-58 lines):

  • New: src/vault/migrate.rs — Migration framework with version validation, per-version wrapping key derivation, private key rewrap, and comprehensive tests
  • Modified: src/vault/ops.rs — Integrated migration into unlock_vault() with auto-backup; moved wrapping key derivation to migrate module
  • Modified: src/vault/format.rs — Added min_version: u32 field and MIN_SUPPORTED_VERSION constant
  • Modified: CLI (commands.rs, mod.rs, main.rs) — Added dota upgrade command
  • Modified: src/tui/mod.rs — Display min_version in info view

Review focus areas:

  1. Cryptographic correctness: Each vault version uses distinct HKDF-Expand labels (dota-v4-wrap-mlkem, dota-v5-wrap-mlkem, etc.) for wrapping key derivation. v3 uses raw master key (legacy). Is the domain separation correct and sufficient?

  2. Anti-rollback design: min_version is set during v4→v5 migration and checked on every unlock. Once upgraded, older binaries that don't understand v5 will refuse to open the vault. However, min_version is unauthenticated JSON metadata — is this acceptable given the threat model?

  3. Memory safety: Two-pass security review addressed:

    • Decrypted private keys zeroized after rewrap (mlkem_sk.zeroize(), x25519_sk.zeroize())
    • WrappingKeys struct derives ZeroizeOnDrop for automatic cleanup
    • HKDF intermediate key arrays zeroized before AesKey::from_bytes()
  4. Backup atomicity: create_backup() uses OpenOptions::create_new(true) (maps to O_CREAT|O_EXCL) to prevent TOCTOU races, rejects symlinks via symlink_metadata(), and sets mode(0o600).

  5. Test coverage: 20+ unit tests including:

    • Version validation (current, future, anti-rollback, too-old, corrupt, zero, u32::MAX)
    • Full migration chain (v3→v5, v4→v5)
    • Wrong passphrase rejection
    • Wrapping key determinism and cross-version divergence
    • Metadata preservation during migration
  6. Known limitation: cargo check/cargo test could not be run in this environment due to network restrictions (no cargo registry cache). Compilation verification is pending CI.

Please review for correctness, security, and any design concerns.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a forward-only vault migration framework and wires it into vault unlock + a new explicit CLI upgrade command, enabling automatic upgrades from older vault format versions to the current format while introducing anti-rollback protection.

Changes:

  • Introduces src/vault/migrate.rs with version validation, v3→v4→v5 migration chain, and version-scoped wrapping-key derivation.
  • Updates vault format to v5 and adds min_version for anti-rollback enforcement.
  • Updates unlock flow to auto-backup + auto-migrate, and adds a CLI upgrade command plus UI/CLI info display for min_version.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/vault/ops.rs Auto-migrate on unlock; create pre-migration backups; switch wrapping-key derivation to migration module.
src/vault/mod.rs Exposes new migrate module.
src/vault/migrate.rs Implements migration framework, wrapping-key derivation, and tests.
src/vault/format.rs Bumps vault version to 5; adds min_version and documents version history.
src/tui/mod.rs Displays min_version in vault info.
src/main.rs Wires Upgrade command dispatch.
src/cli/mod.rs Adds Upgrade CLI subcommand variant.
src/cli/commands.rs Implements handle_upgrade command and prints min_version in info output.
Comments suppressed due to low confidence (1)

src/vault/ops.rs:103

  • The unlock_vault doc comment says the backup is written to <vault_path>.v<old>.bak, but create_backup may fall back to <vault_path>.v<old>.bak.<N> if that name already exists. Consider updating the comment to reflect the suffix behavior so operators know where to look.
/// Unlock a vault with a passphrase.
///
/// If the vault is at an older (but supported) version, it is automatically
/// migrated to `VAULT_VERSION`. A backup of the pre-migration vault is
/// written to `<vault_path>.v<old>.bak` before any changes are persisted.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +422 to +426
let data = fs::read(vault_path).context("Failed to read vault for backup")?;
f.write_all(&data)
.context("Failed to write backup data")?;
f.sync_all().context("Failed to sync backup file")?;
return Ok(candidate);
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If fs::read(vault_path), write_all, or sync_all fails after the backup file has been created, the function returns an error but leaves a (possibly empty/partial) backup file behind. Consider cleaning up the just-created candidate on failure (best-effort remove_file) to avoid accumulating misleading backups.

Suggested change
let data = fs::read(vault_path).context("Failed to read vault for backup")?;
f.write_all(&data)
.context("Failed to write backup data")?;
f.sync_all().context("Failed to sync backup file")?;
return Ok(candidate);
let result: Result<()> = (|| {
let data = fs::read(vault_path).context("Failed to read vault for backup")?;
f.write_all(&data)
.context("Failed to write backup data")?;
f.sync_all().context("Failed to sync backup file")?;
Ok(())
})();
match result {
Ok(()) => return Ok(candidate),
Err(e) => {
// Best-effort cleanup of the just-created backup file on failure
let _ = fs::remove_file(p);
return Err(e);
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +29 to +33
/// Anti-rollback floor: the minimum binary version required to open this
/// vault. Set to VAULT_VERSION on creation and on every migration. Older
/// binaries that do not understand this field will already fail on the
/// `version` check, but this provides an additional explicit safeguard
/// against targeted downgrade attacks.
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The min_version docstring says it is the “minimum binary version required”, but the code compares it against VAULT_VERSION (vault format version) and sets it to VAULT_VERSION during creation/migration. To avoid confusion for operators/users, please adjust the documentation/error wording to refer to the vault format version (or rename the field) rather than implying it maps to the application’s semver.

Suggested change
/// Anti-rollback floor: the minimum binary version required to open this
/// vault. Set to VAULT_VERSION on creation and on every migration. Older
/// binaries that do not understand this field will already fail on the
/// `version` check, but this provides an additional explicit safeguard
/// against targeted downgrade attacks.
/// Anti-rollback floor: the minimum vault format version required to open
/// this vault. Set to VAULT_VERSION (the current vault format version) on
/// creation and on every migration. Older binaries that do not understand
/// this field will already fail on the `version` (format) check, but this
/// provides an additional explicit safeguard against targeted downgrade
/// attacks.

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +63
"This vault requires dota v{} or later (anti-rollback). \
Please upgrade dota.",
vault.min_version,
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_version reports “This vault requires dota v{} or later” using min_version, but min_version is a vault format floor (compared to VAULT_VERSION), not the application semver. Consider rephrasing the message to reference the vault format version (or the minimum supporting release) to avoid telling users to install a non-existent “dota v5”.

Suggested change
"This vault requires dota v{} or later (anti-rollback). \
Please upgrade dota.",
vault.min_version,
"This vault requires vault format v{} or later (anti-rollback), \
but this binary only supports up to v{}. Please upgrade dota.",
vault.min_version,
VAULT_VERSION,

Copilot uses AI. Check for mistakes.
Comment on lines +392 to +418
fn create_backup(vault_path: &str, old_version: u32) -> Result<String> {
use std::os::unix::fs::OpenOptionsExt;

let candidates = (0u32..).map(|i| {
if i == 0 {
format!("{}.v{}.bak", vault_path, old_version)
} else {
format!("{}.v{}.bak.{}", vault_path, old_version, i)
}
});

for candidate in candidates.take(1000) {
let p = Path::new(&candidate);

// Refuse symlinks at the backup destination
if let Ok(meta) = fs::symlink_metadata(p) {
if meta.file_type().is_symlink() {
continue; // skip symlinks, try next suffix
}
}

// Atomically create-or-fail (O_CREAT | O_EXCL equivalent)
match fs::OpenOptions::new()
.write(true)
.create_new(true)
.mode(0o600)
.open(p)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_backup uses std::os::unix::fs::OpenOptionsExt and .mode(0o600) in non-test code, which will not compile on non-Unix targets. If this project is intended to build on Windows, consider moving the Unix-specific permission setting behind #[cfg(unix)] and providing a portable fallback (e.g., create the file with create_new(true) and set permissions conditionally).

Copilot uses AI. Check for mistakes.
Comment on lines +406 to +418
// Refuse symlinks at the backup destination
if let Ok(meta) = fs::symlink_metadata(p) {
if meta.file_type().is_symlink() {
continue; // skip symlinks, try next suffix
}
}

// Atomically create-or-fail (O_CREAT | O_EXCL equivalent)
match fs::OpenOptions::new()
.write(true)
.create_new(true)
.mode(0o600)
.open(p)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The symlink check in create_backup (symlink_metadata then open) is still vulnerable to a TOCTOU race: a path can be swapped to a symlink between the check and open, and OpenOptions::open will follow it. If the goal is to refuse symlinks reliably, use an atomic no-follow open (e.g., O_NOFOLLOW via OpenOptionsExt::custom_flags) or open then verify the opened file is not a symlink via fstat/metadata on the file descriptor.

Copilot uses AI. Check for mistakes.
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: de43ae842c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +127 to +130
let backup_path = create_backup(vault_path, old_version)?;

migrate::migrate_vault(&mut vault, &master_key)
.context("Vault migration failed (backup preserved)")?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Create migration backup only after passphrase validation

When opening a v3/v4 vault, unlock_vault writes a backup before it knows the passphrase can decrypt the private keys. If the passphrase is wrong (or key ciphertext is corrupted), migration fails but the backup remains, so repeated failed attempts keep consuming new *.bak suffixes until create_backup hits its 1000-file limit and then blocks unlock entirely even for the correct passphrase. This is a user-facing lockout path introduced by the new migration flow; defer backup creation until after in-memory migration succeeds, or clean up the backup on migration failure.

Useful? React with 👍 / 👎.

Comment on lines +231 to +235
let (enc_x25519, new_x25519_nonce) = aes_encrypt(&new.x25519, &x25519_sk)?;

// Zeroize decrypted private key material before dropping
mlkem_sk.zeroize();
x25519_sk.zeroize();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Badge Zeroize decrypted keys on all migration error paths

In rewrap_private_keys, decrypted private key buffers are scrubbed only after both re-encryptions succeed. Any error from the second aes_encrypt(...)? returns early and skips zeroization, leaving plaintext key material in heap memory until normal deallocation. Since this module is explicitly security-sensitive and claims zeroization guarantees, these buffers should be wrapped/guarded so they are cleared on every exit path (including ? errors).

Useful? React with 👍 / 👎.

johnzfitch pushed a commit that referenced this pull request Mar 7, 2026
- Fix UTF-8 zeroization flaw: explicitly zeroize bytes on FromUtf8Error
  in get_secret() to prevent plaintext lingering in error (PR #5 P1)
- Fix non-canonical commitment input: add length prefixes to variable-
  length fields in compute_key_commitment() (PR #5 P2)
- Fix unsafe signal handler: replace signal()/raise() with sigaction
  and _exit() for async-signal-safe operation (PR #5 P2)
- Fix TOCTOU symlink race in backup: use atomic temp file + rename
  instead of symlink check + fs::copy (PR #6/7 P2)
- Fix clippy needless_borrows_for_generic_args in migration.rs
- Fix cargo fmt formatting across all changed files

https://claude.ai/code/session_01QD9ukQ4n5q7aDmVXVWELNw
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants