-
Notifications
You must be signed in to change notification settings - Fork 4
fix(dotc1z): add fsync and size validation to prevent c1z corruption #627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Two defensive fixes to prevent corrupt c1z files from being saved: 1. Add explicit fsync on sqlite database file after Close() but before saveC1z() reads it for compression. On filesystems with aggressive caching (like ZFS with ARC), data may still be in kernel buffers. 2. Add size validation in SaveC1Z to detect truncated copies. Compares bytes copied to source file size and fails if they don't match. This catches the actual symptom we observed: truncated zstd streams missing the end-of-stream marker. Root cause investigation: Production incidents showed c1z corruption where sync succeeded and SaveC1Z logged success, but the uploaded file had a truncated zstd stream (missing end-of-stream marker). The size validation would catch this scenario and fail fast rather than uploading corrupt data. Cost: ~5ms for fsync + <1ms for stat - negligible for sync operations. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
WalkthroughAdds a pre-compression fsync step that opens the DB file read-write and calls Sync when modifications exist, plus additional file-size/stat validation after file-copy operations to detect truncation before finalizing C1Z save. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Repository UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
Comment |
Sync() on a read-only file descriptor fails on Windows because FlushFileBuffers requires write access. Changed from O_RDONLY to O_RDWR so the fsync works correctly on both Linux and Windows. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements two defensive fixes to prevent c1z file corruption observed in production where compressed files had truncated zstd streams missing end-of-stream markers.
Key changes:
- Added explicit fsync after SQLite database close but before compression to ensure data is on disk
- Added size validation after file copy operations to detect truncated transfers
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| pkg/dotc1z/c1file.go | Adds fsync call on database file after SQLite Close() and before compression to ensure all data is written to disk, particularly important on filesystems with aggressive caching |
| pkg/dotc1z/manager/local/local.go | Adds size validation by comparing bytes copied via io.Copy to source file size to detect and fail on truncated transfers before saving corrupt files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
Two defensive fixes to prevent corrupt c1z files from being saved/uploaded:
Close()but beforesaveC1z()reads itProblem
Production incidents showed c1z corruption where:
SaveC1Zlogged success with byte count"corrupt stream, did not find end of stream"The zstd end-of-stream marker is written by
zstd.Close(). For it to be missing, truncation must occur during or after compression - either insaveC1z()itself or during the copy inSaveC1Z().Root Cause Analysis
Why fsync helps (defense in depth)
On filesystems with aggressive caching (like ZFS with ARC), data written by sqlite during
Close()may still be in kernel buffers. Without explicit fsync,saveC1z()could theoretically read stale/incomplete data.Why size validation catches the actual symptom
The observed corruption was a truncated zstd stream. If
io.CopyinSaveC1Z()returns fewer bytes than the source file size, we now fail immediately rather than saving a corrupt file.Changes
pkg/dotc1z/c1file.gopkg/dotc1z/manager/local/local.goCost
Total: <15ms - negligible for sync operations that take minutes.
Test plan
go build ./pkg/dotc1z/...passesgo test ./pkg/dotc1z/...passes (113s, all tests pass)🤖 Generated with Claude Code
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.