fix: handle missing metadata files gracefully in Iceberg analysis#8
Merged
danielbeach merged 1 commit intodanielbeach:mainfrom Oct 20, 2025
Conversation
Add resilient error handling for NoSuchKey errors when reading metadata files during: - Schema evolution analysis - Time travel metrics - Table constraints analysis - File compaction Z-order opportunity detection This is common in large, actively updated Iceberg tables where old metadata files are cleaned up while the table is still being queried. The analysis now continues when metadata files are missing, and adds a warning to the health report about incomplete sections while still providing all basic metrics. Fixes race condition where metadata files listed at the start of analysis are deleted/moved before being read.
danielbeach
approved these changes
Oct 19, 2025
Owner
danielbeach
left a comment
There was a problem hiding this comment.
Many thanks for your contribution.
Collaborator
Author
|
Would you mind adding a “hacktoberfest-accepted” label or a "hacktoberfest" topic to the repository before merge? Thanks :) |
Owner
Done |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix: Handle Missing Metadata Files Gracefully in Iceberg Analysis
This fixes #7
Problem
When analyzing large, actively updated Iceberg tables, the analysis would fail with a generic
RuntimeError: Iceberg analysis failed: service error. This error masked the underlying issue: AWS S3NoSuchKeyerrors occurring when trying to read metadata files during various analysis phases.Error Details
The error occurred during:
Root Cause
This is a race condition common in large, actively updated Iceberg tables:
NoSuchKeyerror, causing the entire analysis to failSolution
Modified the Iceberg analysis to be resilient to missing metadata files:
Changes Made
Graceful Error Handling: Updated
analyze_schema_evolution,analyze_time_travel,analyze_table_constraints, andanalyze_iceberg_z_order_opportunitymethods to handleNoSuchKeyerrorsSkip Missing Files: Instead of failing the entire analysis, the code now:
User-Friendly Warning: Added informative warning in the health report recommendations:
Partial Reports: The analysis now returns:
Code Changes
src/iceberg.rs
analyze_schema_evolution()to usematchstatements for error handlinganalyze_time_travel()to skip missing metadata filesanalyze_table_constraints()to continue on errorsanalyze_iceberg_z_order_opportunity()to handle missing filesgenerate_recommendations()to warn about incomplete sectionsTesting
Tested On
Results
✅ Before Fix: Analysis failed with
RuntimeError: Iceberg analysis failed: service error✅ After Fix: Analysis completes successfully with:
Test Output (masked confidential info)
Impact
User Benefits
Backward Compatibility
Related Issues
This fix addresses a common scenario in production environments where:
Checklist
cargo test)pytest)