feat: add --time flag to status command, storage tier display, and multi-provider improvements#6
Closed
feat: add --time flag to status command, storage tier display, and multi-provider improvements#6
Conversation
Ignore local example configuration file to prevent accidental commits of user-specific settings. 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)
- Separate AWS implementation into dedicated aws_client.py - Add AzureBlobClient implementing S3Client interface for Azure Blob Storage - Update s3client.py to contain only abstract base class and factory - Add azure-storage-blob as optional dependency [azure] - Map S3 concepts to Azure equivalents (buckets→containers, Glacier→Archive tier)
- Add GcpStorageClient implementing S3Client interface for GCS - Map S3 concepts to GCS equivalents (Glacier→Archive storage class) - Add google-cloud-storage as optional dependency [gcp] - Update factory to support "gcp" provider
- All cloud clients now accept credentials as constructor parameters - Credentials can be specified in config.yaml under 'storage' section - Environment variables remain as fallback if config not provided - Add load_storage_config() and get_storage_credentials() helpers - Update s3_client_factory() to pass **kwargs to provider clients
- Enable azure and gcp providers in CLI --provider option - Update all READMEs with multi-provider documentation - Update test mocks to use new aws_client module location - Add tests for azure and gcp client factory methods - Update constants test to expect all three providers 🤖 Commit generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Change max_results to results_per_page for azure-storage-blob 12.x compatibility.
Azure Blob Storage containers don't allow underscores in names. When using --provider azure, the CLI now detects underscores in bucket_name_prefix, repo_name_prefix, and base_path_prefix, and prompts the user to confirm converting them to hyphens before proceeding.
- Update repository creation error to show provider-specific solutions - Include plugin installation commands for AWS/Azure/GCP - Add links to Elastic documentation for each provider's repository plugin - Update preconditions check to verify correct plugin based on provider - Update storage bucket creation errors to be provider-aware
- Add ES_PLUGIN_NAME, ES_PLUGIN_DOC_URL, STORAGE_TYPE, etc. class attributes to AwsS3Client, AzureBlobClient, and GcpStorageClient - Update setup action to read error help text from client class - Add class attributes to MockS3Client for test compatibility This keeps provider-specific information in the provider modules rather than hardcoding it in the setup action.
- Remove non-existent "deepfreeze DELETE index" command from help text - Add STORAGE_DELETE_CMD to each client class with provider-specific deletion commands (aws s3 rb, az storage container delete, gcloud storage rm) - Update bucket exists error to show correct provider-specific command
Change "AWS credentials" to use the provider display name from the storage client (e.g., "Azure credentials" for Azure provider).
The create_repo function was hardcoded to create S3 repositories. Updated to support: - AWS: type "s3" with bucket, base_path, canned_acl, storage_class - Azure: type "azure" with container, base_path - GCP: type "gcs" with bucket, base_path This was the root cause of Azure repository creation failures.
Add storage account name tracking to AzureBlobClient to help diagnose credential configuration issues. Error messages now show which account is being used when containers already exist or creation fails. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix mount_repo() to use provider from settings for correct repo type (s3/azure/gcs) and settings (bucket vs container) - Update CLI defaults to validate all providers (aws, azure, gcp) - Update rotate.py log messages to use provider-specific storage type - Update constants.py comments to remove S3-specific terminology Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix push_to_glacier to use refreeze() which handles provider-specific tier changes correctly (direct tier set for Azure/GCP vs copy-in-place for AWS) - Update all refreeze methods to skip objects already in target tier, avoiding errors when archiving already-archived blobs - Add Storage Tier column to status display showing actual blob tier (Archive, Hot, Cool, Mixed) to verify archiving is working Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add _cleanup_orphaned_policies() to rotate action that deletes ILM policies referencing repositories that have been unmounted. This runs automatically after archiving repos, so old versioned policies are cleaned up as part of the rotation process rather than requiring a separate cleanup run. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The unmount_repo function was hardcoded to look for "bucket" in repository settings, but Azure repos use "container" instead. This caused unmounting to fail for Azure repos, preventing archiving during rotation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When blobs are moved to Archive tier, they become unreadable without rehydration. The date range update was happening during unmount (after archiving), so it failed because snapshot metadata couldn't be read. Now the date range is updated BEFORE pushing to archive tier, while the snapshot metadata is still readable. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two fixes for the archive/unmount logic: 1. Don't persist "unmounted" state if ES unmount actually failed (was causing mounted=Yes + thaw_state=frozen inconsistency) 2. Check for active searchable snapshot indices BEFORE archiving (prevents archiving blobs then failing to unmount, leaving repo in broken state where ES can't read archived blobs) Repos with active indices are now skipped and will be retried on the next rotation after ILM deletes those indices. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add _update_date_ranges method to RepairMetadata action that computes and persists date ranges for mounted repositories with missing dates. The date range for a repository is computed by querying @timestamp values from mounted searchable snapshot indices. This can only be done for mounted repos since unmounted repos cannot be queried. Changes: - Add _update_date_ranges() method to compute missing date ranges - Update do_dry_run() to report missing date ranges - Update do_action() to repair date ranges along with state discrepancies - Update test to expect get_all_repos called twice (state + date ranges) This fixes an issue where date ranges were missing from status display because update_repository_date_range was only called during archiving. Now users can run 'deepfreeze repair-metadata' to populate dates for mounted repos that predate the date range tracking feature. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The repair-metadata command was incorrectly flagging active repos in hot storage as "thawed" because both states use instant-access storage. The distinction between active (never archived) and thawed (restored from archive) is semantic, not storage-based. Fix the discrepancy detection to only flag actual contradictions: - active/thawed with archive storage → should be frozen - frozen with instant-access storage → should be thawed - thawing with all-archive storage → should be frozen - thawing with all-accessible storage → should be thawed This prevents false positives where active repos are incorrectly identified as needing repair. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add logging before each date range update attempt so we can see which repo the command is processing. Also verify that the repository actually exists in Elasticsearch before trying to query its snapshots, to avoid hanging on repos that are marked as mounted in the status index but aren't actually present in ES. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…swallowing Separate GET and PUT operations into distinct try blocks in update_index_template_ilm_policy() for both composable and legacy template paths. Previously, a PUT failure after a successful GET was caught by a generic except handler, logged at DEBUG level, and silently swallowed — causing the code to fall through and report the template as "not found" when it was actually found but failed to update. 🤖 Commit generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Elasticsearch returns created_date in get_index_template responses but rejects it on put_index_template. Strip this field before both PUT call sites in update_index_template_ilm_policy and update_template_ilm_policy. 🤖 Commit generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Replace blacklist approach (popping created_date) with a whitelist of fields accepted by put_index_template. This ensures no system-managed metadata from the GET response leaks into the PUT body. 🤖 Commit generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Date ranges were never populated because the only update call lived inside _archive_old_repos, which runs after searchable snapshot indices are already gone. Move date range capture to a dedicated _update_date_ranges() step that runs before archiving, while indices are still queryable. - Add _update_date_ranges() method to Rotate class - Call it in do_action() between ILM policy update and archive step - Remove now-redundant update_repository_date_range call from _archive_old_repos() - Remove unused get_repository import - Add test verifying _update_date_ranges behaviour Co-Authored-By: Claude <noreply@anthropic.com> 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)
When querying indices with no documents or no @timestamp field, Elasticsearch returns {"value": null} without a value_as_string key. This was causing an uncaught KeyError that silently propagated up to update_repository_date_range, making it impossible to distinguish between "no indices found" and "indices found but no @timestamp data". - Add early return if all indices filtered out (empty index list) - Use .get() instead of dict access for value_as_string - Add explicit null check with debug logging explaining why - Prevents KeyError from masking the real cause of empty date ranges Co-Authored-By: Claude <noreply@anthropic.com> 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)
…d indices ILM force-merge creates snapshots with fm-clone-<random>- prefix (e.g., fm-clone-i6je-.ds-df-test-2026.01.28-000001) but when those snapshots are mounted as searchable snapshots, ES strips the prefix (restored-.ds-df-test-2026.01.28-000001). This mismatch prevented update_repository_date_range from finding the mounted indices, resulting in empty date ranges. Strip the fm-clone-xxxx- prefix from snapshot index names before attempting to match them to mounted indices. This allows the code to correctly find restored- indices and query their @timestamp ranges. Co-Authored-By: Claude <noreply@anthropic.com> 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)
The Phase column showed which ILM phase contained the searchable_snapshot action at the time each versioned policy was created. This was configuration archaeology that provided no actionable information for operating or monitoring the system. Remove the column and associated phase_name tracking, keeping only the useful operational data: policy name, repository, and usage counts. Co-Authored-By: Claude <noreply@anthropic.com> 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)
Show full date+time in status tables instead of date-only. Usage: deepfreeze status --time (or -rt) - Added -rt/--time click option to CLI status command - Added show_time parameter to Status class - Added _format_date_value() and _format_created_value() helpers - Date ranges show full ISO datetime when flag is set - Created timestamp shows full datetime when flag is set - Added tests for both date-only and time display modes
Collaborator
Author
|
Nope, this needs to be merged to dev, not pulled to main. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
--time/-rtflag to thestatuscommand to show full date+time in tables (instead of date-only), and includes several improvements to multi-cloud provider support and bug fixes.Changes
Status Command Enhancements
--time/-rtflag — show full date+time (ISO) in status tables; defaults to date-only for cleaner outputMulti-Provider Support (Azure & GCP)
config.yml)list_containersBug Fixes
fm-cloneprefix when matching snapshot indices to mounted indicesget_timestamp_rangecreated_datebefore PUT on index templatesActionErroron template PUT failure instead of silently swallowingunmount_repoto handle Azure container settingsOther
repair-metadatacommand