feat: add --time flag to status command, storage tier display, and multi-provider improvements by wortmanb · Pull Request #6 · elastic/deepfreeze

wortmanb · 2026-01-28T22:46:15Z

Summary

Adds a --time / -rt flag to the status command to show full date+time in tables (instead of date-only), and includes several improvements to multi-cloud provider support and bug fixes.

Changes

Status Command Enhancements

--time / -rt flag — show full date+time (ISO) in status tables; defaults to date-only for cleaner output
Storage Tier column — repositories table now shows the storage tier (Hot/Cool/Archive/Mixed) by sampling blob metadata
Removed Phase column from ILM policies table (redundant)

Multi-Provider Support (Azure & GCP)

Add Azure Blob Storage and Google Cloud Storage clients alongside existing AWS/S3
Config file support for storage provider credentials (config.yml)
Provider-specific error messages throughout the codebase
Azure container name validation (prompts to convert underscores to hyphens during setup)
Correct Azure SDK parameter usage for list_containers

Bug Fixes

Strip fm-clone prefix when matching snapshot indices to mounted indices
Handle null aggregation results in get_timestamp_range
Capture date ranges for all mounted repos during rotation
Whitelist PUT fields for composable template updates
Strip system-managed created_date before PUT on index templates
Raise ActionError on template PUT failure instead of silently swallowing
Check for active indices before archiving repos
Update date range before archiving to archive tier
Fix unmount_repo to handle Azure container settings
Don't treat active repos as thawed based on storage class

Other

Add date range repair to repair-metadata command
Delete orphaned ILM policies during rotation
Logging and ES verification for date range updates

Ignore local example configuration file to prevent accidental commits of user-specific settings. 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)

- Separate AWS implementation into dedicated aws_client.py - Add AzureBlobClient implementing S3Client interface for Azure Blob Storage - Update s3client.py to contain only abstract base class and factory - Add azure-storage-blob as optional dependency [azure] - Map S3 concepts to Azure equivalents (buckets→containers, Glacier→Archive tier)

- Add GcpStorageClient implementing S3Client interface for GCS - Map S3 concepts to GCS equivalents (Glacier→Archive storage class) - Add google-cloud-storage as optional dependency [gcp] - Update factory to support "gcp" provider

- All cloud clients now accept credentials as constructor parameters - Credentials can be specified in config.yaml under 'storage' section - Environment variables remain as fallback if config not provided - Add load_storage_config() and get_storage_credentials() helpers - Update s3_client_factory() to pass **kwargs to provider clients

- Enable azure and gcp providers in CLI --provider option - Update all READMEs with multi-provider documentation - Update test mocks to use new aws_client module location - Add tests for azure and gcp client factory methods - Update constants test to expect all three providers 🤖 Commit generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Change max_results to results_per_page for azure-storage-blob 12.x compatibility.

Azure Blob Storage containers don't allow underscores in names. When using --provider azure, the CLI now detects underscores in bucket_name_prefix, repo_name_prefix, and base_path_prefix, and prompts the user to confirm converting them to hyphens before proceeding.

- Update repository creation error to show provider-specific solutions - Include plugin installation commands for AWS/Azure/GCP - Add links to Elastic documentation for each provider's repository plugin - Update preconditions check to verify correct plugin based on provider - Update storage bucket creation errors to be provider-aware

- Add ES_PLUGIN_NAME, ES_PLUGIN_DOC_URL, STORAGE_TYPE, etc. class attributes to AwsS3Client, AzureBlobClient, and GcpStorageClient - Update setup action to read error help text from client class - Add class attributes to MockS3Client for test compatibility This keeps provider-specific information in the provider modules rather than hardcoding it in the setup action.

- Remove non-existent "deepfreeze DELETE index" command from help text - Add STORAGE_DELETE_CMD to each client class with provider-specific deletion commands (aws s3 rb, az storage container delete, gcloud storage rm) - Update bucket exists error to show correct provider-specific command

Change "AWS credentials" to use the provider display name from the storage client (e.g., "Azure credentials" for Azure provider).

The create_repo function was hardcoded to create S3 repositories. Updated to support: - AWS: type "s3" with bucket, base_path, canned_acl, storage_class - Azure: type "azure" with container, base_path - GCP: type "gcs" with bucket, base_path This was the root cause of Azure repository creation failures.

Add storage account name tracking to AzureBlobClient to help diagnose credential configuration issues. Error messages now show which account is being used when containers already exist or creation fails. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Fix mount_repo() to use provider from settings for correct repo type (s3/azure/gcs) and settings (bucket vs container) - Update CLI defaults to validate all providers (aws, azure, gcp) - Update rotate.py log messages to use provider-specific storage type - Update constants.py comments to remove S3-specific terminology Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Fix push_to_glacier to use refreeze() which handles provider-specific tier changes correctly (direct tier set for Azure/GCP vs copy-in-place for AWS) - Update all refreeze methods to skip objects already in target tier, avoiding errors when archiving already-archived blobs - Add Storage Tier column to status display showing actual blob tier (Archive, Hot, Cool, Mixed) to verify archiving is working Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add _cleanup_orphaned_policies() to rotate action that deletes ILM policies referencing repositories that have been unmounted. This runs automatically after archiving repos, so old versioned policies are cleaned up as part of the rotation process rather than requiring a separate cleanup run. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The unmount_repo function was hardcoded to look for "bucket" in repository settings, but Azure repos use "container" instead. This caused unmounting to fail for Azure repos, preventing archiving during rotation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When blobs are moved to Archive tier, they become unreadable without rehydration. The date range update was happening during unmount (after archiving), so it failed because snapshot metadata couldn't be read. Now the date range is updated BEFORE pushing to archive tier, while the snapshot metadata is still readable. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Two fixes for the archive/unmount logic: 1. Don't persist "unmounted" state if ES unmount actually failed (was causing mounted=Yes + thaw_state=frozen inconsistency) 2. Check for active searchable snapshot indices BEFORE archiving (prevents archiving blobs then failing to unmount, leaving repo in broken state where ES can't read archived blobs) Repos with active indices are now skipped and will be retried on the next rotation after ILM deletes those indices. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

@timestamp

Add _update_date_ranges method to RepairMetadata action that computes and persists date ranges for mounted repositories with missing dates. The date range for a repository is computed by querying @timestamp values from mounted searchable snapshot indices. This can only be done for mounted repos since unmounted repos cannot be queried. Changes: - Add _update_date_ranges() method to compute missing date ranges - Update do_dry_run() to report missing date ranges - Update do_action() to repair date ranges along with state discrepancies - Update test to expect get_all_repos called twice (state + date ranges) This fixes an issue where date ranges were missing from status display because update_repository_date_range was only called during archiving. Now users can run 'deepfreeze repair-metadata' to populate dates for mounted repos that predate the date range tracking feature. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The repair-metadata command was incorrectly flagging active repos in hot storage as "thawed" because both states use instant-access storage. The distinction between active (never archived) and thawed (restored from archive) is semantic, not storage-based. Fix the discrepancy detection to only flag actual contradictions: - active/thawed with archive storage → should be frozen - frozen with instant-access storage → should be thawed - thawing with all-archive storage → should be frozen - thawing with all-accessible storage → should be thawed This prevents false positives where active repos are incorrectly identified as needing repair. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add logging before each date range update attempt so we can see which repo the command is processing. Also verify that the repository actually exists in Elasticsearch before trying to query its snapshots, to avoid hanging on repos that are marked as mounted in the status index but aren't actually present in ES. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…swallowing Separate GET and PUT operations into distinct try blocks in update_index_template_ilm_policy() for both composable and legacy template paths. Previously, a PUT failure after a successful GET was caught by a generic except handler, logged at DEBUG level, and silently swallowed — causing the code to fall through and report the template as "not found" when it was actually found but failed to update. 🤖 Commit generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Elasticsearch returns created_date in get_index_template responses but rejects it on put_index_template. Strip this field before both PUT call sites in update_index_template_ilm_policy and update_template_ilm_policy. 🤖 Commit generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Replace blacklist approach (popping created_date) with a whitelist of fields accepted by put_index_template. This ensures no system-managed metadata from the GET response leaks into the PUT body. 🤖 Commit generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Date ranges were never populated because the only update call lived inside _archive_old_repos, which runs after searchable snapshot indices are already gone. Move date range capture to a dedicated _update_date_ranges() step that runs before archiving, while indices are still queryable. - Add _update_date_ranges() method to Rotate class - Call it in do_action() between ILM policy update and archive step - Remove now-redundant update_repository_date_range call from _archive_old_repos() - Remove unused get_repository import - Add test verifying _update_date_ranges behaviour Co-Authored-By: Claude <noreply@anthropic.com> 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)

@timestamp

When querying indices with no documents or no @timestamp field, Elasticsearch returns {"value": null} without a value_as_string key. This was causing an uncaught KeyError that silently propagated up to update_repository_date_range, making it impossible to distinguish between "no indices found" and "indices found but no @timestamp data". - Add early return if all indices filtered out (empty index list) - Use .get() instead of dict access for value_as_string - Add explicit null check with debug logging explaining why - Prevents KeyError from masking the real cause of empty date ranges Co-Authored-By: Claude <noreply@anthropic.com> 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)

@timestamp

…d indices ILM force-merge creates snapshots with fm-clone-<random>- prefix (e.g., fm-clone-i6je-.ds-df-test-2026.01.28-000001) but when those snapshots are mounted as searchable snapshots, ES strips the prefix (restored-.ds-df-test-2026.01.28-000001). This mismatch prevented update_repository_date_range from finding the mounted indices, resulting in empty date ranges. Strip the fm-clone-xxxx- prefix from snapshot index names before attempting to match them to mounted indices. This allows the code to correctly find restored- indices and query their @timestamp ranges. Co-Authored-By: Claude <noreply@anthropic.com> 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)

The Phase column showed which ILM phase contained the searchable_snapshot action at the time each versioned policy was created. This was configuration archaeology that provided no actionable information for operating or monitoring the system. Remove the column and associated phase_name tracking, keeping only the useful operational data: policy name, repository, and usage counts. Co-Authored-By: Claude <noreply@anthropic.com> 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)

Show full date+time in status tables instead of date-only. Usage: deepfreeze status --time (or -rt) - Added -rt/--time click option to CLI status command - Added show_time parameter to Status class - Added _format_date_value() and _format_created_value() helpers - Date ranges show full ISO datetime when flag is set - Created timestamp shows full datetime when flag is set - Added tests for both date-only and time display modes

wortmanb · 2026-01-28T22:47:09Z

Nope, this needs to be merged to dev, not pulled to main.

Bret Wortman and others added 30 commits December 16, 2025 11:45

🙈 chore: add examples/config.yml to .gitignore

d795688

Ignore local example configuration file to prevent accidental commits of user-specific settings. 🤖 Commit generated with [Claude Code](https://claude.com/claude-code)

✨ feat: add Google Cloud Storage support

af46908

- Add GcpStorageClient implementing S3Client interface for GCS - Map S3 concepts to GCS equivalents (Glacier→Archive storage class) - Add google-cloud-storage as optional dependency [gcp] - Update factory to support "gcp" provider

🐛 fix: use correct Azure SDK parameter name for list_containers

e389110

Change max_results to results_per_page for azure-storage-blob 12.x compatibility.

🐛 fix: make unexpected error message provider-aware

a11ea81

Change "AWS credentials" to use the provider display name from the storage client (e.g., "Azure credentials" for Azure provider).

wortmanb closed this Jan 28, 2026

wortmanb deleted the feature/status-time-display branch January 29, 2026 13:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add --time flag to status command, storage tier display, and multi-provider improvements#6

feat: add --time flag to status command, storage tier display, and multi-provider improvements#6
wortmanb wants to merge 30 commits intomainfrom
feature/status-time-display

wortmanb commented Jan 28, 2026

Uh oh!

wortmanb commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wortmanb commented Jan 28, 2026

Summary

Changes

Status Command Enhancements

Multi-Provider Support (Azure & GCP)

Bug Fixes

Other

Uh oh!

wortmanb commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant