Skip to content

feat: add --time flag to status command, storage tier display, and multi-provider improvements#6

Closed
wortmanb wants to merge 30 commits intomainfrom
feature/status-time-display
Closed

feat: add --time flag to status command, storage tier display, and multi-provider improvements#6
wortmanb wants to merge 30 commits intomainfrom
feature/status-time-display

Conversation

@wortmanb
Copy link
Collaborator

Summary

Adds a --time / -rt flag to the status command to show full date+time in tables (instead of date-only), and includes several improvements to multi-cloud provider support and bug fixes.

Changes

Status Command Enhancements

  • --time / -rt flag — show full date+time (ISO) in status tables; defaults to date-only for cleaner output
  • Storage Tier column — repositories table now shows the storage tier (Hot/Cool/Archive/Mixed) by sampling blob metadata
  • Removed Phase column from ILM policies table (redundant)

Multi-Provider Support (Azure & GCP)

  • Add Azure Blob Storage and Google Cloud Storage clients alongside existing AWS/S3
  • Config file support for storage provider credentials (config.yml)
  • Provider-specific error messages throughout the codebase
  • Azure container name validation (prompts to convert underscores to hyphens during setup)
  • Correct Azure SDK parameter usage for list_containers

Bug Fixes

  • Strip fm-clone prefix when matching snapshot indices to mounted indices
  • Handle null aggregation results in get_timestamp_range
  • Capture date ranges for all mounted repos during rotation
  • Whitelist PUT fields for composable template updates
  • Strip system-managed created_date before PUT on index templates
  • Raise ActionError on template PUT failure instead of silently swallowing
  • Check for active indices before archiving repos
  • Update date range before archiving to archive tier
  • Fix unmount_repo to handle Azure container settings
  • Don't treat active repos as thawed based on storage class

Other

  • Add date range repair to repair-metadata command
  • Delete orphaned ILM policies during rotation
  • Logging and ES verification for date range updates

Bret Wortman and others added 30 commits December 16, 2025 11:45
Ignore local example configuration file to prevent accidental commits
of user-specific settings.

🤖 Commit generated with [Claude Code](https://claude.com/claude-code)
- Separate AWS implementation into dedicated aws_client.py
- Add AzureBlobClient implementing S3Client interface for Azure Blob Storage
- Update s3client.py to contain only abstract base class and factory
- Add azure-storage-blob as optional dependency [azure]
- Map S3 concepts to Azure equivalents (buckets→containers, Glacier→Archive tier)
- Add GcpStorageClient implementing S3Client interface for GCS
- Map S3 concepts to GCS equivalents (Glacier→Archive storage class)
- Add google-cloud-storage as optional dependency [gcp]
- Update factory to support "gcp" provider
- All cloud clients now accept credentials as constructor parameters
- Credentials can be specified in config.yaml under 'storage' section
- Environment variables remain as fallback if config not provided
- Add load_storage_config() and get_storage_credentials() helpers
- Update s3_client_factory() to pass **kwargs to provider clients
- Enable azure and gcp providers in CLI --provider option
- Update all READMEs with multi-provider documentation
- Update test mocks to use new aws_client module location
- Add tests for azure and gcp client factory methods
- Update constants test to expect all three providers

🤖 Commit generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Change max_results to results_per_page for azure-storage-blob 12.x
compatibility.
Azure Blob Storage containers don't allow underscores in names. When
using --provider azure, the CLI now detects underscores in bucket_name_prefix,
repo_name_prefix, and base_path_prefix, and prompts the user to confirm
converting them to hyphens before proceeding.
- Update repository creation error to show provider-specific solutions
- Include plugin installation commands for AWS/Azure/GCP
- Add links to Elastic documentation for each provider's repository plugin
- Update preconditions check to verify correct plugin based on provider
- Update storage bucket creation errors to be provider-aware
- Add ES_PLUGIN_NAME, ES_PLUGIN_DOC_URL, STORAGE_TYPE, etc. class
  attributes to AwsS3Client, AzureBlobClient, and GcpStorageClient
- Update setup action to read error help text from client class
- Add class attributes to MockS3Client for test compatibility

This keeps provider-specific information in the provider modules
rather than hardcoding it in the setup action.
- Remove non-existent "deepfreeze DELETE index" command from help text
- Add STORAGE_DELETE_CMD to each client class with provider-specific
  deletion commands (aws s3 rb, az storage container delete, gcloud storage rm)
- Update bucket exists error to show correct provider-specific command
Change "AWS credentials" to use the provider display name from the
storage client (e.g., "Azure credentials" for Azure provider).
The create_repo function was hardcoded to create S3 repositories.
Updated to support:
- AWS: type "s3" with bucket, base_path, canned_acl, storage_class
- Azure: type "azure" with container, base_path
- GCP: type "gcs" with bucket, base_path

This was the root cause of Azure repository creation failures.
Add storage account name tracking to AzureBlobClient to help diagnose
credential configuration issues. Error messages now show which account
is being used when containers already exist or creation fails.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix mount_repo() to use provider from settings for correct repo type
  (s3/azure/gcs) and settings (bucket vs container)
- Update CLI defaults to validate all providers (aws, azure, gcp)
- Update rotate.py log messages to use provider-specific storage type
- Update constants.py comments to remove S3-specific terminology

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix push_to_glacier to use refreeze() which handles provider-specific
  tier changes correctly (direct tier set for Azure/GCP vs copy-in-place
  for AWS)
- Update all refreeze methods to skip objects already in target tier,
  avoiding errors when archiving already-archived blobs
- Add Storage Tier column to status display showing actual blob tier
  (Archive, Hot, Cool, Mixed) to verify archiving is working

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add _cleanup_orphaned_policies() to rotate action that deletes ILM
policies referencing repositories that have been unmounted. This runs
automatically after archiving repos, so old versioned policies are
cleaned up as part of the rotation process rather than requiring a
separate cleanup run.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The unmount_repo function was hardcoded to look for "bucket" in
repository settings, but Azure repos use "container" instead.
This caused unmounting to fail for Azure repos, preventing
archiving during rotation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When blobs are moved to Archive tier, they become unreadable without
rehydration. The date range update was happening during unmount (after
archiving), so it failed because snapshot metadata couldn't be read.

Now the date range is updated BEFORE pushing to archive tier, while
the snapshot metadata is still readable.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two fixes for the archive/unmount logic:

1. Don't persist "unmounted" state if ES unmount actually failed
   (was causing mounted=Yes + thaw_state=frozen inconsistency)

2. Check for active searchable snapshot indices BEFORE archiving
   (prevents archiving blobs then failing to unmount, leaving
   repo in broken state where ES can't read archived blobs)

Repos with active indices are now skipped and will be retried
on the next rotation after ILM deletes those indices.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add _update_date_ranges method to RepairMetadata action that computes
and persists date ranges for mounted repositories with missing dates.

The date range for a repository is computed by querying @timestamp
values from mounted searchable snapshot indices. This can only be done
for mounted repos since unmounted repos cannot be queried.

Changes:
- Add _update_date_ranges() method to compute missing date ranges
- Update do_dry_run() to report missing date ranges
- Update do_action() to repair date ranges along with state discrepancies
- Update test to expect get_all_repos called twice (state + date ranges)

This fixes an issue where date ranges were missing from status display
because update_repository_date_range was only called during archiving.
Now users can run 'deepfreeze repair-metadata' to populate dates for
mounted repos that predate the date range tracking feature.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The repair-metadata command was incorrectly flagging active repos in
hot storage as "thawed" because both states use instant-access storage.
The distinction between active (never archived) and thawed (restored
from archive) is semantic, not storage-based.

Fix the discrepancy detection to only flag actual contradictions:
- active/thawed with archive storage → should be frozen
- frozen with instant-access storage → should be thawed
- thawing with all-archive storage → should be frozen
- thawing with all-accessible storage → should be thawed

This prevents false positives where active repos are incorrectly
identified as needing repair.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add logging before each date range update attempt so we can see which
repo the command is processing. Also verify that the repository actually
exists in Elasticsearch before trying to query its snapshots, to avoid
hanging on repos that are marked as mounted in the status index but
aren't actually present in ES.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…swallowing

Separate GET and PUT operations into distinct try blocks in
update_index_template_ilm_policy() for both composable and legacy
template paths. Previously, a PUT failure after a successful GET was
caught by a generic except handler, logged at DEBUG level, and silently
swallowed — causing the code to fall through and report the template as
"not found" when it was actually found but failed to update.

🤖 Commit generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Elasticsearch returns created_date in get_index_template responses but
rejects it on put_index_template. Strip this field before both PUT call
sites in update_index_template_ilm_policy and update_template_ilm_policy.

🤖 Commit generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Replace blacklist approach (popping created_date) with a whitelist of
fields accepted by put_index_template. This ensures no system-managed
metadata from the GET response leaks into the PUT body.

🤖 Commit generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Date ranges were never populated because the only update call lived
inside _archive_old_repos, which runs after searchable snapshot indices
are already gone. Move date range capture to a dedicated
_update_date_ranges() step that runs before archiving, while indices
are still queryable.

- Add _update_date_ranges() method to Rotate class
- Call it in do_action() between ILM policy update and archive step
- Remove now-redundant update_repository_date_range call from
  _archive_old_repos()
- Remove unused get_repository import
- Add test verifying _update_date_ranges behaviour

Co-Authored-By: Claude <noreply@anthropic.com>

🤖 Commit generated with [Claude Code](https://claude.com/claude-code)
When querying indices with no documents or no @timestamp field,
Elasticsearch returns {"value": null} without a value_as_string key.
This was causing an uncaught KeyError that silently propagated up to
update_repository_date_range, making it impossible to distinguish
between "no indices found" and "indices found but no @timestamp data".

- Add early return if all indices filtered out (empty index list)
- Use .get() instead of dict access for value_as_string
- Add explicit null check with debug logging explaining why
- Prevents KeyError from masking the real cause of empty date ranges

Co-Authored-By: Claude <noreply@anthropic.com>

🤖 Commit generated with [Claude Code](https://claude.com/claude-code)
…d indices

ILM force-merge creates snapshots with fm-clone-<random>- prefix
(e.g., fm-clone-i6je-.ds-df-test-2026.01.28-000001) but when those
snapshots are mounted as searchable snapshots, ES strips the prefix
(restored-.ds-df-test-2026.01.28-000001). This mismatch prevented
update_repository_date_range from finding the mounted indices,
resulting in empty date ranges.

Strip the fm-clone-xxxx- prefix from snapshot index names before
attempting to match them to mounted indices. This allows the code to
correctly find restored- indices and query their @timestamp ranges.

Co-Authored-By: Claude <noreply@anthropic.com>

🤖 Commit generated with [Claude Code](https://claude.com/claude-code)
The Phase column showed which ILM phase contained the searchable_snapshot
action at the time each versioned policy was created. This was
configuration archaeology that provided no actionable information for
operating or monitoring the system.

Remove the column and associated phase_name tracking, keeping only the
useful operational data: policy name, repository, and usage counts.

Co-Authored-By: Claude <noreply@anthropic.com>

🤖 Commit generated with [Claude Code](https://claude.com/claude-code)
Show full date+time in status tables instead of date-only.
Usage: deepfreeze status --time  (or -rt)

- Added -rt/--time click option to CLI status command
- Added show_time parameter to Status class
- Added _format_date_value() and _format_created_value() helpers
- Date ranges show full ISO datetime when flag is set
- Created timestamp shows full datetime when flag is set
- Added tests for both date-only and time display modes
@wortmanb
Copy link
Collaborator Author

Nope, this needs to be merged to dev, not pulled to main.

@wortmanb wortmanb closed this Jan 28, 2026
@wortmanb wortmanb deleted the feature/status-time-display branch January 29, 2026 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant