feat: Implement Smart Retry Logic with Exponential Backoff for Installations #658

KrishnaShuk · 2026-01-20T06:40:05Z

Related Issue

Closes #43

Summary

This PR implements a robust "Smart Retry" mechanism for the Cortex CLI to improve reliability during installations by automatically handling transient failures.

SmartRetry Utility: Introduced cortex/utils/retry.py with exponential backoff and intelligent error classification to retry transient issues (like network timeouts) while failing fast on permanent errors (like permission denied or disk space), ensuring no time is wasted on unrecoverable failures.
Integration & Visibility: Integrated the retry logic into InstallationCoordinator
and updated the status callback to print visible warnings ("⚠️ Transient error detected...") to stdout, keeping users informed of delays.
Configuration & Documentation: Configured the CLI to default to 5 retries and updated docs/COMMANDS.md to officially document this new reliability feature.
Testing: Added comprehensive unit tests in tests/test_retry.py with 100% coverage and updated existing coordinator tests to mock sleep for efficient execution.

AI Disclosure

No AI used
AI/IDE/Agents used (please describe below)
Google antigravity AI coding assistant(Claude Opus 4.5 ) was used for framing better tests and creating the command file.

Demonstration

I have manipulated coordinator.py for producing below behaviour.

Screencast.from.2026-01-22.12-59-37.webm

Checklist

PR title follows format: type(scope): description or [scope] description
Tests pass (pytest tests/)
MVP label added if closing MVP issue
Update "Cortex -h" (if needed)

Summary by CodeRabbit

New Features
- Installs now include a configurable smart retry mechanism that detects transient failures, retries with exponential backoff (default max attempts: 5) and emits progress/status callbacks; permanent errors fail fast.
- CLI option added to control maximum retry attempts per install.
Tests
- Added extensive tests for retry behaviors: success, transient and permanent failures, retry exhaustion, exception-based retries, and status-callbacks.
Documentation
- Install docs updated and a new retry configuration guide added describing strategies and environment overrides.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-20T06:40:12Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a SmartRetry utility and integrates retry-enabled command execution into InstallationCoordinator with a new max_retries parameter. CLI exposes max_retries on install and threads it into sequential installs. Tests and documentation added/updated; tests mock time.sleep for determinism.

Changes

Cohort / File(s)	Summary
Retry utility `cortex/utils/retry.py`	New `RetryStrategy` dataclass, `DEFAULT_STRATEGIES`, `PERMANENT_ERRORS`, `load_strategies_from_env()` and `SmartRetry` implementing categorized, configurable retries with exponential backoff and status callbacks.
Coordinator core `cortex/coordinator.py`	`InstallationCoordinator.__init__` and `from_plan` gain `max_retries`; `_execute_command` now delegates command execution to `SmartRetry.run(...)`, preserving timeout, rollback, logging, and progress callbacks.
CLI call site `cortex/cli.py`	Imports `DEFAULT_MAX_RETRIES`; `CortexCLI.install` signature adds `max_retries: int = DEFAULT_MAX_RETRIES`; sequential install constructs `InstallationCoordinator(..., max_retries=max_retries)`.
Tests `tests/test_retry.py`, `tests/test_coordinator.py`	New `tests/test_retry.py` exercising strategies, env overrides, callbacks, exception and retry flows; `tests/test_coordinator.py` patches `time.sleep` and updates test signatures/expectations.
Docs `docs/COMMANDS.md`, `docs/RETRY_CONFIGURATION.md`	Single-line note in `COMMANDS.md` about smart retry behavior; new `RETRY_CONFIGURATION.md` documents strategies, env overrides, examples, and runtime feedback.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI (cortex/cli.py)
    participant Coordinator as InstallationCoordinator
    participant SmartRetry as SmartRetry
    participant Subprocess as subprocess.run
    participant Logger as Logger

    CLI->>Coordinator: instantiate(..., max_retries)
    Coordinator->>SmartRetry: create(strategies, status_callback)

    loop for each command
        Coordinator->>SmartRetry: run(execute_command)
        SmartRetry->>Subprocess: execute command (subprocess.run)
        alt returncode == 0
            Subprocess-->>SmartRetry: success
            SmartRetry-->>Coordinator: result
        else transient error
            Subprocess-->>SmartRetry: failed result
            SmartRetry->>Logger: log/notify "Retrying" (status_callback)
            SmartRetry->>SmartRetry: sleep(backoff)
            SmartRetry->>Subprocess: retry
        else permanent error
            Subprocess-->>SmartRetry: failed result
            SmartRetry-->>Coordinator: propagate failure
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Added multi-step structured installation coordinator with plan support fixes #8 #190: Prior changes to InstallationCoordinator constructor/from_plan and execution flow that this PR extends to add max_retries and SmartRetry integration.

Poem

🐰 I hop through logs and backoff's beat,

one, two, then four — I won't admit defeat.
Transient storms I bravely retry,
Permanent woes I kindly pass by.
✨—A rabbit's cheer for resilient installs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and specifically describes the main change: implementing smart retry logic with exponential backoff for installations. It directly matches the primary objective of the changeset.
Description check	✅ Passed	The PR description covers all key sections including Related Issue, Summary with detailed implementation aspects, AI Disclosure, and a completed Checklist. All required information is present and well-organized.
Linked Issues check	✅ Passed	All coding requirements from `#43` are met: transient vs permanent failure detection [cortex/utils/retry.py], exponential backoff implementation [SmartRetry class], configurable max retries defaulting to 5 [CLI integration], retry logging [status callbacks], different strategies per error type [DEFAULT_STRATEGIES mapping], user-visible progress [status_callback warnings], and comprehensive unit tests [tests/test_retry.py with 100% coverage].
Out of Scope Changes check	✅ Passed	All changes directly support the retry logic feature. cortex/cli.py and cortex/coordinator.py additions thread max_retries through the installation path. cortex/utils/retry.py implements core retry logic. Tests and documentation appropriately support the feature. No unrelated changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-01-20T06:40:22Z

Summary of Changes

Hello @KrishnaShuk, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness of the installation process by introducing a smart retry mechanism with exponential backoff. This system is designed to automatically re-attempt installation commands that fail due to temporary issues, such as network glitches or resource contention, while immediately failing on permanent errors. This will lead to more reliable and self-healing installations, reducing manual intervention.

Highlights

Introduced Smart Retry Logic: A new SmartRetry class has been implemented to handle transient errors during installations with exponential backoff.
Integrated into Installation Coordinator: The InstallationCoordinator now utilizes the SmartRetry mechanism for executing commands, making installation steps more resilient.
Configurable Retries: The maximum number of retries (max_retries) is now a configurable parameter for the InstallationCoordinator.
Error Classification: The SmartRetry class leverages an ErrorParser to intelligently distinguish between transient errors (which are retried) and permanent errors (which cause immediate failure).
New Unit Tests: Dedicated unit tests for the SmartRetry class have been added to ensure its correct functionality and error handling.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

KrishnaShuk · 2026-01-20T06:40:24Z

Working on it!

gemini-code-assist

Code Review

This pull request introduces a robust smart retry mechanism with exponential backoff for installation processes, significantly improving the resilience of command execution. A new SmartRetry class has been added, which intelligently distinguishes between transient and permanent errors using ErrorParser to decide whether to retry. The InstallationCoordinator has been updated to integrate this new retry logic, and corresponding unit tests for SmartRetry have been added. While the core retry logic is well-implemented, some existing tests for InstallationCoordinator need to be updated to fully reflect and verify the new retry behavior. Additionally, the handling of subprocess.TimeoutExpired could be more explicit within the retry logic to prevent unnecessary retries.

github-actions · 2026-01-20T08:17:19Z

CLA Verification Passed

All contributors have signed the CLA.

Contributor	Signed As
@KrishnaShuk	@KrishnaShuk
@Anshgrover23	@Anshgrover23

Copilot

Pull request overview

This PR implements smart retry logic with exponential backoff for installation commands to handle transient errors more gracefully. The implementation adds a new SmartRetry class that uses error classification to distinguish between retryable transient errors (network issues, locks) and permanent failures (permission denied, missing packages).

Changes:

Added SmartRetry class with exponential backoff and error categorization
Integrated retry logic into InstallationCoordinator for command execution
Added comprehensive test coverage for retry scenarios
Added max_retries parameter to coordinator initialization

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
cortex/utils/retry.py	New retry handler with exponential backoff and error classification
cortex/coordinator.py	Integration of retry logic into command execution pipeline
tests/test_retry.py	Comprehensive test suite for retry behavior
tests/test_coordinator.py	Updated tests to mock `time.sleep` for retry integration
cortex/cli.py	Added `max_retries` parameter to coordinator instantiation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cortex/coordinator.py

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cortex/cli.py (1)

1037-1044: Add --max-retries CLI flag to make retry attempts user-configurable.

The CLI hard-codes max_retries=5 when instantiating InstallationCoordinator, and there is no existing --max-retries flag or configuration mechanism to override it. Users cannot customize retry behavior for installation steps. Add a --max-retries argument to install_parser (default 5) and wire it through the install method to InstallationCoordinator.

🛠️ Suggested wiring

-                coordinator = InstallationCoordinator(
+                coordinator = InstallationCoordinator(
                     commands=commands,
                     descriptions=[f"Step {i + 1}" for i in range(len(commands))],
                     timeout=300,
                     stop_on_error=True,
                     progress_callback=progress_callback,
-                    max_retries=5,
+                    max_retries=max_retries,
                 )

Update install method signature:

     def install(
         self,
         software: str,
         execute: bool = False,
         dry_run: bool = False,
         parallel: bool = False,
+        max_retries: int = 5,
         json_output: bool = False,
     ):

Add to install_parser:

+install_parser.add_argument(
+    "--max-retries",
+    type=int,
+    default=5,
+    help="Maximum retry attempts for installation steps (default: 5)",
+)

Update dispatch call:

         elif args.command == "install":
             return cli.install(
                 args.software,
                 execute=args.execute,
                 dry_run=args.dry_run,
                 parallel=args.parallel,
+                max_retries=args.max_retries,
             )

🤖 Fix all issues with AI agents

In `@cortex/coordinator.py`:
- Around line 191-199: status_callback currently only calls self._log (which
writes to optional log_file) so retry updates are silent in the default CLI;
modify status_callback used for SmartRetry to also emit CLI-visible output
(e.g., call an existing CLI progress callback or print to stdout) so retry
progress is visible to users. Locate status_callback and update it to call both
self._log(msg) and the CLI-facing notifier (e.g., self._progress_callback(msg)
or a console print) before passing it into SmartRetry.

In `@cortex/utils/retry.py`:
- Around line 17-26: Validate inputs in the __init__ method: ensure max_retries
is an int >= 0 and backoff_factor is a float >= 0.0, and raise a clear
ValueError if either is negative or of the wrong type to fail fast (this
prevents negative durations passed to time.sleep later). Set self.max_retries
and self.backoff_factor only after validation; keep self.status_callback and
self.error_parser initialization unchanged.
- Around line 94-115: The current logic retries by default for unlisted
categories causing permanent errors (e.g., DISK_SPACE) to be retried; update the
retry decision in the function that calls error_parser.parse_error (uses
analysis = self.error_parser.parse_error(error_message) and reads
analysis.primary_category) to treat non-fixable errors as permanent: first check
analysis.is_fixable and immediately return False when is_fixable is False, then
keep the existing retry-on-transient list (ErrorCategory.NETWORK_ERROR,
LOCK_ERROR, UNKNOWN) and the permanent list (add DISK_SPACE plus other permanent
categories like PERMISSION_DENIED, PACKAGE_NOT_FOUND, CONFIGURATION_ERROR,
DEPENDENCY_MISSING, CONFLICT). This ensures permanent categories are not retried
while preserving current transient retry behavior.
- Around line 63-66: In the except block that sets last_exception and calls
self._should_retry (the block capturing "except Exception as e"), replace "raise
e" with a bare "raise" so the original traceback is preserved; locate the
handler that assigns last_exception and invokes self._should_retry(str(e)) and
change the re-raise to a bare raise to keep the original stack trace for
debugging.

🧹 Nitpick comments (1)

tests/test_retry.py (1)
9-13: Add explicit return type hints on test methods.
Type hints are required in Python files; please add -> None to setUp and each test_* method to keep the new test module compliant. As per coding guidelines, type hints are required in Python code.
♻️ Suggested update
-    def setUp(self):
+    def setUp(self) -> None:
         self.retry = SmartRetry(max_retries=3, backoff_factor=0.01)

-    def test_success_first_try(self):
+    def test_success_first_try(self) -> None:

cortex/coordinator.py

cortex/utils/retry.py

Anshgrover23

@KrishnaShuk Kindly address all coderabbit comments.

KrishnaShuk · 2026-01-20T10:06:14Z

will do it under a day.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@cortex/coordinator.py`:
- Around line 182-194: Add explicit type annotations to the inner helpers:
annotate run_cmd with a return type of subprocess.CompletedProcess[str] (i.e.,
def run_cmd() -> subprocess.CompletedProcess[str]:) and annotate status_callback
to return None (i.e., def status_callback(msg: str) -> None:). Ensure
subprocess.CompletedProcess is referenced via the subprocess module (or
imported) so the annotation resolves.

♻️ Duplicate comments (1)

cortex/utils/retry.py (1)

68-88: Preserve the original traceback when re-raising after retries.
Raising last_exception at the end loses the original traceback, which makes debugging harder after multiple retries. Consider storing exc_info and re-raising with its original traceback.

🛠️ Proposed fix

+import sys
@@
-        last_exception = None
+        last_exc_info = None
         last_result = None
@@
-            except Exception as e:
-                last_exception = e
+            except Exception as e:
+                last_exc_info = sys.exc_info()
                 if not self._should_retry(str(e)):
                     raise
@@
-        if last_exception:
-            raise last_exception
+        if last_exc_info:
+            _, exc, tb = last_exc_info
+            raise exc.with_traceback(tb)
         return last_result

🧹 Nitpick comments (1)

cortex/coordinator.py (1)

55-66: Document the new max_retries parameter in public docstrings.
Both __init__ and from_plan are public entry points; updating their docstrings keeps the API change discoverable.

Also applies to: 86-98

cortex/coordinator.py

KrishnaShuk · 2026-01-22T07:49:08Z

PR passes all acceptance criteria.
Resolved all comments.
Added demonstration video.
Passing all tests.

@Anshgrover23 PR is ready to be merged!

Anshgrover23

@KrishnaShuk added some comments, also there is missing test for ErrorCategory integration, add them.

Also, two things are missing from requirements:

Different strategies for different error types
Documentation with configuration examples

cortex/utils/retry.py

cortex/cli.py

cortex/utils/retry.py

cortex/coordinator.py

cortex/utils/retry.py

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@cortex/coordinator.py`:
- Around line 200-204: The code currently mutates a shared RetryStrategy by
setting strategies[ErrorCategory.NETWORK_ERROR].max_retries after calling
load_strategies_from_env(); instead, create and assign a new RetryStrategy
instance so you don't modify the default object. In the block that checks
ErrorCategory.NETWORK_ERROR, construct a fresh RetryStrategy (copying existing
fields or using its constructor) with max_retries=self.max_retries and set
strategies[ErrorCategory.NETWORK_ERROR] = new_instance; this preserves
load_strategies_from_env()'s defaults and avoids global side effects.

In `@docs/RETRY_CONFIGURATION.md`:
- Around line 30-32: The fenced code blocks in RETRY_CONFIGURATION.md are
missing language identifiers (MD040); update the block containing the formula
`wait_time = backoff_factor × 2^(attempt - 1)` and the example block showing the
retry message to include an explicit language (e.g., use ```text or ```math for
the formula and ```text for the retry message) so linting passes and syntax
highlighting is applied.

In `@tests/test_retry.py`:
- Around line 55-71: The tests test_network_override_from_env and
test_lock_override_from_env can be flaky because patch.dict is not clearing
existing environment variables; update both with patch.dict(..., clear=True) so
the CORTEX_RETRY_* values are set deterministically for
load_strategies_from_env; ensure the patch.dict calls surrounding each test use
clear=True to isolate the environment when asserting ErrorCategory.NETWORK_ERROR
and ErrorCategory.LOCK_ERROR strategy values.

♻️ Duplicate comments (2)

cortex/utils/retry.py (2)

55-90: Guard env overrides against invalid/negative values.

Right now int()/float() on CORTEX_RETRY_* will raise (or allow negative values), which can later feed time.sleep with a negative duration. Consider validating and falling back to defaults with a warning.

🛠️ Suggested guardrails for env parsing

 def load_strategies_from_env() -> dict[ErrorCategory, RetryStrategy]:
     """
@@
-    strategies = dict(DEFAULT_STRATEGIES)
+    strategies = dict(DEFAULT_STRATEGIES)
+
+    def _read_int(name: str, default: int) -> int:
+        raw = os.getenv(name)
+        if raw is None:
+            return default
+        try:
+            value = int(raw)
+        except ValueError:
+            logger.warning("Invalid %s=%r; using default %s", name, raw, default)
+            return default
+        if value < 0:
+            logger.warning("Negative %s=%r; using default %s", name, raw, default)
+            return default
+        return value
+
+    def _read_float(name: str, default: float) -> float:
+        raw = os.getenv(name)
+        if raw is None:
+            return default
+        try:
+            value = float(raw)
+        except ValueError:
+            logger.warning("Invalid %s=%r; using default %s", name, raw, default)
+            return default
+        if value < 0:
+            logger.warning("Negative %s=%r; using default %s", name, raw, default)
+            return default
+        return value
@@
-            max_retries=int(os.getenv("CORTEX_RETRY_NETWORK_MAX", "5")),
-            backoff_factor=float(os.getenv("CORTEX_RETRY_NETWORK_BACKOFF", "1.0")),
+            max_retries=_read_int("CORTEX_RETRY_NETWORK_MAX", 5),
+            backoff_factor=_read_float("CORTEX_RETRY_NETWORK_BACKOFF", 1.0),
@@
-            max_retries=int(os.getenv("CORTEX_RETRY_LOCK_MAX", "3")),
-            backoff_factor=float(os.getenv("CORTEX_RETRY_LOCK_BACKOFF", "5.0")),
+            max_retries=_read_int("CORTEX_RETRY_LOCK_MAX", 3),
+            backoff_factor=_read_float("CORTEX_RETRY_LOCK_BACKOFF", 5.0),
@@
-            max_retries=int(os.getenv("CORTEX_RETRY_UNKNOWN_MAX", "2")),
-            backoff_factor=float(os.getenv("CORTEX_RETRY_UNKNOWN_BACKOFF", "2.0")),
+            max_retries=_read_int("CORTEX_RETRY_UNKNOWN_MAX", 2),
+            backoff_factor=_read_float("CORTEX_RETRY_UNKNOWN_BACKOFF", 2.0),

147-151: Warn when a failed command produces no stderr/stdout.

If a command fails with no output, the retry is silently classified as UNKNOWN. A small warning makes this diagnosable without changing behavior.

🛠️ Suggested warning for empty outputs

                 if hasattr(result, "stderr") and result.stderr:
                     error_msg = result.stderr
                 elif hasattr(result, "stdout") and result.stdout:
                     error_msg = result.stdout
+
+                if not error_msg:
+                    logger.warning(
+                        "Command failed with returncode %s but produced no output; treating as UNKNOWN",
+                        getattr(result, "returncode", "unknown"),
+                    )

🧹 Nitpick comments (1)

tests/test_retry.py (1)
14-21: Add return type annotations for test methods.

Type hints are required in Python code here; please annotate setUp and all test_* methods with -> None.
🛠️ Example pattern (apply across test methods)
-    def test_strategy_creation(self):
+    def test_strategy_creation(self) -> None:
@@
-    def setUp(self):
+    def setUp(self) -> None:
@@
-    def test_success_first_try(self):
+    def test_success_first_try(self) -> None:
As per coding guidelines, type hints are required in Python code.

cortex/coordinator.py

docs/RETRY_CONFIGURATION.md

tests/test_retry.py

sonarqubecloud · 2026-01-23T19:05:01Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@cortex/coordinator.py`:
- Line 102: Replace the hardcoded default value 5 for the max_retries parameter
with the shared constant DEFAULT_MAX_RETRIES to keep defaults consistent; update
the function/method signature that currently declares max_retries: int = 5 to
use max_retries: int = DEFAULT_MAX_RETRIES and ensure any nearby references
(e.g., in the same class/method as __init__ or other methods) use the same
DEFAULT_MAX_RETRIES symbol so the module consistently relies on the imported
constant.

In `@tests/test_retry.py`:
- Around line 14-22: The test methods in TestRetryStrategy (e.g.,
test_strategy_creation) are missing return type annotations; update every test
method in this file (all 16) to include the return type -> None (for example:
def test_strategy_creation(self) -> None:) so each test method signature has an
explicit None return type per the guideline.

🧹 Nitpick comments (1)

cortex/coordinator.py (1)

70-78: Document the max_retries parameter in the docstring.

The __init__ docstring at line 72 doesn't document the new max_retries parameter. As per coding guidelines, docstrings should cover all public API parameters.

📝 Suggested docstring update

     def __init__(
         self,
         commands: list[str],
         descriptions: list[str] | None = None,
         timeout: int = 300,
         stop_on_error: bool = True,
         enable_rollback: bool = False,
         log_file: str | None = None,
         progress_callback: Callable[[int, int, InstallationStep], None] | None = None,
         max_retries: int = DEFAULT_MAX_RETRIES,
     ):
-        """Initialize an installation run with optional logging and rollback."""
+        """Initialize an installation run with optional logging and rollback.
+
+        Args:
+            commands: List of shell commands to execute.
+            descriptions: Optional descriptions for each command.
+            timeout: Command execution timeout in seconds.
+            stop_on_error: Whether to stop execution on first failure.
+            enable_rollback: Whether to run rollback commands on failure.
+            log_file: Optional file path for logging.
+            progress_callback: Optional callback for progress updates.
+            max_retries: Maximum retry attempts for transient errors.
+        """

coderabbitai · 2026-01-23T19:10:39Z

cortex/coordinator.py

        enable_rollback: bool | None = None,
        log_file: str | None = None,
        progress_callback: Callable[[int, int, InstallationStep], None] | None = None,
+        max_retries: int = 5,


⚠️ Potential issue | 🟡 Minor

Use DEFAULT_MAX_RETRIES constant instead of hardcoded value.

Line 102 uses a hardcoded 5 while __init__ at line 70 uses the imported DEFAULT_MAX_RETRIES constant. This inconsistency could cause confusion and maintenance issues if the default value changes.

🔧 Proposed fix

- max_retries: int = 5, + max_retries: int = DEFAULT_MAX_RETRIES,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

max_retries: int = 5,

max_retries: int = DEFAULT_MAX_RETRIES,

🤖 Prompt for AI Agents

In `@cortex/coordinator.py` at line 102, Replace the hardcoded default value 5 for the max_retries parameter with the shared constant DEFAULT_MAX_RETRIES to keep defaults consistent; update the function/method signature that currently declares max_retries: int = 5 to use max_retries: int = DEFAULT_MAX_RETRIES and ensure any nearby references (e.g., in the same class/method as __init__ or other methods) use the same DEFAULT_MAX_RETRIES symbol so the module consistently relies on the imported constant.

coderabbitai · 2026-01-23T19:10:39Z

tests/test_retry.py

+class TestRetryStrategy(unittest.TestCase):
+    """Tests for the RetryStrategy dataclass."""
+
+    def test_strategy_creation(self):
+        strategy = RetryStrategy(max_retries=5, backoff_factor=1.0, description="Test")
+        self.assertEqual(strategy.max_retries, 5)
+        self.assertEqual(strategy.backoff_factor, 1.0)
+        self.assertEqual(strategy.description, "Test")
+


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, check if the file exists and view the content around lines 14-22 if [ -f "tests/test_retry.py" ]; then echo "=== File content (lines 1-40) ===" head -40 "tests/test_retry.py" | cat -n else echo "File not found, listing test files:" find . -name "test_retry.py" -o -name "*test*.py" | head -20 fi

Repository: cortexlinux/cortex

Length of output: 1743

🏁 Script executed:

# Run the verification script to identify missing return type annotations python - <<'PY' import ast, pathlib # Try to find the test file import os test_file = None for root, dirs, files in os.walk("."): if "test_retry.py" in files: test_file = os.path.join(root, "test_retry.py") break if test_file: print(f"Found: {test_file}") path = pathlib.Path(test_file) tree = ast.parse(path.read_text()) missing = [] for node in ast.walk(tree): if isinstance(node, ast.FunctionDef): status = "✓ has return hint" if node.returns else "✗ missing return hint" missing.append(f"{node.name} (line {node.lineno}): {status}") print("\n=== Method annotations ===") for item in sorted(missing): print(item) else: print("test_retry.py not found") PY

Repository: cortexlinux/cortex

Length of output: 148

🏁 Script executed:

# Simple grep to find all def statements and check for return type hints rg "^\s+def test_" tests/test_retry.py -A 0

Repository: cortexlinux/cortex

Length of output: 858

🏁 Script executed:

# Check how many methods in the test file lack return type annotations grep -c "def test_" tests/test_retry.py || echo "File check complete"

Repository: cortexlinux/cortex

Length of output: 64

🏁 Script executed:

# Look at the entire test file to understand scope wc -l tests/test_retry.py

Repository: cortexlinux/cortex

Length of output: 85

Add return type hints to all test methods for guideline compliance.

All 16 test methods in this file lack return type annotations. Type hints are required in Python code per coding guidelines. Add -> None to each test method.

Example update

- def test_strategy_creation(self): + def test_strategy_creation(self) -> None:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class TestRetryStrategy(unittest.TestCase):

"""Tests for the RetryStrategy dataclass."""

def test_strategy_creation(self):

strategy = RetryStrategy(max_retries=5, backoff_factor=1.0, description="Test")

self.assertEqual(strategy.max_retries, 5)

self.assertEqual(strategy.backoff_factor, 1.0)

self.assertEqual(strategy.description, "Test")

class TestRetryStrategy(unittest.TestCase):

"""Tests for the RetryStrategy dataclass."""

def test_strategy_creation(self) -> None:

strategy = RetryStrategy(max_retries=5, backoff_factor=1.0, description="Test")

self.assertEqual(strategy.max_retries, 5)

self.assertEqual(strategy.backoff_factor, 1.0)

self.assertEqual(strategy.description, "Test")

🤖 Prompt for AI Agents

In `@tests/test_retry.py` around lines 14 - 22, The test methods in TestRetryStrategy (e.g., test_strategy_creation) are missing return type annotations; update every test method in this file (all 16) to include the return type -> None (for example: def test_strategy_creation(self) -> None:) so each test method signature has an explicit None return type per the guideline.

added retry functionality

c3e5ac7

[autofix.ci] apply automated fixes

769bedc

gemini-code-assist bot reviewed Jan 20, 2026

View reviewed changes

Anshgrover23 marked this pull request as ready for review January 20, 2026 08:17

Anshgrover23 requested review from Anshgrover23, Suyashd999 and mikejmorgan-ai as code owners January 20, 2026 08:17

Copilot AI review requested due to automatic review settings January 20, 2026 08:17

Merge branch 'main' into feat/retryFunctionality

d1c7baf

Copilot AI reviewed Jan 20, 2026

View reviewed changes

cortex/coordinator.py Outdated Show resolved Hide resolved

Anshgrover23 assigned KrishnaShuk Jan 20, 2026

coderabbitai bot reviewed Jan 20, 2026

View reviewed changes

cortex/coordinator.py Outdated Show resolved Hide resolved

cortex/utils/retry.py Show resolved Hide resolved

cortex/utils/retry.py Outdated Show resolved Hide resolved

cortex/utils/retry.py Outdated Show resolved Hide resolved

Anshgrover23 requested changes Jan 20, 2026

View reviewed changes

Anshgrover23 and others added 4 commits January 21, 2026 22:29

Merge branch 'main' into feat/retryFunctionality

b74b3e5

added documentation

840f3ef

minor fixes

ae299f1

[autofix.ci] apply automated fixes

9046057

coderabbitai bot reviewed Jan 22, 2026

View reviewed changes

cortex/coordinator.py Outdated Show resolved Hide resolved

minor fix

af526dc

KrishnaShuk requested a review from Anshgrover23 January 22, 2026 09:40

Merge branch 'main' into feat/retryFunctionality

b66ebf7

Anshgrover23 requested changes Jan 22, 2026

View reviewed changes

KrishnaShuk and others added 2 commits January 23, 2026 23:11

added different strategies and documentation

c0107a5

[autofix.ci] apply automated fixes

af3e882

coderabbitai bot reviewed Jan 23, 2026

View reviewed changes

cortex/coordinator.py Show resolved Hide resolved

docs/RETRY_CONFIGURATION.md Outdated Show resolved Hide resolved

tests/test_retry.py Show resolved Hide resolved

KrishnaShuk and others added 4 commits January 23, 2026 23:31

resolved suggestions

f2bd1e8

[autofix.ci] apply automated fixes

4d01629

Merge branch 'main' into feat/retryFunctionality

76693aa

minor changes

3f2aa27

coderabbitai bot reviewed Jan 23, 2026

View reviewed changes

	max_retries: int = 5,
	max_retries: int = DEFAULT_MAX_RETRIES,

Uh oh!

feat: Implement Smart Retry Logic with Exponential Backoff for Installations #658

Are you sure you want to change the base?

feat: Implement Smart Retry Logic with Exponential Backoff for Installations #658

Conversation

KrishnaShuk commented Jan 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issue

Summary

AI Disclosure

Demonstration

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

gemini-code-assist bot commented Jan 20, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

KrishnaShuk commented Jan 20, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

github-actions bot commented Jan 20, 2026

CLA Verification Passed

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Anshgrover23 left a comment

Choose a reason for hiding this comment

Uh oh!

KrishnaShuk commented Jan 20, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

KrishnaShuk commented Jan 22, 2026

Uh oh!

Anshgrover23 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Jan 23, 2026

Quality Gate passed

KrishnaShuk commented Jan 20, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 20, 2026 •

edited

Loading