[tests, cuda] test: strengthen VRAM target and release assertions for keep sessions by Wangmerlyn · Pull Request #64 · Wangmerlyn/KeepGPU

Wangmerlyn · 2026-02-15T13:44:58Z

Summary by CodeRabbit

Tests
- Enhanced CUDA context manager tests to use dynamic polling for VRAM targets instead of fixed sleeps, improving reliability.
- Added a new test verifying allocation toward a VRAM target and proper release back to baseline with timeout-based checks.
- Introduced a reusable polling utility for wait-until semantics.
- Improved diagnostics, tolerances, and descriptive failure messages for memory checks.

coderabbitai · 2026-02-15T13:45:11Z

📝 Walkthrough

Walkthrough

Replaces fixed sleeps in CUDA controller tests with dynamic, timeout-bounded polling; adds a reusable wait_until helper; introduces pre-checks for available VRAM and delta-based monitoring of memory_allocated/memory_reserved during keep/release scenarios; adds detailed assertions and error messages.

Changes

Cohort / File(s)	Summary
Context manager test `tests/cuda_controller/test_context_manager.py`	Increases target VRAM, calls `torch.cuda.empty_cache()`/`torch.cuda.synchronize()`, adds pre-checks for free VRAM, replaces fixed sleep with delta-based polling loops that monitor `memory_allocated` and `memory_reserved`, records before/after values, asserts attainment and release within tolerances, and adds descriptive error messages and a docstring.
Keep/release tests `tests/cuda_controller/test_keep_and_release.py`	Adds new test `test_cuda_controller_respects_vram_target_during_keep` that uses polling to verify allocation toward a VRAM target during `keep()` and release back to baseline; adjusts imports.
Test utility `tests/polling.py`	Adds `wait_until(predicate, timeout_s=3.0, interval_s=0.05)` helper used by tests to poll a boolean predicate until success or timeout.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 I nudge the GPU, I count each byte,

I wait, I watch, till usage looks right.
No lazy sleeps — I hop and I poll,
Memory rises, then falls to its hole.
Hooray for tests that keep things tight!

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: strengthening VRAM target and release assertions for keep sessions in the test suite.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch tests/vram-assertions

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments

tests/polling.py (1)
1-16: LGTM — clean reusable polling helper.

Tiny nit: time.monotonic() is preferred over time.time() for measuring elapsed durations, as it's immune to system clock adjustments. Not critical for tests, but it's a good habit.
♻️ Use monotonic clock
-    deadline = time.time() + timeout_s
-    while time.time() < deadline:
+    deadline = time.monotonic() + timeout_s
+    while time.monotonic() < deadline:

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-02-15T13:45:11Z

Summary of Changes

Hello @Wangmerlyn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly strengthens the testing of CUDA GPU VRAM management. It refactors an existing test to include more precise assertions for VRAM allocation and deallocation during context manager usage and introduces a new dedicated test to verify that the keep() and release() methods of the CudaGPUController correctly manage VRAM targets. These changes improve the reliability of VRAM management in the CudaGPUController by ensuring that VRAM is accurately reserved and freed.

Highlights

Enhanced VRAM Target Assertions: The test_cuda_controller_context_manager has been significantly refactored to include more rigorous and dynamic assertions for VRAM allocation and deallocation, ensuring that the controller accurately manages VRAM during its lifecycle.
New VRAM Keep and Release Test: A new test, test_cuda_controller_respects_vram_target_during_keep, was introduced to specifically validate that the keep() and release() methods of the CudaGPUController correctly consume and free VRAM according to the specified targets.
Asynchronous Test Utility: A new helper function, _wait_until, was added to facilitate robust asynchronous checks within the tests, allowing for more reliable assertions on VRAM state changes over time.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

tests/cuda_controller/test_context_manager.py
- Refactored test_cuda_controller_context_manager to include detailed assertions for VRAM allocation and release.
- Increased vram_to_keep from '8MB' to '64MB' for a more robust test.
- Added checks for insufficient free VRAM and dynamic waiting for VRAM targets to be met and released.
tests/cuda_controller/test_keep_and_release.py
- Added a new helper function _wait_until to facilitate asynchronous checks in tests.
- Introduced test_cuda_controller_respects_vram_target_during_keep to specifically test the keep() and release() methods' adherence to VRAM targets.

Activity

Initial commit of the changes.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9692d7111e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-15T13:46:43Z

tests/cuda_controller/test_keep_and_release.py

+    alloc_tolerance = 8 * 1024 * 1024
+    reserve_tolerance = 16 * 1024 * 1024
+
+    ctrl.keep()


Always release controller when post-keep assertions fail

test_cuda_controller_respects_vram_target_during_keep starts the background keep loop with ctrl.keep(), but cleanup is only done at the end of the happy path. If any assertion before ctrl.release() fails (for example the 3s target wait timing out on a busy CI GPU), the daemon thread keeps running and continues holding VRAM, which can cascade into unrelated CUDA test failures in the same pytest process. Use try/finally (or with ctrl:) so release is guaranteed on failure paths.

Useful? React with 👍 / 👎.

gemini-code-assist

Code Review

This pull request significantly strengthens the tests for VRAM management by adding detailed assertions for memory allocation and release, which is a great improvement. My review includes two suggestions to enhance the consistency and reduce code duplication within the newly added test logic. One suggestion is to refactor the duplicated polling logic into a shared helper function, and the other is to make the wait conditions across the tests more consistent and robust.

gemini-code-assist · 2026-02-15T13:46:48Z

tests/cuda_controller/test_context_manager.py

+        while time.time() < deadline:
+            alloc_delta = max(
+                0, torch.cuda.memory_allocated(ctrl.rank) - before_allocated
+            )
+            reserved_delta = max(
+                0, torch.cuda.memory_reserved(ctrl.rank) - before_reserved
+            )
+            peak_alloc_delta = max(peak_alloc_delta, alloc_delta)
+            peak_reserved_delta = max(peak_reserved_delta, reserved_delta)
+
+            # allocated should track payload; reserved may be larger due allocator blocks
+            if (
+                alloc_delta >= int(target_bytes * 0.95)
+                and reserved_delta >= alloc_delta
+            ):
+                reached_target = True
+                break
+            time.sleep(0.05)


This polling loop is very similar to the release-checking loop below (lines 69-81). Furthermore, in test_keep_and_release.py, you've introduced a _wait_until helper function which abstracts this exact polling pattern.

To improve consistency and reduce code duplication across the test suite, consider refactoring this test to also use a polling helper. You could move _wait_until to a shared test utility file and import it in both test files.

This would make the test logic cleaner and easier to maintain.

tests/cuda_controller/test_keep_and_release.py

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@tests/cuda_controller/test_context_manager.py`:
- Around line 67-85: Initialize alloc_delta_after and reserved_delta_after to
safe default values before entering the while loop so they are always defined
for the final assertion; update the test around release_deadline / the while
loop that reads torch.cuda.memory_allocated(ctrl.rank) and
torch.cuda.memory_reserved(ctrl.rank) to set alloc_delta_after and
reserved_delta_after (e.g., 0) prior to the loop, then let the loop update them
as it currently does and keep the final assert that references alloc_delta_after
and reserved_delta_after unchanged.

🧹 Nitpick comments (4)

tests/cuda_controller/test_keep_and_release.py (3)
8-14: Clean polling helper – consider using time.monotonic().

time.time() can jump on NTP adjustments; time.monotonic() is the idiomatic choice for measuring elapsed durations. Unlikely to matter at 3 s scale, but worth aligning with best practice.
♻️ Suggested tweak
 def _wait_until(predicate, timeout_s: float = 3.0, interval_s: float = 0.05) -> bool:
-    deadline = time.time() + timeout_s
-    while time.time() < deadline:
+    deadline = time.monotonic() + timeout_s
+    while time.monotonic() < deadline:
         if predicate():
             return True
         time.sleep(interval_s)
     return False
47-50: Consider adding a large_memory marker or documenting the expected footprint.

The coding guidelines mention a large_memory pytest marker for opt-in heavy VRAM tests. At 32 MB this test is lightweight, but if the project convention is to mark any test that allocates real VRAM, it may be worth adding. A brief comment documenting the expected allocation (~32 MB) would also help future maintainers gauge CI requirements.

As per coding guidelines: "Respect existing pytest markers: rocm for ROCm-only tests, large_memory for opt-in heavy VRAM tests."

85-88: Redundant re-assertion after _wait_until already confirmed the condition.

Lines 87–88 re-check the same >= 0.95 * target_bytes condition that _wait_until already verified (and reached was asserted True on line 83). The allocation could theoretically decrease between the poll and these lines if the keep-loop is cycling, making this a slightly stricter "still true right now" check. If that's intentional, a short comment would clarify the intent; otherwise these lines are just noise.
tests/cuda_controller/test_context_manager.py (1)

39-65: Consider extracting the keep-phase polling into a shared helper.

This inline polling loop duplicates the spirit of _wait_until from test_keep_and_release.py but adds peak-delta tracking. A small shared helper (e.g., in a tests/cuda_controller/conftest.py or a _helpers.py module) accepting an optional on_sample callback could serve both files, reducing copy-paste drift as tolerances or timeouts evolve.

tests/cuda_controller/test_context_manager.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

test(cuda): strengthen VRAM consumption assertions

9692d71

chatgpt-codex-connector bot reviewed Feb 15, 2026

View reviewed changes

gemini-code-assist bot reviewed Feb 15, 2026

View reviewed changes

coderabbitai bot reviewed Feb 15, 2026

View reviewed changes

tests/cuda_controller/test_context_manager.py Outdated Show resolved Hide resolved

Wangmerlyn and others added 3 commits February 15, 2026 22:07

Update tests/cuda_controller/test_context_manager.py

7d1b46c

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Update tests/cuda_controller/test_keep_and_release.py

6551fa1

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

test(cuda): factor polling helper and dedupe VRAM checks

56a3951

Wangmerlyn merged commit 6a39098 into main Feb 15, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tests, cuda] test: strengthen VRAM target and release assertions for keep sessions#64

[tests, cuda] test: strengthen VRAM target and release assertions for keep sessions#64
Wangmerlyn merged 4 commits intomainfrom
tests/vram-assertions

Wangmerlyn commented Feb 15, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 15, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

gemini-code-assist bot commented Feb 15, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Feb 15, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 15, 2026

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Wangmerlyn commented Feb 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

gemini-code-assist bot commented Feb 15, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Wangmerlyn commented Feb 15, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 15, 2026 •

edited

Loading