refactor: replace custom DynamoDB checkpointer with langgraph-checkpoint-aws#45
refactor: replace custom DynamoDB checkpointer with langgraph-checkpoint-aws#45
Conversation
…cy with missing permissions
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... 📒 Files selected for processing (1)
✏️ Tip: You can disable in-progress messages and the fortune message in your review settings. 📝 WalkthroughWalkthroughReplaces the in-repo DynamoDB+S3 saver with a lightweight wrapper around langgraph-checkpoint-aws.DynamoDBSaver, updates docs and optional dependencies, adds an async script to clear AWS resources, converts tests to moto-backed fixtures, and adds an IAM admin policy in Terraform. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant Wrapper as AsyncDynamoDBSaver\n(wrapper)
participant Lib as langgraph_checkpoint_aws.DynamoDBSaver
participant Dynamo as DynamoDB
participant S3 as S3
App->>Wrapper: call aput / aget / alist / other ops
Wrapper->>Lib: delegate operation (pass-through)
alt metadata or small payload
Lib->>Dynamo: Put/Get/Query metadata
Dynamo-->>Lib: response
else large payload offloaded
Lib->>S3: Put/Get/Delete object
S3-->>Lib: response
end
Lib-->>Wrapper: result
Wrapper-->>App: return result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
src/store/dynamodb.py (1)
53-59:setup()advertises verification, but currently only logs readiness.Either perform a real connectivity/access check here, or tighten the docstring so it does not imply validation that is not happening.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/store/dynamodb.py` around lines 53 - 59, The setup() docstring claims verification but only logs readiness; either implement an actual connectivity check (for example, call AWS DynamoDB DescribeTable/describe_table using the existing DynamoDB client/resource stored on the instance—refer to self.table_name and any client like self._dynamodb_client or self._resource—and raise/log an error on failure) or change the setup() docstring to state it only logs readiness without performing validation and keep the single logger.info call unchanged; pick one approach and update the code/comments consistently.tests/test_dynamodb_saver.py (1)
62-90: Consider consolidating separatemock_aws()contexts into a single fixture for both DynamoDB and S3.The
mock_dynamodbandmock_s3fixtures each use independentmock_aws()contexts. When both are used as dependencies in thesaverfixture, they run in parallel scopes, which can lead to separate moto backend instances. Consolidating both services into a singlewith mock_aws():context ensures they share the same mock backend, simplifying state management and test isolation.Proposed refactor
-@pytest.fixture() -def mock_dynamodb(aws_credentials): - with mock_aws(): - dynamodb = boto3.resource("dynamodb", region_name=REGION) - dynamodb.create_table( - TableName=TABLE_NAME, - KeySchema=[ - {"AttributeName": "PK", "KeyType": "HASH"}, - {"AttributeName": "SK", "KeyType": "RANGE"}, - ], - AttributeDefinitions=[ - {"AttributeName": "PK", "AttributeType": "S"}, - {"AttributeName": "SK", "AttributeType": "S"}, - ], - BillingMode="PAY_PER_REQUEST", - ) - yield dynamodb - - -@pytest.fixture() -def mock_s3(aws_credentials): - with mock_aws(): - s3 = boto3.client("s3", region_name=REGION) - s3.create_bucket(Bucket=BUCKET_NAME) - yield s3 +@pytest.fixture() +def mock_aws_resources(aws_credentials): + with mock_aws(): + dynamodb = boto3.resource("dynamodb", region_name=REGION) + dynamodb.create_table( + TableName=TABLE_NAME, + KeySchema=[ + {"AttributeName": "PK", "KeyType": "HASH"}, + {"AttributeName": "SK", "KeyType": "RANGE"}, + ], + AttributeDefinitions=[ + {"AttributeName": "PK", "AttributeType": "S"}, + {"AttributeName": "SK", "AttributeType": "S"}, + ], + BillingMode="PAY_PER_REQUEST", + ) + s3 = boto3.client("s3", region_name=REGION) + s3.create_bucket(Bucket=BUCKET_NAME) + yield {"dynamodb": dynamodb, "s3": s3} `@pytest.fixture`() -async def saver(mock_dynamodb, mock_s3): +async def saver(mock_aws_resources): """Create a saver wired to moto-mocked AWS."""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_dynamodb_saver.py` around lines 62 - 90, The two fixtures mock_dynamodb and mock_s3 each enter their own mock_aws() context causing separate moto backends; change them to share a single mock_aws() context by creating a new combined fixture (or modify saver) that calls with mock_aws(): then instantiates both boto3.resource("dynamodb", ...) and boto3.client("s3", ...) (use TABLE_NAME/BUCKET_NAME and create_table/create_bucket as before) and yields both resources to tests; update saver to depend on that combined fixture instead of mock_dynamodb and mock_s3 so DynamoDB and S3 share the same moto backend.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/aws-store.md`:
- Line 24: The class reference uses the wrong module name; replace the
hyphenated reference "langgraph-checkpoint-aws.DynamoDBSaver" with the correct
Python import path "langgraph_checkpoint_aws.DynamoDBSaver" so the docs point to
the actual Python module and class (update the occurrence of DynamoDBSaver in
the docs/aws-store.md line that contains the incorrect module string).
In `@scripts/clear_aws_store.py`:
- Around line 1-8: Add the required future-annotations import to the top of the
module by inserting "from __future__ import annotations" immediately after the
shebang and before the module docstring; ensure it appears before any other
imports or code in this file (the module-level docstring and the imports of
asyncio, os, sys, Path will follow it).
- Around line 16-20: The load_env docstring says it loads .env from the project
root but the code uses Path(".env") which checks the current working directory;
update the env_path computation to locate the repository root relative to the
script file (e.g. use Path(__file__).resolve().parent.parent or
Path(__file__).resolve().parents[1]) and join ".env" so env_path points to the
project root .env; adjust any tests or callers of load_env if they rely on cwd
behavior.
In `@src/store/dynamodb.py`:
- Around line 37-50: The from_config classmethod currently ignores
store_config.s3_prefix; update it to either include the prefix in the
s3_offload_config or, until prefix support is implemented, fail-fast: inside
from_config check store_config.s3_prefix (or its default/empty state) and if a
non-empty/non-default prefix is provided raise a clear ValueError indicating
s3_prefix is not yet supported; adjust the s3_offload_config logic (the
s3_offload_config dict assembly and the saver instantiation) to reflect this
guard so users cannot silently change object-key layout.
---
Nitpick comments:
In `@src/store/dynamodb.py`:
- Around line 53-59: The setup() docstring claims verification but only logs
readiness; either implement an actual connectivity check (for example, call AWS
DynamoDB DescribeTable/describe_table using the existing DynamoDB
client/resource stored on the instance—refer to self.table_name and any client
like self._dynamodb_client or self._resource—and raise/log an error on failure)
or change the setup() docstring to state it only logs readiness without
performing validation and keep the single logger.info call unchanged; pick one
approach and update the code/comments consistently.
In `@tests/test_dynamodb_saver.py`:
- Around line 62-90: The two fixtures mock_dynamodb and mock_s3 each enter their
own mock_aws() context causing separate moto backends; change them to share a
single mock_aws() context by creating a new combined fixture (or modify saver)
that calls with mock_aws(): then instantiates both boto3.resource("dynamodb",
...) and boto3.client("s3", ...) (use TABLE_NAME/BUCKET_NAME and
create_table/create_bucket as before) and yields both resources to tests; update
saver to depend on that combined fixture instead of mock_dynamodb and mock_s3 so
DynamoDB and S3 share the same moto backend.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
docs/aws-store.mdpyproject.tomlscripts/clear_aws_store.pysrc/store/dynamodb.pyterraform/main.tftests/test_dynamodb_saver.py
| - Single-table DynamoDB design with composite sort keys (`cp#...` for checkpoints, `wr#...` for writes) | ||
| - **DynamoDB** stores checkpoint metadata, intermediate writes, and small checkpoint payloads. | ||
| - **S3** stores larger checkpoint payloads (typically >350KB) that exceed DynamoDB's efficient item size limits. | ||
| - Unified storage design managed by `langgraph-checkpoint-aws.DynamoDBSaver`. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify module-path usage in docs vs code
rg -n "langgraph-checkpoint-aws\.DynamoDBSaver|langgraph_checkpoint_aws\.DynamoDBSaver|from langgraph_checkpoint_aws import DynamoDBSaver" docs/aws-store.md src/store/dynamodb.py -C2Repository: pokryfka/coding-agent
Length of output: 1069
Use the Python module path in this class reference.
Line 24 currently shows langgraph-checkpoint-aws.DynamoDBSaver, but the Python import path is langgraph_checkpoint_aws.DynamoDBSaver (with underscores, not hyphens).
Proposed fix
-- Unified storage design managed by `langgraph-checkpoint-aws.DynamoDBSaver`.
+- Unified storage design managed by `langgraph_checkpoint_aws.DynamoDBSaver`.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - Unified storage design managed by `langgraph-checkpoint-aws.DynamoDBSaver`. | |
| - Unified storage design managed by `langgraph_checkpoint_aws.DynamoDBSaver`. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/aws-store.md` at line 24, The class reference uses the wrong module
name; replace the hyphenated reference "langgraph-checkpoint-aws.DynamoDBSaver"
with the correct Python import path "langgraph_checkpoint_aws.DynamoDBSaver" so
the docs point to the actual Python module and class (update the occurrence of
DynamoDBSaver in the docs/aws-store.md line that contains the incorrect module
string).
| """Load .env file from the project root.""" | ||
| env_path = Path(".env") | ||
| if env_path.exists(): | ||
| print(f"Loading environment from {env_path}") | ||
| with open(env_path) as f: |
There was a problem hiding this comment.
load_env docstring says project root, but Line 17 reads current working directory.
If this script is launched outside repository root, .env lookup will miss the intended file.
Proposed fix
def load_env():
"""Load .env file from the project root."""
- env_path = Path(".env")
+ env_path = Path(__file__).resolve().parents[1] / ".env"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| """Load .env file from the project root.""" | |
| env_path = Path(".env") | |
| if env_path.exists(): | |
| print(f"Loading environment from {env_path}") | |
| with open(env_path) as f: | |
| def load_env(): | |
| """Load .env file from the project root.""" | |
| env_path = Path(__file__).resolve().parents[1] / ".env" | |
| if env_path.exists(): | |
| print(f"Loading environment from {env_path}") | |
| with open(env_path) as f: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/clear_aws_store.py` around lines 16 - 20, The load_env docstring says
it loads .env from the project root but the code uses Path(".env") which checks
the current working directory; update the env_path computation to locate the
repository root relative to the script file (e.g. use
Path(__file__).resolve().parent.parent or Path(__file__).resolve().parents[1])
and join ".env" so env_path points to the project root .env; adjust any tests or
callers of load_env if they rely on cwd behavior.
| s3_offload_config = None | ||
| if store_config.s3_bucket: | ||
| s3_offload_config = { | ||
| "bucket_name": store_config.s3_bucket, | ||
| } | ||
| if store_config.aws_endpoint_url: | ||
| s3_offload_config["endpoint_url"] = store_config.aws_endpoint_url | ||
|
|
||
| session = aioboto3.Session(region_name=store_config.aws_region or None) | ||
| saver = cls( | ||
| dynamodb_table_name=store_config.dynamodb_table_name, | ||
| s3_bucket=store_config.s3_bucket, | ||
| s3_prefix=store_config.s3_prefix, | ||
| table_name=store_config.dynamodb_table_name, | ||
| region_name=store_config.aws_region or None, | ||
| endpoint_url=store_config.aws_endpoint_url or None, | ||
| _session=session, | ||
| s3_offload_config=s3_offload_config, | ||
| ) |
There was a problem hiding this comment.
store_config.s3_prefix is silently ignored in from_config.
This makes a documented config field ineffective and can change object-key layout behavior without warning. At minimum, fail fast when a non-default prefix is set until it’s supported.
Proposed guard (fail-fast until support is implemented)
async def from_config(cls, store_config: StoreConfig) -> AsyncIterator[AsyncDynamoDBSaver]:
"""Create from a StoreConfig."""
+ if store_config.s3_prefix and store_config.s3_prefix != "checkpoints/":
+ raise ValueError(
+ "store.s3_prefix is configured but AsyncDynamoDBSaver does not apply it yet."
+ )
+
s3_offload_config = None
if store_config.s3_bucket:
s3_offload_config = {
"bucket_name": store_config.s3_bucket,
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/store/dynamodb.py` around lines 37 - 50, The from_config classmethod
currently ignores store_config.s3_prefix; update it to either include the prefix
in the s3_offload_config or, until prefix support is implemented, fail-fast:
inside from_config check store_config.s3_prefix (or its default/empty state) and
if a non-empty/non-default prefix is provided raise a clear ValueError
indicating s3_prefix is not yet supported; adjust the s3_offload_config logic
(the s3_offload_config dict assembly and the saver instantiation) to reflect
this guard so users cannot silently change object-key layout.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
scripts/clear_aws_store.py (3)
97-100: Movetracebackimport to the top of the file.Inline imports inside exception handlers add latency at the worst moment and make dependencies harder to track.
Proposed fix
import asyncio import os import sys +import traceback from pathlib import PathThen remove the inline import:
except Exception as e: print(f"Error emptying bucket {bucket_name}: {e}") - import traceback traceback.print_exc()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/clear_aws_store.py` around lines 97 - 100, The exception handler currently performs an inline import of traceback inside the except block that catches Exception as e (see the except Exception as e block referencing bucket_name and traceback.print_exc()); move the import statement for traceback to the module-level imports at the top of the file and remove the inline "import traceback" from the except block so the handler simply calls traceback.print_exc() and prints the error without performing an import at runtime.
110-112: Consider adding a confirmation prompt for destructive operations.This script permanently deletes all items/objects without confirmation. While acceptable for scripted CI/CD use, an optional
--forceflag or interactive prompt would prevent accidental data loss during manual runs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/clear_aws_store.py` around lines 110 - 112, Add a confirmation step or a --force CLI flag to prevent accidental destructive runs: extend the script's argument parsing to accept a --force boolean (default false), and before any deletion logic that uses table_name and bucket_name (the block that currently prints the error and exits), if --force is not set prompt the user with a clear interactive confirmation (e.g., "Are you sure you want to permanently delete all items/objects in <table_name>/<bucket_name>? (yes/no)") and abort unless the user types an explicit affirmative; ensure this check runs immediately before calling the deletion routines that use table_name and bucket_name so automated CI can pass --force while manual runs require confirmation.
36-36: Add type hints to function signatures.The async functions lack parameter and return type annotations. Adding hints improves readability and enables static analysis.
Proposed type hints
-async def empty_dynamodb(table_name, region, endpoint_url): +async def empty_dynamodb(table_name: str, region: str, endpoint_url: str | None) -> None: """Scan and delete all items from a DynamoDB table."""-async def empty_s3(bucket_name, region, endpoint_url): +async def empty_s3(bucket_name: str, region: str, endpoint_url: str | None) -> None: """Delete all objects and versions from an S3 bucket."""-async def main(): +async def main() -> None:Also applies to: 71-71, 102-102
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/clear_aws_store.py` at line 36, The async function empty_dynamodb is missing type annotations—add parameter and return type hints (e.g., def empty_dynamodb(table_name: str, region: str, endpoint_url: Optional[str]) -> None) and import Optional from typing; apply the same pattern to the other async functions in this module (those declared later) by annotating their parameters with concrete types (str, Optional[str], bool, etc. as appropriate) and giving explicit return types (likely -> None or -> int), updating signatures and imports accordingly to enable static analysis.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@scripts/clear_aws_store.py`:
- Around line 97-100: The exception handler currently performs an inline import
of traceback inside the except block that catches Exception as e (see the except
Exception as e block referencing bucket_name and traceback.print_exc()); move
the import statement for traceback to the module-level imports at the top of the
file and remove the inline "import traceback" from the except block so the
handler simply calls traceback.print_exc() and prints the error without
performing an import at runtime.
- Around line 110-112: Add a confirmation step or a --force CLI flag to prevent
accidental destructive runs: extend the script's argument parsing to accept a
--force boolean (default false), and before any deletion logic that uses
table_name and bucket_name (the block that currently prints the error and
exits), if --force is not set prompt the user with a clear interactive
confirmation (e.g., "Are you sure you want to permanently delete all
items/objects in <table_name>/<bucket_name>? (yes/no)") and abort unless the
user types an explicit affirmative; ensure this check runs immediately before
calling the deletion routines that use table_name and bucket_name so automated
CI can pass --force while manual runs require confirmation.
- Line 36: The async function empty_dynamodb is missing type annotations—add
parameter and return type hints (e.g., def empty_dynamodb(table_name: str,
region: str, endpoint_url: Optional[str]) -> None) and import Optional from
typing; apply the same pattern to the other async functions in this module
(those declared later) by annotating their parameters with concrete types (str,
Optional[str], bool, etc. as appropriate) and giving explicit return types
(likely -> None or -> int), updating signatures and imports accordingly to
enable static analysis.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
scripts/clear_aws_store.py (1)
99-102: Moveimport tracebackto the top of the module.The inline import inside the exception handler works but is unconventional. Moving it to the module-level imports improves readability and follows Python conventions.
Proposed fix
Add at the top with other imports:
import tracebackThen simplify the except block:
except Exception as e: print(f"Error emptying bucket {bucket_name}: {e}") - import traceback traceback.print_exc()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/clear_aws_store.py` around lines 99 - 102, Move the inline import of traceback to the module-level imports and remove the in-except import; in the exception handler (the except Exception as e: block used when emptying the S3 bucket in scripts/clear_aws_store.py) call traceback.print_exc() and use the existing print/log statement without re-importing traceback so the module imports include "import traceback" at top and the except block simply logs the error and calls traceback.print_exc().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@scripts/clear_aws_store.py`:
- Around line 99-102: Move the inline import of traceback to the module-level
imports and remove the in-except import; in the exception handler (the except
Exception as e: block used when emptying the S3 bucket in
scripts/clear_aws_store.py) call traceback.print_exc() and use the existing
print/log statement without re-importing traceback so the module imports include
"import traceback" at top and the except block simply logs the error and calls
traceback.print_exc().
Entire-Checkpoint: c8e756562670
Summary by CodeRabbit
New Features
Documentation
Infrastructure
Tests
Chores