-
Notifications
You must be signed in to change notification settings - Fork 0
chore: implement recommended claude code fixes #168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: mainline
Are you sure you want to change the base?
Conversation
07de8be to
3e2dae0
Compare
| IndexName="UserCollectionsIndex", | ||
| KeyConditionExpression="user_id = :user_id", | ||
| ExpressionAttributeValues={":user_id": user_id}, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Missing pagination handling
This query doesn't handle pagination. DynamoDB query operations can return a maximum of 1 MB of data per call. If a user has many collections, the response may be paginated and include a LastEvaluatedKey.
| ) | |
| # Query GSI for all collections for this user | |
| response = self.table.query( | |
| IndexName="UserCollectionsIndex", | |
| KeyConditionExpression="user_id = :user_id", | |
| ExpressionAttributeValues={":user_id": user_id}, | |
| ) | |
| for item in response.get("Items", []): | |
| collection = CollectionData.from_dict(item) | |
| collections.append(collection) | |
| # Handle pagination | |
| while "LastEvaluatedKey" in response: | |
| response = self.table.query( | |
| IndexName="UserCollectionsIndex", | |
| KeyConditionExpression="user_id = :user_id", | |
| ExpressionAttributeValues={":user_id": user_id}, | |
| ExclusiveStartKey=response["LastEvaluatedKey"], | |
| ) | |
| for item in response.get("Items", []): | |
| collection = CollectionData.from_dict(item) | |
| collections.append(collection) |
| name: "name", | ||
| type: AttributeType.STRING, | ||
| }, | ||
| projectionType: ProjectionType.ALL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance: ProjectionType.ALL may be expensive
Using ProjectionType.ALL projects all attributes into the GSI, which doubles the storage cost for collection metadata items. Since list_collections only needs name, modified, count, and total_bytes, consider using ProjectionType.INCLUDE with specific attributes:
| projectionType: ProjectionType.ALL, | |
| projectionType: ProjectionType.INCLUDE, | |
| nonKeyAttributes: ["modified", "count", "total_bytes"], |
This reduces GSI storage costs and improves query performance.
3e2dae0 to
0faa454
Compare
PR Review SummaryThis PR implements several configuration improvements and a significant performance optimization. Overall the changes look good with strong test coverage, but there are a few issues that need attention. ✅ Strengths
|
| ":metadata": self._metadata_sk(), | ||
| }, | ||
| # Query GSI for all collections for this user | ||
| response = self.table.query( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical Bug: This query will return ALL items for the user, not just collection metadata!
The GSI only filters by user_id, so it will return:
- Collection metadata (SK=METADATA) ✅
- All BSO objects (SK=OBJECT#...) ❌
- Any other user data ❌
Solution: Add a filter expression:
response = self.table.query(
IndexName="UserCollectionsIndex",
KeyConditionExpression="user_id = :user_id",
FilterExpression="SK = :metadata",
ExpressionAttributeValues={
":user_id": user_id,
":metadata": self._metadata_sk()
},
)Note: FilterExpression is applied after the query, so you still get the GSI performance benefit but filter out non-metadata items.
|
|
||
| # Handle pagination if there are more results | ||
| while "LastEvaluatedKey" in response: | ||
| response = self.table.query( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same FilterExpression needed here for pagination queries.
| name: "user_id", | ||
| type: AttributeType.STRING, | ||
| }, | ||
| sortKey: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Design Question: The GSI defines name as the sort key, but the Python query doesn't use it in the KeyConditionExpression.
This has implications:
- Storage: DynamoDB will store the sort key value even if not queried
- Query efficiency: Not leveraging the sort key for ordering
- Potential waste: If
nameisn't needed for queries, consider removing it from the GSI to save on storage costs
Options:
- Use the sort key in queries:
KeyConditionExpression="user_id = :user_id AND begins_with(#name, :prefix)" - Or remove sort key from GSI if not needed: only define partition key
user_id
What's the intended use case?
| name: "name", | ||
| type: AttributeType.STRING, | ||
| }, | ||
| projectionType: ProjectionType.ALL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cost Consideration: ProjectionType.ALL projects all table attributes into the GSI, which doubles your storage costs for collection metadata items.
If queries only need name, modified, count, and usage, consider using ProjectionType.INCLUDE with specific attributes:
projectionType: ProjectionType.INCLUDE,
nonKeyAttributes: ["modified", "count", "usage"],This reduces storage costs while maintaining query performance. Only use ALL if you truly need every attribute.
| collections = storage_manager.list_collections("test-user-123") | ||
| assert collections == [] | ||
|
|
||
| def test_list_collections_with_pagination( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test Update Needed: Once you add the FilterExpression to fix the critical bug, these test stubs will need to be updated to include the filter parameters:
{
"TableName": storage_table_name,
"IndexName": "UserCollectionsIndex",
"KeyConditionExpression": "user_id = :user_id",
"FilterExpression": "SK = :metadata",
"ExpressionAttributeValues": {
":user_id": "test-user-123",
":metadata": "METADATA"
},
}
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||||||||||||||
Action ItemsBefore merging, please address: 🔴 Must Fix (Critical)
🟡 Should Consider
✅ What Works Well
Let me know if you need help with any of these issues! |
No description provided.