perf: use GROUP_CONCAT for efficient tag aggregation in db_get_all_im…#1115
perf: use GROUP_CONCAT for efficient tag aggregation in db_get_all_im…#1115moksha-hub wants to merge 2 commits intoAOSSIE-Org:mainfrom
Conversation
…ages Fixes AOSSIE-Org#1109 Refactored db_get_all_images() to use SQL GROUP_CONCAT to aggregate tags at the database layer instead of iterating in Python. Changes: - SQL query now uses GROUP_CONCAT(DISTINCT m.name) to aggregate tags - Added GROUP BY i.id to return one row per image - Simplified Python code to directly build image list (no more images_dict) - Also moved import to global scope (addresses AOSSIE-Org#1108) Performance: SQL-level aggregation via C-based SQLite is significantly faster than Python iteration when processing 100k+ image rows.
📝 WalkthroughWalkthroughReplaced Python-side per-row tag grouping with SQL aggregation using Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/database/images.py (1)
141-183:⚠️ Potential issue | 🟠 MajorAddress tag splitting and ordering issues in GROUP_CONCAT aggregation.
Tag names are not constrained to exclude commas, making the current
split(",")unsafe. Additionally,GROUP_CONCATwithoutORDER BYproduces non-deterministic ordering.Consider using a safe separator and explicit ordering:
Proposed fix (safe separator + stable ordering)
+# Separator unlikely to appear in tag names +TAG_SEPARATOR = "\x1f" - query = """ + query = f""" SELECT i.id, i.path, i.folder_id, i.thumbnailPath, i.metadata, i.isTagged, i.isFavourite, - GROUP_CONCAT(DISTINCT m.name) as tags + GROUP_CONCAT(DISTINCT m.name, '{TAG_SEPARATOR}') as tags FROM images i LEFT JOIN image_classes ic ON i.id = ic.image_id LEFT JOIN mappings m ON ic.class_id = m.class_id """ @@ - tags = tags_str.split(",") if tags_str else None + tags = tags_str.split(TAG_SEPARATOR) if tags_str else [] + tags.sort()SQLite supports both
GROUP_CONCAT(expr, separator)andGROUP_CONCAT(expr, sep ORDER BY col)syntax natively.
🤖 Fix all issues with AI agents
In `@backend/app/database/images.py`:
- Around line 187-190: The code is stringifying folder_id unconditionally which
produces "None" for NULL DB values; update the image dictionary construction
(the block where "folder_id": str(folder_id)) to guard against NULL by returning
None when folder_id is None (e.g., use a conditional that yields None if
folder_id is None otherwise str(folder_id)) so behavior matches
db_get_untagged_images; locate this change in the function building the image
dict (the same scope that references path, thumbnail_path, metadata_dict) and
replace the unconditional str(folder_id) with the guarded expression.
|
@moksha-hub first join the discord channel of this repo there you will come to know PRs to those issues which is not yet assigned will not be merged , see recent messages here Pictopy discord |
…ages
Fixes #1109
Refactored db_get_all_images() to use SQL GROUP_CONCAT to aggregate tags at the database layer instead of iterating in Python.
Changes:
Performance: SQL-level aggregation via C-based SQLite is significantly faster than Python iteration when processing 100k+ image rows.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.