Skip to content

fix: add composite index for pr_creators DISTINCT ON query timeout#69

Open
rnetser wants to merge 8 commits intomyk-org:mainfrom
rnetser:fix-report
Open

fix: add composite index for pr_creators DISTINCT ON query timeout#69
rnetser wants to merge 8 commits intomyk-org:mainfrom
rnetser:fix-report

Conversation

@rnetser
Copy link
Contributor

@rnetser rnetser commented Feb 19, 2026

Summary

  • Adds partial composite index ix_webhooks_repo_pr_number_created_at on webhooks(repository, pr_number, created_at ASC) WHERE pr_number IS NOT NULL
  • Fixes TimeoutError on /api/metrics/contributors endpoint caused by full table sort for DISTINCT ON queries
  • Migration uses CREATE INDEX CONCURRENTLY IF NOT EXISTS for production safety (no table locks, idempotent)

Test plan

  • 14 new tests in tests/test_models.py covering migration revision chain, index definition, and model consistency
  • All 404 tests pass with 93% coverage
  • Index already applied to production database (cnv-tests-cov namespace)

Closes #68

Summary by CodeRabbit

  • Chores

    • Added a targeted partial composite database index to speed up pull-request-related queries, improving retrieval performance and responsiveness for PR listings and history lookups.
  • Tests

    • Added tests to validate the new index and migration behavior, ensuring correct index presence, ordering, and safe upgrade/downgrade operations.

The contributors endpoint timed out (>60s) due to missing index for
DISTINCT ON (repository, pr_number) ORDER BY created_at pattern.

Adds partial composite index on webhooks(repository, pr_number, created_at)
WHERE pr_number IS NOT NULL.

Closes myk-org#68
@myakove-bot
Copy link
Collaborator

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: Disabled for this repository
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: All label categories are enabled (default configuration)

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest conventional-title - Validate commit message format
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 0 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No WIP, hold, conflict labels
  5. Verified: PR must be marked as verified (if verification is enabled)

📊 Review Process

Approvers and Reviewers

Approvers:

Reviewers:

Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
  • automerge

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is automatically removed on each new commit
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a partial composite index on webhooks (repository, pr_number, created_at) with WHERE pr_number IS NOT NULL; includes an Alembic migration that creates/drops the index using CONCURRENTLY and IF NOT EXISTS/IF EXISTS, updates the SQLAlchemy model, and adds tests validating model and migration SQL.

Changes

Cohort / File(s) Summary
Database Migration
backend/migrations/versions/20260219_0001_add_pr_creators_index.py
New Alembic migration (revision/down_revision) that commits active txns, drops any existing index concurrently, then creates ix_webhooks_repo_pr_number_created_at with CREATE INDEX CONCURRENTLY IF NOT EXISTS ... ON webhooks (repository, pr_number, created_at ASC) WHERE pr_number IS NOT NULL; downgrade() drops it CONCURRENTLY IF EXISTS. Pay attention to transaction boundaries and CONCURRENTLY usage.
Model Declaration
backend/models.py
Adds Index('ix_webhooks_repo_pr_number_created_at', 'repository', 'pr_number', 'created_at', postgresql_where=(pr_number IS NOT NULL)) to Webhook.__table_args__. Verify index definition matches migration (column order and partial predicate).
Tests — model
tests/test_models.py
New assertions that Webhook.__table_args__ includes the composite index, correct columns/order, and the PostgreSQL partial WHERE clause; checks total index count/names.
Tests — migration
tests/test_migration_add_pr_creators_index.py
New migration tests validating module metadata (revision, down_revision, branch_labels, depends_on), revision linkage to prior migration, and exact SQL emitted by upgrade()/downgrade() (CONCURRENTLY, IF NOT EXISTS/IF EXISTS, index name, column order, target table, predicate). Tests use a mocked Alembic bind and assert SQL snippets.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

size/XXL

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a composite index to fix a DISTINCT ON query timeout issue, which is the primary objective of this PR.
Linked Issues check ✅ Passed All coding objectives from issue #68 are met: migration created with CONCURRENTLY IF NOT EXISTS, Webhook.table_args updated with the composite index, and test coverage added for migration validation.
Out of Scope Changes check ✅ Passed All changes are directly related to adding the composite index: migration file, model update, and comprehensive test coverage for both the index definition and migration logic.
Docstring Coverage ✅ Passed Docstring coverage is 95.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/migrations/versions/20260219_0001_add_pr_creators_index.py`:
- Around line 44-46: The downgrade currently calls
op.drop_index("ix_webhooks_repo_pr_number_created_at", table_name="webhooks")
which issues a normal DROP INDEX inside a transaction; change downgrade() to run
a DROP INDEX CONCURRENTLY for the same index name and do it outside a
transaction (autocommit). Concretely, replace op.drop_index(...) in downgrade
with executing the raw SQL "DROP INDEX CONCURRENTLY
ix_webhooks_repo_pr_number_created_at" using the connection returned by
op.get_bind() with execution/autocommit enabled (e.g., via
conn.execution_options(isolation_level="AUTOCOMMIT") or using a direct
connection.execute) so the index is dropped concurrently like the upgrade
created it.

In `@tests/test_models.py`:
- Around line 28-54: Both tests duplicate the same loop to find an Index in
Webhook.__table_args__; extract a shared helper (e.g., a static method named
_find_index) that takes a name: str and returns the matching Index or
raises/asserts if not found, then replace the loop in
test_composite_index_columns and test_composite_index_has_partial_where_clause
with calls to _find_index(INDEX_NAME) and use the returned target_index for
assertions; reference Webhook.__table_args__, Index, INDEX_NAME, and the test
functions when making the change.
- Around line 125-147: Both tests currently assert against mock_bind.execute but
the migration uses an autocommit connection returned by
mock_bind.execution_options(...); update test_upgrade_targets_webhooks_table and
test_upgrade_index_column_order to grab the autocommit connection (e.g.,
autocommit_conn = mock_bind.execution_options.return_value) after patching
migration.op.get_bind and assert on autocommit_conn.execute.call_args
(extracting .text and .strip() the same way) instead of mock_bind.execute so the
assertions target the actual executed SQL.
- Around line 108-123: The test test_upgrade_creates_correct_index is asserting
execute() on the original mock_bind instead of the autocommit connection
returned by mock_bind.execution_options(), which hides the real migration bug;
change the assertion to capture the autocommit mock (the return value of
mock_bind.execution_options) and assert that its execute() was called and
inspect its call_args to get the executed SQL (replace references to
mock_bind.execute with the autocommit_conn.execute call and validate sql_text
from that call), while still asserting
mock_bind.execution_options.assert_called_once_with(isolation_level="AUTOCOMMIT").

- Use DROP INDEX CONCURRENTLY in migration downgrade for consistency
- Extract _find_index helper to reduce test duplication
- Update downgrade test for CONCURRENTLY pattern
@rnetser
Copy link
Contributor Author

rnetser commented Feb 23, 2026

/build-and-push-container

coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 23, 2026
@myakove-bot
Copy link
Collaborator

New container for ghcr.io/myk-org/github-metrics:pr-69 published

…Y DDL

Alembic's env.py wraps migrations in context.begin_transaction().
SQLAlchemy 2.0 rejects changing isolation_level on a connection with
an active transaction, causing InvalidRequestError in production.

Fix by committing Alembic's active transaction before running
CREATE/DROP INDEX CONCURRENTLY, which cannot run inside a transaction.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_models.py`:
- Around line 103-145: Extract the repeated migration execution + SQL extraction
into a single helper (e.g., _run_migration_and_get_executed_sql) that accepts a
migration instance and optionally the execute index/expected call count; have it
create the MagicMock, patch migration.op.get_bind, call migration.upgrade(),
assert mock_bind.execute.call_count == expected (default 2), and return
str(mock_bind.execute.call_args_list[desired_index][0][0].text).strip(); then
update test_upgrade_creates_correct_index, test_upgrade_targets_webhooks_table,
and test_upgrade_index_column_order to call this helper and use its returned
executed_sql instead of duplicating the setup and extraction logic.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b14ebf and 30c1c4d.

📒 Files selected for processing (2)
  • backend/migrations/versions/20260219_0001_add_pr_creators_index.py
  • tests/test_models.py

Extract _run_upgrade_and_get_sql() and _run_downgrade_and_get_sql()
helpers to eliminate duplicated mock setup, COMMIT verification, and
DDL extraction across four migration tests.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_models.py`:
- Line 28: The AssertionError message in tests/test_models.py is too long per
TRY003; update the raise in the check that verifies index presence (the line
raising AssertionError(f"Index {name} not found in __table_args__")) to use a
much shorter message such as "Index not found" (the variable name "name" is
already the argument and provides context), i.e., replace the long formatted
string with a concise literal while leaving the assertion logic and the symbol
name unchanged.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30c1c4d and 49b25df.

📒 Files selected for processing (1)
  • tests/test_models.py

@rnetser
Copy link
Contributor Author

rnetser commented Feb 23, 2026

/build-and-push-container

@myakove-bot
Copy link
Collaborator

New container for ghcr.io/myk-org/github-metrics:pr-69 published

@rnetser
Copy link
Contributor Author

rnetser commented Feb 23, 2026

/verified

coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 24, 2026
- Split test file into per-module files (test_models.py + test_migration_add_pr_creators_index.py)
- Remove brittle index count test
- Unify duplicate upgrade/downgrade test helpers
- Add DROP INDEX before CREATE to handle invalid indexes from failed CONCURRENTLY builds
- Add BEGIN after CONCURRENTLY DDL to restore Alembic transaction context
- Use Literal type for migration action parameter
- Use more robust alembic.op.get_bind patching
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_migration_add_pr_creators_index.py`:
- Around line 85-101: The test suite doesn't verify the defensive DROP executed
before creating the index; add an assertion in the tests (e.g., in
test_upgrade_creates_correct_index or a new test) that the executed SQL returned
by _run_upgrade_and_get_sql contains the DROP statement for INDEX_NAME with
"DROP INDEX CONCURRENTLY IF EXISTS" and that this DROP appears before the CREATE
(compare positions of the "DROP INDEX CONCURRENTLY IF EXISTS {INDEX_NAME}"
substring and the "CREATE INDEX" substring in executed_sql) to ensure the
defensive cleanup remains present and ordered correctly.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 252fc39 and da0fcd3.

📒 Files selected for processing (3)
  • backend/migrations/versions/20260219_0001_add_pr_creators_index.py
  • tests/test_migration_add_pr_creators_index.py
  • tests/test_models.py

@rnetser
Copy link
Contributor Author

rnetser commented Feb 24, 2026

build-and-push-container

@rnetser
Copy link
Contributor Author

rnetser commented Feb 24, 2026

/build-and-push-container:

@rnetser
Copy link
Contributor Author

rnetser commented Feb 24, 2026

/build-and-push-container

@myakove-bot
Copy link
Collaborator

New container for ghcr.io/myk-org/github-metrics:pr-69 published

@rnetser
Copy link
Contributor Author

rnetser commented Feb 24, 2026

/verified

@rnetser
Copy link
Contributor Author

rnetser commented Feb 25, 2026

@coderabbitai

all requested changes were addressed; approve the pr

@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

@rnetser Thank you for addressing all the requested changes! The PR looks excellent:

Migration & Index: Clean Alembic migration with CREATE INDEX CONCURRENTLY IF NOT EXISTS for safe production deployment
Test Coverage: 404 tests passing with 93% coverage, including 14 new tests validating migration and index behavior
Container Build: Successfully built and published (ghcr.io/myk-org/github-metrics:pr-69)
Production Ready: Index already applied to production database (cnv-tests-cov namespace)
Issue Resolution: Closes #68 (timeout fix for /api/metrics/contributors)

Approving the PR now!


🧠 Learnings used
Learnt from: CR
Repo: myk-org/github-metrics PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-04T12:43:16.559Z
Learning: Run `tox` and `docker build -t github-metrics-dev .` before ANY commit - both must pass with NO EXCEPTIONS

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

✅ Actions performed

Comments resolved and changes approved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: add composite index for pr_creators DISTINCT ON query timeout

2 participants