fix: backport drizzle plugin patch to eliminate per-block DDL overhead#135
fix: backport drizzle plugin patch to eliminate per-block DDL overhead#135
Conversation
Pins Apibara dependencies to concrete versions and applies the @apibara/plugin-drizzle patch from next branch that: - Registers reorg triggers once (guarded flag) instead of every block - Uses PostgreSQL session variable for order_key instead of trigger args - Removes post-transaction trigger teardown - Splits invalidation into dedicated transaction to prevent reorg corruption Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThis PR updates the Apibara indexer to beta version 2.1.0 dependencies and introduces a reorg ordering mechanism through a patch that centralizes trigger management logic. The patch modifies trigger registration and removal flows, replaces direct cursor key usage with a configurable reorg order key, and adds state tracking for transaction-bounded trigger operations. Changes
Sequence DiagramsequenceDiagram
participant Handler as Handler Execution
participant DB as Database Config
participant Triggers as Trigger System
participant Storage as Storage Layer
participant Tx as Transaction
Handler->>DB: Check finality status
alt Finality not finalized
Handler->>Tx: Begin transaction
Handler->>Storage: getReorgOrderKey() from config
Handler->>Triggers: registerTriggers(tx, tables, idColumnMap, indexerId)
Note over Triggers: Creates triggers using<br/>reorg_order_key from config
Handler->>Storage: setReorgOrderKey(tx, endCursor)
Note over Storage: Persist computed order key<br/>Set reorgTriggersRegistered=true
Handler->>Tx: Commit
else Skip trigger removal
Note over Handler: Do not remove triggers<br/>in current transaction
end
alt Previous finality was pending
Handler->>Tx: Begin separate transaction
Handler->>Storage: Invalidate pending finality state
Handler->>Tx: Commit
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude finished @loothero's task —— View job Claude Review - General Engineering ReviewTodo List
Review FindingsAnalysis Scope: Reviewing changes outside contracts/, client/, indexer/, and api/ directories as requested. Result: All changes in this PR are contained within the
Since the review scope explicitly excludes the No issues found. Summary: 0 CRITICAL, 0 HIGH, 0 MEDIUM, 0 LOW, 0 INFO run=22931536972 attempt=1 sha=da02a5b5cba5468857e35f0c391064082ea6cd8b scope=general |
|
🚅 Deployed to the summit-pr-135 environment in Summit stack
1 service not affected by this PR
|
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the performance and stability of the Drizzle plugin by backporting a crucial patch that optimizes PostgreSQL DDL operations. It addresses inefficiencies related to reorg trigger management and Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Codex Review - General Engineering ReviewNo issues found. Summary: 0 CRITICAL, 0 HIGH, 0 MEDIUM, 0 LOW, 0 INFO |
|
Claude finished @loothero's task —— View job Claude Review - Indexer/API ReviewCritical Analysis CompleteFindings: No issues found. Summary: 0 CRITICAL, 0 HIGH, 0 MEDIUM, 0 LOW, 0 INFO Detailed Review1. Dependency Pinning (package.json)
2. Docker Build Integration (Dockerfile)
3. Core Patch Analysis (@apibara__plugin-drizzle@2.1.0-beta.55.patch) A. Trigger Management Optimization:
B. Order Key Handling via Session Variable:
C. Transaction Separation for Invalidation:
4. Reorg Safety & Finality Handling:
5. Operational Reliability:
SummaryThis patch implements a well-engineered optimization that eliminates the primary DDL performance bottleneck in Apibara's drizzle plugin. The changes are:
The documented tradeoff regarding partial state visibility during process failures is acceptable given Apibara's restart-based recovery model. run=22931536972 attempt=1 sha=da02a5b5cba5468857e35f0c391064082ea6cd8b scope=indexer-api |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
indexer/patches/@apibara__plugin-drizzle@2.1.0-beta.55.patch (1)
76-94: Dedicated invalidation transaction introduces a brief inconsistency window.The comment correctly documents the tradeoff: if the main data transaction fails after the invalidation transaction commits, readers see the tip removed until process restart. This is acceptable given Apibara's no-retry behavior and reliance on Railway's restart policy, but worth monitoring in staging.
Consider adding observability (metric/log) when the main transaction fails after invalidation commits, to aid debugging if this edge case occurs.
Also applies to: 211-229, 292-313
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@indexer/patches/`@apibara__plugin-drizzle@2.1.0-beta.55.patch around lines 76 - 94, Add observability for the window where the dedicated invalidation transaction commits but the subsequent main transaction fails: after the invalidation call inside the handler middleware (the withTransaction/invalidate sequence) record a log/metric when the later replacement/main withTransaction fails; emit a structured processLogger.error or increment a metric (include indexerId, cursor/endCursor, prevFinality, and the caught error) and tag that this occurred after invalidation so SRE can find these edge cases; place this reporting in the catch handling around the main withTransaction that follows the invalidate call and reuse existing symbols (withTransaction, invalidate, indexerId, cursor/ endCursor, and registeredTriggersInTxn) for context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@indexer/patches/`@apibara__plugin-drizzle@2.1.0-beta.55.patch:
- Around line 76-94: Add observability for the window where the dedicated
invalidation transaction commits but the subsequent main transaction fails:
after the invalidation call inside the handler middleware (the
withTransaction/invalidate sequence) record a log/metric when the later
replacement/main withTransaction fails; emit a structured processLogger.error or
increment a metric (include indexerId, cursor/endCursor, prevFinality, and the
caught error) and tag that this occurred after invalidation so SRE can find
these edge cases; place this reporting in the catch handling around the main
withTransaction that follows the invalidate call and reuse existing symbols
(withTransaction, invalidate, indexerId, cursor/ endCursor, and
registeredTriggersInTxn) for context.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 717546e9-fda7-44f1-8358-7941954590ea
⛔ Files ignored due to path filters (1)
indexer/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (3)
indexer/Dockerfileindexer/package.jsonindexer/patches/@apibara__plugin-drizzle@2.1.0-beta.55.patch
Codex Review - Indexer/API Review[MEDIUM] indexer/patches/@apibara__plugin-drizzle@2.1.0-beta.55.patch:304 - Reorg invalidation is now committed in a separate transaction before replacement writes, and recovery is explicitly coupled to an external Railway restart assumption if the replacement transaction fails. Summary: 0 CRITICAL, 0 HIGH, 1 MEDIUM, 0 LOW, 0 INFO |
There was a problem hiding this comment.
Pull request overview
This PR backports an upstream @apibara/plugin-drizzle patch and pins Apibara dependencies to concrete versions to eliminate per-block DDL overhead during indexing (reorg trigger creation/removal), improving indexer performance and reducing DB churn.
Changes:
- Pin Apibara packages from floating
nexttags to specific2.1.0-beta.*versions. - Apply a pnpm patch to
@apibara/plugin-drizzle@2.1.0-beta.55to register reorg triggers once, passorder_keyvia a PostgreSQL session variable, and run invalidation in a separate transaction. - Update Docker build stages to include
patches/sopnpm installcan apply the patch in-container.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| indexer/pnpm-lock.yaml | Adds patchedDependencies metadata and pins Apibara package resolutions to specific beta versions. |
| indexer/patches/@apibara__plugin-drizzle@2.1.0-beta.55.patch | Backports the drizzle plugin changes to avoid per-block trigger DDL and adjust reorg invalidation behavior. |
| indexer/package.json | Pins Apibara dependencies and configures pnpm patchedDependencies for the drizzle plugin patch. |
| indexer/Dockerfile | Copies patches/ in both build and production stages so the pnpm patch applies during container builds. |
Files not reviewed (1)
- indexer/pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| + try { | ||
| + await tx.execute( | ||
| + sql.raw( | ||
| + `SELECT set_config('${SCHEMA_NAME}.reorg_order_key', '${Number(endCursor.orderKey)}', true);`, |
There was a problem hiding this comment.
setReorgOrderKey uses Number(endCursor.orderKey) when writing to set_config. In Apibara, orderKey is a bigint (e.g., status.starting?.orderKey ?? 0n in scripts/check-dna-status.ts), so converting to Number can lose precision (or stringify to scientific notation) and lead to incorrect rollback cursor values. Prefer passing the bigint as a decimal string (e.g., endCursor.orderKey.toString()) and casting to an appropriate PG integer type on the database side if needed; update both the src/storage.ts and dist/* hunks in this patch for consistency.
| + `SELECT set_config('${SCHEMA_NAME}.reorg_order_key', '${Number(endCursor.orderKey)}', true);`, | |
| + `SELECT set_config('${SCHEMA_NAME}.reorg_order_key', '${endCursor.orderKey.toString()}', true);`, |
| + order_key_text TEXT := current_setting('${SCHEMA_NAME}.reorg_order_key', true); | ||
| + order_key INTEGER; | ||
| + indexer_id TEXT := TG_ARGV[2]::TEXT; | ||
| new_id_value TEXT := row_to_json(NEW.*)->>id_col; | ||
| old_id_value TEXT := row_to_json(OLD.*)->>id_col; | ||
| BEGIN | ||
| + IF order_key_text IS NULL THEN | ||
| + RETURN NULL; | ||
| + END IF; | ||
| + | ||
| + order_key := order_key_text::INTEGER; | ||
| + |
There was a problem hiding this comment.
The trigger function still declares/casts order_key as INTEGER (order_key := order_key_text::INTEGER). If orderKey is a bigint cursor, this can overflow 32-bit integers and break reorg rollback for higher block heights. Consider using BIGINT consistently (order_key BIGINT and order_key_text::BIGINT) and ensure the rollback table cursor column matches.
There was a problem hiding this comment.
Code Review
This pull request backports a performance patch for the drizzle plugin, which is a significant improvement, and the changes to reorg handling are well-structured. However, the patch introduces a potential SQL Injection vulnerability in the trigger registration logic due to unsanitized string interpolation in raw SQL queries, specifically with the indexerId and tableIdColumn variables. Additionally, two important issues related to handling large block numbers (bigint values) have been identified that could lead to data corruption or errors. My review includes suggestions to address these potential bugs and the SQL injection vulnerability.
| + order_key INTEGER; | ||
| + indexer_id TEXT := TG_ARGV[2]::TEXT; | ||
| new_id_value TEXT := row_to_json(NEW.*)->>id_col; | ||
| old_id_value TEXT := row_to_json(OLD.*)->>id_col; | ||
| BEGIN | ||
| + IF order_key_text IS NULL THEN | ||
| + RETURN NULL; | ||
| + END IF; | ||
| + | ||
| + order_key := order_key_text::INTEGER; |
There was a problem hiding this comment.
The order_key variable in the trigger function is declared as INTEGER. A PostgreSQL INTEGER has a maximum value of 2,147,483,647. Starknet block numbers (order_key) are u64 and will eventually exceed this limit. This will cause an 'integer out of range' error in the database trigger, breaking the reorg mechanism. This should be BIGINT to support u64 block numbers.
order_key BIGINT;
indexer_id TEXT := TG_ARGV[2]::TEXT;
new_id_value TEXT := row_to_json(NEW.*)->>id_col;
old_id_value TEXT := row_to_json(OLD.*)->>id_col;
BEGIN
IF order_key_text IS NULL THEN
RETURN NULL;
END IF;
order_key := order_key_text::BIGINT;
| + try { | ||
| + await tx.execute( | ||
| + sql.raw( | ||
| + `SELECT set_config('${SCHEMA_NAME}.reorg_order_key', '${Number(endCursor.orderKey)}', true);`, |
There was a problem hiding this comment.
There is a potential for precision loss when converting bigint to Number. The endCursor.orderKey is a bigint and can represent numbers larger than Number.MAX_SAFE_INTEGER. Converting it to Number can lead to a loss of precision for large block numbers, causing an incorrect order_key to be set. This could compromise the reorg handling mechanism. It's safer to convert the bigint to a string.
`SELECT set_config('${SCHEMA_NAME}.reorg_order_key', '${endCursor.orderKey.toString()}', true);`,
| + DROP TRIGGER IF EXISTS ${getReorgTriggerName(table, indexerId)} ON ${table}; | ||
| CREATE CONSTRAINT TRIGGER ${getReorgTriggerName(table, indexerId)} | ||
| AFTER INSERT OR UPDATE OR DELETE ON ${table} | ||
| DEFERRABLE INITIALLY DEFERRED | ||
| - FOR EACH ROW EXECUTE FUNCTION ${constants.SCHEMA_NAME}.reorg_checkpoint('${table}', '${tableIdColumn}', ${Number(endCursor.orderKey)}, '${indexerId}'); | ||
| + FOR EACH ROW EXECUTE FUNCTION ${constants.SCHEMA_NAME}.reorg_checkpoint('${table}', '${tableIdColumn}', '${indexerId}'); | ||
| `) |
There was a problem hiding this comment.
The patch introduces a potential SQL Injection vulnerability in the registerTriggers function. It uses sql.raw to construct a DDL statement (DROP TRIGGER and CREATE CONSTRAINT TRIGGER) by interpolating variables like indexerId and tableIdColumn directly into the SQL string.
If these configuration options (e.g., indexerName passed to the plugin) are sourced from untrusted external input such as environment variables or user-provided configuration, an attacker could execute arbitrary SQL commands. For example, a crafted indexerId containing a single quote could break out of the literal and inject additional SQL statements.
To remediate this, ensure that all variables interpolated into raw SQL are properly sanitized or escaped. For identifiers like trigger names, use proper SQL identifier quoting (e.g., double quotes in PostgreSQL). For string literals, use SQL string escaping or Drizzle's built-in SQL construction methods that handle parameterization safely.
| + DROP TRIGGER IF EXISTS ${getReorgTriggerName(table, indexerId)} ON ${table}; | ||
| CREATE CONSTRAINT TRIGGER ${getReorgTriggerName(table, indexerId)} | ||
| AFTER INSERT OR UPDATE OR DELETE ON ${table} | ||
| DEFERRABLE INITIALLY DEFERRED | ||
| - FOR EACH ROW EXECUTE FUNCTION ${SCHEMA_NAME}.reorg_checkpoint('${table}', '${tableIdColumn}', ${Number(endCursor.orderKey)}, '${indexerId}'); | ||
| + FOR EACH ROW EXECUTE FUNCTION ${SCHEMA_NAME}.reorg_checkpoint('${table}', '${tableIdColumn}', '${indexerId}'); | ||
| `) |
| + DROP TRIGGER IF EXISTS ${getReorgTriggerName(table, indexerId)} ON ${table}; | ||
| CREATE CONSTRAINT TRIGGER ${getReorgTriggerName(table, indexerId)} | ||
| AFTER INSERT OR UPDATE OR DELETE ON ${table} | ||
| DEFERRABLE INITIALLY DEFERRED | ||
| - FOR EACH ROW EXECUTE FUNCTION ${SCHEMA_NAME}.reorg_checkpoint('${table}', '${tableIdColumn}', ${Number(endCursor.orderKey)}, '${indexerId}'); | ||
| + FOR EACH ROW EXECUTE FUNCTION ${SCHEMA_NAME}.reorg_checkpoint('${table}', '${tableIdColumn}', '${indexerId}'); | ||
| `), |
Summary
@apibara/plugin-drizzle@2.1.0-beta.55, etc.) instead of floatingnexttagsnextbranch that eliminates per-block DDL overhead: triggers are registered once,order_keyis passed via PostgreSQL session variable, and invalidation runs in a dedicated transactionCOPY patches ./patchesto both Dockerfile build stages so pnpm can apply the patchWhat the patch fixes
reorgTriggersRegisteredflag)order_keyas hardcoded trigger argument → usesset_configsession variable (oneSETper block vs N trigger re-creations)Test plan
pnpm install— patch applies cleanlypnpm exec tsc --noEmit— typecheck passespnpm build— build succeedspnpm test— 88/88 tests pass🤖 Generated with Claude Code
Summary by CodeRabbit