fix: LOG_BASED replication bookmark not advancing between syncs by ksohail22 · Pull Request #745 · MeltanoLabs/tap-postgres

ksohail22 · 2026-03-10T17:18:28Z

Problem

All LOG_BASED streams in tap-postgres fail to advance their replication_key_value bookmark after a successful sync. This means:

Every sync re-reads the same WAL data from the starting LSN, regardless of how many records were processed.
The PostgreSQL replication slot is never flushed past the original LSN, causing unbounded WAL growth on source databases.
Sync durations grow over time as the accumulated WAL backlog increases.
Increased risk of disk exhaustion on production PostgreSQL instances.

INCREMENTAL streams are unaffected — their bookmarks advance correctly.

Root Cause

PostgresLogBasedStream inherits is_sorted = False from the Singer SDK base class and does not override it.

When is_sorted is False, the SDK's increment_state() function writes bookmark updates to a temporary progress_markers buffer rather than directly to replication_key_value in the stream state. These progress markers are supposed to be promoted to the main state at the end of the sync, but this promotion does not succeed — the bookmark remains frozen at its initial value.

This is incorrect for WAL-based replication. PostgreSQL's logical replication protocol delivers messages in strict LSN order — the stream is inherently sorted.

Fix

Set is_sorted = True on PostgresLogBasedStream:

class PostgresLogBasedStream(SQLStream):
    replication_key = "_sdc_lsn"
    is_sorted = True

With this change, increment_state() writes replication_key_value directly into the stream's main state dict after each record. No buffering, no promotion step, no risk of state mismatch.

Impact

Bookmark advancement: replication_key_value will correctly advance after every sync.
WAL flush: send_feedback(flush_lsn=...) will report the new LSN to PostgreSQL, allowing it to discard consumed WAL segments and reclaim disk space.
First run after deploy: Will process the backlog of WAL accumulated since the bookmark was last truly updated. May take longer than usual.
Subsequent runs: Will only process new WAL records since the last sync — fast and efficient.
Backward compatible: No state format changes. Existing bookmarks continue to work; they will simply start advancing from their current position.

How to Verify

Run any LOG_BASED stream twice after deploying this change.
Compare the replication_key_value in the state between the two runs — it should advance.
Confirm the log message "Stream is assumed to be unsorted, progress is not resumable if interrupted" no longer appears.
Monitor pg_replication_slots.confirmed_flush_lsn on source databases — it should advance after each sync.

edgarrmondragon

Thanks @ksohail22!

Just to confirm: records are extracted in increasing order of _sdc_lsn, right?

ksohail22 · 2026-03-11T15:32:51Z

@edgarrmondragon Yes, records are always delivered in increasing _sdc_lsn order.

Fix: LOG_BASED replication bookmark not advancing between syncs

af776f8

ksohail22 requested a review from edgarrmondragon as a code owner March 10, 2026 17:18

edgarrmondragon changed the title ~~Fix: LOG_BASED replication bookmark not advancing between syncs~~ fix: LOG_BASED replication bookmark not advancing between syncs Mar 10, 2026

edgarrmondragon self-assigned this Mar 10, 2026

edgarrmondragon added the bug Something isn't working label Mar 10, 2026

edgarrmondragon approved these changes Mar 10, 2026

View reviewed changes

edgarrmondragon added this pull request to the merge queue Mar 11, 2026

Merged via the queue into MeltanoLabs:main with commit f2d4ea4 Mar 11, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: LOG_BASED replication bookmark not advancing between syncs#745

fix: LOG_BASED replication bookmark not advancing between syncs#745
edgarrmondragon merged 1 commit intoMeltanoLabs:mainfrom
ksohail22:fix/wal-log-flush

ksohail22 commented Mar 10, 2026

Uh oh!

edgarrmondragon left a comment

Uh oh!

ksohail22 commented Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ksohail22 commented Mar 10, 2026

Problem

Root Cause

Fix

Impact

How to Verify

Uh oh!

edgarrmondragon left a comment

Choose a reason for hiding this comment

Uh oh!

ksohail22 commented Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants