Skip to content

Comments

feat(replicache): Add bulk insertion optimization with putMany#5380

Open
arv wants to merge 6 commits intomainfrom
arv/basic-repl-btree-opt
Open

feat(replicache): Add bulk insertion optimization with putMany#5380
arv wants to merge 6 commits intomainfrom
arv/basic-repl-btree-opt

Conversation

@arv
Copy link
Contributor

@arv arv commented Dec 28, 2025

feat(replicache): Add bulk insertion optimization with putMany

Overview

This PR adds bulk insertion optimization to Replicache's BTree and database layer, significantly improving performance for large batch operations like sync patches.

Changes

Core BTree Changes

packages/replicache/src/btree/node.ts

  • Added putMany() method to DataNodeImpl for efficient merging of sorted entries
  • Added putMany() method to InternalNodeImpl with child grouping and rebalancing
  • Added putManyMergeAndPartition() helper for node rebalancing during bulk operations
  • Extracted binarySearchFrom() to enable optimized searching from a start index
  • Refactored readTreeData() to accept getEntrySize parameter (test helper improvement)

packages/replicache/src/btree/write.ts

  • Added BTreeWrite.putMany() method with two optimized paths:
    • Fast path: Bulk loads empty trees bottom-up using optimal partitioning
    • Slow path: Merges entries into existing trees with efficient rebalancing
  • Validates entries are sorted and converts to sized entries in a single pass
  • Reuses arrays to minimize allocations during tree construction

Database Layer

packages/replicache/src/db/write.ts

  • Added Write.putMany() method that delegates to BTreeWrite.putMany()
  • Handles index updates for all entries before bulk insertion
  • Maintains compatibility with existing put() semantics

Sync Layer Optimization

packages/replicache/src/sync/patch.ts

  • Added optimizePatch() function to eliminate redundant operations:
    • Drops operations before the last clear
    • Merges consecutive operations on the same key
    • Removes pointless del operations after clear
    • Sorts operations by key for optimal bulk loading
  • Modified apply() to use optimized patches with bulk loading
  • Extracted mergeUpdate() helper for update operation handling
  • Added bulkLoadPuts() to handle consecutive put operations efficiently

Performance Impact

The optimization targets common sync patterns:

  1. Initial sync: Loading thousands of entries into an empty tree
  2. Large patches: Applying batches of updates from the server
  3. Snapshot application: Replacing entire datasets

Benchmark Results

Comparison of putMany() vs sequential put() operations:

Scenario Entries Value Size Speedup
Empty tree 100 small 3.36x
Empty tree 100 large 1.49x
Empty tree 1,000 small 5.30x
Empty tree 1,000 large 1.58x
Empty tree 10,000 small 4.15x
Empty tree 10,000 large 1.13x
Construction only 10,000 small 53.73x
Update existing 1,000 mixed 4.47x

Key findings:

  • Small values show 3-5x speedup consistently
  • Construction-only (no flush) shows dramatic 53x improvement
  • Large values benefit less due to serialization overhead
  • Updating existing entries shows 4.5x improvement

Additional benefits:

  • Reduces chunk writes through optimal tree construction
  • Minimizes redundant operations through patch optimization

Testing

New test files:

  • packages/replicache/src/btree/write.bench.ts - Performance benchmarks comparing sequential put() vs putMany()
  • Extensive test coverage in:
    • packages/replicache/src/btree/node.test.ts - 17 new tests for putMany() behavior
    • packages/replicache/src/db/write.test.ts - 3 new tests for database-level putMany()
    • packages/replicache/src/sync/patch.test.ts - 24 new tests for patch optimization

Test scenarios covered:

  • Empty tree bulk loading
  • Merging with existing entries
  • Tree rebalancing and partitioning
  • Index updates
  • Update operation merging
  • Patch optimization edge cases

Compatibility

  • No breaking changes to public APIs
  • Existing put() and del() methods remain unchanged
  • putMany() is an additive optimization that can be adopted incrementally
  • Works with both FormatVersion.V6 and FormatVersion.V7

Implementation Details

Key Algorithm Improvements

  1. Bottom-up tree construction: When building from scratch, constructs the optimal tree structure in a single pass
  2. Batch rebalancing: Groups entries by affected child node and rebalances once per group
  3. Restricted binary search: Uses previous search results to narrow search ranges for sorted input
  4. Patch deduplication: Eliminates redundant operations before applying to the tree

Memory Efficiency

  • Reuses arrays during tree construction to minimize allocations
  • Mutates entries in-place during tree node creation (for immutable node pattern)
  • Batch processes operations to reduce intermediate tree states

Future Work

  • Consider adding delMany() for bulk deletions
  • Explore parallel index updates for large batches

Add efficient bulk insertion methods (putMany) to BTree and database layers,
significantly improving performance for large batch operations like sync patches.

Core Changes

- Add putMany() to BTreeWrite with fast path for empty trees and slow path for merging
- Add putMany() to DataNodeImpl and InternalNodeImpl for node-level bulk operations
- Add Write.putMany() in database layer with index update support
- Add optimizePatch() to eliminate redundant operations in sync patches
- Extract binarySearchFrom() to enable optimized searching from start index

Performance

Benchmark results (putMany vs sequential put):
- **100 entries (small values)**: 3.36x faster
- **1,000 entries (small values)**: 5.30x faster
- **10,000 entries (small values)**: 4.15x faster
- **Construction only (10,000 entries)**: 53.73x faster
- **Update operations (1,000 entries)**: 4.47x faster

Additional benefits:
- Reduces chunk writes through optimal tree construction
- Minimizes redundant operations through patch optimization

Testing

- Add comprehensive test suite covering bulk operations, rebalancing, and edge cases
- Add performance benchmarks comparing sequential put() vs putMany()
- Add patch optimization tests with 24 scenarios

Implementation Details

Fast path (empty tree):

- Builds tree bottom-up using optimal partitioning
- Constructs ideal tree structure in single pass
- Reuses arrays to minimize allocations

Slow path (existing tree):

- Groups entries by affected child nodes
- Performs batch rebalancing per group
- Uses restricted binary search for sorted input

Patch optimization:

- Drops operations before last clear
- Merges consecutive operations on same key
- Removes pointless deletes after clear
- Sorts operations for optimal bulk loading

No breaking changes. Additive optimization compatible with V6 and V7 formats.
@arv arv requested a review from grgbkr December 28, 2025 10:45
@vercel
Copy link

vercel bot commented Dec 28, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
replicache-docs Ready Ready Preview, Comment Jan 23, 2026 10:44am
zbugs Ready Ready Preview, Comment Jan 23, 2026 10:44am

Request Review

@github-actions
Copy link

github-actions bot commented Dec 28, 2025

🐰 Bencher Report

Brancharv/basic-repl-btree-opt
Testbedself-hosted
Click to view all benchmark results
BenchmarkThroughputBenchmark Result
operations / second (ops/s) x 1e3
(Result Δ%)
Lower Boundary
operations / second (ops/s) x 1e3
(Limit %)
src/client/custom.bench.ts > big schema📈 view plot
🚷 view threshold
146.36 ops/s x 1e3
(+7.26%)Baseline: 136.45 ops/s x 1e3
120.72 ops/s x 1e3
(82.48%)
src/client/zero.bench.ts > basics > All 1000 rows x 10 columns (numbers)📈 view plot
🚷 view threshold
2.67 ops/s x 1e3
(+9.88%)Baseline: 2.43 ops/s x 1e3
2.13 ops/s x 1e3
(79.72%)
src/client/zero.bench.ts > pk compare > pk = N📈 view plot
🚷 view threshold
68.64 ops/s x 1e3
(+9.82%)Baseline: 62.51 ops/s x 1e3
54.62 ops/s x 1e3
(79.57%)
src/client/zero.bench.ts > with filter > Lower rows 500 x 10 columns (numbers)📈 view plot
🚷 view threshold
4.04 ops/s x 1e3
(+8.08%)Baseline: 3.74 ops/s x 1e3
3.29 ops/s x 1e3
(81.60%)
🐰 View full continuous benchmarking report in Bencher

@github-actions
Copy link

github-actions bot commented Dec 28, 2025

🐰 Bencher Report

Brancharv/basic-repl-btree-opt
Testbedself-hosted
Click to view all benchmark results
BenchmarkThroughputBenchmark Result
operations / second (ops/s)
(Result Δ%)
Lower Boundary
operations / second (ops/s)
(Limit %)
1 exists: track.exists(album)📈 view plot
🚷 view threshold
13,248.78 ops/s
(-2.35%)Baseline: 13,568.11 ops/s
11,726.40 ops/s
(88.51%)
10 exists (AND)📈 view plot
🚷 view threshold
217,268.61 ops/s
(+10.23%)Baseline: 197,108.58 ops/s
165,059.45 ops/s
(75.97%)
10 exists (OR)📈 view plot
🚷 view threshold
3,970.80 ops/s
(+1.57%)Baseline: 3,909.60 ops/s
3,377.88 ops/s
(85.07%)
12 exists (AND)📈 view plot
🚷 view threshold
194,538.65 ops/s
(+12.19%)Baseline: 173,394.43 ops/s
144,732.98 ops/s
(74.40%)
12 exists (OR)📈 view plot
🚷 view threshold
3,256.62 ops/s
(-2.02%)Baseline: 3,323.87 ops/s
2,860.63 ops/s
(87.84%)
12 level nesting📈 view plot
🚷 view threshold
2,925.18 ops/s
(+1.35%)Baseline: 2,886.12 ops/s
2,481.14 ops/s
(84.82%)
2 exists (AND): track.exists(album).exists(genre)📈 view plot
🚷 view threshold
5,089.22 ops/s
(-0.21%)Baseline: 5,100.05 ops/s
4,416.91 ops/s
(86.79%)
3 exists (AND)📈 view plot
🚷 view threshold
1,961.40 ops/s
(-1.67%)Baseline: 1,994.77 ops/s
1,744.00 ops/s
(88.92%)
3 exists (OR)📈 view plot
🚷 view threshold
980.65 ops/s
(-1.63%)Baseline: 996.91 ops/s
864.97 ops/s
(88.20%)
5 exists (AND)📈 view plot
🚷 view threshold
310.12 ops/s
(-1.19%)Baseline: 313.86 ops/s
272.66 ops/s
(87.92%)
5 exists (OR)📈 view plot
🚷 view threshold
162.39 ops/s
(-1.61%)Baseline: 165.04 ops/s
142.66 ops/s
(87.85%)
Nested 2 levels: track > album > artist📈 view plot
🚷 view threshold
4,454.47 ops/s
(-0.11%)Baseline: 4,459.38 ops/s
3,875.96 ops/s
(87.01%)
Nested 4 levels: playlist > tracks > album > artist📈 view plot
🚷 view threshold
735.22 ops/s
(+0.31%)Baseline: 732.96 ops/s
640.11 ops/s
(87.06%)
Nested with filters: track > album > artist (filtered)📈 view plot
🚷 view threshold
3,625.67 ops/s
(-2.60%)Baseline: 3,722.41 ops/s
3,255.74 ops/s
(89.80%)
planned: playlist.exists(tracks)📈 view plot
🚷 view threshold
659.53 ops/s
(+8.05%)Baseline: 610.41 ops/s
540.29 ops/s
(81.92%)
planned: track.exists(album) OR exists(genre)📈 view plot
🚷 view threshold
172.00 ops/s
(+5.35%)Baseline: 163.27 ops/s
146.14 ops/s
(84.97%)
planned: track.exists(album) where title="Big Ones"📈 view plot
🚷 view threshold
7,889.80 ops/s
(+6.05%)Baseline: 7,439.85 ops/s
6,655.42 ops/s
(84.35%)
planned: track.exists(album).exists(genre)📈 view plot
🚷 view threshold
42.07 ops/s
(+9.24%)Baseline: 38.52 ops/s
33.95 ops/s
(80.70%)
planned: track.exists(album).exists(genre) with filters📈 view plot
🚷 view threshold
5,701.89 ops/s
(+8.87%)Baseline: 5,237.13 ops/s
4,649.59 ops/s
(81.54%)
planned: track.exists(playlists)📈 view plot
🚷 view threshold
4.25 ops/s
(+8.07%)Baseline: 3.94 ops/s
3.50 ops/s
(82.31%)
unplanned: playlist.exists(tracks)📈 view plot
🚷 view threshold
641.86 ops/s
(+8.09%)Baseline: 593.84 ops/s
525.58 ops/s
(81.88%)
unplanned: track.exists(album) OR exists(genre)📈 view plot
🚷 view threshold
47.81 ops/s
(+9.15%)Baseline: 43.80 ops/s
38.31 ops/s
(80.13%)
unplanned: track.exists(album) where title="Big Ones"📈 view plot
🚷 view threshold
59.80 ops/s
(+8.46%)Baseline: 55.14 ops/s
48.85 ops/s
(81.68%)
unplanned: track.exists(album).exists(genre)📈 view plot
🚷 view threshold
41.70 ops/s
(+8.70%)Baseline: 38.36 ops/s
33.98 ops/s
(81.49%)
unplanned: track.exists(album).exists(genre) with filters📈 view plot
🚷 view threshold
58.18 ops/s
(+7.88%)Baseline: 53.93 ops/s
48.07 ops/s
(82.63%)
unplanned: track.exists(playlists)📈 view plot
🚷 view threshold
4.20 ops/s
(+6.81%)Baseline: 3.93 ops/s
3.50 ops/s
(83.44%)
zpg: all playlists📈 view plot
🚷 view threshold
5.83 ops/s
(+4.60%)Baseline: 5.58 ops/s
5.06 ops/s
(86.71%)
zql: all playlists📈 view plot
🚷 view threshold
8.30 ops/s
(+13.15%)Baseline: 7.34 ops/s
6.23 ops/s
(75.01%)
zql: edit for limited query, inside the bound📈 view plot
🚷 view threshold
236,243.98 ops/s
(+14.73%)Baseline: 205,921.66 ops/s
176,733.26 ops/s
(74.81%)
zql: edit for limited query, outside the bound📈 view plot
🚷 view threshold
241,826.86 ops/s
(+16.12%)Baseline: 208,254.49 ops/s
168,537.64 ops/s
(69.69%)
zql: push into limited query, inside the bound📈 view plot
🚷 view threshold
115,442.52 ops/s
(+10.07%)Baseline: 104,877.52 ops/s
90,152.32 ops/s
(78.09%)
zql: push into limited query, outside the bound📈 view plot
🚷 view threshold
419,986.62 ops/s
(+10.77%)Baseline: 379,152.22 ops/s
305,209.21 ops/s
(72.67%)
zql: push into unlimited query📈 view plot
🚷 view threshold
352,967.36 ops/s
(+12.78%)Baseline: 312,983.55 ops/s
267,596.15 ops/s
(75.81%)
zqlite: all playlists📈 view plot
🚷 view threshold
1.88 ops/s
(+10.52%)Baseline: 1.71 ops/s
1.46 ops/s
(77.50%)
zqlite: edit for limited query, inside the bound📈 view plot
🚷 view threshold
82,254.27 ops/s
(+11.68%)Baseline: 73,654.38 ops/s
59,981.43 ops/s
(72.92%)
zqlite: edit for limited query, outside the bound📈 view plot
🚷 view threshold
84,532.17 ops/s
(+15.88%)Baseline: 72,951.16 ops/s
56,100.14 ops/s
(66.37%)
zqlite: push into limited query, inside the bound📈 view plot
🚷 view threshold
4,115.09 ops/s
(+2.53%)Baseline: 4,013.49 ops/s
3,642.06 ops/s
(88.50%)
zqlite: push into limited query, outside the bound📈 view plot
🚷 view threshold
94,092.16 ops/s
(+8.68%)Baseline: 86,577.73 ops/s
76,592.61 ops/s
(81.40%)
zqlite: push into unlimited query📈 view plot
🚷 view threshold
133,914.01 ops/s
(+11.54%)Baseline: 120,064.02 ops/s
102,272.74 ops/s
(76.37%)
🐰 View full continuous benchmarking report in Bencher

@github-actions
Copy link

github-actions bot commented Dec 28, 2025

🐰 Bencher Report

Brancharv/basic-repl-btree-opt
TestbedLinux
Click to view all benchmark results
BenchmarkFile SizeBenchmark Result
kilobytes (KB)
(Result Δ%)
Upper Boundary
kilobytes (KB)
(Limit %)
zero-package.tgz📈 view plot
🚷 view threshold
1,800.69 KB
(+0.21%)Baseline: 1,796.88 KB
1,832.82 KB
(98.25%)
zero.js📈 view plot
🚷 view threshold
247.33 KB
(+0.76%)Baseline: 245.47 KB
250.38 KB
(98.78%)
zero.js.br📈 view plot
🚷 view threshold
67.60 KB
(+0.59%)Baseline: 67.20 KB
68.55 KB
(98.61%)
🐰 View full continuous benchmarking report in Bencher

@rajczi
Copy link

rajczi commented Jan 16, 2026

We've now been using this build of Replicache in our expo react native mobile app using the sqlite kvStore against op-sqlite@15.2 for quite a while and are very happy with it. It is saving us over half of our initial sync snapshot time (download complete to ready).

@rajczi
Copy link

rajczi commented Jan 22, 2026

We've discovered an issue with this branch where the output of watchers doesn't match the patch from the server. I've prepared a unit test to demonstrate the issue which passes on main and fails here. I will send it to @arv.

Copy link
Contributor

@grgbkr grgbkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice optimization work!

expect(structure2).toEqual(structure1);

// Also verify both trees have the same data
await withRead(dagStore1, async dagRead1 => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of nesting the reads like this maybe a helper, and do them in sequence


function checkContents(dagStore, hash) {
    return withRead(dagStore, async dagRead => {
          const tree = new BTreeRead(
            dagRead,
            formatVersion,
            hash,
            getEntrySize,
            chunkHeaderSize,
          );
          for (let i = 0; i < 500; i++) {
            const key = `key${i.toString().padStart(4, '0')}`;
            expect(await tree.get(key)).toBe(i);
          }
    });
}
await checkContents(dagStore1, hash1);
await checkContents(dagStore2, hash2);

}

for (const formatVersion of [FormatVersion.V6, FormatVersion.V7] as const) {
test(`putMany empty entries > v${formatVersion}`, async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of having all the > v${formatVersion}, you could do

for (const formatVersion of [FormatVersion.V6, FormatVersion.V7] as const) {
  describe(`v${formatVersion}`, () => {
    test(`putMany empty entries`, async () => {
    });
    // etc
  });
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also thinking it is time to remove the old format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also thinking it is time to remove the old format.

});
});

test(`putMany triggers merge and partition > v${formatVersion}`, async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know in the tests that a merge or partition are being triggered?

Comments describing how these are triggering the various merges/partitions could help to clarify.

);

// Create internal nodes
currentLevel = parentPartitions.map(entries =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above you comment reuse array to avoid allocations, but doesn't this map create a new array?

Should this be

for (let i = 0; i< entries.length; i++ ){
  currentLevel[i] = this.newInternalNodeImpl(entries, level);
}
currentLevel.length = entries.length;

Or maybe track the logical length of currentLevel in its own variable instead of resizing the array.

for (let i = 0; i< entries.length; i++ ){
  currentLevel[i] = this.newInternalNodeImpl(entries, level);
}
currentLevelLength= entries.length;

return;
}

// Slow path: merge with existing tree
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this slow path still faster than putting each entry with a separate put call?

);
await apply(lc, dbWrite, patch);

expect(await dbWrite.get('a')).toBe(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these test should use scan to read and assert on entire content of db?

* 1. Dropping all operations before the last 'clear'
* 2. For each key: put/del replace all previous operations; updates accumulate
* 3. Removing standalone 'del' operations after a clear (deleting from empty tree)
* 4. Merging updates after puts into a single put operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't updates that don't have a put before them be merged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A put + merge -> put

I don't recall if I did

merge + merge -> merge

But it should be correct to do.

* 2. For each key: put/del replace all previous operations; updates accumulate
* 3. Removing standalone 'del' operations after a clear (deleting from empty tree)
* 4. Merging updates after puts into a single put operation
* Note: Order is preserved for operations on the same key, but operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we also merged updates, would there always be just one operation per key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is correct.

In theory we could get a del followed by a merge but I have to check again if that is an error or ignored.


// already sorted

// Use putMany which will use BTreeWrite.fromEntries if the map is empty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is BTreeWrite.fromEntries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old reference. I removed it and dealt with it without a new API.

if (existing.length === 1 && existing[0].op === 'put') {
const {value} = existing[0];
assertObject(value);
const merged = mergeUpdate(p, value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add some tests around merging updates that have a constrain property. I feel uncertain if those are being merged correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants