feat(replicache): Add bulk insertion optimization with putMany#5380
feat(replicache): Add bulk insertion optimization with putMany#5380
Conversation
Add efficient bulk insertion methods (putMany) to BTree and database layers, significantly improving performance for large batch operations like sync patches. Core Changes - Add putMany() to BTreeWrite with fast path for empty trees and slow path for merging - Add putMany() to DataNodeImpl and InternalNodeImpl for node-level bulk operations - Add Write.putMany() in database layer with index update support - Add optimizePatch() to eliminate redundant operations in sync patches - Extract binarySearchFrom() to enable optimized searching from start index Performance Benchmark results (putMany vs sequential put): - **100 entries (small values)**: 3.36x faster - **1,000 entries (small values)**: 5.30x faster - **10,000 entries (small values)**: 4.15x faster - **Construction only (10,000 entries)**: 53.73x faster - **Update operations (1,000 entries)**: 4.47x faster Additional benefits: - Reduces chunk writes through optimal tree construction - Minimizes redundant operations through patch optimization Testing - Add comprehensive test suite covering bulk operations, rebalancing, and edge cases - Add performance benchmarks comparing sequential put() vs putMany() - Add patch optimization tests with 24 scenarios Implementation Details Fast path (empty tree): - Builds tree bottom-up using optimal partitioning - Constructs ideal tree structure in single pass - Reuses arrays to minimize allocations Slow path (existing tree): - Groups entries by affected child nodes - Performs batch rebalancing per group - Uses restricted binary search for sorted input Patch optimization: - Drops operations before last clear - Merges consecutive operations on same key - Removes pointless deletes after clear - Sorts operations for optimal bulk loading No breaking changes. Additive optimization compatible with V6 and V7 formats.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
| Branch | arv/basic-repl-btree-opt |
| Testbed | self-hosted |
Click to view all benchmark results
| Benchmark | Throughput | Benchmark Result operations / second (ops/s) x 1e3 (Result Δ%) | Lower Boundary operations / second (ops/s) x 1e3 (Limit %) |
|---|---|---|---|
| src/client/custom.bench.ts > big schema | 📈 view plot 🚷 view threshold | 146.36 ops/s x 1e3(+7.26%)Baseline: 136.45 ops/s x 1e3 | 120.72 ops/s x 1e3 (82.48%) |
| src/client/zero.bench.ts > basics > All 1000 rows x 10 columns (numbers) | 📈 view plot 🚷 view threshold | 2.67 ops/s x 1e3(+9.88%)Baseline: 2.43 ops/s x 1e3 | 2.13 ops/s x 1e3 (79.72%) |
| src/client/zero.bench.ts > pk compare > pk = N | 📈 view plot 🚷 view threshold | 68.64 ops/s x 1e3(+9.82%)Baseline: 62.51 ops/s x 1e3 | 54.62 ops/s x 1e3 (79.57%) |
| src/client/zero.bench.ts > with filter > Lower rows 500 x 10 columns (numbers) | 📈 view plot 🚷 view threshold | 4.04 ops/s x 1e3(+8.08%)Baseline: 3.74 ops/s x 1e3 | 3.29 ops/s x 1e3 (81.60%) |
|
| Branch | arv/basic-repl-btree-opt |
| Testbed | self-hosted |
Click to view all benchmark results
| Benchmark | Throughput | Benchmark Result operations / second (ops/s) (Result Δ%) | Lower Boundary operations / second (ops/s) (Limit %) |
|---|---|---|---|
| 1 exists: track.exists(album) | 📈 view plot 🚷 view threshold | 13,248.78 ops/s(-2.35%)Baseline: 13,568.11 ops/s | 11,726.40 ops/s (88.51%) |
| 10 exists (AND) | 📈 view plot 🚷 view threshold | 217,268.61 ops/s(+10.23%)Baseline: 197,108.58 ops/s | 165,059.45 ops/s (75.97%) |
| 10 exists (OR) | 📈 view plot 🚷 view threshold | 3,970.80 ops/s(+1.57%)Baseline: 3,909.60 ops/s | 3,377.88 ops/s (85.07%) |
| 12 exists (AND) | 📈 view plot 🚷 view threshold | 194,538.65 ops/s(+12.19%)Baseline: 173,394.43 ops/s | 144,732.98 ops/s (74.40%) |
| 12 exists (OR) | 📈 view plot 🚷 view threshold | 3,256.62 ops/s(-2.02%)Baseline: 3,323.87 ops/s | 2,860.63 ops/s (87.84%) |
| 12 level nesting | 📈 view plot 🚷 view threshold | 2,925.18 ops/s(+1.35%)Baseline: 2,886.12 ops/s | 2,481.14 ops/s (84.82%) |
| 2 exists (AND): track.exists(album).exists(genre) | 📈 view plot 🚷 view threshold | 5,089.22 ops/s(-0.21%)Baseline: 5,100.05 ops/s | 4,416.91 ops/s (86.79%) |
| 3 exists (AND) | 📈 view plot 🚷 view threshold | 1,961.40 ops/s(-1.67%)Baseline: 1,994.77 ops/s | 1,744.00 ops/s (88.92%) |
| 3 exists (OR) | 📈 view plot 🚷 view threshold | 980.65 ops/s(-1.63%)Baseline: 996.91 ops/s | 864.97 ops/s (88.20%) |
| 5 exists (AND) | 📈 view plot 🚷 view threshold | 310.12 ops/s(-1.19%)Baseline: 313.86 ops/s | 272.66 ops/s (87.92%) |
| 5 exists (OR) | 📈 view plot 🚷 view threshold | 162.39 ops/s(-1.61%)Baseline: 165.04 ops/s | 142.66 ops/s (87.85%) |
| Nested 2 levels: track > album > artist | 📈 view plot 🚷 view threshold | 4,454.47 ops/s(-0.11%)Baseline: 4,459.38 ops/s | 3,875.96 ops/s (87.01%) |
| Nested 4 levels: playlist > tracks > album > artist | 📈 view plot 🚷 view threshold | 735.22 ops/s(+0.31%)Baseline: 732.96 ops/s | 640.11 ops/s (87.06%) |
| Nested with filters: track > album > artist (filtered) | 📈 view plot 🚷 view threshold | 3,625.67 ops/s(-2.60%)Baseline: 3,722.41 ops/s | 3,255.74 ops/s (89.80%) |
| planned: playlist.exists(tracks) | 📈 view plot 🚷 view threshold | 659.53 ops/s(+8.05%)Baseline: 610.41 ops/s | 540.29 ops/s (81.92%) |
| planned: track.exists(album) OR exists(genre) | 📈 view plot 🚷 view threshold | 172.00 ops/s(+5.35%)Baseline: 163.27 ops/s | 146.14 ops/s (84.97%) |
| planned: track.exists(album) where title="Big Ones" | 📈 view plot 🚷 view threshold | 7,889.80 ops/s(+6.05%)Baseline: 7,439.85 ops/s | 6,655.42 ops/s (84.35%) |
| planned: track.exists(album).exists(genre) | 📈 view plot 🚷 view threshold | 42.07 ops/s(+9.24%)Baseline: 38.52 ops/s | 33.95 ops/s (80.70%) |
| planned: track.exists(album).exists(genre) with filters | 📈 view plot 🚷 view threshold | 5,701.89 ops/s(+8.87%)Baseline: 5,237.13 ops/s | 4,649.59 ops/s (81.54%) |
| planned: track.exists(playlists) | 📈 view plot 🚷 view threshold | 4.25 ops/s(+8.07%)Baseline: 3.94 ops/s | 3.50 ops/s (82.31%) |
| unplanned: playlist.exists(tracks) | 📈 view plot 🚷 view threshold | 641.86 ops/s(+8.09%)Baseline: 593.84 ops/s | 525.58 ops/s (81.88%) |
| unplanned: track.exists(album) OR exists(genre) | 📈 view plot 🚷 view threshold | 47.81 ops/s(+9.15%)Baseline: 43.80 ops/s | 38.31 ops/s (80.13%) |
| unplanned: track.exists(album) where title="Big Ones" | 📈 view plot 🚷 view threshold | 59.80 ops/s(+8.46%)Baseline: 55.14 ops/s | 48.85 ops/s (81.68%) |
| unplanned: track.exists(album).exists(genre) | 📈 view plot 🚷 view threshold | 41.70 ops/s(+8.70%)Baseline: 38.36 ops/s | 33.98 ops/s (81.49%) |
| unplanned: track.exists(album).exists(genre) with filters | 📈 view plot 🚷 view threshold | 58.18 ops/s(+7.88%)Baseline: 53.93 ops/s | 48.07 ops/s (82.63%) |
| unplanned: track.exists(playlists) | 📈 view plot 🚷 view threshold | 4.20 ops/s(+6.81%)Baseline: 3.93 ops/s | 3.50 ops/s (83.44%) |
| zpg: all playlists | 📈 view plot 🚷 view threshold | 5.83 ops/s(+4.60%)Baseline: 5.58 ops/s | 5.06 ops/s (86.71%) |
| zql: all playlists | 📈 view plot 🚷 view threshold | 8.30 ops/s(+13.15%)Baseline: 7.34 ops/s | 6.23 ops/s (75.01%) |
| zql: edit for limited query, inside the bound | 📈 view plot 🚷 view threshold | 236,243.98 ops/s(+14.73%)Baseline: 205,921.66 ops/s | 176,733.26 ops/s (74.81%) |
| zql: edit for limited query, outside the bound | 📈 view plot 🚷 view threshold | 241,826.86 ops/s(+16.12%)Baseline: 208,254.49 ops/s | 168,537.64 ops/s (69.69%) |
| zql: push into limited query, inside the bound | 📈 view plot 🚷 view threshold | 115,442.52 ops/s(+10.07%)Baseline: 104,877.52 ops/s | 90,152.32 ops/s (78.09%) |
| zql: push into limited query, outside the bound | 📈 view plot 🚷 view threshold | 419,986.62 ops/s(+10.77%)Baseline: 379,152.22 ops/s | 305,209.21 ops/s (72.67%) |
| zql: push into unlimited query | 📈 view plot 🚷 view threshold | 352,967.36 ops/s(+12.78%)Baseline: 312,983.55 ops/s | 267,596.15 ops/s (75.81%) |
| zqlite: all playlists | 📈 view plot 🚷 view threshold | 1.88 ops/s(+10.52%)Baseline: 1.71 ops/s | 1.46 ops/s (77.50%) |
| zqlite: edit for limited query, inside the bound | 📈 view plot 🚷 view threshold | 82,254.27 ops/s(+11.68%)Baseline: 73,654.38 ops/s | 59,981.43 ops/s (72.92%) |
| zqlite: edit for limited query, outside the bound | 📈 view plot 🚷 view threshold | 84,532.17 ops/s(+15.88%)Baseline: 72,951.16 ops/s | 56,100.14 ops/s (66.37%) |
| zqlite: push into limited query, inside the bound | 📈 view plot 🚷 view threshold | 4,115.09 ops/s(+2.53%)Baseline: 4,013.49 ops/s | 3,642.06 ops/s (88.50%) |
| zqlite: push into limited query, outside the bound | 📈 view plot 🚷 view threshold | 94,092.16 ops/s(+8.68%)Baseline: 86,577.73 ops/s | 76,592.61 ops/s (81.40%) |
| zqlite: push into unlimited query | 📈 view plot 🚷 view threshold | 133,914.01 ops/s(+11.54%)Baseline: 120,064.02 ops/s | 102,272.74 ops/s (76.37%) |
|
| Branch | arv/basic-repl-btree-opt |
| Testbed | Linux |
Click to view all benchmark results
| Benchmark | File Size | Benchmark Result kilobytes (KB) (Result Δ%) | Upper Boundary kilobytes (KB) (Limit %) |
|---|---|---|---|
| zero-package.tgz | 📈 view plot 🚷 view threshold | 1,800.69 KB(+0.21%)Baseline: 1,796.88 KB | 1,832.82 KB (98.25%) |
| zero.js | 📈 view plot 🚷 view threshold | 247.33 KB(+0.76%)Baseline: 245.47 KB | 250.38 KB (98.78%) |
| zero.js.br | 📈 view plot 🚷 view threshold | 67.60 KB(+0.59%)Baseline: 67.20 KB | 68.55 KB (98.61%) |
|
We've now been using this build of Replicache in our expo react native mobile app using the sqlite kvStore against op-sqlite@15.2 for quite a while and are very happy with it. It is saving us over half of our initial sync snapshot time (download complete to ready). |
|
We've discovered an issue with this branch where the output of watchers doesn't match the patch from the server. I've prepared a unit test to demonstrate the issue which passes on main and fails here. I will send it to @arv. |
…ers for rebalancing
grgbkr
left a comment
There was a problem hiding this comment.
Very nice optimization work!
| expect(structure2).toEqual(structure1); | ||
|
|
||
| // Also verify both trees have the same data | ||
| await withRead(dagStore1, async dagRead1 => { |
There was a problem hiding this comment.
instead of nesting the reads like this maybe a helper, and do them in sequence
function checkContents(dagStore, hash) {
return withRead(dagStore, async dagRead => {
const tree = new BTreeRead(
dagRead,
formatVersion,
hash,
getEntrySize,
chunkHeaderSize,
);
for (let i = 0; i < 500; i++) {
const key = `key${i.toString().padStart(4, '0')}`;
expect(await tree.get(key)).toBe(i);
}
});
}
await checkContents(dagStore1, hash1);
await checkContents(dagStore2, hash2);
| } | ||
|
|
||
| for (const formatVersion of [FormatVersion.V6, FormatVersion.V7] as const) { | ||
| test(`putMany empty entries > v${formatVersion}`, async () => { |
There was a problem hiding this comment.
instead of having all the > v${formatVersion}, you could do
for (const formatVersion of [FormatVersion.V6, FormatVersion.V7] as const) {
describe(`v${formatVersion}`, () => {
test(`putMany empty entries`, async () => {
});
// etc
});
}
There was a problem hiding this comment.
I'm also thinking it is time to remove the old format.
There was a problem hiding this comment.
I'm also thinking it is time to remove the old format.
| }); | ||
| }); | ||
|
|
||
| test(`putMany triggers merge and partition > v${formatVersion}`, async () => { |
There was a problem hiding this comment.
How do we know in the tests that a merge or partition are being triggered?
Comments describing how these are triggering the various merges/partitions could help to clarify.
| ); | ||
|
|
||
| // Create internal nodes | ||
| currentLevel = parentPartitions.map(entries => |
There was a problem hiding this comment.
Above you comment reuse array to avoid allocations, but doesn't this map create a new array?
Should this be
for (let i = 0; i< entries.length; i++ ){
currentLevel[i] = this.newInternalNodeImpl(entries, level);
}
currentLevel.length = entries.length;
Or maybe track the logical length of currentLevel in its own variable instead of resizing the array.
for (let i = 0; i< entries.length; i++ ){
currentLevel[i] = this.newInternalNodeImpl(entries, level);
}
currentLevelLength= entries.length;
| return; | ||
| } | ||
|
|
||
| // Slow path: merge with existing tree |
There was a problem hiding this comment.
Is this slow path still faster than putting each entry with a separate put call?
| ); | ||
| await apply(lc, dbWrite, patch); | ||
|
|
||
| expect(await dbWrite.get('a')).toBe(2); |
There was a problem hiding this comment.
Maybe these test should use scan to read and assert on entire content of db?
| * 1. Dropping all operations before the last 'clear' | ||
| * 2. For each key: put/del replace all previous operations; updates accumulate | ||
| * 3. Removing standalone 'del' operations after a clear (deleting from empty tree) | ||
| * 4. Merging updates after puts into a single put operation |
There was a problem hiding this comment.
why can't updates that don't have a put before them be merged?
There was a problem hiding this comment.
A put + merge -> put
I don't recall if I did
merge + merge -> merge
But it should be correct to do.
| * 2. For each key: put/del replace all previous operations; updates accumulate | ||
| * 3. Removing standalone 'del' operations after a clear (deleting from empty tree) | ||
| * 4. Merging updates after puts into a single put operation | ||
| * Note: Order is preserved for operations on the same key, but operations |
There was a problem hiding this comment.
If we also merged updates, would there always be just one operation per key?
There was a problem hiding this comment.
I think that is correct.
In theory we could get a del followed by a merge but I have to check again if that is an error or ignored.
|
|
||
| // already sorted | ||
|
|
||
| // Use putMany which will use BTreeWrite.fromEntries if the map is empty |
There was a problem hiding this comment.
What is BTreeWrite.fromEntries?
There was a problem hiding this comment.
Old reference. I removed it and dealt with it without a new API.
| if (existing.length === 1 && existing[0].op === 'put') { | ||
| const {value} = existing[0]; | ||
| assertObject(value); | ||
| const merged = mergeUpdate(p, value); |
There was a problem hiding this comment.
It would be good to add some tests around merging updates that have a constrain property. I feel uncertain if those are being merged correctly.
feat(replicache): Add bulk insertion optimization with putMany
Overview
This PR adds bulk insertion optimization to Replicache's BTree and database layer, significantly improving performance for large batch operations like sync patches.
Changes
Core BTree Changes
packages/replicache/src/btree/node.tsputMany()method toDataNodeImplfor efficient merging of sorted entriesputMany()method toInternalNodeImplwith child grouping and rebalancingputManyMergeAndPartition()helper for node rebalancing during bulk operationsbinarySearchFrom()to enable optimized searching from a start indexreadTreeData()to acceptgetEntrySizeparameter (test helper improvement)packages/replicache/src/btree/write.tsBTreeWrite.putMany()method with two optimized paths:Database Layer
packages/replicache/src/db/write.tsWrite.putMany()method that delegates toBTreeWrite.putMany()put()semanticsSync Layer Optimization
packages/replicache/src/sync/patch.tsoptimizePatch()function to eliminate redundant operations:cleardeloperations afterclearapply()to use optimized patches with bulk loadingmergeUpdate()helper for update operation handlingbulkLoadPuts()to handle consecutive put operations efficientlyPerformance Impact
The optimization targets common sync patterns:
Benchmark Results
Comparison of
putMany()vs sequentialput()operations:Key findings:
Additional benefits:
Testing
New test files:
packages/replicache/src/btree/write.bench.ts- Performance benchmarks comparing sequential put() vs putMany()packages/replicache/src/btree/node.test.ts- 17 new tests for putMany() behaviorpackages/replicache/src/db/write.test.ts- 3 new tests for database-level putMany()packages/replicache/src/sync/patch.test.ts- 24 new tests for patch optimizationTest scenarios covered:
Compatibility
put()anddel()methods remain unchangedputMany()is an additive optimization that can be adopted incrementallyImplementation Details
Key Algorithm Improvements
Memory Efficiency
Future Work
delMany()for bulk deletions