-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Hi! Have you seen the set reconcilliation algorithm described here? https://arxiv.org/abs/2212.13567
I wrote a C++ library that defines and implements this as a concrete protocol: https://github.com/hoytech/negentropy
Before building negentropy, I looked into the algorithms that gensync implements, and my conclusion was that range-based reconcilliation is superior under most conditions:
- Handles very large cardinalities (we are routinely using it for syncing data-sets of size 10 million, and greater)
- No tuning required: works predictably well with large or small number of differences
- Bandwidth overhead is logarithmic in DB size and linear in number of symmetric differences
- Negligible CPU usage compared with CPISync/minisketch
- No false positives or failed decodes like with bloom-filter based algos
But I would love to have some objective benchmarks to verify this (or be proven mistaken!).
Thanks for any help!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels