Range-Based Set Reconciliation (negentropy)

Hi! Have you seen the set reconcilliation algorithm described here? https://arxiv.org/abs/2212.13567

I wrote a C++ library that defines and implements this as a concrete protocol: https://github.com/hoytech/negentropy

Before building negentropy, I looked into the algorithms that gensync implements, and my conclusion was that range-based reconcilliation is superior under most conditions:

* Handles very large cardinalities (we are routinely using it for syncing data-sets of size 10 million, and greater)
* No tuning required: works predictably well with large or small number of differences
* Bandwidth overhead is logarithmic in DB size and linear in number of symmetric differences
* Negligible CPU usage compared with CPISync/[minisketch](https://github.com/sipa/minisketch)
* No false positives or failed decodes like with bloom-filter based algos

But I would love to have some objective benchmarks to verify this (or be proven mistaken!).

Thanks for any help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Range-Based Set Reconciliation (negentropy) #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Range-Based Set Reconciliation (negentropy) #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions