-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Some remarks and suggestions to improve on the implementation:
- There are some unnecessary extra allocations, e.g. in the function
sha256_32you allocate a buffer where you move data. In practice, you can move the buffer out of the hasher:
#[inline(always)]
fn sha256_32(data: &[u8]) -> Bytes32 {
let mut hasher = Sha256::new();
hasher.update(data);
let out = hasher.finalize();
// Avoid an extra allocation/copy by returning `out`'s internal buffer
let arr: [u8; 32] = *(out.as_ref());
arr
}-
The implementation is fairly low level, which makes it difficult to follow and make the implementation error prone. For example, the function
subtree_root_sparsewill panic if an empty tree is passed as input. Also, some implementation details are not clear, e.g. why the checkif i + 1 < level_nodes.len() && level_nodes[i + 1].0 == (idx ^ 1) {is used to test if the sibling of a node is present. -
Judging from the implementation, the structure implemented is more a 256-ary sparse merkle tree, rather than a binary one. In general, using a size 256 as the radix of the tree might result in bigger proofs, and a smaller Radix is preferred (for example, many variations of Sparse Merkle trees such as Merkle Patricia Tries
(https://ethereum.org/developers/docs/data-structures-and-encoding/patricia-merkle-trie/) andJellyfish Merkle trees` (https://developers.diem.com/papers/jellyfish-merkle-tree/2021-01-14.pdf) have a radix of 16. -
In general, when programming in rust it is easier to leverage the NewType pattern (https://rust-unofficial.github.io/patterns/patterns/behavioural/newtype.html) to guarantee strong invariants at compile time. For example, the function
stem_bitreturns au8, but in practice one could define its ownBitenum to guarantee that the result is either0or1.
#[repr(u8)] // Will be represented in memory as a u8
pub enum Bit {
Zero, // will be represented in memory as `00000000`
One // will be represented in memory as `00000001`
}-
When implementing Sparse Merkle Tries or any variant, it is easier to store all nodes in a
HashMap<KeyPrefix, Node>, whereKeyPrefixis a sequence from 0 to 32 bytes, which needs to satisfy the trait bounds forHashMap::Key(e.g.Hash), andNodecontains a bitmap of which children are defined. If thei-thbit of the bitmap is set, then the node has a child at positioni, otherwise it has no child at that position. When traversing the tree from the root to a leaf, it makes sense to define an iterator that helps you select the next node to be traversed, rather than relying on a low-level implementation. -
Depending on whether a value of
000...000for a leaf is the same as the leaf not being present, the implementation might not be collision resistant. From what I understand, becauseh32or64returns00...0when hashing the concatenation of00...0and00...0, the empty tree and a tree whose all leaves have value00...0(or just a leaf with that value) will have the same root hash.
Another question from my side, what is the rationale behind the need for Sparse Merkle trees?