Remarks on implementation:

Some remarks and suggestions to improve on the implementation: 

1. There are some unnecessary extra allocations, e.g. in the function `sha256_32` you allocate a buffer where you move data. In practice, you can move the buffer out of the hasher: 

```rust
#[inline(always)]
fn sha256_32(data: &[u8]) -> Bytes32 {
    let mut hasher = Sha256::new();
    hasher.update(data);
    let out = hasher.finalize();
    // Avoid an extra allocation/copy  by returning `out`'s internal buffer
    let arr: [u8; 32] = *(out.as_ref());
    arr
}
```

2. The implementation is fairly low level, which makes it difficult to follow and make the implementation error prone. For example, the function `subtree_root_sparse` will panic if an empty tree is passed as input. Also, some implementation details are not clear, e.g. why the check `            if i + 1 < level_nodes.len() && level_nodes[i + 1].0 == (idx ^ 1) {`  is used to test if the sibling of a node is present. 

3. Judging from the implementation, the structure implemented is more a 256-ary sparse merkle tree, rather than a binary one. In general, using a size 256 as the radix of the tree might result in bigger proofs, and a smaller Radix is preferred (for example, many variations of Sparse Merkle trees such as Merkle Patricia Tries`(https://ethereum.org/developers/docs/data-structures-and-encoding/patricia-merkle-trie/) and `Jellyfish Merkle trees` (https://developers.diem.com/papers/jellyfish-merkle-tree/2021-01-14.pdf) have a radix of 16. 

4. In general, when programming in rust it is easier to leverage the NewType pattern (https://rust-unofficial.github.io/patterns/patterns/behavioural/newtype.html) to guarantee strong invariants at compile time. For example, the function `stem_bit`  returns a `u8`, but in practice one could define its own `Bit` enum to guarantee that the result is either `0` or `1`. 

```rust
#[repr(u8)] // Will be represented in memory as a u8
pub enum Bit {
Zero, // will be represented in memory as `00000000`
One  // will be represented in memory as `00000001`
}
```

5. When implementing Sparse Merkle Tries or any variant, it is easier to store all nodes in a `HashMap<KeyPrefix, Node>`, where `KeyPrefix` is a sequence from 0 to 32 bytes, which needs to satisfy the trait bounds for `HashMap::Key` (e.g. `Hash`), and `Node` contains a bitmap of which children are defined. If the `i-th` bit of the bitmap is set, then the node has a child at position `i`, otherwise it has no child at that position. When traversing the tree from the root to a leaf, it makes sense to define an iterator that helps you select the next node to be traversed, rather than relying on a low-level implementation.

6. Depending on whether a value of `000...000` for a leaf is the same as the leaf not being present, the implementation might not be collision resistant. From what I understand, because `h32or64` returns `00...0` when hashing the concatenation of `00...0` and `00...0`, the empty tree and a tree whose all leaves have value `00...0` (or just a leaf with that value) will have the same root hash.

Another question from my side, what is the rationale behind the need for Sparse Merkle trees?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remarks on implementation: #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Remarks on implementation: #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions