Skip to content

Conversation

@rayozzie
Copy link
Contributor

These are huge optimizations that impact large transfers - arguably 10x-70x.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR brings significant performance optimizations to the COBS (Consistent Overhead Byte Stuffing) encoding and decoding functions in note-c. The optimizations replace byte-by-byte processing with bulk memory operations and hardware-accelerated searches, claiming 10x-70x performance improvements for large data transfers.

Changes:

  • Replaced byte-by-byte COBS decoding with bulk memory operations (memmove/memcpy) for improved performance
  • Replaced byte-by-byte COBS encoding with memchr()-based zero-finding and bulk copy operations
  • Updated _cobsEncodedLength() to use the same optimization strategy as encoding
  • Added comprehensive inline documentation explaining the optimization techniques

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

This comment was marked as off-topic.

@zfields zfields requested a review from haydenroche5 January 20, 2026 16:03
@rayozzie
Copy link
Contributor Author

⏺ Analysis Complete

After thoroughly studying both implementations and the COBS specification, I can report that the feedback is incorrect
and both implementations are correct.

The Key Finding

Both the Go and C implementations produce 256 bytes for encoding 254 non-zero input bytes. This is the correct COBS
encoding, not an error.

Detailed Trace of C Implementation (254 non-zero bytes)

Looking at cobs.c:18-88:

Initial: length=254, code=1, code_ptr=dst[0], dst advanced to position 1

Loop iteration:

  • maxBytes = 0xFF - 1 = 254
  • searchLen = min(254, 254) = 254
  • memchr finds no zero → zeroPos = NULL
  • chunkLen = 254
  • Copy 254 bytes to dst[1..254]
  • length -= 254 → length = 0
  • code += 254 → code = 255 (0xFF)

Condition check:

  • zeroPos != NULL && length > 0 → FALSE (zeroPos is NULL)
  • code == 0xFF → TRUE
    • Write 0xFF to code_ptr (position 0)
    • code = 1
    • code_ptr = dst (position 255)
    • dst++ → position 256

Loop exits (length = 0)

Post-loop: Write code (1) to code_ptr (position 255)
Return: 256 bytes

Output: [0xFF, data×254, 0x01] = 256 bytes

Go Implementation Verification

The Go test file at hub/note-go/notecard/cobs_test.go:198-215 explicitly confirms this is expected:

// 254 non-zero bytes: [0xFF, 254 data bytes, 0x01]
// The 0xFF means "254 data bytes follow, no implicit zero after"
// The trailing 0x01 terminates the stream (0 more data bytes)
require.Len(t, encoded254, 256, "254 non-zero bytes encode to 256 bytes")
require.Equal(t, byte(0xFF), encoded254[0], "first code byte should be 0xFF")
require.Equal(t, byte(0x01), encoded254[255], "trailing code byte should be 0x01")

Why 256 Bytes is Correct

Per the COBS specification (from the original https://stuartcheshire.org/papers/COBSforToN.pdf):

"To encode some bytes, first append a zero byte, then break them into groups..."

For input [1, 2, ..., 254]:

  1. Append zero: [1, 2, ..., 254, 0]
  2. Break into groups:
    - Group 1: 254 non-zero bytes (full block)
    - Group 2: 0 non-zero bytes followed by the appended zero
  3. Encode groups:
    - Group 1: prepend 255 (0xFF) → [0xFF, 1, ..., 254]
    - Group 2: delete zero, prepend 1 → [0x01]
  4. Result: [0xFF, 1, ..., 254, 0x01] = 256 bytes

cobsEncodedLength Verification

Tracing cobs.c:94-137 for 254 non-zero bytes:

  • encodedLen = 1, code = 1
  • chunkLen = 254, encodedLen = 255, code = 0xFF
  • code == 0xFF triggers: encodedLen++ → 256

Both functions return 256. There is no mismatch.

The Feedback's Error

The feedback suggests changing else if (code == 0xFF) to else if (code == 0xFF && length > 0). This would break the
implementation:

  • Without the trailing 0x01, the output would be [0xFF, data×254] (255 bytes)
  • A COBS decoder reading this would: read 0xFF, copy 254 bytes, then try to read the next code byte (since 0xFF means
    "no implicit zero, continue to next block")
  • No more data → truncated/malformed frame

The trailing 0x01 is required to properly terminate the encoded stream. It signals "0 data bytes follow" which
completes the COBS encoding.

CobsEncodedLength Formula Confirmation

Go uses: length + (1 + (length / 254))

  • For 253 bytes: 253 + (1 + 0) = 254 ✓
  • For 254 bytes: 254 + (1 + 1) = 256 ✓
  • For 255 bytes: 255 + (1 + 1) = 257 ✓

Conclusion

  1. Both Go and C implementations are correct and produce identical output
  2. 256 bytes for 254 non-zero input is correct per the COBS specification
  3. The feedback is wrong - it misunderstands the COBS algorithm
  4. There is no mismatch between cobsEncode and cobsEncodedLength
  5. The implementations are interoperable - they encode and decode each other's output correctly

The suggested "fix" in the feedback would actually break standard COBS compliance and could cause interoperability
issues with other COBS implementations.

Sources:

These are huge optimizations that impact large transfers - arguably 10x-70x.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@zfields zfields merged commit 0560b73 into master Jan 24, 2026
19 of 21 checks passed
@zfields zfields deleted the ray/translead branch January 24, 2026 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants