streamsearch (multipart) can be made significantly faster with Buffer.indexOf

### Prerequisites

- [x] I have written a descriptive issue title
- [x] I have searched existing issues to ensure the feature has not already been requested


### 🚀 Feature Proposal

I recently ran a performance test with the original streamsearch implementation and an experimental change which makes heavy use of `Buffer.indexOf` (with custom code needed only for handling chunk boundaries), and found that (at least in more recent versions of Node.js), a `Buffer.indexOf`-based approach is up to 30× faster in some cases (and typically ~50–100% faster).

In real-world terms, this translates to handling multipart form data (excluding network access) ~2–3× faster in almost all cases (and I'm yet to find a situation where it's slower)

### Motivation

Streamsearch was originally written to be fast, but it was implemented when Node.js' own `Buffer.indexOf` just did dumb character-by-character checks. Now, `Buffer.indexOf` is also fast (implementing MBH as well as various other algorithms internally), and being written in a lower-level language it has advantages which cannot be beaten in userspace. It even "wins" when there are lots of small chunks, which is potentially surprising because higher-level code can take advantage of sharing pre-calculated data, whereas `indexOf` must recalculate its references each time.

You can see more detail in this old thread: https://github.com/mscdex/streamsearch/issues/2

### Example

| Test Needle              | adapted streamsearch (100kB data) | streamsearch@1.1.0 (100kB data) | Buffer.indexOf (100kB data) - for comparison | adapted streamsearch (2kB batches) | streamsearch@1.1.0 (2kB batches) |
| ------------------------ | ------------ | ------------------ | -------------- | ---------------------- | ---------------------------- |
| \r\n                     |        4.0µs |            169.3µs |        * 3.9µs |                * 6.8µs |                      200.9µs |
| \r\n--sep                |        4.0µs |             95.4µs |        * 3.8µs |                * 6.9µs |                       97.8µs |
| realistic                |        4.5µs |             15.7µs |        * 4.5µs |                * 8.0µs |                       21.0µs |
| repetitive prefixed      |        5.3µs |             13.8µs |        * 5.3µs |                * 8.3µs |                       26.0µs |
| long repetitive prefixed |        6.5µs |             36.6µs |        * 6.2µs |               * 17.0µs |                      157.4µs |
| repetitive suffixed      |        4.6µs |             15.2µs |        * 4.5µs |               * 10.0µs |                       24.9µs |
| long repetitive suffixed |       14.0µs |             33.6µs |       * 13.6µs |               * 76.7µs |                      181.5µs |

The "adapted" column is using `Buffer.indexOf` as much as possible, along with a few other optimisations around how the lookbehind is handled by offloading some of the filtering and comparisons to Buffer methods. The full code is linked below.

The left 3 columns of results are when running in an artificial world where there is only a single 100kB chunk. A "dumb" `Buffer.indexOf`-loop implementation is shown here for comparison. The right 2 columns are a more realistic scenario where the same data is split into 2kB chunks, so "lookbehind" handling is required.

The above table is from running on a Mac running Node.js 24.

On linux (on a Pi 5 running Node.js 22), the results are similar:

| Test Needle              | adapted streamsearch (100kB data) | streamsearch@1.1.0 (100kB data) | Buffer.indexOf (100kB data) | adapted streamsearch (2kB batches) | streamsearch@1.1.0 (2kB batches) |
| ------------------------ | ------------ | ------------------ | -------------- | ---------------------- | ---------------------------- |
| \r\n                     |       12.9µs |            623.5µs |       * 10.5µs |               * 24.4µs |                      632.6µs |
| \r\n--sep                |       10.9µs |            180.5µs |       * 10.7µs |               * 24.6µs |                      184.7µs |
| realistic                |       11.5µs |             27.0µs |       * 11.3µs |               * 26.8µs |                       37.4µs |
| repetitive prefixed      |       11.4µs |             29.9µs |       * 11.0µs |               * 28.9µs |                       47.8µs |
| long repetitive prefixed |       19.1µs |            120.1µs |       * 18.0µs |               * 61.1µs |                      287.6µs |
| repetitive suffixed      |       12.9µs |             28.2µs |       * 12.4µs |               * 45.1µs |                       47.5µs |
| long repetitive suffixed |       64.8µs |            141.0µs |       * 64.2µs |              * 255.9µs |                      296.8µs |

The tests use randomised data so the numbers fluctuate a little.

Since running these tests, I've found further ways to tune the performance by changing the API slightly (specifically to break up the content and boundary callbacks)

You can find my forked code here (licensed under the same MIT license as the original):

- still basically drop-in compatible with the original: https://github.com/davidje13/web-listener/blob/fb75eb7700630c13d3080786d2e9a5892b9686bf/src/forks/streamsearch/sbmh.mts
- after changing to use 2 separate callback arguments for additional speed: https://github.com/davidje13/web-listener/blob/ab8e57d2346a21f6cfca255d471074cbabe09df8/src/forks/streamsearch/sbmh.mts (gives an extra ~5–10% performance boost compared to the results above)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

streamsearch (multipart) can be made significantly faster with Buffer.indexOf #202

Prerequisites

🚀 Feature Proposal

Motivation

Example

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Test Needle	adapted streamsearch (100kB data)	streamsearch@1.1.0 (100kB data)	Buffer.indexOf (100kB data) - for comparison	adapted streamsearch (2kB batches)	streamsearch@1.1.0 (2kB batches)
\r\n	4.0µs	169.3µs	* 3.9µs	* 6.8µs	200.9µs
\r\n--sep	4.0µs	95.4µs	* 3.8µs	* 6.9µs	97.8µs
realistic	4.5µs	15.7µs	* 4.5µs	* 8.0µs	21.0µs
repetitive prefixed	5.3µs	13.8µs	* 5.3µs	* 8.3µs	26.0µs
long repetitive prefixed	6.5µs	36.6µs	* 6.2µs	* 17.0µs	157.4µs
repetitive suffixed	4.6µs	15.2µs	* 4.5µs	* 10.0µs	24.9µs
long repetitive suffixed	14.0µs	33.6µs	* 13.6µs	* 76.7µs	181.5µs

Test Needle	adapted streamsearch (100kB data)	streamsearch@1.1.0 (100kB data)	Buffer.indexOf (100kB data)	adapted streamsearch (2kB batches)	streamsearch@1.1.0 (2kB batches)
\r\n	12.9µs	623.5µs	* 10.5µs	* 24.4µs	632.6µs
\r\n--sep	10.9µs	180.5µs	* 10.7µs	* 24.6µs	184.7µs
realistic	11.5µs	27.0µs	* 11.3µs	* 26.8µs	37.4µs
repetitive prefixed	11.4µs	29.9µs	* 11.0µs	* 28.9µs	47.8µs
long repetitive prefixed	19.1µs	120.1µs	* 18.0µs	* 61.1µs	287.6µs
repetitive suffixed	12.9µs	28.2µs	* 12.4µs	* 45.1µs	47.5µs
long repetitive suffixed	64.8µs	141.0µs	* 64.2µs	* 255.9µs	296.8µs

Uh oh!

streamsearch (multipart) can be made significantly faster with Buffer.indexOf #202

Description

Prerequisites

🚀 Feature Proposal

Motivation

Example

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions