Allow `&[u8]` and `Vec<u8>` input #75

tomekb234 · 2025-10-28T21:04:05Z

This library immediately converts &str input to &[u8] with as_bytes() and then does not appear to assume that the input is UTF-8 encoded, in the sense that non-UTF-8 input does not cause any undefined behavior, panics, or an invalid generated DOM. Since there is no parse() variant with &[u8] input, given such an input to parse, it is necessary to awkwardly use the unsafe str::from_utf8_unchecked to avoid unnecessary UTF-8 check overhead.

This pull request:

changes the type of the input parameter of Parser::new from &str to &[u8],
changes the type of the input parameter of VDomGuard::new from String to Vec<u8> (note that as String is backed with Vec<u8>, conversion from the former to the latter is cheap),
changes the type of the pointer stored in VDomGuard from *mut str to *mut [u8],
adds new public parse_bytes() and parse_bytes_owned() functions,
adds a test for non-UTF-8 input.

Fixes #61.

Fixes y21#61.

tomekb234 added 2 commits October 28, 2025 21:09

Allow &[u8] and Vec<u8> input

3f4ceb0

Fixes y21#61.

Add test for non-UTF-8 input

ff062a4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow `&[u8]` and `Vec<u8>` input #75

Allow `&[u8]` and `Vec<u8>` input #75

Uh oh!

tomekb234 commented Oct 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Allow &[u8] and Vec<u8> input #75

Are you sure you want to change the base?

Allow &[u8] and Vec<u8> input #75

Uh oh!

Conversation

tomekb234 commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Allow `&[u8]` and `Vec<u8>` input #75

Allow `&[u8]` and `Vec<u8>` input #75

tomekb234 commented Oct 28, 2025 •

edited

Loading