perf: faster encode_url and decode_url#441
Conversation
lib/decode_url.ts
Outdated
|
|
||
| const decodeURL = (str: string) => { | ||
| if (parse(str).protocol) { | ||
| if (hasProtocolLikeNode(str)) { |
There was a problem hiding this comment.
If we are going to implement some loose check, why not just if (str.includes('://')?
There was a problem hiding this comment.
I am still not a fan of copying some regexp. How would we know if it is spec-compliant or not?
There was a problem hiding this comment.
This regexp is from nodejs https://github.com/nodejs/node/blob/a1244f04dea9148c44fd6daf60b5105f7a85ea12/lib/url.js#L96, and it is stricter than the one in RFC 3986 (https://www.rfc-editor.org/rfc/rfc3986#appendix-B). The regex in RFC 3986 is /^(([^:/?#]+):)/.
There was a problem hiding this comment.
But whathappened if Node.js changes their implementation (maybe a ReDoS attack is discovered, maybe a CSRF PoC is created, and there is a new CVE created), how can we make sure our copied regexp is up to date?
There was a problem hiding this comment.
Since there is such a concern, let's consider a less radical optimization method.
Signed-off-by: Mimi <1119186082@qq.com>
| if (index < 0) { | ||
| return unescape(str); | ||
| } | ||
| if (parse(str.slice(0, index + 1)).protocol) { |
There was a problem hiding this comment.
url.parse is only to determine if the given input is "somewhat" a valid URL (because try { new URL() } catch {} is extremely slow, so we want to avoid new URL() to throw at all cost).

check list
Description
close #439
Additional information