-
Notifications
You must be signed in to change notification settings - Fork 37
Open
Description
Version 0.12.3
The documentation of urlEncode for the boolean parameter is:
Whether to decode '+' to ' '
This doesn't appear to be the whole story though. It also controls other characters like @
ghci> import Prelude
ghci> URI.urlEncode False "test@example.com"
"test@example.com"
ghci> URI.urlEncode True "test@example.com"
"test%40example.com"
This documentation appears true for urlDecoding (replacePlus is only used in one place).
-- | Percent-decoding.
urlDecode :: Bool -- ^ Whether to decode @\'+\'@ to @\' \'@
-> B.ByteString -> B.ByteString
urlDecode replacePlus z = fst $ B.unfoldrN (B.length z) go z
where
go bs =
case B.uncons bs of
Nothing -> Nothing
Just (43, ws) | replacePlus -> Just (32, ws) -- plus to space
Just (37, ws) -> Just $ fromMaybe (37, ws) $ do -- percent
(x, xs) <- B.uncons ws
x' <- hexVal x
(y, ys) <- B.uncons xs
y' <- hexVal y
Just (combine x' y', ys)
Just (w, ws) -> Just (w, ws)
hexVal w
| 48 <= w && w <= 57 = Just $ w - 48 -- 0 - 9
| 65 <= w && w <= 70 = Just $ w - 55 -- A - F
| 97 <= w && w <= 102 = Just $ w - 87 -- a - f
| otherwise = Nothing
combine :: Word8 -> Word8 -> Word8
combine a b = shiftL a 4 .|. b
But in URL encoding, the true/false flag isn't restricted to just spaces:
unreservedQS, unreservedPI :: [Word8]
unreservedQS = map ord8 "-_.~"
unreservedPI = map ord8 "-_.~:@&=+$,"
-- | Percent-encoding for URLs.
urlEncodeBuilder' :: [Word8] -> B.ByteString -> B.Builder
urlEncodeBuilder' extraUnreserved = mconcat . map encodeChar . B.unpack
where
encodeChar ch | unreserved ch = B.word8 ch
| otherwise = h2 ch
unreserved ch | ch >= 65 && ch <= 90 = True -- A-Z
| ch >= 97 && ch <= 122 = True -- a-z
| ch >= 48 && ch <= 57 = True -- 0-9
unreserved c = c `elem` extraUnreserved
-- must be upper-case
h2 v = B.word8 37 `mappend` B.word8 (h a) `mappend` B.word8 (h b) -- 37 = %
where (a, b) = v `divMod` 16
h i | i < 10 = 48 + i -- zero (0)
| otherwise = 65 + i - 10 -- 65: A
-- | Percent-encoding for URLs (using 'B.Builder').
urlEncodeBuilder
:: Bool -- ^ Whether input is in query string. True: Query string, False: Path element
-> B.ByteString
-> B.Builder
urlEncodeBuilder True = urlEncodeBuilder' unreservedQS
urlEncodeBuilder False = urlEncodeBuilder' unreservedPI
-- | Percent-encoding for URLs.
urlEncode :: Bool -- ^ Whether to decode @\'+\'@ to @\' \'@
-> B.ByteString -- ^ The ByteString to encode as URL
-> B.ByteString -- ^ The encoded URL
urlEncode q = BL.toStrict . B.toLazyByteString . urlEncodeBuilder q
I'm not super familiar with this domain. Should URL decoding have an option to work with those other characters? Should just the documentation for URL encoding change?
Metadata
Metadata
Assignees
Labels
No labels