A
UnicodeEscaper that escapes some set of Java characters using
the URI percent encoding scheme. The set of safe characters (those which
remain unescaped) can be specified on construction.
For details on escaping URIs for use in web pages, see section 2.4 of
RFC 3986.
In most cases this class should not need to be used directly. If you
have no special requirements for escaping your URIs, you should use either
CharEscapers#uriEscaper() or
CharEscapers#uriEscaper(boolean).
When encoding a String, the following rules apply:
- The alphanumeric characters "a" through "z", "A" through "Z" and "0"
through "9" remain the same.
- Any additionally specified safe characters remain the same.
- If
plusForSpace was specified, the space character " " is
converted into a plus sign "+".
- All other characters are converted into one or more bytes using UTF-8
encoding and each byte is then represented by the 3-character string
"%XY", where "XY" is the two-digit, uppercase, hexadecimal representation
of the byte value.
RFC 2396 specifies the set of unreserved characters as "-", "_", ".", "!",
"~", "*", "'", "(" and ")". It goes on to state:
Unreserved characters can be escaped without changing the semantics
of the URI, but this should not be done unless the URI is being used
in a context that does not allow the unescaped character to appear.
For performance reasons the only currently supported character encoding of
this class is UTF-8.
Note: This escaper produces uppercase hexidecimal sequences. From
RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits
for all percent-encodings."