Normalizes this IRI's components.
Because IRIs exist to identify resources, presumably they should be considered equivalent when they
identify the same resource. However, this definition of equivalence is not of much practical use, as
there is no way for an implementation to compare two resources unless it has full knowledge or control
of them. Therefore, IRI normalization is designed to minimize false negatives while strictly avoiding
false positives.
Case Normalization the hexadecimal digits within a percent-encoding triplet (e.g., "%3a" versus
"%3A") are case-insensitive and are normalized to use uppercase letters for the digits A - F. The
scheme and host are case insensitive and are normalized to lowercase.
Character Normalization The Unicode Standard defines various equivalences between sequences of
characters for various purposes. Unicode Standard Annex defines various Normalization Forms for these
equivalences and is applied to the IRI components.
Percent-Encoding Normalization decodes any percent-encoded octet sequence that corresponds to an
unreserved character anywhere in the IRI.
Path Segment Normalization is the process of removing unnecessary
"." and
".."segments from the path component of a hierarchical IRI. Each
"." segment is simply removed. A
".." segment is removed only if it is preceded by a non-
".." segment or the start of
the path.
HTTP(S) Scheme Normalization if the port uses the default port number or not given it is set to
undefined. An empty path is replaced with "/".
File Scheme Normalization if the host is "localhost" or empty it is set to undefined.
Internationalized Domain Name Normalization of the host component to Unicode.