This behaviour is to be expected and should be natively supported by browsers as long as any +
symbols follow the query delimiter, ?
.
Per the RFC3986 specification, its obsoleted forms RF2396, RF1738 and the many linked changes over the years, the space character (0x20
) MUST be encoded in a URI.
For compatibility with all schemes, it MUST be percent-encoded as %20
.
However, the query component of a URI with a scheme of http:
or https:
has a Content-Type
of application/x-www-form-urlencoded
. This content type specifies that spaces are to be encoded as +
, the reserved characters are to be escaped according to the URI spec, and that non-alphanumeric characters are to be percent-encoded as %HH
.
So essentially, this mean that +
is converted back to a space after the ?
in a URI with a scheme of http
or https
.
Examples:
// URI of a StackOverflow search for "RFC 3986"
"https://stackoverflow.com/search?q=RFC+3986" // ✔ (default used by StackOverflow)
"https://stackoverflow.com/search?q=RFC%203986" // ✔
"https://stackoverflow.com/search?q=RFC 3986" // ✖ (prohibited character)
// URI of "C:\Program Files (x86)\Notepad++\notepad++.exe"
"file:///C:/Program%20Files%20(x86)/Notepad++\notepad++.exe" // ✔
"file:///C:/Program%20Files%20(x86)/Notepad%2B%2B\notepad%2B%2B.exe" // ✔ (verbose, but valid)
"file:///C:/Program Files (x86)/Notepad++\notepad++.exe" // ✖ (spaces must be encoded)
"file:///C:/Program+Files+(x86)/Notepad++\notepad++.exe" // ✖ (not found, `+` not decoded to space)
From RFC3986 Section 2.2:
The purpose of reserved characters is to provide a set of delimiting characters that are distinguishable from other data within a URI. URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent. Percent-encoding a reserved character, or decoding a percent-encoded octet that corresponds to a reserved character, will change how the URI is interpreted by most applications. Thus, characters in the reserved set are protected from normalization and are therefore safe to be used by scheme-specific and producer-specific algorithms for delimiting data subcomponents within a URI.
Because of that above statement, it's important to note that many decodeURI
/decodeURIComponent
functions will not decode a +
to a space. It is up to your code to add support for that such as how the Google Closure Library swaps them out for spaces before decoding:
/**
* URL-decodes the string. We need to specially handle '+'s because
* the javascript library doesn't convert them to spaces.
* @param {string} str The string to url decode.
* @return {string} The decoded {@code str}.
*/
goog.string.urlDecode = function(str) {
return decodeURIComponent(str.replace(/\+/g, ' '));
};