6
votes

What's the default encoding of HTTP POST request when the content-type is "application/json" with no explicit charset given"?

It seems two specs are in conflicts:

  • JSON spec says that "JSON text SHALL be encoded in Unicode. The default encoding is UTF-8."
  • HTTP spec says that "When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP."
3
You are not sending a request using a text/... media type, so that clause of the HTTP spec does not apply.Remy Lebeau

3 Answers

4
votes

The application/json media type is formally defined in RFC 7158 The JavaScript Object Notation (JSON) Data Interchange Format (which obsoletes RFC 4627), and is registered with IANA has having NO required or optional parameters (thus, charset is not defined for application/json).

Section 8.1 Character Encoding says:

JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32. The default encoding is UTF-8, and JSON texts that are encoded in UTF-8 are interoperable in the sense that they will be read successfully by the maximum number of implementations; there are many implementations that cannot successfully read texts in other encodings (such as UTF-16 and UTF-32).

Implementations MUST NOT add a byte order mark to the beginning of a JSON text. In the interests of interoperability, implementations that parse JSON texts MAY ignore the presence of a byte order mark rather than treating it as an error.

application/... media types are typically defined as binary formats. It is very easy for a JSON parser to differentiate between UTF-8, UTF-16, and UTF-32 just by looking at the first few bytes, so there is no need for a BOM (which is not allowed, as noted above) or an explicit charset (which is not defined).

1
votes

Here is the algorithm form XMLHttpRequest from W3C

The JSON response entity body is either a JavaScript value representing the response entity body. If the JSON response entity body is null, set it to the return value of the following algorithm:

1. Let JSON text be the result of running utf-8 decode on byte stream response entity body.

2. Return the result of invoking the initial value of the parse property of the JSON object defined in JavaScript, with JSON text as

its only argument, or null if that function throws an exception.

http://www.w3.org/TR/XMLHttpRequest/#json-response-entity-body

The server should set it as UTF-8 by default.

1
votes

You are looking at RFC 2616, which has been obsoleted by RFC 7231. The text from RFC 2616 is gone in RFC 7231. In any case, the clause only applies to the text/... media types, not application/... media types.