4
votes

A Remote Service calls our Jetty Server with a Request encoded in ISO-8859-15. This special request is mapped on a Spring Controller. Jetty is not able to encode the request in right manner and shows the following exception:

exception=org.eclipse.jetty.util.Utf8Appendable$NotUtf8Exception: Not valid UTF8! byte F6 in state 3}
org.eclipse.jetty.util.Utf8Appendable$NotUtf8Exception: Not valid UTF8! byte F6 in state 3
    at org.eclipse.jetty.util.Utf8Appendable.appendByte(Utf8Appendable.java:168) ~[na:na]
    at org.eclipse.jetty.util.Utf8Appendable.append(Utf8Appendable.java:93) ~[na:na]
    at org.eclipse.jetty.util.UrlEncoded.decodeUtf8To(UrlEncoded.java:506) ~[na:na]
    at org.eclipse.jetty.util.UrlEncoded.decodeTo(UrlEncoded.java:554) ~[na:na]
    at org.eclipse.jetty.server.Request.extractParameters(Request.java:285) ~[na:na]
    at org.eclipse.jetty.server.Request.getParameter(Request.java:695) ~[na:na]
    ....

Solution

In Spring it's possible to force an encoding of the request through a CharacterEncodingFilter even if the whole application speaks UTF-8. The Exception should disappear.

<filter>
    <filter-name>encoding-filter</filter-name>
    <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
    <init-param>
        <param-name>encoding</param-name>
        <param-value>ISO-8859-15</param-value>
    </init-param>
    <init-param>
        <param-name>forceEncoding</param-name>
        <param-value>true</param-value>
    </init-param>
</filter>
<filter-mapping>
    <filter-name>encoding-filter</filter-name>
    <url-pattern>/app/specialRequest.do</url-pattern>
</filter-mapping>

If this is not working for you

  • find out the remote system encoding
  • start Wireshark to analyze incoming package through ip.src == xxx.xxx.xxx.xxx filter
  • search the requests body for special characters (recalculate the hex value to binary and try several frequently used encodings to find exactly the one who is matched the exception)
  • set encoding through Jetty's start.ini ie. with the following parameters

    Dorg.eclipse.jetty.util.URI.charset=ISO-8859-15

    Dorg.eclipse.jetty.util.UrlEncoding.charset=ISO-8859-15

Otherwise drop me a message if you have more questions.

1
What version were you running before? If it was an earlier version of 8.0 then we can have a look at the changes in those versions and see if something pops up.Tim
It was jetty-7.4.2.v20110526. But I am not 100% sure that this error is a Jetty thing. I will test it today (same case) with jetty7 version again. I am on my way...Frank Szilinski
Hmm, with Jetty7 I get a IllegalArgumentException: !utf8 on org.eclipse.jetty.util.Utf8Appendable.appendByte(Utf8Appendable.java:130). Looks like the same but with other words. Seems like it is a new issue that comes from "outside" and is independent from my migration. I hate those problems when you changing to something new and a in the same moment something else will pop up.Frank Szilinski
Seems like 0xF6 is character "ö" in ISO-8859-1. But what means state 3?Frank Szilinski

1 Answers

4
votes

It looks like the client is sending text that should be encoded as UTF8, but isn't encoding it.

In order to properly diagnose this issue you'll need to understand UTF8 (which you might do, I don't know)

In UTF8 any character with an encoding of 127 (0x7F) or less - i.e. only the lowest 7 bit are used - is included in the stream as is (no special encoding). But anything greater than 127 (i.e. at least one bit higher than the 7th is set), is specially encoded.

0xF6 is greater than 0x7F so if a client wants to send that character, it should encoded it.

0xF6 in binary is 11110110, which in UTF8 should be 11000011 10110110 (C3 B6)

So, if the client wants to send the ISO8859-1 character of 0xF6, then it should be sending the UTF8 byte sequence of 0xC3 0xB6.

You really need to work out what the client wants to be sending, what charset/encoding that data is in, and why it's not converting it to valid UTF8 before it sends it.

( "state 3", is to do with Jetty's internal tables for doing UTF8 decoding, it's not really very helpful for diagnosing this problem. It will only come in handy if you find the client, and it looks like the client is doing the right thing, and you suspect that Jetty's UTF8 decoding is wrong)