0
votes

I must be having issues putting two and two together when it comes to encoding but there are quite a few stackoverflow questions regarding encoding so I must not be the only one! Currently, I have a simple JSP page that has a struts action form containing a single user text input with a default value of Pàmies Olivés.

<%@page contentType="text/html;charset=UTF-8"%>

<form method="get" action="login.jsp">

<tr><td>Full Name:</td><td><input type="text" name="fullName" value="Pàmies Olivés" size="35"></td></tr>


<tr><td colspan="2"><br><input type="submit" name="submit" value="submit"></td></tr>

If a user hits submit with that default value I have a scriptlet output the user's inputted value to the page so I can see it. I get the expected Pàmies Olivés when having Tomcat and the page's charset set to UTF-8. However, if I try to use ISO-8859-1 encoding in Tomcat I get an output of Pàmies Olivés or if I set both Tomcat and the charset of the page to ISO-8859-1 the value becomes Pàmies Olivés. What would be causing this discrepancy with the ISO-8859-1 encoding?

1

1 Answers

2
votes

The encoding of POST form data is usually controlled by the accept-charset attribute on the form element.

A POST form definition should look like:

<form method="get" action="login.jsp" accept-charset="iso-8859-1" >

For GET requests, HTML 4 recommends UTF-8 encoding for the query string, which what you'll be using as the method is "GET".

accept-charset="iso-8859-1" may force the browser to encode the query string to iso-8859-1 but I don't think it'll be very reliable.

Instead, try to use POST if possible.

I appreciate that your question may be academic, but I would recommend that you use UTF-8 for forms so that you're not limiting them to a <255 unique characters.