2
votes

I know this is a common issue but I have not been able to find an existing post which has an answer that solves this scenario.

I'm getting a character encoding issue with an application that's currently being migrated from an old rack server to an AWS EC2 instance. Everything I've tried so far has not resolved the problem.

I have a release build of my war file, let's say app-1.0.war that is deployed to Tomcat 7.0.32 (yes I know it's old, it's a legacy app and for now I want the EC2 instance to be as close as possible to the existing server) and the JDK is 1.7.0_07. The same build is on both environments.

The old os is SLES (SUSE Linux Enterprise Server) and in AWS we are using Ubuntu 16.04.3 LTS.

On both environments the locale is set as en_GB.UTF-8.

Also, on both environments Tomcat's server.xml has URIEncoding="UTF-8" in the Connector.

On both systems, if I use file -i on the minified Javascript file that's got the problem, it detects the file as being utf-8:

myjavascript.min.js: text/plain charset=utf-8

and when viewing the file content on the file system I can see that on both systems there is a string with a £ sign in it.

However if I use curl to request that same file through Tomcat, the old host serves it up correctly while the new environment replaces the pound symbol with ��, even though both have content type application/x-javascript;charset=UTF-8.

I have tried adding -Dfile.encoding=UTF-8 to the args but that didn't help. It wasn't used in the old environment anyway.

1

1 Answers

0
votes

This turned out not to be a file encoding problem at all but a database one.

I echoed some special characters directly into the javascript file in the exploded directory of the webapp then restarted tomcat. When using curl to retrieve the file I could see that the characters were not there. So I checked the file on the file system, sure enough they were there. I then did a find to see if another file of the same name existed elsewhere, perhaps some kind of cache.

By now it wasn't just me that was baffled but a few other colleagues. Then I realized that the javascript makes up part of a theme that is used by the CMS that is built in to the webapp. The CMS content is all loaded into a database when the application is first deployed.

So after spending hours on a wild goose chase, it turned out to be something completely different to what I first expected.