82
votes

I learnt from Google that Internationalization is the process by which I can make my web application to use all languages. I want to understand Unicode for the process of internationalization, so I learnt about Unicode from here and there.

I am able to understand about Unicode that how a charset set in encoded to bytes and again bytes decoded to charset. But I don't know how to move forward further. I want to learn how to compare strings and I need to know how to implement internationalization in my web application. Any Suggestions Please? Please guide me.

My Objective:

My main objective is to develop a Web Application for Translation (English to Arabic & vice versa). I want to follow Internationalization. I wish to run my web Application for translation in all the three browsers namely FF, Chrome, IE. How do I achieve this?

3

3 Answers

223
votes

In case of a basic JSP/Servlet webapplication, the basic approach would be using JSTL fmt taglib in combination with resource bundles. Resource bundles contain key-value pairs where the key is a constant which is the same for all languages and the value differs per language. Resource bundles are usually properties files which are loaded by ResourceBundle API. This can however be customized so that you can load the key-value pairs from for example a database.

Here's an example how to internationalize the login form of your webapplication with properties file based resource bundles.


  1. Create the following files and put them in some package, e.g. com.example.i18n (in case of Maven, put them in the package structure inside src/main/resources).

    text.properties (contains key-value pairs in the default language, usually English)

     login.label.username = Username
     login.label.password = Password
     login.button.submit = Sign in
     

    text_nl.properties (contains Dutch (nl) key-value pairs)

     login.label.username = Gebruikersnaam
     login.label.password = Wachtwoord
     login.button.submit = Inloggen
     

    text_es.properties (contains Spanish (es) key-value pairs)

     login.label.username = Nombre de usuario
     login.label.password = Contraseña
     login.button.submit = Acceder
     

    The resource bundle filename should adhere the following pattern name_ll_CC.properties. The _ll part should be the lowercase ISO 693-1 language code. It is optional and only required whenever the _CC part is present. The _CC part should be the uppercase ISO 3166-1 Alpha-2 country code. It is optional and often only used to distinguish between country-specific language dialects, like American English (_en_US) and British English (_en_GB).


  2. If not done yet, install JSTL as per instructions in this answer: How to install JSTL? The absolute uri: http://java.sun.com/jstl/core cannot be resolved.


  3. Create the following example JSP file and put it in web content folder.

    login.jsp

     <%@ page pageEncoding="UTF-8" %>
     <%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %>
     <%@ taglib prefix="fmt" uri="http://java.sun.com/jsp/jstl/fmt" %>
     <c:set var="language" value="${not empty param.language ? param.language : not empty language ? language : pageContext.request.locale}" scope="session" />
     <fmt:setLocale value="${language}" />
     <fmt:setBundle basename="com.example.i18n.text" />
     <!DOCTYPE html>
     <html lang="${language}">
         <head>
             <title>JSP/JSTL i18n demo</title>
         </head>
         <body>
             <form>
                 <select id="language" name="language" onchange="submit()">
                     <option value="en" ${language == 'en' ? 'selected' : ''}>English</option>
                     <option value="nl" ${language == 'nl' ? 'selected' : ''}>Nederlands</option>
                     <option value="es" ${language == 'es' ? 'selected' : ''}>Español</option>
                 </select>
             </form>
             <form method="post">
                 <label for="username"><fmt:message key="login.label.username" />:</label>
                 <input type="text" id="username" name="username">
                 <br>
                 <label for="password"><fmt:message key="login.label.password" />:</label>
                 <input type="password" id="password" name="password">
                 <br>
                 <fmt:message key="login.button.submit" var="buttonValue" />
                 <input type="submit" name="submit" value="${buttonValue}">
             </form>
         </body>
     </html>
    

    The <c:set var="language"> manages the current language. If the language was supplied as request parameter (by language dropdown), then it will be set. Else if the language was already previously set in the session, then stick to it instead. Else use the user supplied locale in the request header.

    The <fmt:setLocale> sets the locale for resource bundle. It's important that this line is before the <fmt:setBundle>.

    The <fmt:setBundle> initializes the resource bundle by its base name (that is, the full qualified package name until with the sole name without the _ll_CC specifier).

    The <fmt:message> retrieves the message value by the specified bundle key.

    The <html lang="${language}"> informs the searchbots what language the page is in so that it won't be marked as duplicate content (thus, good for SEO).

    The language dropdown will immediately submit by JavaScript when another language is chosen and the page will be refreshed with the newly chosen language.


You however need to keep in mind that properties files are by default read using ISO-8859-1 character encoding. You would need to escape them by unicode escapes. This can be done using the JDK-supplied native2ascii.exe tool. See also this article section for more detail.

A theoretical alternative would be to supply a bundle with a custom Control to load those files as UTF-8, but that's unfortunately not supported by the basic JSTL fmt taglib. You would need to manage it all yourself with help of a Filter. There are (MVC) frameworks which can handle this in a more transparent manner, like JSF, see also this article.

26
votes

In addition to what BalusC said, you have to take care about directionality (since English is written Left-To-Right and Arabic the other way round). The easiest way would be to add dir attribute to html element of your JSP web page and externalize it, so the value comes from properties file (just like with other elements or attributes):

<html dir="${direction}">
...
</html>

Also, there are few issues with styling such application - you should to say the least avoid absolute positioning. If you cannot avoid that for some reason, you could either use different stylesheets per (each?) language or do something that is verboten, that is use tables for managing layout. If you want to use div elements, I'd suggest to use relative positioning with "symmetric" left and right style attributes (both having the same value), since this is what makes switching directionality work.

You could find more about Bi-Directional websites here.

2
votes

based on this tutorial, I am using the following on GAE - Google's App Engine:

A jsp file as follows:

<%@ page import="java.io.* %>
<% 
  String lang = "fr"; //Assign the correct language either by page or user-selected or browser language etc.
  ResourceBundle RB = ResourceBundle.getBundle("app", new Locale(lang));
%>                 

<!DOCTYPE html>
<%@ page contentType="text/html;charset=UTF-8" language="java"%>
<head>
</head>
<body>
  <p>      
    <%= RB.getString("greeting") %>
  </p>
</body>

And adding the files named: app.properties (default) and app_fr.properties (and so on for every language). Each of these files should contain the strings you need as follows: key:value_in_language, e.g. app_fr.properties contains:

greeting=Bonjour!

app.properties contains:

greeting=Hello!

That's all