I'm using PHP Gettext with .mo files (using PoEdit) to translate from French (fr_FR.UTF-8) to English (en_US.UTF-8), and it works quite fine but not all strings are translated ! I'm not talking about accents not translated correctly with UTF-8, some entire strings are just not translated.
I've extracted the french strings to translate from all of my PHP files - _("stringtotranslate") - thanks to shell script :
find . -iname "*.php" | xargs xgettext -j --from-code=UTF-8 -o locale/default.pot
A default.pot has been generated with all the strings correctly, no one is missing. From that .pot file I've created a default.po file for the english version. That .po is in UTF-8, and the PHP files where I need them are in UTF-8 too (no BOM).
There is no problem with the cache : I name the .mo file differently for each modification ('default_' + filemtime).
What I can notice :
- Short strings with no utf-8 character are translated.
- Long strings or containing utf-8 characters (french accents) are not translated... but I've found an example of a short string with no accent, not translated too... (no other example found).
I've set the locale according to my server (Apache 1.3.37, PHP 4.4.4, --with-gettext). The locale is dynamic depending on the flag clicked on the website, here is the english version :
putenv('LANGUAGE=eng');
setlocale(LC_ALL, 'en_US.UTF-8');
bindtextdomain('default', './locale');
bind_textdomain_codeset('default', 'UTF-8');
textdomain('default');
As I've said, some of the strings are correctly translated in english, but not every string.
Here is the website in english version : http://librairiedescarres.com/2015/en/
Here is a part of the .po file :
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: Librairie des Carrés\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2015-08-03 14:57+0200\n"
"PO-Revision-Date: 2015-08-04 10:08+0200\n"
"Last-Translator: \n"
"Language-Team: \n"
"Language: en\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Generator: Poedit 1.8.2\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
"X-Poedit-SourceCharset: UTF-8\n"
#: about-us.php:10 about-us.php:31
#, php-format
msgid ""
"Initialement spécialisé dans l'histoire des sciences et des voyages, notre stock est aujourd'hui largement diversifié : "
"%slivres illustrés%s, %shistoire%s, %sagriculture%s, %sjardinage et architecture%s, %smédecine%s, %sbibliophilie%s, "
"%slivres rares et curieux du XVème au XXème siècle%s."
msgstr ""
"Initially specialised in Travel and Science, our stock is now largely diversified and includes literature and "
"%sillustrated books%s, %shistory%s, %sagriculture%s, %sgardens and architecture%s, %smedicine%s, %sbibliophile%s and "
"%srare books from the XVth to the XXth century%s."
(...)
I've tried many solutions but no one works :
- Try other locales such as : "en_US", "en_US.utf8", "en_US.utf-8"...
- putenv('LANG=fra'), putenv('LC_ALL=fra')...
- Force UTF-8 encoding for the .mo file generated, using Notepad++.
Any solution ? Is there issues with Apache or PHP or Gettext... ?