1
votes

I'm using PHP Gettext with .mo files (using PoEdit) to translate from French (fr_FR.UTF-8) to English (en_US.UTF-8), and it works quite fine but not all strings are translated ! I'm not talking about accents not translated correctly with UTF-8, some entire strings are just not translated.

I've extracted the french strings to translate from all of my PHP files - _("stringtotranslate") - thanks to shell script :

find . -iname "*.php" | xargs xgettext -j --from-code=UTF-8 -o locale/default.pot

A default.pot has been generated with all the strings correctly, no one is missing. From that .pot file I've created a default.po file for the english version. That .po is in UTF-8, and the PHP files where I need them are in UTF-8 too (no BOM).

There is no problem with the cache : I name the .mo file differently for each modification ('default_' + filemtime).

What I can notice :

  • Short strings with no utf-8 character are translated.
  • Long strings or containing utf-8 characters (french accents) are not translated... but I've found an example of a short string with no accent, not translated too... (no other example found).

I've set the locale according to my server (Apache 1.3.37, PHP 4.4.4, --with-gettext). The locale is dynamic depending on the flag clicked on the website, here is the english version :

putenv('LANGUAGE=eng');
setlocale(LC_ALL, 'en_US.UTF-8');
bindtextdomain('default', './locale');
bind_textdomain_codeset('default', 'UTF-8');
textdomain('default');

As I've said, some of the strings are correctly translated in english, but not every string.

Here is the website in english version : http://librairiedescarres.com/2015/en/

Here is a part of the .po file :

# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: Librairie des Carrés\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2015-08-03 14:57+0200\n"
"PO-Revision-Date: 2015-08-04 10:08+0200\n"
"Last-Translator: \n"
"Language-Team: \n"
"Language: en\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Generator: Poedit 1.8.2\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
"X-Poedit-SourceCharset: UTF-8\n"

#: about-us.php:10 about-us.php:31
#, php-format
msgid ""
"Initialement spécialisé dans l'histoire des sciences et des voyages, notre stock est aujourd'hui largement diversifié : "
"%slivres illustrés%s, %shistoire%s, %sagriculture%s, %sjardinage et architecture%s, %smédecine%s, %sbibliophilie%s, "
"%slivres rares et curieux du XVème au XXème siècle%s."
msgstr ""
"Initially specialised in Travel and Science, our stock is now largely diversified and includes literature and "
"%sillustrated books%s, %shistory%s, %sagriculture%s, %sgardens and architecture%s, %smedicine%s, %sbibliophile%s and "
"%srare books from the XVth to the XXth century%s."
(...)

I've tried many solutions but no one works :

  • Try other locales such as : "en_US", "en_US.utf8", "en_US.utf-8"...
  • putenv('LANG=fra'), putenv('LC_ALL=fra')...
  • Force UTF-8 encoding for the .mo file generated, using Notepad++.

Any solution ? Is there issues with Apache or PHP or Gettext... ?

1

1 Answers

0
votes

I've found the answer, finally ! Thanks to that post (french post) : solution

Gettext doesn't seem to support correctly accents or UTF-8 characters in msgid (or something like that).

The solution is to generate the .mo file with the "--no-hash" option :

msgfmt --no-hash default.po -o default.mo

I'd like PoEdit to do it by itself, but I didn't found any setting like that...