18
votes

I remember running some tests a some months ago with gettext and the following code worked perfectly:

putenv('LANG=l33t');
putenv('LANGUAGE=l33t');
putenv('LC_MESSAGES=l33t');

if (defined('LC_MESSAGES')) // available if PHP was compiled with libintl
{
    setlocale(LC_MESSAGES, 'l33t');
}

else
{
    setlocale(LC_ALL, 'l33t');
}

bindtextdomain('default', './locale'); // ./locale/l33t/LC_MESSAGES/default.mo
bind_textdomain_codeset('default', 'UTF-8');
textdomain('default');

echo _('Hello World!'); // h3110 w0r1d!

This worked perfectly (under Windows XP and CentOS if I remember correctly), which was good because I could use arbitrary "locales", without having to bother if they were installed on the system or not. However, this doesn't seem to work anymore, I wonder why...


Red Hat + PHP 5.2.11:

I'm able to switch back and forth from various locales and the translations show up correclty as long as the setlocale() call doesn't return false (if the locale is available/installed on the system).

This is not perfect (would be great if I could just point gettext to any arbitrary translation directory without having to test for the existence of the locale), but it's acceptable. I'll run some more tests later on.

Windows 7 + PHP 5.3.1 (XAMPP):

setlocale() always returns false (even when using LC_ALL instead of LC_MESSAGES), unless I use some valid Windows locale such as eng, deu or ptg - in this case the locale seems to be correctly set but the translations still don't show up. I can't test right now because I've hundreds of tabs open but I think the very first call to that script yields the correct translation (restarting Apache won't do the trick).

I'm not sure if this is related to the PHP Bug #49349. I'll test this is a couple of hours.


Is there any way to use the gettext extension (not pure PHP implementations like php-gettext or the Zend Translate Adapter) reliably across different operating systems (possibly with custom locales like l33t)?

Also, is it absolutely necessary to use setlocale(LC_ALL, ...)? I would preffer leaving the TIME, NUMERIC and MONETARY (specially) locale settings untouched (defaulting to the POSIX locale).


I had an idea... Would it be possible to call setlocale() with a very common locale (like C, POSIX or en_US) and specify the language via the domain? Something like this:

/lang/C/LC_MESSAGES/domain.pt.mo
/lang/C/LC_MESSAGES/domain.de.mo
/lang/C/LC_MESSAGES/domain.en.mo
/lang/C/LC_MESSAGES/domain2.pt.mo
/lang/C/LC_MESSAGES/domain2.de.mo
/lang/C/LC_MESSAGES/domain2.en.mo

Would this work on *nix and Windows plataforms without problems?

2
This is why I frigging hate gettext. It would be so easy if not for this horrible, horrible, unnecessary chaos. Interested to see whether anything comes up.Pekka
For what it's worth, I'm using Zend_Locale - it can deal with gettext files, too.Pekka
@Pekka: Yeah, I wonder what reliable solutions were available before ZF. Do you use Zend_Locale alone or in conjunction with Zend_Translate?Alix Axel
@Alix sorry, I meant Zend_Translate here. Yup, that's what I use.Pekka

2 Answers

19
votes

Gettext isn't overly practical for webapps.

  • As for example it doesn't honor/use Accept-Language style preferences by itself.
  • Typically incurs some caching issues on shared webhosts (mod_php SAPI).

So I sort of sometimes wish that PHP module wouldn't exist, and the convenient _() function name shortcut was available to userland implementations.
(Had my own gettext.php, which worked more reliable.)

Your options:

  1. Anway, according to a few bug reports the Windows port of gettext had some flaws with UTF-8. Maybe your version is affected again. So try bind_textdomain_codeset('default', 'ISO-8859-1'); for starters. Also, it seems to prefer the environment variables on Windows IIRC, so putenv("LC_ALL", "fr_FR"); might work better than setlocale(). Especially workable if you dl(gettext.dll) later on.

    Also give it a chance with including a charset right there LANG=en_GB.ISO-8859-1. (Since your source text is English anyway, caring about the charset isn't very relavant here; but probably a common case where gettext trips over itself.) Oh and on occasion it's UTF8 not UTF-8; also try ASCII.

  2. Alternatively circumvent gettext. Your domain idea is close, but I'd simply use a pre-defined ./locale/ subdir for languages:

    ./lang/en/locale/C/LC_MESSAGES/domain.mo
    

    Then just invoke bindtextdomain("default", "./lang/{$APP_LANG}/locale") without giving gettext room to interpret much. It will always look up /C/, but the correct locale directory has been injected already. But try to have a symlink from the $LANG to /C/ in there anyway.

  3. Bite in the gnu. Give up on gettext. "PhpWiki" had a custom awk conversion script. It transforms .po files into .php array scripts (yeah very oldschool), and just utilizes a __() function instead. Close. And more reliable.

6
votes

This code won't run perfectly on every system, because every systems locale repository + php version is different, among other things.

If you want consistency you need to use something like Zend_Translate, which if you install Zend on each system ( the same version of it ) they would all be consistent with one another because they're using the same localization data, locale names and codebase.

There are numerous bugs with setlocale, it's just not reliable. See the comments @ http://php.net/manual/en/function.setlocale.php