8
votes

I'm developing a website in PHP and I'd like to give the user to switch from German to English easily.

So, a translation politic must be considered:

Should I store the data and its translation in a database table ((1, "Hello", "hallo"), (2, "Good morning", "Guten Tag") etc .. ?

Or should I use the ".mo" Files to store it?
Which way is the best?
What are the pros and the cons?

7

7 Answers

5
votes

There are some factors you should consider.

Will the website be updated frequenytly? if yes, by whom? you or the owner? how much data / information are you dealing with? and also... are you doing this frequently (for many clients) ?

I can hardly think that using a relational database can couse any serious speed impacts unless you are having VERY high traffic (several hundreds of thousands of pageviews per day).

Should you be doing this frequently (for lots of clients) think no further: build up a CMS (or use an existing one). If you really need to consider speed impact, you can customize it so that when you are done with the website you can export static HTML pages where possible.

If you are updating frequently, the same as above applies. If the client has to update (and not you), again, you need a CMS. If you are dealing with lots of infomration (big and lots of articles), you need a CMS.

All in all, a CMS will help you build up your website structure fast, add content fast and not worry that much about code since it will be reusable.

Now, if you just need to create a small website fast, you can easily do this with hardcoded arrays and datafiles.

12
votes

After having just tackled this myself recently (12 languages and counting) on a production system and having run into some major performance issues along the way I would suggest a hybrid system.

1) Store the language strings and translations in a database--this will make it easy to interact with/update/remove items plus will be part of your normal backup routines.

2) Cache the languages into flat files on the server and draw those out as necessary to display on the page.

The benefits here are many--mostly it is fast! I am not dealing with connection overhead for MySQL or any traffic slowdowns during the transfer. (especially important if your DB server is not localhost).

This will also make it very easy to use. Store the data from your database in the file as a php serialized array and GZIP the contents of the file to shrink storage overhead (this also makes it faster in my benchmarking).

Example:

$lang = array(
    'hello' => 'Hallo',
    'good_morning' => 'Guten Tag',
    'logout_message' = > 'We are sorry to see you go, come again!'    
);

$storage_lang = gzcompress( serialize( $lang ) );

// WRITE THIS INTO A FILE SUCH AS 'my_page.de'

When a user loads your system for the first time do a file_exists('/files/languages/my_page.de'). If the file exists then load the content, un-gzip, and un-serialize and it is ready to go.

Example

$file_contents = get_contents( 'my_page.de' );
$lang = unserialize( gzuncompress( $file_contents ) );

As you can see you can make the caching specific to each page in the system keeping the overhead even smaller and use the file extension to denote language... (my_page.en, my_page.de, my_page.fr)

If the file DOESN'T exist then query the DB, build your array, serialize it, gzip it and write the missing file--at the same time you have just constructed the array that the page needed so continue on to display the page and everyone is happy.

Finally, this allows you to build in update pages accessible to non-programmers but you also control when changes appear by deciding when to remove cache files so they can be rebuilt by the system.

Warnings and Pitfalls

When I kept everything in the database directly we hit some MAJOR slowdowns when our traffic spiked.

Trying to keep them in flat-file arrays only was so much trouble because updates were painful and prone to errors.

Not GZIP compressing the contents of the cache files made the language system about 20% slower in my benchmarks.

Make sure all of your database fields containing languages are set to UTF8-general-ci (or at least one of the UTF8 options, I find general-ci best for my use). If you don't you will not be able to store non-unicode character sets in your database (like Chinese, Japanese, etc)

Extension: In response to a comment below, be sure to set your database tables up with page level language strings in mind.

id      string              page        global
1       hello               NULL          1
2       good_morning        my_page.php   0

Anything that shows up in headers or footers can have a global flag that will be queried in every cache file created, otherwise query them by page to keep your system responsive.

7
votes

PHP arrays are indeed the fastest way to load translations. However, you really don't want to update these files by hand in an editor. This might work in the beginning, and for one or two languages, but when your site grows this gets really hard to maintain.

I advise you to setup a few simple tables in a database where you keep the translations, and build a simple app that lets you update the translations (some forms to add and update texts). As for the database: use one table to store translation variables; use another to link translations to these variables.

Example:

`text`

id  variable
1   hello
2   bye
`text_translations`

id  textId  language  translation
1   1       en        hello
2   1       de        hallo
3   2       en        bye
4   2       de        tschüss

So what you do is:

  • create the variable in the first table

  • add translations for it in the second table (in whatever language you want)

After you've updated the translations, create/update a language file for each language that you're using:

  • select the variables you need and its translation (tip: use English if there's no translation)

  • create a big array with all this stuff, e.g.:

$texts = array('hello' => 'hallo', 'bye' => 'tschüss');
  • write the array to a file, e.g.:
file_put_contents('de.php', serialize($texts));
  • in your PHP/HTML create the array from the file (based on selected language by user), e.g.:
$texts = unserialize(file_get_contents('de.php'));
  • in your PHP/HTML use the variables, e.g.:
<h1><?php echo $texts['hello']; ?></h1>

or if you like/enabled PHP short tags:

<p><?=$texts['bye'];?></p>

This setup is very flexible, and with a few forms to update the translations it's easy to keep your site up to date in multiple languages.

6
votes

I'd also suggest Zend Framework Zend_Translate package.

The manual gives a good overview on How to decide which translation adapter to use. Even when not using ZF, this will give you some ideas about what is out there and what the pros and cons are.

Adapters for Zend_Translate

  • Array
    • Use PHP arrays Small pages;
    • simplest usage; only for programmers
  • Csv
    • Use comma separated (.csv/.txt) files
    • Simple text file format; fast; possible problems with unicode characters
  • Gettext
    • Use binary gettext (*.mo) files GNU standard for linux;
    • thread-safe; needs tools for translation
  • Ini
    • Use simple ini (*.ini) files
    • Simple text file format; fast; possible problems with unicode characters
  • Tbx
    • Use termbase exchange (.tbx/.xml) files
    • Industry standard for inter application terminology strings; XML format
  • Tmx
    • Use tmx (.tmx/.xml) files
    • Industry standard for inter application translation; XML format; human readable
  • Qt
    • Use qt linguist (*.ts) files
    • Cross platform application framework; XML format; human readable
  • Xliff
    • Use xliff (.xliff/.xml) files
    • A simpler format as TMX but related to it; XML format; human readable
  • XmlTm
    • Use xmltm (*.xml) files
    • Industry standard for XML document translation memory; XML format; human readable
1
votes

If you need to provide web interface for adding/editting translations, then database is a good idea.

If, however, your translations are static, I would use gettext or even plain PHP array.

Either way you can take advantage of Zend_Translate.

Small comparison, the first two from Zend tutorial:

  1. Plain PHP arrays: Small pages; simplest usage; only for programmers.
  2. Gettext: GNU standard for linux; thread-safe; needs tools for translation.
  3. Database: Dynamic; Worst performance.
0
votes

I would recommend PHP arrays, they can be built around a GUI for easy access.

0
votes

Be realize the everybody in the world when dealing with computer, they usually know some common English used in computer or internet like About Us, Home, Send, Delete, Read More etc. Question : Are they really need to be translated?

Ok, honestly, some translation to that words is actually not about 'required', it's all about 'style'.

Now, if it's really wanted, for the common words that no need to be changed forever, it's better use a php file which output lang array for only local and English. And for some contents such as blog, news and some descriptions, use database and save in as many as language translation required. You must do it manually.

Using and rely on Google Translate? I think you have to think 1000 times. At least for this decade.