2
votes

I'm using the ICU Message Format for i18n in an application.

Some of the strings involve dynamic-length comma separated lists. For instance, the string:

"There are three pets: a dog, a fish, a cat."

may be generated with this message:

"There are {count} pets: {list}"

Where count is the length of the list, and list are the individual strings themselves. (As an aside, were this a real string I'd be configuring "pets" per "count," but let's keep it simple).

In pseudocode, the list variable might be generated like so:

pets.join(', ');

That last bit is what I'm not a fan of. It seems to only make sense for LTR languages, and possibly just a subset of LTR languages.

I have two questions:

  1. How should comma-separated lists be formatted in other languages, such as RTL languages?
  2. Does the ICU Message Format support that in any way, or does it require a system in addition to ICU to generate the lists?

For what it's worth, this is a JavaScript webapp, although the answers to these questions are probably language-agnostic.

1

1 Answers

4
votes

List formatting is locale sensitive. Not all languages use the "ASCII comma", or spaces. ICU has a ListFormatter: http://icu-project.org/apiref/icu4j/com/ibm/icu/text/ListFormatter.html

For JavaScript Closure has an equivalent https://github.com/google/closure-library/blob/master/closure/goog/labs/i18n/listformat.js


July 2019 update

There is (finally) support for list formatting in ECMAScript: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ListFormat

It is (still) not well supported, but give it some time