11
votes

I am in the file encoding hell with Puppet. Even the simplest try does not work:

hiera-data/test.yaml:
---
test: Äñö

init.pp:
  $test = hiera('test')
  file { "/root/encoding.txt":
    ensure  => file,
    content => $test
  }

On the Puppet server everything looks fine:

puppet:~ # file -i /etc/puppetlabs/puppet/hiera-data/env/test.yaml
/etc/puppetlabs/puppet/hiera-data/env/test.yaml: text/plain charset=utf-8
puppet:~ # cat /etc/puppetlabs/puppet/hiera-data/env/test.yaml
---
test: Äñö
puppet:~ # locale
LANG=POSIX
LC_CTYPE=en_US.UTF-8

On the puppet agent:

puppet-test:~ # locale
LANG=POSIX
LC_CTYPE=en_US.UTF-8

After running:

puppet-test:~ # file -i encoding.txt
encoding.txt: text/plain charset=utf-8

but

cat encoding.txt

Here is the HEX data asked for:

0000000: efbf bdef bfbd efbf bdef bfbd efbf bdef  ................
0000010: bfbd 0a                                  ...

Running hiera directly does not provide any further insight. In special, I can only try it on the server, since the agent does not have the sources. enter image description here

My environment is quite outdated, but I am not allowed to use any newer version, at least not yet, without any GOOD reason:

  • SuSE Enterprise Linux 11 Service Pack 3
  • Puppet Enterprise 3.8.6
  • pe-ruby-1.9.3.551-9.pe.sles11

I would appreciate any insight to this problem.

1
Are you sure that the problem is not your terminal? Open encoding.txt in a hex editor (or in a text editor with a binary / hex mode) and check whether the file contains the correct bytes. In particular, the UTF-8 encoding for the string you present would consist of these six bytes: c3 84 c3 b1 c3 b6. If it contains something different, then please add that to your question. - John Bollinger
i updated the question with the missing info, Thanks @JohnBollinger - mmoossen
After a second look at this, this does not seem like a puppet problem. This seems like a system env problem. - Matt Schuchard
@MattSchuchard: could be, but what could i check to get further? - mmoossen
What do you get if you use hiera from the command line? e.g.: ` # hiera -d test > test-encoding.txt ` (you may have to use -c <config> to find the right files) - Peter Faller

1 Answers

0
votes

Having LANG=POSIX can definitely make things screwy. LANG is what determines the encoding that the console displays. It is usually desirable to have LANG and LC_TYPE matching.

See: "Explain the effects of export LANG, LC_CTYPE, LC_ALL".

And for more info on LANG=POSIX see POSIX Locale.

Also, on an unrelated note, Puppet 3.8 definitely has defects around properly displaying and/or persisting Unicode characters. A lot of work has been done in the more recent Puppet builds to completely internationalize and localize Puppet.