I experience a strange behavior in Perl while trying to decode a Unicode JSON string coming from a PHP script's json_encode
function. I simplified the problem to next code:
#!/usr/bin/perl
use CGI;
use JSON;
print CGI::header(-type=>'text/html', -charset=>'UTF-8');
print %{ decode_json('{"test_1" : "= \u00F9 ="}') }->{'test_1'};
print '<br>';
print %{ decode_json('{"test_2" : "= \u00F9 \u0121 ="}') }->{'test_2'};
When I run this script in browser I see next:
= � =
= ù ġ =
The first line contains a "broken character", the second is correct. What I think is happenning is that for some reason Perl decodes first string in ISO-8859-1 encoding, if I change page encoding to ISO-8859-1 the first line is correct, however the second is broken.
My Perl version is 5.10.1 and the JSON version is 2.51.
Question: how to force Perl json_decode
to return UTF-8 characters in the first print?
Note: I can fix the problem by manually converting first output to UTF-8, but this requires the installation of an additional "Encoder" module, which I want to avoid.