3
votes

I'm dealing with a site transfer, and in the process my charset got fouled up. At first, I transferred all files with no alterations, and the files on the new server showed <?> icons for special characters. A glance at the browser's character encoding (Chrome and FF) told me it was auto-detecting UTF-8. The meta charset of pages is set to ISO-8859-1. Copy is drawn from various tables in multiple databases (don't ask).

On the original site, all displayed as it should. On the new site, <?>... I dug into it, found default charset ="UTF-8" in php.ini, set it to nothing. Now the majority of pages on the site display fine, the browser recognizes the meta charset tag, everybody's happy; that is, until I navigate to a folder off root.

The files in this folder, although their meta charset is ISO-8859-1, are somehow telling the browser to be read as UTF-8, which means I'm seeing <?> on these pages. If I set the browser to read as ISO-8859-1, then it displays fine. Auto-detect resets it to UTF-8. Any ideas?

Thank you!

Update (added from comment below):

I ran the page through the W3C checker as recommended by martinstoeckli, and it tells me that the HTTP Content-Type is Content-Type: text/html; charset=utf-8 while the meta tag is <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>, which gives me a Conflicting character encoding declarations error. Crazy thing is, I can't for the life of me figure out where the UTF-8 declaration is coming from! It's nowhere in any file, all files were saved UTF-8 w/out BOM, the php.ini is set to declare no default, the folder's .htaccess is set like PatomaS suggests.

(For what it's worth, Mozilla's Web Sniffer confirms HTTP header Content-Type of text/html; charset=utf-8.)

Update: While we did not reach a solution to this problem as I posed it, I did decide that the best way to resolve my character encoding issues is to refactor everything to use UTF-8 encoding. Of course, this probably means you will see me on here with more exciting newbie questions like "Just why won't utf8-encode() do my łâùñdrÿ?"

Of course, that means the mystery remains: what is causing the server to send HTTP Content-type charset headers of UTF-8 when it appears that everything is configured differently?

5

5 Answers

4
votes

Are files may have BOM (Byte Order Mark) in them. To ensure open a file with notepad++ and check Coding menu , also you can select convert to ansi as an option for a test.

BTW using utf-8 everywhere is a better -long term- approach that i can suggest.

4
votes

There is a wonderful W3-checker for all kind of encoding problems.

4
votes

PHP 5.6 comes with a new default charset directive set to UTF-8, in some case this may be a problem with pages served in metatag as latin1 and may be cause of conflicts in validation service, you can override this directive by calling ini_set('default_charset', 'iso-8859-1') in your scripts.

For doing that put on each php file you want to be coded to latin1 this piece of code at the beginning of your scripts:

example: index.php

<?php
  $server_root = realpath($_SERVER["DOCUMENT_ROOT"]);
  $config_serv = "$server_root/php/config.php";
  include("$config_serv");
?>

Then create a folder "php" under your root website and put this piece of code into config.php:

example: config.php

<?php
  ##########################################################################
  # Server Directive - Override default_charset utf-8 to latin1 in php.ini #
  ##########################################################################
  @ini_set('default_charset', 'ISO-8859-1');
?>

If your php.ini is set to latin1 (ISO-8859-1) and you want serve a utf-8 (unicode) page you can force encoding using the same way but putting instead of iso-8859-1, utf-8. Look at that:

example: config.php

<?php
  ##########################################################################
  # Server Directive - Override default_charset latin1 to utf-8 in php.ini #
  ##########################################################################
  @ini_set('default_charset', 'UTF-8');
?>

I hope you find my answer useful, I solved my problem in this way! I had problems with firefox and its html/css validator that given my pages as latin1 and the headers as utf-8. This conflict is solved! I finally solved using my answer.

0
votes

You can set this in your root .htaccess for all the files you want to be iso-8859-1

<FilesMatch "\.(htm|html|xhtml|xml|css|js|php)$">
    AddDefaultCharset iso-8859-1
</FilesMatch>

Remember that server headers have priority over inline headers.

0
votes

It appears that Apache could force the charset to a default value (UTF-8) even if you specify it in your code.

This options is in your httpd.conf file and is called AddDefaultCharset. You need to comment it to let your code rules the world.

This has solved my problem.

source : https://major.io/2007/11/15/change-the-default-apache-character-set/