1
votes

My csv file contains special characters like 'æ', 'å' etc. When I read and print the file, special characters in the file gets converted into '�'. I tried setting page encodig to UTF-8 and ISO 8859-1. But none of these helped.

Could smb advice a solution?

1
When I have no idea of an original files codepage I open it in a browser and play with Menu->Encoding->...Andrey Volk
It is not UTF-8 encode and it looks as if it is not ISO-8859-1. '�' means that it has not been able to find the value in the encoding table. You have to find out what encoding type that is used in csv file, eg. by doing as Andrey Volk has proposed.Diblo Dk

1 Answers

4
votes

I think you have to detect and change the original encoding as folows (if you are using php):

  <?php
        header( "Content-Type: text/html; charset=utf-8");
        $csvContent = file_get_contents( $fileName );
        $encoding = mb_detect_encoding( $csvContent, 
                                        array("UTF-8","UTF-32","UTF-32BE","UTF-32LE","UTF-16","UTF-16BE","UTF-16LE"), 
                                        TRUE );

        if( $fileEncoding !== "UTF-8" ) {
             $csvContent = mb_convert_encoding($csvContent, "UTF-8", $fileEncoding );
        }

        foreach( explode( PHP_EOL, $csvContent ) as $item ) {
           var_dump($item );
        }
 ?>