0
votes

I use PHP to change file encoding, but it generates a new empty line after each line.

$t_comment = file_get_contents('php://stdin');
$t_comment = iconv("ISO-8859-1", "UTF-8//IGNORE",$t_comment);
echo $t_comment;

Here is myfile.txt:

line1 [CR][LF]
line2 [CR][LF]
line3 [CR][LF]

Conversion command (BATCH):

C:\> type myfile.txt | php.exe myscript.php

or

C:\> php.exe myscript.php < myfile.txt

Result:

line1 [LF]
[LF]
line2 [LF]
[LF]
line3 [LF]
[LF]

Can you help me to fix this?

2
can you provide a hexdump of your input? - Janus Troelsen
show how you call your script / how you provide STDIN from the outside of this script .... cat myfile.txt | php myscript.php? did you try php myscript.php < myfile.txt? which shell are you using? - Kaii
@Kaii I have to use a windows BATCH file, I tried both syntax, it doesn't change anything. It really seems that iconv has a strange behaviour... - sinsedrix

2 Answers

3
votes

You have alternative functions: utf8_encode and mb_convert_encoding, although I have tested iconv locally and it works, but I am not using Windows.

Considering you are on Windows and using type to generate the PHP script input, you should check the active console code page, which may be affecting the output of the type command: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/chcp.mspx?mfr=true

First, try to discard the problem being with iconv by changing the first line of your code to the following, and seeing what happens:

$t_comment = "line 1 \r\nline 2 \r\nline 3 \r\n";
0
votes

I found that the new lines were generated because of the double pipe command. So I decided to merge both sripts to have only one pipe.

So I changed:

C:\> type myfile.txt | php.exe myscript1.php | php.exe myscript2.php

to:

C:\> type myfile.txt | php.exe myscript1-merge2.php

This fixes the problem but doesn't answer where do the extra newlines came from.