9
votes

I am trying to write a UTF-16 encoded file using std::ofstream(). Even in binary mode writing "\n\0" is written as "\r\n\0". Sample code:

std::string filename = ...
std::ofstream fout(filename, std::ios_base::binary);
fout.write("\xff\xfe", 2);
fout.write("\n\0", 2);
fout.close();

The resulting file's hex data is:

ff fe 0d 0a 00

I must be doing something wrong. Any ideas to prevent the 0x0d being written?

I am using MS VisualStudio 2013.

Update: It inexplicably started working as expected. Chalk it up to ghosts in the machine.

2
In binary mode there should be no translation. If there is a translation there is an error somewhere. - Dietmar Kühl
What about fout.write("\x0a\0", 2);? - πάντα ῥεῖ
That would work, but the OP is going about this the wrong way. This path will only lead to more pain and misery and Unicode gotchas. - Jesse Weigert
"\x0a" has the same effect. Jesse, I need to do it this way for other reasons, but the gist of the problem is that \r should not be written in binary mode. - Jason
Write bytes to the stream, not text. Use something like this: char[] buffer = {255, 254, 0, 10}; fout.write(buffer, 4); - Jesse Weigert

2 Answers

1
votes

You sent 4 bytes to be output. 5 were observed in the output.

You were somehow not using binary mode. There is no other way you could use .write(buf, 2) and .write(buf, 2) and get 5 bytes of output.

Likely, in messing/playing around with things, (as people always do when trying to figure out why odd behavior) something you changed caused it to actually assert binary mode.

If you were earlier attempting to output to either STDOUT or STDERR, it's entirely possible that windows was automatically adding the '\r' into the stream because STDOUT and STDERR are almost always text, and this could have been overriding your attempt to put it into binary mode. (No, really. No, you're using Visual Studio, this is a really. Yes, if you use cygwin this isn't true, but you're using VS.)

-5
votes

That's by design. The \n character is converted to the EOL marker for your platform and so the ofstream::write function is correctly interpreting it. If you want to write a binary file, you can't use special text characters.

Clarification: I managed to create a bit of confusion over what the compiler is doing. Basically, the \n is a special character which means "EOL/End of Line" This is different depending on what platform your compiling on.

Now the write() function is taking an array of bytes to write to the stream. The C standard doesn't really differentiate between a string(technically no such thing in C) and an array of chars(or bytes), so it lets you get away with this. What is happening during compile time is that those lines are getting converted to something like this:

fout.write({255, 254, 0}, 2);   // "\xff\xfe"
fout.write({13, 10, 0, 0}, 2);  // "\n\0"
fout.close();