15
votes

I recently ran into a problem caused by using fstream::eof(). I read the following line from here:

The function eof() returns true if the end of the associated input file has been reached, false otherwise.

and (mistakenly) assumed this meant that if I used fstream::read() and read past the end of the file, the function eof() would tell me. So I did something like this (very generalized):

for(int i = 0; i < max && !file.eof(); i++)
{
     file.read(mything, sizeof(mything));
}

The problem came because of what is explained later on the page linked above (which I failed to read initially, thanks to the misleading first paragraph):

Conversely, the stream does not go into EOF state if there happens to be any whitespace after the last token, but trying to read another token will still fail. Therefore, the EOF flag cannot be used as a test in a loop intended to read all stream contents until EOF. Instead, one should check for the fail condition after an attempt to read.

So I changed, and now my loop checks against file.fail() rather than file.eof(), and I understand HOW eof() works. My question is, why does it work that way? Are there situations where this is desirable? It seems to me that once you've passed EOF, you've passed EOF and eof() should return true.

UPDATE Thanks for the responses, I think I've got it. The only operation I'm performing is read(), and I immediately check for fail(), so I think I'm okay. Now, my question is, what would I use eof() for?

2
for(int i = 0; i < max && file.read(mything, sizeof(mything)); i++) {}Martin York

2 Answers

17
votes

Because this way it can detect EOF without knowing how large the file is. All it has to do is simply attempt to read and if the read is short (but not an error), then you have reached the end of the file.

This mirrors the functionality of the read system call, which file IO typically ends up calling (win32 stuff may call ReadFile but I believe that the functionality is similar).

From the read manpage "RETURN VALUE" section (emphasis added):

On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal. On error, -1 is returned, and errno is set appropriately. In this case it is left unspecified whether the file position (if any) changes.

BTW: a good way to write what you wanted would be like this:

T something;
while(file.read(something, sizeof(something))) {
    // process your 'something'
}

this works because file.read (like many members of iostream) return a reference to the iostream itself. All of which have an operator overloaded to allow testing the stream state. Similarly to read from std::cin, while(std::cin >> x) { ... } works as well.

EDIT: you should know that testing vs. fail can be equally wrong for the same reason. From the page you linked to fail() returns if the previous operation failed. Which means you need to perform a read or other relevant operation before testing it.

0
votes
int n;
std::cin >> n >> std::stripws;

fixes this problem. at that point you can use either .good() or .eof(). I like to use .good(), since if there is a bad disk block, .good() will detect it. but that's me. .eof() will not, you will also have to add .fail() || .bad().

I just found this out after some hard study about the problem of eating whitespace. I was going to propose an ECO to iostream and ifstream, and lo and behold, it's already been done. :-D