2
votes

I am attempting to read data files stored as a .txt, some of which are very large (>1 GB). It seems that every time QFile attempts to use the .open() method on files larger than 600MB, it freezes and crashes. Is there a better way to open large files than QFile? None of the code below the if (_file.open(QIODevice::ReadOnly)) line shown below executes, so I believe the crash occurs where the open method is called.

I understand from answers to similar questions that reading in large text files is not a great way to handle huge amounts of data, but unfortunately these are log files that I have no control over. I need to be able to read these files OR elegantly handle/ignore an oversized file, but I can't find information on how to detect the maximum read size. I would rather not have to manually open and split these files in a text editor, as I have about a terabyte of these to process and manually splitting could lead to loss of important information. I am not overly concerned with the responsiveness of this program, and any method used to open files can sit and think for quite awhile, as this program will be used for data processing not any kind of user interaction.

Thanks for your help

Code:

void FileRead::openNewFile()
{
    if(_listOfFiles.size()>0)
    {
        _file.setFileName(_listOfFiles.at(0));
        if (_file.open(QIODevice::ReadOnly)) //file opened successfully
        {
            _file.reset();
            emit fileOpened();
            emit fileOpened(_file.fileName());
            qDebug()<<"File Opened";
            qDebug()<<_file.fileName();


        }
        else
        {
            qDebug()<<"Unable to open file";
            qDebug()<<_listOfFiles;
            _listOfFiles.removeAt(0);
            emit fileSent();
        }
    }
    else
    {
        qDebug()<<"All files processed";
    }
}
1
Usually it is no problem with just open() - no matter of size. I may guess that you may have obstacles for example: due to lock of the file by another process - just as one of ideas and you mentioned it as log file. - yshurik
@yshurik : the log sits in a directory and aren't open in any other program (they were generated months ago, I'm attempting to parse and summarize them.) My code will work on files <400MB, but always fails on files >600MB. Could it be due to running this in QtCreator, and not as the .exe? I don't see any of the qDebug messages in the output, which is why I believe the problem lies in the .open(). - french13
So you just open them? You could perhaps test std::ifstream instead? - user2672165
@user2672165 _file is a QFile member variable, a separate slot reads the file in 100 lines at a time to avoid putting the entirety of a huge file into memory, or at least that's the design intent. This the section of code that's giving me trouble, though. - french13

1 Answers

1
votes

I think you're re-using a QFile that's already open, and this might be problematic.

The call to reset() is pointless - you've just opened the file, it is reset by definition.

You have not provided a backtrace of where exactly does your code crash. I cannot reproduce your problem - I have a 16GB sparse file that I can open, read from, and close, successfully, on both Qt 4.8 and Qt 5.2, on both Windows 7 and OS X.

If you write a minimal test case for this (a stand-alone application that does nothing but opens the file, reads a few bytes from it, and closes it), you'll likely find that it doesn't crash - the problem is elsewhere in your code.