0
votes

I am working on a big (~500Mb) RAW txt file. There are about 20,000,000 lines in the file. Each line includes one double and one int. For example:

45782.1234852 10

Below is my simple code:

QTextStream rdStream(&qFile_Input);
while (!rdStream.atEnd())
{
//QStringList qList_data=rdStream.readLine().split(" ",QString::SkipEmptyParts);
    rdStream.readLine();
}

It takes about 30 seconds just to read line QTextStream::readLine();

If I add .split(" ",QString::SkipEmptyParts) into a Qstringlist, then the total time required jumps to 5 minutes. My question is three fold:

  1. Where does the time gap comes from?
  2. Is there a way to get a shorter processing time?
  3. If my file is larger than the RAM of PC, will I encounter an error? If so, what can I do?

Thanks in advance!

1

1 Answers

0
votes

Well, it seems that the splitting part adds an enormous overhead time-wise. Instead of using the Qt class QTextStream, you could probably just use the methods from the c++ standard library. You should get better performance than the 5 minutes you are seeing now.

#include <fstream>

int main()
{
   std::ifstream infile("thefile.txt");

   double a;
   int b;

   while(infile >> a >> b)
   {
      //Do something with a and b here, they've been read
   }

   return 0;
}