5
votes

I'm currently writing out billions of binary records to ASCII files (ugh). I've got things working just fine, but I'd like to optimize the performance if I can. The problem is, the user is allowed to select any number of fields to output, so I can't know at compile-time which of 3-12 fields they'll include.

Is there a faster way to construct lines of ASCII text? As you can see, the types of the fields vary quite a bit and I can't think of a way around the series of if() statements. The output ASCII file has one line per record, so I've tried using a template QString constructed with arg, but that just slowed things down about 15%.

A faster solution doesn't have to use QTextStream, or necessarily write directly to the file, but the output is too large to write the whole thing to memory.

Here's some sample code:

QFile outfile(outpath);
if(!outfile.open(QIODevice::WriteOnly | QIODevice::Text | QIODevice::Truncate))
{
    qWarning("Could not open ASCII for writing!");
    return false;
} else
{
    /* compute XYZ precision */
    int prec[3] = {0, 0, 0}; //these non-zero values are determined programmatically

    /* set up the writer */
    QTextStream out(&outfile);
    out.setRealNumberNotation(QTextStream::FixedNotation);
    out.setRealNumberPrecision(3);
    QString del(config.delimiter); //the user chooses the delimiter character (comma, tab, etc) - using QChar is slower since it has to be promoted to QString anyway

    /* write the header line */
    out << "X" << del << "Y" << del << "Z";
    if(config.fields & INTFIELD)
        out << del << "IntegerField";
    if(config.fields & DBLFIELD)
        out << del << "DoubleField";
    if(config.fields & INTFIELD2)
        out << del << "IntegerField2";
    if(config.fields & TRIPLEFIELD)
        out << del << "Tri1" << del << "Tri2" << del << "Tri3";
    out << "\n";

    /* write out the points */
    for(quint64 ptnum = 0; ptnum < numpoints; ++ptnum)
    {
        pt = points.at(ptnum);
        out.setRealNumberPrecision(prec[0]);
        out << pt->getXYZ(0);
        out.setRealNumberPrecision(prec[1]);
        out << del << pt->getXYZ(1);
        out.setRealNumberPrecision(prec[2]);
        out << del << pt->getXYZ(2);
        out.setRealNumberPrecision(3);
        if(config.fields & INTFIELD)
            out << del << pt->getIntValue();
        if(config.fields & DBLFIELD)
            out << del << pt->getDoubleValue();
        if(config.fields & INTFIELD2)
            out << del << pt->getIntValue2();
        if(config.fields & TRIPLEFIELD)
        {
            out << del << pt->getTriple(0);
            out << del << pt->getTriple(1);
            out << del << pt->getTriple(2);
        }
        out << "\n";
    } //end for every point
outfile.close();
6
You need to profile your app to find out what slows down it. Qt classes are already optimized and work fast if properly used. And your code is correct, I see nothing obviously slow. Profiling is really required in your case. Maybe your disk is slow, maybe it's QTextStream, maybe it's QString.Pavel Strakhov
I have yet to find a good Qt profiler for 64bit windows. VerySleepy has potential, but its output is so arcane I hardly understand it. Suggestions?Phlucious
It depends on which compiler you're using. I like gprof for gcc. For MSVC compiler the standard vsprofiler can be used.Pavel Strakhov

6 Answers

4
votes

(This doesn't answer the profiler question. It tries to answer the original question, which is the performance issue.)

I would suggest avoiding the use of QTextStream altogether in this case to see if that helps. The reason it might help with performance is that there's overhead involved, because text is encoded internally to UTF-16 for storage, and then decoded again to ASCII or UTF-8 when writing it out. You have two conversions there that you don't need.

Try using only the standard C++ std::ostringstream class instead. It's very similar to QTextStream and only minor changes are needed in your code. For example:

#include <sstream>

// ...

QFile outfile(outpath);
if (!outfile.open(QIODevice::WriteOnly | QIODevice::Text
                | QIODevice::Truncate))
{
    qWarning("Could not open ASCII for writing!");
    return false;
}

/* compute XYZ precision */
int prec[3] = {0, 0, 0};

std::ostringstream out;
out.precision(3);
std::fixed(out);
// I assume config.delimiter is a QChar.
char del = config.delimiter.toLatin1();

/* write the header line */
out << "X" << del << "Y" << del << "Z";
if(config.fields & INTFIELD)
    out << del << "IntegerField";
if(config.fields & DBLFIELD)
    out << del << "DoubleField";
if(config.fields & INTFIELD2)
    out << del << "IntegerField2";

if(config.fields & TRIPLEFIELD)
    out << del << "Tri1" << del << "Tri2" << del << "Tri3";
out << "\n";

/* write out the points */
for(quint64 ptnum = 0; ptnum < numpoints; ++ptnum)
{
    pt = points.at(ptnum);
    out.precision(prec[0]);
    out << pt->getXYZ(0);
    out.precision(prec[1]);
    out << del << pt->getXYZ(1);
    out.precision(prec[2]);
    out << del << pt->getXYZ(2);
    out.precision(3);
    if(config.fields & INTFIELD)
        out << del << pt->getIntValue();
    if(config.fields & DBLFIELD)
        out << del << pt->getDoubleValue();
    if(config.fields & INTFIELD2)
        out << del << pt->getIntValue2();
    if(config.fields & TRIPLEFIELD)
    {
        out << del << pt->getTriple(0);
        out << del << pt->getTriple(1);
        out << del << pt->getTriple(2);
    }
    out << "\n";

    // Write out the data and empty the stream.
    outfile.write(out.str().data(), out.str().length());
    out.str("");
}
outfile.close();
1
votes

Given that you are writing out billions of records you might consider using the boost karma library:

http://www.boost.org/doc/libs/1_54_0/libs/spirit/doc/html/spirit/karma.html

According to their benchmark it runs much faster than C++ streams and even sprintf with most compilers/libraries, including Visual C++ 2010:

http://www.boost.org/doc/libs/1_54_0/libs/spirit/doc/html/spirit/karma/performance_measurements/numeric_performance/format_performance.html

It will take some learning, but you will be rewarded with significant speedup.

1
votes

Use multiple cores (if available)! It seems to me that each point of your data is independent of the others. So you could split up the preprocessing using QtConcurrent::mappedReduced. e.g.:

  1. divide your data into a sequence of blocks consisting of N (e.g. 1000) points each,
  2. then let your mapFunction process each block into a memory buffer
  3. let the reduceFunction write the buffers to the file.

Use OrderedReduce | SequentialReduce as options.

This can be used in addition to the other optimizations!

0
votes

If you don't have a proper profiler, but a debugger which allows you to break the running application, manual profiling is an option: - start the app in your debugger, call the slow code part - break the execution randomly while executing the slow part - look at the call stack and note which subroutine was active - repeat several times (about 10x or so)

Now the probability is high that you found the same procedure in the majority of cases - that's the one which you have to avoid / make faster in order to improve things

0
votes

Here I rewrote your piece of code using the standard C library - maybe that's faster. I didn't test, so you may need to read some fprintf format specification documentation - depending on your compiler format flags may be different.

Take care with the return type of your getTriple() function - if it's not float you must change the %f's in the preceeding format specification.

#include <stdio.h>

FILE* out;

out = fopen(outpath, "w");
if (out == NULL)
{
    qWarning("Could not open ASCII for writing!");
    return false;
} else {
    /* compute XYZ precision */
    int prec[3] = {0, 0, 0}; //these non-zero values are determined programmatically

    /* set up the writer */
    char del = config.delimiter;

    char s[255];        // or more if needed..
    /* write the header line */
    sprintf(s, "X%cY%cZ%c", del, del, del);
    fputs(s, out);
    if(config.fields & INTFIELD)
        fputs("IntegerField", out);
    if(config.fields & DBLFIELD)
        fputs("DoubleField", out);
    if(config.fields & INTFIELD2)
        fputs("IntegerField2", out);
    if(config.fields & TRIPLEFIELD) {
        sprintf(s, "%cTri1%cTri2%cTri3", del, del, del);
        fputs(s, out);
    }
    fputs("\n", out);

    /* write out the points */
    for(quint64 ptnum = 0; ptnum < numpoints; ++ptnum)
    {
        pt = points.at(ptnum);
        sprintf(s, "%.*f%c%.*f%c%.*f%c", prec[0], pt->getXYZ(0), del, prec[1], pt->getXYZ(1), del, prec[2], pt->getXYZ(2), del);
        fputs(s, out);            
        if(config.fields & INTFIELD)
            sprintf(s, "%d", pt->getIntValue());
        if(config.fields & DBLFIELD)
            sprintf(s, "%f", pt->getDoubleValue());
        if(config.fields & INTFIELD2)
            sprintf(s, "%d", pt->getIntValue2());
        fputs(s, out);
        if(config.fields & TRIPLEFIELD)
        {
            sprintf(s, "%c%f%c%f%c%f", del, pt->getTriple(0), del, pt->getTriple(1), del, pt->getTriple(2));    // assuming the getTriples() return double - need to adjust the %f to the real type
            fputs(s, out);
        }
        fputs("\n", out);
    } //end for every point
    fclose(out);
}
0
votes

If using text output is not mandatory, you might want to use binary output with QDataStream. As there is no formatting to perform, the time to write or read your file will be strongly reduced.

void saveData(const QString & filename, const QVector<double> & iVect){
   QFile file(filename);
   if( !file.open(QIODevice::WriteOnly) )
      return;
   QDataStream out(file);
   for(int i=0;i<iVect.count();i++){
      out << iVect[i];
   file.close();
}