36
votes

In Qt, there is a foreach loop which is implemented using macros (Q_FOREACH). There are different implementations, depending on the compiler.

The definition for GCC is as follows:

#define Q_FOREACH(variable, container)                                \
for (QForeachContainer<__typeof__(container)> _container_(container); \
     !_container_.brk && _container_.i != _container_.e;              \
     __extension__  ({ ++_container_.brk; ++_container_.i; }))        \
    for (variable = *_container_.i;; __extension__ ({--_container_.brk; break;}))

... using the helper class QForeachContainer which is defined as follows:

template <typename T>
class QForeachContainer {
public:
    inline QForeachContainer(const T& t) : c(t), brk(0), i(c.begin()), e(c.end()) { }
    const T c;
    int brk;
    typename T::const_iterator i, e;
};

The container in a Q_FOREACH macro has to be a class T which at least has to provide a T::const_iterator type, a T.begin() and a T.end() method, as do all STL containers as well as most Qt containers like QList, QVector, QMap, QHash, ...

My question is now: How does this macro work?

One thing seems to be really odd: The variable only appears once in the macro definition. So e.g. foreach(QString item, list) has a QString item = but no item = afterwards at any time... How can the variable item then be changed in each step?

Even more confusing is the following definition of Q_FOREACH for the MS VC++ compiler:

#define Q_FOREACH(variable,container)                                                         \
if(0){}else                                                                                     \
for (const QForeachContainerBase &_container_ = qForeachContainerNew(container);                \
     qForeachContainer(&_container_, true ? 0 : qForeachPointer(container))->condition();       \
     ++qForeachContainer(&_container_, true ? 0 : qForeachPointer(container))->i)               \
    for (variable = *qForeachContainer(&_container_, true ? 0 : qForeachPointer(container))->i; \
         qForeachContainer(&_container_, true ? 0 : qForeachPointer(container))->brk;           \
         --qForeachContainer(&_container_, true ? 0 : qForeachPointer(container))->brk)

Why true : 0 ? ...? Doesn't this always get evaluated to 0? Is the function call qForeachPointer(container) executed even if the condition before ? is true?

And why do we need two for-loops?

It would be cool if anyone can make things a bit clearer for me!

1

1 Answers

74
votes

The GCC version


The GCC one is really quite simple. First of all it is used like this:

Q_FOREACH(x, cont)
{
    // do stuff
}

And that will be expanded to

for (QForeachContainer<__typeof__(cont)> _container_(cont); !_container_.brk && _container_.i != _container_.e; __extension__  ({ ++_container_.brk; ++_container_.i; }))
    for (x = *_container_.i;; __extension__ ({--_container_.brk; break;}))
    {
        // do stuff
    }

So first of all:

for (QForeachContainer<__typeof__(cont)> _container_(cont); !_container_.brk && _container_.i != _container_.e; __extension__  ({ ++_container_.brk; ++_container_.i; }))

This is the actual for loop. It sets up a QForeachContainer to help with the iteration. The brk variable is intitialised to 0. Then the condition is tested:

!_container_.brk && _container_.i != _container_.e

brk is zero so !brk is true, and presumably if the container has any elements i (the current element) doesn't equal e (the last element) yet.

Then the body of that outer for is entered, which is:

for (variable = *_container_.i;; __extension__ ({--_container_.brk; break;}))
{
    // do stuff
}

So x is set to *_container_.i which is the current element the iteration is on, and there is no condition so presumably this loop will continue forever. Then the body of the loop is entered, which is our code, and it's just a comment so it doesn't do anything.

Then the increment part of the inner loop is entered, which is interesting:

__extension__ ({--_container_.brk; break;})

It decrements brk so that's now -1, and breaks out of the loop (with __extension__ which makes GCC not emit warnings for using GCC extensions, like you now know).

Then the increment part of the outer loop is entered:

__extension__  ({ ++_container_.brk; ++_container_.i; })

which increments brk again and makes it 0 again, and then i is incremented so we get to the next element. The condition is checked, and since brk is now 0 and i presumably doesn't equal e yet (if we have more elements) the process is repeated.

Why did we decrement and then increment brk like that? The reason is because the increment part of the inner loop will not be executed if we used break in the body of our code, like this:

Q_FOREACH(x, cont)
{
    break;
}

Then brk would still be 0 when it breaks out of the inner loop, and then the increment part of the outer loop would be entered and increment it to 1, then !brk would be false and the outer loop's condition would evaluate to false, and the foreach would stop.

The trick is to realise that there are two for loops; the outer one's lifetime is the whole foreach, but the inner one only lasts for one element. It would be an infinite loop since it doesn't have a condition, but it is either breaked out of by it's increment part, or by a break in the code you provide it. That's why x looks like it is assigned to "only once" but actually it's assigned to on every iteration of the outer loop.

The VS version


The VS version is a little more complicated because it has to work around the lack of the GCC extension __typeof__ and block-expressions, and the version of VS it was written for (6) didn't have auto or other fancy C++11 features.

Let's look at an example expansion for what we used earlier:

if(0){}else
    for (const QForeachContainerBase &_container_ = qForeachContainerNew(cont); qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->condition(); ++qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->i)
        for (x = *qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->i; qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->brk; --qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->brk)
        {
            // stuff
        }

The if(0){}else is because VC++ 6 did the scoping of for variables wrong and a variable declared in the initialisation part of a for loop could be used outside the loop. So it's a workaround for a VS bug. The reason they did if(0){}else instead of just if(0){...} is so that you can't add an else after the loop, like

Q_FOREACH(x, cont)
{
    // do stuff
} else {
    // This code is never called
}

Second, let's look at the initialisation of the outer for:

const QForeachContainerBase &_container_ = qForeachContainerNew(cont)

The definition of QForeachContainerBase is:

struct QForeachContainerBase {};

And the definition of qForeachContainerNew is

template <typename T>
inline QForeachContainer<T>
qForeachContainerNew(const T& t) {
    return QForeachContainer<T>(t);
}

And the definition of QForeachContainer is

template <typename T>
class QForeachContainer : public QForeachContainerBase {
public:
    inline QForeachContainer(const T& t): c(t), brk(0), i(c.begin()), e(c.end()){};
    const T c;
    mutable int brk;
    mutable typename T::const_iterator i, e;
    inline bool condition() const { return (!brk++ && i != e); }
};

So to make up for the lack of __typeof__ (which analogous to the decltype of C++11) we have to use polymorphism. The qForeachContainerNew function returns a QForeachContainer<T> by value but due to lifetime extension of temporaries, if we store it in a const QForeachContainer&, we can prolong it's lifetime till the end of the outer for (actually the if because of VC6's bug). We can store a QForeachContainer<T> in a QForeachContainerBase because the former is a subclass of the latter, and we have to make it a reference like QForeachContainerBase& instead of a value like QForeachContainerBase to avoid slicing.

Then for the condition of the outer for:

qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->condition(); 

The definition of qForeachContainer is

inline const QForeachContainer<T> *qForeachContainer(const QForeachContainerBase *base, const T *) {
    return static_cast<const QForeachContainer<T> *>(base);
}

And the definition of qForeachPointer is

template <typename T>
inline T *qForeachPointer(const T &) {
    return 0;
}

This is where you might not be aware of what's going on since these functions seem kind of pointless. Well here's how they work and why you need them:

We have a QForeachContainer<T> stored in a reference to a QForeachContainerBase with no way to get it back out (that we can see). We have to cast it to the proper type somehow, and that's where the two functions come in. But how do we know what type to cast it to?

A rule of the ternary operator x ? y : z is that y and z must be of the same type. We need to know the type of the container, so we use the qForeachPointer function to do that:

qForeachPointer(cont)

The return type of qForeachPointer is T*, so we use template type deduction to deduce the type of the container.

The true ? 0 : qForeachPointer(cont) is to be able to pass a NULL pointer of the right type to qForeachContainer so it will know what type to cast the pointer we give it to. Why do we use the ternary operator for this instead of just doing qForeachContainer(&_container_, qForeachPointer(cont))? It's to avoid evaluating cont many times. The second (actually third) operand to ?: is not evaluated unless the condition is false, and since the condition is true itself, we can get the right type of cont without evaluating it.

So that solves that, and we use qForeachContainer to cast _container_ to the right type. The call is:

qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))

And again the definition is

inline const QForeachContainer<T> *qForeachContainer(const QForeachContainerBase *base, const T *) {
    return static_cast<const QForeachContainer<T> *>(base);
}

The second parameter will always be NULL because we do true ? 0 which always evaluates to 0, and we use qForeachPointer to deduce the type T, and use that to cast the first argument to a QForeachContainer<T>* so we can use its member functions/variables with the condition (still in the outer for):

qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->condition()

And condition returns:

(!brk++ && i != e)

which is the same as the GCC version above except that it increments brk after evaluating it. So !brk++ evaluates to true and then brk is incremented to 1.

Then we enter the inner for and begin with the initialisation:

x = *qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->i

Which just sets the variable to what the iterator i is pointing to.

Then the condition:

qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->brk

Since brk is 1, the body of the loop is entered, which is our comment:

// stuff

Then the increment is entered:

--qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->brk

That decrements brk back to 0. Then the condition is checked again:

qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->brk

And brk is 0 which is false and the loop is exited. We come to the increment part of the outer for:

++qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->i

And that increments i to the next element. Then we get to the condition:

qForeachContainer(&_container_, true ? 0 : qForeachPointer(cont))->condition()

Which checks that brk is 0 (which it is) and increments it to 1 again, and the process is repeated if i != e.

This handles break in client code only a little differently than the GCC version, since brk will not be decremented if we use break in our code and it will still be 1, and the condition() will be false for the outer loop and the outer loop will break.

And as GManNickG stated in the comments, this macro is a lot like Boost's BOOST_FOREACH which you can read about here. So there you have it, hope that helps you out.