140
votes

When using the same code, simply changing the compiler (from a C compiler to a C++ compiler) will change how much memory is allocated. I'm not quite sure why this is and would like to understand it more. So far the best response I've gotten is "probably the I/O streams", which isn't very descriptive and makes me wonder about the "you don't pay for what you don't use" aspect of C++.

I'm using the Clang and GCC compilers, versions 7.0.1-8 and 8.3.0-6 respectively. My system is running on Debian 10 (Buster), latest. The benchmarks are done via Valgrind Massif.

#include <stdio.h>

int main() {
    printf("Hello, world!\n");
    return 0;
}

The code used does not change, but whether I compile as C or as C++, it changes the results of the Valgrind benchmark. The values remain consistent across compilers, however. The runtime allocations (peak) for the program go as follows:

  • GCC (C): 1,032 bytes (1 KB)
  • G++ (C++): 73,744 bytes, (~74 KB)
  • Clang (C): 1,032 bytes (1 KB)
  • Clang++ (C++): 73,744 bytes (~74 KB)

For compiling, I use the following commands:

clang -O3 -o c-clang ./main.c
gcc -O3 -o c-gcc ./main.c
clang++ -O3 -o cpp-clang ./main.cpp
g++ -O3 -o cpp-gcc ./main.cpp

For Valgrind, I run valgrind --tool=massif --massif-out-file=m_compiler_lang ./compiler-lang on each compiler and language, then ms_print for displaying the peaks.

Am I doing something wrong here?

2
To begin with, how are you building? What options do you use? And how do you measure? How do you run Valgrind?Some programmer dude
If I remember correctly, modern C++ compilers have to an exception model where there is no performance hit to entering a try block at the expense of a larger memory footprint, maybe with a jump table or something. Maybe try compiling without exceptions and see what impact that has. Edit : In fact, iteratively try disabling various c++ features to see what impact that has on the memory footprint.François Andrieux
When compiling with clang++ -xc instead of clang, the same allocation was there, which strongly suggests its due to linked librariesJustin
@bigwillydos This is indeed C++, I do not see any part of the C++ specifications it breaks... Other than potentially including stdio.h rather than cstdio but this is allowed at least in older C++ version. What do you think is "malformed" in this program?Vality
I find it suspicious that those gcc and clang compilers generate the exact same number of bytes in C mode and the exact same number of bytes C++ mode. Did you make a transcription error?RonJohn

2 Answers

150
votes

The heap usage comes from the C++ standard library. It allocates memory for internal library use on startup. If you don't link against it, there should be zero difference between the C and C++ version. With GCC and Clang, you can compile the file with:

g++ -Wl,--as-needed main.cpp

This will instruct the linker to not link against unused libraries. In your example code, the C++ library is not used, so it should not link against the C++ standard library.

You can also test this with the C file. If you compile with:

gcc main.c -lstdc++

The heap usage will reappear, even though you've built a C program.

The heap use is obviously dependant to the specific C++ library implementation you're using. In your case, that's the GNU C++ library, libstdc++. Other implementations might not allocate the same amount of memory, or they might not allocate any memory at all (at least not on startup.) The LLVM C++ library (libc++) for example does not do heap allocation on startup, at least on my Linux machine:

clang++ -stdlib=libc++ main.cpp

The heap use is the same as not linking at all against it.

(If compilation fails, then libc++ is probably not installed. The package name usually contains "libc++" or "libcxx".)

16
votes

Neither GCC nor Clang are compilers -- they're actually toolchain driver programs. That means they invoke the compiler, the assembler, and the linker.

If you compile your code with a C or a C++ compiler you will get the same assembly produced. The Assembler will produce the same objects. The difference is that the toolchain driver will provide different input to the linker for the two different languages: different startups (C++ requires code for executing constructors and destructors for objects with static or thread-local storage duration at namespace level, and requires infrastructure for stack frames to support unwinding during exception processing, for example), the C++ standard library (which also has objects of static storage duration at namespace level), and probably additional runtime libraries (for example, libgcc with its stack-unwinding infrastructure).

In short, it's not the compiler causing the increase in footprint, it's the linking in of stuff you've chose to use by choosing the C++ language.

It's true that C++ has the "pay only for what you use" philosophy, but by using the language, you pay for it. You can disable parts of the language (RTTI, exception handling) but then you're not using C++ any more. As mentioned in another answer, if you don't use the standard library at all you can instruct the driver to leave that out (--Wl,--as-needed) but if you're not going to use any of the features of C++ or its library, why are you even choosing C++ as a programming language?