Why do programming languages take a long to compile? Assumed c++ takes a long time because it must parse and compile a header every time it compiles a file. But I -heard- precompiled headers take as long? I suspect c++ is not the only language that has this problem.
8 Answers
Compiling is a complicated process which involves quite a few steps:
- Scanning/Lexing
- Parsing
- Intermediate code generation
- Possibly Intermediate code optimization
- Target Machine code generation
- Optionally Machine-dependent code optimization
(Leaving aside linking.)
Naturally, this will take some time for longer programs.
Precompiled headers are way faster, as has been known at least since 1988.
The usual reason for a C compiler or C++ compiler to take a long time is that it has to #include, preprocess, and then lex gazillions of tokens.
As an exercise you might find out how long it takes just to run cpp over a typical collection of header files---then measure how long it takes to lex the output.
gcc -O uses a very effective but somewhat slow optimization technique developed by Chris Fraser and Jack Davidson. Most other optimizers can be slow because they involve repeated iteration over fairly large data structures.
Language design does have an effect on compiler performance. C++ compilers are typically slower than C# compilers, which has a lot to do with the design of the language. (This also depends on the compiler implementer, Anders Hejlsberg implemented C# and is one of the best around.)
The simplistic "header file" structure of C++ contributes to its slower performance, although precompiled headers can often help. C++ is a much more complex language than C, and C compilers are therefore typically faster.
Compilation does not need to take long: tcc compiles ANSI c fast enough to be useful as an interpreter.
Some thing to think about:
- Complexity in the scanning and parsing passes. Presumably requiring long look-aheads will hurt, as will contextual (as opposed to context-free) languages.
- Internal representation. Building and working on a large and featureful AST will take some time. Presumably you should use the simplest internal representation that will support the features you want to implement.
- Optimization. Optimization is fussy. You need to check for a lot of different conditions. You probably want to make multiple passes. All of this is going to take time.
They take as long as they take and it usually depends on how much extraneous stuff you inject into your compilation units. I'd like to see you hand-compile them any faster :-)
The first time you compile a file, you should have no headers at all. Then add them as you need them (and check when you're finished whether you still need them).
Other ways of reducing that time is to keep your compilation units small (even to the point of one function per file, in an extreme case) and use a make-like tool to ensure you only build what's needed.
Some compilers (IDE's really) do incremental compilation in the background so that they're (almost) always close to fully-compiled.
I think the other answers here have missed some important parts of the situation that slow C++ compilation:
- Compilation model that saves
.obj
/.o
files to disk, reads them back, then links them - Linking in general and bad slow linkers in particular
- Overly complex macro preprocessor
- Arbitrarily complex Turing-complete template system
- Nested and repeated inclusion of source files, even with
#pragma once
- User-inflicted fragmentation, splitting code into too many files (even to the point of one function per file, in an extreme case)
- Bloated or low-effort internal data structures in the compiler
- Overbloated standard library, template abuse
By contrast, these don't slow C++ compilation:
- Scanning/Lexing
- Parsing
- Intermediate code generation
- Target machine code generation
As an aside, optimization is one of the biggest slowdowns, but it is the only slowdown here that is actually necessary by some measure, and also it is entirely optional.
Run Idera RAD Studio (there is a free version). It comes with C++ and Delphi. The Delphi code compiles in a tiny fraction of the time that C++ code doing the same thing does. This is because C++ evolved horribly over the decades, with not much thought to compiler consiquences for it's complex context determined macros and to some degree the so called ".hpp" hell. Ada has similar issues. Delphi dialect of Pascal was designed from the ground up to be an efficient language to compile. So compiler and run takes seconds, instead of minutes, making iterative debugging fast and easy. Debugging slow to compile languages is a huge time waster, and a pain in the you know what! BTW Anders also wrote Delphi before M$ stole him!