I keep hearing about LLVM all the time. It's in Perl, then it's in Haskell, then someone uses it in some other language? What is it?
- What exactly distinguishes it from GCC (perspectives = safety etc.)?
I keep hearing about LLVM all the time. It's in Perl, then it's in Haskell, then someone uses it in some other language? What is it?
LLVM is a library that is used to construct, optimize and produce intermediate and/or binary machine code.
LLVM can be used as a compiler framework, where you provide the "front end" (parser and lexer) and the "back end" (code that converts LLVM's representation to actual machine code).
LLVM can also act as a JIT compiler - it has support for x86/x86_64 and PPC/PPC64 assembly generation with fast code optimizations aimed for compilation speed.
Unfortunately disabled since 2013, there was the ability to play with LLVM's machine code generated from C or C++ code at the demo page.
A good summary of LLVM is this:
At the frontend you have Perl, and many other high level languages. At the backend, you have the natives code that run directly on the machine.
At the centre is your intermediate code representation. If every high level language can be represented in this LLVM IR format, then analysis tools based on this IR can be easily reused - that is the basic rationale.
LLVM (used to mean "Low Level Virtual Machine" but not anymore) is a compiler infrastructure, written in C++, which is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs written in arbitrary programming languages. Originally implemented for C/C++, the language-independent design (and the success) of LLVM has since spawned a wide variety of front-ends, including Objective C, Fortran, Ada, Haskell, Java bytecode, Python, Ruby, ActionScript, GLSL, and others.
Read this for more explanation
Also check out Unladen Swallow
According to 'Getting Started With LLVM Core Libraries' book (c):
In fact, the name LLVM might refer to any of the following:
The LLVM project/infrastructure: This is an umbrella for several projects that, together, form a complete compiler: frontends, backends, optimizers, assemblers, linkers, libc++, compiler-rt, and a JIT engine. The word "LLVM" has this meaning, for example, in the following sentence: "LLVM is comprised of several projects".
An LLVM-based compiler: This is a compiler built partially or completely with the LLVM infrastructure. For example, a compiler might use LLVM for the frontend and backend but use GCC and GNU system libraries to perform the final link. LLVM has this meaning in the following sentence, for example: "I used LLVM to compile C programs to a MIPS platform".
LLVM libraries: This is the reusable code portion of the LLVM infrastructure. For example, LLVM has this meaning in the sentence: "My project uses LLVM to generate code through its Just-in-Time compilation framework".
LLVM core: The optimizations that happen at the intermediate language level and the backend algorithms form the LLVM core where the project started. LLVM has this meaning in the following sentence: "LLVM and Clang are two different projects".
The LLVM IR: This is the LLVM compiler intermediate representation. LLVM has this meaning when used in sentences such as "I built a frontend that translates my own language to LLVM".
LLVM is basically a library used to build compilers and/or language oriented software. The basic gist is although you have gcc which is probably the most common suite of compilers, it is not built to be re-usable ie. it is difficult to take components from gcc and use it to build your own application. LLVM addresses this issue well by building a set of "modular and reusable compiler and toolchain technologies" which anyone could use to build compilers and language oriented software.
The LLVM Compiler Infrastructure is particularly useful for performing optimizations and transformations on code. It also consists of a number of tools serving distinct usages. llvm-prof is a profiling tool that allows you to do profiling of execution in order to identify program hotspots. Opt is an optimization tool that offers various optimization passes (dead code elimination for instance).
Importantly LLVM provides you with the libraries, to write your own Passes. For instance if you require to add a range check on certain arguments that are passed into certain functions of a Program, writing a simple LLVM Pass would suffice.
For more information on writing your own Pass, check this http://llvm.org/docs/WritingAnLLVMPass.html
Low Level Virtual Machine(LLVM)
Alternative: GCC(GNU Compiler Collection). GDB(GNU Debugger) - debug tool. Supports more languages and architectures.
LLVM - is umbrella project(set of libraries) it is a brand name with different projects(IR, debug tool...) And now it not Virtual Machine or acronym. LLDB(LLVM Debugger) - debug tool. Supported by Big companies
Compiler:
Language FrontEnd(Many: Clang, Haskel...) -> Optimizer(Single) -> Backend(Many: ARM, x86...)
FrontEnd generates Intermediate Representation (IR)
. This common language allows simple scale process. If you are creating new language you are responsible only for FrontEnd, if you are developing new architecture you should take care about BackEnd. It is a kind of .class
file in JVM which are used by ClassLoader
[About]
There are thee equivalent IR forms:
llvm-dis
can be used to convert bitcode
into human readable