13
votes

I am developing a compiler that emits IL code. It is important that the resulting IL is JIT'ted to the fastest possible machine codes by Mono and Microsoft .NET JIT compilers.

My questions are:

  1. Does it make sense to optimize patterns like:

    'stloc.0; ldloc.0; ret' => 'ret' 
    'ldc.i4.0; conv.r8' => 'ldc.r8.0'
    

    and such, or are the JIT's smart enough to take care of these?

  2. Is there a specification with the list of optimizations performed by Microsoft/Mono JIT compilers?

  3. Is there any good read with practical recommendations / best practices to optimize IL so that JIT compilers can in turn generate the most optimal machine code (performance-wise)?

2
From what I gather, the JIT is quite good in eliminating stloc.0; ldloc.0;. For IronScheme, I tried to tweak the output IL to be much like C# based on the feeling that the JIT would likely try harder to optimize known patterns. But this is just a feeling :D You could always just create some microbenchmarks to measure it.leppie
.NET JITters aren't particularly smart (after all, they don't have much time). Why do you care about "fastest possible"?Luaan
@Luaan, I care about "the fastest possible" because this is the compiler that needs to produce code for intensive computations. Ideally, it should be producing native machine code, but I'm considering IL for better portability and maintainability. However, performance is still a top priority.Denis Yarkovoy
@DenisYarkovoy You could create some micro-benchmark for analyzing the result...xanatos
I'd go for IL where performance isn't as critical, and native code where it is. Portability is tricky, but - YAGNI. Just make sure it's actually safe :)Luaan

2 Answers

5
votes
  1. The two patterns yo described are the easy stuff that the JIT actually gets right (except for non-primitive structs). In SSA form constant propagation and elimination of dead values is very easy.
  2. No, you have to test what the JIT can do. Look into compiler literature to see what standard optimizations to expect. Then, test for them. The two JITs that we have right now optimize very little and sometimes do not get the most basic stuff right. For example, MyStruct s; s.x = 1; s.x = 1; is not optimized by RyuJIT. s = s; isn't either. s.x + s.x loads x twice from memory. Expect little.
  3. You need to understand what machine code basic operations map to. This is not too complicated. Try a few things and look at the disassembly listing. You'll quickly get a feel for what the output is going to look like.
5
votes

Redundant conversions and load/stores like that are a pretty inevitable side-effect of a recursive decent parser. You can technically get rid of them with a peephole optimizer. But it is nothing to worry about, the C# and VB.NET compilers generate them as well.

The existing .NET/Mono jitters are very good at optimizing them away. They focus on optimizing the code that really matters for execution speed, the machine code. With the very nice advantage that anybody that writes a compiler that generates IL automatically benefits from these optimizations without having to do anything special.

Jitter optimizations are covered in this post.