7
votes

Running through all the questions on profiling tools, I was surprised to discover VTune by Intel that I hadn't heard of before. At $700, it is even more expensive than AQTime.

But before I make the decision to put down the big bucks for AQTime, has anyone used VTune for Delphi, and if so, do you think it has any benefits that may make it a better choice than AQTime and the other profiling tools for Delphi that are out there?

3
Do you handtune a lot of SSE assembler?Marco van de Voort
No, but I do look at it and have a lot of experience in low-level optimization.lkessler

3 Answers

5
votes

You can download 30 days vtune trial and try by yourself.

I have used AQTime and VTune.

VTune is good if you want to test multithreaded application - it help me to find locks in memory manager which slow down my multithreaded part of application.

Secend difference is that VTune is sample profiler and AQTime is instrument profiler. Both has strong and weak points but I personaly prefer instrument one. With instrument profiler you get exactly information how many times your function was called, all caller of this procedure etc with cost of inaccurate time results - instrumentation profilers change a way processor executing code, so branch predictions and cache works different that in real and tested app work slow.

But most important is GUI and here AQTime win. It is powerfull application but very easy in use. VTune is quite different. I lost too much time to find right command in VTune. GUI is very messy.

So except of multithreading I use AQTime.

8
votes

VTune can read low level CPU counters, like branch prediction, cache misses, etc. I used it to find out why TopMM (multithreaded scaling memory manager!) was very slow on my Hyperthreading CPU. Was something about 64kb memory cache aliasing. So it gives more in-dept information how it really runs on a CPU, and why something is slow due to cache misses etc. For real optimalisation (last %) I would use both, for normal optimalisation use AQ or other (like my asmprofiler :-) )

1
votes

It's been over 10 years since the question was asked. Unfortunately, nothing seems to have changed so far.

I have successfully used VTune Amplifier with Delphi binaries more than a couple of times over many years. It's doable, but it's also time consuming. We did have a licence of AQTime at some point, but I didn't really like it. The execution slowdown was way too much for a big project, and I couldn't get used to the way results were presented.

Recent versions of VTune Amplifier look way cleaner, but they still work about the same. What you will be looking at most of the time is the right column with the memory addresses of the callstack. What you want to do is to match those addresses with the ones in the map file that Delphi generates (if you enable the option). Just a minor caveat: the addresses in the callstack inside VTune Amplifier and the map file are offset by the start address of the code section. The default value is 0x401000 (you can find it at the beginning of the map file). Therefore, you will want to search in the map file for the address in the callstack minus the offset. Furthermore, it happens often enough that the address has an extra offset of a few bytes. Instead of searching for the exact (offset) address, search for the vicinity of the address instead, then check which line the exact address belongs to. It also happens sometimes that some addresses don't seem to point to a proper place. Just ignore that address and go to the next one in the callstack.

Converting the map file (or an equivalent Delphi binary) to a pdb file could potentially make things a lot easier. I was unable to find an up-to-date tool to do the job, but I did find a description of the pdb file format in InformIt (Cracking PDB Symbol Files by Sven B. Schreiber).

An in-between solution would be to speed up the current process by having a tool that reads in the map file and allows for a quick search of an address (including offset adjustment and using vicinity search). Even better if it allows you to jump to the source file and display recently matched addresses.

Of course the nicest solution would be for Embarcadero to add support for generating PDB files to their compiler, but my experience with them is that they just hoard bugs and feature requests and rarely ever do something about them. We are on our own on this one.

Interestingly enough, Primož Gabrijelčič mentions Intel's VTune Amplifier in at least two Delphi related books. Mastering Delphi Programming (2019) mentions it along with a few other programs, but it's the one for which no further information is shown. It would be interesting to know if the author has actually used VTune Amplifier with Delphi binaries, and how he goes about it.