I have just rewritten a Matlab program in c++ as a mex-function to speed things up, with fantastic results. This optimization decision was a very very good idea, with up to a factor 20 speed up without threading. It still left me curious about what the mex-function was spending time on and wanting to identify possible bottlenecks.
I'm looking for a way to profile mex-functions. The matlab profiler is not much use, and the other profilers I've downloaded (both free and trial) all want an executable to run. I'm no mex-guru, but as far as I've understood the only way to run a mex is from within Matlab. The mex-function is compiled into a dll, but is called .mex64. So this problem should be similar to profiling a dll. To write the c++ mex-function I used a single-user VS2005 (i.e., not the team version), and am running on a x64-platform.
Does anyone know of a good way to profile a mex-function? What tool should I use and how do I use it when I start from within Matlab? Or is there any other way to profile the c++-code?