1
votes

GCC has an auto-instrument options for function entry/exit.

-finstrument-functions Generate instrumentation calls for entry and exit to functions. Just after function entry and just before function exit, the following profiling functions will be called with the address of the current function and its call site. (On some platforms, __builtin_return_address does not work beyond the current function, so the call site information may not be available to the profiling functions otherwise.) void __cyg_profile_func_enter (void *this_fn, void *call_site); void __cyg_profile_func_exit (void *this_fn, void *call_site);

I would like to have something like this for every "basic block" so that I can log, dynamically, execution of every branch.

How would I do this?

1
You can't. Not with GCC alone at least. - Eugene Sh.
@EugeneSh. what would be the rough outline of how to do it? I'm thinking: code -> pre-processor output -> use Python to add your own functions at every {} -> continue compilation . . . Is there another way? - Bob
Some people will object, but you could use some macros to enclose the blocks of interest. Then yeah, you will get away with GCC only. - Eugene Sh.
The main problem I see is that the presence of {} is neither necessary nor a sufficient property of "basic block" . - Eugene Sh.
Then just Search/Replace { to BLOCK_START and } to BLOCK_END. Well, it might have some false positives in case of array initializes for instance... - Eugene Sh.

1 Answers

0
votes

There is a fuzzer called American Fuzzy Lop, it solves very similar problem of instrumenting jumps between basic blocks to gather edge coverage: if basic blocks are vertices what jumps (edges) were encountered during execution. It may be worth to see its sources. It has three approaches:

  • afl-gcc is a wrapper for gcc that substitutes as by a wrapper rewriting assembly code according to basic block labels and jump instructions
  • plugin for Clang compiler
  • patch for QEMU for instrumenting already compiled code

Another and probably the simplest option may be to use DynamoRIO dynamic instrumentation system. Unlike QEMU, it is specially designed to implement custom instrumentation (either as rewriting machine code by hand or simply inserting calls that even may be automatically inlined in some cases, if I get documentation right). If you think dynamic instrumentation is something very hard, look at their examples -- they are only about 100-200 lines (but you still need to read their documentation at least here and for used functions since it may contain important points: for example DR constructs dynamic basic blocks, which are distinct from a compiler's classic basic blocks). With dynamic instrumentation you can even instrument used system libraries. In case it is not what you want, you may use something like

static module_data_t *traced_module;

// in dr_client_main
traced_module = dr_get_main_module();

// in basic block event handler
void *app_pc = dr_fragment_app_pc(tag);
if (!dr_module_contains_addr(traced_module, app_pc)) {
    return DR_EMIT_DEFAULT;
}