1
votes

I am doing an LLVM project aimed to disassemble an ARM ELF binary executable to the MCInst format, inserting some instructions or doing some modification, and re-assemble the MCInst to an ELF binary.

I used llvm-objdump to do the first part of the job. But after searching for a long time, I still cannot figure out how to translate the MCInst back to a binary.

Could anyone kindly tell me which tool or function of LLVM is useful for doing this? And what is the best way to store the intermediate MCInst? e.g., store them just in memory or store them in a file, which function could store and read them in a decent way?

I really appreciate any of your help, even just one point.

1

1 Answers

4
votes

I don't have a full answer because what you're trying to do is not explicitly exposed from LLVM for external use. But I do have some pointers which should lead you to the solution after some code reading.

The tool you want to look at is tools/llvm-mc/llvm-mc.cpp. When it does assembly, it goes through the steps of parsing ASM, creating MC-level data structures from it and emitting them into an ELF file (or some other file). These actions are implemented in some helper classes, which are quite strongly coupled; this is why you need to look at the tool to see how they're used in unison. The main class you want to look at is MCAssembler. It uses an "emitter" to actually emit MCInsts into binary form.

An emitter is (naturally) target-specific. For example ARMMCCodeEmitter, etc. Note that the MC-level representation of a binary is not only MCInsts, though. There are directives, stuff to deal with sections and so on. This is why it's important to study what MCAssembler does.