Here's a possible approach I'm trying, which seems to work, at least for some well-behaved branch tables. After decoding the TBB, I start looping over the branch bytes. For each one, I find the address it corresponds to, and keep track of the lowest of these addresses (closest to the end of the branch table). I also check that each branch address is after the currently decoded branch byte, since there may be zero padding at the end of the table.
This depends on there not being any code or data between the end of the table and the beginning of the code referenced from the table. If, for example, the default case were immediately after the table, but not referenced from the table, this would encounter problems. For the examples I have to test, the compiler placed default cases at the end of the other cases.
I'm using Capstone for disassembly, here is some code that should make sense without much context:
case ARM_INS_TBB:
// Table branch byte
if(insn->detail->arm.op_count == 1 &&
insn->detail->arm.operands[0].type == ARM_OP_MEM &&
insn->detail->arm.operands[0].mem.base == ARM_REG_PC){
// PC relative TBB
u64 min = U64_MAX;
// loop over table bytes
for(u64 i = 0; ; ++i){
// check if current table byte is before minimum branch target
if(insn_addr + insn->size + i < min){
// get branch address from table byte
u64 branchaddr = insn_addr + insn->size +
(binary_image[insn_addr + insn->size + i] << 1);
// check if branch address is larger than the
// location of the previous table byte
if(branchaddr > insn_addr + insn->size + i){
// new lower address branch target
min = branchaddr;
// do something with the code at branchaddr
} else {
break;
}
} else {
break;
}
}
}
// Instructions immediately after this are junk, stop parsing
return;
break;