[llvm-dev] How can I get the opcode length of an IR instruction in LLVM?

Mon Feb 27 07:55:10 PST 2017

On 27 February 2017 at 07:24, Mohsen Ahmadvand via llvm-dev <llvm->
> Is this in general possible?

Definitely not. Just about every pathology you can imagine could happen:

  * Multiple IR instructions can be combined into a single target
instruction (without any information tracking which instructions it
came from).
  * A single IR instruction can produce multiple target instructions.
  * Some target instructions don't correspond to any IR instruction
(ABI handling, register spills to the stack).
  * Some IR instructions produce no target instructions (unreachable
for example). This might be the easiest to handle.

> How to hack the backend to dump the required informations? Is there a
> generic way to do so, or do I need to hack all backends?

The size is only really known at the very end of the compilation
pipeline (low-level optimizations like compressing branches can affect
the size and happen last). The functions where it happens are
MCObjectStreamer::EmitInstruction and friends.

So bearing in mind that you'll only ever get an approximation, you
could attach debug-info to the IR pointing back at itself (i.e. debug
info for LLVM IR instead of a higher-level language). You could hack a
check for that during emission and count the bytes that came from any
particular line/inst.

There used to be a pass to add this kind of debug info to IR, but it
bit-rotted and got removed a while back. Should still be in the git
history somewhere though.

Cheers.

Tim.