[llvm-dev] Tracking all prologue and epilogue insertions through codegen/lowering

Tue Apr 27 11:35:32 PDT 2021

Hi all,

I’m doing some program analysis research within LLVM. One thing that I’d like to be able to
do is track LLVM’s generation of function prologues and epilogues, particularly as 
Functions are lowered to MachineFunctions and eventually through target backends.
I only need to do this for x86, so I’ve been focusing my attention on that target.

So far, this is what I’ve done:

* Created two new pseudo instructions in Target.td, named PROLOGUE_ANCHOR and
EPILOGUE_ANCHOR. I’ve made sure that these instructions inherit PseudoInstruction,
and are marked as not having side effects.

* Modified X86FrameLowering::emitPrologue and X86FrameLowering::emitEpilogue
to unconditionally emit new PROLOGUE_ANCHOR and EPILOGUE_ANCHOR MachineInstrs.

* Modified AsmPrinter::EmitFunctionBody to include a case statement for 
PROLOGUE_ANCHOR and EPILOGUE_ANCHOR, converting each into an MCSymbol that I emit
using the output streamer.

All of this “works,” in the sense that my output assembly (via llc) contains labels
everywhere that I expect. However, the binaries themselves are broken -- anything
compiled above -O0 immediately segfaults very early on in process initialization.
My best guess is that this has something to do with higher optimizations including
frame pointer elision and thus my pseudos are messing with that,
but I’m at a little bit of a loss for how best to debug this
(or whether my approach is better replaced with something else).

Could anybody offer some advice/pointers for this approach?

Best,
William Woodruff