[llvm-dev] Runtime inlining proof-of-concept (and questions)
D RTI via llvm-dev
llvm-dev at lists.llvm.org
Mon Oct 5 09:10:56 PDT 2020
I'm working on some code to re-compile the output from ahead-of-time
LLVM compilers at runtime, which allows inlining of function calls whose
targets are only known at runtime. This works by decorating selected
functions ahead of time, adding code for determining caller-callee
relationships and invoking the JIT compiler at runtime. The decoration
works on the IR from the front-end compiler (e.g. clang) before
generating object code with llc.
If anyone is interested in knowing more about the runtime inlining
project it's available on GitHub at https://github.com/drti/drti
Making this work got a bit tricky in places and I have some questions
about improvements:
1. To figure out when one "decorated" function has called another I pass
some information in the r14 register as well as in the instruction
stream accessible via the return address. The code is only supposed to
work on Linux x86_64 for now. What I wanted to do was extend the
existing X86TargetMachine to add in these features but I couldn't find
any way to do this cleanly - I couldn't see any target machine extension
points like RegisterPass and RegisterStandardPasses for IR passes. What
I did in the end was implement a new target type "x86_64_drti" which
delegates as much as possible to the real X86 target obtained via
TargetRegistry::lookupTarget. This is messy because many of the virtual
functions from TargetPassConfig that I want to delegate to X86PassConfig
are protected (e.g. addPreRegAlloc). So I'm wondering if I missed
something and if not, whether there's a reason the existing target
machines don't provide any extension points?
2. To make it more robust I'd like to convert CALL instructions into a
PUSH and JMP, so I can fake the return address to point at a block
containing raw data and a JMP back to the instruction after the original
CALL. I think we could call this a "return thunk". So instead of CALL
target [...] I would have something like the below:
MOV my_thunk, R11
PUSH R11
JMP target
my_post_call:
[...]
Where my_thunk would have this:
.8byte [...]
my_thunk:
JMP my_post_call
I don't know if this is even feasible since it splits the basic block
containing the CALL and quite likely breaks any pre-call or post-call
handling. To be honest I'm also not sure how this relates to instruction
"bundles" either and whether the CALL is already more complicated than a
single instruction. Does anyone know what would be involved in this kind
of transformation from CALL to PUSH and JMP?
Regards,
Raoul Gough.
More information about the llvm-dev
mailing list