[LLVMdev] Why does the x86-64 JIT emit stubs for external calls?

Wed Jun 10 12:17:02 PDT 2009

In X86CodeGen.cpp, the following code appears in the handler used for
CALL64pcrel32 instructions:

        // Assume undefined functions may be outside the Small codespace.
        bool NeedStub =
          (Is64BitMode &&
              (TM.getCodeModel() == CodeModel::Large ||
               TM.getSubtarget<X86Subtarget>().isTargetDarwin())) ||
          Opcode == X86::TAILJMPd;
        emitGlobalAddress(MO.getGlobal(), X86::reloc_pcrel_word,
                          MO.getOffset(), 0, NeedStub);

This causes every external call to be emitted as a call to a stub
which then jumps to the real function.
I understand, thanks to the helpful folks on #llvm, that calls across
more than 31 bits of address space need to be emitted as a "mov
$ADDRESS, r10; call *r10" pair instead of the simple "call
rip+ADDRESS" used for calls within 31 bits. But why isn't the mov+call
pair emitted inline? And why are Darwin and TAILJMPs special?

Having this out of line seems to lose up to 2% performance on the
Unladen Swallow benchmarks, so, while it's not urgent, it'd be nice to
figure out how to avoid the stubs.

What kind of patch would be welcome to fix this?

Thanks,
Jeffrey