[PATCH] D39065: Support nonlazybind attribute for X86 64-bit ELF (invoked via -fno-plt)

Sriraman Tallam via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 18 11:32:48 PDT 2017


tmsriram created this revision.
Herald added a subscriber: mehdi_amini.

GCC supports option -fno-plt to avoid calls via the PLT.  Allow the same in clang/llvm.

This llvm patch does not use the PLT and calls indirectly via the GOT if the function is declared as external. GCC supports this feature via option -fno-plt, https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00001.html. We noticed that, for large binaries, avoiding the PLT helps reduce the iTLB misses and improves performance of some of our critical benchmarks by more than 0.5%.

This patch generates indirect calls for all external functions.  For non-LTO, if the external function ends up linked with the binary, the linker can convert these indirect calls to direct.  GNU ld and gold already support this, https://sourceware.org/ml/binutils/2016-05/msg00322.html

I will make a separate clang patch to add option -fno-plt which will annotate external functions with the "nonlazybind" attribute.


https://reviews.llvm.org/D39065

Files:
  lib/Target/X86/X86Subtarget.cpp
  test/CodeGen/X86/no-plt.ll


Index: test/CodeGen/X86/no-plt.ll
===================================================================
--- test/CodeGen/X86/no-plt.ll
+++ test/CodeGen/X86/no-plt.ll
@@ -0,0 +1,23 @@
+; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux-gnu -relocation-model=pic \
+; RUN:   | FileCheck -check-prefix=X64 %s
+; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux-gnu \
+; RUN:   | FileCheck -check-prefix=X64 %s
+
+define i32 @main() #0 {
+; X64: callq *_Z3foov at GOTPCREL(%rip)
+; X64: callq _Z3barv
+
+entry:
+  %retval = alloca i32, align 4
+  store i32 0, i32* %retval, align 4
+  %call1 = call i32 @_Z3foov()
+  %call2 = call i32 @_Z3barv()
+  ret i32 0
+}
+
+; Function Attrs: nonlazybind
+declare i32 @_Z3foov() #1
+
+declare i32 @_Z3barv() #2
+
+attributes #1 = { nonlazybind }
Index: lib/Target/X86/X86Subtarget.cpp
===================================================================
--- lib/Target/X86/X86Subtarget.cpp
+++ lib/Target/X86/X86Subtarget.cpp
@@ -160,6 +160,15 @@
 unsigned char
 X86Subtarget::classifyGlobalFunctionReference(const GlobalValue *GV,
                                               const Module &M) const {
+  const Function *F = dyn_cast_or_null<Function>(GV);
+
+  // Do not use the PLT when explicitly told to do so for ELF 64-bit
+  // target.
+  if (isTargetELF() && is64Bit() && F &&
+      F->hasFnAttribute(Attribute::NonLazyBind) &&
+      GV->isDeclarationForLinker())
+    return X86II::MO_GOTPCREL;
+
   if (TM.shouldAssumeDSOLocal(M, GV))
     return X86II::MO_NO_FLAG;
 
@@ -169,8 +178,6 @@
     return X86II::MO_DLLIMPORT;
   }
 
-  const Function *F = dyn_cast_or_null<Function>(GV);
-
   if (isTargetELF()) {
     if (is64Bit() && F && (CallingConv::X86_RegCall == F->getCallingConv()))
       // According to psABI, PLT stub clobbers XMM8-XMM15.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D39065.119509.patch
Type: text/x-patch
Size: 1793 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171018/818e8ae9/attachment.bin>


More information about the llvm-commits mailing list