[PATCH] D39065: Support nonlazybind attribute for X86 64-bit ELF (invoked via -fno-plt)
Sriraman Tallam via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 18 11:32:48 PDT 2017
tmsriram created this revision.
Herald added a subscriber: mehdi_amini.
GCC supports option -fno-plt to avoid calls via the PLT. Allow the same in clang/llvm.
This llvm patch does not use the PLT and calls indirectly via the GOT if the function is declared as external. GCC supports this feature via option -fno-plt, https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00001.html. We noticed that, for large binaries, avoiding the PLT helps reduce the iTLB misses and improves performance of some of our critical benchmarks by more than 0.5%.
This patch generates indirect calls for all external functions. For non-LTO, if the external function ends up linked with the binary, the linker can convert these indirect calls to direct. GNU ld and gold already support this, https://sourceware.org/ml/binutils/2016-05/msg00322.html
I will make a separate clang patch to add option -fno-plt which will annotate external functions with the "nonlazybind" attribute.
https://reviews.llvm.org/D39065
Files:
lib/Target/X86/X86Subtarget.cpp
test/CodeGen/X86/no-plt.ll
Index: test/CodeGen/X86/no-plt.ll
===================================================================
--- test/CodeGen/X86/no-plt.ll
+++ test/CodeGen/X86/no-plt.ll
@@ -0,0 +1,23 @@
+; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux-gnu -relocation-model=pic \
+; RUN: | FileCheck -check-prefix=X64 %s
+; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux-gnu \
+; RUN: | FileCheck -check-prefix=X64 %s
+
+define i32 @main() #0 {
+; X64: callq *_Z3foov at GOTPCREL(%rip)
+; X64: callq _Z3barv
+
+entry:
+ %retval = alloca i32, align 4
+ store i32 0, i32* %retval, align 4
+ %call1 = call i32 @_Z3foov()
+ %call2 = call i32 @_Z3barv()
+ ret i32 0
+}
+
+; Function Attrs: nonlazybind
+declare i32 @_Z3foov() #1
+
+declare i32 @_Z3barv() #2
+
+attributes #1 = { nonlazybind }
Index: lib/Target/X86/X86Subtarget.cpp
===================================================================
--- lib/Target/X86/X86Subtarget.cpp
+++ lib/Target/X86/X86Subtarget.cpp
@@ -160,6 +160,15 @@
unsigned char
X86Subtarget::classifyGlobalFunctionReference(const GlobalValue *GV,
const Module &M) const {
+ const Function *F = dyn_cast_or_null<Function>(GV);
+
+ // Do not use the PLT when explicitly told to do so for ELF 64-bit
+ // target.
+ if (isTargetELF() && is64Bit() && F &&
+ F->hasFnAttribute(Attribute::NonLazyBind) &&
+ GV->isDeclarationForLinker())
+ return X86II::MO_GOTPCREL;
+
if (TM.shouldAssumeDSOLocal(M, GV))
return X86II::MO_NO_FLAG;
@@ -169,8 +178,6 @@
return X86II::MO_DLLIMPORT;
}
- const Function *F = dyn_cast_or_null<Function>(GV);
-
if (isTargetELF()) {
if (is64Bit() && F && (CallingConv::X86_RegCall == F->getCallingConv()))
// According to psABI, PLT stub clobbers XMM8-XMM15.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D39065.119509.patch
Type: text/x-patch
Size: 1793 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171018/818e8ae9/attachment.bin>
More information about the llvm-commits
mailing list