[llvm-dev] [lld] avoid emitting PLT entries for ifuncs

Mark Johnston via llvm-dev llvm-dev at lists.llvm.org
Tue Aug 21 06:47:41 PDT 2018


Hello,

We've recently started using ifuncs in the x86(_64) FreeBSD kernel.
Currently lld will emit a PLT entry for each ifunc, so ifunc calls are
more expensive that those of regular functions.  In our kernel, this
overhead isn't really necessary: if lld instead emits PC-relative
relocations for each ifunc call site, where each relocation references
a symbol of type GNU_IFUNC, then during boot we can resolve each
call site and apply the relocation before mapping the kernel text
read-only.  Then, ifunc calls have the same overhead as regular function
calls.

To implement this optimization, I wrote an lld patch to add
"-z ifunc-noplt".  When this option is specified, lld does not create
PLT entries for ifuncs and instead passes the existing PC-relative
relocation through to the output file.  The patch is below; I tested it
with lld 7.0 and the patch applied without modifications to the sources
in trunk.

I'm wondering if such an option would be acceptable in upstream lld, and
whether anyone had comments on my implementation.  The patch is lacking
tests, and I had some questions:
- How should "-z ifunc-noplt" interact with "-z text"? Should the
  invoker be required to additionally specify "-z notext"?
- Could "-z ifunc-noplt" be subsumed by a more general mechanism which
  tells lld not to apply constant relocations and instead pass them
  through to the output file?  I could imagine using such mechanism
  to make it possible to dynamically enable retpoline at boot time.
  It could also be useful for implementing static DTrace trace points.

Thanks,
-Mark

diff --git a/ELF/Config.h b/ELF/Config.h
index 5dc7f5321..b5a3d3266 100644
--- a/ELF/Config.h
+++ b/ELF/Config.h
@@ -182,6 +182,7 @@ struct Configuration {
   bool ZCopyreloc;
   bool ZExecstack;
   bool ZHazardplt;
+  bool ZIfuncnoplt;
   bool ZInitfirst;
   bool ZKeepTextSectionPrefix;
   bool ZNodelete;
diff --git a/ELF/Driver.cpp b/ELF/Driver.cpp
index aced1edca..e7896cedf 100644
--- a/ELF/Driver.cpp
+++ b/ELF/Driver.cpp
@@ -340,7 +340,8 @@ static bool getZFlag(opt::InputArgList &Args, StringRef K1, StringRef K2,
 
 static bool isKnown(StringRef S) {
   return S == "combreloc" || S == "copyreloc" || S == "defs" ||
-         S == "execstack" || S == "hazardplt" || S == "initfirst" ||
+         S == "execstack" || S == "hazardplt" || S == "ifunc-noplt" ||
+         S == "initfirst" ||
          S == "keep-text-section-prefix" || S == "lazy" || S == "muldefs" ||
          S == "nocombreloc" || S == "nocopyreloc" || S == "nodelete" ||
          S == "nodlopen" || S == "noexecstack" ||
@@ -834,6 +835,7 @@ void LinkerDriver::readConfigs(opt::InputArgList &Args) {
   Config->ZCopyreloc = getZFlag(Args, "copyreloc", "nocopyreloc", true);
   Config->ZExecstack = getZFlag(Args, "execstack", "noexecstack", false);
   Config->ZHazardplt = hasZOption(Args, "hazardplt");
+  Config->ZIfuncnoplt = hasZOption(Args, "ifunc-noplt");
   Config->ZInitfirst = hasZOption(Args, "initfirst");
   Config->ZKeepTextSectionPrefix = getZFlag(
       Args, "keep-text-section-prefix", "nokeep-text-section-prefix", false);
diff --git a/ELF/Relocations.cpp b/ELF/Relocations.cpp
index 8f60aa3d2..a54d87e43 100644
--- a/ELF/Relocations.cpp
+++ b/ELF/Relocations.cpp
@@ -361,6 +361,10 @@ static bool isStaticLinkTimeConstant(RelExpr E, RelType Type, const Symbol &Sym,
           R_TLSLD_HINT>(E))
     return true;
 
+  // The computation involves output from the ifunc resolver.
+  if (Sym.isGnuIFunc() && Config->ZIfuncnoplt)
+    return false;
+
   // These never do, except if the entire file is position dependent or if
   // only the low bits are used.
   if (E == R_GOT || E == R_PLT || E == R_TLSDESC)
@@ -808,6 +812,10 @@ static void processRelocAux(InputSectionBase &Sec, RelExpr Expr, RelType Type,
     Sec.Relocations.push_back({Expr, Type, Offset, Addend, &Sym});
     return;
   }
+  if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) {
+    InX::RelaDyn->addReloc(Type, &Sec, Offset, &Sym, Addend, R_ADDEND, Type);
+    return;
+  }
   bool CanWrite = (Sec.Flags & SHF_WRITE) || !Config->ZText;
   if (CanWrite) {
     // R_GOT refers to a position in the got, even if the symbol is preemptible.
@@ -977,7 +985,7 @@ static void scanReloc(InputSectionBase &Sec, OffsetGetter &GetOffset, RelTy *&I,
   // all dynamic symbols that can be resolved within the executable will
   // actually be resolved that way at runtime, because the main exectuable
   // is always at the beginning of a search list. We can leverage that fact.
-  if (Sym.isGnuIFunc())
+  if (Sym.isGnuIFunc() && !Config->ZIfuncnoplt)
     Expr = toPlt(Expr);
   else if (!Sym.IsPreemptible && Expr == R_GOT_PC && !isAbsoluteValue(Sym))
     Expr = Target->adjustRelaxExpr(Type, RelocatedAddr, Expr);
diff --git a/ELF/Writer.cpp b/ELF/Writer.cpp
index 90462ecc7..418133ebd 100644
--- a/ELF/Writer.cpp
+++ b/ELF/Writer.cpp
@@ -1570,8 +1570,11 @@ template <class ELFT> void Writer<ELFT>::finalizeSections() {
   applySynthetic({InX::EhFrame},
                  [](SyntheticSection *SS) { SS->finalizeContents(); });
 
-  for (Symbol *S : Symtab->getSymbols())
+  for (Symbol *S : Symtab->getSymbols()) {
     S->IsPreemptible |= computeIsPreemptible(*S);
+    if (S->isGnuIFunc() && Config->ZIfuncnoplt)
+      S->ExportDynamic = true;
+  }
 
   // Scan relocations. This must be done after every symbol is declared so that
   // we can correctly decide if a dynamic relocation is needed.


More information about the llvm-dev mailing list