[llvm-dev] [lld] avoid emitting PLT entries for ifuncs
Peter Smith via llvm-dev
llvm-dev at lists.llvm.org
Tue Aug 21 09:47:59 PDT 2018
Hello Mark,
On 21 August 2018 at 14:47, Mark Johnston via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Hello,
>
> We've recently started using ifuncs in the x86(_64) FreeBSD kernel.
> Currently lld will emit a PLT entry for each ifunc, so ifunc calls are
> more expensive that those of regular functions. In our kernel, this
> overhead isn't really necessary: if lld instead emits PC-relative
> relocations for each ifunc call site, where each relocation references
> a symbol of type GNU_IFUNC, then during boot we can resolve each
> call site and apply the relocation before mapping the kernel text
> read-only. Then, ifunc calls have the same overhead as regular function
> calls.
>
> To implement this optimization, I wrote an lld patch to add
> "-z ifunc-noplt". When this option is specified, lld does not create
> PLT entries for ifuncs and instead passes the existing PC-relative
> relocation through to the output file. The patch is below; I tested it
> with lld 7.0 and the patch applied without modifications to the sources
> in trunk.
>
> I'm wondering if such an option would be acceptable in upstream lld, and
> whether anyone had comments on my implementation. The patch is lacking
> tests, and I had some questions:
I'm not the LLD maintainer so this is just a personal opinion. If I
understand the optimisation correctly, if it used on some program then
either the loader for the program or the program itself is responsible
for running the ifunc resolver and resolving the callsites. I think it
would have to come with a big health warning in at least the help and
documentation that platform/OS support is needed to run the program.
> - How should "-z ifunc-noplt" interact with "-z text"? Should the
> invoker be required to additionally specify "-z notext"?
I think it could it either be -z text -z ifunc-noplt = error, with -z
ifunc-noplt implying -z notext; or -ifunc-noplt is an error without -z
notext.
> - Could "-z ifunc-noplt" be subsumed by a more general mechanism which
> tells lld not to apply constant relocations and instead pass them
> through to the output file? I could imagine using such mechanism
> to make it possible to dynamically enable retpoline at boot time.
> It could also be useful for implementing static DTrace trace points.
In theory on RELA platforms emit-relocs gets you pretty close; it
won't inhibit the generation of PLT or GOT entries though, but I think
it would give enough information to alter the callsites to the results
of the ifunc resolvers. I guess the problem here is where do you stop
and how portable would the solution be across different targets. For
example on Arm you would ideally only want to deal with a small subset
of the instruction relocations at run/load time. I think it is a
solvable problem but it does need some careful thought to avoid just
implementing something that works for a specific target/OS.
Peter
>
> Thanks,
> -Mark
>
> diff --git a/ELF/Config.h b/ELF/Config.h
> index 5dc7f5321..b5a3d3266 100644
> --- a/ELF/Config.h
> +++ b/ELF/Config.h
> @@ -182,6 +182,7 @@ struct Configuration {
> bool ZCopyreloc;
> bool ZExecstack;
> bool ZHazardplt;
> + bool ZIfuncnoplt;
> bool ZInitfirst;
> bool ZKeepTextSectionPrefix;
> bool ZNodelete;
> diff --git a/ELF/Driver.cpp b/ELF/Driver.cpp
> index aced1edca..e7896cedf 100644
> --- a/ELF/Driver.cpp
> +++ b/ELF/Driver.cpp
> @@ -340,7 +340,8 @@ static bool getZFlag(opt::InputArgList &Args, StringRef K1, StringRef K2,
>
> static bool isKnown(StringRef S) {
> return S == "combreloc" || S == "copyreloc" || S == "defs" ||
> - S == "execstack" || S == "hazardplt" || S == "initfirst" ||
> + S == "execstack" || S == "hazardplt" || S == "ifunc-noplt" ||
> + S == "initfirst" ||
> S == "keep-text-section-prefix" || S == "lazy" || S == "muldefs" ||
> S == "nocombreloc" || S == "nocopyreloc" || S == "nodelete" ||
> S == "nodlopen" || S == "noexecstack" ||
> @@ -834,6 +835,7 @@ void LinkerDriver::readConfigs(opt::InputArgList &Args) {
> Config->ZCopyreloc = getZFlag(Args, "copyreloc", "nocopyreloc", true);
> Config->ZExecstack = getZFlag(Args, "execstack", "noexecstack", false);
> Config->ZHazardplt = hasZOption(Args, "hazardplt");
> + Config->ZIfuncnoplt = hasZOption(Args, "ifunc-noplt");
> Config->ZInitfirst = hasZOption(Args, "initfirst");
> Config->ZKeepTextSectionPrefix = getZFlag(
> Args, "keep-text-section-prefix", "nokeep-text-section-prefix", false);
> diff --git a/ELF/Relocations.cpp b/ELF/Relocations.cpp
> index 8f60aa3d2..a54d87e43 100644
> --- a/ELF/Relocations.cpp
> +++ b/ELF/Relocations.cpp
> @@ -361,6 +361,10 @@ static bool isStaticLinkTimeConstant(RelExpr E, RelType Type, const Symbol &Sym,
> R_TLSLD_HINT>(E))
> return true;
>
> + // The computation involves output from the ifunc resolver.
> + if (Sym.isGnuIFunc() && Config->ZIfuncnoplt)
> + return false;
> +
> // These never do, except if the entire file is position dependent or if
> // only the low bits are used.
> if (E == R_GOT || E == R_PLT || E == R_TLSDESC)
> @@ -808,6 +812,10 @@ static void processRelocAux(InputSectionBase &Sec, RelExpr Expr, RelType Type,
> Sec.Relocations.push_back({Expr, Type, Offset, Addend, &Sym});
> return;
> }
> + if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) {
> + InX::RelaDyn->addReloc(Type, &Sec, Offset, &Sym, Addend, R_ADDEND, Type);
> + return;
> + }
> bool CanWrite = (Sec.Flags & SHF_WRITE) || !Config->ZText;
> if (CanWrite) {
> // R_GOT refers to a position in the got, even if the symbol is preemptible.
> @@ -977,7 +985,7 @@ static void scanReloc(InputSectionBase &Sec, OffsetGetter &GetOffset, RelTy *&I,
> // all dynamic symbols that can be resolved within the executable will
> // actually be resolved that way at runtime, because the main exectuable
> // is always at the beginning of a search list. We can leverage that fact.
> - if (Sym.isGnuIFunc())
> + if (Sym.isGnuIFunc() && !Config->ZIfuncnoplt)
> Expr = toPlt(Expr);
> else if (!Sym.IsPreemptible && Expr == R_GOT_PC && !isAbsoluteValue(Sym))
> Expr = Target->adjustRelaxExpr(Type, RelocatedAddr, Expr);
> diff --git a/ELF/Writer.cpp b/ELF/Writer.cpp
> index 90462ecc7..418133ebd 100644
> --- a/ELF/Writer.cpp
> +++ b/ELF/Writer.cpp
> @@ -1570,8 +1570,11 @@ template <class ELFT> void Writer<ELFT>::finalizeSections() {
> applySynthetic({InX::EhFrame},
> [](SyntheticSection *SS) { SS->finalizeContents(); });
>
> - for (Symbol *S : Symtab->getSymbols())
> + for (Symbol *S : Symtab->getSymbols()) {
> S->IsPreemptible |= computeIsPreemptible(*S);
> + if (S->isGnuIFunc() && Config->ZIfuncnoplt)
> + S->ExportDynamic = true;
> + }
>
> // Scan relocations. This must be done after every symbol is declared so that
> // we can correctly decide if a dynamic relocation is needed.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
More information about the llvm-dev
mailing list