[lld] r288314 - Add `isRelExprOneOf` helper

Sean Silva via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 6 16:40:00 PST 2016


Interesting that it didn't speed things up as much for you. The only thing
I can think of that seems likely is that I was testing this on an i7
machine whereas your machine (I'm assuming mac pro) has a beefier xeon with
much more L3 cache. That would mitigate the benefit of this patch since I
suspect the main benefit is letting the processor's speculative execution
run farther ahead, hiding load latency (if my hypothesis is correct, it
must be LLC vs DRAM latency).

On Dec 6, 2016 2:11 PM, "Rafael Avila de Espindola" <
rafael.espindola at gmail.com> wrote:

>
> Thanks for the patch.
>
> I uploaded a new version of the benchmark to
>
>   https://drive.google.com/open?id=0B7iRtublysV6RjFBcVJhSjg1eHM
>
> The problem you had with it using files from my machine should be
> fixed. The issue was with linker scripts using absolute paths. It can be
> avoided with fakeroot, but I just edited the scripts to make it easier.
>
> I got a smaller benefit from this patch. The results were
>
> firefox-O0
>   master 4.150496547
>   patch  4.144422417 1.00146561556x faster
> firefox
>   master 6.901552864
>   patch  6.900340356 1.00017571713x faster
> firefox-gc
>   master 7.078655427
>   patch  7.069354855 1.00131561821x faster
> chromium
>   master 4.107906591
>   patch  4.094586237 1.0032531624x faster
> chromium fast
>   master 2.007441161
>   patch  1.999441702 1.00400084633x faster
> the gold plugin
>   master 0.337859787
>   patch  0.334852395 1.0089812468x faster
> clang
>   master 0.57712437
>   patch  0.574027448 1.00539507651x faster
> llvm-as
>   master 0.033570448
>   patch  0.033285747 1.00855324052x faster
> the gold plugin fsds
>   master 0.363855963
>   patch  0.36106724 1.00772355587x faster
> clang fsds
>   master 0.656084922
>   patch  0.652941395 1.00481440911x faster
> llvm-as fsds
>   master 0.033544621
>   patch  0.033198883 1.01041414556x faster
> scylla
>   master 3.112523401
>   patch  3.103663875 1.00285453785x faster
>
> Sean Silva via llvm-commits <llvm-commits at lists.llvm.org> writes:
>
> > Author: silvas
> > Date: Wed Nov 30 23:43:48 2016
> > New Revision: 288314
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=288314&view=rev
> > Log:
> > Add `isRelExprOneOf` helper
> >
> > In various places in LLD's hot loops, we have expressions of the form
> > "E == R_FOO || E == R_BAR || ..." (E is a RelExpr).
> >
> > Some of these expressions are quite long, and even though they usually
> go just
> > a very small number of ways and so should be well predicted, they can
> still
> > occupy branch predictor resources harming other parts of the code, or
> they
> > won't be predicted well if they overflow branch predictor resources or
> if the
> > branches are too dense and the branch predictor can't track them all (the
> > compiler can in theory avoid this, at a cost in text size). And some of
> these
> > expressions are so large and executed so frequently that even when
> > well-predicted they probably still have a nontrivial cost.
> >
> > This speedup should be pretty portable. The cost of these simple bit
> tests is
> > independent of:
> >
> > - the target we are linking for
> > - the distribution of RelExpr's for a given link (which can depend on
> how the
> >   input files were compiled)
> > - what compiler was used to compile LLD (it is just a simple bit test;
> >   hopefully the compiler gets it right!)
> > - adding new target-dependent relocations (e.g. needsPlt doesn't pay any
> extra
> >   cost checking R_PPC_PLT_OPD on x86-64 builds)
> >
> > I did some rough measurements on clang-fsds and this patch gives over
> about 4%
> > speedup for a regular -O1 link, about 2.5% for -O3 --gc-sections and
> over 5%
> > for -O0. Sorry, I don't have my current machine set up for doing really
> > accurate measurements right now.
> >
> > This also is just a bit cleaner. Thanks for Joerg for suggesting for
> > this approach.
> >
> > Differential Revision: https://reviews.llvm.org/D27156
> >
> > Modified:
> >     lld/trunk/ELF/Relocations.cpp
> >     lld/trunk/ELF/Relocations.h
> >
> > Modified: lld/trunk/ELF/Relocations.cpp
> > URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/
> Relocations.cpp?rev=288314&r1=288313&r2=288314&view=diff
> > ============================================================
> ==================
> > --- lld/trunk/ELF/Relocations.cpp (original)
> > +++ lld/trunk/ELF/Relocations.cpp Wed Nov 30 23:43:48 2016
> > @@ -62,12 +62,10 @@ namespace lld {
> >  namespace elf {
> >
> >  static bool refersToGotEntry(RelExpr Expr) {
> > -  return Expr == R_GOT || Expr == R_GOT_OFF || Expr ==
> R_MIPS_GOT_LOCAL_PAGE ||
> > -         Expr == R_MIPS_GOT_OFF || Expr == R_MIPS_GOT_OFF32 ||
> > -         Expr == R_MIPS_TLSGD || Expr == R_MIPS_TLSLD ||
> > -         Expr == R_GOT_PAGE_PC || Expr == R_GOT_PC || Expr ==
> R_GOT_FROM_END ||
> > -         Expr == R_TLSGD || Expr == R_TLSGD_PC || Expr == R_TLSDESC ||
> > -         Expr == R_TLSDESC_PAGE;
> > +  return isRelExprOneOf<R_GOT, R_GOT_OFF, R_MIPS_GOT_LOCAL_PAGE,
> R_MIPS_GOT_OFF,
> > +                        R_MIPS_GOT_OFF32, R_MIPS_TLSGD, R_MIPS_TLSLD,
> > +                        R_GOT_PAGE_PC, R_GOT_PC, R_GOT_FROM_END,
> R_TLSGD,
> > +                        R_TLSGD_PC, R_TLSDESC, R_TLSDESC_PAGE>(Expr);
> >  }
> >
> >  static bool isPreemptible(const SymbolBody &Body, uint32_t Type) {
> > @@ -302,16 +300,16 @@ template <class ELFT> static bool isAbso
> >  }
> >
> >  static bool needsPlt(RelExpr Expr) {
> > -  return Expr == R_PLT_PC || Expr == R_PPC_PLT_OPD || Expr == R_PLT ||
> > -         Expr == R_PLT_PAGE_PC || Expr == R_THUNK_PLT_PC;
> > +  return isRelExprOneOf<R_PLT_PC, R_PPC_PLT_OPD, R_PLT, R_PLT_PAGE_PC,
> > +                        R_THUNK_PLT_PC>(Expr);
> >  }
> >
> >  // True if this expression is of the form Sym - X, where X is a
> position in the
> >  // file (PC, or GOT for example).
> >  static bool isRelExpr(RelExpr Expr) {
> > -  return Expr == R_PC || Expr == R_GOTREL || Expr == R_GOTREL_FROM_END
> ||
> > -         Expr == R_MIPS_GOTREL || Expr == R_PAGE_PC || Expr ==
> R_RELAX_GOT_PC ||
> > -         Expr == R_THUNK_PC || Expr == R_THUNK_PLT_PC;
> > +  return isRelExprOneOf<R_PC, R_GOTREL, R_GOTREL_FROM_END,
> R_MIPS_GOTREL,
> > +                        R_PAGE_PC, R_RELAX_GOT_PC, R_THUNK_PC,
> R_THUNK_PLT_PC>(
> > +      Expr);
> >  }
> >
> >  template <class ELFT>
> > @@ -320,12 +318,11 @@ static bool isStaticLinkTimeConstant(Rel
> >                                       InputSectionBase<ELFT> &S,
> >                                       typename ELFT::uint RelOff) {
> >    // These expressions always compute a constant
> > -  if (E == R_SIZE || E == R_GOT_FROM_END || E == R_GOT_OFF ||
> > -      E == R_MIPS_GOT_LOCAL_PAGE || E == R_MIPS_GOT_OFF ||
> > -      E == R_MIPS_GOT_OFF32 || E == R_MIPS_TLSGD || E == R_GOT_PAGE_PC
> ||
> > -      E == R_GOT_PC || E == R_PLT_PC || E == R_TLSGD_PC || E == R_TLSGD
> ||
> > -      E == R_PPC_PLT_OPD || E == R_TLSDESC_CALL || E == R_TLSDESC_PAGE
> ||
> > -      E == R_HINT || E == R_THUNK_PC || E == R_THUNK_PLT_PC)
> > +  if (isRelExprOneOf<R_SIZE, R_GOT_FROM_END, R_GOT_OFF,
> R_MIPS_GOT_LOCAL_PAGE,
> > +                     R_MIPS_GOT_OFF, R_MIPS_GOT_OFF32, R_MIPS_TLSGD,
> > +                     R_GOT_PAGE_PC, R_GOT_PC, R_PLT_PC, R_TLSGD_PC,
> R_TLSGD,
> > +                     R_PPC_PLT_OPD, R_TLSDESC_CALL, R_TLSDESC_PAGE,
> R_HINT,
> > +                     R_THUNK_PC, R_THUNK_PLT_PC>(E))
> >      return true;
> >
> >    // These never do, except if the entire file is position dependent or
> if
> > @@ -659,12 +656,12 @@ static void scanRelocs(InputSectionBase<
> >
> >      // Ignore "hint" and TLS Descriptor call relocation because they are
> >      // only markers for relaxation.
> > -    if (Expr == R_HINT || Expr == R_TLSDESC_CALL)
> > +    if (isRelExprOneOf<R_HINT, R_TLSDESC_CALL>(Expr))
> >        continue;
> >
> > -    if (needsPlt(Expr) || Expr == R_THUNK_ABS || Expr == R_THUNK_PC ||
> > -        Expr == R_THUNK_PLT_PC || refersToGotEntry(Expr) ||
> > -        !isPreemptible(Body, Type)) {
> > +    if (needsPlt(Expr) ||
> > +        isRelExprOneOf<R_THUNK_ABS, R_THUNK_PC, R_THUNK_PLT_PC>(Expr) ||
> > +        refersToGotEntry(Expr) || !isPreemptible(Body, Type)) {
> >        // If the relocation points to something in the file, we can
> process it.
> >        bool Constant =
> >            isStaticLinkTimeConstant<ELFT>(Expr, Type, Body, C,
> RI.r_offset);
> >
> > Modified: lld/trunk/ELF/Relocations.h
> > URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/
> Relocations.h?rev=288314&r1=288313&r2=288314&view=diff
> > ============================================================
> ==================
> > --- lld/trunk/ELF/Relocations.h (original)
> > +++ lld/trunk/ELF/Relocations.h Wed Nov 30 23:43:48 2016
> > @@ -73,6 +73,35 @@ enum RelExpr {
> >    R_TLSLD_PC,
> >  };
> >
> > +// Build a bitmask with one bit set for each RelExpr.
> > +//
> > +// Constexpr function arguments can't be used in static asserts, so we
> > +// use template arguments to build the mask.
> > +// But function template partial specializations don't exist (needed
> > +// for base case of the recursion), so we need a dummy struct.
> > +template <RelExpr... Exprs> struct RelExprMaskBuilder {
> > +  static inline uint64_t build() { return 0; }
> > +};
> > +
> > +// Specialization for recursive case.
> > +template <RelExpr Head, RelExpr... Tail>
> > +struct RelExprMaskBuilder<Head, Tail...> {
> > +  static inline uint64_t build() {
> > +    static_assert(0 <= Head && Head < 64,
> > +                  "RelExpr is too large for 64-bit mask!");
> > +    return (uint64_t(1) << Head) | RelExprMaskBuilder<Tail...>::
> build();
> > +  }
> > +};
> > +
> > +// Return true if `Expr` is one of `Exprs`.
> > +// There are fewer than 64 RelExpr's, so we can represent any set of
> > +// RelExpr's as a constant bit mask and test for membership with a
> > +// couple cheap bitwise operations.
> > +template <RelExpr... Exprs> bool isRelExprOneOf(RelExpr Expr) {
> > +  assert(0 <= Expr && Expr < 64 && "RelExpr is too large for 64-bit
> mask!");
> > +  return (uint64_t(1) << Expr) & RelExprMaskBuilder<Exprs...>::build();
> > +}
> > +
> >  // Architecture-neutral representation of relocation.
> >  struct Relocation {
> >    RelExpr Expr;
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161206/29564557/attachment.html>


More information about the llvm-commits mailing list