[PATCH] D109516: [JITLink] Factor out forEachRelocation() function from addRelocations() in ELF Aarch64 backend (NFC)

Sun Sep 12 00:57:26 PDT 2021

lhames accepted this revision.
lhames added a comment.
This revision is now accepted and ready to land.

In D109516#2996242 <https://reviews.llvm.org/D109516#2996242>, @sgraenitz wrote:

> Thinking again now: Assuming we skip most sections immediately, this probably adds unreasonable overhead. Because now there's 2 function calls before checking the type against `ELF::SHT_REL(A)`. And there's not much hope for inlining even with the templated parameter, right? Would be really bad with `-ffunction-sections`.
>
> So, yes for performance reasons I'd change that and put the loop back inside. @lhames What do you think?

Thanks for making these changes Stefan. I'm happy with that approach. I think we would need to do some profiling to determine the practical performance impacts of these decisions before we bothered tuning further.

Now for a fun digression (only tangentially related to this, so you can skip it if you're busy): My motivation for asking you to remove the std::function was my intuition was that it would block inlining of the call to the per-relocation handling method. I didn't have any hard evidence to back that intuition up though, so for fun I thought I would write a quick test. In this test I let 'ints' stand in for relocation objects, since they should make things as easy as possible for the inliner:

  #include <functional>

  void doSomethingWith(int);

  class MyLinker {
  public:
    void handleRelocation(int X) { doSomethingWith(X); }
  };

  template <typename Func>
  void functionObjectWalk(const int *V, Func F) {
    while (*V != 0)
      F(*V++);
  }

  template <typename ClassT, typename MethT>
  void methodWalk(const int *V, ClassT *Instance, MethT Method) {
    functionObjectWalk(V,
                       [Instance, Method](int X) {
                         (Instance->*Method)(X);
                       });
  }

  void stdfunctionWalk(const int *V,
                       std::function<void(int)> F) {
    while (*V != 0)
      F(*V++);
  }

  void testFunctionObjectWalk() {
    MyLinker ML;
    int V[] = { 1, 2, 3, 0 };
    functionObjectWalk(V, [&](int X) { ML.handleRelocation(X); });
  }

  void testMethodWalk() {
    MyLinker ML;
    int V[] = { 1, 2, 3, 0 };
    methodWalk(V, &ML, &MyLinker::handleRelocation);
  }

  void testStdFunctionWalk() {
    MyLinker ML;
    int V[] = { 1, 2, 3, 0 };
    stdfunctionWalk(V, [&](int X) { ML.handleRelocation(X); });
  }

I compiled this with a released Xcode clang with: `clang++ -std=c++17 -fno-exceptions -fno-rtti -fno-asynchronous-unwind-tables -O3 -S -o testcase.s testcase.cpp` (the options helped to keep the assembly readable). As expected: `functionObjectWalk` and `methodWalk` produce equivalent results with neatly inlined calls to `doSomethingWith`, whereas `stdfunctionWalk` produces a longer loop with an indirect call. I'm going to call my intuition "tentatively confirmed". I'm not quite motivated enough to dig into the assembly for a release build of LLVM at the moment. ;)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109516/new/

https://reviews.llvm.org/D109516