[PATCH] D136877: [ORC][JITLink] Do not claim dead symbols

Lang Hames via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Oct 29 17:11:26 PDT 2022


lhames added a comment.

> However, this means that we are emitting more weak symbols than we need to, right?

It means that we claim and emit late-breaking definitions by default, rather than dead-stripping them by default. In my experience with code generated from C and C++, late-breaking weak defs are only inserted because something in the same TU needs them and in this case both approaches behave the same (for late-breaking defs, at least).

> But I guess I just have to accept that you implement something if you don't agree with my patches...

I can't stress enough how much I appreciate you digging into the issue and proposing patches. This would have gone unsolved for much longer if it had not been for your hard work. That said I thought there was an issue with your approach, which is why I implemented ba26b5ef15dc <https://reviews.llvm.org/rGba26b5ef15dcbfc69f062b1aea6424cdb186e5b0> instead. I tried to explain the issue in abstract terms, but a concrete example might be better. Consider the following two-file program:

  % cat main.c
  static int X = 42;
  int *P = &X;
  extern int getValuePointedToByP();
  int main(int argc, char *argv[]) {
    return getValuePointedToByP();
  }
  
  % cat ext.c
  static int Y = 42;
  int __attribute__((weak)) *P = &Y;
  int getValuePointedToByP() {
    return *P;
  }

In this case the definition of `P` from `main.c` is guaranteed to be chosen as it is strong, whereas `ext.c`'s definition is weak. When we jit-link `ext.o` we want to externalize its weak definition of `P` prior to dead-stripping so that both `P` and `Y` from `ext.o` can be removed. If we wait until after dead-stripping then the block for `Y` will already have been marked live and will be allocated memory unnecessarily.

In this case `Y` is just an int and the overhead is just a few bytes, but in practice `Y` could be anything (e.g. a struct, or a function) and could have its own references that pull in still more symbols (and potentially whole other files). This is the motivation for externalizing symbols that are not in the responsibility set prior to dead-stripping, rather than after.

As mentioned above: there may be other ways to solve this problem by introducing new dead-stripping phases, and there may also be ways to improve the performance of what we have by introducing explicit non-responsibility markers, but these are speculative ideas. I believe that ba26b5ef15dc <https://reviews.llvm.org/rGba26b5ef15dcbfc69f062b1aea6424cdb186e5b0> is the right approach for now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136877/new/

https://reviews.llvm.org/D136877



More information about the llvm-commits mailing list