[PATCH] D71639: [ELF][PPC64] Improve "call lacks nop" diagnostic and make it compatible with GCC<5.5 and GCC<6.4

Sean Fertile via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 3 12:51:41 PST 2020


sfertile added a comment.

I agree that the runtime symbol preemption makes the failure unlikely. I disagree its not a bug though: it doesn't matter that FORTRAN and C++ have the 'one-definition-rule' because the generated code isn't ABI compliant. A compiler producing a valid C++/Fortran program that isn't ABI compliant is definitely a bug. The case I am much more concerned about is code like this:

      .abiversion 2
  
      .text
      .globl  Nop_Needed
      .p2align        2
      .type   Nop_Needed, at function
  Nop_Needed:
  .Lfunc_begin0:
      blr
      .long   0
      .quad   0
  .Lfunc_end0:
      .size   Nop_Needed, .Lfunc_end0-.Lfunc_begin0
  
      .globl  No_Nop_Needed
      .p2align        2
      .type   No_Nop_Needed, at function
  No_Nop_Needed:                          # @No_Nop_Needed
  .Lfunc_begin1:
  .Lfunc_gep1:
      addis 2, 12, .TOC.-.Lfunc_gep1 at ha
      addi 2, 2, .TOC.-.Lfunc_gep1 at l
  .Lfunc_lep1:
      .localentry     No_Nop_Needed, .Lfunc_lep1-.Lfunc_gep1
      li 3, 0
      b Nop_Needed
      .long   0
      .quad   0
  .Lfunc_end1:
      .size   No_Nop_Needed, .Lfunc_end1-.Lfunc_begin1

I realize now though that if you assemble this with clang instead of gas, we don't get the error because clang will not emit a relocation for `b Nop_Needed`. This is by design ( clang may have done IPO between the 2 function definitions) but breaks ELF shared-object interposition semantics :frown:. If assembled with gas, or if the functions are in different sections, then with both bfd and gold I get the `call lacks nop, can't restore toc;` error with linkers from binutils  2.26,2.27 and 2.29 and don't with LLD after your change.

>   I can add an option to warn such cases, and make it off by default (if it is on by default, Linux distributions will have to turn off this option. And in the future, when we remove the code, the distributions will have to delete it)

Why would Linux distributions have to turn it off? How many Linux distros were compiled with a gcc with this problem? RHEL 7.x will be affected but I think RHEL 8.x has fixed system compilers. Ubuntu and Debian should not need to relax the error either. (Jessie might have the problem but that goes out of service soon).

Is there a way to configure the toolchain driver to add the option when targeting/configuring for RHEL 7.x only? For example I think Ubuntu switched to always PIE with release 18.04, that must be done through the driver configuration. Could we do something similar?

> Sorry for my being rechless...

Sorry I missed this earlier I should have got to reviewing it before Xmas break. FWIW I don't think your change is reckless. This is a huge problem for any of the distros affected and we do need to fix it, and relaxing the error to make LLD work on those systems is unavoidable. I feel strongly that the code given in my example should always be a fatal link time error. It can lead to a shared object that is corrupt when linked into one particular program but perfectly valid when linked into a different program. The case you presented I think can be relaxed although ideally we do it in a way where similar future codegen problems can't sneak in.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71639/new/

https://reviews.llvm.org/D71639





More information about the llvm-commits mailing list