[PATCH] D71100: [lld][RISCV] Fixup PC-relative relocations to undefined weak symbols.

James Clarke via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 5 17:47:58 PST 2019


jrtc27 added inline comments.


================
Comment at: lld/ELF/InputSection.cpp:982
       break;
+    case R_PC:
+      if (config->emachine == EM_RISCV && rel.sym->isUndefWeak() &&
----------------
jrtc27 wrote:
> jrtc27 wrote:
> > MaskRay wrote:
> > > Adding the code here is not correct. The most appropriate place is `getRelocTargetVA`. However, I don't recommend adding any logic at all because weak undefined symbols do not have defined semantics. This means multiple things:
> > > 
> > > 1) their behaviors may be different in different toolchains
> > > 2) their behavior may change with various configuration (-no-pie, -pie, -shared) even in the same toolchain
> > > 
> > > The binutils behaviors may sometime not easy to mimic in lld because their internals are so different.
> > > 
> > > I did add the following two lines for EM_PPC, but that was because I had no choice but to support that glibc use case.
> > > 
> > >       else if (config->emachine == EM_PPC)
> > >         dest = p;
> > > 
> > > For anything newly developed, we should try our best to avoid weak undefined symbols.
> > > 
> > > Can you share with me some instructions to build FreeBSD RISC-V?
> > *Calling* weak undefined symbols does not necessarily have well-defined semantics, depending on the architecture, but referencing them absolutely does (namely that the result of the code sequence is NULL) and is required for many systems, not just FreeBSD. The issue here is that, with `-mcmodel=medany` (LLVM's medium) in position-dependent code generation, the addressing mode is PC-relative, which breaks if linked at an address beyond 2 GiB as the 32-bit immediate pair is too small to negate PC, but the whole point of medany is to allow linking code at any address, including above the 2 GiB mark (otherwise you'd just use medlow).
> The alternative is to generate code using the GOT for extern weak symbols, as is done on AArch64 (which also otherwise uses PC-relative addressing), but that would require changes for both GCC and LLVM.
Moreover, even if calling undefined weak functions is not well-defined in the ABI, it still needs to link without error to *something*, otherwise you can't write the perfectly-legitimate common construct that is:

```
  extern void foo(void) __attribute__((weak));
  if (&foo)
    foo();
```

The call can be turned into whatever you like at link time if the symbol is undefined, because it should be guarded by an if (and it's undefined behaviour otherwise, I would assume), but the linker needs to do something that's not failing with an error.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71100/new/

https://reviews.llvm.org/D71100





More information about the llvm-commits mailing list