[PATCH] D72046: [ELF][RISCV] Add test for absolute/relative/branch relocations to undefined weak symbols

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 31 21:21:56 PST 2019


MaskRay added a comment.

In D72046#1800270 <https://reviews.llvm.org/D72046#1800270>, @jrtc27 wrote:

> In D72046#1800268 <https://reviews.llvm.org/D72046#1800268>, @MaskRay wrote:
>
> > In D72046#1800264 <https://reviews.llvm.org/D72046#1800264>, @jrtc27 wrote:
> >
> > > For calls, whilst it differs from BFD, it can work just fine as it’s what AArch64 does.
> >
> >
> >
> >
> > > For data references, however, this is highly inappropriate. They *need* to resolve to 0 as *real world code* relies on that, and it’s what happens on every architecture out there. Code generated by both GCC and LLVM uses PC-relative relocations to refer to external weak symbols. We can argue all we like about whether that’s good or not, but at the end of the day *that is what happens* and we need to not break that.
> >
> > No. GCC/clang emit either absolute or GOT-generating relocations (R_RISCV_HI20+R_RISCV_LO12_I; R_RISCV_GOT_HI20+R_RISCV_PCREL_LO12_I). Such relocations to weak undefined symbols can be relied on by real world code. The linker only has to resolve absolute relocations to 0 and fill GOT entries with 0.
> >
> > PC-relative relocations to undefined weak symbols should not be emitted by compilers. Such hand-written assembly is just broken.
>
>
> No, you’re missing the whole point behind John’s patch, which is to fix LLD so it can link code compiled by either GCC or LLVM. A copy of my comment on his patch so it’s in this discussion chain:
>
> The problem is and always has been `-mcmodel=medany`. Neither GCC nor LLVM use the GOT in this case, as it’s not `-fpic`/`-fpie`. This is the case we care about for the kernel. Would I have made GCC do this? No. But it does, and we have to deal with that in a sensible way. Making it an error and changing GCC and LLVM is one such way. But that hasn’t happened.


I checked GCC/clang -fno-pic behaviors on small and large code models. (Tiny/medium is similar to either).

  extern long v __attribute__((weak));
  long foo() { return v; }

gcc -fno-pic

aarch64 -mcmodel={small,large}: GOT variant (.rodata.cst8)
ppc64 -mcmodel={small,large}: GOT variant (.toc)
x86_64 -mcmodel=small: PC relative
x86_64 -mcmodel=large: absolute
riscv -mcmodel=medlow: absolute
riscv -mcmodel=medany: PC relative

clang -fno-pic

aarch64 -mcmodel=small: GOT variant (.rodata.cst8)
aarch64 -mcmodel=large: absolute (R_AARCH64_MOVW_UABS_G0_NC+R_AARCH64_MOVW_UABS_G1_NC+R_AARCH64_MOVW_UABS_G2_NC+R_AARCH64_MOVW_UABS_G3)
ppc64 -mcmodel={small,large}: GOT variant (.toc)
riscv -mcmodel=medlow: absolute (R_RISCV_HI20+R_RISCV_LO12_I)
riscv -mcmodel=medany: PC relative (R_RISCV_PCREL_HI20+R_RISCV_PCREL_LO12_I)

Note that aarch64 and ppc64 use either absolute or GOT variant, but not PC relative. So they can make `R_PC` to undefined weak resolved to an arbitrary value. x86_64 -mcmodel=small uses PC relative, so it has to use `sym.getVA(a)`.

I thought RISC-V was in the aarch64/ppc64 camp but it turns out that its -mcmodel=medany isn't. If it behaves like aarch64/ppc64,

  else if (config->emachine == EM_PPC || config->emachine == EM_RISCV)
    dest = p + a;

will be a very nice solution.

`-mcmodel=medany -fno-pic` codegen is indeed strange. The external variable access code sequence `lla a5, v` expands to `auipc a5, ...; addi a5,a5,...`. `riscv64-linux-musl-ld -Ttext=0x200000000 a.o` can rewrite the sequence to `lui a5,0; mv a5,a5` (set a5 to 0).

@bsdjhb I need to understand more why the FreeBSD kernel uses `-mcmodel=medany -fno-pic`, but not `-mcmodel=medany -fpie`.

I want to make a proposal: unify `-mcmodel=medany -fno-pic` and `-mcmodel=medany -fpic`. If the variable turns out to be defined locally, relax `ld a5,...(a5)` to `addi a5,a5,...` (think of x86 GOTPCRELX). @jrtc27 the code sequences are of the same length, so this should work in theory. Am I right? If so, I'll investigate how to make llvm codegen do this. medany is largely underdocumented (am i missing something?) I will make a comment at the psABI side. If there is better place discussing this issue, please tell me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72046/new/

https://reviews.llvm.org/D72046





More information about the llvm-commits mailing list