[lld] [lld] Add target support for SystemZ (s390x) (PR #75643)

Ulrich Weigand via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 20 09:23:51 PST 2024


uweigand wrote:

> .plt and .plt.got are mutually exclusive. You can think of .plt.got in mold as an optimization; if we already have a .got entry for a symbol, we don't need to resolve it again lazily at runtime because it's address is already available in .got at load-time. That's why we have .plt.got besides .plt.

Agreed - this optimization seems fine in any case.
 
> > Even in the "fast" path (once lazy resolution has happened), we now always use basr. This clobbers r0 - which is allowed by the ABI, but there might be some complications with other stubs maybe (e.g. mcount stubs? we'd need to check). Also, the basr might have performance implications as on some microarchitecture implementations it might confuse the call/return stack tracking by the branch predictor.

One update here: a regular `_mcount` call does not need to preserve r0.   However, a call to `__fentry__` *does* need to preserve r0 - fentry works by adding the instruction
```
brasl %r0, __fentry__
```
as first instruction of each function.  The `__fentry__` routine in glibc expects r0 to point to the callee and r14 to the caller.  See https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/s390/s390-64/s390x-mcount.h for more details.   It looks like this would not work with the current mold PLT implementation, unless the linker specifically recognizes `__fentry__`?
 
> That's what I thought too. I thought that since the ABI requires r14 to be used as a return address, basr with other register as a return address wouldn't be considered as a function call in the microarchitecture and doesn't confuse the Return Address Stack, but that's just my assumption. It'd be awesome if you can ask the processor team how RAS works on s390x.

r14 is just a Linux ABI convention, not required by hardware.  And in fact other ABIs used on the platform tend to use different return registers (e.g. z/OS XPLINK uses r4).   It would never make sense to use r0 specifically since you cannot return via r0, but I'm not sure if the hardware call-return predictor checks for that.

> > The addresses in .got.plt now no longer implement the target function ABI, but implicitly require r0 and r1 to be set up correctly. This would break "PLT inlining" via the R_390_GOTPLT family of instructions. This is not currently used by the default toolchains, but as long as the relocations are there, I guess they need to work as expected ...
> 
> PLT inlining implies disabling lazy symbol resolution, no? 

The intent was PLT inlining should specifically *not* disable lazy symbol resolution.  That's the whole point of adding the new GOTPLT relocations - otherwise the compiler could just use normal GOT relocations ...


https://github.com/llvm/llvm-project/pull/75643


More information about the llvm-commits mailing list