[PATCH] D125036: [RISCV] Alignment relaxation
Greg McGary via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun May 22 16:54:56 PDT 2022
gkm marked 17 inline comments as done.
gkm added inline comments.
================
Comment at: lld/ELF/Arch/RISCV.cpp:40
uint64_t val) const override;
+ void finalizeSections() const override;
};
----------------
MaskRay wrote:
> The name is unnecessarily the same as `Writer::finalizeSections`. `relaxSections` may be a better name.
This is an override of target-independent `virtual void TargetInfo::finalizeSections()`, which is meant to be a generic hook for targets to do target-specific things. For RISC-V, that target-specific thing happens to be relaxation.
================
Comment at: lld/ELF/Arch/RISCV.cpp:522
+ if (incr) {
+ adjustRanges.push_back({r.offset, incr});
+ delta += incr;
----------------
MaskRay wrote:
> emplace_back
```
/scratch/llvm-project/lld/ELF/Arch/RISCV.cpp:523:18: error: no matching member function for call to 'emplace_back'
adjustRanges.emplace_back({r.offset, incr});
~~~~~~~~~~~~~^~~~~~~~~~~~
/scratch/llvm-project/llvm/include/llvm/ADT/SmallVector.h:927:45: note: candidate template ignored: substitution failure: deduced incomplete pack <(no value)> for template parameter 'ArgTypes'
template <typename... ArgTypes> reference emplace_back(ArgTypes &&... Args) {
~~~~~~~~ ^
```
The only way I have found to code this without complaint from clang is this:
```
AdjustRange range {r.offset, incr};
adjustRanges.push_back(range);
```
Godbolt reveals that the generated code is substantially the same at -O2 for `push_back()` and `emplace_back()`.
================
Comment at: lld/ELF/Arch/RISCV.cpp:591
+void RISCV::finalizeSections() const {
+ // Can't perform relaxation if it is not a final link.
+ if (config->relocatable)
----------------
MaskRay wrote:
> Don't perform relaxation for a relocatable link.
The comment is redundant and provides no insight, so I removed it.
================
Comment at: lld/ELF/InputSection.cpp:188
+ SmallVector<SymbolAddr, 0> symbolAddrs;
+ for (Symbol *sym : file->getSymbols())
+ if (Defined *d = dyn_cast<Defined>(sym))
----------------
MaskRay wrote:
> If a file compiled with -ffunction-sections has N text sections/function symbols, this takes O(N^2). I think we should build a map to decrease the time complexity. You can add a map to `InputFile`. It's unfortunate to increase the size of `InputFile` which it is acceptable.
Adding a map to `InputFile` is unnecessary, especially for something specific to one target.
================
Comment at: lld/ELF/InputSection.cpp:194
+ }
+ llvm::sort(symbolAddrs, [](const SymbolAddr &a, const SymbolAddr &b) {
+ return a.offset < b.offset;
----------------
MaskRay wrote:
> stable_sort
>
> Two symbols may be defined at the same offset.
For this algorithm, retaining the original sequence for symbols at the same offset is irrelevant.
================
Comment at: lld/ELF/InputSection.cpp:238
+ size_t length = srcN - src0;
+ if (dest < src0)
+ std::copy_backward(src0, srcN, dest + length);
----------------
MaskRay wrote:
> The condition isn't correct. It may trigger the undefined behavior src0 < dest+length < srcN.
>
> ---
>
> We should avoid `std::copy` when
>
> > The behavior is undefined if d_first is within the range [first, last)
>
> Otherwise we should use `std::copy`.
>
> ---
>
> I am not sure the code works with adding bytes. In that case we need to get the NOP pattern.
You are correct that the condition was wrong. This is an arch-independent loop that is responsible only for moving code that falls outside the adjust ranges. Arch-dependent code is responsible for filling contents of expanded adjust ranges. See `fillAlignNops()` in `Arch/RISCV.cpp`.
It is not yet possible to test expanding adjust ranges. The first pass of `relaxOnce()` will only shrink ranges. Only on a subsequent pass will we ever grow an alignment pad, or undo a call/jump/load/store/arithmetic relaxation. Alignment alone without other relaxations will always complete in one pass.
================
Comment at: lld/test/ELF/riscv-relax-align-rvc.s:38
+ c.add a0, a1
+.balign 16
+ c.add s0, s1
----------------
MaskRay wrote:
> The test needs many `.balign {8,16}` directives to test `InputSectionBase::adjustRanges`.
>
> The current `InputSectionBase::adjustRanges` is flawed but the tests aren't able to detect that.
How //many// `.balign` directives do you think are necessary? I think the important case is where moved bytes overlap. Since ranges can only shrink in this diff, we can only test for `src0 < dest+length < srcN`, but not for `src0 < dest < srcN`.
Note that in libc++, the implementations of `std::copy()` and `std::copy_backward()` both call `__builtin_memmove()`, which is bidirectionally safe and works no matter how the regions overlap.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D125036/new/
https://reviews.llvm.org/D125036
More information about the llvm-commits
mailing list