[PATCH] D125036: [RISCV] Alignment relaxation

Greg McGary via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun May 22 16:54:56 PDT 2022


gkm marked 17 inline comments as done.
gkm added inline comments.


================
Comment at: lld/ELF/Arch/RISCV.cpp:40
                 uint64_t val) const override;
+  void finalizeSections() const override;
 };
----------------
MaskRay wrote:
> The name is unnecessarily the same as `Writer::finalizeSections`. `relaxSections` may be a better name.
This is an override of target-independent `virtual void TargetInfo::finalizeSections()`, which is meant to be a generic hook for targets to do target-specific things. For RISC-V, that target-specific thing happens to be relaxation.


================
Comment at: lld/ELF/Arch/RISCV.cpp:522
+  if (incr) {
+    adjustRanges.push_back({r.offset, incr});
+    delta += incr;
----------------
MaskRay wrote:
> emplace_back
```
/scratch/llvm-project/lld/ELF/Arch/RISCV.cpp:523:18: error: no matching member function for call to 'emplace_back'
    adjustRanges.emplace_back({r.offset, incr});
    ~~~~~~~~~~~~~^~~~~~~~~~~~
/scratch/llvm-project/llvm/include/llvm/ADT/SmallVector.h:927:45: note: candidate template ignored: substitution failure: deduced incomplete pack <(no value)> for template parameter 'ArgTypes'
  template <typename... ArgTypes> reference emplace_back(ArgTypes &&... Args) {
                        ~~~~~~~~            ^
```
The only way I have found to code this without complaint from clang is this:
```
AdjustRange range {r.offset, incr};
adjustRanges.push_back(range);
```
Godbolt reveals that the generated code is substantially the same at -O2 for `push_back()` and `emplace_back()`.


================
Comment at: lld/ELF/Arch/RISCV.cpp:591
+void RISCV::finalizeSections() const {
+  // Can't perform relaxation if it is not a final link.
+  if (config->relocatable)
----------------
MaskRay wrote:
> Don't perform relaxation for a relocatable link.
The comment is redundant and provides no insight, so I removed it.


================
Comment at: lld/ELF/InputSection.cpp:188
+  SmallVector<SymbolAddr, 0> symbolAddrs;
+  for (Symbol *sym : file->getSymbols())
+    if (Defined *d = dyn_cast<Defined>(sym))
----------------
MaskRay wrote:
> If a file compiled with -ffunction-sections has N text sections/function symbols, this takes O(N^2). I think we should build a map to decrease the time complexity. You can add a map to `InputFile`. It's unfortunate to increase the size of `InputFile` which it is acceptable.
Adding a map to `InputFile` is unnecessary, especially for something specific to one target.


================
Comment at: lld/ELF/InputSection.cpp:194
+      }
+  llvm::sort(symbolAddrs, [](const SymbolAddr &a, const SymbolAddr &b) {
+    return a.offset < b.offset;
----------------
MaskRay wrote:
> stable_sort
> 
> Two symbols may be defined at the same offset. 
For this algorithm, retaining the original sequence for symbols at the same offset is irrelevant.


================
Comment at: lld/ELF/InputSection.cpp:238
+    size_t length = srcN - src0;
+    if (dest < src0)
+      std::copy_backward(src0, srcN, dest + length);
----------------
MaskRay wrote:
> The condition isn't correct. It may trigger the undefined behavior src0 < dest+length < srcN.
> 
> ---
> 
> We should avoid `std::copy` when
> 
> > The behavior is undefined if d_first is within the range [first, last)
> 
> Otherwise we should use `std::copy`.
> 
> ---
> 
> I am not sure the code works with adding bytes. In that case we need to get the NOP pattern.
You are correct that the condition was wrong. This is an arch-independent loop that is responsible only for moving code that falls outside the adjust ranges. Arch-dependent code is responsible for filling contents of expanded adjust ranges. See `fillAlignNops()` in `Arch/RISCV.cpp`.

It is not yet possible to test expanding adjust ranges. The first pass of `relaxOnce()` will only shrink ranges. Only on a subsequent pass will we ever grow an alignment pad, or undo a call/jump/load/store/arithmetic relaxation. Alignment alone without other relaxations will always complete in one pass.


================
Comment at: lld/test/ELF/riscv-relax-align-rvc.s:38
+  c.add a0, a1
+.balign 16
+  c.add s0, s1
----------------
MaskRay wrote:
> The test needs many `.balign {8,16}` directives to test `InputSectionBase::adjustRanges`.
> 
> The current `InputSectionBase::adjustRanges` is flawed but the tests aren't able to detect that.
How //many// `.balign` directives do you think are necessary? I think the important case is where moved bytes overlap. Since ranges can only shrink in this diff, we can only test for `src0 < dest+length < srcN`, but not for `src0 < dest < srcN`.

Note that in libc++, the implementations of `std::copy()` and `std::copy_backward()` both call `__builtin_memmove()`, which is bidirectionally safe and works no matter how the regions overlap.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125036/new/

https://reviews.llvm.org/D125036



More information about the llvm-commits mailing list