[PATCH] D134600: [RISCV][LLD] Add RISCV zcmt optimise in linker relaxation

Tue Nov 21 04:54:09 PST 2023

nemanjai added a comment.

I am only starting my involvement with RISC-V and thereby have much less experience than most of the other reviewers. Especially when it comes to `lld`. So if any of the more experienced reviewers disagree with any of my suggestions, please favour their suggestions over mine.

I for one certainly would like to see support for `Zcmt` land in `lld` sooner rather than later, so I have provided a number of comments that I am hoping can at least help advance the discussion even if they do not need to be incorporated.

Finally, it is not clear to me where the `JVT` is set (i.e. `csrw  jvt, <reg with base address>`). Is that set in some library or the `crt` files? I imagine that is documented somewhere, but I think it might be useful to add a link to that documentation as a comment to the entry point for this relaxation (`relaxTableJump()`).

================
Comment at: lld/ELF/Arch/RISCV.cpp:1167
+
+uint32_t TableJumpSection::getEntry(
+    const Symbol *symbol, uint32_t maxSize,
----------------
If a function returns an index, I think it should be called `getIndex()` rather than `getEntry()`.

================
Comment at: lld/ELF/Arch/RISCV.cpp:1172-1181
+  uint32_t i = 0;
+
+  for (; i < entriesList.size() && i <= maxSize; i++) {
+    // If this is a duplicate addition, do not add it and return the address
+    // offset of the original entry.
+    if (symbol == entriesList[i].first) {
+      return i;
----------------
Can this not just be something like:
```
  // Find this symbol in the ordered list of entries if it exists.
  assert(maxSize >= entriesList.size() &&
         "Finalized vector of entries exceeds maximum");
  auto idx = std::find_if(
      entriesList.begin(), entriesList.end(),
      [symbol](llvm::detail::DenseMapPair<const Symbol *, int> &e) {
        return e.first == symbol;
      });
  if (idx == entriesList.end())
    return entriesList.size();
  return idx - entriesList.begin();
```

================
Comment at: lld/ELF/Arch/RISCV.cpp:1199
+
+      int gain = 6;
+      if (sec.relaxAux->relocTypes[i] == R_RISCV_RVC_JUMP)
----------------
I think it is useful to not name the opposite of what they mean (even though I do understand that `gain` is wrt. the goal of code size reduction). It is probably to change all references to `gain` to `codeSizeReduction` or `csReduction` if that's to long.

================
Comment at: lld/ELF/Arch/RISCV.cpp:1225
+
+  if (finalizedCMJALTEntries.size() > 0) {
+    int gainRequired = maxCMJTEntrySize * config->wordsize;
----------------
I think it would be useful to document this. Perhaps something like:
```
// The total reduction in code size is J - JA + JTS - JAE.
// Where:
// J = number of bytes saved for all the cm.jt instructions emitted
// JA = number of bytes saved for all the cm.jalt instructions emitted
// JTS = size of the part of the table for cm.jt jumps (i.e. 32 x worsize)
// JAE = number of entries emitted for the cm.jalt jumps x wordsize
```
Plus it seems like a much more readable implementation would be something like:
```
int sizeIncrease = getSize();
for (auto entry : finalizedCMJTEntries) {
  if (sizeIncrease < 0)
    break;
  sizeIncrease -= entry.second;
}
for (auto entry : finalizedCMJALTEntries) {
  if (sizeIncrease < 0)
    break;
  sizeIncrease -= entry.second;
}
```
In fact, it might be useful to just implement a function `getSizeReduction()` similar to `getSize()` that would simply return the amount of code size reduction for all the table entries.

================
Comment at: lld/ELF/Arch/RISCV.cpp:1244
+
+SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+TableJumpSection::finalizeEntry(llvm::DenseMap<const Symbol *, int> EntryMap,
----------------
```
/// Sort the map in decreasing order of the amount of code reduction provided
/// by the entries. Drop any entries that can't fit in the map from the tail
/// end since they provide less code reduction. Drop any entries that cause
/// an increase in code size (i.e. the reduction from instruction conversion
/// does not cover the code size gain from adding a table entry).
```

================
Comment at: lld/ELF/Arch/RISCV.cpp:1264
+
+  // drop the item which has negitive effect
+  while (finalizedVector.size()) {
----------------
`// Drop any items that have a negative effect (i.e. increase code size).`

================
Comment at: lld/ELF/Arch/RISCV.cpp:1292
+               finalizedCMJTEntries);
+  padWords(buf + ((startCMJTEntryIdx + finalizedCMJTEntries.size()) *
+                  config->wordsize),
----------------
Is this needed if there are no entries in `finalizedCMJALTEntries`?

================
Comment at: lld/ELF/Arch/RISCV.cpp:1313-1314
+  for (const auto &entry : entriesList) {
+    if (entry.second == 0)
+      continue;
+    // Use the symbol from in.symTab to ensure we have the final adjusted
----------------
What is this for? How can there be a symbol in the vector that has a zero "gain"? Wouldn't that just mean that it doesn't have any calls? Seems more like this should be an assert.

================
Comment at: lld/ELF/Options.td:323

+def riscv_tbljal: FF<"riscv-tbljal">,
+  HelpText<"(RISC-V only) Enable table jump instructions from the Zcmt extension">;
----------------
I find the name and description somewhat unnatural for what this does. I think something like this would feel more natural:

```
riscv-tbljmp
"Enable conversion of call instructions to table jump instruction from the Zcmt extension for frequently called functions (RISC-V only)"
```
Also, is an option perhaps too coarse-grained? Is it feasible that a user would want to use table jumps within some objects but not in others? Namely, if the user compiles some objects with no `Zcmt`, shouldn't the linker respect that?

================
Comment at: lld/ELF/Target.h:38
+  virtual void writeTableJumpHeader(uint8_t *buf) const {};
+  virtual void writeTableJump(uint8_t *buf, const uint64_t symbol) const {};
   virtual void writeIgotPlt(uint8_t *buf, const Symbol &s) const {}
----------------
I think it would be clearer if this had `entry` or `target` in the name since it is writing a table entry (a target address).

================
Comment at: lld/ELF/Writer.cpp:1680
+        // Jump table.
+        if (in.riscvTableJumpSection) {
+          for (InputSectionBase *inputSection : ctx.inputSections) {
----------------
I wonder if it would be possible and/or advantageous to produce the jump table and convert the instructions before we begin the process of relaxation. I am not familiar enough with the process here so this may not make sense, but this is how I see it and why I'm suggesting this...

1. This adds another source of volatility with relaxation, possibly making it more difficult to reach a fixed point
2. If we compile with `-ffunction-sections`, the linker has the ability to lay out the sections that contain functions for which we will use the jump table as far away from the rest of text as possible. This will keep the remaining functions closer and make them more likely to be possible to relax. Of course, the functions for which we're emitting jump table entries will themselves have calls, so an optimal layout is not feasible to compute, but it is likely useful to sort such functions late.

================
Comment at: lld/test/ELF/riscv-tbljal-call.s:1
+# REQUIRES: riscv
+
----------------
I think it would be useful to add checks for the layout of the jump table including the padding.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134600/new/

https://reviews.llvm.org/D134600