[PATCH] D94627: [PowerPC][PC Rel] Implement option to omit P10 instructions from stubs

Wed Jan 27 04:37:41 PST 2021

stefanp added a comment.

I have a few comments. Most of them are nits but there is a functional issue as well.

For the testing:
Do we have a test for the `PPC64R2SaveStub` in the situation where the offset fits in 34 bits?

================
Comment at: lld/ELF/Driver.cpp:768
+  // This handles the --no-power10-stubs option.
+  bool NoP10 = args.hasArg(OPT_no_power10_stubs);
+
----------------
nit:
You can probably get rid of this temp and have the condition down below change to:
```
if (!args.hasArg(OPT_power10_stubs_eq) &&
    args.hasArg(OPT_no_power10_stubs))
  return P10Stub::No;
```
Mainly because you only use `NoP10` in one place at this point.

================
Comment at: lld/ELF/Options.td:445

+def power10_stubs: F<"power10-stubs">, HelpText<"Alias for --power10-stubs=yes">;
+
----------------
nit:
This is not checked in `getP10StubOpt`.
This is no longer an alias for `power10-stubs=yes` it is an alias for default.

================
Comment at: lld/ELF/Thunks.cpp:932
+        write32(buf + 4, addis); // addis r12, 0, top of offset
+        write32(buf + 8, addi);  // addi  r12, r12, bottom of offset
+        nextInstOffset = 12;
----------------
This sequence does not match with what you are using as an offset.
The `tocOffset` is the difference between the TOC pointer (in r2) and the address of the destination function (the callee). However, the add instructions are using that offset and adding it to zero (`addis r12, 0, top of offset`) which is not the TOC pointer. At the end of this sequence the register `r12` will not have the address of the callee but simply the difference between that address and the TOC pointer.

Since you have a valid `r2` and are computing the offset from the TOC pointer you should be adding to that.

================
Comment at: lld/ELF/Thunks.cpp:935
+      } else {
+        write32(buf + 4, addi); // addi r12, 0, offset
+        nextInstOffset = 8;
----------------
Same here as above.

================
Comment at: lld/ELF/Thunks.cpp:971
+    uint32_t d = destination.getVA(addend);
+    uint32_t off = d - getThunkTargetSym()->getVA();
+    write32(buf + 0, 0x7c0802a6);                      // mflr r12
----------------
nit:
The variable `d` is only used in one place so you can just inline it.
Also, for the future try to avoid single letter variable names.

================
Comment at: lld/ELF/Thunks.cpp:977
+    write32(buf + 16, 0x3d8c0000 | ha(off));           // addis r12,r11,off at ha
+    write32(buf + 20, 0x398c0000 | (uint16_t)(off)-8); // addi r12,r12,off at l
+    nextInstOffset = 24;
----------------
You need to have the `off-8` in both cases. The way that this is done is quite confusing beause in one case you subtract 8 in the lambda and in the other you just subtract here inline. I would say just subtract in the computation of `off` and then you don't have to do it twice in two different places.

Secondly, instead of casting to `uint16_t` you are better off just using a mask (`off & 0xffff`).

================
Comment at: lld/ELF/Thunks.cpp:1010
+    uint32_t d = destination.getVA(addend);
+    uint32_t off = d - getThunkTargetSym()->getVA();
+    write32(buf + 0, 0x7c0802a6);            // mflr r12
----------------
nit:
Similar to above it is probably better to compute `off` with the `-8` included than to add the adjustment to the lambdas. 

================
Comment at: lld/ELF/Thunks.cpp:1057
+  if (config->Power10Stub == P10Stub::No) {
+    auto ha = [](uint32_t v) -> uint16_t { return (v + 0x8000 - 8) >> 16; };
+    uint32_t d = destination.getVA(addend);
----------------
I see that these lambdas are used in a lot of places to do the same thing. It might be better to have a static function instead of defining the same lambda three times.

================
Comment at: lld/ELF/Thunks.cpp:1059
+    uint32_t d = destination.getVA(addend);
+    uint32_t off = d - getThunkTargetSym()->getVA();
+    write32(buf + 0, 0x7c0802a6);                      // mflr r12
----------------
nit: Same as above. Just add compute the offset with the 8 difference here. 

================
Comment at: lld/test/ELF/ppc64-pcrel-call-to-toc.s:18

+# RUN: llvm-mc -filetype=obj -triple=powerpc64 %s -o %t.o
+# RUN: ld.lld -T %t.script %t.o -o %t --no-power10-stubs
----------------
nit:
Do the LE version of the test instead of the BE version. 

================
Comment at: llvm/include/llvm/Object/ELF.h:94
+  LD_R12_NO_DISP = 0xE9800000,
+  LD_R12_TO_R12_NO_DISP = 0xE98C0000,
   MTCTR_R12 = 0x7D8903A6,
----------------
Are these two instruction masks used?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94627/new/

https://reviews.llvm.org/D94627