[llvm] [BOLT][AArch64] Implement PLTCall optimization (PR #93584)

Pavel Samolysov via llvm-commits llvm-commits at lists.llvm.org
Wed May 29 20:29:00 PDT 2024


================
@@ -1055,6 +1055,52 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
     return true;
   }
 
+  bool convertCallToIndirectCall(BinaryBasicBlock &BB,
+                                 BinaryBasicBlock::iterator &It,
+                                 const MCSymbol *TargetLocation,
+                                 MCContext *Ctx) override {
+    // Generated code:
+    // adrp	x16 <symbol>
+    // ldr	x17, [x16, #<offset>]
+    // bl <label> -> blr	x17  (or covert 'b -> br' for tail calls)
+
+    MCInst &InstCall = *It;
+    bool IsTailCall = isTailCall(InstCall);
+    assert((InstCall.getOpcode() == AArch64::BL ||
+            (InstCall.getOpcode() == AArch64::B && IsTailCall)) &&
+           "64-bit direct (tail) call instruction expected");
+
+    // Convert the call to an indicrect one by modifying the instruction.
+    InstCall.clear();
+    InstCall.setOpcode(IsTailCall ? AArch64::BR : AArch64::BLR);
+    InstCall.addOperand(MCOperand::createReg(AArch64::X17));
+    if (IsTailCall)
+      setTailCall(*It);
+
+    // Prepend instructions to load PLT call address from the input symbol.
+
+    MCInst InstLoad;
+    InstLoad.setOpcode(AArch64::LDRXui);
+    InstLoad.addOperand(MCOperand::createReg(AArch64::X17));
+    InstLoad.addOperand(MCOperand::createReg(AArch64::X16));
+    InstLoad.addOperand(MCOperand::createImm(0));
+    setOperandToSymbolRef(InstLoad, /* OpNum */ 2, TargetLocation,
+                          /* Addend */ 0, Ctx, ELF::R_AARCH64_LD64_GOT_LO12_NC);
+    It = BB.insertInstruction(It, InstLoad);
+
+    MCInst InstAdrp;
----------------
samolisov wrote:

Why the `LDR` was generated before the corresponding `ADRP` even though it is inserted after? If I get it correctly, in the result code the following snippet will be generated:

```
adrp reg1, <symbol>
ldr  reg2, [reg1, <offset>]
```

I believe the code is more readable when the instruction is generated in the code one by one to reflect the result without playing with insertion points (`It`) in this case. Also, this let us to increment `It` after every instruction without using a magic constant (`2` in `It = It + 2`). Is there a dependency between `InstLoad` and `InstAdrp` what makes the reflection of the generated code impossible?

https://github.com/llvm/llvm-project/pull/93584


More information about the llvm-commits mailing list