[PATCH] D79785: [ARM] Register pressure with -mthumb forces register reload before each call

Prathamesh via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 12 07:30:02 PDT 2020


prathamesh created this revision.
prathamesh added a reviewer: john.brawn.
Herald added subscribers: llvm-commits, danielkiss, hiraditya, kristof.beyls.
Herald added a project: LLVM.

Hi,
Compiling following test-case (reduced for uECC_shared_secret function from tinycrypt library) with -Oz on armv6-m:

typedef unsigned char uint8_t;
extern uint8_t x1;
extern uint8_t x2;

void foo(uint8_t *, unsigned, unsigned);

void uECC_shared_secret(uint8_t *private_key, unsigned num_bytes, unsigned num_words)
{
  foo(private_key, num_bytes, num_words);
  foo(private_key, num_bytes, num_words);
  foo(private_key, num_bytes, num_words);
}

results in ldr of function's address before each blx call:

  ldr       r3, .LCPI0_0
  blx      r3
  mov    r0, r6
  mov    r1, r5
  mov    r2, r4
  ldr       r3, .LCPI0_0
  blx       r3
  ldr        r3, .LCPI0_0
  mov     r0, r6
  mov     r1, r5
  mov     r2, r4
  blx       r3

.LCPI0_0:

  .long   foo

As suggested by John Brawn in http://lists.llvm.org/pipermail/llvm-dev/2020-April/140712.html,
this happens because:
(1) ARMTargetLowering::LowerCall prefers indirect call for 3 or more functions in same basic block.
(2) For thumb1, we have only 3 callee-saved registers available (r4-r6).
(3) The function has 3 arguments and needs one extra register to hold it's address.
(4) So we have to end up spilling one of the registers. The register holding function's address gets split since it can be rematerialized, and we end up with ldr before each call.

As per the suggestion, the patch implements foldMemoryOperand hook in Thumb1InstrInfo, to convert back to bl in case of a spill.
Does the patch look OK ?
make check-llvm shows no unexpected failures.

Thanks,
Prathamesh


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D79785

Files:
  llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
  llvm/lib/Target/ARM/Thumb1InstrInfo.h


Index: llvm/lib/Target/ARM/Thumb1InstrInfo.h
===================================================================
--- llvm/lib/Target/ARM/Thumb1InstrInfo.h
+++ llvm/lib/Target/ARM/Thumb1InstrInfo.h
@@ -53,6 +53,15 @@
                             const TargetRegisterInfo *TRI) const override;
 
   bool canCopyGluedNodeDuringSchedule(SDNode *N) const override;
+
+  // foldMemoryOperand
+  using TargetInstrInfo::foldMemoryOperandImpl;
+
+  virtual MachineInstr *foldMemoryOperandImpl(
+      MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
+      MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,
+      LiveIntervals *LIS = nullptr) const override;
+
 private:
   void expandLoadStackGuard(MachineBasicBlock::iterator MI) const override;
 };
Index: llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
===================================================================
--- llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
+++ llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
@@ -152,3 +152,34 @@
 
   return false;
 }
+
+MachineInstr *Thumb1InstrInfo::foldMemoryOperandImpl(
+      MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
+      MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,
+      LiveIntervals *LIS) const
+{
+  // Replace:
+  // ldr Rd, func address
+  // blx Rd
+  // with:
+  // bl func
+
+  if (MI.getOpcode() == ARM::tBLXr && LoadMI.getOpcode() == ARM::tLDRpci
+      && MI.getParent() == LoadMI.getParent()) {
+    unsigned CPI = LoadMI.getOperand(1).getIndex();
+    const MachineConstantPool *MCP = MF.getConstantPool();
+    const MachineConstantPoolEntry &CPE = MCP->getConstants()[CPI];
+    assert(!CPE.isMachineConstantPoolEntry() && "Invalid constpool entry");
+    const Constant *callee = cast<Constant>(CPE.Val.ConstVal);
+    const char *func_name = MF.createExternalSymbolName(callee->getName());
+    MachineInstrBuilder MIB = BuildMI(*MI.getParent(), InsertPt, MI.getDebugLoc(), get(ARM::tBL))
+                                      .add(predOps(ARMCC::AL))
+                                      .addExternalSymbol(func_name)
+                                      .addReg(ARM::R0, RegState::Implicit | RegState::Kill)
+                                      .addReg(ARM::R1, RegState::Implicit | RegState::Kill)
+                                      .addReg(ARM::R2, RegState::Implicit | RegState::Kill);
+    return MIB.getInstr();
+  }
+
+  return nullptr;
+}


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D79785.263428.patch
Type: text/x-patch
Size: 2420 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200512/da5d2c86/attachment-0001.bin>


More information about the llvm-commits mailing list