[PATCH] D79785: [ARM] Register pressure with -mthumb forces register reload before each call

Prathamesh via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 10 23:02:25 PDT 2020


prathamesh updated this revision to Diff 284579.
prathamesh added a comment.

Thanks for the suggestions. I finally was able to build llvm natively on arm. Benchmarking SPEC2006 for code-size with -march=armv6 revealed minimal differences with previous patch. It works for the original test case from tinycrypt library, but causes a code-size regression of ~10% for another function from same library, resulting in overall worse code-size. Likewise disabling the transform, gives similar results.
Should we go instead with the approach suggested by John (reattaching in current patch) ?
Implement foldMemoryOperand hook to fold tLDRpci, tBLX pair to tBL.
This has one issue above if COPY insns come in between and foldMemoryOperand gets passed tLDRpci and COPY instead.
I am not sure how to handle that. However, this approach improves somewhat on the original behavior without causing regressions for any test-cases. Benchmarking showed minimal difference in code-size for SPEC2006, but improves the original test-case in tinycrypt and avoids the 10% code-size regression.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79785/new/

https://reviews.llvm.org/D79785

Files:
  llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
  llvm/lib/Target/ARM/Thumb1InstrInfo.h
  llvm/test/CodeGen/ARM/minsize-call-cse-2.ll


Index: llvm/test/CodeGen/ARM/minsize-call-cse-2.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/ARM/minsize-call-cse-2.ll
@@ -0,0 +1,20 @@
+; RUN: llc < %s | FileCheck %s
+
+target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv6m-arm-none-eabi"
+
+; CHECK-LABEL: f:
+; CHECK: bl g
+; CHECK: blx r
+; CHECK: bl g
+; CHECK: bl g
+define void @f(i32* %p, i32 %x, i32 %y, i32 %z) minsize optsize {
+entry:
+  call void @g(i32* %p, i32 %x, i32 %y, i32 %z)
+  call void @g(i32* %p, i32 %x, i32 %y, i32 %z)
+  call void @g(i32* %p, i32 %x, i32 %y, i32 %z)
+  call void @g(i32* %p, i32 %x, i32 %y, i32 %z)
+  ret void
+}
+
+declare void @g(i32*,i32,i32,i32)
Index: llvm/lib/Target/ARM/Thumb1InstrInfo.h
===================================================================
--- llvm/lib/Target/ARM/Thumb1InstrInfo.h
+++ llvm/lib/Target/ARM/Thumb1InstrInfo.h
@@ -53,6 +53,13 @@
                             const TargetRegisterInfo *TRI) const override;
 
   bool canCopyGluedNodeDuringSchedule(SDNode *N) const override;
+
+protected:
+  virtual MachineInstr *foldMemoryOperandImpl(
+      MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
+      MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,
+      LiveIntervals *LIS = nullptr) const override;
+
 private:
   void expandLoadStackGuard(MachineBasicBlock::iterator MI) const override;
 };
Index: llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
===================================================================
--- llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
+++ llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
@@ -152,3 +152,33 @@
 
   return false;
 }
+
+MachineInstr *Thumb1InstrInfo::foldMemoryOperandImpl(
+    MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
+    MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,
+    LiveIntervals *LIS) const {
+  // Replace:
+  // ldr Rd, func address
+  // blx Rd
+  // with:
+  // bl func
+
+  if (MI.getOpcode() == ARM::tBLXr && LoadMI.getOpcode() == ARM::tLDRpci &&
+      MI.getParent() == LoadMI.getParent()) {
+    unsigned CPI = LoadMI.getOperand(1).getIndex();
+    const MachineConstantPool *MCP = MF.getConstantPool();
+    const MachineConstantPoolEntry &CPE = MCP->getConstants()[CPI];
+    assert(!CPE.isMachineConstantPoolEntry() && "Invalid constpool entry");
+    const Constant *Callee = cast<Constant>(CPE.Val.ConstVal);
+    const char *FuncName = MF.createExternalSymbolName(Callee->getName());
+    MachineInstrBuilder MIB =
+        BuildMI(*MI.getParent(), InsertPt, MI.getDebugLoc(), get(ARM::tBL))
+            .add(predOps(ARMCC::AL))
+            .addExternalSymbol(FuncName);
+    for (auto &MO : MI.implicit_operands())
+      MIB.add(MO);
+    return MIB.getInstr();
+  }
+
+  return nullptr;
+}


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D79785.284579.patch
Type: text/x-patch
Size: 2841 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200811/33306a3f/attachment.bin>


More information about the llvm-commits mailing list