[PATCH] D157002: [AArch64] Add an option to do machine cse at all time regardless of profitable checking

JinGu Kang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 3 07:21:50 PDT 2023


jaykang10 created this revision.
jaykang10 added reviewers: efriedma, craig.topper, t.p.northover, dmgreen.
Herald added subscribers: hiraditya, kristof.beyls.
Herald added a project: All.
jaykang10 requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

MachineCSE pass has heuristics to check that common subexpression would increase register pressure. For example,

  %1 = mov 0
  %2 = add %1, ...
  ...
  %5 = mov 0
  %6 = copy %5 --> it is copy for access subreg.
  ...

We can see a common sub expression `mov 0` on above example and we can imagine the `%5` can be removed. However, MachineCSE pass does not remove it because it could increase register pressure.
I agree with evan's commit message. If the MI does not use vreg, it just creates a live range and does not close any other vreg's live range. However, I feel it could be too conservative... I am not sure how we can improve the heuristics without performance regression so I would like to suggest to add an option which does CSE at all time.


https://reviews.llvm.org/D157002

Files:
  llvm/lib/CodeGen/MachineCSE.cpp
  llvm/test/CodeGen/AArch64/machine-cse-profitable-check.ll


Index: llvm/test/CodeGen/AArch64/machine-cse-profitable-check.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/AArch64/machine-cse-profitable-check.ll
@@ -0,0 +1,31 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
+; RUN: llc -mtriple aarch64-none-linux-gnu < %s | FileCheck %s --check-prefixes=CHECK-BASE
+; RUN: llc -mtriple aarch64-none-linux-gnu -always-do-machine-cse < %s | FileCheck %s --check-prefixes=CHECK-ALWAYS-CSE
+
+define void @foo(ptr %buf, <8 x i16> %a) {
+; CHECK-BASE-LABEL: foo:
+; CHECK-BASE:       // %bb.0: // %entry
+; CHECK-BASE-NEXT:    movi v2.2d, #0000000000000000
+; CHECK-BASE-NEXT:    // kill: def $q0 killed $q0 def $q0_q1
+; CHECK-BASE-NEXT:    zip2 v2.8h, v0.8h, v2.8h
+; CHECK-BASE-NEXT:    movi v1.2d, #0000000000000000
+; CHECK-BASE-NEXT:    st2 { v0.4h, v1.4h }, [x0], #16
+; CHECK-BASE-NEXT:    str q2, [x0]
+; CHECK-BASE-NEXT:    ret
+;
+; CHECK-ALWAYS-CSE-LABEL: foo:
+; CHECK-ALWAYS-CSE:       // %bb.0: // %entry
+; CHECK-ALWAYS-CSE-NEXT:    // kill: def $q0 killed $q0 def $q0_q1
+; CHECK-ALWAYS-CSE-NEXT:    movi v1.2d, #0000000000000000
+; CHECK-ALWAYS-CSE-NEXT:    st2 { v0.4h, v1.4h }, [x0], #16
+; CHECK-ALWAYS-CSE-NEXT:    zip2 v0.8h, v0.8h, v1.8h
+; CHECK-ALWAYS-CSE-NEXT:    str q0, [x0]
+; CHECK-ALWAYS-CSE-NEXT:    ret
+entry:
+  %vzip.i = shufflevector <8 x i16> %a, <8 x i16> <i16 0, i16 0, i16 0, i16 0, i16 poison, i16 poison, i16 poison, i16 poison>, <8 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11>
+  %vzip1.i = shufflevector <8 x i16> %a, <8 x i16> <i16 poison, i16 poison, i16 poison, i16 poison, i16 0, i16 0, i16 0, i16 0>, <8 x i32> <i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15>
+  store <8 x i16> %vzip.i, ptr %buf, align 4
+  %add.ptr = getelementptr inbounds i32, ptr %buf, i64 4
+  store <8 x i16> %vzip1.i, ptr %add.ptr, align 4
+  ret void
+}
Index: llvm/lib/CodeGen/MachineCSE.cpp
===================================================================
--- llvm/lib/CodeGen/MachineCSE.cpp
+++ llvm/lib/CodeGen/MachineCSE.cpp
@@ -65,6 +65,10 @@
     CSUsesThreshold("csuses-threshold", cl::Hidden, cl::init(1024),
                     cl::desc("Threshold for the size of CSUses"));
 
+static cl::opt<bool> AlwaysDoMachineCSE("always-do-machine-cse", cl::Hidden,
+                                        cl::init(false),
+                                        cl::desc("Always do machine cse"));
+
 namespace {
 
   class MachineCSE : public MachineFunctionPass {
@@ -439,6 +443,9 @@
 /// defined.
 bool MachineCSE::isProfitableToCSE(Register CSReg, Register Reg,
                                    MachineBasicBlock *CSBB, MachineInstr *MI) {
+  if (AlwaysDoMachineCSE)
+    return true;
+
   // FIXME: Heuristics that works around the lack the live range splitting.
 
   // If CSReg is used at all uses of Reg, CSE should not increase register


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D157002.546852.patch
Type: text/x-patch
Size: 2964 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230803/5095fd77/attachment.bin>


More information about the llvm-commits mailing list