[PATCH] Fix PR16938 - rewrite CSE heuristic for same block uses

Eric Christopher echristo at gmail.com
Tue Apr 22 16:21:23 PDT 2014


Hi Evan,

In taking a look at 16938 it seems that what you want is to check the
uses of the original register in all of the basic blocks and then make
sure that the new CSE'd expression is in one of them rather than the
inverse?

This will fix the PR mentioned above and no regressions anywhere. I
ran the performance tests and didn't see anything outside of the noise
as well.

Thoughts?

-eric
-------------- next part --------------
commit 5d22cd1cdc55a0d73e1bf26e69f653a41384cbd7
Author: Eric Christopher <echristo at gmail.com>
Date:   Tue Apr 22 16:15:08 2014 -0700

    Rewrite PHI cse heuristic to check that the use that we're verifying
    is the original use and not the new use. The new instruction would
    never have a use, however, the value it's computing would via the
    old instruction's uses.
    
    Fixes PR16938

diff --git a/lib/CodeGen/MachineCSE.cpp b/lib/CodeGen/MachineCSE.cpp
index 7da439c..845d146 100644
--- a/lib/CodeGen/MachineCSE.cpp
+++ b/lib/CodeGen/MachineCSE.cpp
@@ -414,15 +414,15 @@ bool MachineCSE::isProfitableToCSE(unsigned CSReg, unsigned Reg,
   // Heuristics #3: If the common subexpression is used by PHIs, do not reuse
   // it unless the defined value is already used in the BB of the new use.
   bool HasPHI = false;
-  SmallPtrSet<MachineBasicBlock*, 4> CSBBs;
-  for (MachineInstr &MI : MRI->use_nodbg_instructions(CSReg)) {
+  SmallPtrSet<MachineBasicBlock*, 4> UseBBs;
+  for (MachineInstr &MI : MRI->use_nodbg_instructions(Reg)) {
     HasPHI |= MI.isPHI();
-    CSBBs.insert(MI.getParent());
+    UseBBs.insert(MI.getParent());
   }
 
   if (!HasPHI)
     return true;
-  return CSBBs.count(MI->getParent());
+  return UseBBs.count(CSMI->getParent());
 }
 
 void MachineCSE::EnterScope(MachineBasicBlock *MBB) {
diff --git a/test/CodeGen/X86/cse-fp-with-phi.ll b/test/CodeGen/X86/cse-fp-with-phi.ll
new file mode 100644
index 0000000..c695830
--- /dev/null
+++ b/test/CodeGen/X86/cse-fp-with-phi.ll
@@ -0,0 +1,50 @@
+; RUN: llc -mtriple=x86_64-pc-linux-gnu < %s | FileCheck %s
+
+; Make sure we CSE the extra xorps and movaps here.
+; CHECK: f
+; CHECK: movaps
+; CHECK: xorps
+; CHECK-NOT: xorps
+; CHECK: addsd
+; CHECK-NOT: movaps
+
+; Function Attrs: nounwind readnone uwtable
+define double @f(double %p, i32 %n) #0 {
+entry:
+  %tobool = icmp eq i32 %n, 0
+  br i1 %tobool, label %if.end, label %if.then
+
+if.then:                                          ; preds = %entry
+  %add = fadd double %p, 0.000000e+00
+  br label %if.end
+
+if.end:                                           ; preds = %entry, %if.then
+  %s.0 = phi double [ %add, %if.then ], [ 0.000000e+00, %entry ]
+  ret double %s.0
+}
+
+; Make sure we CSE the extra xorps.
+; CHECK: g
+; CHECK: xorps
+; CHECK-NOT: xorps
+
+define double @g(double* nocapture readonly %p, i32 %n) #0 {
+entry:
+  %tobool = icmp eq i32 %n, 0
+  br i1 %tobool, label %if.end, label %if.then
+
+if.then:                                          ; preds = %entry
+  %0 = load double* %p, align 8
+  %add = fadd double %0, 0.000000e+00
+  br label %if.end
+
+if.end:                                           ; preds = %entry, %if.then
+  %s.0 = phi double [ %add, %if.then ], [ 0.000000e+00, %entry ]
+  ret double %s.0
+}
+
+attributes #0 = { nounwind readnone uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
+
+!llvm.ident = !{!0}
+
+!0 = metadata !{metadata !"clang version 3.5.0 (trunk 206920) (llvm/trunk 206919)"}


More information about the llvm-commits mailing list