[PATCH] D35014: [X86] PR32755 : Improvement in CodeGen instruction selection for LEAs.

Sun Oct 29 06:12:41 PDT 2017

RKSimon added inline comments.

================
Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:105

+    bool isLegalScale() {
+      return (Scale == 1 || Scale == 2 || Scale == 4 || Scale == 8);
----------------
```
bool isLegalScale() const {
```

================
Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:199

+    void setAggressiveOperandFolding(bool val = false) {
+      OptForAggressingFolding = val;
----------------
Is a default argument for a setter a good idea? Especially one that is the inverse of what the setter says it is.

================
Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:1537
+      } else if (AM.IndexReg == N) {
+        AM.Scale++;
+        return false;
----------------
These AM.Scale increments are scary - better to set it with AM.Scale = 2?

================
Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:1178
+  if ((!EnableAggressiveFolding && Depth > 5) ||
+        (EnableAggressiveFolding && Depth > 8) )
     return matchAddressBase(N, AM);
----------------
RKSimon wrote:
> Two hard coded depths like this is weird - better to have a getMaxOperandFoldingDepth() helper?
My comment still stands - try to avoid hard coded values embedded in the source - add a getMaxOperandFoldingDepth() helper.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:29
 #include "llvm/CodeGen/MachineRegisterInfo.h"
+#include "llvm/CodeGen/MachineDominators.h"
 #include "llvm/CodeGen/Passes.h"
----------------
RKSimon wrote:
> (style) Insert the include in alphabetical order (so before MachineFunctionPass.h)
Include ordering still broken

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:839
+               MachineInstr *NewMI = 
+                  BuildMI(*(const_cast<MachineBasicBlock *>(&MBB)),
+                     &LI1,DL,TII->get(LI1.getOpcode()))
----------------
lsaba wrote:
> This could end up in an assertion failure if LI1 is at the beginning of the BB, need to handle it separately, for example in this reproducer :
> 
>  ; ModuleID = 'bugpoint-reduced-simplified.bc'
>  source_filename = "bugpoint-output-2ef2e5d.bc"
>  target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>  target triple = "x86_64-unknown-linux-gnu"
>  
>  ; Function Attrs: norecurse nounwind readnone uwtable
>  define i32 @foo(i32 %a, i32 %b, i32 %d, i32 %y, i32 %x) local_unnamed_addr #0 {
>  entry:
>    %mul1 = shl i32 %b, 1
>    %add2 = add i32 %a, 4
>    %add3 = add i32 %add2, %mul1
>    %mul4 = shl i32 %b, 2
>    %add6 = add i32 %add2, %mul4
>    br label %for.body
>  
>  for.cond.cleanup:                                 ; preds = %for.body
>    ret i32 %add
>  
>  for.body:                                         ; preds = %for.body, %entry
>    %x.addr.015 = phi i32 [ %x, %entry ], [ %add3, %for.body ]
>    %y.addr.014 = phi i32 [ %y, %entry ], [ %add6, %for.body ]
>    %mul = mul nsw i32 %x.addr.015, %y.addr.014
>    %add = add nsw i32 0, %mul
>    %exitcond = icmp eq i32 undef, %d
>    br i1 %exitcond, label %for.cond.cleanup, label %for.body, !llvm.loop !1
>  }
>  
>  attributes #0 = { norecurse nounwind readnone uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-       math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+   mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
>  
>  !llvm.ident = !{!0}
>  
>  !0 = !{!"clang version 6.0.0 (cfe/trunk 309511)"}
>  !1 = distinct !{!1, !2}
>  !2 = !{!"llvm.loop.unroll.disable"}
> 
Has @lsaba test been added to the patch? I couldn't see it.

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:327
+static bool isDefCopyLike(MachineRegisterInfo *MRI, const MachineOperand &Opr) {
+  bool isInstrErased = Opr.isReg() && Opr.getParent()->getParent() ? 0 : 1;
+  if (!Opr.isReg() || isInstrErased ||
----------------
Do you mean:
```
bool isInstrErased = !(Opr.isReg() && Opr.getParent()->getParent()); 
```

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:1013
+  // Legal scale value (1,2,4 & 8) vector.
+  int LegalScale[9] = {0, 1, 1, 0, 1, 0, 0, 0, 1};
+
----------------
Really don't like this - write a helper instead like you did in X86ISelDAGToDAG.cpp
```
auto IsLegalScale = [](int S) {
  return S == 1 || S == 2 || S == 4 || S == 8;
};
```

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:1019
+      return false;
+    return true;
+  };
----------------
```
return Arg1->getOperand(2).getImm() >= Arg2->getOperand(2).getImm();
```

================
Comment at: lib/Target/X86/X86OptimizeLEAs.cpp:1074
+          MachineInstrBuilder NewMI =
+              BuildMI(*(const_cast<MachineBasicBlock *>(&MBB)), &LI1, DL,
+                      TII->get(LI1.getOpcode()))
----------------
DL is only used here - just use LI1.getDebugLoc() directly?

================
Comment at: test/CodeGen/X86/lea-opt-cse2.ll:3
+; RUN: llc < %s -mtriple=x86_64-unknown  -mattr=+slow-3ops-lea | FileCheck %s -check-prefix=X64
+; RUN: llc < %s -mtriple=i686-unknown  -mattr=+slow-3ops-lea  | FileCheck %s -check-prefix=X86

----------------
Why have you changed these tests?

================
Comment at: test/CodeGen/X86/lea-opt-cse4.ll:3
+; RUN: llc < %s -mtriple=x86_64-unknown -mattr=+slow-3ops-lea | FileCheck %s -check-prefix=X64
+; RUN: llc < %s -mtriple=i686-unknown -mattr=+slow-3ops-lea | FileCheck %s -check-prefix=X86

----------------
Why have you changed these tests?

https://reviews.llvm.org/D35014