[PATCH] Reassociate GEP operands for loop invariant code motion

Tue Apr 21 16:51:58 PDT 2015

First round.

That's a *lot* of refactoring work. Appreciate it!


REPOSITORY
  rL LLVM

================
Comment at: lib/Target/NVPTX/NVPTXTargetMachine.cpp:172
@@ -172,1 +171,3 @@
+  // generates significantly better code than EarlyCSE for some of our
+  // benchmarks.
   if (getOptLevel() == CodeGenOpt::Aggressive)
----------------
Should we run LICM after ReassociateGEPs? 

================
Comment at: lib/Target/PowerPC/PPCTargetMachine.cpp:266
@@ -265,3 +265,3 @@
   if (TM->getOptLevel() == CodeGenOpt::Aggressive && EnableGEPOpt) {
-    // Call SeparateConstOffsetFromGEP pass to extract constants within indices
+    // Call ReassociateGEPs pass to extract constants within indices
     // and lower a GEP with multiple indices to either arithmetic operations or
----------------
extract loop invariants

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:126
@@ -86,2 +125,3 @@
+// codegen). Such transformation can have following benefits:
 // (1) It can always extract constants in the indices of structure type.
 // (2) After such Lowering, there are more optimization opportunities such as
----------------
minor: (1) and (2) are used before; try other numbering. 

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:332
@@ +331,3 @@
+protected:
+  virtual bool visit(Value *V, bool &ContinueSearch);
+
----------------
I think LLVM generally prefers
```
bool visit(Value *V, bool &ContinueSearch) override;
```

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:616
@@ +615,3 @@
+                            /* NonNegative */ false, UserChain)) {
+      Found = true;
+      Operand = Op;
----------------
Is `Found` necessary? Do we need to look at the second operand if already found what we want in the first operand? 

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:629
@@ +628,3 @@
+  if (Found)
+    UserChain.clear();
+
----------------
What's the invariant between UserChain and Found? My understanding was, when Found is true, UserChain is non-empty containing the user chain from V to the found value, but the code here clears UserChain when Found. 

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:634
@@ +633,3 @@
+    // We cannot do much with Values that are not a User, such as an Argument.
+    if (User *U = dyn_cast<User>(V)) {
+      if (BinaryOperator *BO = dyn_cast<BinaryOperator>(V)) {
----------------
Looks like we cannot do much with `ConstantExprs` either. Maybe just `dyn_cast<Instruction>` to pre-filter out more cases. 

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:688
@@ +687,3 @@
+
+int64_t ConstantOffsetFinder::FindConstOffset(
+    GetElementPtrInst *GEP, unsigned OperandIdx,
----------------
s/FindConstOffset/findConstOffset

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:760
@@ +759,3 @@
+
+int LICMRanker::getRank(Value *V) {
+  Instruction *I = dyn_cast<Instruction>(V);
----------------
Can you comment on what the rank computes? Is it the deepest loop-level V can recursively reach? 

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:770
@@ +769,3 @@
+  if (Loop *L = LI->getLoopFor(I->getParent())) {
+    int MaxRank = L->getLoopDepth();
+    if (isImmovableInstruction(I))
----------------
An instruction can use a value in a deeper loop if the loop is in a non-LCSSA form. Does `MaxRank = L->getLoopDepth()` still make sense? 

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:784
@@ +783,3 @@
+
+bool LoopInvariantFinder::FindLoopInvariantValue(
+    GetElementPtrInst *GEP, unsigned OperandIdx, LICMRanker &Ranker,
----------------
s/FindLoopInvariantValue/findLoopInvariantValue/

or may be just `findLoopInvariant`? 

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:786
@@ +785,3 @@
+    GetElementPtrInst *GEP, unsigned OperandIdx, LICMRanker &Ranker,
+    SmallVectorImpl<UserChainElement> &UserChain) {
+  int BaseRank = 0;
----------------
I've seen `UserChain` passed to lots of methods of HoistableValueFinder. Can we make it global to HoistableValueFinder? 

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:830
@@ +829,3 @@
+  else if (BinaryOperator *BO = dyn_cast<BinaryOperator>(V)) {
+    // Don't recurse into subtract expressions.
+    if (BO->getOpcode() == Instruction::Sub)
----------------
Is recursing into subtract a TODO or problematic? 

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:1135
@@ +1134,3 @@
+//   %idx = add %a, %b
+//        = gep %base, 42, %idx, %foo, %bar
+// The gep can be split at operand 2 (%idx) with IdxPart1 equal to %a
----------------
I was initially confused when I thought I saw
```
add %a, %b = gep %base, 42, %idx, %foo, %bar
```
:)

How about giving the gep a name instead of assigning it to blank? e.g. 
```
%idx = add %a, %b
%p = gep %base, 42, %idx, %foo, %bar
```

================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:1248
@@ -897,11 +1247,3 @@
 
-  // Lowers a GEP to either GEPs with a single index or arithmetic operations.
-  if (LowerGEP) {
-    // As currently BasicAA does not analyze ptrtoint/inttoptr, do not lower to
-    // arithmetic operations if the target uses alias analysis in codegen.
-    if (TM && TM->getSubtargetImpl(*GEP->getParent()->getParent())->useAA())
-      lowerToSingleIndexGEPs(GEP, AccumulativeByteOffset);
-    else
-      lowerToArithmetics(GEP, AccumulativeByteOffset);
-    return true;
-  }
+bool ReassociateGEPs::splitGEP(GetElementPtrInst *GEP, LICMRanker &Ranker) {
+  bool NeedsExtraction;
----------------
Can we split splitGEPForConst and splitGEPForLICM? The general logic seems:
```
if we can strip const offset from gep
  gep = GEP with const offset stripped
splitGEPForLICM()
```


================
Comment at: lib/Transforms/Scalar/ReassociateGEPs.cpp:1359
@@ -989,2 +1358,3 @@
 
-bool SeparateConstOffsetFromGEP::runOnFunction(Function &F) {
+bool ReassociateGEPs::splitAndLowerGEP(GetElementPtrInst *GEP,
+                                       LICMRanker &Ranker) {
----------------
I'll come back to this function later. Looks like after we refactor splitGEP, splitGEP and splitAndLowerGEP can be merged? 

================
Comment at: test/Transforms/ReassociateGEPs/NVPTX/split-licm-gep.ll:122
@@ +121,3 @@
+; be split at two of the indices forming two new loop-invariant geps which are
+; hoisted to different loop-depths.
+define void @double_loop([5 x [6 x i32]]* %input, i32* %output, i64 %a) {
----------------
Nice! :)

http://reviews.llvm.org/D9136

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/