[PATCH] CSE on GEP indices

Thu Apr 24 23:10:05 PDT 2014

================
Comment at: include/llvm/LinkAllPasses.h:56
@@ -55,3 +55,3 @@
       (void) llvm::createBasicAliasAnalysisPass();
       (void) llvm::createLibCallAliasAnalysisPass(nullptr);
       (void) llvm::createScalarEvolutionAliasAnalysisPass();
----------------
This file is changed because I did a "git rebase master". These changes won't be in the final commit. 

================
Comment at: lib/Target/NVPTX/NVPTXTargetMachine.cpp:150
@@ -149,5 +149,3 @@
   addPass(createNVPTXFavorNonGenericAddrSpacesPass());
-  // The FavorNonGenericAddrSpaces pass may remove instructions and leave some
-  // values unused. Therefore, we run a DCE pass right afterwards. We could
-  // remove unused values in an ad-hoc manner, but it requires manual work and
-  // might be error-prone.
+  addPass(createSeparateConstOffsetFromGEPPass());
+  // The SeparateConstOffsetFromGEP pass creates variadic bases that can be used
----------------
Eli Bendersky wrote:
> Since this pass is not target independent, I'm not sure this is the right place to put it. It makes more sense to add it as part of a custom compiler optimization pipeline, or maybe even in opt itself?
> 
> Note that the other passes added here are in fact NVPTX specific. Also the GVN vs. CSE logic seems out of place here. We already have a place where optimization pipeline contents are customized according to the optimization level - opt itself :)
> 
> We can add it under a off-by-default flag to PassManagerBuilder, until others experiment with the pass. If it proves beneficial for other archs, it may be enabled by default down the road.
As discussed offline, we add this pass and its clean up passes (-dce, -early-cse, and -gvn) in NVPTXTargetMachine.cpp for now. If this pass turned out to really benefit other targets, we can move that in TargetPassConfig. 

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:1
@@ +1,2 @@
+//===-- SeparateConstOffsetFromGEP.cpp - -------------------*- C++ -*-===//
+//
----------------
Eli Bendersky wrote:
> Inconsistent comment line length
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:98
@@ +97,3 @@
+
+static cl::opt<bool> DisableSplitGEP("disable-split-gep", cl::init(false),
+                                     cl::desc("Do not split GEPs for CSE"),
----------------
Eli Bendersky wrote:
> Make the flag name consistent with pass name
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:113
@@ +112,3 @@
+/// index, and tries to find a constant integer that can be hoisted to the
+/// outermost level of the expression as an addition. Not every constants in an
+/// expression can jump out. e.g., we cannot transform (b * (a + 5)) to (b * a +
----------------
Eli Bendersky wrote:
> s/constants/constant/
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:121
@@ +120,3 @@
+  /// numeric value of the extracted constant offset (0 if failed), and a
+  /// new index representing the remainder (equal to the original index - the
+  /// constant offset).
----------------
Eli Bendersky wrote:
> Do you mean "minus" here? The dash is unclear as it can be just part of the sentence
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:130
@@ +129,3 @@
+                         Instruction *IP);
+  static int64_t Find(Value *Idx, const DataLayout *DL);
+
----------------
Eli Bendersky wrote:
> Document "Find"
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:204
@@ +203,3 @@
+INITIALIZE_PASS_BEGIN(
+    SeparateConstOffsetFromGEP, "split-gep",
+    "Split GEPs to a variadic base and a constant offset for better CSE", false,
----------------
Eli Bendersky wrote:
> Consistent name - separate / split
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:210
@@ +209,3 @@
+INITIALIZE_PASS_END(
+    SeparateConstOffsetFromGEP, "split-gep",
+    "Split GEPs to a variadic base and a constant offset for better CSE", false,
----------------
Eli Bendersky wrote:
> ditto
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:226
@@ +225,3 @@
+  if (ConstantOffset != 0) return ConstantOffset;
+  ConstantOffset = find(Usr->getOperand(1));
+  // If Usr is a sub operator, negate the constant offset found in the right
----------------
Eli Bendersky wrote:
> Do you assert somewhere that users passed here have at least two operands?
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:270
@@ +269,3 @@
+  // If we found a non-zero constant offset, adds it to the path for future
+  // analysis and surgery. Zero is a valid constant offset, but doesn't help
+  // this optimization.
----------------
Eli Bendersky wrote:
> surgery :) ?
done. explicitly say rebuildWithoutConstantOffset

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:278
@@ +277,3 @@
+unsigned ConstantOffsetExtractor::FindFirstUse(User *U, Value *Used) {
+  for (unsigned i = 0, e = U->getNumOperands(); i < e; ++i) {
+    if (U->getOperand(i) == Used)
----------------
Eli Bendersky wrote:
> s/i/I/ and s/e/E/
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:318
@@ +317,3 @@
+  unsigned OpNo = FindFirstUse(U, C); // U->getOperand(OpNo) == C
+  assert(OpNo != (unsigned)-1 && "UserChain wasn't built correctly");
+  Value *TheOther = U->getOperand(1 - OpNo); // The other operand of U
----------------
Eli Bendersky wrote:
> Maybe asserting OpNo < 2 here is more inclusive?
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:383
@@ +382,3 @@
+  gep_type_iterator GTI = gep_type_begin(*GEP);
+  for (unsigned i = 1, e = GEP->getNumOperands(); i != e; ++i, ++GTI) {
+    if (isa<SequentialType>(*GTI)) {
----------------
Eli Bendersky wrote:
> I, E
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:416
@@ +415,3 @@
+  // trace into these casts.
+  if (GEP->isInBounds()) {
+    // Doing this to inbounds GEPs is safe because their indices are guaranteed
----------------
Eli Bendersky wrote:
> You can do this even if the offset doesn't need extraction? Is this OK?
This is OK in the sense that the program still behaves as before. Almost all such sext/zext are added by clang and some optimizations (e.g. instcombine) to make their lives easier (http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-April/072242.html). They are unnecessary anyway. 

Shortcutting sext/zext makes ConstantOffsetExtractor::find much easier to implement, so I am inclined to keep this logic.

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:441
@@ +440,3 @@
+  TargetTransformInfo &TTI = getAnalysis<TargetTransformInfo>();
+  if (!TTI.isLegalAddressingMode(GEP->getType()->getElementType(),
+                                 /*BaseGV=*/nullptr, AccumulativeByteOffset,
----------------
Eli Bendersky wrote:
> I think a comment here would be useful
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:450
@@ +449,3 @@
+  gep_type_iterator GTI = gep_type_begin(*GEP);
+  for (unsigned i = 1, e = GEP->getNumOperands(); i != e; ++i, ++GTI) {
+    if (isa<SequentialType>(*GTI)) {
----------------
Eli Bendersky wrote:
> I, E
done

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:549
@@ +548,3 @@
+  }
+  return Changed;
+}
----------------
Eli Bendersky wrote:
> Where do you modify Changed?
good catch. done

================
Comment at: test/Transforms/SeparateConstantOffsetFromGEP/NVPTX/gvn-pre.ll:1
@@ +1,2 @@
+; RUN: llc < %s -march=nvptx -mcpu=sm_20 | FileCheck %s
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 | FileCheck %s
----------------
Eli Bendersky wrote:
> Can you change the directory name to be consistent the pass name? [I mean s/Constant/Const/]
done

================
Comment at: test/Transforms/SeparateConstantOffsetFromGEP/NVPTX/gvn-pre.ll:2
@@ +1,3 @@
+; RUN: llc < %s -march=nvptx -mcpu=sm_20 | FileCheck %s
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 | FileCheck %s
+
----------------
Eli Bendersky wrote:
> Woudln't it be much more useful to test this on IR->IR level with opt?
> 
> test/Transforms/LoopStrengthReduce/ has some examples of tests that use opt with tripel flags to provide target info
done. I added an IR test and kept the PTX test with a different check-prefix.

http://reviews.llvm.org/D3462