[PATCH] D127727: [SeparateConstOffsetFromGEPPass] Added optional modification strategy

Thu Aug 11 02:58:31 PDT 2022

mkazantsev requested changes to this revision.
mkazantsev added a comment.
This revision now requires changes to proceed.

Found some bugs, some style proposals as well. The general point still holds. If the patch is purposed to reduce register pressure on some platform, please provide a test which shows that this actually happens. This can only be shown on a `llc` test.

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:247
   /// it. It returns the numeric value of the extracted constant offset (0 if
   /// failed). The meaning of the arguments are the same as Extract.
   static int64_t Find(Value *Idx, GetElementPtrInst *GEP,
----------------
This comment is obsolete now, `Extract` does not have these new parameters.

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:272
+  /// \p NonConstantBaseValue The second non-constant operand if V is binary
+  ///                         operator.
+  /// \p CheckProfitability   Check the possible profit of optimization and
----------------
And if `V` is not binop, should it change?

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:362
+///
+/// GEPPointer - %array
+/// PreviousIndices - [%a]
----------------
This is usually called `BasePointer` in other parts of optimizer.

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:363
+/// GEPPointer - %array
+/// PreviousIndices - [%a]
+/// NonConstantBaseValue - %i
----------------
Shouldn't `%b` also be a part of it? Or where does it go?
Maybe more elaborate example on how there can be more than one previous index?

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:388
+  GetElementPtrInst *GEPInstruction;
+  int64_t AccumulativeByteOffset;
+  SmallVector<const Value *> ConstantIndices;
----------------
APInt? Just to make sure this doesn't overflow.

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:389
+  int64_t AccumulativeByteOffset;
+  SmallVector<const Value *> ConstantIndices;
+
----------------
Naturaly I'd expect this to be `SmallVector<const ConstantInt *>`, but the code below suggests there might not be constants. Misleading name?

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:557
+  static inline GEPBaseInfo getTombstoneKey() {
+    return GEPBaseInfo((Value *)(-1), SmallVector<const Value *>(),
+                       (Value *)(-1));
----------------
Use `DenseMapInfo<Value *>::getTombstoneKey()` and same above

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:561
+  static unsigned getHashValue(const GEPBaseInfo &Val) {
+    return llvm::hash_combine(Val.GEPPointer, Val.PreviousIndices.size(),
+                              Val.NonConstantBaseValue);
----------------
Why PreviousIndices size but not contents?

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:709

+  if (!isa<ConstantInt>(BO->getOperand(0))) {
+    NonConstantBaseValue = BO->getOperand(0);
----------------
No `{ }`

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:710
+  if (!isa<ConstantInt>(BO->getOperand(0))) {
+    NonConstantBaseValue = BO->getOperand(0);
+  }
----------------
`undef` and `poison` are constants but not `ConstantInt`. Are you OK with them?

================
Comment at: llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:1364
+            for (unsigned J = I + 1;
+                 J < SortedInstructionsList.size() && CurrentChanged; J++) {
+              auto RemoveIt = remove_if(SortedInstructionsList[J].first,
----------------
eklepilkina wrote:
> mkazantsev wrote:
> > The complexity of this is `SortedInstructionsList.size() * SortedInstructionsList.size() * sum(SortedInstructionsList[J])` if I'm reading this correctly. Looks very expensive. Is there a cheaper way of doing this? Imagine you have 10k instructions on your list. It will just be stuck forever.
> > Imagine you have 10k instructions on your list
>  
> I amn't sure we should optimize this case, because it's mostly impossible, because this list is always quite small. I'll think some more, but I amn't sure that the optimization here is more important than readability.
"Mostly impossible" means "possible". We generally bail out on non-linear algorithms with some thresholds. This could also be the case here.

================
Comment at: llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll:286
+; CHECK-NEXT:    [[GEP8:%.*]] = getelementptr inbounds [50 x i32], [50 x i32]* [[ARRAY]], i64 [[SEXT4]], i64 [[SEXT7]]
+; CHECK-NEXT:    store i32 [[I]], i32* [[GEP8]], align 4
 ; CHECK-NEXT:    ret i32 undef
----------------
eklepilkina wrote:
> mkazantsev wrote:
> > This code is bigger than it used to be. Can you explain why is it better?
> This code is bigger on IR, and it's so becuase of repeating sext opertaions, but `sext` isn't so critical in assembly, at the same time pass generates 2 new GEP instructions that are used as base and we need registers for them
Then please provide a llc test that demonstrates a positive change.

The fact that "sext isn't so critical" is a way not obvious to me. Filling the upper part of the registry may sometimes be an extra operation.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127727/new/

https://reviews.llvm.org/D127727