[llvm] [InstCombine] Try to fold add into GEP x, C (PR #85090)

Thu Mar 14 04:02:02 PDT 2024

================
@@ -2893,6 +2893,37 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
         }
       }
     }
+
+    // Try to replace GEP p, (x + C1), C2 with GEP p, x, C2+C1*S
+    if (GEP.getNumIndices() > 1) {
+      gep_type_iterator GTI = gep_type_begin(GEP);
+      for (User::op_iterator I = GEP.op_begin() + 1, E = GEP.op_end() - 1;
+           I != E; ++I, ++GTI) {
+        if (!GTI.isSequential())
+          break;
+        Value *X;
+        const APInt *C1, *C2;
+        User::op_iterator Next = std::next(I);
+        if (match(I->get(), m_Add(m_Value(X), m_APInt(C1))) &&
+            match(Next->get(), m_APInt(C2))) {
+          TypeSize Scale1 = GTI.getSequentialElementStride(DL);
+          if (Scale1.isScalable() || !(++GTI).isSequential())
+            break;
+          TypeSize Scale2 = GTI.getSequentialElementStride(DL);
+          if (Scale2.isScalable())
+            break;
+
+          // Update the GEP instruction indices, and add Add to the worklist
+          // so that it can be DCEd.
+          Instruction *Add = cast<Instruction>(*I);
+          *I = X;
+          *Next = ConstantInt::get((*Next)->getType(),
+                                    *C2 + *C1 * (Scale1 / Scale2));
+          addToWorklist(Add);
+          return &GEP;
----------------
davemgreen wrote:

Yeah it would need to be dropped. This is in a `if (!GEP.isInBounds())` block, hence we already know it is not inbounds. I wasn't sure which is best for optimization between `a = add x; c1; gep inbounds p, a, c2` and `gep p, x, c3` without the inbounds. As the motivating case I had didn't have inbounds (even though it is from a global) I opted for the simpler route that hopefully wouldn't make anything worse.

I believe it should be OK to change it around though, I don't think this comes up a huge amount either way.

https://github.com/llvm/llvm-project/pull/85090