[PATCH] D127392: [InstCombine] Combine consecutive loads which are being merged to form a wider load.

Fri Jun 24 01:58:23 PDT 2022

nikic added a comment.

The main question I see here is whether this needs to be TTI based or not -- if yes, then it also shouldn't be in InstCombine. I think there's two reason why we might want TTI:

- Do we want to create unaligned loads? Creating them is legal, but if the target does not support fast unaligned loads, the backend will break them up again. Should we only perform this optimization if `TTI.allowsMisalignedMemoryAccesses` reports a fast unaligned access?
- Do we want to create loads with illegal sizes? For example, if we have a 64-bit target, so we want to combine two `i64` loads into an `i128` load, even though it will later be broken up again? (At least for the current implementation where both loads must have the same size, the question of whether to allow `i24` loads or similar does not come up.)

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:3004
+  if (match(&I, m_c_Or(m_Value(X), m_OneUse(m_Shl(m_Value(Y), m_APInt(ShAmt2)))))) {
+    Value *ShAmt1 = NULL;
+    Instruction *L1, *L2;
----------------
nullptr

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:3017
+      LoadInst *LI1, *LI2;
+      if (!((LI1 = dyn_cast<LoadInst>(L1)) && (LI2 = dyn_cast<LoadInst>(L2))))
+        return nullptr;
----------------
Move dyn_cast into initialization, having it in the if is hard to read.

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:3029
+      // Second load must be a GEP. Extract load info.
+      if (isa<GetElementPtrInst>(Op2)) {
+        GetElementPtrInst *GEP2 = dyn_cast<GetElementPtrInst>(Op2);
----------------
GEPOperator

Also directly dyn_cast in the condition.

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:3035
+        if (ConstantInt *CI = dyn_cast<ConstantInt>(GEP2->getOperand(1)))
+          Idx2 = CI->getZExtValue();
+
----------------
You are probably looking for `GEP2->accumulateConstantOffsets()` here -- your current code will treat GEPs with multiple indices incorrectly.

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:3077
+        if (!FindAvailableLoadedValue(LI1, LoadBB, BBIt, DefMaxInstsToScan, AA,
+                                      &IsLoadCSE))
+          return nullptr;
----------------
I don't get this check -- you're checking that there is an available value, but not that this value is the previous load. There might still be a clobber in between.

What you're probably looking for is canInstructionRangeModRef() between the two instructions.

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:3108
+        } else
+          NewOp = (Value *)NewLoad;
+
----------------
Unnecessary cast

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:3114
+          NewOp = BinaryOperator::CreateShl(NewOp, ShAmt1);
+          Builder.Insert(NewOp);
+        }
----------------
Why does this not use the builder to create the instruction, only to insert it?

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:3119
+          Value *Zero = Constant::getNullValue(X->getType());
+          return BinaryOperator::CreateOr(NewOp, Zero);
+        } else
----------------
Uh, what's the point of oring something with zero?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127392/new/

https://reviews.llvm.org/D127392