[llvm] [X86] Attempt to fold trunc(srl(load(p), amt) -> load(p+amt/8) (PR #165266)

Tue Oct 28 01:35:32 PDT 2025

================
@@ -54652,6 +54653,39 @@ static SDValue combineTruncate(SDNode *N, SelectionDAG &DAG,
   if (SDValue V = combinePMULH(Src, VT, DL, DAG, Subtarget))
     return V;
 
+  // Fold trunc(srl(load(p),amt) -> load(p+amt/8)
+  // If we're shifting down whole byte+pow2 aligned bit chunks from a larger
+  // load for truncation, see if we can convert the shift into a pointer
+  // offset instead. Limit this to normal (non-ext) scalar integer loads.
+  if (SrcVT.isScalarInteger() && Src.getOpcode() == ISD::SRL &&
+      Src.hasOneUse() && Src.getOperand(0).hasOneUse() &&
+      ISD::isNormalLoad(Src.getOperand(0).getNode())) {
+    auto *Ld = cast<LoadSDNode>(Src.getOperand(0));
+    if (Ld->isSimple() && VT.isByteSized() &&
+        isPowerOf2_64(VT.getSizeInBits())) {
+      SDValue ShAmt = Src.getOperand(1);
----------------
RKSimon wrote:

What did you have in mind? We check that that VT is byte sized (multiple of 8 bits) and that its pow - then check ShAmt is zero in the lowest bits matching the alignment of VT.

https://github.com/llvm/llvm-project/pull/165266