[PATCH] D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.

Mon Jul 18 11:12:39 PDT 2022

dmgreen added inline comments.

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:444-445
+// 1. (zExt(L1) << shift1) | (zExt(L2) << shift2) -> zExt(L3) << shift1
+// 2. (? | (zExt(L1) << shift1)) | (zExt(L2) << shift2) -> ? | (zExt(L3) <<
+// shift1)
+static bool foldConsecutiveLoads(Value *V, LoadOps &LOps, const DataLayout &DL,
----------------
The comment could flow better:
```
// 2. (? | (zExt(L1) << shift1)) | (zExt(L2) << shift2)
//  -> ? | (zExt(L3) << shift1)
```

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:475
+    // Check if loads are same, atomic or volatile.
+    if ((LI1 == LI2) || !LI1 || !LI2 || (!LI1->isSimple() && !LI2->isSimple()))
+      return false;
----------------
Drop extra brackets from (LI1 == LI2)

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:485
+
+    // Is second Load a GEP
+    Value *Load2Ptr;
----------------
I think this can use getPointersDiff to check the two pointers are the right distance apart.

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:527
+    int MaxScan = 0;
+    if (LI1->comesBefore(LI2)) {
+      BasicBlock::iterator BBIt(LI2);
----------------
Do we check anywhere that LI1 and LI2 are in the same block?

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:529-532
+      for (BasicBlock::iterator II = LI1->getIterator(),
+                                IE = LI2->getIterator();
+           II != IE; ++II)
+        MaxScan++;
----------------
Does std::distance work?

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:579
+  LoadOps LOps;
+  if (!foldConsecutiveLoads((Value *)(&I), LOps, DL, DT) || !LOps.FoundRoot)
+    return false;
----------------
This doesn't need to cast to a Value*

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:588
+  // TTI based checks if we want to proceed with wider load
+  bool Allows =
+      TTI.isTypeLegal(IntegerType::get(I.getContext(), LOps.LoadSize));
----------------
Allows -> Allowed

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:589
+  bool Allows =
+      TTI.isTypeLegal(IntegerType::get(I.getContext(), LOps.LoadSize));
+  if (!Allows)
----------------
This could also be DL.isLegalInteger, which would avoid the need to create the Type.

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:597
+                                              LI1->getAlign(), &Fast);
+  if (!Allows)
+    return false;
----------------
Should this check the Fast flag too?
```
if (!Allowed || !Fast)
```

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:601-608
+  Value *Op1 = LI1->getOperand(0);
+  if (isa<GetElementPtrInst>(Op1)) {
+    GetElementPtrInst *GEP1 = dyn_cast<GetElementPtrInst>(Op1);
+    NewLoad = new LoadInst(IntegerType::get(GEP1->getContext(), LOps.LoadSize),
+                           GEP1, "", LI1->isVolatile(), LI1->getAlign(),
+                           LI1->getOrdering(), LI1->getSyncScopeID());
+  } else {
----------------
I think this is unneeded, and this can always just create:
```
NewLoad =
        new LoadInst(IntegerType::get(Load1Ptr->getContext(), LOps.LoadSize),
                     LI1->getPointerOperand(), "", LI1->isVolatile(), LI1->getAlign(),
                     LI1->getOrdering(), LI1->getSyncScopeID());
```

Or possibly use Builder.CreateLoad to avoid the separate Insert.

================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:621-624
+  if (LOps.zext) {
+    NewOp = Builder.CreateZExt(NewLoad, LOps.zext);
+  } else
+    NewOp = (Value *)NewLoad;
----------------
This could avoid the NewOp variable and just do:
```
  if (LOps.zext) {
    NewLoad = Builder.CreateZExt(NewLoad, LOps.zext);
```

================
Comment at: llvm/test/Transforms/AggressiveInstCombine/AArch64/or-load.ll:25-26
+
+  %s1 = shl i16 %e1, 0
+  %s2 = shl i16 %e2, 8
+
----------------
Can you run these tests through `opt -O1` (without this patch) and use the result as the tests (maybe with a little cleanup). LLVM-IR will almost never include shl 0 nodes, and we should make sure we are testing what will appear in reality.
https://godbolt.org/z/raxKnEE9a

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127392/new/

https://reviews.llvm.org/D127392