[PATCH] D127392: [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 18 11:12:39 PDT 2022
dmgreen added inline comments.
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:444-445
+// 1. (zExt(L1) << shift1) | (zExt(L2) << shift2) -> zExt(L3) << shift1
+// 2. (? | (zExt(L1) << shift1)) | (zExt(L2) << shift2) -> ? | (zExt(L3) <<
+// shift1)
+static bool foldConsecutiveLoads(Value *V, LoadOps &LOps, const DataLayout &DL,
----------------
The comment could flow better:
```
// 2. (? | (zExt(L1) << shift1)) | (zExt(L2) << shift2)
// -> ? | (zExt(L3) << shift1)
```
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:475
+ // Check if loads are same, atomic or volatile.
+ if ((LI1 == LI2) || !LI1 || !LI2 || (!LI1->isSimple() && !LI2->isSimple()))
+ return false;
----------------
Drop extra brackets from (LI1 == LI2)
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:485
+
+ // Is second Load a GEP
+ Value *Load2Ptr;
----------------
I think this can use getPointersDiff to check the two pointers are the right distance apart.
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:527
+ int MaxScan = 0;
+ if (LI1->comesBefore(LI2)) {
+ BasicBlock::iterator BBIt(LI2);
----------------
Do we check anywhere that LI1 and LI2 are in the same block?
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:529-532
+ for (BasicBlock::iterator II = LI1->getIterator(),
+ IE = LI2->getIterator();
+ II != IE; ++II)
+ MaxScan++;
----------------
Does std::distance work?
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:579
+ LoadOps LOps;
+ if (!foldConsecutiveLoads((Value *)(&I), LOps, DL, DT) || !LOps.FoundRoot)
+ return false;
----------------
This doesn't need to cast to a Value*
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:588
+ // TTI based checks if we want to proceed with wider load
+ bool Allows =
+ TTI.isTypeLegal(IntegerType::get(I.getContext(), LOps.LoadSize));
----------------
Allows -> Allowed
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:589
+ bool Allows =
+ TTI.isTypeLegal(IntegerType::get(I.getContext(), LOps.LoadSize));
+ if (!Allows)
----------------
This could also be DL.isLegalInteger, which would avoid the need to create the Type.
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:597
+ LI1->getAlign(), &Fast);
+ if (!Allows)
+ return false;
----------------
Should this check the Fast flag too?
```
if (!Allowed || !Fast)
```
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:601-608
+ Value *Op1 = LI1->getOperand(0);
+ if (isa<GetElementPtrInst>(Op1)) {
+ GetElementPtrInst *GEP1 = dyn_cast<GetElementPtrInst>(Op1);
+ NewLoad = new LoadInst(IntegerType::get(GEP1->getContext(), LOps.LoadSize),
+ GEP1, "", LI1->isVolatile(), LI1->getAlign(),
+ LI1->getOrdering(), LI1->getSyncScopeID());
+ } else {
----------------
I think this is unneeded, and this can always just create:
```
NewLoad =
new LoadInst(IntegerType::get(Load1Ptr->getContext(), LOps.LoadSize),
LI1->getPointerOperand(), "", LI1->isVolatile(), LI1->getAlign(),
LI1->getOrdering(), LI1->getSyncScopeID());
```
Or possibly use Builder.CreateLoad to avoid the separate Insert.
================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:621-624
+ if (LOps.zext) {
+ NewOp = Builder.CreateZExt(NewLoad, LOps.zext);
+ } else
+ NewOp = (Value *)NewLoad;
----------------
This could avoid the NewOp variable and just do:
```
if (LOps.zext) {
NewLoad = Builder.CreateZExt(NewLoad, LOps.zext);
```
================
Comment at: llvm/test/Transforms/AggressiveInstCombine/AArch64/or-load.ll:25-26
+
+ %s1 = shl i16 %e1, 0
+ %s2 = shl i16 %e2, 8
+
----------------
Can you run these tests through `opt -O1` (without this patch) and use the result as the tests (maybe with a little cleanup). LLVM-IR will almost never include shl 0 nodes, and we should make sure we are testing what will appear in reality.
https://godbolt.org/z/raxKnEE9a
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D127392/new/
https://reviews.llvm.org/D127392
More information about the llvm-commits
mailing list