[PATCH] D147471: [VPlan] Don't add live-outs if scalar epilogue is required.
Ayal Zaks via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Apr 8 05:49:43 PDT 2023
Ayal added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3723
Loop *VectorLoop = LI->getLoopFor(State.CFG.VPBB2IRBB[LatchVPBB]);
- if (Cost->requiresScalarEpilogue(VF)) {
- // No edge from the middle block to the unique exit block has been inserted
- // and there is nothing to fix from vector loop; phis should have incoming
- // from scalar loop only.
- Plan.clearLiveOuts();
- } else {
+ if (!Cost->requiresScalarEpilogue(VF)) {
// If we inserted an edge from the middle block to the unique exit block,
----------------
nit: retain explanation for the emptied else?
TODO: better understand from VPlan if any external (live-out?) users of IVs need to be fixed up, than to ask Cost?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8806
VPBasicBlock *MiddleVPBB, Loop *OrigLoop,
- VPlan &Plan) {
+ VPlan &Plan, bool RequiresScalarEpilogue) {
+ if (RequiresScalarEpilogue) {
----------------
Passing a parameter only to early-exit; better guard the call instead?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8811
+ // from scalar loop only.
+
+ return;
----------------
nit: redundant empty line?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8997
+ addUsersInExitBlock(HeaderVPBB, MiddleVPBB, OrigLoop, *Plan,
+ all_of(Range, RequiresScalarEpilogue));
----------------
How about delegating to CM the check if a Range requires a scalar epilog or not:
```
bool RequiresScalarEpilogue = CM.requiresScalarEpilogue(Range);
if (!RequiresScalarEpilogue) // No users to add when scalar epiloque must be reached.
addUsersInExitBlock(HeaderVPBB, MiddleVPBB, OrigLoop, *Plan);
```
?
================
Comment at: llvm/test/Transforms/LoopVectorize/loop-form.ll:201
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr i32, ptr [[PTR:%.*]], i64 [[TMP0]]
; CHECK-NEXT: [[TMP2:%.*]] = getelementptr i32, ptr [[TMP1]], i32 0
; CHECK-NEXT: store <2 x i32> <i32 10, i32 10>, ptr [[TMP2]], align 4
----------------
Load eliminated from vectorized loop body (only), because we now recognize it is useless?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D147471/new/
https://reviews.llvm.org/D147471
More information about the llvm-commits
mailing list