[llvm] [LV] Fix FindLastIV reduction for epilogue vectorization. (PR #120395)
Mel Chen via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 20 02:55:45 PST 2024
================
@@ -3405,15 +3406,13 @@ void VPReductionPHIRecipe::execute(VPTransformState &State) {
}
} else if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK)) {
// [I|F]FindLastIV will use a sentinel value to initialize the reduction
- // phi or the resume value from the main vector loop when vectorizing the
- // epilogue loop. In the exit block, ComputeReductionResult will generate
- // checks to verify if the reduction result is the sentinel value. If the
- // result is the sentinel value, it will be corrected back to the start
- // value.
+ // phi. In the exit block, ComputeReductionResult will generate checks to
+ // verify if the reduction result is the sentinel value. If the result is
+ // the sentinel value, it will be corrected back to the start value.
// TODO: The sentinel value is not always necessary. When the start value is
// a constant, and smaller than the start value of the induction variable,
// the start value can be directly used to initialize the reduction phi.
- Iden = StartV;
+ StartV = Iden = RdxDesc.getSentinelValue();
----------------
Mel-Chen wrote:
This is for correctness. For example:
```
int rdx = 3;
for (int i = 0; i < n; ++i)
rdx = a[i] > 3 ? i : rdx;
return rdx;
```
Assume {0, +, 1} generates a sequence <0, 1, 2, 3, ..., n-1>. This sequence might include the start value (3 in the example), so we need a number (the sentinel) that does not appear in this sequence and is smaller than all its members to replace the start value during the reduction operation. After the reduction, the result can be corrected using `rdx_res = max_rdx != sentinel ? max_rdx : start_value`.
Back to epilogue vectorization, the same approach must be applied to implement FindLastIV, especially since the main vector loop may be skipped. In such cases, the resume value entering the epilogue vector loop will directly be the start value. The epilogue vector loop must handle this in the same way as the main vector loop.
Interestingly, in most real-world code, it probably looks like this:
```
int rdx = -1;
for (int i = 0; i < n; ++i)
rdx = a[i] > 3 ? i : rdx;
return rdx;
```
In this case, directly initializing the phi with the start value (-1 in the second example) is entirely correct. As long as the start value does not appear in the {0, +, 1} sequence and is smaller than all the sequence members, a sentinel is unnecessary, and the start value can be used to initialize the phi directly. This is a improvement direction for FindLastIV idiom.
https://github.com/llvm/llvm-project/pull/120395
More information about the llvm-commits
mailing list