[llvm] [LV] Fix FindLastIV reduction for epilogue vectorization. (PR #120395)
Mel Chen via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 8 02:28:17 PST 2025
================
@@ -3405,15 +3406,13 @@ void VPReductionPHIRecipe::execute(VPTransformState &State) {
}
} else if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK)) {
// [I|F]FindLastIV will use a sentinel value to initialize the reduction
- // phi or the resume value from the main vector loop when vectorizing the
- // epilogue loop. In the exit block, ComputeReductionResult will generate
- // checks to verify if the reduction result is the sentinel value. If the
- // result is the sentinel value, it will be corrected back to the start
- // value.
+ // phi. In the exit block, ComputeReductionResult will generate checks to
+ // verify if the reduction result is the sentinel value. If the result is
+ // the sentinel value, it will be corrected back to the start value.
// TODO: The sentinel value is not always necessary. When the start value is
// a constant, and smaller than the start value of the induction variable,
// the start value can be directly used to initialize the reduction phi.
- Iden = StartV;
+ StartV = Iden = RdxDesc.getSentinelValue();
----------------
Mel-Chen wrote:
AnyOf can do this because there is only 1 element in the sequence.
FindLastIV possible needs to be adjusted like this:
Change bc.merge.rdx from
```
; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i64 [ [[RDX_SELECT]], %[[VEC_EPILOG_ITER_CHECK]] ], [ 3, %[[VECTOR_MAIN_LOOP_ITER_CHECK]] ]
```
to
```
; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i64 [ [[TMP6]], %[[VEC_EPILOG_ITER_CHECK]] ], [ -9223372036854775808, %[[VECTOR_MAIN_LOOP_ITER_CHECK]] ]
;; TMP6 is defined by `[[TMP6:%.*]] = call i64 @llvm.vector.reduce.smax.v4i64(<4 x i64> [[TMP4]])`
```
The method is to pattern match `rdx_res = max_rdx != sentinel ? max_rdx : start_value`, and directly use the result of reduce.smax (i.e. `max_rdx`) to resume the reduction result instead of the adjusted result (i.e. `rdx_res`).
I think this is a little more complicated, what do you think?
https://github.com/llvm/llvm-project/pull/120395
More information about the llvm-commits
mailing list