[PATCH] D150851: [LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable
Mel Chen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 10 03:08:10 PDT 2023
Mel-Chen added a comment.
In D150851#4480526 <https://reviews.llvm.org/D150851#4480526>, @artagnon wrote:
> Yes, I can confirm that there is indeed a bug. Thanks for catching it! I'm thinking about a fix now.
Perhaps Ayal's approach can be helpful. The final `found_func` allows the approach to be more widely applicable. If we only focus on increasing/decreasing induction variables, we may be able to further streamline the process.
In D150851#4467261 <https://reviews.llvm.org/D150851#4467261>, @Ayal wrote:
> Here's a sketch minimizing the size of the indices maintained throughout the loop, so they would avoid wrapping, provide out-of-bound values, and possibly use narrower types depending on trip-count and vl:
>
> return_type FindLast(return_type unfound_value, vec_predicate_func, found_func) {
> vec_unsigned_int select_red_part = splat(0); // Zero indicates unfound.
> vec_unsigned_int step_vec = splat(1); // Count vector iterations starting at 1.
>
> for (unsigned int i = 0; i < n; i+=vl, step_vec+=splat(1))
> select_red_part = (vec_predicate_func(i) ? step_vec : select_red_part;
>
> unsigned vec_indices_ored = reduce.or(select_red_part);
> if (vec_indices_ored == 0)
> return unfound_value;
> unsigned inflated_red_part = (select_red_part - splat(1)) * vl + <0,1,...,vl-1>;
> unsigned last_index = reduce.umax(inflated_red_part);
> return found_func(last_index);
> }
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D150851/new/
https://reviews.llvm.org/D150851
More information about the llvm-commits
mailing list