[PATCH] D150851: [LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable

Fri Jul 7 04:40:18 PDT 2023

Mel-Chen added a comment.

Update: I could found an example where the approach in D152693 <https://reviews.llvm.org/D152693> lead to incorrect result:
Assuming `start_value` is 3, and `red_part` is {0, 1, 2, 3} in the end. If the 3 is updated from the loop, not from the `start_value` , `red` should be 3 instead of 2.

@artagnon, could you please help to verify it?

> If we focus on removing the wrapping and bound restrictions, I think we can consider the approach proposed by @artagnon in D152693 <https://reviews.llvm.org/D152693>. This method cleverly extends the technique used by @david-arm in `SelectICmp`. The approach can be summarized as follows: 
> Consider the loop:
>
>   unsigned int red = start_value;
>   for (unsigned int i = 0; i < n; ++i)
>     red = (a[i] > b[i]) ? i : red;
>
> vectorize to:
>
>   unsigned int red = start_value;
>   vec_unsigned_int red_part = splat(start_value);
>   vec_unsigned_int step_vec = {0, 1, 2, ...};
>   for (unsigned int i = 0; i < n; i+=vl) {
>     red_part = (vec_a[i] > vec_b[i]) ? step_vec : red_part;
>     step_vec += {vl, vl, vl, ...};
>   }
>   vec_bool ne_start_value = red_part != splat(start_value);
>   bool may_update = reduce.or(ne_start_value);
>   vec_unsigned_int masked_red_part = ne_start_value ? red_part : splat(DataTypeMin);
>   red = may_update ? reduce.smax|umax(masked_red_part) : start_value;

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150851/new/

https://reviews.llvm.org/D150851