[PATCH] D71432: [AArch64][SVE] Proposal to use op+select to match scalable predicated operations

Thu Dec 12 17:57:11 PST 2019

cameron.mcinally added a comment.

In D71432#1782606 <https://reviews.llvm.org/D71432#1782606>, @efriedma wrote:

> Adding patterns for vselect of various operations seems reasonable in general.  The patterns are simple enough that it's not a big deal to repeat for a bunch of instructions.

I'm sure you know, but for others, this is how AVX512 handles predication. We originally began adding predicated intrinsics for every AVX512 instruction, but they were later replaced with op+select patterns (except for masked loads/stores/gathers/scatters and a few others). Although, I'll admit that this isn't an apples-to-apples comparison, since AVX512 has 5000+ predicated instructions. I don't know the SVE predicated instruction count, but my intuition says it's much less (please correct me if I'm wrong).

> For floating-point ops in particular, I'm sort of wondering how this interacts with STRICT_* operations. I think these patterns should not match in that case? We'd be suppressing exceptions that would otherwise trigger.  Not sure how important that is.

It's actually the other way around. Vectorization would need the selects in place to suppress exceptions guarded by conditions. E.g.:

  for(...)
    if (b[i] != 0)
      a[i] = a[i]/b[i];

My current understanding (and I could be wrong) is that the native predication intrinsics and the constrained intrinsics will be merge into one set. So once we have native predication + constrained intrinsics, these op+select patterns can go away on all targets. The op+select patterns are just a stop-gap.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71432/new/

https://reviews.llvm.org/D71432