[PATCH] D129501: Redefine get.active.lane.mask to allow a more scalar lowering

Mon Jul 11 17:40:42 PDT 2022

efriedma added a comment.

> As an aside, I am wondering if we need this intrinsic at all. The lowering chosen here could be used by the vectorizer directly, and the AArch64 whillelo pattern match for the EVL form would seem straight forward. Maybe we'd have trouble folding the SUB back in, but has anyone played with this?

The intrinsic was originally designed to be pattern-matched by MVE loop optimizations (llvm/lib/Target/ARM/MVETailPredication.cpp).  I think there were issues with making the pattern-matching work reliably without the intrinsic.

================
Comment at: llvm/docs/LangRef.rst:19962
+numbers and not in machine numbers.  If ``%n`` unsigned less than ``%base``, then
+the result is a poison value. The above is equivalent to:

----------------
A potential issue I see with this change is that it doesn't play well with unrolling in the vectorizer.  For example, if each iteration of a loop handles 8 elements at a time with vector width 4, you get two calls to llvm.get.active.lane.mask, I think.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129501/new/

https://reviews.llvm.org/D129501