[llvm] [AArch64] Optimized generated assembly for bool to svbool_t conversions (PR #83001)

Tue Feb 27 03:09:24 PST 2024

paulwalker-arm wrote:

> > The main concern I have about this commit is that it relies on splat_vector being always implemented using zeroing instruction.
> > Do you think it is okay to embed this reliance into the code ?
> 
> The real question is what is the canonical representation in the AArch64 backend of types like `<vscale x N x i1>` where `N < 16`. Taking `<vscale x 8 x i1>` as an example, we know it has to represented with a bit-pattern that is `vscale * 16` bits wide. Naturally, the "active" bits are the even-numbered bits and the "inactive" bits are the odd-numbered bits. The question is about the inactive bits. We can say they must be zero. Or we can say they can be undefined. We are going through a lot of effort to zero the "inactive" bits (that's the whole point of convert to/from svbool), which makes me think the canonical representation is with zero in "inactive" bits. Hence `splat_vector` must be lowered to instructions which yield zero in the inactive bits, other possible lowerings would be semantically incorrect.

By design the "invisible bits" within a predicate type are undefined.  When changing the visibility of bits via the to/from_svbool intrinsics we didn't want to leave newly visible bits hanging and so to_svbool defines them as zero. ` isZeroingInactiveLanes` exists to reduce the need to explicitly zero these lanes by entering certain operations into a contract to guarantee all isel related to them will zero their invisible lanes.

I suspect the reason that non-constant `SPLAT_VECTOR`'s were omitted is because they don't typically come from ACLE code where to_svbool is prevalent.  Out of interest, in what context are you seeing them today?

https://github.com/llvm/llvm-project/pull/83001