[all-commits] [llvm/llvm-project] 3033f2: [IR] Add llvm.vector.[de]interleave{4, 6, 8} (#139893)
Luke Lau via All-commits
all-commits at lists.llvm.org
Mon May 26 10:45:34 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 3033f202f6707937cd28c2473479db134993f96f
https://github.com/llvm/llvm-project/commit/3033f202f6707937cd28c2473479db134993f96f
Author: Luke Lau <luke at igalia.com>
Date: 2025-05-26 (Mon, 26 May 2025)
Changed paths:
M llvm/docs/LangRef.rst
M llvm/include/llvm/IR/Intrinsics.h
M llvm/include/llvm/IR/Intrinsics.td
M llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
M llvm/lib/IR/Intrinsics.cpp
M llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll
M llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll
M llvm/test/CodeGen/RISCV/rvv/vector-interleave-fixed.ll
M llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll
Log Message:
-----------
[IR] Add llvm.vector.[de]interleave{4,6,8} (#139893)
This adds [de]interleave intrinsics for factors of 4,6,8, so that every
interleaved memory operation supported by the in-tree targets can be
represented by a single intrinsic.
For context, [de]interleaves of fixed-length vectors are represented by
a series of shufflevectors. The intrinsics are needed for scalable
vectors, and we don't currently scalably vectorize all possible factors
of interleave groups supported by RISC-V/AArch64.
The underlying reason for this is that higher factors are currently
represented by interleaving multiple interleaves themselves, which made
sense at the time in the discussion in
https://github.com/llvm/llvm-project/pull/89018.
But after trying to integrate these for higher factors on RISC-V I think
we should revisit this design choice:
- Matching these in InterleavedAccessPass is non-trivial: We currently
only support factors that are a power of 2, and detecting this requires
a good chunk of code
- The shufflevector masks used for [de]interleaves of fixed-length
vectors are much easier to pattern match as they are strided patterns,
but for the intrinsics it's much more complicated to match as the
structure is a tree.
- Unlike shufflevectors, there's no optimisation that happens on
[de]interleave2 intriniscs
- For non-power-of-2 factors e.g. 6, there are multiple possible ways a
[de]interleave could be represented, see the discussion in #139373
- We already have intrinsics for 2,3,5 and 7, so by avoiding 4,6 and 8
we're not really saving much
By representing these higher factors are interleaved-interleaves, we can
in theory support arbitrarily high interleave factors. However I'm not
sure this is actually needed in practice: SVE only has instructions
for factors 2,3,4, whilst RVV only supports up to factor 8.
This patch would make it much easier to support scalable interleaved
accesses in the loop vectorizer for RISC-V for factors 3,5,6 and 7, as
the loop vectorizer and InterleavedAccessPass wouldn't need to
construct and match trees of interleaves.
For interleave factors above 8, for which there are no hardware memory
operations to match in the InterleavedAccessPass, we can still keep the
wide load + recursive interleaving in the loop vectorizer.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list