[llvm] [InterleavedAccess] Construct interleaved access store with shuffles (PR #164000)
Rajveer Singh Bharadwaj via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 22 03:49:01 PDT 2025
================
@@ -18173,6 +18180,135 @@ bool AArch64TargetLowering::lowerInterleavedStore(Instruction *Store,
return true;
}
+/// If the interleaved vector elements are greter than supported MaxFactor
+/// then, interleaving the data with additional shuffles can be used to
+/// achieve the same.
+/// Below shows how 8 interleaved data are shuffled to store with stN
+/// instructions. Data need store in this order v0,v1,v2,v3,v4,v5,v6,v7
+/// v0 v4 v2 v6 v1 v5 v3 v7
+/// | | | | | | | |
+/// \ / \ / \ / \ /
+/// [zip v0,v4] [zip v2,v6] [zip v1,v5] [zip v3,v7]==> stN = 4
+/// | | | |
+/// \ / \ /
+/// \ / \ /
+/// \ / \ /
+/// [zip [v0,v2,v4,v6]] [zip [v1,v3,v5,v7]] ==> stN = 2
+///
+/// In stN = 4 level upper half of interleaved data V0,V1,V2,V3 is store
+/// withone st4 instruction. Lower half V4,V5,V6,V7 store with another st4.
+///
+/// In stN = 2 level first upper half of interleaved data V0,V1 is store
+/// with one st2 instruction. Second set V2,V3 with store with another st2.
+/// Total of 4 st2 are required.
----------------
Rajveer100 wrote:
Nit.
```suggestion
/// If the interleaved vector elements are greater than supported MaxFactor,
/// interleaving the data with additional shuffles can be used to
/// achieve the same.
///
/// Consider the following data with 8 interleaves which are shuffled to store
/// stN instructions. Data needs to be stored in this order:
/// [v0, v1, v2, v3, v4, v5, v6, v7]
///
/// v0 v4 v2 v6 v1 v5 v3 v7
/// | | | | | | | |
/// \ / \ / \ / \ /
/// [zip v0,v4] [zip v2,v6] [zip v1,v5] [zip v3,v7] ==> stN = 4
/// | | | |
/// \ / \ /
/// \ / \ /
/// \ / \ /
/// [zip [v0,v2,v4,v6]] [zip [v1,v3,v5,v7]] ==> stN = 2
///
/// For stN = 4, upper half of interleaved data V0, V1, V2, V3 is stored
/// with one st4 instruction. Lower half, i.e, V4, V5, V6, V7 is stored with
/// another st4.
///
/// For stN = 2, upper half of interleaved data V0, V1 is stored
/// with one st2 instruction. Second set V2, V3 is stored with another st2.
/// Total of 4 st2's are required here.
```
https://github.com/llvm/llvm-project/pull/164000
More information about the llvm-commits
mailing list