[PATCH] D135564: [AArch64-SVE]: Force generating code compatible to streaming mode for (masked/extending/truncating) load/store
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 20 06:13:24 PDT 2022
sdesmalen added a comment.
Hi @hassnaa-arm, I think you have the tests the wrong way around. The tests from D136147 <https://reviews.llvm.org/D136147> should be part of //this// patch, because this is the patch where you're implementing the lowering of the operations you're testing in D136147 <https://reviews.llvm.org/D136147>.
After this patch, you get the masked/truncating/extending load/store operations "for free", so the tests for those operations could be moved to a separate test-only patch like D136147 <https://reviews.llvm.org/D136147>.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:1394
+ for (MVT VT : {MVT::v8i8, MVT::v16i8, MVT::v4i16, MVT::v8i16, MVT::v2i32,
+ MVT::v4i32, MVT::v1i64, MVT::v2i64})
+ addTypeForStreamingSVE(VT);
----------------
It should be able to use standard scalar instructions for v1i64 in streaming-compatible mode, so this one can be removed from the list.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:1397
+
+ for (MVT VT : {MVT::v4f16, MVT::v8f16, MVT::v2f32, MVT::v4f32, MVT::v1f64,
+ MVT::v2f64})
----------------
Most scalar FP operations are valid in streaming mode, so we probably don't need to do anything custom for this type.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:1609-1611
+ setOperationAction(ISD::ANY_EXTEND, VT, Custom);
+ setOperationAction(ISD::ZERO_EXTEND, VT, Custom);
+ setOperationAction(ISD::SIGN_EXTEND, VT, Custom);
----------------
nit: Perhaps it doesn't lead to an error, but these operations only operate on integers, so should be guarded by:
if (VT.isInteger()) { ... }
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12264
- // If this is extracting the upper 64-bits of a 128-bit vector, we match
- // that directly.
- if (Size == 64 && Idx * InVT.getScalarSizeInBits() == 64 &&
- InVT.getSizeInBits() == 128)
- return Op;
+ if (!Subtarget->forceStreamingCompatibleSVE()) {
+ // If this is extracting the upper 64-bits of a 128-bit vector, we match
----------------
nit: rather than wrapping this in another condition, can you just add it to the existing condition with `&& !Subtarget->forceStreamingCompatibleSVE()` ?
================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:3035-3036
// Extract element from vector with immediate index that's within the bottom 128-bits.
- let AddedComplexity = 1 in {
- def : Pat<(i32 (vector_extract (nxv16i8 ZPR:$vec), VectorIndexB:$index)),
- (i32 (UMOVvi8 (v16i8 (EXTRACT_SUBREG ZPR:$vec, zsub)), VectorIndexB:$index))>;
- def : Pat<(i32 (vector_extract (nxv8i16 ZPR:$vec), VectorIndexH:$index)),
- (i32 (UMOVvi16 (v8i16 (EXTRACT_SUBREG ZPR:$vec, zsub)), VectorIndexH:$index))>;
- def : Pat<(i32 (vector_extract (nxv4i32 ZPR:$vec), VectorIndexS:$index)),
- (i32 (UMOVvi32 (v4i32 (EXTRACT_SUBREG ZPR:$vec, zsub)), VectorIndexS:$index))>;
- def : Pat<(i64 (vector_extract (nxv2i64 ZPR:$vec), VectorIndexD:$index)),
- (i64 (UMOVvi64 (v2i64 (EXTRACT_SUBREG ZPR:$vec, zsub)), VectorIndexD:$index))>;
- }
-
- def : Pat<(sext_inreg (vector_extract (nxv16i8 ZPR:$vec), VectorIndexB:$index), i8),
- (i32 (SMOVvi8to32 (v16i8 (EXTRACT_SUBREG ZPR:$vec, zsub)), VectorIndexB:$index))>;
- def : Pat<(sext_inreg (anyext (vector_extract (nxv16i8 ZPR:$vec), VectorIndexB:$index)), i8),
- (i64 (SMOVvi8to64 (v16i8 (EXTRACT_SUBREG ZPR:$vec, zsub)), VectorIndexB:$index))>;
+ let Predicates = [NotInStreamingSVEMode] in {
+ let AddedComplexity = 1 in {
+ def : Pat<(i32 (vector_extract (nxv16i8 ZPR:$vec), VectorIndexB:$index)),
----------------
nit: Can you change this into:
let Predicates = [NotInStreamingSVEMode], AddedComplexity = 1 in {
def : Pat<...>
..
}
let Predicates = [NotInStreamingSVEMode] in {
def : Pat<..>
...
}
Rather than indenting?
================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-store.ll:312
+
+define void @masked_store_v32f32(<32 x float>* %dst, <32 x i1> %mask) #0 {
+; CHECK-LABEL: masked_store_v32f32:
----------------
Can you remove all tests that are larger than "twice the size" of a 128bit vector (v32f32 is 8x the size, I'm not sure what value that adds for the testing of this functionality)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D135564/new/
https://reviews.llvm.org/D135564
More information about the llvm-commits
mailing list