[PATCH] D136858: [AArch64-SVE]: Force generating code compatible to streaming mode for sve-fixed-length tests.
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 16 10:11:10 PST 2022
sdesmalen added a comment.
Hi @hassnaa-arm, I've not been able to go through the entire patch yet, but I think it makes sense to split it up to make the changes a bit easier to review. I've left some comments to suggest how to split it up and also some comments on the code-changes itself.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:1664
+ setCondCodeAction(ISD::SETOLT, VT, Expand);
+ setCondCodeAction(ISD::SETLT, VT, Expand);
+ setCondCodeAction(ISD::SETOLE, VT, Expand);
----------------
SETLT and SETLE should not be in this list, because they have a 1-1 mapping with instructions.
Most of the other nodes need expanding into two nodes (one for ordered/unordered and one for LE/LT), with SETO need expanding to `not(unordered)`.
It would also be nice if these changes could be moved to a separate patch.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:1674-1682
+ setTruncStoreAction(MVT::v2f32, MVT::v2f16, Custom);
+ setTruncStoreAction(MVT::v4f32, MVT::v4f16, Custom);
+ setTruncStoreAction(MVT::v8f32, MVT::v8f16, Custom);
+ setTruncStoreAction(MVT::v1f64, MVT::v1f16, Custom);
+ setTruncStoreAction(MVT::v2f64, MVT::v2f16, Custom);
+ setTruncStoreAction(MVT::v4f64, MVT::v4f16, Custom);
+ setTruncStoreAction(MVT::v1f64, MVT::v1f32, Custom);
----------------
These actions are unrelated to `VT` as passed into the function, so they can be moved out of this function.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:3840
- if (useSVEForFixedLengthVectorVT(SrcVT))
+ if (useSVEForFixedLengthVectorVT(SrcVT,
+ Subtarget->forceStreamingCompatibleSVE()))
----------------
It would be nice if you could split up your patch such that each such a code change, lives in its own patch with a set of corresponding tests. That makes the patch a bit more manageable to review.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12772
- if (useSVEForFixedLengthVectorVT(Op.getOperand(0).getValueType()))
+ if (useSVEForFixedLengthVectorVT(Op.getOperand(0).getValueType(),
+ Subtarget->forceStreamingCompatibleSVE()))
----------------
Please move this change and corresponding tests to a separate patch.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14066
bool UseScalable;
- if (!Subtarget->hasNEON() ||
+ if (Subtarget->forceStreamingCompatibleSVE() ||
+ !Subtarget->hasNEON() ||
----------------
Rather than adding this condition here, you can add the condition to `isLegalInterleavedAccessType` like this:
```- if (Subtarget->useSVEForFixedLengthVectors() &&
- (VecSize % Subtarget->getMinSVEVectorSizeInBits() == 0 ||
- (VecSize < Subtarget->getMinSVEVectorSizeInBits() &&
- isPowerOf2_32(NumElements) && VecSize > 128))) {
+ if (Subtarget->forceStreamingCompatibleSVE() ||
+ (Subtarget->useSVEForFixedLengthVectors() &&
+ (VecSize % Subtarget->getMinSVEVectorSizeInBits() == 0 ||
+ (VecSize < Subtarget->getMinSVEVectorSizeInBits() &&
+ isPowerOf2_32(NumElements) && VecSize > 128)))) {
```
When you add `vscale_range(1,16)` to the attributes of the test file, you will get the code you expect.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12931
useSVEForFixedLengthVectorVT(
SrcVT, OverrideNEON && Subtarget->useSVEForFixedLengthVectors())) {
----------------
sdesmalen wrote:
> Instead of updating the OverrideNEON variable, I suspect that you actually want to do something like this:
>
> if (SrcVT.isScalableVector() ||
> useSVEForFixedLengthVectorVT(
> SrcVT, OverrideNEON && Subtarget->useSVEForFixedLengthVectors()) ||
> useSVEForFixedLengthVectorVT(
> SrcVT, Subtarget->forceStreamingCompatibleSVE()))
>
> so to not alter the behaviour for non-streaming fixed-length vectors. This avoids the regressions in sve-fixed-length-int-reduce.ll where the SVE variants require an additional predicate (whereas the NEON reduction operations are unpredicated and thus only 1 instruction).
This can actually be simplified to:
```
bool OverrideNEON = Subtarget->forceStreamingCompatibleSVE() ||
(Subtarget->useSVEForFixedLengthVectors() &&
(Op.getOpcode() == ISD::VECREDUCE_AND ||
Op.getOpcode() == ISD::VECREDUCE_OR ||
Op.getOpcode() == ISD::VECREDUCE_XOR ||
Op.getOpcode() == ISD::VECREDUCE_FADD ||
(Op.getOpcode() != ISD::VECREDUCE_ADD &&
SrcVT.getVectorElementType() == MVT::i64)));
if (SrcVT.isScalableVector() ||
useSVEForFixedLengthVectorVT(SrcVT, OverrideNEON)) {```
It would be nice if you could move this change, the changes in `addTypeForStreamingSVE` and corresponding reduction tests to a separate patch.
================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:3656
AArch64::FPR128RegClass.contains(SrcReg)) {
- if (Subtarget.hasNEON()) {
+ if (Subtarget.forceStreamingCompatibleSVE()) {
+ BuildMI(MBB, I, DL, get(AArch64::ORR_ZZZ))
----------------
Can you move this change to a separate patch and test it with something very simple, such as:
```define fp128 @test_streaming_compatible_register_mov(fp128 %q0, fp128 %q1) {
; CHECK-LABEL: test_streaming_compatible_register_mov:
; CHECK: // %bb.0:
; CHECK-NEXT: mov z0.d, z1.d
; CHECK-NEXT: ret
ret fp128 %y
}```
================
Comment at: llvm/test/CodeGen/AArch64/-streaming-mode-fixed-length-fp-arith.ll:1
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -force-streaming-compatible-sve < %s | FileCheck %s
----------------
The name of this file is incorrect, it should start with `sve-` instead of `-`
================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-ld2-alloca.ll:20
+ %strided.vec = shufflevector <8 x double> %load, <8 x double> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
+ store <8 x double> zeroinitializer, ptr %ptr
+ ret void
----------------
This test seems broken, because it's not using the result `%strided.vec` from the shufflevector, which as we can see from the assembly causes the load + shuffle to be removed entirely.
I've fixed `sve-fixed-ld2-alloca.ll` in c2600244fc14, can you update this test accordingly?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136858/new/
https://reviews.llvm.org/D136858
More information about the llvm-commits
mailing list