[PATCH] D136858: [AArch64-SVE]: Force generating code compatible to streaming mode for sve-fixed-length tests.

Tue Nov 8 06:45:37 PST 2022

david-arm added a comment.

Hi @hassnaa-arm, could you rebase this off D137093 <https://reviews.llvm.org/D137093> please because I'd like to see whether or not this patch fixes up the gathers and scatters present in CodeGen/AArch64/sve-streaming-mode-fixed-length-addressing-modes.ll from that patch.

I've only reviewed about 1/3 of this patch so far, since it's so big! But I'm leaving the comments I have so far ...

================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:3658
+      BuildMI(MBB, I, DL, get(AArch64::ORR_ZZZ), DestReg)
+          .addReg(AArch64::Z0 + (SrcReg - AArch64::Q0), RegState::Define)
+          .addReg(AArch64::Z0 + (DestReg - AArch64::Q0), RegState::Define);
----------------
Given you're trying to mov SrcReg into DstReg I think this is incorrect for two reasons:

1. You're marking the source registers as being Defined, which isn't right since they're only being read.
2. The second source should also be SrcReg.

i.e. something like this:

  BuildMI(MBB, I, DL, get(AArch64::ORR_ZZZ))
        .addReg(AArch64::Z0 + (DestReg - AArch64::Q0), RegState::Define)
        .addReg(AArch64::Z0 + (SrcReg - AArch64::Q0))
        .addReg(AArch64::Z0 + (SrcReg - AArch64::Q0))

================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-int-reduce.ll:1111
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    umaxv b0, v0.8b
+; CHECK-NEXT:    // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT:    ptrue p0.b, vl8
----------------
There is nothing technically wrong with these changes - we're getting the same result. I just wonder if we want to change the behaviour for NEON-like vectors when not in streaming mode?

@paulwalker-arm any thoughts?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-to-int.ll:321
+
+define void @fcvtzu_v16f32_v16i16(<16 x float>* %a, <16 x i16>* %b) #0 {
+; CHECK-LABEL: fcvtzu_v16f32_v16i16:
----------------
I think we can remove this test because the input vector > 256 bits.

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-to-int.ll:1070
+
+define void @fcvtzs_v16f32_v16i16(<16 x float>* %a, <16 x i16>* %b) #0 {
+; CHECK-LABEL: fcvtzs_v16f32_v16i16:
----------------
Remove this test, since input > 256 bits?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-scatter.ll:2401
+
+define void @masked_scatter_32b_scaled_sext_f64(<32 x double>* %a, <32 x i32>* %b, double* %base) #0 {
+; CHECK-LABEL: masked_scatter_32b_scaled_sext_f64:
----------------
Given we know we're just going to scalarise this operation I wonder if there is much value in producing tests for large types? Perhaps we can just ignore tests for vectors that are > 256 bits?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136858/new/

https://reviews.llvm.org/D136858