[PATCH] D135324: [AArch64-SVE]: force using SVE in streaming mode to lower arithmetic and logical fixed-width vector ops.

Thu Oct 6 07:33:47 PDT 2022

david-arm added a comment.

Thanks for this @hassnaa-arm! I had some comments about how to tidy up the tests a bit. I also think some there are some load/store test changes that shouldn't be part of this patch.

================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-masked-stores.ll:420
 attributes #0 = { "target-features"="+sve" }
+
----------------
nit: whitespace

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-ext-loads.ll:8

-define <8 x i16> @load_zext_v8i8i16(<8 x i8>* %ap)  #0 {
-; CHECK-LABEL: load_zext_v8i8i16:
+define <8 x i16> @load_zext_v16i8i32(<8 x i8>* %ap)  #0 {
+; CHECK-LABEL: load_zext_v16i8i32:
----------------
I don't think these changes should be part of this patch, since it's not changing loads and stores?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-arith.ll:12
+;
+define <8 x i8> @add_v8i8(<8 x i8> %op1, <8 x i8> %op2) #0 {
+; CHECK-LABEL: add_v8i8:
----------------
Could you also add a test for an illegal NEON type too, i.e. `<4 x i8>` or `<2 x i16>`?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-arith.ll:472
+
+define <8 x i8> @mul_v8i8(<8 x i8> %op1, <8 x i8> %op2) #0 {
+; CHECK-LABEL: mul_v8i8:
----------------
Again, could you add at least one illegal type - `<4 x i8>` or `<2 x i16>`?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-arith.ll:1409
+
+define <8 x i8> @abs_v8i8(<8 x i8> %op1) #0 {
+; CHECK-LABEL: abs_v8i8:
----------------
Can you add an illegal NEON type such as `<2 x i16>`?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-arith.ll:91
+; VBITS_GE_128-NEXT:    add z1.b, z1.b, z5.b
+; VBITS_GE_128-NEXT:    stp q0, q1, [x0, #32]
+; VBITS_GE_128-NEXT:    add z0.b, z3.b, z7.b
----------------
david-arm wrote:
> Again, this is illegal in streaming mode.
Please ignore this comment! `stp q0, q1` is legal - my mistake!

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-arith.ll:1587
+; VBITS_GE_128-NEXT:    abs z1.h, p0/m, z2.h
+; VBITS_GE_128-NEXT:    stp q0, q1, [x0]
+; VBITS_GE_128-NEXT:    ret
----------------
david-arm wrote:
> This looks like a NEON instruction - can you investigate where this is coming from?
Please ignore this - my mistake!

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-div.ll:298
+
+define void @sdiv_v64i8(<64 x i8>* %a, <64 x i8>* %b) vscale_range(16,0) #0 {
+; CHECK-LABEL: sdiv_v64i8:
----------------
This still has the `vscale_range(16,0)` attribute. Can you remove it and recreate the CHECK lines please?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-div.ll:1062
+
+define void @udiv_v64i8(<64 x i8>* %a, <64 x i8>* %b) vscale_range(16,0) #0 {
+; CHECK-LABEL: udiv_v64i8:
----------------
Again, this still has the `vscale_range(16,0)` attribute. Can you remove it and regenerate the CHECK lines?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll:16
+
+define <8 x i8> @smulh_v8i8(<8 x i8> %op1, <8 x i8> %op2) #0 {
+; CHECK-LABEL: smulh_v8i8:
----------------
Can you add a test for an illegal type such as `<4 x i8>` too?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll:217
+define void @smulh_v64i8(<64 x i8>* %a, <64 x i8>* %b) #0 {
+; VBITS_GE_128-LABEL: smulh_v64i8:
+; VBITS_GE_128:       // %bb.0:
----------------
Wow, this code surely gets an award for being so impressively bad?!

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:471
+
+define void @srem_v128i8(<128 x i8>* %a, <128 x i8>* %b) #0 {
+; VBITS_GE_128-LABEL: srem_v128i8:
----------------
I think that you can remove the tests greater than 512 bits, i.e. <128 x i8>. If the tests already work for <64 x i8> they are likely to work for anything larger too.

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:773
+
+define void @srem_v256i8(<256 x i8>* %a, <256 x i8>* %b) #0 {
+; VBITS_GE_128-LABEL: srem_v256i8:
----------------
Again, maybe remove this test since I'm not sure what extra value it gives us?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:1607
+
+define void @srem_v64i16(<64 x i16>* %a, <64 x i16>* %b) #0 {
+; VBITS_GE_128-LABEL: srem_v64i16:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:1774
+
+define void @srem_v128i16(<128 x i16>* %a, <128 x i16>* %b) #0 {
+; VBITS_GE_128-LABEL: srem_v128i16:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:2239
+
+define void @srem_v32i32(<32 x i32>* %a, <32 x i32>* %b) #0 {
+; VBITS_GE_128-LABEL: srem_v32i32:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:2334
+
+define void @srem_v64i32(<64 x i32>* %a, <64 x i32>* %b) #0 {
+; VBITS_GE_128-LABEL: srem_v64i32:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:2651
+
+define void @srem_v16i64(<16 x i64>* %a, <16 x i64>* %b) #0 {
+; VBITS_GE_128-LABEL: srem_v16i64:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:2746
+
+define void @srem_v32i64(<32 x i64>* %a, <32 x i64>* %b) #0 {
+; VBITS_GE_128-LABEL: srem_v32i64:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:3383
+
+define void @urem_v128i8(<128 x i8>* %a, <128 x i8>* %b) #0 {
+; VBITS_GE_128-LABEL: urem_v128i8:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:3685
+
+define void @urem_v256i8(<256 x i8>* %a, <256 x i8>* %b) #0 {
+; VBITS_GE_128-LABEL: urem_v256i8:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:4519
+
+define void @urem_v64i16(<64 x i16>* %a, <64 x i16>* %b) #0 {
+; VBITS_GE_128-LABEL: urem_v64i16:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:4686
+
+define void @urem_v128i16(<128 x i16>* %a, <128 x i16>* %b) #0 {
+; VBITS_GE_128-LABEL: urem_v128i16:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:5151
+
+define void @urem_v32i32(<32 x i32>* %a, <32 x i32>* %b) #0 {
+; VBITS_GE_128-LABEL: urem_v32i32:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:5246
+
+define void @urem_v64i32(<64 x i32>* %a, <64 x i32>* %b) #0 {
+; VBITS_GE_128-LABEL: urem_v64i32:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:5563
+
+define void @urem_v16i64(<16 x i64>* %a, <16 x i64>* %b) #0 {
+; VBITS_GE_128-LABEL: urem_v16i64:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll:5658
+
+define void @urem_v32i64(<32 x i64>* %a, <32 x i64>* %b) #0 {
+; VBITS_GE_128-LABEL: urem_v32i64:
----------------
Again, maybe remove this test?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-loads.ll:2
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -aarch64-sve-vector-bits-min=128 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=256 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=384 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=512 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=640 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=768 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=896 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=1024 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=1152 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=1280 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=1408 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=1536 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=1664 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=1792 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=1920 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=2048 -force-sve-when-streaming-compatible < %s | FileCheck %s
+; RUN: llc -aarch64-sve-vector-bits-min=128 -force-sve-when-streaming-compatible < %s | FileCheck %s -check-prefixes=CHECK,VBITS_GE_128
+; RUN: llc -aarch64-sve-vector-bits-min=256 -force-sve-when-streaming-compatible < %s | FileCheck %s -check-prefixes=CHECK,VBITS_GE_256
----------------
Not part of this patch?

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-store.ll:2
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -aarch64-sve-vector-bits-min=128 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=256 -force-sve-when-streaming-compatible < %s | FileCheck %s
-; RUN: llc -aarch64-sve-vector-bits-min=512 -force-sve-when-streaming-compatible < %s | FileCheck %s
+; RUN: llc -aarch64-sve-vector-bits-min=128 -force-sve-when-streaming-compatible < %s | FileCheck %s -check-prefixes=CHECK,VBITS_GE_128_STREAMING
+; RUN: llc -aarch64-sve-vector-bits-min=256 -force-sve-when-streaming-compatible < %s | FileCheck %s -check-prefixes=CHECK,VBITS_GE_256_STREAMING
----------------
Not part of this patch?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135324/new/

https://reviews.llvm.org/D135324