[PATCH] D150336: [LV][AArch64] Disable maximising bandwidth for streaming compatible sve

Thu May 18 03:46:19 PDT 2023

david-arm added a comment.

Thanks for making these changes @dtemirbulatov, the tests look a lot better now! I just had a couple of minor suggestions for improving the test a bit further and reducing the CHECK lines, but I think it's almost ready to go!

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/streaming-compatible-sve-no-maximize-bandwidth.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=loop-vectorize -force-streaming-compatible-sve -mattr=+sve -force-target-instruction-cost=1 -scalable-vectorization=off -S 2>&1 | FileCheck %s --check-prefix=SC_SVE
+; RUN: opt < %s -passes=loop-vectorize -mattr=+sve -force-target-instruction-cost=1 -scalable-vectorization=off -S 2>&1 | FileCheck %s --check-prefix=NO_SC_SVE
----------------
nit: This is just a suggestion, but if you add `-force-vector-interleave=1` to each of the RUN lines it should significantly reduce the number of CHECK lines.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/streaming-compatible-sve-no-maximize-bandwidth.ll:225
+;
+entry:
+  %cmp17 = icmp sgt i32 %n, 0
----------------
I think you can fold these two blocks into one and remove the >0 check, i.e.:

```entry:
  %0 = sext i32 %lag to i64
  %wide.trip.count = zext i32 %n to i64
  br label %for.body

for.body:
  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
...
  br i1 %exitcond.not, label %for.end, label %for.body

for.end:
  %ret.0.lcssa = phi i32 [ %add9, %for.body ]
  ret i32 %ret.0.lcssa```

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150336/new/

https://reviews.llvm.org/D150336