[PATCH] D147040: [AArch64][CodeGen] Use interleave store for streaming compatible functions

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 29 05:49:54 PDT 2023


david-arm added a comment.

Thanks for the new tests @CarolineConcatto! I just had a couple more suggestions on possibly improving the tests a bit more ...



================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-shuffle.ll:21
 
+define void @interleave_store_without_splat(ptr %a, <8 x i32> %v1, <8 x i32> %v2) #0 {
+; CHECK-LABEL: interleave_store_without_splat:
----------------
I don't think you need the second `%v2` argument here, since it's never actually used. You can rewrite the IR below to just be:

  %interleaved = shufflevector <8 x i32> %v1, <8 x i32> undef, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
  store <8 x i32> %interleaved, ptr %a, align 1


================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-shuffle.ll:35
+
+define void @interleave_store_legalization(ptr %a, <8 x i32> %b) #0 {
+; CHECK-LABEL: interleave_store_legalization:
----------------
This test has the same problem as `@hang_when_merging_stores_after_legalisation`, because it's using splats. I think you can do this:

```define void @interleave_store_legalization(ptr %p, <8 x i32> %a, <8 x i32> %b) #0 {
  %interleaved = shufflevector <8 x i32> %a, <8 x i32> %b, <16 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11, i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15>
  store <16 x i32> %interleaved, ptr %p, align 1
  ret void
}```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147040/new/

https://reviews.llvm.org/D147040



More information about the llvm-commits mailing list