[llvm] [IA] Generalize the support for power-of-two (de)interleave intrinsics (PR #123863)

Luke Lau via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 22 19:06:10 PST 2025


================
@@ -37,8 +41,12 @@ define {<vscale x 16 x i8>, <vscale x 16 x i8>} @vector_deinterleave_load_nxv16i
 ; CHECK-NEXT:    vlseg2e8.v v8, (a0)
 ; CHECK-NEXT:    ret
   %vec = load <vscale x 32 x i8>, ptr %p
-  %retval = call {<vscale x 16 x i8>, <vscale x 16 x i8>} @llvm.vector.deinterleave2.nxv32i8(<vscale x 32 x i8> %vec)
-  ret {<vscale x 16 x i8>, <vscale x 16 x i8>} %retval
+  %deinterleaved.results = call {<vscale x 16 x i8>, <vscale x 16 x i8>} @llvm.vector.deinterleave2.nxv32i8(<vscale x 32 x i8> %vec)
+  %t0 = extractvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } %deinterleaved.results, 0
+  %t1 = extractvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } %deinterleaved.results, 1
+  %res0 = insertvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } undef, <vscale x 16 x i8> %t0, 0
+  %res1 = insertvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } %res0, <vscale x 16 x i8> %t1, 1
+  ret {<vscale x 16 x i8>, <vscale x 16 x i8>} %res1
----------------
lukel97 wrote:

I fear that InstCombine or something might fold away these extractvalues/insertvalues.

For factor 4 and 8 we'll always need the extractvalues + insertvalues to reorder the segments correctly, but for factor 2 we don't need to.

But looking at the original AArch64 changes in #89276 I can see that aarch64 also requires the extractvalues now, so as long as we're in sync with them then I think that's fine.

I think it would be good to handle the factor-2-without-extractvalue case in a follow up PR.

https://github.com/llvm/llvm-project/pull/123863


More information about the llvm-commits mailing list