[llvm] [IA] Generalize the support for power-of-two (de)interleave intrinsics (PR #123863)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 22 19:06:10 PST 2025
================
@@ -37,8 +41,12 @@ define {<vscale x 16 x i8>, <vscale x 16 x i8>} @vector_deinterleave_load_nxv16i
; CHECK-NEXT: vlseg2e8.v v8, (a0)
; CHECK-NEXT: ret
%vec = load <vscale x 32 x i8>, ptr %p
- %retval = call {<vscale x 16 x i8>, <vscale x 16 x i8>} @llvm.vector.deinterleave2.nxv32i8(<vscale x 32 x i8> %vec)
- ret {<vscale x 16 x i8>, <vscale x 16 x i8>} %retval
+ %deinterleaved.results = call {<vscale x 16 x i8>, <vscale x 16 x i8>} @llvm.vector.deinterleave2.nxv32i8(<vscale x 32 x i8> %vec)
+ %t0 = extractvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } %deinterleaved.results, 0
+ %t1 = extractvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } %deinterleaved.results, 1
+ %res0 = insertvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } undef, <vscale x 16 x i8> %t0, 0
+ %res1 = insertvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } %res0, <vscale x 16 x i8> %t1, 1
+ ret {<vscale x 16 x i8>, <vscale x 16 x i8>} %res1
----------------
lukel97 wrote:
I fear that InstCombine or something might fold away these extractvalues/insertvalues.
For factor 4 and 8 we'll always need the extractvalues + insertvalues to reorder the segments correctly, but for factor 2 we don't need to.
But looking at the original AArch64 changes in #89276 I can see that aarch64 also requires the extractvalues now, so as long as we're in sync with them then I think that's fine.
I think it would be good to handle the factor-2-without-extractvalue case in a follow up PR.
https://github.com/llvm/llvm-project/pull/123863
More information about the llvm-commits
mailing list