[llvm] [NVPTX] Fixup v2i8 parameter and return lowering (PR #145585)
Princeton Ferro via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 26 11:46:34 PDT 2025
================
@@ -247,7 +243,14 @@ define <4 x i8> @test_v4i8(<4 x i8> %a) {
; CHECK: call.uni (retval0), test_v5i8,
; CHECK-DAG: ld.param.v4.b8 {[[RE0:%rs[0-9]+]], [[RE1:%rs[0-9]+]], [[RE2:%rs[0-9]+]], [[RE3:%rs[0-9]+]]}, [retval0];
; CHECK-DAG: ld.param.b8 [[RE4:%rs[0-9]+]], [retval0+4];
-; CHECK-DAG: st.param.v4.b8 [func_retval0], {[[RE0]], [[RE1]], [[RE2]], [[RE3]]}
+; CHECK-DAG: cvt.u32.u16 [[R3:%r[0-9]+]], [[RE3]];
+; CHECK-DAG: cvt.u32.u16 [[R2:%r[0-9]+]], [[RE2]];
+; CHECK-DAG: prmt.b32 [[P0:%r[0-9]+]], [[R2]], [[R3]], 0x3340U;
+; CHECK-DAG: cvt.u32.u16 [[R1:%r[0-9]+]], [[RE1]];
+; CHECK-DAG: cvt.u32.u16 [[R0:%r[0-9]+]], [[RE0]];
+; CHECK-DAG: prmt.b32 [[P1:%r[0-9]+]], [[R0]], [[R1]], 0x3340U;
+; CHECK-DAG: prmt.b32 [[P2:%r[0-9]+]], [[P1]], [[P0]], 0x5410U;
+; CHECK-DAG: st.param.b32 [func_retval0], [[P2]];
----------------
Prince781 wrote:
This looks like worse PTX, but maybe the resulting SASS is not much different.
https://github.com/llvm/llvm-project/pull/145585
More information about the llvm-commits
mailing list