[llvm] [NVPTX] Generalize and extend upsizing when lowering 8/16-bit-element vector loads/stores (PR #119622)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 13 11:26:33 PST 2024
================
@@ -172,30 +172,34 @@ define float @ff(ptr %p) {
define void @combine_v16i8(ptr noundef align 16 %ptr1, ptr noundef align 16 %ptr2) {
; ENABLED-LABEL: combine_v16i8(
; ENABLED: {
-; ENABLED-NEXT: .reg .b32 %r<40>;
+; ENABLED-NEXT: .reg .b32 %r<36>;
; ENABLED-NEXT: .reg .b64 %rd<3>;
; ENABLED-EMPTY:
; ENABLED-NEXT: // %bb.0:
; ENABLED-NEXT: ld.param.u64 %rd1, [combine_v16i8_param_0];
-; ENABLED-NEXT: ld.v4.u32 {%r1, %r2, %r3, %r4}, [%rd1];
+; ENABLED-NEXT: ld.v4.b32 {%r1, %r2, %r3, %r4}, [%rd1];
; ENABLED-NEXT: ld.param.u64 %rd2, [combine_v16i8_param_1];
-; ENABLED-NEXT: bfe.u32 %r9, %r1, 0, 8;
-; ENABLED-NEXT: bfe.u32 %r10, %r1, 8, 8;
-; ENABLED-NEXT: bfe.u32 %r11, %r1, 16, 8;
-; ENABLED-NEXT: bfe.u32 %r12, %r1, 24, 8;
-; ENABLED-NEXT: bfe.u32 %r13, %r2, 0, 8;
-; ENABLED-NEXT: bfe.u32 %r14, %r2, 8, 8;
-; ENABLED-NEXT: bfe.u32 %r15, %r2, 16, 8;
-; ENABLED-NEXT: bfe.u32 %r16, %r2, 24, 8;
-; ENABLED-NEXT: bfe.u32 %r17, %r3, 0, 8;
-; ENABLED-NEXT: bfe.u32 %r18, %r3, 8, 8;
-; ENABLED-NEXT: bfe.u32 %r19, %r3, 16, 8;
-; ENABLED-NEXT: bfe.u32 %r20, %r3, 24, 8;
-; ENABLED-NEXT: bfe.u32 %r21, %r4, 0, 8;
-; ENABLED-NEXT: bfe.u32 %r22, %r4, 8, 8;
-; ENABLED-NEXT: bfe.u32 %r23, %r4, 16, 8;
-; ENABLED-NEXT: bfe.u32 %r24, %r4, 24, 8;
-; ENABLED-NEXT: add.s32 %r25, %r9, %r10;
+; ENABLED-NEXT: bfe.u32 %r5, %r1, 0, 8;
----------------
Artem-B wrote:
OK. Looks like we may have more opportunities for switching to PRMT, but it's outside of the scope of this patch.
https://github.com/llvm/llvm-project/pull/119622
More information about the llvm-commits
mailing list