[llvm] [NVPTX] Optimize v2x16 BUILD_VECTORs to PRMT (PR #116675)

Justin Fargnoli via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 18 13:11:44 PST 2024


================
@@ -197,8 +196,7 @@ define <2 x bfloat> @test_faddx2(<2 x bfloat> %a, <2 x bfloat> %b) #0 {
 ; SM70-NEXT:    setp.nan.f32 %p2, %f6, %f6;
 ; SM70-NEXT:    or.b32 %r21, %r17, 4194304;
 ; SM70-NEXT:    selp.b32 %r22, %r21, %r20, %p2;
-; SM70-NEXT:    { .reg .b16 tmp; mov.b32 {tmp, %rs11}, %r22; }
-; SM70-NEXT:    mov.b32 %r23, {%rs11, %rs7};
+; SM70-NEXT:    prmt.b32 %r23, %r22, %r12, 0x7632U;
----------------
justinfargnoli wrote:

What's the SASS diff for this change? 

https://github.com/llvm/llvm-project/pull/116675


More information about the llvm-commits mailing list