[llvm] [NVPTX] Optimize v2x16 BUILD_VECTORs to PRMT (PR #116675)
Fraser Cormack via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 19 01:49:46 PST 2024
================
@@ -1148,8 +1147,7 @@ define <2 x bfloat> @fma_bf16x2_expanded_no_nans_multiple_uses_of_fma(<2 x bfloa
; CHECK-SM70-NEXT: setp.nan.f32 %p8, %f18, %f18;
; CHECK-SM70-NEXT: or.b32 %r58, %r54, 4194304;
; CHECK-SM70-NEXT: selp.b32 %r59, %r58, %r57, %p8;
-; CHECK-SM70-NEXT: { .reg .b16 tmp; mov.b32 {tmp, %rs23}, %r59; }
-; CHECK-SM70-NEXT: mov.b32 %r60, {%rs23, %rs20};
----------------
frasercrmck wrote:
I don't think so because it's still used for `(trunc (srl s, 16))` or `(extractelt $vec, 1)`. Perhaps if we generally matched both of those to PRMTs we could remove the code, but I suspect we'll always need the option to fall back to these patterns.
https://github.com/llvm/llvm-project/pull/116675
More information about the llvm-commits
mailing list