[llvm] [NVPTX] Optimize v2x16 BUILD_VECTORs to PRMT (PR #116675)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 18 11:20:47 PST 2024


================
@@ -159,8 +159,8 @@ define <2 x bfloat> @test_faddx2(<2 x bfloat> %a, <2 x bfloat> %b) #0 {
 ; SM70-LABEL: test_faddx2(
 ; SM70:       {
 ; SM70-NEXT:    .reg .pred %p<3>;
-; SM70-NEXT:    .reg .b16 %rs<13>;
-; SM70-NEXT:    .reg .b32 %r<24>;
+; SM70-NEXT:    .reg .b16 %rs<9>;
+; SM70-NEXT:    .reg .b32 %r<25>;
----------------
Artem-B wrote:

Considering that registers are 'virtual' anyways, it's a cosmetic issue. ptxas does the actual register allocation, so the number of registers in the PTX does not really matter for anything other than syntax checks. 
If you look at the old code, it declares 13 `rs` registers, but uses only 11. As long as we declare enough of them, we're fine. We may want to tighten register accounting, but it's a cosmetic low priority issue.

https://github.com/llvm/llvm-project/pull/116675


More information about the llvm-commits mailing list