[llvm] [SelectionDAG] Improve value type selection for inline asm within selected register class (PR #135097)
Da Li 李达 via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 9 15:50:06 PDT 2025
https://github.com/dlee992 created https://github.com/llvm/llvm-project/pull/135097
### Summary
This patch improves value type selection for inline assembly when a specific register class is involved. It avoids generating imprecise load/store instructions and unnecessary type extensions or truncations. Previously, the first legal value type was always selected, but this patch chooses a more accurate type when possible.
### Example (NVPTX)
In the NVPTX target, a new `bf16` addition test previously generated `ld.param.u16` instead of the expected `ld.param.b16`, due to always selecting the first value type in the list:
```llvm
def Int16Regs : NVPTXRegClass<[i16, f16, bf16], 16, (add (sequence "RS%u", 0, 4))>;
```
### Impact
This is a target-independent improvement and may benefit other targets in similar scenarios.
>From 42a0aff4b21b620cf808fa0bb5f566d84b8aeb8a Mon Sep 17 00:00:00 2001
From: dlee992 <lidanuaa at gmail.com>
Date: Wed, 9 Apr 2025 17:30:06 -0500
Subject: [PATCH 1/2] patch
---
.../CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 89793c30f3710..528c8bd648332 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -9738,8 +9738,9 @@ getRegistersForValue(SelectionDAG &DAG, const SDLoc &DL,
// register class, find it.
unsigned AssignedReg;
const TargetRegisterClass *RC;
+ const MVT RefValueVT = RefOpInfo.ConstraintVT;
std::tie(AssignedReg, RC) = TLI.getRegForInlineAsmConstraint(
- &TRI, RefOpInfo.ConstraintCode, RefOpInfo.ConstraintVT);
+ &TRI, RefOpInfo.ConstraintCode, RefValueVT);
// RC is unset only on failure. Return immediately.
if (!RC)
return std::nullopt;
@@ -9747,7 +9748,17 @@ getRegistersForValue(SelectionDAG &DAG, const SDLoc &DL,
// Get the actual register value type. This is important, because the user
// may have asked for (e.g.) the AX register in i32 type. We need to
// remember that AX is actually i16 to get the right extension.
- const MVT RegVT = *TRI.legalclasstypes_begin(*RC);
+ MVT RegVT = *TRI.legalclasstypes_begin(*RC);
+
+ // If the reference value type is legal and belongs to the register class,
+ // use it instead of the first legal value type. This avoids generating
+ // inaccurate load/store instructions or unnecessary type extensions and
+ // truncations.
+ if (TLI.isTypeLegal(RefValueVT) &&
+ llvm::is_contained(llvm::make_range(TRI.legalclasstypes_begin(*RC),
+ TRI.legalclasstypes_end(*RC)),
+ RefValueVT.SimpleTy))
+ RegVT = RefValueVT.SimpleTy;
if (OpInfo.ConstraintVT != MVT::Other && RegVT != MVT::Untyped) {
// If this is an FP operand in an integer register (or visa versa), or more
>From 72f7612fadbc7d0c7ccd15416f83b5b28a159b3a Mon Sep 17 00:00:00 2001
From: dlee992 <lidanuaa at gmail.com>
Date: Wed, 9 Apr 2025 17:30:30 -0500
Subject: [PATCH 2/2] update tests for NVPTX
---
llvm/test/CodeGen/NVPTX/bf16-instructions.ll | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/llvm/test/CodeGen/NVPTX/bf16-instructions.ll b/llvm/test/CodeGen/NVPTX/bf16-instructions.ll
index 9be54a746cacd..4db68f28b3c0a 100644
--- a/llvm/test/CodeGen/NVPTX/bf16-instructions.ll
+++ b/llvm/test/CodeGen/NVPTX/bf16-instructions.ll
@@ -81,6 +81,23 @@ define bfloat @test_fadd(bfloat %0, bfloat %1) {
ret bfloat %3
}
+define bfloat @test_fadd_inlineasm(bfloat %0, bfloat %1) nounwind {
+; SM90-LABEL: test_fadd_inlineasm(
+; SM90: {
+; SM90-NEXT: .reg .b16 %rs<4>;
+; SM90-EMPTY:
+; SM90-NEXT: // %bb.0:
+; SM90-NEXT: ld.param.b16 %rs2, [test_fadd_inlineasm_param_0];
+; SM90-NEXT: ld.param.b16 %rs3, [test_fadd_inlineasm_param_1];
+; SM90-NEXT: // begin inline asm
+; SM90-NEXT: add.rn.bf16 %rs1, %rs2, %rs3
+; SM90-NEXT: // end inline asm
+; SM90-NEXT: st.param.b16 [func_retval0], %rs1;
+; SM90-NEXT: ret;
+ %3 = tail call bfloat asm "add.rn.bf16 $0, $1, $2", "=h,h,h"(bfloat %0, bfloat %1) nounwind
+ ret bfloat %3
+}
+
define bfloat @test_fsub(bfloat %0, bfloat %1) {
; SM70-LABEL: test_fsub(
; SM70: {
More information about the llvm-commits
mailing list