[Mlir-commits] [mlir] [mlir][NVVM] Tighten result-type predicate on special-register ops (PR #195030)
Bastian Hagedorn
llvmlistbot at llvm.org
Thu Apr 30 00:54:21 PDT 2026
https://github.com/bastianhagedorn created https://github.com/llvm/llvm-project/pull/195030
The NVVM "read special register" ops (`read.ptx.sreg.tid.x`, `read.ptx.sreg.clock`, `read.ptx.sreg.clock64`, ...) currently inherit the generic `LLVM_Type` result-type predicate from `LLVM_IntrOpBase`. The underlying `llvm.nvvm.read.ptx.sreg.*` intrinsics each have a single fixed return type, so `LLVM_Type` is broader than the intrinsic actually supports: the dialect verifier accepts e.g. `f32` for `tid.x` even though that would crash or miscompile during LLVM lowering.
## Changes
- `NVVMOps.td`: parameterize the special-register base classes (`NVVM_PureSpecialRegisterOp`, `NVVM_SpecialRegisterOp`, `NVVM_PureSpecialRangeableRegisterOp`) by `Type resultType` with a default of `I32`. Override to `I64` for the two i64-returning ops: `read.ptx.sreg.clock64` and `read.ptx.sreg.globaltimer`.
- `mlir/test/Dialect/LLVMIR/invalid.mlir`: new verifier negative tests covering both i32 and i64 special registers with mismatched result types.
- `mlir/test/python/dialects/nvvm.py`: new positive test constructing special-register ops with no arguments (`nvvm.ThreadIdXOp()` etc.) and checking the printed types.
## Effect
- The dialect verifier rejects type mismatches up-front instead of crashing at LLVM lowering time.
- Because each result type is now a buildable concrete type, MLIR auto-attaches `InferTypeOpInterface` (via `Operator::populateTypeInferenceInfo`). In turn, `mlir-tblgen -gen-python-op-bindings` emits the inferred-result form for these ops:
```python
def read_ptx_sreg_tid_x(*, range=None, results=None, loc=None, ip=None)
```
instead of the previous form that required a positional `res` argument. Python callers can now write `nvvm.ThreadIdXOp()` with no arguments and the i32 type is filled in via the interface.
## Compatibility
This is a strict tightening on the IR side: every caller already had to use the concrete LLVM intrinsic result type to pass LLVM lowering, so no valid existing MLIR IR is rejected. Existing dialect and translation tests pass unchanged.
The MLIR textual form still requires the `: i32` / `: i64` suffix; the assembly format is unchanged. Making that optional in textual MLIR would require custom parse/print methods (the declarative `(`:` type($res)^)?` form is rejected by `mlir-tblgen` for non-variadic results) and is left as a possible follow-up.
For Python callers that previously constructed these ops by passing a positional `res` (e.g. `nvvm.ThreadIdXOp(i32_type)`), this is a source-breaking change. Those calls need to drop the argument or pass `results=[i32_type]` as a keyword.
## Testing
- `ninja check-mlir` passes locally (no new unexpected failures).
- The new `invalid.mlir` cases each trigger the expected verifier diagnostic.
- `mlir/test/python/dialects/nvvm.py` runs cleanly under FileCheck against the new test function.
>From da4e4a433ed2b29596d0a8491e079407aa077534 Mon Sep 17 00:00:00 2001
From: Bastian Hagedorn <bhagedorn at nvidia.com>
Date: Wed, 29 Apr 2026 12:55:18 +0000
Subject: [PATCH 1/2] [mlir][NVVM] Tighten result-type predicate on
special-register ops
The NVVM "read special register" ops (read.ptx.sreg.tid.x, ...clock,
...clock64, ...) currently inherit the generic LLVM_Type result-type
predicate from LLVM_IntrOpBase. The underlying llvm.nvvm.read.ptx.sreg.*
intrinsics each have a single fixed return type, so LLVM_Type is broader
than the intrinsic actually supports.
This patch parameterizes the special-register base classes
(NVVM_PureSpecialRegisterOp, NVVM_SpecialRegisterOp,
NVVM_PureSpecialRangeableRegisterOp) by `Type resultType` (default
`I32`), and overrides it to `I64` for the two i64-returning ops:
read.ptx.sreg.clock64 and read.ptx.sreg.globaltimer.
After this change:
- The dialect verifier rejects type mismatches up-front instead of
crashing at LLVM lowering time.
- The MLIR Python op-binding generator emits the inferred-result form
for these ops, e.g.
def read_ptx_sreg_tid_x(*, range=None, results=None, loc=None, ip=None)
instead of requiring a positional `res` argument.
This is a strict tightening: no valid existing IR is rejected, since
every caller already had to use the concrete LLVM intrinsic result type
to pass LLVM lowering. Existing dialect and translation tests continue
to pass.
Assisted-by: Claude Opus 4.7 (Anthropic)
---
mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td | 18 ++++++++----
mlir/test/Dialect/LLVMIR/invalid.mlir | 32 +++++++++++++++++++++
2 files changed, 44 insertions(+), 6 deletions(-)
diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index 95fe1e0535843..a62cb80fb0fb7 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -306,22 +306,28 @@ class NVVM_SingleResultIntrinsicOp<string mnemonic, list<Trait> traits = [], str
// NVVM special register op definitions
//===----------------------------------------------------------------------===//
-class NVVM_PureSpecialRegisterOp<string mnemonic, list<Trait> traits = []> :
+class NVVM_PureSpecialRegisterOp<string mnemonic, list<Trait> traits = [],
+ Type resultType = I32> :
NVVM_IntrOp<mnemonic, !listconcat(traits, [Pure]), 1> {
let arguments = (ins);
+ let results = (outs resultType:$res);
let assemblyFormat = "attr-dict `:` type($res)";
}
-class NVVM_SpecialRegisterOp<string mnemonic, list<Trait> traits = []> :
+class NVVM_SpecialRegisterOp<string mnemonic, list<Trait> traits = [],
+ Type resultType = I32> :
NVVM_IntrOp<mnemonic, traits, 1> {
let arguments = (ins);
+ let results = (outs resultType:$res);
let assemblyFormat = "attr-dict `:` type($res)";
}
-class NVVM_PureSpecialRangeableRegisterOp<string mnemonic, list<Trait> traits = []> :
+class NVVM_PureSpecialRangeableRegisterOp<string mnemonic, list<Trait> traits = [],
+ Type resultType = I32> :
NVVM_PureSpecialRegisterOp<mnemonic,
!listconcat(traits,
- [DeclareOpInterfaceMethods<InferIntRangeInterface, ["inferResultRanges"]>])> {
+ [DeclareOpInterfaceMethods<InferIntRangeInterface, ["inferResultRanges"]>]),
+ resultType> {
let arguments = (ins OptionalAttr<LLVM_ConstantRangeAttr>:$range);
let assemblyFormat = "(`range` $range^)? attr-dict `:` type($res)";
let llvmBuilder = baseLlvmBuilder # setRangeRetAttrCode # baseLlvmBuilderCoda;
@@ -421,8 +427,8 @@ def NVVM_AggrSmemSize : NVVM_PureSpecialRegisterOp<"read.ptx.sreg.aggr.smem.s
//===----------------------------------------------------------------------===//
// Clock registers
def NVVM_ClockOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.clock">;
-def NVVM_Clock64Op : NVVM_SpecialRegisterOp<"read.ptx.sreg.clock64">;
-def NVVM_GlobalTimerOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.globaltimer">;
+def NVVM_Clock64Op : NVVM_SpecialRegisterOp<"read.ptx.sreg.clock64", [], I64>;
+def NVVM_GlobalTimerOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.globaltimer", [], I64>;
def NVVM_GlobalTimerLoOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.globaltimer.lo">;
//===----------------------------------------------------------------------===//
diff --git a/mlir/test/Dialect/LLVMIR/invalid.mlir b/mlir/test/Dialect/LLVMIR/invalid.mlir
index e849b59b846f7..10bc62803f28a 100644
--- a/mlir/test/Dialect/LLVMIR/invalid.mlir
+++ b/mlir/test/Dialect/LLVMIR/invalid.mlir
@@ -2115,3 +2115,35 @@ module attributes { dlti.dl_spec = #dlti.dl_spec<
%0 = llvm.ptrtoaddr %arg0 : !llvm.ptr to i64
}
}
+
+// -----
+
+func.func @nvvm_read_sreg_tid_x_wrong_type() {
+ // expected-error at +1 {{'nvvm.read.ptx.sreg.tid.x' op result #0 must be 32-bit signless integer, but got 'i64'}}
+ %0 = nvvm.read.ptx.sreg.tid.x : i64
+ return
+}
+
+// -----
+
+func.func @nvvm_read_sreg_clock_wrong_type() {
+ // expected-error at +1 {{'nvvm.read.ptx.sreg.clock' op result #0 must be 32-bit signless integer, but got 'i64'}}
+ %0 = nvvm.read.ptx.sreg.clock : i64
+ return
+}
+
+// -----
+
+func.func @nvvm_read_sreg_clock64_wrong_type() {
+ // expected-error at +1 {{'nvvm.read.ptx.sreg.clock64' op result #0 must be 64-bit signless integer, but got 'i32'}}
+ %0 = nvvm.read.ptx.sreg.clock64 : i32
+ return
+}
+
+// -----
+
+func.func @nvvm_read_sreg_globaltimer_wrong_type() {
+ // expected-error at +1 {{'nvvm.read.ptx.sreg.globaltimer' op result #0 must be 64-bit signless integer, but got 'i32'}}
+ %0 = nvvm.read.ptx.sreg.globaltimer : i32
+ return
+}
>From 12d847b4ef0144715a17c41beab9c1d14447e314 Mon Sep 17 00:00:00 2001
From: Bastian Hagedorn <bhagedorn at nvidia.com>
Date: Thu, 30 Apr 2026 07:27:23 +0000
Subject: [PATCH 2/2] [mlir][NVVM][python] Test inferred-result form for
special-register ops
Exercise the user-facing benefit of the previous patch from Python:
construct a few special-register ops with no arguments at all
(nvvm.ThreadIdXOp() etc.) and verify the printed IR has the correct
inferred result types (i32 / i64).
This complements the dialect-verifier negative tests in invalid.mlir by
locking down the positive Python API behavior: if the .td were ever
loosened back to LLVM_Type, mlir-tblgen would re-introduce a positional
`res` argument and these constructor calls would fail.
Assisted-by: Claude Opus 4.7 (Anthropic)
---
mlir/test/python/dialects/nvvm.py | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/mlir/test/python/dialects/nvvm.py b/mlir/test/python/dialects/nvvm.py
index 24abf617548b8..f5e057812642e 100644
--- a/mlir/test/python/dialects/nvvm.py
+++ b/mlir/test/python/dialects/nvvm.py
@@ -377,3 +377,16 @@ def reductions(mask, vi32, vf32):
# CHECK: %[[REDUX_35:.*]] = nvvm.redux.sync fmax %[[ARG2]], %[[ARG1]] : f32 -> f32
# CHECK: return
# CHECK: }
+
+
+# CHECK-LABEL: TEST: testSpecialRegisterInferredResults
+ at constructAndPrintInModule
+def testSpecialRegisterInferredResults():
+ # CHECK: %{{.*}} = nvvm.read.ptx.sreg.tid.x : i32
+ nvvm.ThreadIdXOp()
+ # CHECK: %{{.*}} = nvvm.read.ptx.sreg.clock : i32
+ nvvm.ClockOp()
+ # CHECK: %{{.*}} = nvvm.read.ptx.sreg.clock64 : i64
+ nvvm.Clock64Op()
+ # CHECK: %{{.*}} = nvvm.read.ptx.sreg.globaltimer : i64
+ nvvm.GlobalTimerOp()
More information about the Mlir-commits
mailing list