[Mlir-commits] [mlir] [MLIR][ROCDL] Add conversion for gpu.subgroup_id to ROCDL (PR #136405)

Tue Apr 22 14:49:25 PDT 2025

================
@@ -80,6 +80,24 @@ static constexpr StringLiteral amdgcnDataLayout =
     "64-S32-A5-G1-ni:7:8:9";
 
 namespace {
+
+// Truncate or extend the result depending on the index bitwidth specified
+// by the LLVMTypeConverter options.
+static Value truncOrExtToLLVMType(ConversionPatternRewriter &rewriter,
+                                  Location loc, Value value,
+                                  const LLVMTypeConverter *converter) {
+  auto intWidth = cast<IntegerType>(value.getType()).getWidth();
+  auto indexBitwidth = converter->getIndexTypeBitwidth();
+  if (indexBitwidth > intWidth) {
+    return rewriter.create<LLVM::SExtOp>(
----------------
krzysz00 wrote:

I don't know. I suspect that we want `zext nneg` when we know it's equivalent - ex, on these IDs that are guaranteed to be in [0, i32_signed_max)` or some stricter range. However, there's an assumption in much of `affine` and co that `index` is signed when there's an ambiguity.

Int range optimization often creates sext/trunc pairs for these things in practice.

I suspect that for now we want `sext`for MLIR-semantic reasons and then to filter it down later in the backend, especially if we can stick a `range(i32 0, [upper bound we actually know])` on the intrinsic.

... Heck, even  in the absence of the actual subgroup size, we do know said upper bound: 1024 / [wave size], aka 1024 / 32 = 32 ... which we can just hint to LLVM

(Though see also my note about how this intrinsic basically doesn't exist on any GPU of interest)

https://github.com/llvm/llvm-project/pull/136405