[Mlir-commits] [mlir] [MLIR][NVVM][Tests]: Update FileCheck primitives (PR #173252)

Mon Dec 22 04:36:24 PST 2025

https://github.com/durga4github created https://github.com/llvm/llvm-project/pull/173252

This patch updates a few FileCheck primitives for the TMA test
to use CHECK-PTX-DAG instead of CHECK-PTX to accommodate
a slightly different ordering of BB's.

The dump-ptx integration test fails when the PTX is generated
through nvcc (intermediates) from public toolkit. This patch fixes
it by allowing regex strings from both the backends.

>From 242abdca9b1f6a871347e185ffacc41c134ea44c Mon Sep 17 00:00:00 2001
From: Durgadoss R <durgadossr at nvidia.com>
Date: Mon, 22 Dec 2025 12:20:58 +0000
Subject: [PATCH] [MLIR][NVVM][Tests]: Update FileCheck primitives

This patch updates a few FileCheck prmitives for the
TMA test to use CHECK-PTX-DAG instead of CHECK-PTX
to accomodate a slightly different ordering of BB's.

The dump-ptx integration test fails when the PTX is
generated through nvcc (intermediates) from public
toolkit. This patch fixes it by allowing both
regex strings from the backend.

Signed-off-by: Durgadoss R <durgadossr at nvidia.com>
---
 mlir/test/Integration/GPU/CUDA/dump-ptx.mlir              | 4 ++--
 .../tma_load_64x8_8x128_noswizzle-transform.mlir          | 8 ++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir b/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir
index efffcaaf23b2e..5b40946ce3537 100644
--- a/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir
+++ b/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir
@@ -2,7 +2,7 @@
 // RUN:  | mlir-opt -gpu-lower-to-nvvm-pipeline="allow-pattern-rollback=0" -debug-only=serialize-to-isa \
 // RUN:  2>&1 | FileCheck %s
 
-// CHECK-LABEL: Generated by LLVM NVPTX Back-End
+// CHECK-LABEL: {{Generated by (LLVM NVPTX Back-End|NVIDIA NVVM Compiler)}}
 // CHECK: .visible .func kernel_a()
 // CHECK: ret;
 gpu.module @bar {
@@ -12,7 +12,7 @@ gpu.module @bar {
   }
 }
 
-// CHECK-LABEL: Generated by LLVM NVPTX Back-End
+// CHECK-LABEL: {{Generated by (LLVM NVPTX Back-End|NVIDIA NVVM Compiler)}}
 // CHECK: .visible .func  ({{.+}}) fma(
 // CHECK: fma.rn.f32
 
diff --git a/mlir/test/Integration/GPU/CUDA/sm90/transform-dialect/tma_load_64x8_8x128_noswizzle-transform.mlir b/mlir/test/Integration/GPU/CUDA/sm90/transform-dialect/tma_load_64x8_8x128_noswizzle-transform.mlir
index a1e2729146c64..6ba9c16390192 100644
--- a/mlir/test/Integration/GPU/CUDA/sm90/transform-dialect/tma_load_64x8_8x128_noswizzle-transform.mlir
+++ b/mlir/test/Integration/GPU/CUDA/sm90/transform-dialect/tma_load_64x8_8x128_noswizzle-transform.mlir
@@ -19,10 +19,10 @@
 
 // Basic PTX check to make sure we are generating the right instructions.
 // CHECK-PTX: mbarrier.init.shared.b64
-// CHECK-PTX: mbarrier.arrive.expect_tx.shared.b64
-// CHECK-PTX: cp.async.bulk.tensor.2d.shared::cluster.global.tile.mbarrier::complete_tx::bytes
-// CHECK-PTX: cp.async.bulk.tensor.2d.shared::cluster.global.tile.mbarrier::complete_tx::bytes
-// CHECK-PTX: mbarrier.arrive.expect_tx.shared.b64
+// CHECK-PTX-DAG: mbarrier.arrive.expect_tx.shared.b64
+// CHECK-PTX-DAG: cp.async.bulk.tensor.2d.shared::cluster.global.tile.mbarrier::complete_tx::bytes
+// CHECK-PTX-DAG: cp.async.bulk.tensor.2d.shared::cluster.global.tile.mbarrier::complete_tx::bytes
+// CHECK-PTX-DAG: mbarrier.arrive.expect_tx.shared.b64
 // CHECK-PTX: mbarrier.try_wait.parity.shared.b64
 
 // RUN: mlir-opt %s \