[Mlir-commits] [mlir] [MLIR][NVGPU] Fix the cga_cluster.mlir test (PR #112191)

Mon Oct 14 06:02:44 PDT 2024

https://github.com/durga4github updated https://github.com/llvm/llvm-project/pull/112191

>From 56fcfdf14bab4c7546ab8e7a5ff4f9cb44a9cb6c Mon Sep 17 00:00:00 2001
From: Durgadoss R <durgadossr at nvidia.com>
Date: Thu, 10 Oct 2024 08:58:34 +0000
Subject: [PATCH] [MLIR][NVGPU] Fix the cga_cluster.mlir test

This patch fixes the sm90 cluster test by:
* Fixing a typo in LowerGpuOpsToNVVMOps where
  one of the ClusterDim Op conversion pattern should
  actually be for the ClusterDimBlocks Op.
  This addresses the compilation error for this test.
* The grid-size should be (4,4,1) instead of (2,2,1).
  This passes the scf-if check against the threshold of
  3 below and actually generates the required prints
  from the GPU.

Signed-off-by: Durgadoss R <durgadossr at nvidia.com>
---
 mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp | 5 +++--
 mlir/test/Integration/GPU/CUDA/sm90/cga_cluster.mlir   | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp b/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
index e83574b7342725..04e85c2b337dec 100644
--- a/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
+++ b/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
@@ -373,8 +373,9 @@ void mlir::populateGpuToNVVMConversionPatterns(
       NVVM::BlockInClusterIdYOp, NVVM::BlockInClusterIdZOp>>(
       converter, IndexKind::Other, IntrType::Id);
   patterns.add<gpu::index_lowering::OpLowering<
-      gpu::ClusterDimOp, NVVM::ClusterDimXOp, NVVM::ClusterDimYOp,
-      NVVM::ClusterDimZOp>>(converter, IndexKind::Other, IntrType::Dim);
+      gpu::ClusterDimBlocksOp, NVVM::ClusterDimBlocksXOp,
+      NVVM::ClusterDimBlocksYOp, NVVM::ClusterDimBlocksZOp>>(
+      converter, IndexKind::Other, IntrType::Dim);
   patterns.add<gpu::index_lowering::OpLowering<
       gpu::BlockIdOp, NVVM::BlockIdXOp, NVVM::BlockIdYOp, NVVM::BlockIdZOp>>(
       converter, IndexKind::Grid, IntrType::Id);
diff --git a/mlir/test/Integration/GPU/CUDA/sm90/cga_cluster.mlir b/mlir/test/Integration/GPU/CUDA/sm90/cga_cluster.mlir
index 5c11d80178f727..c70c940564a264 100644
--- a/mlir/test/Integration/GPU/CUDA/sm90/cga_cluster.mlir
+++ b/mlir/test/Integration/GPU/CUDA/sm90/cga_cluster.mlir
@@ -18,7 +18,7 @@ module attributes {gpu.container_module} {
     return
   }
   gpu.module @gpumodule {
-    gpu.func @kernel_cluster() kernel attributes {gpu.known_block_size = array<i32: 1, 1, 1>, gpu.known_grid_size = array<i32: 2, 2, 1>} {
+    gpu.func @kernel_cluster() kernel attributes {gpu.known_block_size = array<i32: 1, 1, 1>, gpu.known_grid_size = array<i32: 4, 4, 1>} {
       %cidX = gpu.cluster_id  x
       %cidY = gpu.cluster_id  y
       %cidZ = gpu.cluster_id  z