[Mlir-commits] [mlir] [mlir][gpu] Use `known_block_size` to set `maxntid` for NVVM target (PR #77301)

Mon Jan 8 05:32:53 PST 2024

================
@@ -85,8 +85,26 @@ GPUFuncOpLowering::matchAndRewrite(gpu::GPUFuncOp gpuFuncOp, OpAdaptor adaptor,
   // Add a dialect specific kernel attribute in addition to GPU kernel
   // attribute. The former is necessary for further translation while the
   // latter is expected by gpu.launch_func.
-  if (gpuFuncOp.isKernel())
+  if (gpuFuncOp.isKernel()) {
     attributes.emplace_back(kernelAttributeName, rewriter.getUnitAttr());
+
+    // Set the block size attribute if it is present.
+    if (kernelBlockSizeAttributeName.has_value()) {
+      std::optional<int32_t> dimX =
+          gpuFuncOp.getKnownBlockSize(gpu::Dimension::x);
+      std::optional<int32_t> dimY =
+          gpuFuncOp.getKnownBlockSize(gpu::Dimension::y);
+      std::optional<int32_t> dimZ =
+          gpuFuncOp.getKnownBlockSize(gpu::Dimension::z);
+      if (dimX.has_value() || dimY.has_value() || dimZ.has_value()) {
+        // If any of the dimensions are missing, fill them in with 1.
+        attributes.emplace_back(
+            kernelBlockSizeAttributeName.value(),
+            rewriter.getI32ArrayAttr(
----------------
grypp wrote:

Sure, I can use DenseArray and I also think I should. 

Some adjustments are needed [here](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp#L1061-L1074), and need to change some test. I can do it in a separate PR if it's okay

https://github.com/llvm/llvm-project/pull/77301