[all-commits] [llvm/llvm-project] 477c0b: [mlir][affine][gpu] Replace DivSIOp to CeilDivSIOp...

Mon Nov 27 00:06:08 PST 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 477c0b67a3ab30e74f3563b3f0b9d4d53caba465
      https://github.com/llvm/llvm-project/commit/477c0b67a3ab30e74f3563b3f0b9d4d53caba465
  Author: Hsiangkai Wang <hsiangkai.wang at arm.com>
  Date:   2023-11-27 (Mon, 27 Nov 2023)

  Changed paths:
    M mlir/lib/Conversion/SCFToGPU/SCFToGPU.cpp
    M mlir/test/Conversion/SCFToGPU/step_positive.mlir

  Log Message:
  -----------
  [mlir][affine][gpu] Replace DivSIOp to CeilDivSIOp when lowering to GPU launch (#73328)

When converting affine.for to GPU launch operator, we have to calculate
the block dimension and thread dimension for the launch operator.

The formula of the dimension size is

(upper_bound - lower_bound) / step_size

When the difference is indivisible by step_size, we use rounding-to-zero
as the division result. However, the block dimension and thread
dimension is right-open range, i.e., [0, block_dim) and [0, thread_dim).
So, we will get the wrong result if we use DivSIOp. In this patch, we
replace it with CeilDivSIOp to get the correct block and thread
dimension values.