[all-commits] [llvm/llvm-project] 477c0b: [mlir][affine][gpu] Replace DivSIOp to CeilDivSIOp...
Hsiangkai Wang via All-commits
all-commits at lists.llvm.org
Mon Nov 27 00:06:08 PST 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 477c0b67a3ab30e74f3563b3f0b9d4d53caba465
https://github.com/llvm/llvm-project/commit/477c0b67a3ab30e74f3563b3f0b9d4d53caba465
Author: Hsiangkai Wang <hsiangkai.wang at arm.com>
Date: 2023-11-27 (Mon, 27 Nov 2023)
Changed paths:
M mlir/lib/Conversion/SCFToGPU/SCFToGPU.cpp
M mlir/test/Conversion/SCFToGPU/step_positive.mlir
Log Message:
-----------
[mlir][affine][gpu] Replace DivSIOp to CeilDivSIOp when lowering to GPU launch (#73328)
When converting affine.for to GPU launch operator, we have to calculate
the block dimension and thread dimension for the launch operator.
The formula of the dimension size is
(upper_bound - lower_bound) / step_size
When the difference is indivisible by step_size, we use rounding-to-zero
as the division result. However, the block dimension and thread
dimension is right-open range, i.e., [0, block_dim) and [0, thread_dim).
So, we will get the wrong result if we use DivSIOp. In this patch, we
replace it with CeilDivSIOp to get the correct block and thread
dimension values.
More information about the All-commits
mailing list