[Mlir-commits] [llvm] [mlir] [openmp] [Flang][OpenMP] Add support for schedule clause for GPU (PR #81618)
Johannes Doerfert
llvmlistbot at llvm.org
Tue Feb 20 11:41:53 PST 2024
================
@@ -685,17 +685,22 @@ template <typename Ty> class StaticLoopChunker {
Ty KernelIteration = NumBlocks * BlockChunk;
----------------
jdoerfert wrote:
Do we need *ThreadChunk here too?
Let's say we have 5 blocks, and each block does a chunk of 3.
Each block has 11 threads and a chunk size of 2.
What I'd expect to work on in one iteration of the do loop below is:
```
Iteration : 0 1 2 3 ... 20 21
Block/Thread: B0T0, B0T0, B0T1, B0T1, ..., B0T10, B0T10
Iteration : 66 67 68 69 ... 86 87
Block/Thread: B1T0, B1T0, B1T1, B1T1, ..., B1T10, B1T10
...
Iteration : 264 265 266 267 ... 284 285
Block/Thread: B4T0, B4T0, B4T1, B4T1, ..., B4T10, B4T10
```
So, 2 * 11 = 22 iterations for a block and 5 * 22 = 110 iterations for the kernel.
https://github.com/llvm/llvm-project/pull/81618
More information about the Mlir-commits
mailing list