[Mlir-commits] [mlir] [mlir][affine][gpu] support unroll dynamic value and apply it to gpu.thread_id op (PR #128113)

Sun Feb 23 21:01:12 PST 2025

================
@@ -117,7 +118,8 @@ static void replaceIterArgsAndYieldResults(AffineForOp forOp) {
 /// was known to have a single iteration.
 LogicalResult mlir::affine::promoteIfSingleIteration(AffineForOp forOp) {
   std::optional<uint64_t> tripCount = getConstantTripCount(forOp);
-  if (!tripCount || *tripCount != 1)
+  std::optional<uint64_t> maxTripCount = getMaxConstantTripCount(forOp);
+  if (!tripCount || *tripCount != 1 || !maxTripCount || *maxTripCount != 1)
----------------
linuxlonelyeagle wrote:

`maxTripCount` is always greater than or `equa`l to TriCount. In the case of the CPU, they are equal.The original commit included the removal of the invalid loop, which has now been removed.
```  
std::optional<uint64_t> tripCount = getConstantTripCount(forOp);
std::optional<uint64_t> maxTripCount = getMaxConstantTripCount(forOp);
```
* Keep the loop in this case.
tripCount = 0
maxTripCount = 1
Keep the loop in this case.

* And the other is the case of dumping the IR in the loop out of the loop.
tripCount = 1
maxTripCount = 1

* core idea
tripCount = (upper - (blockSize - 1)) div stride
maxTripCount = (uppper - 0) div stride
(blockSize - 1 ) = maxThreadId = blockSIze - 1
0 = minThreadId = 0
The above rules apply to all scoped Values.

For details, please see the discussion between me and [krzysz00](https://github.com/krzysz00)  above. 

https://github.com/llvm/llvm-project/pull/128113