[Mlir-commits] [mlir] [mlir][gpu] reverse parallel loop to gpu dimension mapping order. (PR #79592)

Mon Jan 29 04:14:00 PST 2024

================
@@ -78,23 +77,23 @@ static Processor getHardwareIdForMapping(MappingLevel level, int dimension) {
   case MapGrid:
     switch (dimension) {
     case 0:
-      return Processor::BlockX;
+      return Processor::BlockZ;
     case 1:
       return Processor::BlockY;
     case 2:
-      return Processor::BlockZ;
+      return Processor::BlockX;
----------------
ftynse wrote:

I don't think it makes sense to invert _block_ dimensions, only threads where we may want `x` for the innermost to enable things like memory coalescing.

https://github.com/llvm/llvm-project/pull/79592