[PATCH] D75052: [MLIR][GPU] Properly model step in parallel loop to gpu conversion.

Tue Feb 25 03:49:02 PST 2020

bondhugula added inline comments.

================
Comment at: mlir/test/Conversion/LoopsToGPU/parallel_loop.mlir:230
       loop.parallel (%arg5, %arg6) = (%c0, %c0) to (%3, %5) step (%c1, %c1) {
-        %17 = load %6[%arg5, %arg6] : memref<?x?xf32, #map2>
-        %18 = load %11[%arg5, %arg6] : memref<?x?xf32, #map2>
-        %19 = load %16[%arg5, %arg6] : memref<?x?xf32, #map2>
+        %17 = load %6[%arg5, %arg6] : memref<?x?xf32, #map3>
+        %18 = load %11[%arg5, %arg6] : memref<?x?xf32, #map3>
----------------
Side question: where aren't we using affine.load/store instead of load/store and loop.parallel -> affine.parallel here? With the former, you'll get things like store to load fwd'ing, redundant load elimination, composition of ops supplying subscript values into the load/store itself, etc., infra for all of which exist and whenever you need them. All the mapping metadata should nicely fit into affine.parallel as well.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75052/new/

https://reviews.llvm.org/D75052