[PATCH] D76700: [MLIR] Introduce full/partial tile separation using if/else

Uday Bondhugula via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 24 06:57:09 PDT 2020


bondhugula created this revision.
bondhugula added reviewers: andydavis1, flaub.
Herald added subscribers: llvm-commits, Joonsoo, liufengdb, aartbik, lucyrfox, mgester, arpith-jacob, nicolasvasilache, antiagainst, shauheen, burmako, jpienaar, rriddle, mehdi_amini.
Herald added a project: LLVM.
bondhugula added a parent revision: D76701: [MLIR] Add flat affine constraints method to round trip integer set.
bondhugula edited the summary of this revision.
bondhugula edited the summary of this revision.

This patch introduces a utility to separate full tiles from partial
tiles when tiling affine loop nests where trip counts are unknown or
where tile sizes don't divide trip counts. A conditional guard is
generated to separate out the full tile (with constant trip count loops)
into the then block of an 'affine.if' and the partial tile to the else
block. The separation allows the 'then' block (which has constant trip
count loops) to be optimized better subsequently: for eg. for
unroll-and-jam, register tiling, vectorization without leading to
cleanup code, or to offload to accelerators. Among techniques from the
literature, the if/else based separation leads to the most compact
cleanup code for multi-dimensional cases (because a single version is
used to model all partial tiles).

INPUT

  affine.for %i0 = 0 to %M {
    affine.for %i1 = 0 to %N {
      "foo"() : () -> ()
    }
  }

OUTPUT AFTER TILING W/O SEPARATION

  map0 = affine_map<(d0) -> (d0)>
  map1 = affine_map<(d0)[s0] -> (d0 + 32, s0)>
  
  affine.for %arg2 = 0 to %M step 32 {
    affine.for %arg3 = 0 to %N step 32 {
      affine.for %arg4 = #map0(%arg2) to min #map1(%arg2)[%M] {
        affine.for %arg5 = #map0(%arg3) to min #map1(%arg3)[%N] {
          "foo"() : () -> ()
        }
      }
    }
  }

OUTPUT AFTER TILING WITH SEPARATION

  map0 = affine_map<(d0) -> (d0)>
  map1 = affine_map<(d0) -> (d0 + 32)>
  map2 = affine_map<(d0)[s0] -> (d0 + 32, s0)>
  
  #set0 = affine_set<(d0, d1)[s0, s1] : (-d0 + s0 - 32 >= 0, -d1 + s1 - 32 >= 0)>
  
  affine.for %arg2 = 0 to %M step 32 {
    affine.for %arg3 = 0 to %N step 32 {
      affine.if #set0(%arg2, %arg3)[%M, %N] {
        // Full tile.
        affine.for %arg4 = #map0(%arg2) to #map1(%arg2) {
          affine.for %arg5 = #map0(%arg3) to #map1(%arg3) {
            "foo"() : () -> ()
          }
        }
      } else {
        // Partial tile.
        affine.for %arg4 = #map0(%arg2) to min #map2(%arg2)[%M] {
          affine.for %arg5 = #map0(%arg3) to min #map2(%arg3)[%N] {
            "foo"() : () -> ()
          }
        }
      }
    }
  }

The separation is tested via a cmd line flag on the loop tiling pass.
The utility itself allows one to pass in any band of contiguously nested
loops, and can be used by other transforms/utilities. The current
implementation works for hyperrectangular loop nests.

Signed-off-by: Uday Bondhugula <uday at polymagelabs.com>


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D76700

Files:
  mlir/include/mlir/Analysis/AffineStructures.h
  mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
  mlir/include/mlir/Transforms/LoopUtils.h
  mlir/lib/Analysis/AffineStructures.cpp
  mlir/lib/Analysis/Utils.cpp
  mlir/lib/Dialect/Affine/Transforms/LoopTiling.cpp
  mlir/lib/Transforms/Utils/LoopUtils.cpp
  mlir/test/Dialect/Affine/loop-tiling.mlir

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D76700.252298.patch
Type: text/x-patch
Size: 34740 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200324/e4b96497/attachment.bin>


More information about the llvm-commits mailing list