[PATCH] D72022: [mlir][Linalg] Extend generic ops to allow tensors

Thu Jan 2 10:17:37 PST 2020

nicolasvasilache added a comment.

@mravishankar actually this will simplify things quite a lot.
Copy-pasting from the mlir at tensorflow.org discussion: https://groups.google.com/a/tensorflow.org/g/mlir/c/pbWk9a-t3Xc

  This has a number of implications in some of the transformations that are traditionally done at the level of the HLO dialect.
  In particular, everything related to trivial fusion of pointwise operators can be done immediately using the region.
  This avoids the need for the current, more cumbersome and phase-ordered, flow that does:
  1. mark fusion with XLA fusion nodes,
  2. allocate buffers for everything
  3. convert to Linalg
  4. apply fusion in Linalg
  5. perform an analysis and remove temporary buffers that have been fused.

  Note that step 4. may not necessarily do what one wants at step 1. since we are talking about different systems that are not really designed to talk to each other.

  Instead, this can be replaced by:
  1. apply fusion of ops using regions

  Temporary buffers never get materialized or anything.
  This becomes especially handy when implicit of explicit broadcast semantics are involved: some things are trivial to fuse at the level of Linalg on tensors and all the unnecessary intermediate memory is never allocated.

  There are many other implications on the type of transforms that become available at this level (hint: look at the TASO compiler) but I only listed the most obvious one.

  In my mind the codegen path where things are the most natural is:

  User
  -> Language / Framework 
  -> HLO + Linalg on tensors 
  -> LHLO + Linalg on buffers 
  (note that buffer allocation in Linalg on tensors -> Linalg on buffers can be very progressive intermixing ops with both tensor and buffers arbitrarily)
  -> Affine/StructuredControlFlow (still named Loops atm ..)
  -> backends

  Different transformations apply at each level. 

This does of course not remove the need for the buffer allocation pass but it is just a trivial extension of what already exists.
The particular case you mention: linalg + tensors -> GPU is illegal, the IR must be legalized with buffer allocation first.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72022/new/

https://reviews.llvm.org/D72022