[PATCH] D71904: [mlir] GPU: introduce utilities for promotion to workgroup memory

Nicolas Vasilache via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 2 09:40:05 PST 2020


nicolasvasilache added a comment.
Herald added a subscriber: arpith-jacob.

I think this can be significantly simplified by reusing existing infra better.

In the longer term, composing the following steps should allow to write all this in a few lines of code:

1. (optionally, not sure if really needed) use a proper `linalg.reshape` to insert your dummy dimensions (once available)
2. use the `linalg.pointwise` builder (https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h#L113) with the proper indexings and a region that just yields the input (see e.g. https://github.com/llvm/llvm-project/blob/master/mlir/test/EDSC/builder-api-test.cpp#L837)
3. use `linalgOpToLoops` (https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Dialect/Linalg/Transforms/LinalgTransforms.h#L75) (bonus points extend the infra to capture the loops so you can do 4. directly)
4. apply `mapLoopToProcessorIds` (https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Transforms/LoopUtils.h#L221)
5. (alternative to 3. + 4.) use the linalg -> SPMD loops mapping transformation (future work)

Please add a TODO so we can revisit later and simplify.



================
Comment at: mlir/lib/Dialect/GPU/Transforms/MemoryPromotion.cpp:78
+  bounds.reserve(rank);
+  ConstantGenerator constants(builder, loc);
+
----------------
No need to reimplement EDSCs for the special case of constants, you can just use them: `constant_index`
See https://github.com/llvm/llvm-project/blob/master/mlir/test/EDSC/builder-api-test.cpp#L473 and code around for usage.


================
Comment at: mlir/lib/Dialect/GPU/Transforms/MemoryPromotion.cpp:82
+  // stepping from 0 to the number of elements in the corresponding dimension.
+  if (type.getRank() > numRequiredLoops) {
+    for (int64_t s : type.getShape().drop_back(numRequiredLoops)) {
----------------
Since you're already using  EDSCs, note that you already have support for this and the code above here:
https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/EDSC/Helpers.h#L67


================
Comment at: mlir/lib/Dialect/GPU/Transforms/MemoryPromotion.cpp:108
+  auto indexType = builder.getIndexType();
+  for (unsigned i = 0; i < numRequiredLoops; ++i) {
+    auto dimName = builder.getStringAttr(getDimName(numRequiredLoops - i - 1));
----------------
Why roll your own special cases?
You could just emit the loops as you do (note the loops themselves are captured since you use the EDSCs) and then apply:
```
void mapLoopToProcessorIds(loop::ForOp forOp, ArrayRef<Value> processorId,
                           ArrayRef<Value> numProcessors);
```
https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Transforms/LoopUtils.h#L221


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71904/new/

https://reviews.llvm.org/D71904





More information about the llvm-commits mailing list