[PATCH] D71904: [mlir] GPU: introduce utilities for promotion to workgroup memory
Nicolas Vasilache via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 2 09:40:05 PST 2020
nicolasvasilache added a comment.
Herald added a subscriber: arpith-jacob.
I think this can be significantly simplified by reusing existing infra better.
In the longer term, composing the following steps should allow to write all this in a few lines of code:
1. (optionally, not sure if really needed) use a proper `linalg.reshape` to insert your dummy dimensions (once available)
2. use the `linalg.pointwise` builder (https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Dialect/Linalg/EDSC/Builders.h#L113) with the proper indexings and a region that just yields the input (see e.g. https://github.com/llvm/llvm-project/blob/master/mlir/test/EDSC/builder-api-test.cpp#L837)
3. use `linalgOpToLoops` (https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Dialect/Linalg/Transforms/LinalgTransforms.h#L75) (bonus points extend the infra to capture the loops so you can do 4. directly)
4. apply `mapLoopToProcessorIds` (https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Transforms/LoopUtils.h#L221)
5. (alternative to 3. + 4.) use the linalg -> SPMD loops mapping transformation (future work)
Please add a TODO so we can revisit later and simplify.
================
Comment at: mlir/lib/Dialect/GPU/Transforms/MemoryPromotion.cpp:78
+ bounds.reserve(rank);
+ ConstantGenerator constants(builder, loc);
+
----------------
No need to reimplement EDSCs for the special case of constants, you can just use them: `constant_index`
See https://github.com/llvm/llvm-project/blob/master/mlir/test/EDSC/builder-api-test.cpp#L473 and code around for usage.
================
Comment at: mlir/lib/Dialect/GPU/Transforms/MemoryPromotion.cpp:82
+ // stepping from 0 to the number of elements in the corresponding dimension.
+ if (type.getRank() > numRequiredLoops) {
+ for (int64_t s : type.getShape().drop_back(numRequiredLoops)) {
----------------
Since you're already using EDSCs, note that you already have support for this and the code above here:
https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/EDSC/Helpers.h#L67
================
Comment at: mlir/lib/Dialect/GPU/Transforms/MemoryPromotion.cpp:108
+ auto indexType = builder.getIndexType();
+ for (unsigned i = 0; i < numRequiredLoops; ++i) {
+ auto dimName = builder.getStringAttr(getDimName(numRequiredLoops - i - 1));
----------------
Why roll your own special cases?
You could just emit the loops as you do (note the loops themselves are captured since you use the EDSCs) and then apply:
```
void mapLoopToProcessorIds(loop::ForOp forOp, ArrayRef<Value> processorId,
ArrayRef<Value> numProcessors);
```
https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Transforms/LoopUtils.h#L221
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D71904/new/
https://reviews.llvm.org/D71904
More information about the llvm-commits
mailing list