[PATCH] D126158: [MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration.
Stephan Herhut via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed May 25 08:26:59 PDT 2022
herhut added inline comments.
================
Comment at: mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp:158
+// by the same divisor.
+struct ExpandDivF16 : public ConvertOpToLLVMPattern<LLVM::FDivOp> {
+ using ConvertOpToLLVMPattern<LLVM::FDivOp>::ConvertOpToLLVMPattern;
----------------
This pattern is a bit misplaced here, as `LLVM::FDivOp` is not really a GPU dialect operation. Instead, should this be a special lowering of the arith dialect to NVVM (which we do not have yet) or a rewrite at the LLVM dialect level?
When lowering to LLVM, we already typically configure a different lowering for math dialect, so configuring the lowering of arith dialect differently seems like an OK option. That would mean a specialized pattern for `arith.divf` with higher priority. That would also give users a choice.
================
Comment at: mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp:304
+ patterns.add<ExpandDivF16>(converter);
+
----------------
I assume this is to differentiate this pattern somehow but there is no need for an extra `patterns.add` here.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D126158/new/
https://reviews.llvm.org/D126158
More information about the cfe-commits
mailing list