[PATCH] D126158: [MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration.

Christian Sigg via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon May 30 00:03:26 PDT 2022


csigg added inline comments.


================
Comment at: mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp:158
+// by the same divisor.
+struct ExpandDivF16 : public ConvertOpToLLVMPattern<LLVM::FDivOp> {
+  using ConvertOpToLLVMPattern<LLVM::FDivOp>::ConvertOpToLLVMPattern;
----------------
herhut wrote:
> This pattern is a bit misplaced here, as `LLVM::FDivOp` is not really a GPU dialect operation. Instead, should this be a special lowering of the arith dialect to NVVM (which we do not have yet) or a rewrite at the LLVM dialect level?
> 
> When lowering to LLVM, we already typically configure a different lowering for math dialect, so configuring the lowering of arith dialect differently seems like an OK option. That would mean a specialized pattern for `arith.divf` with higher priority. That would also give users a choice.
Yes, I agree it's a bit misplaced. I considered it the best of all questionable options.

Adding it to ArithToLLVM doesn't really work, because we don't want it to depend on the NVVM dialect.

How about adding it as a separate pass to `mlir/include/mlir/Dialect/LLVMIR/Transforms/Passes.td`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126158/new/

https://reviews.llvm.org/D126158



More information about the cfe-commits mailing list