[Mlir-commits] [mlir] [NvGpu Dialect] add rcp approxe op (PR #100965)
Guray Ozen
llvmlistbot at llvm.org
Sun Jul 28 23:27:08 PDT 2024
================
@@ -802,4 +803,16 @@ def NVGPU_WarpgroupMmaInitAccumulatorOp : NVGPU_Op<"warpgroup.mma.init.accumulat
let hasVerifier = 1;
}
+def NVGPU_RcpApproxOp : NVGPU_Op<"rcp_approx", [
----------------
grypp wrote:
The `rcp.approx` is quite narrow scope for an NVGPU OP. When I look at [the PTX instruction](https://docs.nvidia.com/cuda/parallel-thread-execution/#floating-point-instructions-rcp), I see many flavors. I think one OP can cover that. What do you think?
Could we rename the operation to `nvgpu.rcp` and then add enumerators for `approx`, `rnd`, and `ftz` dynamically? It's okay if nvvm doesn't support it all, you can give an error message for the time being.
https://github.com/llvm/llvm-project/pull/100965
More information about the Mlir-commits
mailing list