[clang] [llvm] [mlir] [AMDGPU] [ROCDL] Added Intrinsics for smed, umed, to support ISA instructions from ROCDL (PR #157748)
Keshav Vinayak Jha via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 11 08:05:19 PDT 2025
keshavvinayak01 wrote:
> It is not necessary to add new intrinsics for these operations. You are better off writing the med3 in terms of min and max and letting the backend deal with it. The effort of fully supporting all analyses and optimizations on a new operation is very high
@arsenm I see that we already have support for lowering fmed3 all the way down to the supported AMDGPU V_MED_F32 / V_MED_F16 ops, see [here](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf). Why can't we also add similar intrinsics for SMED and UMED when the hardware already supports those instructions?
https://github.com/llvm/llvm-project/pull/157748
More information about the llvm-commits
mailing list