[libclc] [libclc] Reduce bithacking for INF/NAN values (PR #129738)
Fraser Cormack via cfe-commits
cfe-commits at lists.llvm.org
Tue Mar 18 03:50:32 PDT 2025
================
@@ -46,9 +46,7 @@ _CLC_DEF _CLC_OVERLOAD __CLC_GENTYPE __clc_hypot(__CLC_GENTYPE x,
__CLC_GENTYPE retval = __clc_sqrt(__clc_mad(fx, fx, fy * fy)) * fx_exp;
retval = (ux > PINFBITPATT_SP32 || uy == 0) ? __CLC_AS_GENTYPE(ux) : retval;
- retval = (ux == PINFBITPATT_SP32 || uy == PINFBITPATT_SP32)
- ? __CLC_AS_GENTYPE((__CLC_UINTN)PINFBITPATT_SP32)
- : retval;
+ retval = __clc_isinf(x) || __clc_isinf(y) ? __CLC_GENTYPE_INF : retval;
----------------
frasercrmck wrote:
I was looking at `V_FREXP_EXP_I32_F32` and `V_FREXP_MANT_F32` in the [RDNA 3.5 docs](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna35_instruction_set_architecture.pdf).
I was previously looking at the lowering of `llvm.frexp` in CodeGen tests and saw more than just those two instructions, but now I realise I was only seeing the GFX6 output. The lowering is indeed just those two instructions for other architectures.
My broader point is that I need to align the semantics of these instructions with what other targets would have to do for the equivalent operations. To match the AMDGPU behaviour like-for-like on other targets, such as separating out the two exp/mant operations, having subnormal support, etc., would be far more expensive than the bithacking it's currently doing in `hypot` to just shift out the mantissa. But if the AMDGPU version of "frexp_mant" does subnormal scaling and other targets don't, then that could cause bugs between platforms as now we can't guarantee the bits that come out of the frexp_mant and frexp_exp helpers.
That's why I suggested it might just be better to have AMDGPU override `hypot` directly, rather than rely on a "generic" hypot which calls into mant/exp helpers which AMDGPU specialize.
https://github.com/llvm/llvm-project/pull/129738
More information about the cfe-commits
mailing list