[libclc] [libclc] Refine __clc_fp*_subnormals_supported and __clc_flush_denormal_if_not_supported (PR #157633)

Mészáros Gergely via cfe-commits cfe-commits at lists.llvm.org
Fri Oct 3 01:34:43 PDT 2025


================
@@ -127,9 +127,9 @@ _CLC_DEF _CLC_OVERLOAD float __clc_sw_fma(float a, float b, float c) {
     return c;
   }
 
-  a = __clc_flush_denormal_if_not_supported(a);
-  b = __clc_flush_denormal_if_not_supported(b);
-  c = __clc_flush_denormal_if_not_supported(c);
+  a = __clc_soft_flush_denormal(a);
+  b = __clc_soft_flush_denormal(b);
+  c = __clc_soft_flush_denormal(c);
----------------
Maetveis wrote:

> surely compiler-rt already has an implementation?

It doesn't. LLVM libc has one but it uses FP64, so I don't think it is of much help. I'd expect most targets that don't have hardware fma don't have fp64 either.

I think dropping sw fma would impact:
- SPIR-V, which then starts generating the `GLS.std.450` extended instruction FMA. The problem there is that instruction is (AFAICT) allowed to round intermediate products, but the OpenCL spec doesn't allow that. I'm not sure if drivers actually implement it as fused or not.
Arguably the lowering is bug in LLVM, `@llvm.fma` is specified to be fused without fast math.
- Not all old R600 targets have FMA, I think this change would be breaking them. These are >10 years old GPUs at this point though.

https://github.com/llvm/llvm-project/pull/157633


More information about the cfe-commits mailing list