[clang] [clang] constexpr built-in fma function. (PR #113020)

Tue Oct 29 11:34:06 PDT 2024

================
@@ -549,6 +562,22 @@ static bool interp__builtin_fpclassify(InterpState &S, CodePtr OpPC,
   return true;
 }
 
+static bool interp__builtin_fma(InterpState &S, CodePtr OpPC,
+                                const InterpFrame *Frame, const Function *Func,
+                                const CallExpr *Call) {
+  const Floating &X = getParam<Floating>(Frame, 0);
+  const Floating &Y = getParam<Floating>(Frame, 1);
+  const Floating &Z = getParam<Floating>(Frame, 2);
+  Floating Result;
+
+  llvm::RoundingMode RM = getActiveRoundingMode(S, Call);
+  Floating::mul(X, Y, RM, &Result);
+  Floating::add(Result, Z, RM, &Result);
----------------
lntue wrote:

Sorry I'm not so familiar with `Floating` type in here, but are they perform the operations in arbitrary precision?  If these are performed in the same precision as the inputs then this will not work correctly with double rounding errors, overflow, and underflow.  If there are higher intermediate precision to be used then it can be simplified a bit.

A simple example of double rounding errors for floating point type `T` you can try is:
```
x = 1 + std::numeric_limits<T>::epsilon();
y = 1 + std::numeric_limits<T>::epsilon();
z = std::numeric_limits<T>::epsilon() * 0.5 ;
```
then for the default rounding mode:
```
T(T(x * y) + z) = 1 + 2 * std::numeric_limits<T>::epsilon();
FMA(x, y, z) = 1 + 3 * std::numeric_limits<T>::epsilon();
```

https://github.com/llvm/llvm-project/pull/113020