[libc-commits] [PATCH] D152923: [libc] Add support for FMA in the GPU utilities

Wed Jun 14 11:07:49 PDT 2023

arsenm added a comment.

In D152923#4421830 <https://reviews.llvm.org/D152923#4421830>, @lntue wrote:

> - Resolve when the FMA instructions are available, which is straightforward for other architectures, not so much for early x86-64 AVX, AVX2 cpus.

This is the backend's job. At best you are papering over gaps / bugs in legalization.

> - We cannot rely on `__builtin_fma` since it causes back-ref to libc's fma functions.

This is just a bug. The backend should always be able to handle llvm.fma. Whether or not x86 respects nobuiltin when lowering it is another question, but it should always be able to inline expand it or call into compiler-rt.

> - Also when `fma` instructions are not available, we need a precise control of falling back to either emulated fma functions or just multiply + add.  With the current set up, we can simply control it with calling either `fputil::fma` or `fputil::multiply_add`.

If you do not care about the precision semantics of FMA, you really don't need to know anything about the target. You should just emit fmul contract or the fmuladd intrinsic (which you get using FP_CONTRACT on the basic expression). The backend trivially then only introduces fma if profitable

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152923/new/

https://reviews.llvm.org/D152923