[libc-commits] [PATCH] D152923: [libc] Add support for FMA in the GPU utilities

Wed Jun 14 10:00:16 PDT 2023

lntue added a comment.

In D152923#4421746 <https://reviews.llvm.org/D152923#4421746>, @arsenm wrote:

> In D152923#4421674 <https://reviews.llvm.org/D152923#4421674>, @lntue wrote:
>
>> I would assume that the fma instructions on GPU will be more performant than normal multiply + add.  Do you want to let generic math functions use fma's for GPUs?
>>
>> https://github.com/llvm/llvm-project/blob/main/libc/src/__support/macros/properties/cpu_features.h#L39
>> https://github.com/llvm/llvm-project/blob/main/libc/src/__support/FPUtil/multiply_add.h#L30
>
> This is trying to resolve the problem the fmuladd intrinsic solves. The target macros should be dropped and you should simply implement multiply_add with FP_CONTRACT on and let the backend decide

Currently these macros are used for few things:

- Resolve when the FMA instructions are available, which is straightforward for other architectures, not so much for early x86-64 AVX, AVX2 cpus.
- We cannot rely on `__builtin_fma` since it causes back-ref to libc's fma functions.
- We use this to control and generate both with and without fma instructions for testing math functions in the current settings, somewhat similar to memory functions.
- Also when `fma` instructions are not available, we need a precise control of falling back to either emulated fma functions or just multiply + add.  With the current set up, we can simply control it with calling either `fputil::fma` or `fputil::multiply_add`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152923/new/

https://reviews.llvm.org/D152923