[libc-commits] [PATCH] D152923: [libc] Add support for FMA in the GPU utilities
Tue Ly via Phabricator via libc-commits
libc-commits at lists.llvm.org
Wed Jun 14 10:00:16 PDT 2023
lntue added a comment.
In D152923#4421746 <https://reviews.llvm.org/D152923#4421746>, @arsenm wrote:
> In D152923#4421674 <https://reviews.llvm.org/D152923#4421674>, @lntue wrote:
>
>> I would assume that the fma instructions on GPU will be more performant than normal multiply + add. Do you want to let generic math functions use fma's for GPUs?
>>
>> https://github.com/llvm/llvm-project/blob/main/libc/src/__support/macros/properties/cpu_features.h#L39
>> https://github.com/llvm/llvm-project/blob/main/libc/src/__support/FPUtil/multiply_add.h#L30
>
> This is trying to resolve the problem the fmuladd intrinsic solves. The target macros should be dropped and you should simply implement multiply_add with FP_CONTRACT on and let the backend decide
Currently these macros are used for few things:
- Resolve when the FMA instructions are available, which is straightforward for other architectures, not so much for early x86-64 AVX, AVX2 cpus.
- We cannot rely on `__builtin_fma` since it causes back-ref to libc's fma functions.
- We use this to control and generate both with and without fma instructions for testing math functions in the current settings, somewhat similar to memory functions.
- Also when `fma` instructions are not available, we need a precise control of falling back to either emulated fma functions or just multiply + add. With the current set up, we can simply control it with calling either `fputil::fma` or `fputil::multiply_add`.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D152923/new/
https://reviews.llvm.org/D152923
More information about the libc-commits
mailing list