[PATCH] D109658: [X86][FP16] Change the order of the operands in complex FMA intrinsics to allow swap between the mul operands.

Mon Sep 13 07:28:13 PDT 2021

craig.topper added a comment.

In D109658#2997395 <https://reviews.llvm.org/D109658#2997395>, @pengfei wrote:

> In D109658#2996767 <https://reviews.llvm.org/D109658#2996767>, @craig.topper wrote:
>
>> In D109658#2996714 <https://reviews.llvm.org/D109658#2996714>, @pengfei wrote:
>>
>>> In D109658#2996412 <https://reviews.llvm.org/D109658#2996412>, @craig.topper wrote:
>>>
>>>> Does gcc use the same builtin name? Our general policy is to have the same interface as gcc if we have a builtin. So if gcc has these builtins the should work the same way.
>>>
>>> No. We don't sync with GCC on the builtin name during the development. We had a disscussion and decided to not keep them aligned due to 1) target specific builtins are compiler private names that no need to keep it compatible with other compilers; and 2) we already differentiate the target builtins with GCC long ago on the naming, masking etc. Currently, regardless the name, GCC uses the same C, A, B order with our existing implementation. https://gitlab.com/x86-gcc/gcc/-/blob/users/intel/liuhongt/independentfp16_wip/gcc/config/i386/avx512fp16intrin.h#L6672
>>
>> I thought we were pretty consistent on names with gcc for most of sse and avx and most of avx512. The names aren't completely private occasionally users due try to use them. If we happen to have the same name we should have the same behavior to avoid confusion.
>
> I'm not so optimistic. I had a coarse-grained statistics on the use of x86 builtins in Clang and GCC. It shows Clang only defines 2/3 of GCC's builtins and 1/4 of Clang builtins have different names with GCC's. Command below:
>
>   cat gcc/config/i386/*.h | grep -o "\b__builtin_ia32_\w\+" |sort|uniq|tee gcc.txt|wc -l
>   2788
>   
>   ls clang/lib/Headers/*.h |grep -v fp16 |xargs cat |grep -o "\b__builtin_ia32_\w\+" |sort|uniq|tee clang.txt|wc -l
>   1808
>   
>   comm -12 gcc.txt clang.txt |wc -l
>   1347

Not implementing something that gcc does at least gives a compile error if some tries to use the gcc name so I’m fine with that.

Using a different name in clang for something gcc also has is kinda silly. But I guess we’re worse at this then I thought.

Using the same name and having different behavior should be avoided if the it won’t give a compile error.

> Regarding this case, we already have a different name with GCC, I think it worthwhile to use a different order for the swapping optimization.
> With a bit research on AVX512IFMA, I found:
>
> 1. The use of C, A, B order in GCC is not consistent on its AVX512IFMA builtins. It supposes GCC should change to A, B, C order if considering consistency;
> 2. We aren't consistent on AVX512IFMA builtins with GCC either due to the use of select.
>
> By the way, GCC folks told me GCC has ability to specify arbitrary operands that can be commutative. But I found both SDNode and MI only have ability on the first 2 operands, which is insufficient for instruction like CFMA. Do you know if we have other mechanism for commutable operands?

That’s true for SDNode, but you can always manually add more isel patterns. We do this for FMA3.

MI uses two target specific hooks in X86InstrInfo.cpp. findCommutedOpIndices and commuteInstructionImpl are the names if I remember right.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109658/new/

https://reviews.llvm.org/D109658