[llvm-dev] [X86] FMA transformation restrictions
Michael Kuperstein via llvm-dev
llvm-dev at lists.llvm.org
Mon Sep 12 12:07:22 PDT 2016
Assuming I understood the question correctly - Intel doesn't specify which
FMA instruction is going to be used, but it does specify what it expects
the result to be.
__m128 _mm_fmadd_ss (__m128 a, __m128 b, __m128 c)
the specified semantics are:
dst[31:0] := (a[31:0] * b[31:0]) + c[31:0]
dst[127:32] := a[127:32]
dst[MAX:128] := 0
The user is allowed to rely on the upper bits of the result being the upper
bits of a, and the compiler is required choose an appropriate instruction
form that will make this happen.
On Mon, Sep 12, 2016 at 10:24 AM, via llvm-dev <llvm-dev at lists.llvm.org>
> I noticed that the operand commuting code in X86InstrInfo.cpp treats
> scalar FMA intrinsics specially. It prevents operand commuting on these
> scalar instructions because the scalar FMA instructions preserve the
> upper bits of the vector. Presumably, the restrictions are there
> because commuting operands potentially changes the result upper bits.
> However, AFAIK the Intel and GNU FMA intrinsics don't actually specify
> which FMA (213, 132, 231) is going to be used and so the user can't rely
> on knowing which operand is tied to the destination. Thus the user
> can't rely on knowing what the upper bits will be.
> Is there some other reason these scalar FMA commuting restrictions are
> in place?
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev