[llvm-dev] [X86] FMA transformation restrictions

Mon Sep 12 13:17:00 PDT 2016

Michael Kuperstein <mkuper at google.com> writes:

> Hi David,
>
> Assuming I understood the question correctly - Intel doesn't specify
> which FMA instruction is going to be used, but it does specify what it
> expects the result to be. 
>
> E.g. for
> __m128 _mm_fmadd_ss (__m128 a, __m128 b, __m128 c) 
> the specified semantics are:
>
> dst[31:0] := (a[31:0] * b[31:0]) + c[31:0]
> dst[127:32] := a[127:32]
> dst[MAX:128] := 0
>
> The user is allowed to rely on the upper bits of the result being the
> upper bits of a, and the compiler is required choose an appropriate
> instruction form that will make this happen.

Ah, thank you.  I missed that line about grabbing the upper bits
from a.

Not a big deal, I was just wondering.  Carry on.  :)

                      -David