[llvm-dev] [RFC] Should -ffast-math affect intrinsics?

Sanjay Patel via llvm-dev llvm-dev at lists.llvm.org
Wed Jul 14 06:26:36 PDT 2021


To be clear, there are no target-specific intrinsics in this particular
example because clang translates the source-level intrinsics to generic IR:
https://godbolt.org/z/q4YYs6PxM
(That shows -O1 to make it easier to read, but there are no intrinsics at
-O0 either.)

We chose that form to give the IR and codegen optimizers full opportunity
to perform generic transforms because the target-specific source ops have
the same semantics as generic IR.
So changing the header file to avoid a subset of those optimizations will
likely cause perf regressions/complaints.


On Wed, Jul 14, 2021 at 5:24 AM Wang, Pengfei via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi Kevin,
>
> AFAIK, it is expected behavior that the fast-math flags affect llvm
> intrinsics. An example is llvm.vector.reduce.fadd.*
> https://llvm.org/docs/LangRef.html#llvm-vector-reduce-fadd-intrinsic.
> But how fast-math flags affect target dependent intrinsics is a bit vague.
> Because target intrinsics are expressions of the inherent characteristic of
> native instructions. So they imply the special FP model sometimes. E.g.: on
> X86, we have some intrinsics that assume to be used under fp-model=strict,
> e.g. _mm512_add_round_ps etc., while some assume to be used with given
> constraint (similar to fast math flags), e.g. _mm_max_ps etc.
> In general, I think we should respect fast math flags on target intrinsics
> too. We don't do much of it simply because we don't put the emphasis on the
> performance of target intrinsics. There was an optimization under fast math
> flag in X86InstCombineIntrinsic.cpp, which I removed in D85385 for other
> propose.
>
> Thanks
> Pengfei
>
> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Smith,
> Kevin B via llvm-dev
> Sent: Tuesday, July 13, 2021 5:46 AM
> To: llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] [RFC] Should -ffast-math affect intrinsics?
>
> Sorry, missed a NOT or two.
>
> This is what I meant to say:
> It seems to me that the fast-math flags really should NOT affect
> intrinsics implementations themselves, and that the fast-math flags should
> NOT allow reassociation across the intrinsic calls. So, is this expected
> behavior, or just something that no-one has noticed before?
>
>  It surprised me.4
>
> I have also checked GCC behavior, which is consistent with clang, or vice
> versa.  Intel C/C++ compiler does not have fast math flags affect
> intrinsics, at least it doesn't allow reassociation across the call
> boundaries and I haven't checked the Microsoft compiler yet.
>
> Kevin Smith
>
> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Smith,
> Kevin B via llvm-dev
> Sent: Monday, July 12, 2021 2:28 PM
> To: llvm-dev at lists.llvm.org
> Subject: [llvm-dev] [RFC] Should -ffast-math affect intrinsics?
>
> I've got the following little program that illustrates what I think is a
> problem. This is for X86/Intel64 intrinsics.
>
> If compiled using
> $ clang -O2 intrin_prob.c
> $ a.out
> 2.000000, 3.000000
>
> This is the expected result.  But if compiled using $ clang -O2
> -ffast-math intrin_prob.c $ a.out 1.500000, 3.255000
>
> This gets incorrect results, because reassociation happens across the
> calls to the _mm_add_pd, and _mm_sub_pd intrinsics and the value that
> should have been added and subtracted gets constant folded to zero.  It
> seems to me that the fast-math flags really should not affect intrinsics
> implementations themselves, and that the fast-math flags should allow
> reassociation across the intrinsic calls. So, is this expected behavior, or
> just something that no-one has noticed before?  It surprised me.
> I have also checked GCC behavior, which is consistent with clang, or vice
> versa.  Intel C/C++ compiler does not have fast math flags affect
> intrinsics, at least not for reassociation across the call boundaries and I
> haven't checked the Microsoft compiler yet.
>
> An easy "fix" would be to add
> #pragma float_control(precise, on)
> or
> #pragma clang fp  reassociate(off)
> near the top of immintrin.h to cause all intrinsics to ignore all
> fast-math flags, or at least ignore reassociation.
>
> $ cat intrin_prob.c
> #include <immintrin.h>
> #include <stdio.h>
>
> static union {
>   double u1[2];
>   __m128d u2;
> } t1[1] = {1.25, 3.25};
>
> int main(int argc, char **argv) {
>   __m128d t2;
>   __m128d t3;
>   // This is just so the compiler cannot constant fold
>   // and know the values of t1.
>   t1[0].u1[0] += argc * 0.25;
>   t1[0].u1[1] += argc * .005;
>
>   // This value when added, then subtracted should cause
>   // the values to be truncated to integer. If the compiler
>   // optimizes the add and subtract out by doing
>   // reassociation, then the printed values will have
>   // fractional parts.  If the compiler does the intrinsics
>   // as expected, then the values printed will have no fractional part.
>   t2 = _mm_castsi128_pd(_mm_set_epi32((int)((0x4338000000000000uLL) >> 32),
>                                       (int)((0x4338000000000000uLL) >> 0),
>                                       (int)((0x4338000000000000uLL) >> 32),
>                                       (int)((0x4338000000000000uLL) >>
> 0)));
>   t3 = _mm_add_pd(t1[0].u2, t2);
>   t3 = _mm_sub_pd(t3, t2);
>   t1[0].u2 = t3;
>
>   printf("%f, %f\n", t1[0].u1[0], t1[0].u1[1]);
>   return 0;
> }
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210714/7507fe4b/attachment.html>


More information about the llvm-dev mailing list