[PATCH] D98708: [LoopVectorize] relax FMF constraint for FP induction

Wed Mar 17 05:30:21 PDT 2021

spatel added a comment.

In D98708#2631295 <https://reviews.llvm.org/D98708#2631295>, @david-arm wrote:

> Hi @dmgreen, yes of course you're right. I'd forgotten about the nsz requirement. It's definitely needed at compile time for vectorising FP reduction loops, i.e. `clang -freassociative-math -fno-trapping-math -fno-signed-zeroes`. I guess adding a check for nsz here is consistent with that?

Yes - clang derived its requirements from gcc, so we've passed that into the optimizer in some places (instcombine at least). I don't know of any practical examples where you could have FP reassociation and still guarantee sign-of-zero, but maybe I'm not being imaginative. :)
So currently there's no easy way (starting from C/C++ at least) to have IR that has `reassoc` without `nsz`.

Ok if I push this change, so we're consistent within the vectorizer? Then, I'll push a follow-up (we'll need a pile of new regression tests) to add the `nsz` requirement for both induction and reduction. That way, we'll be conservatively correct in requiring the extra flag, and we'll match the expected IR coming out of clang.

Note that the FMF requirements for fmul/fadd reduction/induction are different than the fmin/fmax patterns that we've also recently updated; fmin/fmax require `nnan` and `nsz` to rearrange, but not `reassoc` (since there's no FP math involved in those ops).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98708/new/

https://reviews.llvm.org/D98708